UI Tasks 1 Comments

Introduction

Contributed by Jonathan Pool on 2009-06-21. Amended on 2009-08-17. This document contains comments on the document “UI Tasks 1”.

Legend

“IU Tasks 1” describes 9 tasks that can be performed with the Web user interface of PanLex. To perform a task, execute the actions described in the “User Action” column, in sequence.

Before performing each task, enter the following preparatory command to clear the PostgreSQL cache:

sudo -E /var/local/utils/pgcold.txt

This command runs a script containing the following three commands:

sync
echo 3 > /proc/sys/vm/drop_caches
sudo -E -u postgres pg_ctl restart

After this preparatory command, execute the entire task, one action after the other, measuring and recording each action’s execution time. This is the “cold” time. When the task is complete, execute the same task again, without the preparatory command. The times you record for this repeated task are the “hot” times.

In the tables below, the “OK” times are the maximum elapsed times I judge satisfactory for the responses to the specified user actions. The cold and hot times are those that I measured.

The envisioned typical PanLex use case involves a widely dispersed user population making occasional queries. I am assuming, conservatively, that in this environment the file blocks and query plans required for a typical query would not be cached and that therefore the “Cold” times are the ones that users would typically experience. Cold times that exceed the satisfactory times are shown in bold italic type.

Diagnostic Comments

Task 1, Action 6

This action is executed on line 297 of exvizcw.pl. The work is done by the “plx” database’s SQL function “tpexxs (text, integer)”. To replicate directly, submit the query:

select * from tpexxs ('peripher', 5000);

Task 2, Action 7

This action is executed on line 26 of trviz7w.pl. The work is done by the “plx” database’s SQL function “trlv (integer)”. To replicate directly, submit the query:

select * from trlv (241018);

Tasks 3 and 4

These tasks are identical to tasks 1 and 2, respectively, except for action 2. Action 2 in these tasks causes one function to execute more slowly on all subsequent actions than in tasks 1 and 2.

The UI labels (currently 138 in number) are recorded in an artificial language “PanLex” and are translated at run-time into the user’s interface language. To bootstrap these translations, the PanLex administrator has provided a table (“pl1”) containing translations into 7 languages (English, French, German, Russian, Turkish, Norwegian Bokmål, and Esperanto) and, sporadically, into others. When the user’s interface language is one of these 7, translation of the UI labels makes use only of table “pl1”; otherwise, all translations found in the database from the PanLex expression and from any of its translations in “pl1” are tabulated, and the most duplicated translation into the target language is chosen. This process is more expensive than the simple “pl1” lookup.

The work described above is done by the “plx” database’s PL/pgSQL function “trp2a (integer, integer)”. To replicate directly, submit one of these queries:

select trp2a (4927, 187); (translate “trn” from PanLex into English; fast)
select trp2a (4927, 211); (translate “trn” from PanLex into French; fast)
select trp2a (4927, 620); (translate “trn” from PanLex into Russian; fast)
select trp2a (4927, 304); (translate “trn” from PanLex into Italian; slow)
select trp2a (4927, 128); (translate “trn” from PanLex into Mandarin; slow)
select trp2a (4927, 184); (translate “trn” from PanLex into Greek; slow)
select trp2a (4927, 933); (translate “trn” from PanLex into Mochi; slow; none found)

Tasks 5-9

Each of these tasks tabulates all the characters in all the expressions of some language. The more expressions the language has in the database, the more expensive the task. If the language has more than about 100,000 expressions, there is substantial latency. The latency is exacerbated if the repertoire of characters in the language’s expressions is very large. At present, one language has enough expressions to cause the server to crash with an out-of-memory error while trying to execute the character-tabulation operation.

The work described above is done by the statement on lines 18-22 of lvviz2w.pl. When the out-of-memory crash occurs, it is on this statement. This statement compiles a string containing all the instances of all the characters in all the expressions of the language, sorted. The actions performed by this statement for this purpose are:

  1. Obtain an array of the texts of all the expressions in the language.
  2. Concatenate the array elements into a single string.
  3. Decode the string from UTF-8 to character values.
  4. Split the characters in the string into an array of characters.
  5. Sort the array.
  6. Join the array elements into a string.