Contributed by Jonathan Pool on 2010-11-28.
PanLex uses a PostgreSQL database named “plx” on the host uf.utilika.org. The host runs the current version of PostgreSQL, not the older one integrated into Red Hat Enterprise Linux 5.
The design of the database is described by “PanLex 2.0: The Database Design”.
The cluster hosting the “plx” database is not configured for point-in-time recovery from the write-ahead log. The WAL level is set to “minimal”, the default. Point-in-time recovery appears to be too complex for practical use.
The “plx” database is archived automatically with pg_dump every Sunday into the server’s /var/local/archives/dbms directory. Each Sunday’s archive is combined with those of the other important cluster databases and rearchived in the /var/local/archives/general directory, where each weekly database archive is date-named and stored indefinitely until an administrator copies it to another device and deletes it. In addition, an archive of the largest tables in “plx” is made into the /tmp directory nightly. Each nightly archive replaces the prior night’s archive.
As an alternative to point-in-time recovery, as of December 2010 all changes to the “plx” database that are made through the PanLem interface are logged in a format that is user-readable and straightforwardly convertible to sequences of commands that would restore the database to a selected point after the last archive has been restored. The log files are stored in the /var/www/local/panlex/log directory. A new file is started every day and is automatically deleted 30 days after creation. The files are named with their dates. Each line describes a query or an uploaded approver file. The line contains the time of day, the user ID, the PanLem state in which the operation took place, and the operation. When an approver file is uploaded, only one line is added to the log, identifying the file, so restoring that operation requires uploading that file again. To do this, one can move the uploaded file from the /var/www/local/panlex/fin directory, where uploaded files are kept for 30 days, back to the /var/www/local/panlex/smp/bon (if simple text) or /var/www/local/panlex/tot/bon (if full text) or /var/www/local/panlex/xml/bon (if XML) directory, where it will be automatically reprocessed by the appropriate daemon.
For debugging and security purposes, as of December 2010 PanLem sessions are also logged in the file /var/log/httpd/error_log. A line is produced for each form submission. It gives the date, time, client IP address, form state, and next form state. This is not a complete record of data submitted via forms. Any submissions that produce changes in the database are logged twice: in the error_log file and as described in the previous paragraph.