Batch Submission Specifications for PanLem
Contributed by Jonathan Pool.
Revised 2011-05-16.
The “PanLem” UI permits users to submit lexical data as files.
There are 3 input file formats: simple text, full text, and XML. On this page is the syntax for the full text format.
The expansion formulae below use multi-character space-delimited tokens as atoms. Spaces are not significant. The operators are represented with standard regular-expression symbols and the following special symbols:
| ‘’ | text quotation |
| ¶ | newline (Ux000a) |
| «» | arbitrary reordering of repetitions of expansions of enclosed atoms |
| ⁑ | such that (expansion must comply with the following condition) |
| ⊖ () | all atoms in all expansions of all instances of enclosed atom must be unique |
| $ | introducer of variable referenced in condition query |
Varilingual Variant
The full text format has three variants: varilingual, centrilingual, and bilingual. The most general variant is the varilingual. Its specification is:
- file → ‘:’ ¶ ‘0’ ¶ mn0+
- mn0 → ¶ «mi, dms, dfs, dn0s»
- mi → (‘mi’ ¶ .{1,50} ¶)?
- dms → (‘dm’ ¶ dmlt ¶)* ⁑ ⊖ (dmlt)
- dmlt → lvi ¶ .{1,50}
- lvi → $1 ‘-’ $2 ⁑ (select count (lv) from lv where lc = $1 and vc = $2) = 1
- dfs → (‘df’ ¶ dflt ¶)* ⁑ ⊖ (dflt)
- dflt → lvi ¶ .{1,200}
- dn0s → (‘ex’ ¶ exlt ¶ exwms)* ⁑ ⊖ (exlt)
- exlt → lvi ¶ .{1,100}
- exwms → ((‘wc’ ¶ wctt | ‘md’ ¶ mdvv) ¶)* ⁑ ⊖ (wctt), ⊖ (mdvv)
- wctt → ‘adjv’|‘advb’|‘affx’|‘auxv’|‘conj’|‘detr’|‘ijec’|‘misc’|‘name’|‘noun’|‘post’|‘prep’|‘pron’|‘verb’|‘vpar’
- mdvv → .{1,50} ¶ .{1,100}
Centrilingual Variant
The centrilingual variant is identical to the varilingual variant except as follows:
- The expansion formula “file → ‘:’ ¶ ‘0’ ¶ mn0+” is replaced with “file → ‘:’ ¶ ‘1’ ¶ lvi ¶ mn0+”.
- Within each mn0 expansion, the “lvi ¶” in the expansion of the “exlt” in the repetition of the expansion of the “dn0s” that appears first in the reordering is unexpressed, and its expansion is implicitly identical to the expansion of the “lvi” in the expansion of “file”.
Bilingual Variant
The bilingual variant is identical to the centrilingual variant except as follows:
- The expansion formula “file → ‘:’ ¶ ‘1’ ¶ lvi ¶ mn0+” is replaced with “file → ‘:’ ¶ ‘2’ ¶ lvi ¶ lvit ¶ mn0+”.
- The expansion formula “lvit → $1 ‘-’ $2 ⁑ (select count (lv) from lv where lc = $1 and vc = $2) = 1” is added.
- Within each mn0 expansion, the “lvi ¶” in the expansion of the “exlt” in every repetition of the expansion of the “dn0s” that does not appear first in the reordering is unexpressed, and its expansion is implicitly identical to the expansion of the “lvit” in the expansion of “file”.