The Novial Lexike must be put online; since N30 is the baseline from which we are working, the NL is the baseline for our vocabulary. As such, we need to get it online in its entirety in some form.
The NL, being as it is a printed book, is in flat text format. In order to be more useful to us online, we need to convert this into some format which conveys more information: SGML. I have written a DTD and assorted software for an SGML-formatted dictionary, but the actual text itself needs to be tagged.
There are two types of dictionary. Multilingual dictionaries are for converting between two languages, and as such they typically have short, one or two-word definitions. Monolingual dictionaries serve to define the word in the same language. We should go through the dictionary and write out full definitions in Novial for each word.
English, French, and German do not a quorum make. They do cover a large portion of our target audience, but we will need to expand the languages covered by the dictionary. Good early targets (imo) would be Spanish and Esperanto.
If you're interested in helping out on this, email me. The work may seem daunting, but it's not so much of a commitment to just do a page or two at a time. At the moment, most of the work that needs to be done is converting to SGML; not as hard as it sounds, as I can do some preprocessing on the file to get you started. So here's what you do if you want to help. Ideally, you email me, and I set aside a section for you, and email you back that it has been set aside. This way completely avoids duplication of work. However, I know that when I am helping other people on projects, I tend to have time right now, and I don't want to wait for approval before I can start work on it. If you are of this sort, then the best we can do to avoid a collision is for you to pick some section at random, download it, and email me so I can officially set it aside ASAP. Just make sure you don't take one of the "in progress" sections, as those are already being worked on by someone.
SGMLizing: If you speak SGML, you might want to read the DTD itself, but otherwise: the SGML format is a lot like HTML, but with different tags. In our project, each letter is stored in its own file, and each file has this basic format:
<section letter=a> <pron type=pref><fra>a <eng>as in card, calm <deu>a ... <group root=abat> <word pos=sb>abate <deu>Abt <eng>abbot <fra>abbé <word pos=sb>abata <fra>abbesse <eng>abbess <deu>Äbtissin <word pos=sb>abatia <deu>Abtei <fra>abbaye <eng>abbey </group> ... <suffix from=sb to=sb>aje <fra>quelquechose composé de ou ayant le caractère de <eng>something made of, consisting of, having the character of <deu>bestehend aus, nach Art von <ex>lanaje <fra>article de laine <eng>woollen goods <deu>Wollware <ex>infantaje <fra>enfantillage <eng>childish act <deu>Kinderei </suffix> ... <prefix>arki <deu>Archi-, Erz- <eng>arch- <fra>arch- <ex>arkianjele <ex>arkiepiskope <ex>arkiduke </prefix> ... </section>
Here is what each tag does: section
Self-explanatory.
pron
Pronunciation. Its type
can be one of
pref
(erred), pos
(itional), or
alt
(ernate). There can be several pron
s for a
certain letter; each contains a tag for some number of languages which
have a corresponding pronunciation. group
contains a group
of words which are derived from each other. Its root
is
the root form; for nouns and adjectives, just hack off the last vowel,
and for verbs keep the vowel but drop the R. We may come up with a
better way to do this later, but if you do it this way now it can be
mass-converted later. word
is an individual word; its
pos
(part of speech) can be one of vb sb adj adv prep
pron konj interj num
. Also, it can have an irr
argument if the word is irregularly derived (represented in NL by a
double bar). suffix
is, of course, a suffix;
from
and to
are what the suffix takes, and
what it converts to; they can be any part of speech, plus
any
and same
. prefix
is just
that. eng
, fra
, and deu
, and
eventually others, are the languages we support. nov
is
used when a definition is given in Novial, and cf
for a
reference to another word (in NL, designated by "kp."). ex
is an example, and can itself be followed by definitions.
Above all, take a look at one of the completed SGML sections to get a good idea of what to do.