Dictionary entries have grown in complexity over the past several years due to the restraint of physical printing being removed. Originally, as John Simpson pointed out several times in The Word Detective22, definitions had to be as concise as possible, owing to the need to conserve printed space. Electronic and online dictionaries don’t have such a restriction. Computer storage space, even on mobile phones, has grown exponentially year after year. The only restrictions on space now are those of the screen on which the dictionary is being presented.
Dictionaries will always have the first three categories of content, just like glossaries. However, given that electronic dictionaries are no longer constrained by printed size, many will have additional content that includes the various types of definitions.
1. Term or lemma – This is the term that will be defined. Lemma is the fancy word lexicographers give the term. Dictionaries always list their terms following strict capitalization rules.
2. Pronunciation helpers – Some dictionaries will have a small speaker () next to the term’s entry. When pressed, the term will be spoken out loud to assist the reader in understanding how to vocalize the term. Other dictionaries will have a syllable breakdown of the term entry, such as the breakdown for dictionary [dik-shuh-ner-ee]. This was the original pronunciation helper.
3. Preferred, nonstandard, and alternate spelling variants– Not many dictionaries, as of yet, contain these categories of data. Preferred terms are those terms normally found within a manual of style. The manual of style will list certain terms to use instead of other, like, terms. Nonstandard variants have entered the dictionary world as a part of analyzing various sets of corpora (the documents you are using and drawing terms from) and determining that the writers are using differing terms with the same definitions. The nonstandard variants are those outlying term uses that get added to dictionaries to let the Natural Language Processing engines know that personal data and individual’s information are the same thing. Alternate spelling variants are those versions where the US spells organization with a “z” and the UK spells organization with an “s”.
4. Acronym – Some dictionaries will place the acronym on a line below the term entry. Others will simply follow the term entry with the acronym in parentheses.
5. Designator and Definition – Dictionary definitions are more stringent than glossary definitions. Dictionary definitions will always begin with the definition’s designator. A designator is needed because some terms have multiple definitions, such as the term report (it is has multiple definitions for both noun and verb). All dictionaries will list whether the definition that follows is a noun, pronoun, adjective, determiner, verb, adverb, preposition, conjunction, or interjection. Custom dictionaries will take this concept farther and will list whether the definition fits any specific type of named entity (we cover those later).
6. Attribution – Many online dictionaries, such as Wordnik and Compliance Dictionary will include definitions from multiple sources. When including definitions from multiple sources, these dictionaries will include the source’s attribution along with the definition.
7. Related forms – Any electronic dictionary that is built with the intention of working with Natural Language Processing Engines will also include all of the other forms that the term can take. The most common being plurals and possessives for nouns and all of the various verb tenses.
8. Relationships – Most dictionaries will list each term’s synonyms and antonyms. Dictionaries that also blend in a thesaurus, will add additional terms related to the primary term. As of this writing, only Compliance Dictionary lists advanced semantic relationships such as category of, part of, used to enforce, references, manages, used to create, etc. These advanced semantic relationships are necessary for Natural Language Processing engines’ understanding of named entity relationships of terms.
9. Examples of use – Examples of use are wonderful. And with modern “document scraping” software, once a term has been identified, examples can be found in the dictionary’s corpus and brought to the forefront.
10. Reverse lookup – These are terms that the scraping engine of the dictionary has found that use the primary term in their definition.
11. Etymology – Some dictionaries will list the term’s origin, showing which parts of the term originated when and where and how the term has evolved.
12. Visuals – Some of the newer online dictionaries, like Wordnik, will also have pictures and illustrations of the term listed with the definitions. They say a picture is worth a thousand words...
Not every dictionary will have every item. Wordnik, for example, has most of these items, but doesn’t have the named entity recognition designators, or the advanced semantic relationships that go with them. ComplianceDictionary doesn’t have the reverse lookup, etymology, or visuals.
To write custom dictionary entries, follow steps 1 through 6 of how to write definitions.