|
This starter glossary is a reconstruction of an earlier project which was lost in a move. Most of the definitions are lifted from other glossaries. Eventually, with the help of readers, I hope that better defintions and a more refined set of important terms can be created. Please submit your suggestions for words and definitions to Steve Bett.Credits: Glen Adams' Unicode Glossary, David Crystal's Encyclopedia of Linguistics and The English Language, D. Runes' Dictionary of Philosophy, and The Dictionary of Linguistics.
A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z |
- Acronym
- Literally top or front (initial) name. The Acropolis was the "top city" or, more descriptively, the city on top of the rock. An acronym is a name constructed from the initial letters of a longer name. RCA is an abbreviation for Radio Corporation of America (now French owned). GM is an abbreviation for General Motors. Some abbreviations are pronounceable as in the case of NASA and ...
- Acrophonic
- Literally top or initial sound
- Allograph
- The marks which are taken an representative of a particular grapheme or letter.
- Allophone
- The sounds which within a particular language are taken as representative of a particular phoneme or sound category.
- Alphabet.
- An ordered set of sound signs. An ordered set of graphemes (marks or shapes) which are linked to and represent the phonemes or significant sound categories of a language. A writing system is said to be alphabetical if the symbols represent the phonemes of a language.
- A collection of symbol shapes used in a writing system to more or less represent the sounds of that language. The shapes symbolize or represent the sound. The correspondence between shapes and sounds may be either more or less exact; most alphabets do not exhibit a one-to-one correspondence between distinct sounds, phonemes, and distinct symbols, graphemes. English has a many-to-many correspondence between letters and sounds. The typical letter represents over 12 different sounds and a typical sound category can be represented by over 10 different letters.
A writing system in which a set of symbols (letters) represents the important sounds (phonemes) of a language.C
- Alphabetic Principle
- A one to one correspondence between the phonemes of a language and the letters in a writing system. In writing systems that follow an alphabetic principle, there is a close correspondence between a particular shape or allograph and a particular sound or allophone making it possible to spell just as one speaks. In English, which departs from the alphabetic principle about half the time, sounds are associated with several shapes (or letters) and most letters are associated with several sounds.
- Affective
- Affective
- Affricate
- Aveolar
- Ambiguous
- Anagram
- Analogy
- Argot, cant, special vocabulary used by a social group. See also pidjin.
- ASCII
- B
- Base Character.
- See Non-Combining Character.
- Bidirectional Display (BIDI).
- The process or result of mixing left-to-right oriented text and right-to-left oriented text in a single line. The standard Arabic writing system writes numbers from left-to-right while writing all other text from right-to-left. Mixing English text and Hebrew text requires bidirectional display.
- Blend
- Over half of the 15 vowels recognized by English speakers can be constructed from other vowels. The long A, for instance, can be a composite of the /ah/ or short o sound and /ee/ the long E sound. The problem of with any representation of these blends or diphthongs is the ambiguity of the traditional sound signs. /ae/ /ai/ /oi/ /oe/ have all been used to denote the component sounds of the vowel in *age /aj/. Pictographic monofon avoids these problems by linking the shape to a key word. Thus the letter AVIAN is a blend of OX and EEL.
![]()
- .
- Character.
- (1) an element of a computer character set; (2) an element of an alphabet; (3) an element of the Han script (see Hanzi). See also glyph.
- Character Properties.
- An unordered list of property names and property values which is associated with individual character code elements. Unicode explicitly specifies the following properties: a character's unique name, an image of a nominal form, a directional property, and, optionally, spacing, number, and certain other properties. A number of other properties can be deduced from a character's name and its character description. For example, the Unicode character LATIN SMALL LETTER A may be assigned the following properties: <TYPE,LETTER>, <CASE,LOWER>, <SCRIPT,LATIN>, <SCRIPT-ELEMENT,A>, etc.
- Character Set.
- A collection of elements used to organize, control, or represent information. Such information can be classified as either formal, functional, or a combination of both form and function. Certain types of information are normally excluded from such representation; for example, directly perceived information such as pictures, sounds, texture, etc.; in contrast, the information which is represented by character sets can normally be said to be symbolic in nature.
- Character Unification.
- The process of replacing a number of potential elements of a character set with one actual element. The criteria for unification may be according to abstract form, abstract function, or both.
- Code Element.
- A unit of character encoding referring to both the numeric code value and the character which the code value represents.
- Code Page.
- A coded character set, often referring to a coded character set used by a Personal Computer, for example, PC code page 437, the default coded character set used by the DOS operating system.
- Coded Character Set.
- A character set in which each character is assigned a numeric code value. Frequently abbreviated as character set when the context is sufficient to determine what is intended.
- Collation.
- The process of ordering units of textual information according to a well defined set of rules. These rules may be relatively easy, based solely on the order of component symbols; on the other hand, they may be arbitrarily complex, requiring grammatical or semantic interpretation.
- Combining Character.
- A character whose visible form is intended to visually combine with or attach to the visible form of another character; the meaning of the resulting combination may be a combination of the meanings of its components, or it may be a new meaning altogether.
- Combining Character Sequence.
- A sequence of character code elements which starts with a non-combining character and includes one or more subsequent combining characters up to the next non-combining character. Also called Composed Character Sequence.
- Compatibility Character.
- A character which would normally not be encoded but which for compatibility reasons is encoded.
- Composition and Layout.
- The process of creating a final form document from a revisable form document. A rendering engine is one component of this process.
- Conjunct Form.
- A glyph depicting a combination of two or more glyphs which represent consonants. A conjunct is a type of ligature which appears in most scripts based on the Brahmi family of Indic scripts. Some of these scripts do not use conjuncts, e.g., Tamil and Thai. In Unicode, a conjunct is formed by using a virama character to create a dead consonant; dead consonants are normally joined with a subsequent consonant form to form a conjunct. The components of a conjunct form may be joined either horizontally or vertically; in some cases, the component forms are not distinguishable as such.
- Contextual Variant.
- An abstract form which depicts an underlying, more abstract element in some particular context. In the context of rendering characters visible, a contextual variant refers to a glyph which depicts some character(s) in a given context; for example, the Unicode character ARABIC LETTER BA is depicted using one of four contextual variants, depending on its context. However, complex Arabic styles, such as Thuluth and Nastaliq, may require many more than four contextual variants for each form.
- Creole
- Creoles are new languages that have developed from a mixture of old ones and now have a life of their own. There are around sixty (60) surviving English-based pidgins and creoles.
- D
- Decomposition.
- (1) the process of separating or analyzing a text element into component units. These component units may not have any functional status, but may be simply formal units, i.e., abstract shapes; (2) the process of replacing a code element with multiple code elements, which, together, represent the original code element in some manner, e.g., the shapes associated with the resulting code elements may combine to form the shape associated with the original code element.
- Default Vowel.
- In writing systems based on a script in the Brahmi family of Indic scripts, a consonant letter symbol normally assumes the presence of a default vowel, unless otherwise indicated. The phonetic value of this vowel differs among the various languages written with these writing systems. A default vowel is overridden either by indicating another vowel with an explicit vowel sign or by using virama to create a dead consonant.
- Demotic Script.
- (1) a simplified form of the ancient Egyptian hieratic writing (2) A script or a form of a script used to write the vernacular or common speech of some language community; from Greek "de:mosios", belonging to the people. See Hieratic.
- Dependent Vowel.
- A symbol or sign which represents a vowel, and which is attached or combined with another symbol, usually one which represents a consonant. For example, in writing systems based on Arabic, Hebrew, and Indic scripts, vowels are normally represented as dependent vowel signs.
- Diacritic.
- (1) a mark applied or attached to a symbol in order to create a new symbol that represents an entirely new value; (2) a mark applied to a symbol irrespective of whether it changes the value of that symbol. In the latter case, the diacritic usually represents an independent value, e.g., an accent, tone, or some other linguistic information. Also called diacritical mark, or diacritical. See also non-spacing mark and combining mark.
- Digraph.
- A pair of signs or symbols (two graphs) which, together, represent a single sound or a single linguistic unit. The English writing system employs many digraphs, e.g., th, ch, sh, qu, etc. The same two symbols may not always be interpreted as a digraph, e.g., cathouse versus cathode. When three signs are so combined, they are called a trigraph. More than three are usually called an n-graph.
- Diphthong.
- A pair of vowels which are considered a single vowel for the purpose of phonemic distinction. One of the two vowels is more prominent than the other. For example, in American English, the words day /dai/ and row /rou/ each consist of a consonant followed by a diphthong. In writing systems, diphthongs are sometimes written with one symbol, and sometimes with more than one symbol, e.g., with a digraph.
- Display Cell.
- A rectangular region on a display device within which one or more glyphs are imaged. Traditionally, display devices have assumed that each such cell was non-overlapping with other cells, that only one glyph was imaged into the cell, and that this single glyph was represented by a single character code element. In contrast, Unicode assumes a many-to-many relation between character code elements and glyphs which are imaged into a display cell that may overlap other display cells.
- Display Element.
- A particular kind of text element which, for purposes of display, is treated as an atomic unit. For example, a Unicode rendering engine may treat a combining character sequence as a single display text element.
- Dual Alphabet
- The use of capital (majiscules) and small letters (miniscules) in a single system.
- Dyslexia
- A language disturbance that affects the ability to read. (sometimes called alexia)
- Equivalence.
- In the context of text processing, the process or result of establishing whether two text elements are identical in some respect. Different types of equivalence can be employed. For example, a form of strong equivalence is identity of code element values; this is very important for performing binary operations on text, such as a binary sort of a symbol table. More often, however, weak equivalence is what is desired by users of text; e.g., the English words cat and Cat are usually considered equivalent, even though they would not be equivalent in terms of code elements, i.e., strongly equivalent. Many different forms of weak equivalence may be desired by a user.
- Final Form Document.
- The resulting form taken by a text (document) after performing composition and layout on its logical form. This form of a document is not considered revisable or editable, but represents a form that can be directly imaged without any further formatting or layout operations. An example of a final form document is a Postscript file (.ps, .pdf) which represents the imagable form of some text.
- Font
- The source (cf. fountain) or mother (matrix) of the letter shapes that appear on a page. A collection of glyphs used for the visual depiction of character data. A font is often associated with a set of parameters, e.g., size, posture, slant, weight, serifness, etc., which, when set to particular values, generate a collection of imagable glyphs. The term is often confused with related terms such as type face and type style. Times is a type face. The font is more specific and includes size, posture, weight, etc.
- Form.
- In the context of written language, a form is an abstract shape which, by itself, has no intrinsic meaning. In Unicode, some characters are defined solely on the basis of form; however, as characters, such a form carries or bears meaning by virtue of the context in which it is used.
- Function.
- In the context of written language, a function is an abstract meaning which, by itself, has no intrinsic form. In Unicode, some characters are defined solely on the basis of function; however, as characters, such functions usually refer to one or more nominal forms, the precise form of which can be determined only by context. Form and function are independent dimensions along which the characters of Unicode are defined.
- Glyph.
- An abstract form which represents one or more glyph images, and which is used to visually depict encoded character data. In displaying Unicode character data, one or more glyphs may be selected to depict a particular character. These glyphs are selected by a rendering engine during composition and layout processing. See also character.
- Glyph Image.
- The actual, concrete image of a glyph representation having been rasterized or otherwise imaged onto some display surface. A particular shape.
- Glyph Representation.
- The glyph shape and glyph metrics associated with a specific glyph in a font.
- Glyph Shape.
- A collection of information which specifies the desired shape of a glyph; for example, bitmaps, vectors, and outlines are common forms taken by glyph shapes.
- Grapheme.
- A minimally distinctive unit of writing in the context of a particular writing system. For example, b and d are distinct graphemes in English writing systems since there exist distinct words like big and dig. Whereas, a and a are not distinct graphemes since no word is distinguished on the basis of these two different forms. A grapheme is for a writing system what a phoneme is for a phonology.
- Hangul.
- An alphabetic script used to write the Korean language. As used by the Korean writing system, this script is primarily phonemic because each symbol generally corresponds to a single phoneme. These symbols are further organized graphically into blocks which correspond closely to syllables. Finally, these blocks are laid out sequentially to represent words.
- Hieratic
- An abbreviated semi-pictorial version of Hieroglyphics. The Greek name for this script is a variation of "sacred or priestly markings."
- Homograph.
- A word or a symbol which is visually the same as another, but which represents a different sound or meaning. A special form of homographs occur with names in Chinese; for, even though a single Han character may be pronounced the same, and refer to the same canonical family name, families who wish to distinguish themselves from others using the same name often create a slightly modified form of the character which they use for their own name. In Unicode, these stylistic variations on a single character are not distinctly encoded; however, a mechanism for indirectly encoding this information in Unicode plain text format is currently under consideration.
- Homonym.
- A word which has the same pronunciation as another but which has a different meaning. Writing systems often write homonyms slightly differently in order to distinguish among their meanings. For example, diacritical marks and spelling variations are used by many alphabetic writing systems to distinguish among homonyms. In writing a language like Chinese where the homonym density is extremely high, a rich morphemic writing system is almost a necessity.
- Ideograph.
- The common term used in the West to refer to Han characters. It refers to the fact that Han characters are used in various writing systems primarily not to represent sound, but to represent meaning. However, it is somewhat inaccurate since Han characters do in fact represent a great deal of sound information; nearly 98% of all modern Han characters indicate some sound information.
- Ideographic Writing
- A writing system in which the symbols primarily refer to ideas or meaning rather than sounds. The standard Chinese writing system is primarily ideographic. However, to be more accurate, it is actually a morpho-syllabic writing system since each symbol simultaneously refers to a morpheme and a syllable; that is, it is both a meaning and sound writing system. [Note: what to call Chinese writing is perhaps one of the most controversial subjects of discussions about writing; see J. DeFrancis, The Chinese Language: Fact and Fantasy, for an excellent treatment of this topic.]
- Internationalization.
- The process of making a system or application software independent of or transparent to natural language. If a system or application can support any language, then it is fully internationalized; if it supports only a limited subset of languages, then it is partially internationalized. The goal of Unicode is to support full internationalization.
- ITA (Initial Teaching Alphabet)
- A regularized alphabet similar to New Spelling and WES which used ligatures to represent digraphs. ITA requires a special font. New Spelling doesn't. ITA was popular in the 1960's. While students could master ITA at least twice as fast as the conventional writing system, the approach never became mainstream and by the 1990's was largely abandoned. ITA experiments were conducted in classes where the teacher's did not believe in the approach and it still worked. With ITA, teachers could use "phonics" as a teaching method. However, ITA was never coupled with a particular teaching method. It was just a medium and different teachers introduced it in different ways. Given the variety of ways that ITA could be introduced, it was not unusual that there were a few studies indicating that those who quickly learned ITA had difficulty in the third year of school when they had to transition to the traditional writing system (TO). Some of those who were taught ITA complain 20 years later that they never know if they have spelled a word correctly because they might have spelled in phonemically. (Cf. Downing)
- Jamo.
- The Korean name for a single element of the Hangul script, it literally means consonant & vowel.
- Kana.
- The name of a primarily syllabic script used by the Japanese writing system. It comes in two forms, hiragana and katakana. The former is used to write particles, grammatical affixes, and words which have no Kanji form; the latter is used primarily to write foreign words.
- Kanji.
- The name for Han characters used in Japanese; derived from Hanzi.
- Keyboard Order
- The two most common keyboard layouts are referred to QWERTY, the first six letters on the top row of keys, and DVORAK, after the inventor of a keyboard layout that positioned the frequently used keys in the home key position. The story is that QWERTY was purposely invented to slow down typist and prevent jamming on the early typewriters.
- Keying Order.
- The order of text entry as it is entered by the user on a keyboard or another input device.
- Language.
- The sounds, structure, meaning, and usage associated with some linguistic community. Note that this definition refers to language in a general sense, and not merely the sounds uttered by its speakers.
- Letter
- An element of an alphabet. A symbol used in a writing system to represent one or more speech sounds. A sound sign having a particular characteristic shape. (Cf. Article: The definition of letter by ...)
- Lexeme
- The smallest contrastive unit in a semantic system (a lexical item), e.g., run, cat, mat)
- Lexicon
- The vocabulary of a language, especially in dictionary form; also called lexis.
- Lingua Franca
- A medium of communication for people who speak different first languages. A language of commerce. lingua refers to the tongue.
- Ligature.
- A character in which two or more letters have been joined or connected. e.g., Ligatures were used by early printers to improve the aesthetics of some letter combinations. ITA used ligatures to improve the recognition of digraphs. A visual form representing a combination of two or more visual forms. Ligatures may either be obligatory, in which case their use is required, or optional, when they may be freely used or not.
- Linear Display Order.
- The process or result of (1) writing symbols linearly with respect to the sounds or meanings to which they refer, or (2) displaying the glyphs which represent characters in a linear order with respect to those characters. The English writing system employs a linear display order; most Indic script based writing systems do not. See Non-Linear Display Order.
- Localization.
- The process or result of modifying system or application software to support a particular language environment. Often this entails making coding decisions based on the particular language supported. Localization differs from Internationalization which attempts to remove all references to language from a system or application.
- Logical Order.
- An ordering of character code elements within a string such that the ordering does not correspond to any particular physical ordering, e.g., display or keyboard ordering.
- Metalanguage
- A language used to talk about or describe a language. SGML is a metalanguage for HTML.
- Morpheme.
- The smallest contrastive unit of grammar. A minimally distinctive unit of meaning in the context of a particular language. For example, cats consists of two morphemes: cat and -s, the plural suffix. The -s is called a bound form while cat is a free (or stand alone) form. dogs also has the -s but it is pronounced /z/.
- Morphemic Writing.
- A writing system which maps its symbols onto the morphemes of the language being represented. The Chinese writing system is a good example of morphemic writing, although it is also partly syllabic writing. English, on the other hand, is also partly a morphemic writing, since it uses spelling variations to distinguish among homonyms.
- Morphology
- The study of word structure, especially in terms of morphemes.
- Non-Combining Character.
- A character whose visual form normally stands on its own. However, the visual forms of other combining characters may attach to a non-combining character. Also, the visual form of a non-combining character may combine with that of another character to form a ligature.
- Non-Joiner.
- An invisible character which effects the joining behavior of surrounding characters.
- Non-Linear Display Order.
- The process or result of (1) writing symbols non-linearly with respect to the sounds or meanings to which they refer, or (2) displaying glyphs which represent characters in a non-linear order with respect to those characters. Most Indic script based writing systems employ non-linear display order. See Linear Display Order.
- Non-Spacing Mark.
- The primary term adopted by The Unicode Standard to refer to a combining character. The forms of most combining characters do not possess intrinsic spacing, i.e., the current display point is not advanced after their glyphs are rendered. Abbreviated as NSM. See Combining Character.
- Orthography. (right writing, correct display)
- One of the three (independent) components of a writing system which is responsible for determining the relation between the linguistic units represented by the writing system and the symbols used to represent those units; the rules used in laying out or making these symbols visible.
The three components are sometimes called (1) the words and other linguistic units in a particular language, (2) the symbols or character set, and (3) the mapping rules - although the rules extend beyond simply mapping. One can display English sentences iin hieroglyphic "letters" without changing the anything else. We can also change the rules without changing the language. Orthographic reforms change the way the language is displayed, they do not change the way the language is spoken.- Orthographic Composition.
- The process of combining symbols in a writing system with one another as they are depicted in visual form. Compositionality of forms may be very little, as in printed English, or considerable, as in Arabic.
- Phoneme.
- A minimally distinct sound in the context of a particular spoken language. For example, in American English, /p/ and /b/ are distinct phonemes because pat and bat are distinct; however, the two different sounds of /t/ in tick and stick are not distinct, even though they are distinct in other languages, e.g., in Thai.
- Phonemic Writing.
- A writing system that incorporates the alphabetic principle of one sound per symbol. No system in use can be said to be isomorphic.
- A writing system which maps the symbols it employs onto the phonemes of the language being represented. The Finnish writing system is a good example of a nearly pure phonemic writing system; on the other hand, English and French writing systems are highly impure phonemic writing, since they both possess many silent letters, digraphs, multiple graphemic options for the same sound, and historically derived spellings unrelated to current pronunciation. There are phonemic writing systems for English but none of them have caught on.
- Phone.
- A minimum unit of sound articulated during speech. A single phoneme can materialize as various phones when actually pronounced, e.g., the /l/ in film is a different phone from that in limb. The process used in rendering Unicode of selecting a glyph based on context is similar to this process of selecting a phone based on phonemic context.
- Phonemic Writing
- Phonetic Writing.
- A writing system whose symbols (graphemes) correspond with the sounds articulated during speech production (phonemes), irrespective of whether the sounds are meaningfully distinct; e.g., the /t/ in English time and might would be written slightly differently in phonetic writing. The primary script used for phonetic writing is the International Phonetic Alphabet (IPA).
- Physical Order.
- An ordering of character code elements within a string such that the order corresponds to a particular external order; for example, display order, keyboard order, phonetic order, sorting order, etc., are all examples of particular physical orders. In text processing, this term usually refers to display device order, i.e., the order of glyphs needed to display some character text onto the device. Unicode employs a logical order which has some similarity to keyboard and phonetic order, although, strictu sensu, it is neither of these; it is a true logical order.
- Pidjin
- A convenient means communication between two linguistic communities. Typically a simplified form of English with grammatical elements and a few words from the native language. Developed pidjins become creoles, a distinct language often with a written form.
- Pidgin forms of English developed as lingua franca when British traders from the 17th century on needed to communicate quickly with peoples of other languages. From a base of English mixed with the other tongue or tongues, a simple grammar with minimum vocabulary would develop, so that it could be picked up very quickly. Pidgins may appear long-winded in comparison with English when they have only a limited vocabulary.
- Plain Text.
- A sequence of character code elements which represents only the core or primary content of some textual information. Text which is unformatted with the exception of spaces and line breaks. Monospaced typewriter text. Text composed of ASCII characaters. Cf. Rich Text.
- Precomposed Element.
- A character code element which represents a composite form that can also be represented by decomposed character code elements.
- Presentation Form.
- A visual form or shape (glyph) used to display a contextual variant of an abstract form. Normally, Unicode text does not encode presentation forms; however, for compatibility reasons, Unicode supports the encoding of certain presentation forms. See Compatibility Characters.
- Primary Script.
- A script whose function is to represent the primary linguistic information in a writing system. All primary scripts represent either sound, meaning, or a combination of both.
- Pseudo-Script.
- A collection of symbols which is used to represent the secondary linguistic information in a writing system. For example, collections of punctuation, symbols, numbers, shapes, etc. all can be considered as pseudo-scripts. Also called a secondary script. See dingbats, printer's symbols, decorations,
- Radical.
- A component of a Han character (Hanzi) which designates one of a number of semantic categories. The traditional number of such radicals is 214.
- Regularity, Consistency (How regular is English?)
- Depending on how one calculates regularity or consistency, TO is consistent 6% to 84% of the time. The best measure of consistency is how many words would be written the same way in a phonemic writing system. As the chart indicates, only 43% would be written the same way as in TO if English were written consistently. English uses 561 different graphemes to represent about 40 distinctive sounds.
- Regularized Spelling
- Regular means according to regulations or conforming to rules. Regularized spelling would mean conforming to the alphabetical rule: one symbol for each distinct sound. There are a number of different proposals or notational schemes for regularizing English spelling. Some limit themselves to 26 letters which means they are at least 14 letters short of having a unigraphic symbol for each sound. These proposals uses digraphs (2-letter combinations) to represent the missing sound signs. Others limit the character set to what can be found on the typewriter keyboard. Some, such as ITA, augment the Roman character set by creating ligatures and completely new letter forms. Some employ a non-Roman character set.
- Rendering.
- (1) the process of selecting and laying out glyphs for the purpose of depicting character data; (2) the process of making glyphs visible on a display device.
- Rendering Engine.
- The component of a composition and layout process which implements the Rendering function.
- Repertoire.
- In the context of characters sets, the collection of characters represented by a character set. Some character sets have open repertoires, in which new characters can be constructed; others have closed repertoires, which do not allow characters to be constructed from other characters. Unicode is a closed repertoire character set, since new characters cannot be created. Note that the operation of combining characters in Unicode does not have the effect of creating new characters; rather, combining characters are like composing words out of letters, they are simply a type of text element.
- Revisable Form Document.
- The form of a document in which its content and formatting information can be edited or revised. A Unicode plain text file is a form of a revisable form document.
- Rich Text.
- The result of adding additional information to Plain Text. Examples of information that can be added: font data, visual appearance, e.g., color, formatting information, phonetic annotations, interlinear text, etc. Unicode does not address the representation of rich text. It is expected that systems and applications will implement proprietary forms of rich text. Some public forms of rich text are available, e.g., ODA, SGML, etc. When everything but primary content is removed from rich text, only plain text should remain.
- Schwa, shwa, /shwah/, Heb. schewa
- An unstressed mid- central vowel heard at the end of such words as after and the and at the beginning of ago, acute, and America. In IPA this sound sign is represented by a rotated e. In NF and PMF it is an acute accent (half of an up sign ^). In SoundSpel it is a c.
- Script.
- (multiple related meanings) Something written. Text produced by scribes. A collection of symbols used to represent textual information in a writing system. The visible part of a writing system. A font that resembles handwriting, especially a running cursive style.
- Segment
- To cut, mark off, divide. A unit whose boundaries can be identified in the stream of speech. A phoneme is a discrete category or segment cut out of the continuum of speech sounds. It is a man made artifact. The vowel sounds merge into one another but most native speakers can discern a clear instance of a particular vowel. Those who are unfamiliar with the language may not discriminate two vowels (e.g. uh and ah) unless their language also isolates these sounds.
- Segmentation
- The process of dividing or segmenting textual information into segments that serve some text processing need. For example, line layout requires segmenting the rendered form of text into visible lines; to perform this, it is often necessary to find syllable or word boundaries where line breaks can occur.
- Sign
- 1 A feature of language that conveys meaning, especially as used conventionally in a syustem; also called a symbol (conventional sign). 2 A mark used as an element in a writing system (e.g., letter or sound sign).
- Phonemic vs. Phonetic
- One of the simplest ways to distinguish between notational systems is to say that a phonetic system should be able to distinguish dialects.
- Anyone reading a phonemic system can be understood by native speakers but they will not be duplicating native speech patterns. The accents and pitch will probably be off.
- Simplified Spelling Society
- An organization headquartered in England and founded in 1908 that supports the regularization of English orthography. They have been connected with the ITA movement (Initial Teaching Alphabet) and the Shaw Alphabet but this connection is through its members rather than direct. A similar organization in the USA was originally called ... but now is the American Literacy Council. Both support a phonemic system that was originally called New Spelling but now has several variants. The SSS also endorses Cut Spelling, a partially regularized orthography that corrects about 80% of the problems with TO.
- There seem to be four divisions among those on the SimSpel mailing list. Those who
- - want to retain as much of TO as possible (NS and CS)
- - want to retain the shape pattern of TO (CS)
- - want spelling to be a near perfect guide to pronunciation (NS, YS, NF, SS, AT)
- - want every letter used in a digraph to be pronounceable (YS, NF, SS)
- - are opposed to digraphs (e.g. ANJeL Tug, PMF)
- This is more than four because there are divisions within the divisions.
- No silent letters, no phoney phonics (AT, YS, CS) Cut spelling eliminates the silent letters be retains ambiguous vowel markers e.g. e for an a sound, a for an e sound, etc.
- Sound Sign
- Typically a mark used to represent a sound.
- Spacing Mark
- See Non-Combining Character.
- Speech
- The oral medium of transmission for (spoken) language
- Speech act
- An utternace defined in terms of the intentions of the speaker and the effect it has on the listener.
- Spelling (Orthography)
- Orthography means right-writing. There are two components of a spelling system: (1) one rule defines or limits what is a graphemically correct way to represent a sound (phoneme) and (2) the other rule determines which of the graphemic options is lexically correct. The traditional English writing system (TO) is said to have too many orthographic options. There are too many ways to represent a speech sound. According to Dewey, there is an average of over 20 ways to represent a vowel sound and about 9 ways to represent most consonant sounds.
- Spelling pronunciation
- The pronunciation of a word based on its spelling. of/oaf, said/sah-eed/. Spelling pronunciation works in most languages but in English the spelling doesn't provide much of a guide to pronunciation.
- Spelling Reform
- A movement to make spelling more regular in its relation to speech. The goal of the Simplified Spelling Society.
- Stress Mark
- A diacritical mark which denotes stress; e.g., in English dictionaries, primary and secondary stress are marked with acute and grave diacritic marks, respectively. [a 'go /^ 'gou/]
- Syllable.
- A basic unit of articulation which corresponds to a pulmonary pulse. A syllable consists of an onset and a rhyme; the rhyme consists of a nucleus and a coda. Some writing systems assign symbols to the onsets and rhymes of syllables; these are referred to as sub-syllabic writing systems.
- Syllabic Writing.
- A writing system whose symbols correspond to syllables. A number of syllabic writing systems are in use; in some instances, they are part of a larger writing system. Writing based on the Kana and Cree scripts use syllabic correspondences. The Amharic writing system, based on the Ethiopian script, is probably best classified as syllabic writing. Some writing systems employ syllabic layout or punctuation devices, e.g., Korean Hangul and Tibetan both mark syllable boundaries, either through layout or by punctuation. Wilkins' notational system for English marked syllable boundaries by connecting both the vowel and the consonant to the same stem. Single marks in this system could represent VC or CV. By leaving out a feature the stem could represent stand alone Vs (vowels) and Cs (consonants)
- Symbol.
- 1 a sign which (by convention) has a form, and in some context, a particular function; written words and logograms represent spoken words and concepts. 2 a (pictographic) form which visually depicts or re-presents its meaning, e.g., the symbol (e.g., crescent) whose form is a picture of the moon, etc. A yellow crescent shape (w/o points) could, by virtue of the two criterial attributes: shape and color) be interpreted to refer to a banana. According to Berman, most pictures are to some extent conventionalized. In a pictographic alphabet, the crescent could be an acrophonic representation of the sound sign C. Both the sound of the shape and the shape of the sound could be derived from its name.
- Synonym
- Phrases can be synonomous without having any correspondences among their individual words. E.G., coherent optical radar = pulsed laser reconnaissance device. Phrasal synonym vs. lexical synonomy. There is a synonym relation between any two members of a lattice (Anglin)
- implication
- application (outward-inward)
- origin, associations, connotations
- persepective or point of view: anger vs. wrath.
- Differences
- One term more general - refuse-reject
- more intensive repudiate - refuse
- more emotive reject-decline
- censure vs. neutral trifty, economical - stingy, miserly
- more professional decease-death
- more literary passing - death
- more colloquial turn down - refuse
- more local (dialect) flesher-butcher
- more childish daddy - father
- Syn. are interchangeable in some contexts but not in others. Dr. Johnson: Words are seldom exactly synonymous. Macaulay: Syn. change the structure of a sentence, substitute one syn. for another and the whole effect is destroyed.
- Text Element.
- A minimum unit of text in relation to a particular text process, a particular writing system, and possibly in relation to the text itself. A single text element may consist of multiple code elements; on the other hand, a single code element may represent multiple text elements. In general, the mapping between text elements and code elements is many-to-many.
- Unicode
- (1) a registered trademark of Unicode, Inc. (2) a world-wide character encoding standard based on a 16-bit unit of encoding developed by Unicode, Inc., (3) a usage profile of ISO/IEC 10646 UCS-2, the international character set standard based on the Unicode Standard. (see ascii)
- Universal Character Set.
- A character set whose goal is to (1) adequately represent all written language; (2) represent all existing character data.
- Visual Order.
- A physical order based on display device order.
- Writing Direction.
- The direction or orientation of writing lines of text in a writing system. Three directions are common in modern writing systems: left to right, right to left, and top to bottom. Egyptian heiroglyphics used all three with a preference in cursive for retrograde. A writing direction found in ancient Greek (ca. 900 b.c.) was "as the ox plows" or boustrophean. This would alternate between left to right and retrograde.
- Writing System.
- A language, one or more scripts, and an orthography, which, together, account for a particular form of a written language. We can thus talk about the American English writing system, the British English writing system, the French writing system, etc. These different writing systems, though sharing one thing in common, i.e., use of the Latin script, nevertheless represent either different languages or different orthographic rules applied to the same language.
- Written Language.
- A more permanent non-oral form of language. A writing system is the means by which a language is written.
The entries in bold face have been reviewed and revised at least once. Volunteers are needed to make this list more useful. Some entries need to be removed, many others need to be added. Existing entries need to be rewritten. Send your suggestions to Steve Bett - sbett@oecrc.org or call 1-800-417-3950
Back to the top of the page
Back to the Simple Spell Link Page
See also the Unicode and Internationalization Glossary developed by Glen Adams in 1993