#20 – Marks = Letters = Numbers?

letters on a chalkboard

After finishing off the list of long-dead letters, we have come to another, titular question.  Whilst we use them to represent linguistic features within the text we write and type, many would not necessarily consider them letters of the alphabet.  They are, however, graphemes, and whilst most of us know what is and is not a letter, they could act as alphabets in underlying orthographies.  The set of punctuation marks (which could be considered suprasegmental – units above the base context level) would certainly make up a grammatical inventory, whilst the numerical digits and function symbols we use as numbers and shorthand ways of saying what the function is, can surely make up a numerical and mathematical inventory (and who is to say that these are not separate distinctions).  Suffice to say, an alphabet, in terms of orthographies, is rather more a concept than a defined term to the lay speaker.

Letters also form this mysterious ground of being a concept and are even less defined within linguistics.  Notice how we seem to have words for different things within the layers of our alphabetic inventory in English and sometimes we use them interchangeably as well;

  • Letters – for obvious members of the written alphabet regardless of typeface.
  • Digits – for numbers.
  • Marks ­– for elements of punctuation that often seem too small be a letter.
  • Symbols/Signs – for functions and uncategorised elements.

But all of these need some form of overarching consistency as the conceptual nature of the spoken categories are not quite enough.  This would also help to classify these members of a writing system into groups depending upon their features.   And features, in linguistics, are a pretty useful thing to define abstract concepts and predict what they are and are not going to do in a given environment – something that we humans find innate, the ability and desire to classify things, which is the very reason preserved communicative media (writings) were designed.

So, we definitely have a term to describe all of the above.  Graphemes.  This is to mean the smallest contrastive unit of a given writing system.  There are base graphemes which could be any of the following: <ƿ g H 9 ? 2 & d f Ð>, and are obvious, base units of writing systems that convey information to the reader about what is going on.  Diacritical graphemes exist also, which are ‘extra’ units that are imposed upon a base grapheme to convey further detail about it – examples include the accent upon most vowels and some letters in a lot of European orthographies, both vowel and voice markers in Punjabi and Japanese, and super-/subscript markers in the IPA amongst others.  Lastly, suprasegmental graphemes are those that do not necessarily add meaning to the graphemic unit, but instead represent information about structure and are often considered too small to be a true grapheme – examples would include commas, full stops, and spaces amongst many more.

Overall, we have a great many mathematical functions, so to represent all of the terms in mathematics would be unreasonable as it would therefore never end as jargon graphemes of other academic and creative areas would require adding to this list.  This requires a distinction between what is commonly used in mathematics – which would be a basic subset of signs, <+ – ÷ × =>, and the numerals themselves.  Numerals, in vastly different orthographies, are often represented on charts with the alphabets as well, so this idea is already present at an underlying level that there are separate sets of graphemes for separate things.

We also have a lot of markers for punctuation that serve morphosyntactic, phonological, phonetic, and lexical functions in order to give greater depth of meaning without longwinded explanations.   A few of them are base graphemes like the ampersand, question-, and exclamation marks, whilst some of them represent stress patterns from quotes, humour, or sarcasm, and others represent loudness, curiosity, phrasal breaks and pauses, lists – and the list goes on.  These mixtures of suprasegmental and base graphemes fill up the grammatical orthographic inventory of English.

Alphabets have been a widely used term to express a writing system but that would not be a fitting description of the subset of grammatical and numerical graphemes as they are not alphabets to most people but are to some – hence avoidance of this term.  This term would again not fit for language orthographies such as Japanese Hiragana & Katakana (which are syllabaries), Hindi Devanagari & Punjabi Gurmukhi (abugidas), Mandarin Hanzi & Japanese Kanji (logographies).  Alphabets, linguistically, tend to be systems where strings of graphemes form syllables but are separate components, so one syllable = x components.  Syllabaries are where the graphemes represent a syllable – one syllable = one component.  Abugidas are where graphemes represent a syllable but can almost be defined as empty, where diacritical graphemes must be added to complete the syllable (kind of like a mixture between alphabetic and syllabic orthographies) so one syllable = one components + x diacritics.  Logographies are where one grapheme represents at least one syllable but has no inherent information as to the pronunciation of the graphemes and so gets its name as being a remembered ‘logo’ – emojis form a modern digital (chronolectal) logography.

So, in short, marks, numbers, digits, symbols, letters, and signs are all essentially the same thing.  They are conceptual and do differ somewhat in flavour but all of them are graphemes and all are used to communicate and preserve language.  There are so many orthographies out there with so very many different ways of capturing this communication – anyone wanna make their own?

After talking on graphemes of all manners be it single, double, or triple combinations, ligatures, and extinct graphemes, I want to turn to a small set of little-known grammatical graphemes that were proposed – the grammatical markers that never were…

-DP, Linguistics student

Related Posts: