Wednesday, 14 December 2016

"Hznai" and words as patten recognition.


“It dseno’t mtaetr in waht oerdr the ltteres in a wrod are, the olny iproamtnt tihng is taht the frsit and lsat ltteer be in the rghit pclae. The rset can be a taotl mses and you can sitll raed it whotuit a pboerlm.”
The above passage is taken from an interesting article written by a web-acquaintance of mine. This is an example of typoglycemia. The explanation for this is that generally we treat written words primarily as patterns. We only tend to look at each individual letter if the pattern is unfamiliar such as in a word that we do not know. Unlike Chinese, English words are composed of letters and their arrangement will give us some help in determining how the unfamiliar word is pronounced. Of course, given the numerous irregular rules of English the word may be pronounced in a quite different way to what the component letters may suggest to the reader!
That words are not fully read has been established by various experiments. Most of us read much faster than we might do if it was necessary to register every letter. Many of us can also correctly comprehend a larger number of words than we correctly spell. In one test I saw participants were rapidly reading out loud a prepared text. Unbeknown to them the text had deliberate mistakes such as “bifferent”. This was pronounced as “different” on the playback. Context doubtless also had an effect on such corrections.
One of the things that strikes me about the above passage is that it seems to suggest there are a fairly limited selection of commonly used word endings in English:
 “-m”, “-n” and “-ng” are used.
“-d” and “-t”
“-r” and “-l”
“-k” and “-g”
“-s”
“-z” and “-x” and some other letters are rarer but not unknown.
Vowel endings are also used. “-y” is phonetically “-i”. Some of the “-e” endings would probably be replaced by the above consonants if the words were first rendered into a more phonetic form. Words like “bole” or “fare” have homophones such as “bowl” or “fair”.
In a more practical vein grouping words by “first letter, last letter, approximate length” may greatly improve the capabilities of search engines and similar systems. One can envision a search engine mode where one enters the first and last letter and the word length. Words of six to nine letters would be grouped together, as would words of eight letters or more.

Wednesday, 14 September 2016

Phonetic Consonants in English


             Following the post on phonetic representation of vowels in English it is only logical that I make some comments on consonants.
            The good news is that the majority of consonants in English only have one phoneme. The bad news is that consonants are sometimes silent.
            Three consonants are unnecessary and are not used in phonetic spelling. These are C, Q and X.
            C in English either represents an “S” sound or a “K”.
            Q represents a “kw” sound that is more usefully represented by these letters. The “u” that customarily follows a Q is usually silent.
            X in English is generally pronounced as a “Z”. When it is preceded by an “e” the sound is often “eks”. “Taxi” can be rendered phonetically as either “taksi” or “takzi”, depending on dialect.
            G is a consonant that has two phonemes, being either a “g” sound or a “j”. It is also a silent letter in some words. Phonetically G is used for a hard “g” and “j” is used for “j” sounds.
            J is a relatively young letter, dating back to the middle ages. In other words it was necessary to create a letter to represent a phoneme that was in common use. It is therefore a little surprising that “j” was a letter that Benjamin Franklin did not include in his phonetic alphabet. Instead he represented the sound with his letters representing “dsh”.
            Another letter Franklin eliminated was “W”. While some nationalities have trouble with pronouncing “w” it is a distinct phoneme in English. Like “j” it is a relatively new letter that came into common use in the early middle ages. Franklin represented “hw”/ “wh” and “w” with letter combinations such as “hu” and “uu”.
            Y is a distinct phoneme when at the start of a word or syllable. In English pronunciation “yog” is phonetically distinct from “jog”, for example.
            When H is placed after another consonant it generally has a softening effect. Some of the exceptions to this constitute some of the most widely used consonant digraphs.
            SH is used in words such as “shush”.
            CH is often rendered as TSH in many phonetic systems. An argument can be made that in the initial position “ch” may have a softer sound, closer to “jh”. This gives us the words “jhurtsh” and “jhiyna” for “church” and “china”.
            TH in English has two phonemes. It has an “f” sound in words such as “three” or “thigh”. This is rendered as “th” in many phonetic systems although “fh” may be closer in actual sound. TH words with a “d” or “v” like sound may be phonetically spelt with a “dh”. Many of the “dh” words are determiners or pronouns and include dhe, dhey, dhem, dhis, dhat, dhez, dhouz, dhayr and widh
            PH is inherited from Greek and is phonetically represented by “f”.

Tuesday, 13 September 2016

20(ish) Vowels.

Officially English has twenty vowel sounds but only five vowel letters. The other vowel sounds are represented by combinations of two or more letters. Unfortunately most of these vowel sounds can be represented by multiple different combinations with no apparent logic or consistency. For example, the words “boot” and “hook” have quite different pronunciations, both of them closer to “u” sounds than “o”.
If a more phonetic form of English is desired then the vowel sounds seem a very logical place to start.
The five “basic” vowels are:
Sound
Usually written
Examples
/æ/
a
mat, pat, lap
/ɛ/
e
met, pet, let
/ɪ/
i
bin, pit, lip
/ɒ/
o
rot, pot, lot
/ʌ/
u
fun, sun, luck = fun, sun, luhk
 
These are relatively consistent, so we will move on to the other vowel sounds. In the second column I suggest standardized letter combinations to represent these. Further discussion of these is in the section below:
/eɪ/
ay/ ey
wait, day, late = wayt, day, layt.
/ɑ:/
ar/ aa
far, car
/eə/
er/ ayr
air, care, where =ayer, kayr, wayr
/iː/
ii
sheep, meat, fiend, elite = shiip, miit, fiind, ayliit/ eyliit.
/ɪə/
ir
steer, near, here = stir, nir, hir
/ɜ:/
ur
stir, her, word, bird, hurt = stur, hur, wurd, burd, hurt
/aɪ/
ai/ iy
I, sign, fight, dry, ice = ai, sain, fait, drai, ais/ iy, siyn, fiyt, driy, iys.
/u:/
u/ uu
do, doom, through, boot = du/ duu, duum, thru/ thruu, buut
/ɔɪ/
oy
coin, toy = koyn, toy
/əʊ/
oh
boat, note, snow, know = boht, noht, snoh, noh
/ʊə/
or
for, oar, worn, door, more, saw, paw, lore = for, or, worn, dor, mor, sor, por, lor.
/aʊ/
ou
sound, cow, how, now = sound, kou, hou, nou
/ʊ/
u
look, hook = luk, huk
/juː/
yu
few, due, cube = fyu, dyu, kyub.
 
One of the surprises in constructing this table is the variability of how “u” is used. The dictionary insists words like “do” and “through” are a “long u” (/u:/) while I would be inclined to pronounce them “du” and “thru”. Indeed, this would be my inclination to pronounce any word ending in a “u”. “Look” is obviously a short “u” sound, “luk”. Spelling “luck” phonetically in the above system gives us “luk” too although pronunciation is obviously different, hence I used “luhk”. Further examination of the ways “u” is used as a phoneme may be needed.
The “i” in words like “high” or “wire” is represented by “ay” in SaypYu. To my mind this is too likely to be taken as “-ay” as in “may” by English speakers. As far as I know “ay” is always pronounced “/eɪ/” in English. SaypYu’s use of “ay” requires the unnecessary respelling of many perfectly reasonably spelt English words. In the past I have suggested that this should be “ai”, pronounced as in “thai”. A good case can be made for instead using the letter combination “iy”. Thus words such a “fire” become “fiyr”, which is fairly easy to comprehend.
Saypyu uses “ey” to represent the “a” sound in words such as “may” or “same”. If “iy” or “ai” is used to represent the gliding “i” “ay” can be used to represent the gliding “a” and is more easily comprehended by English speakers. There may be a case made for using both “ey” and “ay”. “Ey” would be used for “/eɪ/” sounds not traditionally spelt “ay”. One problem with phonetically spelling English is it increases the number of homographs, something that the language hardly needs!
“er/ ayr” is another case where two different spellings may be used. Words such as “air”, “care” and “where” are more easily comprehended using the “ayr” spelling : “ayr”, “kayr”, “wayr”. The spelling of other words may be clearer using what is effectively a rhotic form of schwa. The distinction between these two may be more pronounced in some accents and dialects.
The long “a” is another phoneme that might be represented two ways. In the above table examples are given of a rhotic form but this may not be applicable for some words. Alternately “ah” may be used instead. Is “father” better spelt “fardher”, “fahdher” or “faadher”?
/ɔː/ as in “saw” or “sore” would usually be represented by “or”. For some uses such as /ɔːl/ or /ɔl/” in “ball” the use of “au” to create “aul” might be clearer.

Careful readers will have realised that the above options give this system more than twenty vowel sounds. They may also have noted that the two tables only have nineteen rows! The missing vowel sound is schwa, which is not represented by a letter in English. Unlike Saypyu I have attempted to create a vowel system that for the most part uses existing English constructions. This system should be comprehendible to native English speakers as well as easy for non-native speakers to learn. Schwa will usually be represented by “e” although in some words it may be clearer if another vowel is used.
The above vowel system is relatively easy to remember. Firstly, you have the digraphs that end in “-r” : ar, er, ir, or, and ur, to which we might add “ayr”.
Then come the doubled letters : ii, uu, au and possibly aa.
Next come the digraphs with a “y” in : yu, ay, ey, iy, oy.
And last, the other “o”s : oh, ou.
This gives seventeen vowels, which with schwa and the single letter vowels is twenty-three in total. Some of these vowels represent the same or similar phonemes. ah, eh, ih and uh might also be considered to be digraph vowels.  This gives us a much more logical and intuitive system than traditional English.
An example:
“Aul hyumen biingz ar born frii and iikwel in digniti and riytz. Dhey ar endoud widh riizen and konshens and shud akt tewordz wun enodher in ey spirit ov brodherhud.”

Wednesday, 13 April 2016

Superlatives and Comparatives.

English has two ways to form comparatives and superlatives. The first is by preceding the item being described by the adverb “more” or “most”. This is the system used in many European languages and also in Mandarin. Some languages, such as Portuguese use one word, “mais” meaning “more” and “ o mais” meaning “the most”. The word “most” in English is somewhat ambivalent. “most red” means nothing discussed is more red. “Most people” means the majority, not the entirety.

The second commonly used system uses the suffixes -er and -est. “-er” is also used to create agent nouns in English. It is also used for words that are neither comparatives nor agent nouns. Its actual pronunciation in RP English is “-ə”.

Both systems are widely used in English, the choice being determined by the syllable number of the word being modified. The system used in Diinlang needs to be simpler to learn but remain versatile.

The first draft of Diinlang used the suffixes “-ha” and “-ho” for the comparative and superlative. Observing that the “h” sound could sometimes be problematic for my Portuguese-speaking friends I then changed this to “-tah” and “-toh”. Latest idea is to instead convert these to prefixes. This is easier to learn for speakers of the many languages that form comparatives and superlatives with a word before the word of interest. It also maintains a convenient single word form for when the comparative or superlative word is uses as an adjective.
Many quantities in English are described by a number of words. Temperature, for example is described by “hot”, “cold”, “warm”, “cool”, “tepid” etc. For Diinlang we want a logical system that is easier to learn. It should be easy and logical to deduce the word for a smaller or larger quantity of a property. The system I propose for Diinlang uses the prefixes “et/mes/tai”. “tai” comes from Chinese and is used in terms such as “tai chi” which means “great ultimate”. It also means “the highest part of a roof”. “et” is a diminutive used in some English words such as “bomblet”. “et” therefore means a small amount of something, “tai” a large amount.
To illustrate how this works, let us assume that the word for temperature is “hii”. This is adapted from the Dutton speedword for heat, “he”. Cold is “he-x”, meaning “opposite of heat” and temperature is actually “gre-he” where “gre” means “grade, degree or stage.
taihii” would mean hot or high temperature.
ethii” would mean cold.
meshii” would mean medium heat. This can be taken as a temperature comfortable for human beings.
etmeshii” and “mestaihii” represent cool and warm temperatures.
With the comparative prefix added “tataihii” means hotter and thus “totaihii” is “hottest”.
With this basic system you only need to know the core word for weight, number, mass, height etc to form the derived words for large or small quantities, comparatives or superlatives.
A superlative or comparative usually needs to be compared with something. In English this is often introduced by the word “than”. “Your porridge is hotter than mine!” One option in Diinlang is to use “di” as the comparative conjunction. In many languages the equivalent to di (of/from) is used in this way.
In English comparisons are also made using the word “as”, particularly when the two things are regarded as similar. “You are nearly as tall as me!” Note the “as...as...” format, although the first “as” is sometimes omitted. “as” is a nice, compact word but with a definition that is hard to pin down. Possibly in Diinlang “as” can be used as a more general purpose conjunction and used instead of “than” even when there is a considerable difference between the items.
Ti bi tataihii as mi” = “You are hotter than me”

Tuesday, 12 April 2016

Plurals, Gender and Possession.


Continuing the introduction of some of the basic framework of Diinlang.
Plurals.
As may have been deduced by the previous posts plurals are formed by the addition of -z at the end. Phonetically this is the same as an -s ending is usually pronounced in English. The -z ending is used on nouns and also used to make the plural pronouns. “we”, “they”, “us”, “them”, “these” and “those” are all created by adding a -z to the equivalent singular pronoun. Hence we have miz, ziz, saz and siz. The z can also be added to the one letter words to form their plural. If a word ends in “-s” or some other construction that does not euphonically mesh with “-z” then “-iz”,  can be used instead. 
The -z can be dropped if the sentence has an obvious indicator of plurality. “Three coffee” is an acceptable construction since it contains a plural number.
Ideally in Diinlang the only words ending in -z will be plurals. In an older draft I had “plu” for amount/much and “pluz” for number/ many. This will probably be changed.
Gender.
Most words in Diinlang are of neutral gender. One way to indicate gender of a individual is to compound their designation with the relevant singular third person pronoun. This is the same as is sometimes done in English with constructions such as “she-wolf”. Another way to indicate gender is to add an -o suffix for a male or a -a suffix for a female. Since it is planned that most words in Diinlang end in -m, -n, -ng, -i or -u then -o or -a endings can be added without needing to substitute letters. Obviously we want to avoid ungendered words that end in o or a. A work around may be to spell such words more phonetically with an -oh or -ah but this is not entirely satisfactory. Neither is that only nouns are likely to be gendered in this way.
Some pronouns take their gender using the same convention. The third person neuter singular pronoun “zi” can become “zio” or “zia” to mean “he” or “she”. In single letter form this becomes “zo” and “za”. Plural gendered constructions are also possible. A body of males could be referred to as “zoz”. “Ze” will most probably be used instead of “zi”. 
A number of non-noun words end in -o or -a. These include “ya”, “no”, “sa” and “so”, meaning “yes”, “no”, “this/here” and “that yonder”.

A simpler approach may be to gender nouns with -zoand-zawhich agrees with the system proposed for gendered agent nouns and maintains the option for neutral words ending in -o or -a. Non-agent nouns can be gendered by using “zo” and “za” as prefixes.
Possession.
The use of the apostrophe, particularly for possession, is something that seems to baffle many native English speakers. A basic guideline is that if a word is both plural and ending in -s put an apostrophe at the end. If it is not both plural and ending in -s then add -’s. Children is plural but does not end in -s so becomes “children’s”. Not that difficult! Of course, English being the eccentric language it is there are oddities. Possessive pronouns such as “mine”, “yours”, “his”, “hers” and “whose” don’t take apostrophes, but “one’s” does.  
There is no possessive apostrophe in Diinlang. In Diinlang there are several ways to indicate possession. One is the “_ of xxx” construction used in many European languages. The Diinlang word for “of” or “from” is “di” which can be represented by the single letter “d”. Incidentally, rather than saying “a play by Shakespeare” in Diinlang the construction would translate as “ a play from Shakespeare” so use “d” or “di”.
Possession can also be indicated by using the noun or pronoun as an adjective. “John’s book” and “his book” translates as “John book” and “he book”. Since this is a noun phrase this construction will often have an article before the noun or pronoun, for example “the John book” or “those John books”.
Sometimes there is a need to emphasise possession. In English you might say “Dean and myself got beers. I held his”. To an English speaker it is obvious that it is Dean’s beer that I was holding. In Diinlang “his” is usually replaced by “zio”. Such a sentence could be translated as “I held him”. When the possessive nature of a noun or pronoun needs emphasis the marker “vo” is placed after it. “I held his” would be correctly written “mi held zio vo” or “m held zo vo”.