Introduction | Control Panel | The Tests | Notes | My Fonts |
This page provides a set of test cases for a Tai Tham renderer. It has been compiled with a view to putting the 'Universal Shaping Engine' through its paces for the Tai Tham script. This set of tests is incomplete in that it does not directly give the correct renderings, although some one in possession of the source documents could visually check them.
I was originally requested to provide words of one syllable for such a test. By syllable, I understand an Indic syllable of the form C+(M*V*M*C*M*)* with a single base consonant. (M = miscellaneous mark). I include cases where the rôle of the base consonant is played by something other than a letter. The post-vocalic consonants occur not only in other SEA Indic scripts such as the Khmer script and Lao script (as in the use of ຽ U+0EBD LAO SEMIVOWEL SIGN NYO in the Lao writing system), but also in Tibetan.
However, many of the interesting cases occur in the second syllable of a word, and certain initial syllables are obligatorily followed by more characters of the word. I have therefore also supplied longer words when a conceivable problem would not appear in a word of one syllable.
The dependent vowel AA (U+1A63 and U+1A64) may form the base of its own little stack of dependent marks. Manual line breaking may also separate it from its base consonant. I have nevertheless counted it as part of the same syllable as a base consonant; the two stacks frequently interact in Northern Thai, with MAI KANG migrating to or towards the base consonant and interacting with its dependents.
The page was originally set up to use either my own stick font, 'Da Lekh', which is based on Deja Vu Sans, or the cut down version, 'Da Lekh Seri'. The Da Lekh font is intended to be suitable for use in preparing (but perhaps not publishing) Tai Tham text. It therefore includes work-arounds for known rendering engine problems. The Da Lekh Seri font deliberately does not include such work-arounds. You may be interested in using or examining my fonts for your own purposes.
I have added two other families of fonts. These fonts are available under the SIL Open Font license. The font 'A Tai Tham KH' relies only on the ccmp feature being enabled; it handles all Indic rearrangement itself.
The Hariphunchai font is an OpenType Layout font that looked promising when used with the South-East Asian shaper of HarfBuzz. Development seems to have drastically slowed when HarfBuzz switched Tai Tham to its implementation of the Universal Shaping Engine (USE). The code for this font is available on SourceForge and there is further documentation elsewhere. I have added work-arounds and a few further touches to enable it to work under the USE; I have dubbed the resulting font 'Lamphun'. I have included two versions in the menus, the 2014 version used for Lamphun, dubbed 'early Hariphunchai', and the latest (2019) version, dubbed 'Hariphunchai4'.
Feel free to adapt this web page to add your own fonts and test cases.
The test text is given in the table columns headed 'Text', and is the content
of the first table cell in table rows with class tst1
.
Two further columns,
headed 'Encoding' and 'Hacked via ASCII', are automatically derived
from this text as the page is loaded. The hacked column is intended to show
users how the text should look, though it too
may suffer from rendering engine limitations. The font used for this column
is the member of the Da Lekh font family last selected to display the first
column.
Ideally, I would include images of the text from credible
sources, but that may cause copyright problems, for the Unicode Consortium
wishes to be able to use this document for commercial purposes.
The 'Hacked via ASCII' column contains an unambiguous transliteration to ASCII of the Tai Tham text in the column headed 'Text'. Members of the Da Lekh font family contain an open type font feature, Stylistic Set 2, whose enabling may cause it to render the transliteration as the original Tai Tham text as it is intended to be rendered. For more details, see the style sheet in the source of this page.
The 'Meaning and Pronunciation' column is given to identfy the word given as an example. There may be better glosses, and pronunciation can vary extensively within a nominal language. The letter RA is particularly variation between /l/, /h/ and even /r/, and there are regional variations as to whether vowel length distinction exist and, if so, whether they are phonemic. For the Tai languages the pronunciation is given using IPA, while Pali is simply transliterated (as Pali). I have omitted tone, as phonetic tone is also quite variable. Where no indication to the contrary is given, the Tai pronunciation given approximates that of Chiangmai.
The test words may, in principle, be extracted quite simply from this
web page. Each test 'word' is the content of the first cell in each row
whose class is tst1
. For convenience, I have extracted the
first two cells in such rows, along with titles, to a
CSV file. Rows where there is a plausible case for
treating the encoding used as erroneous are marked in pink. (Their CSC
class is tst2
.) For completeness, I have included
alternative
encodings which the Universal Script Engine (USE) calls for with
an orange
background and CSS class tst3
when they are defensible
encodings. The USE encoding is not well-supported by fonts and is not
robust to alternative classifications of combining marks.
The HTML comments within this web page should not be construed as holding test words.
This page is intended as a rendering engine test, rather than as a font
test. However, you may modify this page to try out your own font. The
necessary changes will be confined to the style sheet in the
source code of this page, unless you use a different ASCIIfication scheme,
in which case look at the usage of javascript variable
ss02_hack
.
When this page was initially composed, in June 2015, the Da Lekh font currently mostly worked for the Tai Tham script in the Firefox and Chrome broswers. It worked in them because they use HarfBuzz to render the Tai Tham script. Since then, the HarfBuzz rendering engine used for Tai Tham has been brought into line with the Universal Script Engine, with a consequent dramatic fall in the rendering performance for the Tai Tham script.
The solution to this problem was to add numerous work-arounds to the font. These work-arounds have mostly restored performance, the main exceptions being subtle positioning errors where mark to base positioning is ignored and the default mark position is used instead.
The quality of the 'Hacked via ASCII' column varies from browser to browser and operating system to system, and also varies over time. For Internet Explorer 11, Microsoft Edge and for the HarfBuzz-based browsers Firefox and Chrome, it is actually the best rendered column. (Script-specific rendering engines have a tendency to make the achievement of advanced script features dificult rather than easy; Tai Tham has many 'advanced' features.)
Traditionally, the consonants used in neither Pali nor Sanskrit did not have subscript forms. However, one significant text book, the 'big blue book', provides a subscript form for LOW FA for use in loans from English. However, this form is cramped and ugly, which goes against the tradition of Lanna script writing. The MFL treats the stroke distinguishing HIGH KXA, LOW KXA, LOW SA, LOW FA and LETTER UU from HIGH KHA, LOW KA, LOW CA, LOW PA and LETTER U as a diacritic. The Da Lekh font follows this interpretation, and leaves this diacritic above the baseline when the letter is subscripted.
If you have difficulty reading the Da Lekh fonts, you may find it useful to consult their glyph gallery.
Introduction | Control Panel | The Tests | Notes | My Fonts |
You may type your own text in the area below. It will, if possible, be displayed in the font selected above.
Introduction | Control Panel | The Tests | Notes | My Fonts |
These vowel combinations are taken from Revised proposal for encoding the Lanna script in the BMP of the UCS, ISO/IEC JTC1/SC2/WG2/N3207R, L2/07-007R (Everson, Hosken & Constable). Changes have been rung on the initial consonants to check for silly omissions.
A hyphen in the pronunciation indicates a syllable-final consonant that would be specified by a subscript consonant or following orthographic syllable.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨠᩫ | N/A /ko-/ |
1A20 1A6B | ko | Section 5 No. 1. This sequence does not form a whole word. An example may be seen in a word for 'danger'. |
ᨣᩴ | then, and /kɔː/ |
1A23 1A74 | gM | Section 5 No. 2 |
ᨧᩢ | (irrealis marker) /tɕaʔ/ |
1A27 1A62 | ca | Section 5 No. 4 |
ᨲ᩠ᩅᩫᩡ | to prevaricate /tuaʔ/ |
1A32 1A60 1A45 1A6B 1A61 | t/woH | Section 5 No. 5 |
ᨷ᩠ᩅᩫ | lotus /bua/ |
1A37 1A60 1A45 1A6B | B/wo | Section 5 No. 6 |
ᨠ᩠ᩅ | N/A /kua-/ |
1A20 1A60 1A45 | k/w | Section 5 No. 7. This sequence does not form a whole word. An example may be seen in one of the words for 'big'. |
ᨡᩬᩴ | to request /kʰɔː/ |
1A21 1A6C 1A74 | khVOM | Section 5 No. 8 |
ᨠᩬ | N/A /kɔː-/ |
1A20 1A6C | kVO | Section 5 No. 9. This sequence does not form a whole word. An example may be seen in the fuller spelling of the word for 'belongings'. |
ᨦᩡ | to split up /ŋaʔ/ |
1A26 1A61 | GH | Section 5 No. 10 |
ᨠᩣ | crow /kaː/ |
1A20 1A63 | kA | Section 5 No. 11 |
ᨴᩤ | to paint /taː/ |
1A34 1A64 | d^A | Section 5 No. 12 |
ᩌᩣᩴ | to sprinkle /ham/ |
1A4C 1A63 1A74 | rhAM | Section 5 No. 13 |
ᨣᩤᩴ | word /kam/ |
1A23 1A64 1A74 | g^AM | Section 5 No. 14 |
ᨳᩥ | to pretend /tʰiʔ/ |
1A33 1A65 | th_i | Section 5 No. 15 |
ᨺᩦ | boil (n.) /fiː/ |
1A3A 1A66 | FI | Section 5 No. 16 |
ᨩᩧ | moist /tɕɯʔ/ |
1A29 1A67 | jue | Section 5 No. 17 |
ᨾᩨ | hand /mɯː/ |
1A3E 1A68 | mUE | Section 5 No. 18 |
ᨵᩩ | monk /tʰuʔ/ |
1A35 1A69 | dhu | Section 5 No. 19 |
ᨦᩪ | snake /ŋuː/ |
1A26 1A6A | GU | Section 5 No. 20 |
ᨲᩮᩡ | to kick /keʔ/ |
1A32 1A6E 1A61 | t_eH | Section 5 No. 21 |
ᨽᩮ | danger /pʰeː/ |
1A3D 1A6E | bh_e | Section 5 No. 22 |
ᨤᩯᩡ | to limp along /kʰɛʔ/ |
1A24 1A6F 1A61 | gx_EH | Section 5 No. 23 |
ᨧᩯ | corner /tɕɛː/ |
1A27 1A6F | c_E | Section 5 No. 24 |
ᨸᩮᩬᩥᩡ | mud /pɤʔ/ |
1A38 1A6E 1A6C 1A65 1A61 | p_eVO_iH | Section 5 No. 25 |
ᨸᩮᩥᩬᩡ | 1A38 1A6E 1A65 1A6C 1A61 | p_e_iVOH | Different from the proposals. | |
ᨶᩮᩬᩥ | (final particle for commands and entreaties) /nɤː/ |
1A36 1A6E 1A6C 1A65 | n_eVO_i | Section 5 No. 26. |
ᨶᩮᩥᩬ | 1A36 1A6E 1A65 1A6C | n_e_iVO | Different from the proposals. | |
ᨠᩮᩬᩨᩡ | N/A /kɯaʔ/ |
1A20 1A6E 1A6C 1A68 1A61 | k_eVOUEH | Section 5 No. 27 |
ᨠᩮᩨᩬᩡ | 1A20 1A6E 1A68 1A6C 1A61 | k_eUEVOH | Different from the proposals. | |
ᨠᩮᩬᩨ | /kɯa/ |
1A20 1A6E 1A6C 1A68 | k_eVOUE | Section 5 No. 28 |
ᨠᩮᩨᩬ | 1A20 1A6E 1A68 1A6C | k_eUEVO | Different to the proposals. | |
ᩁᩮᩢᩣ | we /hau/ |
1A41 1A6E 1A62 1A63 | r_eaA | Section 5 No. 29 |
ᨾᩳ | drunk /mau/ |
1A3E 1A73 | m^O | Section 5 No. 30. This example is not taken from the MFL, which does not use this vowel symbol. |
ᨠᩮᩣ | N/A /ko:/ |
1A20 1A6E 1A63 | k_eA | Section 5 No. 31. This is very rare in monosyllables, but is quite common at the end of monks' names, e.g. Adittadhammo. |
ᨹ᩠ᨿᩮᩡ | a type of sound /pʰiaʔ/ |
1A39 1A60 1A3F 1A6E 1A61 | ph/_y_eH | Section 5 No. 32 |
ᨻ᩠ᨿᩮ | flower /pia/ |
1A3B 1A60 1A3F 1A6E | b/_y_e | Section 5 No. 33 |
ᨠ᩠ᨿ | N/A /kia-/ |
1A20 1A60 1A3F | k/_y | Section 5 No. 34. This sequence does not form a whole word. An example may be seen in a spelling of the word for 'city'. |
ᨾᩮᩬᩥᩋᩡ | mucus /mɯaʔ/ |
1A3E 1A6E 1A6C 1A65 1A4B 1A61 | m_eVO_i_qH | Section 5 No. 35. (2 syllables) |
ᨾᩮᩥᩬᩋᩡ | 1A3E 1A6E 1A65 1A6C 1A4B 1A61 | m_e_iVO_qH | Different from the proposals. | |
ᨠᩖᩮᩬᩥᩋ | salt /kɯa/ |
1A20 1A56 1A6E 1A6C 1A65 1A4B | kVl_eVO_i_q | Section 5 No. 36. (2 syllables) |
ᨠᩖᩮᩥᩬᩋ | 1A20 1A56 1A6E 1A65 1A6C 1A4B | kVl_e_iVO_q | Different from the proposals. | |
ᩈᩰᩡ | to practice /soʔ/ |
1A48 1A70 1A61 | sOH | Section 5 No. 37 |
ᨾᩰ | big /moː/ |
1A3E 1A70 | mO | Section 5 No. 38 |
ᨪᩰᩬᩡ | to gouge out /sɔʔ/ |
1A2A 1A70 1A6C 1A61 | jxOVOH | Section 5 No. 39 |
ᨩᩢ᩠ᨿ | victory /tɕai/ |
1A29 1A62 1A60 1A3F | ja/_y | Section 5 No. 40 |
ᨶᩲ | in /nai/ |
1A36 1A72 | naue | Section 5 No. 41 |
ᨢᩱ | to expose /kʰai/ |
1A22 1A71 | kxai | Section 5 No. 42 |
ᨴᩱ᩠ᨿ | Thailand /tai/ |
1A34 1A71 1A60 1A3F | dai/_y | Section 5 No. 43 |
ᨠᩮᩬᩨᩡ | Khün /kɤʔ/ |
1A20 1A6E 1A6C 1A68 1A61 | k_eVOUEH | Section 5.3 No. 22 |
ᨠᩮᩨᩬᩡ | 1A20 1A6E 1A68 1A6C 1A61 | k_eUEVOH | Different from proposals. | |
ᨠᩮᩬᩨ | Khün /kɤː/ |
1A20 1A6E 1A6C 1A68 | k_eVOUE | Section 5.3 No. 23 |
ᨠᩮᩨᩬ | 1A20 1A6E 1A68 1A6C | k_eUEVO | Different from proposals. | |
ᨠᩰᩢ | Khün /ko-/ |
1A20 1A70 1A62 | kOa | Section 5.3 No. 26 |
ᩈᩘ | First syllable of compounds of saṅgha.
/saŋ/ |
1A48 1A58 | s>G | Section 5.3 No. 29. Apparently not a possible final syllable, but can be left stranded as a result of line-breaking. |
ᨴᩢ᩠ᨦ | whole /taŋ/ |
1A34 1A62 1A60 1A26 | da/G | Section 5.3 No. 30 |
ᩌᩥᩴ | edge /him/ |
1A4C 1A65 1A74 | rh_iM | Section 5.3 No. 31 (Example from Apiradee p53, but different language, different pronunciation, i.e. not /-iŋ/.) |
ᨠᩥ᩠ᨦ | /kiŋ/ |
1A20 1A65 1A60 1A26 | k_i/G | Section 5.3 No. 32 |
ᨠᩢ᩠ᨾ | /kam/ |
1A20 1A62 1A60 1A3E | ka/m | Section 5.3 No. 34 |
ᨠᩢᨾ | /kam/ |
1A20 1A62 1A3E | kam | Section 5.3 No. 35 |
ᨯᩭ | mountain /dɔːi/ |
1A2F 1A6D | Doi | Section 5.3 No. 36 |
Introduction | Control Panel | The Tests | Notes | My Fonts |
Other explicit coding sequences are given in Revised proposal for encoding the Lanna script in the BMP of the UCS, ISO/IEC JTC1/SC2/WG2/N3207R, L2/07-007R (Everson, Hosken & Constable), and these are recorded here. Amended and exploratory material is highlighted in yellow; it is not vouched for by the proposal. The remarks are my own.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
᪓᩠ᨴ | thrice /saːm tiː/ |
1A93 1A60 1A34 | 3T/d | Section 2 |
ᨲ᩵ᩣ᩠ᨦ᩻ | different in my view /taːŋ taːŋ/ |
1A32 1A75 1A63 1A60 1A26 1A7B | t1A/G" | Section 7 |
ᨲᩣ᩠᩵ᨦ᩻ | 1A32 1A63 1A75 1A60 1A26 1A7B | tA1/G" | Different from the proposals. | |
ᨲᩣ᩠᩵ᨦ᩻ | 1A32 1A63 1A60 1A75 1A26 1A7B | tA/1G" | Normalisation of the above. | |
ᨳ᩠ᨶ᩻ᩫᩁ | path /tʰănon/ |
1A33 1A60 1A36 1A7B 1A6B 1A41 | th/n"o_r | Sections 7 and 14.6 (2 syllables - the second is a single character). |
ᨳ᩠ᨶᩫ᩻ᩁ | 1A33 1A60 1A36 1A6B 1A7B 1A41 | th/no"_r | Different from proposals, which specifically specified the various semantically sensitive positions of mai sam. For this word, the visual position of the marks above is free. | |
ᨡᩢ᩶᩻ᩬᨦ | belongings /kʰau kʰɔːŋ/ |
1A21 1A62 1A76 1A7B 1A6C 1A26 | kha2"VOG | Section 7 (2 syllables - the second is a single character) |
ᨡᩢᩬ᩶᩻ᨦ | 1A21 1A62 1A6C 1A76 1A7B 1A26 | khaVO2"G | Different from the proposals. | |
ᨡᩮᩢ᩶ᩣᨡᩬᨦ | belongings
/kʰau kʰɔːŋ/ |
1A21 1A6E 1A62 1A76 1A63 1A21 1A6C 1A26 | kh_ea2AkhVOG | Section 7 (3 syllables - the third is a single character) |
ᨡᩮᩢᩣ᩶ᨡᩬᨦ | 1A21 1A6E 1A62 1A63 1A76 1A21 1A6C 1A26 | kh_eaA2khVOG | Different from the proposals. | |
᪭ᩣ | elephant /tɕaːŋ/ |
1AAD 1A63 | ᪭A | Section 11 |
ᩉ᩠ᨶᩦ | to flee /niː/ |
1A49 1A60 1A36 1A66 | h/nI | Section 14.1 |
ᨤ᩠ᩅᩯ᩶ᩁ | to blockade /kʰwɛːn/ |
1A24 1A60 1A45 1A6F 1A76 1A41 | gx/w_E2_r | Section 14.2 (2 syllables - the second is a single character) |
ᩉ᩠ᩅᩫ | head /hua/ |
1A49 1A60 1A45 1A6B | h/wo | Section 14.3 |
ᨯᩢ᩵ᨦ᩠ᨶᩦ᩶ | like this /daŋ niː/ |
1A2F 1A62 1A75 1A26 1A60 1A36 1A66 1A76 | Da1G/nI2 | Section 14.4 (2 syllables) |
ᩉᩥ᩠ᨶ | stone /hin/ |
1A49 1A65 1A60 1A36 | h_i/n | Section 14.5 |
ᨷ᩠᩵ᨾᩦ | to not have /bɔː miː/ |
1A37 1A75 1A60 1A3E 1A66 | B1/mI | Section 14.6. The proposal lists MAI KANG as a code point, but it is visually dropped in this compound. I presume the renderer is not intended to suppress the appearance of the character. The upper row drops the MAI KANG from the encoding, so is not the encoding intended, while the lower row uses the stated encoding. Da Lekh fails to arrange the marks above properly; arrangement is a proper challenge for a Tai Tham font. The phonetic syllable boundary is part of the context! |
ᨷᩴ᩠᩵ᨾᩦ | to not have /bɔː miː/ |
1A37 1A74 1A75 1A60 1A3E 1A66 | BM1/mI | |
ᨲᩣ᩠ᨾ | to follow /taːm/ |
1A32 1A63 1A60 1A3E | tA/m | Section 14.7 |
ᨻ᩠ᨿᩣ᩠ᨵᩥ | sickness /păɲaːt/ |
1A3B 1A60 1A3F 1A63 1A60 1A35 1A65 | b/_yA/dh_i | Section 14.8 |
ᨸ᩠ᩃ᩠ᨿ᩵ᩁ | to change /pian/ |
1A38 1A60 1A43 1A60 1A3F 1A75 1A41 | p/_l/_y1_r | Section 14.9 (2 syllables - the second is a single character) |
ᨾᩯ᩠᩶ᨶ᩠ᩅ᩵ᩣ | even though /mɛːn waː/ |
1A3E 1A6F 1A76 1A60 1A36 1A60 1A45 1A75 1A63 | m_E2/n/w1A | Section 14.9. A sophisticated font might transpose the tone marks. The phonetic syllable boundary should be part of the context. |
ᨾᩯ᩠᩶ᨶ᩠ᩅ᩵ᩣ | even though /mɛːn waː/ |
1A3E 1A6F 1A60 1A76 1A36 1A60 1A45 1A75 1A63 | m_E/2n/w1A | Same as above, but normalised, so not the code point sequence in the proposal. Proposal explicitly stated SAKOT was to have ccc=0, not 9, but ccc=9 was quietly inserted in draft properties and not noticed until too late. |
ᩈ᩠ᩅᩯ᩵ | to butt in /swɛː/ |
1A48 1A60 1A45 1A6F 1A75 | s/w_E1 | Section 14.10 |
ᩈᩯ᩠᩵ᩅ | to embroider /sɛːw/ |
1A48 1A6F 1A75 1A60 1A45 | s_E1/w | Section 14.10 (but the proposal has vowel and tone the wrong way round) |
ᩈᩯ᩠᩵ᩅ | to embroider /sɛːw/ |
1A48 1A6F 1A60 1A75 1A45 | s_E/1w | As above, but normalised, so very much not the codepoint sequence in the proposal. |
ᩈ᩵ᩯ᩠ᩅ | to embroider /sɛːw/ |
1A48 1A75 1A6F 1A60 1A45 | s1_E/w | As above, but uncorrected. Arguably, the rendering is unconstrained. |
ᨿᩪ | broom, whisk /ɲuː/ |
1A3F 1A6A | yU | Section 15 No. 1 |
ᨾᩦ | to have /miː/ |
1A3E 1A66 | mI | Section 15 No. 2 |
ᩉ᩠ᨾᩪ | pig /muː/ |
1A49 1A60 1A3E 1A6A | h/mU | Section 15 No. 3 |
ᩉ᩠ᨾᩦ | bear (n.) /miː/ |
1A49 1A60 1A3E 1A66 | h/mI | Section 15 No. 4 |
ᨹ᩠ᩅᩫ | husband /pʰua/ |
1A39 1A60 1A45 1A6B | ph/wo | Section 15 No. 5 |
ᩉ᩠ᩃᩬᩴ᩵ | to cast (in metal) /lɔː/ |
1A49 1A60 1A43 1A6C 1A74 1A75 | h/_lVOM1 | Section 15 No. 6 |
ᨾᩣ | to come /maː/ |
1A3E 1A63 | mA | Section 15 No. 7 |
ᩉᩱ᩵ | to hit /hai/ |
1A49 1A71 1A75 | hai1 | Section 15 No. 8 |
ᨾ᩠ᨿ | 1A3E 1A60 1A3F | m/_y | Section 15 No. 9 | |
ᩅ᩠ᨿᨦ | city /wiaŋ/ |
1A45 1A60 1A3F 1A26 | w/_yG | Section 15 No. 10 (2 syllables - the second is a single character) |
ᩉᩣ᩠ᨾ | to carry by the handles /haːm/ |
1A49 1A63 1A60 1A3E | hA/m | Section 15 No. 11 |
ᨯᩣᩴ | black /dam/ |
1A2F 1A63 1A74 | DAM | Section 15 No. 12 |
ᨡᩮ᩠ᩅ | 1A21 1A6E 1A60 1A45 | kh_e/w | Section 15 No. 13 | |
ᩉ᩠ᨾᩣ | dog /maː/ |
1A49 1A60 1A3E 1A63 | h/mA | Section 15 No. 14 |
ᨠᩕᩣ᩠ᨸ | to prostrate oneself /kʰaːp/ |
1A20 1A55 1A63 1A60 1A38 | kVrA/p | Section 15 No. 15. The later addition of SIGN BA to the repertoire makes the correct final consonant here unclear. |
ᨻᩕ᩵ᩣᩴ | indefatigable /pʰam/ |
1A3B 1A55 1A75 1A63 1A74 | bVr1AM | Section 15 No. 16 |
ᨻᩕᩣᩴ᩵ | 1A3B 1A55 1A63 1A74 1A75 | bVrAM1 | Different from the proposals. The USE diktat at December 2021 does not determine the relative order of the tone mark and mai kang. In some styles the tone mark is associated with and follows mai kang, either above or to the right of it, but in other styles the tone mark sits on the consonant and the mai kang on the spacing vowel. Both encodings are shown here. | |
ᨻᩕᩣ᩵ᩴ | 1A3B 1A55 1A63 1A75 1A74 | bVrA1M | ||
ᨠᩕᩬᨦ | garland; Mekong /kʰɔːŋ/ |
1A20 1A55 1A6C 1A26 | kVrVOG | Section 15 No. 17 (2 syllables - the second is a single character) |
ᩈᩕᩫᨾ᩠ᨱ᩺ | ascetic /sălom/ |
1A48 1A55 1A6B 1A3E 1A60 1A31 1A7A | sVrom/N^r | Section 15 No. 18. If the word is interpreted as having two phonetic syllables, then the medial consonant comes between an implicit vowel and an explicit vowel. (2 syllables) |
ᩈᩕ᩠ᩅᩫᨾ | to embrace /săluam/ |
1A48 1A55 1A60 1A45 1A6B 1A3E | sVr/wom | Section 15 No. 19 (2 syllables - the second is a single character). Ignore final ᨾ; it makes the spelling ungrammatical. However, a few such spellings do occur in the MFL. |
ᩈᩕ᩠ᩅᨾ | to embrace /săluam/ |
1A48 1A55 1A60 1A45 1A3E | sVr/wm | Spelling of above in the MFL, so this form's encoding is not given in the proposal. |
ᨯᩮᩬᩨᩁ | month /dɯan/ |
1A2F 1A6E 1A6C 1A68 1A41 | D_eVOUE_r | Section 15 No. 20 (2 syllables - the second is a single character) |
ᨯᩮᩨᩬᩁ | 1A2F 1A6E 1A68 1A6C 1A41 | D_eUEVO_r | Different from the proposals. | |
ᩁᩮᩬᩨᩋ | boat /hɯa/ |
1A41 1A6E 1A6C 1A68 1A4B | r_eVOUE_q | Section 15 No. 21 (2 syllables - the second is a single character) |
ᩁᩮᩨᩬᩋ | 1A41 1A6E 1A68 1A6C 1A4B | r_eUEVO_q | Different from the proposals. | |
ᩉ᩠ᩃᩮᩬᩨᩋ | to exceed /lɯa/ |
1A49 1A60 1A43 1A6E 1A6C 1A68 1A4B | h/_l_eVOUE_q | Section 15 No. 22 (2 syllables - the second is a single character) |
ᩉ᩠ᩃᩮᩨᩬᩋ | 1A49 1A60 1A43 1A6E 1A68 1A6C 1A4B | h/_l_eUEVO_q | Different from the proposals. | |
ᩉ᩠ᨾ᩵ᩣᩴ | to eat /mam/ |
1A49 1A60 1A3E 1A75 1A63 1A74 | h/m1AM | Section 15 No. 23 |
ᩉ᩠ᨾᩣᩴ᩵ | 1A49 1A60 1A3E 1A63 1A74 1A75 | h/mAM1 | The USE diktat does not show specify whether mai kang or tone mark comes first. Both encodings are shown. | |
ᩉ᩠ᨾᩣ᩵ᩴ | 1A49 1A60 1A3E 1A63 1A75 1A74 | h/mA1M | ||
ᩈ᩠ᨾᩬᩥ᩻ | very level(?) /sămɤː sămɤː/ |
1A48 1A60 1A3E 1A6C 1A65 1A7B | s/mVO_i" | Section 15 No. 24. Encoding as given, omitting SIGN E, which is depicted in the proposal. Moreover, the word appears to be a misreading of the next but one. |
ᩈ᩠ᨾᩨᩬ᩻ | 1A48 1A60 1A3E 1A68 1A6C 1A7B | s/mUEVO" | Encoding is different from the proposals. | |
ᩈ᩠ᨾᩮᩬᩥ᩻ | 1A48 1A60 1A3E 1A6E 1A6C 1A65 1A7B | s/m_eVO_i" | Section 15 No. 24. SIGN E restored to encoding. | |
ᩈ᩠ᨾᩮᩥᩬ᩻ | 1A48 1A60 1A3E 1A6E 1A65 1A6C 1A7B | s/m_e_iVO" | SIGN E restored to a different encoding from the proposals. | |
ᩈ᩠ᨾ᩻ᩮᩬᩥ | level (adj.) /sămɤː/ |
1A48 1A60 1A3E 1A7B 1A6E 1A6C 1A65 | s/m"_eVO_i | Probable reading of above. Consequently, the encoding is not vouched for by the proposal. Phonetically, this is one or two syllables, depend on how one counts. |
ᩈ᩠ᨾᩮᩥᩬ᩻ | 1A48 1A60 1A3E 1A6E 1A65 1A6C 1A7B | s/m_e_iVO" | Probable reading of the above. The USE-compliant encodings of the two readings are the same, but each has compatible renderings inconsistent with the other interpretation. | |
ᩉ᩠ᨾᩮᩬᩨᨦ | mine (n.) /mɯaŋ/ |
1A49 1A60 1A3E 1A6E 1A6C 1A68 1A26 | h/m_eVOUEG | Section 15 No. 25 (2 syllables - the second is a single character) |
ᩉ᩠ᨾᩮᩨᩬᨦ | 1A49 1A60 1A3E 1A6E 1A68 1A6C 1A26 | h/m_eUEVOG | Different from the proposals. | |
ᩉ᩠ᨿᩮᩬᩨᨦ | to despise /ɲɯaŋ/ |
1A49 1A60 1A3F 1A6E 1A6C 1A68 1A26 | h/_y_eVOUEG | Section 15 No. 26 (2 syllables - the second is a single character) |
ᩉ᩠ᨿᩮᩨᩬᨦ | 1A49 1A60 1A3F 1A6E 1A68 1A6C 1A26 | h/_y_eUEVOG | Different from the proposals. | |
ᩉ᩠ᨾᩫ᩵ᩁ | winter melon (Benincasa
hispida) /mon/ |
1A49 1A60 1A3E 1A6B 1A75 1A41 | h/mo1_r | Section 15 No. 27 (2 syllables - the second is a single character) |
ᩉ᩠ᩃᩣ᩠ᨿ | many /laːi/ |
1A49 1A60 1A43 1A63 1A60 1A3F | h/_lA/_y | Section 15 No. 28 |
ᩉ᩠ᩃᩮᩬᩨᨦ | yellow /lɯaŋ/ |
1A49 1A60 1A43 1A6E 1A6C 1A68 1A26 | h/_l_eVOUEG | Section 15 No. 29 (2 syllables - the second is a single character) |
ᩉ᩠ᩃᩮᩨᩬᨦ | 1A49 1A60 1A43 1A6E 1A68 1A6C 1A26 | h/_l_eUEVOG | Different from the proposals. |
Introduction | Control Panel | The Tests | Notes | My Fonts |
The actual coding sequences to be used here are open to challenge.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨠᩬᩢᩃ᩠ᨼ᩺ | golf /kɔp/ |
1A20 1A6C 1A62 1A43 1A60 1A3C 1A7A | kVOa_l/f^r | Section 2. The position of RA HAAM is debatable - cf. Thai กอล์ฟ. The first example places it on the second consonant, the second on the first. The third then normalises the spelling of the second. Note that this word consists of two orthographic syllables. |
ᨠᩬᩢᩃ᩠᩺ᨼ | 1A20 1A6C 1A62 1A43 1A7A 1A60 1A3C | kVOa_l^r/f | ||
ᨠᩬᩢᩃ᩠᩺ᨼ | 1A20 1A6C 1A62 1A43 1A60 1A7A 1A3C | kVOa_l/^rf | ||
ᨠᩢᩬᩃ᩠ᨼ᩺ | 1A20 1A62 1A6C 1A43 1A60 1A3C 1A7A | kaVO_l/f^r | In the December 2021 USE order. | |
ᨠᩢᩬᩃ᩠᩺ᨼ | 1A20 1A62 1A6C 1A43 1A7A 1A60 1A3C | kaVO_l^r/f | ||
ᨠᩢᩬᩃ᩠᩺ᨼ | 1A20 1A62 1A6C 1A43 1A60 1A7A 1A3C | kaVO_l/^rf | ||
ᨠᩕᩣ᩠ᨼ | graph /kaːp/ (?) |
1A20 1A55 1A63 1A60 1A3C | kVrA/f | Section 2. |
ᨴᩬᨼ᩠ᨼᩦ᩵ | toffee | 1A34 1A6C 1A3C 1A60 1A3C 1A66 1A75 | dVOf/fI1 | Section 2 (2 syllables) |
ᨠᨽᩚ | pregnant /kap pʰa?/ |
1A20 1A3D 1A5A | kbh^b | Section 4 (2 syllables - the first is a single character) |
ᩈᨱᩛᩣ᩠ᨶ | shape /san tʰaːn/ |
1A48 1A31 1A5B 1A63 1A60 1A36 | sNVbA/n | Section 4 (2 syllables - the first is a single character) |
ᩁᨭᩛᨷᩣ᩠ᩃ | government /rat tʰa baːn/ |
1A41 1A2D 1A5B 1A37 1A63 1A60 1A43 | r_TVbBA/_l | Section 4 (3 syllables) |
ᩁᩢᨭᩛᨷᩣ᩠ᩃ | government /rat tʰa baːn/ |
1A41 1A62 1A2D 1A5B 1A37 1A63 1A60 1A43 | ra_TVbBA/_l | Section 4 (3 syllables) |
ᩈᨻᩛ | omniscience /sap paʔ/ |
1A48 1A3B 1A5B | sbVb | Section 4 (2 syllables - the first is a single character) |
ᩋᨾᩛ | mango /ʔam paʔ/ |
1A4B 1A3E 1A5B | qmVb | Section 4 (2 syllables - the first is a single character) |
ᩁᩣᨩᨽᩢ᩠ᨮ | Rajabhat /la:t tɕa pʰat/ |
1A41 1A63 1A29 1A3D 1A62 1A60 1A2E | rAjbha/Th | Section 4 (3 syllables) |
ᨷᩢᨱ᩠ᨻᨷᩩᩁᩩᩈ | disciple "banop burus" |
1A37 1A62 1A31 1A60 1A3B 1A37 1A69 1A41 1A69 1A48 | BaN/bBu_ru_s | Section 4 (5 syllables) |
Introduction | Control Panel | The Tests | Notes | My Fonts |
The mai kang lai character can be challenge to a font. The character has a wide range of behaviours. It can behave as a spacing final character (as in modern Tai Khün fonts) to a repha-like character, the old-fashioned behaviour seen in Tai Khün, Thailand and Laos. The MFL dictionary shows an intermediate behaviour, where marks above the following base consonant cause it to be positioned within the previous syllable. This is the style employed by the Da Lekh font.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨴᩘ᩠ᩃᩣ᩠ᨿ | all /taŋ laːi/ |
1A34 1A58 1A60 1A43 1A63 1A60 1A3F | d>G/_lA/_y | The ascending tail of SAKOT LA prevents the MAI KANG LAI moving on to a subsequent syllable/word. This prevents fonts exploiting the rphf feature of the Universal Shaping Engine. |
ᨴ᩠ᩃᩘᩣ᩠ᨿ | 1A34 1A60 1A43 1A58 1A63 1A60 1A3F | d/_l>GA/_y | With total disregard for logical order. | |
ᩈᩘᨥᩮᩣ | Nominative of Pali saṅgha <saṅgho> |
1A48 1A58 1A25 1A6E 1A63 | s>Ggh_eA | (2 syllables) |
ᩁᩘᩈᩦ | ray /raŋ siː/ |
1A41 1A58 1A48 1A66 | r>G_sI | (2 syllables) |
Introduction | Control Panel | The Tests | Notes | My Fonts |
This is mostly a test for readers!
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨶᩣᩴ | to lead /nam/ |
1A36 1A63 1A74 | nAM | |
ᨾᨶᩮᩣ | heart, mind /maʔ no:/ |
1A3E 1A36 1A6E 1A63 | mn_eA | (2 syllables) |
ᨶᩮᩢᩣ | to sew a long stitch /nau/ |
1A36 1A6E 1A62 1A63 | n_eaA | Some fonts may fail here because they handle
the ligature in pstf ; this worked with HarfBuzz until
pstf
was moved to before Indic rearrangement. |
ᨶᩣ᩠ᨿ | leader /na:i/ |
1A36 1A63 1A60 1A3F | nA/_y | |
ᨶ᩵ᩣ᩠ᨶ | Nan
/na:n/ |
1A36 1A75 1A63 1A60 1A36 | n1A/n | |
ᨶᩣ᩠᩵ᨶ | 1A36 1A63 1A75 1A60 1A36 | nA1/n | Using formalism where neither current nor historical speech defines phonetic order. The first of these two keeps user-perceivable characters contiguous, and the second is its normalisation (NFC/NFD). | |
ᨶᩣ᩠᩵ᨶ | 1A36 1A63 1A60 1A75 1A36 | nA/1n | ||
ᩍᨶ᩠ᨴᩣ | Indra /ʔin ta:/ |
1A4D 1A36 1A60 1A34 1A63 | qqin/dA | The more usual form lacks U+1A63. (2 syllables - first has one character.) |
ᩋᩫᨶ᩠ᨲᩕᩣ᩠ᨿ | danger
/ʔon tʰaʔ la:i/ |
1A4B 1A6B 1A36 1A60 1A32 1A55 1A63 1A60 1A3F | qon/tVrA/_y | (2 syllables) |
ᨶ᩶ᩣᩴ | water /nam/ |
1A36 1A76 1A63 1A74 | n2AM | This can be surprisingly hard to achieve in a font. Logic designed to stop Arabic vowel marks wrongly interacting has to be circumvented so that the two marks will interact! |
ᨶᩣ᩶ᩴ | 1A36 1A63 1A76 1A74 | nA2M | The USE rules do not dictate whether the tone mark comes before or after the mai kang. Both the canonically inequivalent forms are given here. | |
ᨶᩣᩴ᩶ | 1A36 1A63 1A74 1A76 | nAM2 | ||
ᨶ᩠ᩅᩣ᩠ᨷ | to falsely accuse /nwaːp/ |
1A36 1A60 1A45 1A63 1A60 1A37 | n/wA/B | MFL p352 |
ᨴᩤᩴᨶ᩠ᩅᩣ᩠ᨿ | to foretell /tam nwaːi/ |
1A34 1A64 1A74 1A36 1A60 1A45 200C 1A63 1A60 1A3F | d^AMn/wA/_y | NTDPLM p285. Sometimes the writer wants to avoid the ligature! (2 syllables) |
ᨲ᩵ᩣᩴᨶ᩠ᩅᩣ᩠ᨿ | to foretell /tam nwaːi/ |
1A32 1A75 1A63 1A74 1A36 1A60 1A45 1A63 1A60 1A3F | t1AMn/wA/_y | MFL p320, but only in transliteration. Shape of second syllable (ligature plus subscript consonant) is attested elsewhere. (2 syllables) |
ᨲᩣ᩵ᩴᨶ᩠ᩅᩣ᩠ᨿ | 1A32 1A63 1A75 1A74 1A36 1A60 1A45 1A63 1A60 1A3F | tA1Mn/wA/_y | The USE does not dictate whether mai kang or the tone mark comes first. Both options are given here. | |
ᨲᩣᩴ᩵ᨶ᩠ᩅᩣ᩠ᨿ | 1A32 1A63 1A74 1A75 1A36 1A60 1A45 1A63 1A60 1A3F | tAM1n/wA/_y | ||
ᨶᩣ | rice field /naː/ |
1A36 200C 1A63 | nA | An isolated test of the ZWNJ feature above. This form is to be expected in texts teaching the writing system. |
ᩉ᩠ᨶ᩶ᩣ | face /naː/ |
1A49 1A60 1A36 1A76 1A63 | h/n2A | Note that the SAKOT prevents ligature formation. |
ᩉ᩠ᨶᩣ᩶ | 1A49 1A60 1A36 1A63 1A76 | h/nA2 | Tone mark above consonant still follows the vowel. |
Introduction | Control Panel | The Tests | Notes | My Fonts |
These examples are taken from the 'big blue book' pp151-6. Some of these renderings are unusual compared with the native tradition, and are included for that reason. The position of RA HAAM is particularly noteworthy.
The pronunciations given are guesswork where Siamese practice and Lanna script orthography conflict.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨠᩯᩢ᩠ᩈ | gas /kɛs/ |
1A20 1A6F 1A62 1A60 1A48 | k_Ea/_s | |
ᨴᩕᩯ᩠ᨠᨴᩮᩬᩥᩁ᩺ | tractor /tʰɛːk tʰɤː/ |
1A34 1A55 1A6F 1A60 1A20 1A34 1A6E 1A6C 1A65 1A41 1A7A | dVr_E/kd_eVO_i_r^r | Slightly complicated set of consonants in first syllable. (2 syllables) |
ᨴᩕᩯ᩠ᨠᨴᩮᩥᩬᩁ᩺ | 1A34 1A55 1A6F 1A60 1A20 1A34 1A6E 1A65 1A6C 1A41 1A7A | dVr_E/kd_e_iVO_r^r | Vowel not as in the proposals. | |
ᨶᩰᩫ᩠᩶ᨲ | note /noːt/ |
1A36 1A70 1A6B 1A76 1A60 1A32 | nOo2/t | Vowel combination not listed above |
ᨷᩕᩰᨴᩦ᩠ᨶ | protein /pʰoː tiːn/ |
1A37 1A55 1A70 1A34 1A66 1A60 1A36 | BVrOdI/n | Tests reordering - the vowel symbol should appear first. (2 syllables) |
ᨼᩥᩅ᩠ᩈ᩺ | fuse /fiu/ |
1A3C 1A65 1A45 1A60 1A48 1A7A | f_iw/_s^r | |
ᩈᨲᩯᨾ᩠ᨷ᩺ | postage stamp /sa tɛːm/ |
1A48 1A32 1A6F 1A3E 1A60 1A37 1A7A | st_Em/B^r | (3 syllables) |
ᩈᩮᩥᩁ᩠᩺ᨷ | to serve /sɤːp/ |
1A48 1A6E 1A65 1A41 1A7A 1A60 1A37 | s_e_i_r^r/B | Compare the placement of RA HAAM with the previous word. The same contrast may be seen on p155 of the 'big blue book'. (2 syllables) |
Introduction | Control Panel | The Tests | Notes | My Fonts |
These examples are all taken from Graphic Blends at SEAsite. The pronunciations given are Tai Lü.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨴᩢ᩵ᩗᩣ | all /taŋ laːi/ |
1A34 1A62 1A75 1A57 1A63 | da1Vl+A | This word, in some of its various forms, seems to be the only word containing U+1A57 TAI THAM CONSONANT SIGN LA TANG LAI. I withdraw my previous, surprised, reading of the word shown as containing NGA as the base consonant. |
ᨡᨶ᩠ᨵᩣ | spell (magic) /kʰan tʰaː/ |
1A21 1A36 1A60 1A35 1A63 | khn/dhA | (2 syllables, first a single character) |
ᨣ᩠᩶ᨯᩦ | okay /kɔː diː/ |
1A23 1A76 1A60 1A2F 1A66 00A0 | g2/DI | A non-breaking space has been appended to avoid truncation. A sophisticated font would slide the vowel under the tone mark. |
ᨷ᩠᩶ᨾᩣ | to not come /bau maː/ |
1A37 1A76 1A60 1A3E 1A63 | B2/mA | |
ᨷ᩠᩶ᨾᩣ | 1A37 1A60 1A76 1A3E 1A63 | B/2mA | Same again, but normalised. | |
ᨷ᩠᩶ᨯᩣ᩠ᨿ | to not have /bau da:i/ |
1A37 1A76 1A60 1A2F 1A63 1A60 1A3F | B2/DA/_y | |
ᨧᩢ᩠ᩅᩤ | How big an area? /tsak va:/ |
1A27 1A62 1A60 1A45 1A64 | ca/w^A | |
ᩈᩮ᩠ᩓ᩠ᩅ | deceased /se: lɛu/ |
1A48 1A6E 1A60 1A53 1A60 1A45 | s_e/lE/w | |
ᨴᩯ᩠ᨶᩳ | Really, is that true? /tɛː nɔː/ |
1A34 1A6F 1A60 1A36 1A73 | d_E/n^O | |
ᩓ᩠ᨾᩣ | to look this way /lɛ maː/ |
1A53 1A60 1A3E 1A63 | lE/mA | |
ᨠᩮ᩠ᩈᩣ | hair /keː saː/ |
1A20 1A6E 1A60 1A48 1A63 | k_e/_sA | |
ᨻᩱ᩠ᨾᩣ | to come and go /pai maː/ |
1A3B 1A71 1A60 1A3E 1A63 | bai/mA | |
ᩈᩮ᩠ᩅ᩶ᩤ | if /seː vaː/ |
1A48 1A6E 1A60 1A45 1A76 1A64 | s_e/w2^A | |
ᩈᩮ᩠ᩅᩤ᩶ | 1A48 1A6E 1A60 1A45 1A64 1A76 | s_e/w^A2 | Tone mark position not as in the proposals. | |
ᩅᩮ᩠ᩃᩣ | time /veː laː/ |
1A45 1A6E 1A60 1A43 1A63 | w_e/_lA | Also in Apiradee p49 |
ᨵᩤ᩠ᨲᩩ | physical body /tʰaː tuʔ/ |
1A35 1A64 1A60 1A32 1A69 | dh^A/tu | The vowel on the final consonant is inescapable - there is no way of rewriting the orthographic syllable to escape the combination. |
ᨩ᩠ᩓ | in conclusion /tsălɛː/ |
1A29 1A60 1A53 | j/lE | |
ᨻᩭ᩠ᩅ᩻ᩣ | because /pɔi vaː/ |
1A3B 1A6D 1A60 1A45 1A7B 1A63 | boi/w"A | The MAI SAM tags the WA as starting a chained syllable. The spelling presumes that a font can decide that the subscript WA goes to the left of the MAI KOY. |
ᨻᩭ᩠᩻ᩅᩣ | 1A3B 1A6D 1A7B 1A60 1A45 1A63 | boi"/wA | A purely visual placement of MAI SAM. | |
ᨻᩭ᩠᩻ᩅᩣ | 1A3B 1A6D 1A60 1A7B 1A45 1A63 | boi/"wA | Normalised form of the above. | |
ᩈᩫ᩠ᨦᩣ᩠ᨶ | world /suŋ saːn/ |
1A48 1A6B 1A60 1A26 1A63 1A60 1A36 | so/GA/n |
Introduction | Control Panel | The Tests | Notes | My Fonts |
These words are taken from the MA thesis 'Development of Tai Lue Scripts and Orthography' by Apiradee Techasiriwan (อภิรดี เตชะศิริวรรณ). The pronunciations given are Tai Lü. Comparative material from elsewhere is highlighted in yellow.
Text | Meaning | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨻᩬᩳ᩵ | father /pɔː/ |
1A3B 1A6C 1A73 1A75 | bVO^O1 | p3. Vowel combination not listed above. Spelling is archaic. |
ᨻᩳᩬ᩵ | 1A3B 1A73 1A6C 1A75 | b^OVO1 | USE vowel ordering. | |
ᩈᨷ᩷ᩣ᩠ᨿ | content, well /săbaːi/ |
1A48 1A37 1A77 1A63 1A60 1A3F | sB3A/_y | p3. Rare example of a word with this tone mark. (2 syllables, first is a single character.) |
ᩈᨷᩣ᩠᩷ᨿ | 1A48 1A37 1A63 1A77 1A60 1A3F | sBA3/_y | USE tone positioning. | |
ᩅ᩠ᨿᩙ | city /weŋ/ |
1A45 1A60 1A3F 1A59 | w/_y^G | p4. |
ᨣᩪ᩺ | person /kun/ |
1A23 1A6A 1A7A | gU^r | p4. Unetymological, phonetic spelling. The mark above is serving as a final consonant, not a cancellation mark. |
ᨣ᩺ᩪ | 1A23 1A7A 1A6A | g^rU | USE ordering as vowels. | |
᪁᪂ ᨻᩢ᩠ᨶ᩻ᩣ | Sipsongpanna /sip sɔːŋ pan naː/ |
1A81 1A82 00A0 1A3B 1A62 1A60 1A36 1A7B 1A63 | 1P2P ba/n"A | p10. (Number precedes syllable). Example of mai sam marking a double-acting consonant. |
᪁᪂ ᨻᩢ᩠᩻ᨶᩣ | 1A81 1A82 00A0 1A3B 1A62 1A7B 1A60 1A36 1A63 | 1P2P ba"/nA | Best-looking hack for USE compliance. | |
ᨻᩱ᩻ᩣ᩠ᨿ | to go to the location /pai paːi/ |
1A3B 1A71 1A7B 1A63 1A60 1A3F | bai"A/_y | p47. |
ᨻᩱᩣ᩠᩻ᨿ | 1A3B 1A71 1A63 1A7B 1A60 1A3F | baiA"/_y | Best-looking hack for USE compliance | |
ᨻᩱᩣ᩠᩻ᨿ | 1A3B 1A71 1A63 1A7B 1A60 1A3F | baiA"/_y | Normalisation of the above. | |
ᨩ᩠ᨿᩙᨲᩩᩴ | Kengtung
/tseŋ tuŋ/ |
1A29 1A60 1A3F 1A59 1A32 1A69 1A74 | j/_y^GtuM | p53. (2 syllables) Possibly the Chengtung on the Vietnamese border. |
ᩅᨲᩛᩩ | matter /wat tʰu/ |
1A45 1A32 1A5B 1A69 | wtVbu | p49. U+1A5B represents subscript HIGH THA rather than high RATHA. This is an issue for a font's repertoire of conjuncts. |
ᩅᨲ᩠ᨳᩩ | matter /wat tʰu/ |
1A45 1A32 1A60 1A33 1A69 | wt/thu | The Northern Thai writing of the above. Perhaps this should be rendered as the above when the language is Tai Lü or Lao. |
ᨯ᩠ᨿᩴ | one /deu/ |
1A2F 1A60 1A3F 1A74 | D/_yM | p53. Assuming the word has TAI THAM SIGN MAI KANG rather than unencoded *TAI THAM CONSONANT SIGN FINAL WA. |
ᩉ᩠ᨶᩦᩢ᩶ | debt /niː/ |
1A49 1A60 1A36 1A66 1A62 1A76 | h/nIa2 | p57. |
ᩁᩮᩂ᩠ᨠ | auspicious occasion /hɤːk/ |
1A41 1A6E 1A42 1A60 1A20 | r_e_R/k | p79. |
ᩁ᩠ᨿ᩺ | to learn /heːn/ |
1A41 1A60 1A3F 1A7A | r/_y^r | p118. |
Introduction | Control Panel | The Tests | Notes | My Fonts |
The word typically meaning 'and...not' or 'and...then' may be written with a chained syllable, and this may present challenges to renderers. The form of the letter representing /b/ in a chained syllable presented an encoding challenge. N3207R proposed using the sequence <SAKOT, BA> for it, and using <SAKOT, HIGH PA> for the subscript form corresponding to both BA (common) and HIGH PA (extremely rare) in its rôle as a final (Thai sakot) consonant. During the ISO process, a new character was introduced instead for the special form, SIGN BA, and it is widely assumed that <SAKOT, BA> represents the usual subscript form corresponding to BA, both as a sakot consonant and in the Pali /mp/ and /pp/ intervocalic clusters.
When syllables are chained, shared vowel symbols are not repeated. This leads to ambiguity as to which symbol is dropped.
All the spellings in the table below represent the same careful pronunciation in Northern Thai, namely /kɔː bɔː/. The Tai Lü forms are written with different marks and pronounced with different vowels, but use the same two consonant forms in the stack.
Text | Meaning | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨣᩴᨷᩴ᩵ | and...not, then...not | 1A23 1A74 1A37 1A74 1A75 | gMBM1 | Full form - 2 syllables, and arguably 2 words. |
ᨣᩴᨷᩴ | do. | 1A23 1A74 1A37 1A74 | gMBM | Univerbated form in MFL (2 syllables) |
ᨣᩝᩴ᩵ | do. | 1A23 1A5D 1A74 1A75 | gVBM1 | First mai kang dropped. |
ᨣᩴᩝ᩵ | do. | 1A23 1A74 1A5D 1A75 | gMVB1 | Second mai kang dropped. |
ᨣᩝᩴ | do. | 1A23 1A5D 1A74 | gVBM | First mai kang dropped. |
ᨣᩴᩝ | do. | 1A23 1A74 1A5D | gMVB | Second mai kang dropped. |
Introduction | Control Panel | The Tests | Notes | My Fonts |
These words behave slightly oddly.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᩓᩯ | very much /lɛː/ |
1A53 1A6F | lE_E | Redundant vowel mark |
ᩐᩣ | to take /ʔau/ |
1A50 1A63 | qqUA | Vowel on independent vowel |
ᩐ᩵ᩣ | very hot /ʔau/ |
1A50 1A75 1A63 | qqU1A | Vowel and tone mark on independent vowel |
ᩐᩣ᩵ | 1A50 1A63 1A75 | qqUA1 | USE-compliant order | |
ᨯᩪᩕᩣ | listen to me /duː haː/ |
1A2F 1A6A 1A55 1A63 | DUVrA | Medial consonant between explicit vowels |
ᨯᩮᩬᩥᩁᨹᩫᩖᨣᩩᨱ᩺ | March /dɯan pʰon laʔ kun/ |
1A2F 1A6E 1A6C 1A65 1A41 1A39 1A6B 1A56 1A23 1A69 1A31 1A7A | D_eVO_i_rphoVlguN^r | NTDPLM p259. Double-acting medial consonant with implicit vowel after it. (3 syllables) |
ᨯᩮᩥᩬᩁᨹᩫᩖᨣᩩᨱ᩺ | 1A2F 1A6E 1A65 1A6C 1A41 1A39 1A6B 1A56 1A23 1A69 1A31 1A7A | D_e_iVO_rphoVlguN^r | USE-compliant vowel ordering | |
ᨻᩣᨷᩰᩖ | Pabol (sic) /paː boːn/ |
1A3B 1A63 1A37 1A70 1A56 | bABOVl | A mistake for Spanish Pablo seen on Wikipedia, but in light of the above a renderer should render it as intended. |
ᨶ᩶ᩭ | little /nɔːi/ |
1A36 1A76 1A6D | n2oi | Tai Khün spelling. |
ᨶᩭ᩶ | 1A36 1A6D 1A76 | noi2 | USE-compliant tone mark sequencing. | |
ᩉᩖ᩠ᩅᨦ | big /luaŋ/ |
1A49 1A56 1A60 1A45 1A26 | hVl/wG | Medial consonant in middle of stack. The proposal classified the final consonant of the stack as a 'medial vowel'. (2 syllables, second a single character) |
ᩉᩖ᩠ᩅᩣ | iron /lwaː/ |
1A49 1A56 1A60 1A45 1A63 | hVl/wA | Medial consonant in middle of stack. In this case, the WA is very much a consonant. |
ᨻᩕ᩠ᨿᩮᩡ | a type of sound /pʰiaʔ/ |
1A3B 1A55 1A60 1A3F 1A6E 1A61 | bVr/_y_eH | Preposed medial consonant in middle of stack along with a preposed vowel. |
ᨠᩩ᩶ᩣ᩠ᨶ᩠ᨦ | to prosper /kaːn kuŋ/ |
1A20 1A69 1A76 1A63 1A60 1A36 1A60 1A26 | ku2A/n/G | The first word in the MFL! Note that there are two final consonants. The SIGN AA prevents a phonetic spelling. |
ᨠᩩᩣ᩠᩶ᨶ᩠ᨦ | 1A20 1A69 1A63 1A76 1A60 1A36 1A60 1A26 | kuA2/n/G | USE-compliant tone mark placement. | |
ᩋᩢ᩠ᨭᩛ | a satang coin /ʔat/ |
1A4B 1A62 1A60 1A2D 1A5B | qa/_TVb | Two consonants in final consonant position (3 consonants in total) |
ᩆᩢᨠ᩠ᨯᩥ᩺ | rank /sak/ |
1A46 1A62 1A20 1A60 1A2F 1A65 1A7A | shak/D_i^r | Consonant-killer also killing explicit vowel above (2 syllables) |
ᩆᩢᨠ᩠ᨯᩥ᩼ | rank /sak/ |
1A46 1A62 1A20 1A60 1A2F 1A65 1A7C | shak/D_iX | Same again, but with KARAN instead of RA HAAM. Some people are using KARAN in Northern Thai instead of RA HAAM! (2 syllables) |
ᨾᩉᩣᩉᩥᨦ᩠ᨣᩩ᩺ | giant fennel /ma haː hiŋ/ |
1A3E 1A49 1A63 1A49 1A65 1A26 1A60 1A23 1A69 1A7A | m_hA_h_iG/gu^r | Consonant-killer also killing explicit vowel below (4 syllables) |
ᨾᩉᩣᩉᩥᨦ᩠ᨣ᩺ᩩ | 1A3E 1A49 1A63 1A49 1A65 1A26 1A60 1A23 1A7A 1A69 | m_hA_h_iG/g^ru | USE then requires that the killer precede the killed vowel. | |
ᨾᩉᩣᩉᩥᨦ᩠ᨣᩩ᩼ | 1A3E 1A49 1A63 1A49 1A65 1A26 1A60 1A23 1A69 1A7C | m_hA_h_iG/guX | Same again, but with KARAN. (4 syllables) | |
ᩆᩣᩈ᩠ᨲᩕ᩺ | science /saːt/ |
1A46 1A63 1A48 1A60 1A32 1A55 1A7A | shA_s/tVr^r | Consonant-killer also killing medial consonant. NT spelling. (2 syllables) |
ᩈᩣᩈ᩠ᨲᩕ᩼ | science /saːt/ |
1A48 1A63 1A48 1A60 1A32 1A55 1A7C | sA_s/tVrX | Consonant-killer also killing medial consonant. Tai Khün spelling. (2 syllables) |
ᩁᩪ᩠ᨷ | image /huːp/ |
1A41 1A6A 1A60 1A37 | rU/B | This spelling is archaic in Northern Thailand (but current in Tai Khün) |
ᨻᩦ᩠᩵ᨶᩬ᩶ᨦ | relatives /piː nɔːŋ/ |
1A3B 1A66 1A75 1A60 1A36 1A6C 1A76 1A26 | bI1/nVO2G | (2 syllables - second is a single character) |
ᩃᩢᩪ | child (progeny) /luːk/ |
1A43 1A62 1A6A | laU | USE demands that mai kak (see next) precede most of the vowels that it phonetically follows. |
ᩃᩪᩢ | 1A43 1A6A 1A62 | lUa | MAI SAT can serve as a final consonant, /k/. This leads to yet more formal vowel combinations. | |
ᨸᩢᩣ | mouth /paːk/ |
1A38 1A62 1A63 | paA | |
ᨯᩬᩢ | flower /dɔːk/ |
1A2F 1A6C 1A62 | DVOa | |
ᨯᩢᩬ | 1A2F 1A62 1A6C | DaVO | USE-compliant ordering. | |
ᨯᩢᩬᩡ | 1A2F 1A62 1A6C 1A61 | DaVOH | USE-compliant ordering. | |
ᨯᩬᩢᩡ | 1A2F 1A6C 1A62 1A61 | DVOaH | MAI SAT can even be reinforced by SIGN A. | |
ᨻ᩠ᩅᩢᩡ | group /puak/ |
1A3B 1A60 1A45 1A62 1A61 | b/waH | |
ᨲᩯ᩠ᨶᩬᩴ᩵ | wasp, hornet /tɛːn tɔː/ |
1A32 1A6F 1A60 1A36 1A6C 1A74 1A75 | t_E/nVOM1 | A single orthographic syllable. |
ᨲᩬᩴ᩵͏ᩯ᩠ᨶ | wasp, hornet /tɔː tɛːn/ |
1A32 1A6C 1A74 1A75 034F 1A6F 1A60 1A36 | tVOM1͏_E/n | Should normally be visually identical with the above - the font may be too crude. However, when font colouring is supported, the vowel below should be coloured differently in the Da Lekh Si font; that font is intended to reveal the order of characters. |
ᨲᩬᩴ᩵ᩯ᩠ᨶ | 1A32 1A6C 1A74 1A75 1A6F 1A60 1A36 | tVOM1_E/n | Would it be legitimate for this to render differently to the above? | |
ᩈ᩠ᨶᩫ᩻ | street /sănon/ |
1A48 1A60 1A36 1A6B 1A7B | s/no" | The mai sam represents the final consonant in addition to the epenthetic vowel. |
ᨠᨾᩛᩦ | scripture /kam piː/ |
1A20 1A3E 1A5B 1A66 | kmVbI | The surprise is that U+1A5B had InSC=Consonant_Final until Unicode 10.0. (2 syllables - the first is a single consonant in the first example.) |
ᨶᩥᨻᩛᩣ᩠ᨶ | nirvana /nip paːn/ |
1A36 1A65 1A3B 1A5B 1A63 1A60 1A36 | n_ibVbA/n | |
ᨵᨾᩜᩥᨠ | saintly /tʰam miʔ kaʔ/ |
1A35 1A3E 1A5C 1A65 1A20 | dhmVm_ik | Chiengtung p166. It has 3 syllables - the second is of interest. It may show a problem with U+1A5C having InSC=Consonant_Final until Unicode 10.0. |
ᩈᨵᩩ᩠ᨷ | stupa(?) /sătʰup/ |
1A48 1A35 1A69 1A60 1A37 | sdhu/B | Chiengtung p166. (2 syllables - the first is a single letter.) This shows the issue with placement of the vowel and 'sakot' consonant also applies to this explicit vowel. |
ᩋᩣᨴᩥᨲ᩠ᨲᨵᨾᩜᩮᩣ | Adittadhammo Pali <Ādittadhammo> |
1A4B 1A63 1A34 1A65 1A32 1A60 1A32 1A35 1A3E 1A5C 1A6E 1A63 | qAd_it/tdhmVm_eA | Chiengtung p264. (5 syllables) |
ᨬᩣᨱᨵᨾᩜᩮᩣ | Nyanadhammo
Pali <Ñāṇadhammo> |
1A2C 1A63 1A31 1A35 1A3E 1A5C 1A6E 1A63 | nyANdhmVm_eA | Chiengtung p238. The individual referred is not the one hyperlinked to. (4 syllables) |
ᩅᩥᩈᩮ᩠ᩈ | special /wiʔ seːt/ |
1A45 1A65 1A48 1A6E 1A60 1A48 | w_i_s_e/_s | Note the lack of a ligature. (2 syllables) |
ᨢ᩶ᩣ | slave /kʰaː/ |
1A22 1A76 1A63 | kx2A | Same character order as in Thai and Lao! |
ᨢᩣ᩶ | 1A22 1A63 1A76 | kxA2 | But not if the USE prevails! | |
ᩈᩣᩈᨶᩣ | religion /saː saʔ naː/ |
1A48 1A63 1A48 1A36 1A63 | sA_snA | Full (5 chars) and contracted (7 chars) forms. (3 and 2 syllables respectively) |
ᩈᩣᩈ᩠ᨶ᩻ᩣ | 1A48 1A63 1A48 1A60 1A36 1A7B 1A63 | sA_s/n"A | ||
ᩈ᩠ᨶ᩻ᩮᩢ᩶ᩣ | javelin /sănau/ |
1A48 1A60 1A36 1A7B 1A6E 1A62 1A76 1A63 | s/n"_ea2A | |
ᨲᩦ͏ᩣ᩠ᨿ | to beat to death /tiː taːi/ |
1A32 1A66 034F 1A63 1A60 1A3F | tI͏A/_y | Uses CGJ as an invisible MAI SAM to stand for the duplicated consonant. |
ᩋᩮᩰᩣᨽᩣᩈ | to illuminate /ʔoː pʰaː saʔ/ |
1A4B 1A6E 1A70 1A63 1A3D 1A63 1A48 | q_eOAbhA_s | MFL p919. While the spelling rules call for either just U+1A70 SIGN OO or just the combination of <U+1A6E SIGN E, U+1A63 SIGN AA>, this might conceivably be a private lexicographer's notation indicating that both occur that happened to escape into the published work. The graphical order, left-to-right, in the MFL is SIGN OO, SIGN E, LETTER A, SIGN AA. The 'hacked via ASCII' rendering is wrong. (3 syllables - first is of interest.) |
ᩉ᩠ᨾ᩵ᩣᩴ᩻ | Grub's up! /mam mam/ |
1A49 1A60 1A3E 1A75 1A63 1A74 1A7B | h/m1AM" | |
ᩉ᩠ᨾᩣᩴ᩵᩻ | 1A49 1A60 1A3E 1A63 1A74 1A75 1A7B | h/mAM1" | It is not clear whether a USE-compliant form should have MAI KANG or the tone mark first. | |
ᩉ᩠ᨾᩣ᩵ᩴ᩻ | 1A49 1A60 1A3E 1A63 1A75 1A74 1A7B | h/mA1M" | ||
ᩃᩮᩞ | trickery /leːs/ | 1A43 1A6E 1A5E | l_eVs | Tai Khün spelling, cited in N3384 |
ᩋᨶᩣᨳᨷᩥᨱ᩠ᨯᩥᨠᩈᩞ | Anathapindika's Pali <Anāthapiṇḍikassa> |
1A4B 1A36 1A63 1A33 1A37 1A65 1A31 1A60 1A2F 1A65 1A20 1A48 1A5E | qnAthB_iN/D_ik_sVs | A rare spelling of the Pali masculine genitive singular ending. Note that SIGN SA starts the final phonetic syllable. (7 syllables - the last one is of interest.) |
Introduction | Control Panel | The Tests | Notes | My Fonts |
These exampless are intended to reveal the behaviour of the rendering system, rather than be clear pass or fail tests.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks | ||
---|---|---|---|---|---|---|
Interpretation | ||||||
ᨠ᩠ᨷ | (no meaning) | 1A20 1A60 1A37 | k/B | Interpretation of <SAKOT, BA> and <SAKOT, HIGH PA> respectively. This looks at font behaviour rather than at layout engine behaviour. | ||
ᨠ᩠ᨸ | (no meaning) | 1A20 1A60 1A38 | k/p | |||
| ||||||
ᩈᨾᩮᩣᨴ᩠ᨴᨾᩣᨶᩮᩉᩥ | with (things) on friendly terms
Pali <samoddamānehi> |
1A48 1A3E 1A6E 1A63 1A34 1A60 1A34 1A3E 00AD 1A63 1A36 1A6E 1A49 1A65 | sm_eAd/dmAn_e_h_i |
Split using a soft hyphen. (Many syllables.) The text occurs with and without dingbats (U+1AA5) so that one can see whether an inactive soft hyphen affects it. |
||
᪥᪥᪥ᩈᨾᩮᩣᨴ᩠ᨴᨾᩣᨶᩮᩉᩥ | with (things) on friendly terms
Pali <samoddamānehi> |
1AA5 1AA5 1AA5 1A48 1A3E 1A6E 1A63 1A34 1A60 1A34 1A3E 00AD 1A63 1A36 1A6E 1A49 1A65 | ᪥᪥᪥_sm_eAd/dmAn_e_h_i | |||
ᩈᨾᩮᩣᨴ᩠ᨴᨾᩣᨶᩮᩉᩥ | with (things) on friendly terms Pali <samoddamānehi> |
1A48 1A3E 1A6E 1A63 1A34 1A60 1A34 1A3E 200B 1A63 1A36 1A6E 1A49 1A65 | sm_eAd/dmAn_e_h_i | Split using zero width space - this uses the presentation-oriented view that ZWSP is simply a soft hyphen without visible rendering. This test is uninformative if the renderer refuses to make the break. See above for dingbats. (Many syllables) | ||
᪥᪥ᩈᨾᩮᩣᨴ᩠ᨴᨾᩣᨶᩮᩉᩥ | with (things) on friendly terms
Pali <samoddamānehi> |
1AA5 1AA5 1A48 1A3E 1A6E 1A63 1A34 1A60 1A34 1A3E 200B 1A63 1A36 1A6E 1A49 1A65 | ᪥᪥_sm_eAd/dmAn_e_h_i | |||
Baseless Marks and Non-alphabetic Bases | ||||||
ᩣ | (no meaning) | 1A63 | A | Bare vowel symbol | ||
ᩣ | (no meaning) | 00A0 1A63 | A | Vowel symbol 'on' NBSP | ||
ᩣ | (no meaning) | 00A0 200D 1A63 | A | Vowel symbol 'on' NBSP with ZWJ. | ||
ᨷ ◌ᩮ | N/A | 1A37 0020 25CC 1A6E | B ◌_e | And now discourage the use of multiple script runs by the renderer. | ||
Dependent Consonant Above and Tone Mark - What Chooses the Order? | ||||||
ᨾ᩠ᩅ᩺᩵ | to be fun Khün /mon/ |
1A3E 1A60 1A45 1A7A 1A75 | m/w^r1 | Typed as seen. Da Lekh fonts place the glyphs side by side, but the order is as in the Tai Khün manuscript. To be precise, it is an extract from a 1949 edition of the Khemarat Weekly, reproduced in L2/17-120 Figure 4. | ||
ᨾ᩠ᩅ᩵᩺ | to be fun Khün /mon/ |
1A3E 1A60 1A45 1A75 1A7A | m/w1^r | Typed with tone mark first. Da Lekh accepts the order, just as Thai does not rearrange THANTHAKHAT (or vowels above) with tone marks. The Da Lekh rendering does not match the Tai Khün manuscript | ||
ᨾ᩠ᩅ᩺᩵᩻ | to be lots of fun Khün /mon mon/ |
1A3E 1A60 1A45 1A7A 1A75 1A7B | m/w^r1" | Not actually attested, but grammatical derivatives of the above. | ||
ᨾ᩠ᩅ᩵᩺᩻ | to be lots of fun Khün /mon mon/ |
1A3E 1A60 1A45 1A75 1A7A 1A7B | m/w1^r" | |||
ᨣᩪ᩺᩻ | everyone Tai Lü /kun kun/ |
1A23 1A6A 1A7A 1A7B | gU^r" | Theoretical derivative of the unetymological, phonetic spelling of the word for person. The first mark above is serving as a final consonant, not a cancellation mark. | ||
|
||||||
ᨭᩮ᩠ᨮ ᨭᩛᩮ | (no meaning) /te:t/ /-t tʰe:/ |
1A2D 1A6E 1A60 1A2E 0020 1A2D 1A5B 1A6E | T_e/Th _TVb_e | Should be different. (2 syllables) | ||
ᨱᩮ᩠ᨮ ᨱᩛᩮ | (no meaning) /ne:t/ /-n tʰe:/ |
1A31 1A6E 1A60 1A2E 0020 1A31 1A5B 1A6E | N_e/Th NVb_e | Should be different. (2 syllables) | ||
ᨲᩮ᩠ᨮ ᨲᩛᩮ | (no meaning) /te:t/ /-t tʰe:/ |
1A32 1A6E 1A60 1A2E 0020 1A32 1A5B 1A6E | t_e/Th tVb_e | Should probably be different. (2 syllables) | ||
ᨻᩮ᩠ᨻ ᨻᩛᩮ | (no meaning) /pe:p/ /-p pe:/ |
1A3B 1A6E 1A60 1A3B 0020 1A3B 1A5B 1A6E | b_e/b bVb_e | Should be different. (2 syllables) | ||
ᨾᩮ᩠ᨻ ᨾᩛᩮ | (no meaning) /me:p/ /-m pe:/ |
1A3E 1A6E 1A60 1A3B 0020 1A3E 1A5B 1A6E | m_e/b mVb_e | Should be different. (2 syllables) | ||
ᨠᩮ᩠ᩁ ᨻᩕᩮ | (no meaning) /keːn/ /kʰe/ |
1A20 1A6E 1A60 1A41 0020 1A3B 1A55 1A6E | k_e/_r bVr_e | Should be different. (2 syllables) | ||
ᨠᩮ᩠ᩃ ᨠᩖᩮ | (no meaning) /keːn/ /keː/ |
1A20 1A6E 1A60 1A43 0020 1A20 1A56 1A6E | k_e/_l kVl_e | Should be different. (2 syllables) | ||
ᨠᩖᩮ ᨠ᩠ᩃᩮ | (no meaning) /keː/ /keː/ |
1A20 1A56 1A6E 0020 1A20 1A60 1A43 1A6E | kVl_e k/_l_e | However, those who don't use MEDIAL LA won't make a visual distinction to show the position of the vowel! (2 syllables) |
||
ᩈ᩠ᩈ ᩈᩞ ᩔ | (no meaning) | 1A48 1A60 1A48 0020 1A48 1A5E 0020 1A54 | s/_s _sVs ss | Should be different. (3 syllables) | ||
ᨾ᩠ᨾ ᨾᩜ | (no meaning) | 1A3E 1A60 1A3E 0020 1A3E 1A5C | m/m mVm | Should be different. (2 syllables) | ||
Behaviour of <SAKOT, NYA> | ||||||
ᨬᩮ᩠ᨬ ᨬ᩠ᨬᩮ | (no meaning) /ɲeːn/ /-n ɲeː/ |
1A2C 1A6E 1A60 1A2C 0020 1A2C 1A60 1A2C 1A6E | ny_e/ny ny/ny_e | Would ideally be different, but this may not be readily and robustly achievable. (2 syllables) | ||
ᨬᩮ᩠ᨬ ᨬ᩠ᨬᩮ | (no meaning) /ɲeːn/ /-n ɲeː/ |
1A2C 1A6E 200C 1A60 1A2C 0020 1A2C 1A60 1A2C 1A6E | ny_e/ny ny/ny_e | Instead should be different. (2 syllables) | ||
ᨱ᩠ᨬ ᨬ᩠ᨬ | (no meaning) | 1A31 1A60 1A2C 0020 1A2C 1A60 1A2C | N/ny ny/ny | Should these be different? (2 syllables) | ||
ᨱᩮ᩠ᨬ ᨱ᩠ᨬᩮ | (no meaning) /neːn/ /-n ɲeː/ |
1A31 1A6E 1A60 1A2C 0020 1A31 1A60 1A2C 1A6E | N_e/ny N/ny_e | Should these be different? (2 syllables) | ||
Marks from outside the Tai Tham Block | ||||||
ᩋᩦ๊ | (meaningless syllable in refrain of a song)
/ʔiː/ |
1A4B 1A66 0E4A | qI3K | Thai mai tri and mai chattawa are found on tua mueang 'words' on p236 of the big blue book! Of course, these might just be the unencoded THAI-LAO TONES THREE and FOUR. In this particular case, a rendering issue might be alleviated by making the default positions of the tone marks higher than that of the vowels above. | ||
ᩋᩦ๋ | (meaningless syllable in refrain of a song)
/ʔiː/ |
1A4B 1A66 0E4B | qI4K | |||
Language-Sensitive Forms (Browser Test?) | ||||||
ᩌᩣᩴ ᩌᩣᩴ | bran (written twice) /ham/ |
1A4C 1A63 1A74 0020 1A4C 1A63 1A74 | rhAM rhAM |
The top two rows are declared to be in Lao, and the second also has a corresponding style-setting lest the language setting be ignored. The initial consonant takes the form in the Da Lekh family of the consonant form used in that role in Laos and Northeast Thailand, namely , which is only subtly different from U+1A41 TAI THAM LETTER RA. So doing may be improper behaviour, but is seen in fonts. The mai kang should appear on the vowel U+1A63 TAI THAM VOWEL SIGN AA, its usual position outside Thailand. Of course, this won't happen if the font cannot be appropriate for such writing systems. At least one browser has failed to render the final stack properly when it has been the final glyph in the glyph stream; this is why the word is written twice. The bottom row is not marked for language, and shows the same word (and encoding). The Da Lekh font follows the more technically challenging Chiangmai style by default, with the MAI KANG on the consonant. (2 words, so 2 syllables!) |
||
ᩌᩣᩴ ᩌᩣᩴ | bran (written twice) /ham/ |
1A4C 1A63 1A74 0020 1A4C 1A63 1A74 | rhAM rhAM | |||
ᩌᩣᩴ ᩌᩣᩴ | bran (written twice) /ham/ |
1A4C 1A63 1A74 0020 1A4C 1A63 1A74 | rhAM rhAM | |||
Tone before Vowel! | ||||||
ᨣ᩠ᩅ᩵ᩢᩣ᩠ᨶ | when kan waː | 1A23 1A60 1A45 1A75 1A62 1A63 1A60 1A36 | g/w1aA/n | p118 - MFL clearly has the tone as the first mark! It may be that these are just typing errors. There are two other examples of tone and then vowel in the dictionary, the same tone and vowel as here. | ||
ᨣ᩠ᩅ᩵ᩢᩣ | and say kɔʔ waː | 1A23 1A60 1A45 1A75 1A62 1A63 | g/w1aA |
Introduction | Control Panel | The Tests | Notes | My Fonts |
These words presented problems, now overcome (Version 0.05), when developing the Da Lekh font to overcome the problems presented by the Universal Shaping Engine of mid 2016. (The solution is not entirely compliant with the Unicode standard - dotted circles in the input are sometimes deleted.) These are offered as an aid to font developers fighting unhelpful layout engines; they are not expected to help developers of the core layout engines.
Text | Meaning and Pronunciation | Encoding | Hacked via ASCII | Remarks |
---|---|---|---|---|
ᨠ᩠ᩃ᩻ᩬ᩵ᨾ | Cambodian /kălɔːm/ |
1A20 1A60 1A43 1A7B 1A6C 1A75 1A3E | k/_l"VO1m | p4 (2 syllables, second a single consonant) |
ᨠ᩠ᩃᩬ᩵᩻ᨾ | 1A20 1A60 1A43 1A6C 1A75 1A7B 1A3E | k/_lVO1"m | Hard to interpret encoding. | |
ᨠ᩠ᩃᩬ᩻᩵ᨾ | 1A20 1A60 1A43 1A6C 1A7B 1A75 1A3E | k/_lVO"1m | USE-compatible, with interpretable rendering specification. | |
ᨠᩕᩥ᩠᩵ᨦ | suspicious /kʰiŋ/ |
1A20 1A55 1A65 1A75 1A60 1A26 | kVr_i1/G | p15 |
ᨡᩮᩢ᩶ᩬᩣ᩠ᨦ | belongings /kʰau kʰɔːŋ/ |
1A21 1A6E 1A62 1A76 1A6C 1A63 1A60 1A26 | kh_ea2VOA/G | p101 - the one syllable form. The first form minimises the disruption to the pattern of first element followed by second element. The second spelling tries sticking in CGJ to advise that the ordering of the marks is not an error. The third spelling follows the principle that if the components cannot be concatenated (with deletion and addition of SAKOT or equivalent as appropriate), then the ordering should be based on the visual layout of the marks. |
ᨡᩮᩢ᩶͏ᩬᩣ᩠ᨦ | 1A21 1A6E 1A62 1A76 034F 1A6C 1A63 1A60 1A26 | kh_ea2͏VOA/G | ||
ᨡᩮᩬᩢ᩶ᩣ᩠ᨦ | 1A21 1A6E 1A6C 1A62 1A76 1A63 1A60 1A26 | kh_eVOa2A/G | ||
ᨡᩮᩢᩬᩣ᩠᩶ᨦ | 1A21 1A6E 1A62 1A6C 1A63 1A76 1A60 1A26 | kh_eaVOA2/G | USE (December 2021)-compatible rearrangement of the above - but the final consonant is still incompatible at 2021. | |
ᨦ᩠ᩅ᩶ᩣ᩻ ᨪᩰᩫ᩠᩶ᨦ᩻ | spastic /ŋwaː ŋwaː soːŋ soːŋ/ |
1A26 1A60 1A45 1A76 1A63 1A7B 0020 1A2A 1A70 1A6B 1A76 1A60 1A26 1A7B | G/w2A" jxOo2/G" | p168 (2 syllables) |
ᨦ᩠ᩅᩣ᩶᩻ ᨪᩰᩫ᩠᩶ᨦ᩻ | 1A26 1A60 1A45 1A63 1A76 1A7B 0020 1A2A 1A70 1A6B 1A76 1A60 1A26 1A7B | G/wA2" jxOo2/G" | Vowel and tone order adjusted to the USE as at December 2021. | |
ᨴᩯ᩠᩶ᩃ | truth to tell /tɛː lɛː/ |
1A34 1A6F 1A76 1A60 1A43 | d_E2/_l | p318. The first entry has the written vowel with the first consonant, the second with the second, and the third entry is the same as the second but normalised. |
ᨴ᩠᩶ᩃᩯ | 1A34 1A76 1A60 1A43 1A6F | d2/_l_E | ||
ᨴ᩠᩶ᩃᩯ | 1A34 1A60 1A76 1A43 1A6F | d/2_l_E | ||
ᨳᩮᩬᩥᩡ᩻ ᨳᩮᩥ᩠ᨠ᩻ | bruised /tʰɤʔ tʰɤʔ tʰɤːk tʰɤːk/ |
1A33 1A6E 1A6C 1A65 1A61 1A7B 0020 1A33 1A6E 1A65 1A60 1A20 1A7B | th_eVO_iH" th_e_i/k" | p314. (2 syllables) |
ᨳᩮᩥᩬᩡ᩻ ᨳᩮᩥ᩠ᨠ᩻ | 1A33 1A6E 1A65 1A6C 1A61 1A7B 0020 1A33 1A6E 1A65 1A60 1A20 1A7B | th_e_iVOH" th_e_i/k" | With vowel order of the USE as at December 2021 | |
ᨾᩉᩫᩖᨿᩰᨴᩤ | great army /maʔ hon yoː tʰaː/ |
1A3E 1A49 1A6B 1A56 1A3F 1A70 1A34 1A64 | m_hoVl_yOd^A | NTDPLM p511. |
Introduction | Control Panel | The Tests | Notes | My Fonts |
Short Name | Full Reference |
---|---|
N3207R | Everson M., Hosken M. & Constable P. Revised proposal for encoding the Lanna script in the BMP of the UCS, ISO/IEC JTC1/SC2/WG2/N3207R, L2/07-007R |
MFL | Rungrueangsi, Udom (2004) [1991]. Lanna-Thai Dictionary, Princess Mother Version พจนานุกรมล้านนา ~ ไทย ฉบับแม่ฟ้าหลวง ᨻᨧᨶᩣᨶᩩᨠᩕᩫ᩠ᨾᩃ᩶ᩣ᩠ᨶᨶᩣ ~ ᨴᩱ᩠ᨿ ᨨᨷᩢ᩠ᨷᨾᩯ᩵ᨼ᩶ᩣᩉᩖ᩠ᩅᨦ [Photchananukrom Lanna ~ Thai, Chabap Maefa Luang] (in Thai) (Revision 1 ed.). Chiang Mai: Rongphim Ming Mueang (โรงพิมพ์มิ่งเมือง). ISBN 974-8359-03-4. |
big blue book | Wacharasat, Bunkhit (2003). Language of Mueang Lanna ᨽᩣᩈᩣᨾᩮᩬᩨᨦᩃ᩶ᩣ᩠ᨶᨶᩣ ภาษาเมืองล้านนา [Phasa Mueang Lanna] (in Thai). ISBN 974-85472-0-5 |
Apiradee | Techasiriwan, Apiradee อภิรดี เตชะศิริวรรณ. พัฒนาการของอักษรและอัขรวิธีในเอกสานไทลื้ [Patthanakan khong Akson lae Akhara Witi nai Ekasan Thai Lue] Development of Tai Lue Scripts and Orthography. MA Thesis, Chiangmai University (in Thai) |
NTDPLM | Arunrat Wichiankhiao et al. อรุณรัตน์ วิเชียรเขียว (1996). ᨻᨧᨶᩣᨶᩩᨠᩕᩫ᩠ᨾᩃ᩶ᩣ᩠ᨶᨶᩣᨨᨻᩕᩰᩬᩡᨣᩤᩴᨴᩦ᩵ᨷᩕᩤᨠᩫ᩠ᨭᨶᩱᨷᩱᩃᩣ᩠ᨶ พจนานุกรมศัพท์ล้านนาเฉพาะคำที่ปรากฏในใบลาน The Northern Thai Dictionary of Palm-Leaf Manuscripts. ISBN 974-7067-77-2 |
Chiengtung | Chieng Tung: Its Way of Life ᨡᩮᨾᩁᨭᩛᨶᨣᩬᩁᨩ᩠ᨿᨦᨲᩩᨦ [Khemarattha Nakon Cheng Tung] เขมรัฐนครดชียงตุง [Khemarat Nakhon Chiang Tung] (in Thai, Tai Khün, French and English) Chiang Mai: Wat Tha Kradas (วัดท่ากระดาษ) |
L2/17-120 | Wordingham J.R. Corrections to the Indic Syllabic Category for the Tai Tham Script, L2/17-120 |
N3384 | Hosken M. Tai Tham Subjoined Variants, ISO/IEC JTC1/SC2/WG2/N3384, L2/08-073 |
This is Version 2.12 of the web page, which has been written by Richard Wordingham.
Version | Date | Changes |
---|---|---|
1.0 | 14 June 2015 | Initial 'stable' (i.e. abandoned) version. Work had started on 27 February 2015, and there may be earlier versions around. |
1.1 | 25 September 2016 | Converted from XML to HTML (by stripping off XML header) for new website. |
2.0 | 25 October 2016 | Added option to dynamically switch fonts - free font Da Lekh Seri for exposure to rendering engine foibles, and encumbered font Da Lekh for resistance. Both fonts are open source, but I created all the inked glyphs for the Da Lekh Seri font. ('Seri' means beholding to no-one.) Completed references, and improved, pruned and extended the examples. |
2.1 | 26 October 2016 | Corrected typos. Started testing of display bases. |
2.2 | 7 November 2016 | Fixed transliterator bug. Added examples from testing of Da Lekh font work-arounds. Corrected more typos. Tested language sensitivity. |
2.3 | 14 November 2016 | Added styles to force Lao forms. Reorganised 'test and tell'. Added one new test word, for mai kam followed by mai sam. |
2.4 | 14 April 2017 | Improved 'bran' bug alert. Added 'A Tai Tham KH' font with and without ccmp enabled. The radar buttons are hidden, and anyone enabling them would also have to supply the font. Added test for double acting MEDIAL LA. |
2.5 | 8 July 2017 |
Added test for tone plus SIGN OY. Added colour fonts to show phonetic position of subscripts relative to vowel. Added "onclick" for radio buttons. |
2.6 | 22 February 2018 |
Added test cases for karan on vowels and medial la following preposed vowel. |
2.7 | 12 May 2018 |
Added three new fonts - 'A Tai Tham KH', 'Hariphunchai' and my extension of the latter, 'Lamphun'. Added a few more examples of the ᨶᩣ ligature. Added query as to when ᨲᩬᩴ᩵͏ᩯ᩠ᨶ should render properly. Colour for spell-checking is now a reality. |
2.8 | 17 February 2019 |
Added test cases for ᩃᩮᩞ and ᨻᩕ᩠ᨿᩮᩡ. |
2.9 | 9 December 2021 |
Corrected feature ss99 to ss19. Clarified rôle of Da Lekh Si fonts. Updated repertoire of Da Lekh Seri fonts. Added play area for readers to try the fonts out. Massively duplicated navigation bars to avoid need to scroll to top or bottom. Made the difference between strings that shall be rendered and other possible sequences clearer. Promoted four test and tell cases to test cases - three sequences for displaying marks and one for the combination of RA HAAM and MAI SAM. Linked to my font compiler. |
2.10 | 30 December 2021 |
Noted that colour now works even in IE 11 and also in LibreOffice. Added most recent (2019) version of Hariphunchai, dubbed Hariphunchai 4. Added USE-compatible encodings to avoid maligning any fonts that assume a USE-compatible encoding. |
2.12 | 23 January 2022 |
(Including changes to 2.11). Fixed miscellaneous typos, including alternative encodings of ᨶ᩶ᩣᩴ. Changed shortcomings of 'my fonts' to shortcomings of 'Da Lekh'. Changed site from HTTP to HTTPS. Changed background for USE encoding from red to orange to avoid clash with coloured fonts. |
This web page has been developed with frequent testing on Firefox Version 54 and occasional viewing using Safari on iPhone (iOS 10.3.2), IE 11 (on Windows 7) and Microsoft Edge (on Windows 10).
Switching fonts has been tested in all these browsers.
Introduction | Control Panel | The Tests | Notes | My Fonts |
You may freely use my four fonts mentioned here without modification and may freely examine my fonts. See the respective licensing for conditions and modification. I do not own all the intellectual property rights for the Da Lekh and Da Lekh Si fonts. The fonts are available as follows:
Name | Font file | Source file | Licence file |
---|---|---|---|
Da Lekh (ᨯᩣᩃᩮ᩠ᨡ) |
dalekh.ttf | File dalekh.txt in
dalekh.zip. This is also the
ultimate source code for the Da Lekh Seri font. See Makefile
therein for preprocessing directives. |
DejaVu licence |
Da Lekh Si (ᨯᩣᩃᩮ᩠ᨡᩈᩦ) |
dalekh_si.ttf | ||
Da Lekh Seri (ᨯᩣᩃᩮ᩠ᨡᩈᩮᩁᩥ) |
dalekh_seri.ttf | Either start from the source code, which is subject to
the DejaVu licence, for the Da Lekh font, or use the preprocessed file
dalekh_seri.txt. If the GNU Compiler
Collection is available,
one may use the following command to generate the
immediate 'source' code:cc -E -fdirectives-only -DSERI -x c dalekh.txt | grep -v ^# >| dalekh_seri.txt
|
seri_license.htm |
Da Lekh Si Seri (ᨯᩣᩃᩮ᩠ᨡᩈᩦᩈᩮᩁᩥ) |
dalekh_si_seri.ttf | Either start from the source code, which is subject to
the DejaVu licence, for the Da Lekh font, or use the preprocessed file
dalekh_si_seri.txt. If the GNU Compiler
Collection is available,
one may use the following command to generate the
immediate 'source' code:cc -E -fdirectives-only -DSERI -DCOLOUR -x c dalekh.txt | grep -v ^# >| dalekh_seri.txt
|
If you wish to have WOFF files, you should either generate them yourself from the font files listed above, or simply copy them from this website.
The fonts are generated from the source code by means of a DIY font compiler that still has many rough edges. However, the source code of the font, although spartanly commented, may make it clearer what the font is attempting to do. I have endeavoured to make reverse engineering unnecessary.
The font Da Lekh is partly intended for my practical use in analysing material in the Tai Tham script. It therefore contains a large set of Latin characters to support transcription and transliteration. It also contains work arounds so that it may render properly despite problems with rendering engines.
The other purpose of the fonts is to explore issues in making an OpenType font for the Tai Tham script.
The font Da Lekh Seri is an unencumbered font intended for testing rendering engines. It therefore has, besides the glyphs for Tai Tham writing systems, just a bespoke set of (poor) ASCII glyphs; both the extra characters required by Microsoft Office and the characters recommended for the Universal Script Engine; and the characters needed for transliteration style (feature ss04) and their closure under NFC. Known existing work-arounds have been removed. This removal is implemented by compiler directives.
The font Da Lekh Si (ᨯᩣᩃᩮ᩠ᨡᩈᩦ) differs from Da Lekh in that it aims to reveal the spelling of words. This is useful when using a spell-checker, for example on Firefox. The ideal is that subscript consonants in the coda of an orthographic syllable would be distinguished from those in the onset by colour, whence the word 'Si' in the name of the font. The colour technology used works in the dominant browsers (Chrome, Safari, Firefox, MS Edge and even Internet Explorer 11) and in the word processor of LibreOffice. The colouring is also applied to chained syllables.
It is possible that Da Lekh Si may be reduced to an optional OpenType feature applied to the Da Lekh font.
The font Da Lekh Si Seri is an unencumbered font that colours glyphs in the same way. Like Da Lekh Seri, it deliberately lacks work-arounds for problems with renderers. It is intended as an aid for the development of the Da Lekh Si font.
The Lamphun font is available under the SIL open font licence; the applicable customisation declares that "Hariphunchai" and "Lamphun" are reserved font names. The font file is lamphun.otf and what I have used as 'source' code to build the font is an untidy mess assembled in lamphun.zip:
Rôle | Name | Remarks | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Glyphs | Hariphunchai.otf | A version of the font dated 5 May 2014, taken from SourceForge. The 'unique identifier' in the name table is FontForge : Hariphunchai : 5-5-2014. There were later .sfd and .fea files at the same location, but at best they offered improved glyphs compared to Lamphun. This is the file that defines the 'early' Hariphunchai font as used on this web page. |
||||||||||||||||
OTL tables | lamphun.txt |
This defines a font with the same glyph numbering, but with blank glyphs. I then replace 7 tables in the early Hariphunchai font with tables from this new font:
|
||||||||||||||||
Change log | fontlog.txt | Only for Lamphun. | ||||||||||||||||
Make file | lamphun_makefile | The compiler invoked by '~/oft/parse' is my DIY font compiler. |
It is likely that I will create a variant coloured to indicate spelling.
There are two versions of the font used on this page. The fonts themselves are distinguished by the unique font identifiers in their name tables.
The early version of the Hariphunchai font, whose 'unique identifier' is
FontForge : Hariphunchai : 5-5-2014, is available as
Hariphunchai.otf
both on
SourceForge and within the
Lamphun source
zip file. The reversibly generated WOFF file is available
here.
The 2019 version, whose 'unique identifier' in the name table is
TragerStudio : Hariphunchai : 19-5-2019,
is available as Hariphunchai4.otf
on
SourceForge.
The reversibly generated WOFF file is available
here.
The licence is available on Source Forge. The WOFF files, being derivative works, are licensed under the same licence. As the original OTF files can be recovered from them, they preserve the font names.
Introduction | Control Panel | The Tests | Notes | My Fonts |