A better way to input Vietnamese

Earlier this year I had the pleasure of implementing for Firefox OS an input method for Vietnamese (a language I have some familiarity with). After being dissatisfied with the Vietnamese input methods on other smartphones, I was eager to do something better.

I believe Firefox OS is now the easiest smartphone on the market for out-of-the-box typing of Vietnamese.

The Challenge of Vietnamese

Vietnamese uses the Latin alphabet, much like English, but it has an additional 7 letters with diacritics (Ă, Â, Đ, Ê, Ô, Ơ, Ư). In addition, each word can carry one of five tone marks. The combination of diacritics and tone marks means that the character set required for Vietnamese gets quite large. For example, there are 18 different Os (O, Ô, Ơ, Ò, Ồ, Ờ, Ỏ, Ổ, Ở, Õ, Ỗ, Ỡ, Ó, Ố, Ớ, Ọ, Ộ, Ợ). The letters F, J, W, and Z are unused. The language is (orthographically, at least) monosyllabic, so each syllable is written as a separate word.

This makes entering Vietnamese a little more difficult than most other Latin-based languages. Whereas languages like French benefit from dictionary lookup, where the user can type A-R-R-E-T-E and the system can from prompt for the options ARRÊTE or ARRÊTÉ, that is much less useful for Vietnamese, where the letters D-O can correspond to one of 25 different Vietnamese words (do, , , , dỗ, , dở, dỡ, dợ, đo, đò, đỏ, đó, đọ, đô, đồ, đổ, đỗ, đố, độ, đơ, đờ, đỡ, đớ, or đợ).

Other smartphone platforms have not dealt with this situation well. If you’ve tried to enter Vietnamese text on an iPhone, you’ll know how difficult it is. The user has two options. One is to use the Telex input method, which involves memorizing an arbitrary mapping of letters to tone marks. (It was originally designed as an encoding for sending Vietnamese messages over the Telex telegraph network.) It is user-unfriendly in the extreme, and not discoverable. The other option is to hold down a letter key to see variants with diacritics and tone marks. For example, you can hold down A for a second and then scroll through the 18 different As that appear. You do that every time you need to type a vowel, which is painfully slow.

Fortunately, this is not an intractable problem. In fact, it’s an opportunity to do better. (I can only assume that the sorry state of Vietnamese input on the iPhone speaks to a lack of concern about Vietnamese inside Apple’s hallowed walls, which is unfortunate because it’s not like there’s a shortage of Vietnamese people in San José.)

Crafting a Solution

To some degree, this was already a solved problem. Back in the days of typewriters, there was a Vietnamese layout called AĐERTY. It was based on the French AZERTY, but it moved the F, J, W, and Z keys to the periphery and added keys for Ă, Đ, Ơ, and Ư. It also had five dead keys. The dead keys contained:

  • a circumflex diacritic for typing the remaining letters (Â, Ê, and Ô);
  • the five tone marks; and
  • four glyphs each representing the kerned combination of the circumflex diacritic with a tone mark, needed where the two marks would otherwise overlap

Photo of a Vietnamese typewriter

My plan was to make a smartphone version of this typewriter. Already it would be an improvement over the iPhone. But since this is the computer age, there were more improvements I could make.

Firstly, I omitted F, J, W, and Z completely. If the user needs to type them — for a foreign word, perhaps — they can switch layouts to French. (Gaia will automatically switch to a different keyboard if you need to type a web address.) And obviously I could omit the glyphs that represent kerned pairs of diacritic & tone marks, since kerning is no longer a mechanical process.

The biggest change I made is that, rather than having keys for the five tone marks, words with tones appear as candidates after typing the letters. This has numerous benefits. It eliminates five weird-looking keys from the keyboard. It eliminates confusion about when to type the tone mark. (Tone marks are visually positioned in the middle of the word, but when writing Vietnamese by hand, tone marks are usually added last after writing the rest of the word.) It also saves a keystroke too, since we can automatically insert a space after the user selects the candidate. (For a word without a tone mark, the user can just hit the space bar. Think of the space bar as meaning “no tone”.)

This left just 26 letter keys plus one key for the circumflex diacritic. Firefox OS’s existing AZERTY layout had 26 letter keys plus one key for the apostrophe, so I put the circumflex where the apostrophe was. (The apostrophe is unused in Vietnamese.)

Screenshot of Vietnamese input method in use

In order to generate the tone candidates, I had to detect when the user had typed a valid Vietnamese syllable, because I didn’t want to display bizarre-looking nonsense as a candidate. Vietnamese has rules for what constitutes a valid syllable, based on phonotactics. And although the spelling isn’t purely phonetic (in particular, it inherits some peculiarities from Portuguese), it follows strict rules. This was the hardest part of writing the input method. I had to do some research about Vietnamese phonotactics and orthography. A good chunk of my code is dedicated to encoding these rules.

Knowing about the limited set of valid Vietnamese syllables, I was able to add some convenience to the input method. For example, if the user types V-I-E, a circumflex automatically appears on E because VIÊ is a valid sequence of letters in Vietnamese while VIE is not. If the user types T to complete the partial word VIÊT, only two tone candidates appear (VIẾT and VIỆT), because the other three tone marks can’t appear on a word ending with T.

Using it yourself

You can try the keyboard for yourself at Timothy Guan‑tin Chien’s website.

The keyboard is not included in the default Gaia builds. To include it, add the following line to the Makefile in the gaia root directory:

GAIA_KEYBOARD_LAYOUTS=fr,vi-Typewriter

The code is open source. Please steal it for your own open source project.

6 thoughts on “A better way to input Vietnamese

  1. Mr. Clancy,

    At the behest of a friend of mine (one of Wikimedia’s language engineers), I’m posting some comments that were my *late-night* first impressions of your keyboard:

    ———-

    My thoughts:

    – the design is clever in removing “unused” (read : non-standard orthography) letters and mitigates the overcrowding of five tone tone choices for letters that _are_ overloaded (a (> a â ă), e (> e, ê), o (> o, ô, ơ), u (> u, ư)), but I still dislike having to still bounce between keyboards for loans/foreign words that will be somewhat inevitable on a mobile device/virtual experience, where code-switching has higher probability and/or likelihood, and for other ethnicities’ orthographies

    – I mention “Standard Vietnamese” orthography only because I have encountered more non-Standard_Viet orthographies written with this system (Mường, Tày, Coong, etc) and know that there’re still non-trivial phonological exceptions even in Std Vietnamese, e.g Pleiku. Yes, majority cases will find these trivial, but I prefer better architecturing that isn’t potentially excluding.

    – nitpick : the predictive/prescriptive tone/word choice is interesting, but, since the one of the examples is flawed (only acute accent (◌́ ; dấu sắc) and dot below (◌̣ ; dấu nạng) are viable choices for *Standard Vietnamese* orthography in closed/non-nasal obstruent codas, …ế{p|t|ch|c} | …ệ{p|t|ch|c}, *never* ể), I am conservative in my outlook (I readily recognize that this is a first-pass MVP)

    – I am unsure of the absolutism in the claim that tone marks are _always_ written after a syllable/string/phrase has been completed. I can remember my teacher writing tone marks in linear/visual order as well as syllable logical order as well as a rare memory of once all letters in a string had been written, where the last case was IIRC a test for us students. For cursive handwriting, writing of diacritics after a string is done makes sense, but, not necessarily so for print handwriting.

    – I think that the predictive feature could be used for tone placement according to orthographic rules, where appropriate/applicable tone class diacritics could be prompted by either white space or hyphen (???), but that feels … expensive

    Given that this keyboard is seemingly primarily engineered for a smartphone / mobile device / virtual keyboard experience, it’s an amazingly neat first-pass.

    (…and it’s totally triggering my OCD right now and, it has me contemplating how I would address my own nitpicks and dislikes that I mentioned above…)

    Like

    • > there’re still non-trivial phonological exceptions even in Std Vietnamese, e.g Pleiku

      I’ll note that you can still type ‘Pleiku’ with this keyboard layout. The keyboard app will notice that it doesn’t conform to usual Vietnamese rules, but that just means it won’t try to generate tone candidates. You can still type the word. You only have to switch to a different layout if you want to type a word with F, J, W, or Z. My understanding is that when foreign words are borrowed into Vietnamese, those letters are usually replaced (F -> PH, J -> GI, W -> OU, Z -> D).

      I’ll also note that Firefox OS makes switching keyboard layouts easy. Toggling between AZERTY and AĐERTY is no more difficult than tapping a shift key.

      You are correct that this keyboard isn’t suitable for typing any of Vietnam’s numerous minority languages, like Mường or Tày. But let’s be clear, those languages are not Vietnamese. (Well, they are Vietnamese in the sense that they exist in Vietnam, but they are not the language known as “Vietnamese”. Tày isn’t even related to Vietnamese.) You will not find articles written in Mường or Tày on the Vietnamese-language Wikipedia.

      I’m not opposed to seeing support for languages like Mường or Tày, or indeed any of Vietnam’s 53 official minority languages, but that becomes a much more difficult undertaking (starting with the difficult question of “which languages do we support and why?”) and it wasn’t part of the requirements for this keyboard. If it makes you feel better, Firefox OS will have support for installable 3rd-party keyboards.

      > I am unsure of the absolutism in the claim that tone marks are _always_ written after a syllable/string/phrase has been completed

      Well, I didn’t say always. But I was under the impression that it was usual, and I’ve noticed it when people spell words aloud too. e.g. Nước will be spelt aloud as N-Ư-Ơ-C-dấu-sắc.

      But it doesn’t matter. The point is that it’s not clear when the key for a tone mark should be pressed. Back in the days of typewriters, tone marks were typed before the main vowel, so that makes three places a user might expect to type the tone mark (before the main vowel, after the main vowel, and at the end of the word).

      I thought about allowing a tone mark to be typed in all three places, but I wasn’t sure what the user experience for that would look like, and I also really wanted the tone mark key to automatically insert a space (which I couldn’t do if the user could type the tone mark in the middle of the word). In any case, I think generating tone candidates is a simpler experience.

      Thanks for your comments, Patrick!

      Like

  2. Great post, thank you for the detailed write-up. It’s fascinating to learn about the intricacies of the Vietnamese writing system. I also liked the fact that you found some inspiration in an old-school typewriter.

    If I understand correctly, in Vietnamese there might be only one tone marks over a single word (out of five). Does this mean that for each word the user inputs, the suggestion engine is guaranteed to show at least five suggestions, one for each tone mark? Plus any number of other suggestions which offer actual spelling fixes. Would it make sense (given enough screen estate) to have two suggestion bars: one for the variants of the current word with tone marks and another one for spelling suggestions?

    Like

    • > Does this mean that for each word the user inputs, the suggestion engine is guaranteed to show at least five suggestions, one for each tone mark?

      Usually five. For words that end in a stop consonant (c, ch, p, or t), there are only two options.

      > Would it make sense (given enough screen estate) to have two suggestion bars: one for the variants of the current word with tone marks and another one for spelling suggestions?

      The IME doesn’t currently attempt to fix spelling, but that would be an interesting future direction. I don’t know if we’d need two bars… we might be able to get away with one bar. If the user types something that isn’t a valid Vietnamese syllable, we could show the closest valid syllable, and all tone variations of that syllable. It would provide some typing assistance while not using too much screen real estate.

      I might have a stab at that if I have some time.

      Thanks for your comments!

      Like

  3. one key expectation i would have from this is that what it eventually drops into your content area should be Unicode text where the characters are properly normalised. You didn’t mention that. Does it? That would be very helpful.

    (ps. fwiw, i developed a vietnamese picker at http://rishida.io/pickers/vietnamese/ (which allows you to normalise in more than one way).)

    Like

    • I don’t normalize the text, but because I use precomposed characters only (no combining marks), the result should be identical to normalization form NFC.

      Like

Leave a comment