Skip to content

A common "international" base layout for English/German + Spanish/Portuguese/French #58

@leogama

Description

@leogama

Hello, there. I'm an enthusiast user of the (programmer's) Dvorak layout for almost a decade now, and it was a huge improvement over good ol' QWERTY to learn it. However, while it is really widespread and readily available on most current systems, its performance for the English language is sub-optimal. Also, its variations for languages with similar alphabets —like my dear Portuguese— are still "super-terrible" (a bit less terrible than QWERTY due to the vowels at the left home row).

The elephant in the room

I took a look at some of these newer designs, including yours. Congratulations, by the way! Amazing work. But the OP touched a very important point that is still unaddressed by all of these: we live in an international, interconnected world now. Until the early 2000's, it wasn't a problem to have totally different keyboard layouts for every language. We even used different, incompatible text encodings! But now the most used encoding in both new devices and the Internet is Unicode. I believe the same transition should happen to keyboard layouts.

But is there a need for it? Well, most professionals that type a lot (journalists, academics, programmers, etc.) will need to either create content in more than one language, usually in their native one and in English, or at least communicate with foreigners through text often. It applies even to countries that have English as their primary language, like the US, where there's more and more people speaking Spanish as a primary or secondary language each year (> 50 million today).

Is an "international" keyboard layout possible?

I know that many languages use completely different alphabets and, even when they use similar ones (like variations of the Latin or Cyrillic scripts), they have extra characters and wildly varying letter/n-gram frequencies. Therefore, there can't be a truly international base layout for keyboards. But can we do better?

Starting from English, the de facto international language, a non-monolingual layout can't be much distant from ASCII. Looking at the languages with most speakers in the world that use a Latin script alphabet, we have in the top positions (Wikipedia/Ethnologue 2022):

Position Language Family Branch 1st language 2nd language Total speakers
1 English Indo-European Germanic 372.9 million 1.080 billion 1.452 billion
4 Spanish Indo-European Romance 474.7 million 73.6 million 548.3 million
5 French Indo-European Romance 79.9 million 194.2 million 274.1 million
9 Portuguese Indo-European Romance 232.4 million 25.2 million 257.7 million
12 German Indo-European Germanic 75.6 million 59.1 million 134.6 million

I think it would be feasible to analyse these 5 languages, from two branches of the same language family —you already did it for two of them― and find a design that is awesome for one (likely English) but doesn't sucks for all the others.

A "Latin" or "Romance-Germanic" base keyboard layout

For whoever is interested, I propose the development of a base layout using the Latin alphabet that is optimized for all of these 5 languages. It wouldn't be a simple weighted optimization though. What I would expect to achieve with this design is:

  1. To have a common base for creating a new layout for each of the 5 languages;
  2. It must be really good at English, at least as good as other current designs by the same metrics;
  3. It should be reasonably good for the other 4 languages, but must not be terrible for any of them;
  4. The differences between the layouts should be minimal, so that one can constantly switch between layouts without hassle, create a custom hybrid bilingual layout or don't even need to switch at all.

Steps necessary to achieve these goals:

  1. Obtain a text corpus and n-gram frequency for French, German and Portuguese;
  2. Find the similarities between the 5 languages using some kind of distance measure(s);
  3. Define optimization weights for them considering these similarities, number of speakers, etc.;
  4. Develop a method for searching the layout space by optimizing primarily for English and secondarily for the 4 other languages (using the weights), with penalization if the layout starts becoming too bad for any single language middle search;
  5. Choose a winner base layout and then search for full layouts for each individual language, positioning specific keys and maybe repositioning some punctuation keys in the process.
  6. Profit. 😎

Advantages

  • Beyond the obvious advantages for multilingual typists, this base layout and its derivatives would benefit from having a unified, larger user base —likely very small in the beginning, but it's plausible to reach a critical size eventually.
  • Its software implementations could have common codebases, following the pattern of a base layout file (either the English layout or just the base itself) and modifications of it. Would be easier to maintain and port to different systems.
  • Being multilingual could be an eye-catching feature for anyone looking for a better layout to learn beyond QWERTY/Dvorak.
  • The new methods developed could be useful for custom/personal layout creation and also for other language subfamilies, like those that use the Cyrillic script.

I'm seriously considering to learn once more a new keyboard layout, but it would have to be a killer layout. It would have to be one to rule them all.

I am willing to dedicate some time to this idea if there are others interested. If not, maybe I'll end up trying to create my own Portuguese or Portuguese-English Engram layout.

Greetings from Brazil! 🇧🇷

Originally posted by @leogama in binarybottle/engram-es-2021#40 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions