Philosophical Multicore

Sometimes controversial, sometimes fallacious, sometimes thought-provoking, and always fun.

The New Keyboard Layout Project (NKLP)

Posted by Michael Dickens on December 9, 2008


Right now I am expanding my corpus. If anyone else has big blocks of text, like a bunch of stuff they typed on their computer, send it to me at Tell me what’s in it (like, emails, computer programs, business letters) so I don’t have to read it. (For confidentiality reasons, and for my convenience.)

I’m trying to get my corpus up to 10,000 pages, because I think that’s enough to have a really good variety of text. Right now I have about 3000.

I have about 11,000 pages in my corpus. However, I only have about 1000 pages of casual text, an I’d like about 3000. I’d also like about 1000 pages of news, and only have about 400. But I’m close to being done. (Collecting news is just so tedious, though.)

I just realized that Carpalx has a good corpus that’s free. It doesn’t have everything that I want, but it has a lot of books and programming code.

VvV from wanted to make an evolutionary algorithm for non-latin letters. I don’t have any data for any languages other than English. But if I did, what languages could be done? I’ll look at the world’s most popular languages (from

1. Mandarin Chinese – 882 million
There are far too many characters in Chinese to make a keyboard layout.
2. Spanish – 325 million
3. English – 312-380 million
4. Arabic – 206-422 million
This could work.
5. Hindi – 181 million
This alphabet is also probably small enough.
6. Portuguese – 178 million
7. Bengali – 173 million
It looks kind of big, but it should work.
8. Russian – 146 million
9. Japanese – 128 million
Same deal as Chinese.
10. German – 96 million

Any comments you have relating to the NKLP should be posted here.


10 Responses to “The New Keyboard Layout Project (NKLP)”

  1. Phynnboi said


    If you’re unhappy with the results of your algorithm with a 1000 page corpus, you’re going go be even less happy with the results with a 10,000 page corpus.


    The first question one needs to answer before optimizing a keyboard layout is, “For what, specifically, am I optimizing this layout?”

    For instance, take the extreme: “I want my layout to be able to type any possible character sequence with the least possible effort.” In this case, your corpus would basically be a string of random characters. But then, every character will have the same frequency as every other character, so no layout will be any better than any other layout!

    To put it another way, there will quickly come a point in your corpus building where adding more text either has no significant effect on your layouts or actually makes them worse. Collecting a massive, varied corpus is, therefore, counter-productive.

    For my “grow a layout” project, I set a pretty narrow goal: “I want a layout that makes it easy to type the kind of stuff I tend to type.” My corpus, then, was a few thousand of my most recent Usenet posts. After stripping out characters I didn’t care about, the corpus weighed in at around 500,000 characters; even that was probably overkill. I would recommend you do something similar–collect a bunch of your own text and use that as the corpus. You might be surprised how well the resultant layouts adapt to typing other modern English texts.

    (As for layouts for other languages, I have no clue. I don’t type anything other than English, so frankly don’t care. Let the people who actually use those languages optimize layouts for them. I know, I know–typical Imperialist American.)

  2. mtgap said

    Why would a longer corpus be less accurate? You need to have enough text to accurately represent the average of future texts that will be typed. The more text you get, the closer you are to this average.

    I put in stuff besides my own text for two reasons:
    1. I want to get closer to this average.
    2. I want to make my corpus available to anyone else who might want to use it. There are already some comprehensive corpuses (corpi?) out there, probably better than anything I’d be able to get, but they cost a lot of money. If anyone else uses it, they will have different typing patterns.

    Even though it’s not all my own text, it will probably be at least as good. The only thing that might be skewed is word frequency; it’s possible that I use certain words more frequently than they are usually used. Oh, and punctuation use could also be messed up somewhat.

    I wouldn’t say I was unhappy with my shorter corpus. I used to have one that was only about 120 pages, and it was really not as good as my 2000 page long one. The 2000 page long one was good, but I wanted to have more accuracy. And it’s not like it really matters; I just enjoy going to a website, copying a bunch of text, and pasting it in a file. It’s meditative.

  3. vVv said

    I have no idea how the program works, but I would think that it shouldn’t matter what language it is used for, as long as the letters (as well as any other data that is needed) are taken as input. (By the way, I don’t really need this so it’s not very important.)

  4. Phynnboi said

    I’m just saying, you can’t make a layout that’s great at typing everything. You need to focus on one style, like informal, formal, technical, source code, etc. A layout that tries to be all things to all people is going to be crap for everyone. Also, if you already are focusing on one particular kind of text, adding tons more of the same kind of text isn’t going to improve your results appreciably. Remember, we’re dealing with algorithms that produce sub-optimal, “good enough” results, so an extra decimal place of precision isn’t really going to help.

    If you enjoy it, fine, but I think you’d be much better served coming up with a cleverer evaluation function.

  5. Bill Foster said

    I have to agree with Phynnbol. The more constraints you put on a problem, the less optimized the solution can be for any particular constraint. The layout that’s best for typing Java code will not be the same as the best layout for writing a novel. It you want a layout that’s good for both there will be trade offs involved.

  6. mtgap said

    vVv: There is a description of how my program works midway through the page at

    Phynnbol, you definitely have a point there. What I was planning on doing, something that I didn’t do for the last version, was to have the person running the program input the types of things that this layout would be used for. Then, the different texts would be weighted accordingly. For example, I have a lot of programming code, but most people never would program, so the algorithm doesn’t count the programming code. In this situation, the amount of data that’s actually used is relatively small, and more optimized for the person using it.

    However, most people don’t just type one thing. So the program would have to be able to find the best average of various texts, unless you want to use a different layout for programming, a different one for writing emails, etc. I certainly don’t. So multiobjective optimization comes into play.

    As you say Phynn, what we really need is a better evaluation function.

    Multiobjective optimization (MOP) is important for determining the layout criteria. They are sometimes mutually exclusive: hand rolls vs. alternation. Usually the aren’t.

    MOP will be tricky, because there are so many objectives. And we don’t always know what a keyboard layout with x hand alternation, x same finger, etc. will look like. So a lot of the problem is actually finding the layout that has all the stats you want. That’s what the computer does, I suppose. So the prerequisite is to find out what stats are the best, and then try to weigh them accordingly.

    I have tried a sort of weight on the different values such that as it increases, the penalty increases quadratically. But it completely broke the program. So that won’t work.

    List of objectives to MINIMIZE: (roughly in order of importance)
    -travel distance
    -same finger
    -same hand
    -home row jumping
    -changing rows
    -reaching to the center column

    List of objectives to MAXIMIZE:
    -inward rolls
    -outward rolls(?)
    -comfortable home row jumping (on QWERTY, typing IN is easy, but EX is hard)

    -Make it get the best layout
    -Make it as fast as possible

    Phynn: Algorithms don’t necessarily produce sub-optimal results. They can produce the best result for the given criteria. The thing about genetic algorithms is that they don’t always get the best result. But if done right, they usually will.

    We want to find the best ratio between values, while also having the values be low. This is trickier than one would think. You have to somehow make all the values be lower without making one be way lower than the others. They have to stay together.

    Currently, the best evaluation function that I have is the one where certain values are worth some amount of points, and every time that negative value occurs, points are added. Every time a positive value occurs, points are subtracted. (It works better this way than the inverse: easy things reduce effort, while hard things increase effort.)

    Potential better ones that I’ve thought of but didn’t work:
    -As the occurrences of a certain value increase, the cost increases nonlinearly.
    -Have it cost more for the values to not be in the proper ratios.

    I just got an idea. The program could FORCE the values to be in the proper ratios: if they’re not, it throws out the layout. Then from there, it finds the layout where the occurrences of each value is lowest. The problem here is that it might throw out a huge number of layouts, making the whole process take a lot longer. It would be faster to simply have the ratios reduce over time, but that would require costs. Maybe there could be separate costs for ratios and occurrences.

  7. Phynnboi said

    I think the way to go for layout optimization, where you’re trying to find the best compromise between a bunch of different parameters (objectives), is to use a kind of least squares optimization. You kinda-sorta mention trying this before without success.

    The trick to squaring is, you have to normalize all the values first. For instance, if you have one parameter that ranges from 0 to 10, and another parameter that ranges from 10 to 100, relatively small changes in the second parameter will dwarf even the largest of changes in the first parameter. So, you need to normalize them to be within the same range so they can be compared equally. I normalized all my parameters to between 0.0 and 1.0, but you could use any range you wanted.

    To normalize your parameters, you have to figure out what the minimum and maximum values are per parameter. Luckily, we have a method of doing that already! Just set your program to minimize and maximize each parameter individually, and record the minimum and maximum values it came up with. For example, set it up so that same-finger is the only thing measured, and try to minimize it and maximize it. You may need to run your algorithm several times if it reports something different each time. Obviously, these bounds will be specific to the particular corpus you use, but, unless you change your corpus a lot, it doesn’t really matter.

    Once you have all your bounds, you can use them to normalize your parameters in your overall evaluator, and then square each normalized parameter. The result is a kind of least squares optimizer, except without having to do any calculus. 🙂 The nice thing about scoring layouts this way is, you don’t have to come up with weights for each parameter. The program will naturally try to find the best compromise between all parameters.

    In case you don’t know, we want to square the terms so that bigger errors get punished more. For instance, say you have two layouts and are using two parameters in [0,10], with 0 being perfect and 10 being the worst. The first layout scores 5 on the first parameter and 5 on the second. The second layout scores 0 on the first parameter but 10 on the second. Just summing the parameter values, both layouts look the same. However, I contend that we’d rather favor layout 1, since, although it’s not perfect in anything, it’s also not super-awful in anything. Layout 2 is perfect in one thing, but absolutely retched in another. Well, if we square the parameters, we get what we want: Layout 1 scores 5^2 + 5^2 = 25 + 25 = 50, while layout 2 scores 0^2 + 10^2 = 0 + 100 = 100. Lower scores are better, so layout 1 wins.

    Put another way, we don’t want to sacrifice one thing to make another thing really good–we want everything to balance out.

    It works really well once you get a good set of parameters (which is the hard part). It sounds like you’re on the right track with the objectives you listed. In fact, everything you list but outward rolls and changing rows is factored into my evaluation function, somehow. Minimizing row changes tends to lead to lots of sequences on just the top or bottom rows, which for home-row-focused layouts is a lot less desirable than it sounds.

    One parameter I found interesting that you might want to try out is counting how many keys move from their Qwerty locations, with lower scores being better.

  8. TeckGecko said

    MTGAP, how did you determine the placement for all the special characters? You mention the special character layout being inspired by Arensito and while I do notice some similarities, the final configuration is actually fairly different.

    Further regarding the special characters, I’m curious as to what specifically you (dis)like about how they’re allocated in DDvorak ( Another layout that you may find has some merit is shown here:

    In regards to the evolutionary algorithm itself, the designers of Arensito and Capewell have some ideas that I’ve not seen you comment on here (though perhaps you did on the Colemak forum and I’ve not noticed).

    From Arensito’s designer:

    “We should also look for more efficient ways to permute/mutate a keyboard.

    A mutation that moves ‘e’ to a peripheral key should be less likely than the opposite move.
    Moving a frequent character to the shifted state should be less likely than the opposite direction.
    I am sure there are other smart mutation rules.


    Modify the Evolutionary Algorithm to include 48 keys, and possibly let us define characters for the thumbs.
    Modify the EA to make smart mutations
    The calculations must not use a list of the most frequently used words. Trigraphs may include characters from two different words, and punctuation characters break the word definition. We should base the information on monograph-, digraph- and/or trigraph-statistics (quadruple-statistics?), and possibly include additional statistics.
    (I have not studied the code yet, but we should…) Stop calculating the strain on a keyboard if it will be too high.
    Distribute the calculations of strain to clients on the internet. Or make a version that will run as a client process on people’s machines that calculates strain and forward results to a server.”

    And from Capewell:

    “Another thing I do differenty than P.M. Klausler is that I don’t have the layouts race to type a weighted word list words. Instead, it races to a type weighted key-combination list. I took a 120kb text file of random paragraphs of whatever I’ve read or written lately and counted 2- and 3-character combinations. I then threw out any 2-character combos with a space in them and any 3-letter combos that was not either character-space-character or space-space-character (space-space is also ‘\r\n’ to account for new paragraphs [It was easier to do it using the Windows-style endline]). This is both faster (~1900 layouts tested per second) and allows me to take into account the effects of ending one word and starting another. I had it originally implemented as simply reading through a text file, but this combo method scores within 2% of the text file method and is waaayyy faster. I did not try the word list method. The space-space-character combo indicates the beginning of a paragraph, so does not take into account any previously typed key. The character-space-character combo behaves like a character-character 2-key combo, except that the score is reduced in magnitude by a specified percentage since pressing space should let your fingers reset themselvessome what… It currently does about 3000 layouts per second on a 2.8GHz Intel P4 Northwood. A typical run through the program takes about two to three hours and inspects 15+ million layouts”

  9. mtgap said

    For the special characters, I had a few criteria, and was mostly trying to make it as simple and easy to remember as possible. I put all the math-related keys on one side. Keys that were on the same button in QWERTY were sometimes put next to each other. All the foreign-language keys were also put together. On top of that, I tried to organize everything roughly by frequency.

    I can’t say what I have against DDvorak, as I don’t really remember. I looked at it briefly just now, and can’t see any big problems.

    About the Arensito comments, I considered those ideas, but didn’t think of any good ways to implement them.

    About Capewell’s comments, I don’t know if I wrote about it anywhere, but that’s what I did. Instead of digraphs, I did trigraphs (which were necessary for how I was counting hand alternation), and I didn’t add in spaces.

  10. Phynnboi said

    Wow, 15+ million evaluations to convergence sounds really excessive. My program typically evaluates around 3,000 layouts before converging. Even at its least greedy (where I let it “climb down the hill” infinitely, if it wants), it rarely evaluates more than 20,000 layouts (and often does less than 10,000).

    I actually went to Capewell’s page, wanting to try out his layout (or at least run it through my program), but it seems to have 2 Ws and 0 Ms. Hmm…. There is a pretty obvious flaw I see right off the bat, though: TH, the most common digraph in English, is quite awkward to type on Capewell. I realize it’s a work in progress, but still, 15 million evaluations is a lot to end up with a problem like that. :/


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: