Philosophical Multicore

Sometimes controversial, sometimes fallacious, sometimes thought-provoking, and always fun.

Archive for the ‘New Keyboard Layout Project’ Category

New Keyboard Layout Project: The Optimal Layout?

Posted by Michael Dickens on August 28, 2009

MTGAP 3.5 is the best layout, even with several criteria changes:

' y l d b  j p u m ,
o i t s c  h n e r a
x q w g z  v f . k ;

So the question now is, is it really the best layout?

PROS
-Low same finger, most of which is on the strong middle or index finger.
-A lot of keys are in good positions. I very much like the position of l (ell).
-Very good inward rolls.
-A good balance of outward rolls to satisfy both those who find them easy and those who don’t like them.
-A good-looking home row.
-The word “are” is super easy to type.

CONS
-Same finger, while low, could be better.
-hand alternation is pretty bad, though better than Capewell or Arensito.

How does it compare to Colemak, MTGAP 2.0, and Carpalx GYLMW?

Read the rest of this entry »

Posted in Keyboard Release, Keyboards, New Keyboard Layout Project | 6 Comments »

New Keyboard Layout Project: Followup

Posted by Michael Dickens on August 27, 2009

My previous post was erroneous. MTGAP 3.6 was in fact the convergence point. But MTGAP 3.5 still scores better for some reason.

Posted in Keyboards, New Keyboard Layout Project | Leave a Comment »

New Keyboard Layout Project: Another Cost and a Strange Development

Posted by Michael Dickens on August 27, 2009

I added another fitness element. If the curve follows the natural roll of the hand, then it should be supported. I made it cheaper to change rows if the row changes were (QWERTY positions) DV/VD, FE/EF, MK/KM, or IJ/JI. After adding that and running the program once, I got this strange result:

, p h d k  j c u l y
a n o t m  g s e r i
q b . f z  v w ; x '

It looks so different from the other results that I don’t entirely trust it. But h and o on the same finger? What? But perhaps surprisingly, it actually has very low same finger.

I ran a comparison, and it turns out that it is actually not the best layout for the score set. Apparently, adding that extra fitness criterion caused the program to take longer to converge upon the best layout.

MTGAP 3.5

k c f g b  j h u w .
o s a t d  l n e r i
q v , p z  ; m y x '

Fitness:       18.99
Distance:      1822.85
Inward rolls:  7.50%
Outward rolls: 5.27%
Same hand:     22.29%
Same finger:   0.85%
Row change:    9.89%
Home jump:     0.30%
To center:     3.16%

MTGAP 3.6

, p h d k  j c u l y
a n o t m  g s e r i
q b . f z  v w ; x '

Fitness:       19.49
Distance:      1864.59
Inward rolls:  8.67%
Outward rolls: 6.06%
Same hand:     24.29%
Same finger:   0.76%
Row change:    11.67%
Home jump:     0.23%
To center:     2.92%

3.6 has better same finger, but 3.5 is a good deal better overall. How could this happen? At this time, I have no idea. I will probably not be using the additional criterion, since it makes the convergence rate so slow.

Posted in Keyboard Release, Keyboards, New Keyboard Layout Project | 6 Comments »

New Keyboard Layout Project: Some Cost Modifications

Posted by Michael Dickens on August 22, 2009

Today, I played around with the costs. I found a few things that didn’t work too well, and a few that did. I came across something interesting, though. I reduced the cost of outward rolls to zero, and I got a somewhat different layout. There was too much same finger usage for my taste, so I increased the cost of same finger. Interestingly, I got the exact same result as MTGAP 3.2.

k c f g b  j h u w .
o s a t d  l n e r i
q v , p z  ; m y x '

I don’t like the way the “he” digraph requires that I bend my hand, so I added something to the program to take that into account. When I factored that in, I came up with this layout, which I like a lot:

MTGAP 3.5

k y d c b  j f u l ,
o i t s g  h n e r a
q ' w p z  v m . x ;

This solves some of the problems of MTGAP 3.2. The “he” digraph is much easier in my opinion. Inward rolls are improved, and that awkward “he” is eliminated. There are some problems, though. Distance, same hand and same finger are all increased. So is this layout really any better?

You may wonder why I care so much about the “he” digraph. I mean, it’s just one digraph. But the thing is, it’s the second most common digraph after “th”. That digraph alone has a significant impact.

I will continue to look for a good balance.

Posted in Keyboards, New Keyboard Layout Project | 6 Comments »

New Keyboard Layout Project: Overview

Posted by Michael Dickens on August 19, 2009

http://mtgap.wordpress.com/2009/08/18/new-keyboard-layout-project-efficient-genetic-algorithm-optimization/

For p = 48, b = 16, k = 1, o = 1024, q = 2048, the program got the best layout possible in just over fifteen minutes. Considering that to find the best layout by brute force would take a billion times longer than the universe has existed for, that’s a pretty impressive feat.

Even more impressive is q = 64. You might say, “Why in the heck was q so large in the first place?” My rationale was this: it’s the final round. You want to get the best possible layout. And even in the worst-case scenario, if q = 2048 the keyboard has a 99% chance of improvement. But it looks like that was excessive, because q = 64 has about a 50% chance of getting the best possible layout, and in only about ten minutes. (I ran it twice: one time it got the best layout, and the other time it got a layout that is very very close to the best, but not quite.) That all-star round with 1024 layouts and 2048 iterations was gobbling up a lot of time.

I am currently using this cost distribution. See my keyboarding page for an explanation of what these mean.

Inward roll:  -14
Outward roll:  -4
Same hand:      2
Same finger on...
    pinky:    100
    ring:      90
    middle:    70
    index:     55
Row change:    10
Home row jump: 40
	index: -30
To center:     10

Also, each position on the keyboard has a cost based on its difficulty to reach. These costs are not just based on my own assessment, but also on the placement of keys on some of the better keyboards out there.

 70 40 30 30 60   80 30 30 40 70
 10  4  0  0 30   30  0  0  4 10
 90 85 60 50 95   70 40 60 80 90

Normally the ring finger and pinky would cost 0 since no movement is required, but I wanted to keep E and T off of the pinky and E off of the ring finger. Those letters are far more common than any other, and it would put too much stress on those fingers.

And the best possible layout under this scoring system is:

k c f p b  j h u w .
o s a t d  l n e r i
q v , g z  ; m y x '

By the way, this layout is officially called MTGAP 3.2. You don’t have to worry about the version-numbering system; it’s a little erratic.

Is this really the best possible layout, or is the scoring system flawed? Let’s look at some of the problems with this layout.

-The fa/af digraph is fairly common. I assume that the program made it because the middle finger can handle same finger usage rather well (at least according to the scoring system). Also, if you’ve ever tried, it is exceptionally hard to find stuff to put on the same finger as ‘a’. It is usually placed on the pinky and then put with either a really rare letter like q or j, or with punctuation. But the algorithm thought it better to put a on the middle finger. Why?

To find out, I worked with rearranging some parts of the layout.

-You can’t switch i and a. i can only be paired with u, y, and punctuation. The spot above the middle finger is too good for y, so that won’t work. e is already using u and there isn’t really anything else for e to go with, so u is out. That just leaves punctuation. But the spot above the middle finger is still too good for punctuation. What does that leave? Pretty much nothing. f, though, is common enough that it can be placed in that spot. (By the way, if all this talk about letter frequency is confusing you, check out this graph.

-You can’t switch a and s. You end up with the exact same problem as you had initially: what do you put over the a?

-You can’t switch o and a. o has the same problem as a and i, only worse. Remember when I said that a is exceptionally hard to pair with other letters? Well, o and i have it even worse, and e has it worst of all.

So how have other layouts coped with this problem? By making sacrifices.

q w f p g  j l u y ;
a r s t d  h n e i o
z x c v b  k m , . /

Colemak copes with the problem by sacrificing some finger travel distance. Colemak has the lowest same finger usage of any layout I’ve seen, but it can only do this by making sacrifices. c is placed in a hard-to-reach position, for one.

Colemak actually copes remarkably well. Capewell and Arensito do not cope very well at all. They end up placing punctuation and rare letters where they shouldn’t be. Dvorak is even worse. So it looks like having the fa/af combination on the same finger is the best choice after all.

I will look for more potential problems later. Right now I’ve got some other fish to fry. As always, I welcome comments and feedback.

Posted in Keyboarding Theory, Keyboards, New Keyboard Layout Project | 8 Comments »

New Keyboard Layout Project: Efficient Genetic Algorithm Optimization

Posted by Michael Dickens on August 18, 2009

See part 1 and part 2.

I’m using a slightly modified algorithm, designed for increased performance.


Step 1: Create a pool containing p randomly-generated keyboard layouts.
Step 2: Score each keyboard according to a fitness function and sort the keyboards by score.
Step 3: Randomly delete half of the pool (giving preference to keyboards with lower fitness) and create a copy of each remaining keyboard.
Step 4: Mutate the copies by randomly swapping the positions of two keys m times.
Step 5: Repeat steps 2-4 until the best keyboard in the pool has not changed for b rounds.
Step 6: Place this best k keyboards in pool O and sort each keyboard in O by score.
Step 7: Repeat steps 2-6 until O contains o layouts.
Step 8: Repeat steps 2-4 using pool O until the best keyboard in the pool has not changed for q rounds.

I did some empirical testing and came up with an optimized algorithm. For p = 2048, b = 16, k = 32, o = 4096, q = 2048, it took five hours to come up with a layout with fitness 2304432 (remember that lower is better). But for p = 128, b = 16, k = 1, o = 1024, q = 2048, it took only one hour to come up with a layout with fitness 2304182, which is slightly better. Not only that, but the second set of values optimizes faster. I ran the first version twice, and each time came up with a different layout (one scored 2304432, the other scored worse). But when I ran the second version twice, I came up with the same layout both times. This means that it is probably the best possible layout given the criteria. Here’s the layout:


kcfpb jhuw.
osatd lneri
qv,gz ;myx'

Not bad, eh?

Soon, I will post a comparison of this layout and various other layouts. For now I’ll just say that this layout beats out MTGAP 2.0 on every criterion but one. Impressive, huh?

Posted in Keyboarding Theory, Keyboards, New Keyboard Layout Project | Leave a Comment »

New Keyboard Layout Project Update: Bug Fixed

Posted by Michael Dickens on August 16, 2009

I discovered a bug that was completely skewing the results of the output. It was very hard to catch, since C arrays make it way too easy to screw up. It turns out that I had been going outside the bounds of an array, messing up the results. But now that it’s fixed, the layouts are looking better. It got this one after only two and a half minutes.


k c p g w j l u f .
a s i t d h n e r o
q v y b z ; m , x '

Posted in Keyboards, New Keyboard Layout Project | Leave a Comment »

The New Keyboard Layout Project: Quick Update

Posted by Michael Dickens on August 11, 2009

I have recently been working on an evolutionary algorithm in C. I’m still working out the technical details, but it’s really getting there. After just three and a half days of writing the program, it is already capable of getting some decent output. Just now, I got this after only a few minutes:

q f y p b j l u m .
a s i t d h n e r o
k c ' w g z v ; x ,

A comparably good layout would have taken about two hours to get in Ruby. Not only is C faster, but the program I’ve written is much more efficient.

I am working on a modification that will give costs to keyboard shortcuts.

This is my current scoring system:


same finger moving outward +100
same finger moving inward +50
jump over the home row* +40
same key twice in a row +10
change rows* +10
move to the center* +10
same hand +4
outward roll -8
inward roll -20

Assume that both key presses are on the same hand.

Posted in Keyboards, New Keyboard Layout Project | Leave a Comment »

Optimized Evolutionary Algorithm for Keyboard Design Part 2

Posted by Michael Dickens on August 11, 2009

Warning: Math-heavy post ahead. Read at your own risk.

In part 1, we looked at appropriate iteration numbers and mutation rates for when there is only one keyboard per pool. Let’s look at our algorithm:

Step 1: Create a pool containing p randomly-generated keyboard layouts.
Step 2: Score each keyboard according to a fitness function and sort the keyboards by score.
Step 3: Randomly delete half of the pool (giving preference to keyboards with lower fitness) and create a copy of each remaining keyboard.
Step 4: Mutate the copies by randomly swapping the positions of two keys m times.
Step 5: Repeat steps 2-4 until the best keyboard in the pool has not changed for b rounds.
Step 6: Place this best keyboard in pool O and sort each keyboard in O by score.
Step 7: Repeat steps 2-6 until the best keyboard in pool O has not changed for o rounds.
Step 8: Repeat steps 2-4 using pool O until the best keyboard in the pool has not changed for q rounds.

In this post, we will discuss the optimal value for b; that is, the number of rounds it iterates before it stops. Remember that it only stops if the best layout has not changed for b rounds. I will also discuss what happens when a keyboard gets “stuck”; that is, it is not yet optimal but there is no possible improvement.

Let’s look at when a layout will usually get stuck. Say every key is at its 2nd best possible position. The only way a mutation could improve the layout is if, when two keys are swapped, they both improve. Every key only has one possible better position (1 out of 29) so the chance of both keys improving is (1/29) * (1/29), or 1/841. Since the swap could go in either direction, we multiply the chance by two and get 1/420. So the chance of any particular mutation being an improvement is 1 out of 420. But there are 435 possible mutations. So the chance of at least one of them being an improvement is 1 - (419/420)^435, or ~0.65. Worst-case, the chance of finding this improvement is 1/435. So about 2000 rounds (1 - (434/435)^2000 ~ 0.99) are necessary to ensure that the improvement happens. But we can get away with a good deal less, since it’s likely that there will be more than one possible improvement.

But what if we’re satisfied with something worse? Then even fewer rounds are necessary. The chance of improvement is even higher if we are satisfied with every key being in the 3rd best position. Or the 4th.

Let’s assume that every key is in the 3rd best position. The chance of there being one possible improvement is ~0.67. We’d have to take about 2000 iterations to be 99% sure that we’ve found it. After only 500 iterations, we could be about 70% sure. But what if there are two possible improvements? Assuming that every key is in its 3rd best position, there is a 43% chance that there are two possible improvements. If there are two possible improvements, we can be 99% sure that we’ve found one of them after only 1000 rounds. So assuming that every key is in its best possible position, a reasonable value for b is around 500 to 1000.

But there is another factor to take into account: the size of the pool. Assume that the pool has 1000 keyboards in it. It is possible that one of these keyboards is better than all the others. Through the selection process, its “genes” will slowly take over the whole pool. So it’s entirely possible that every single keyboard is close “relative” to just one original keyboard. If they are all so similar, then the chance of improvement is higher since not just one layout could improve, but any layout could improve. So if there are 1000 keyboards then b doesn’t need to be so large. Just what b should be is harder to calculate.

Posted in Computer Science, Keyboarding Theory, Keyboards, Math, New Keyboard Layout Project, Programming | 4 Comments »

Optimized Evolutionary Algorithm for Keyboard Design Part 1

Posted by Michael Dickens on August 7, 2009

Warning: Math-heavy post ahead. Read at your own risk.

Trying to optimize a keyboard layout is tricky business. There are 30 factorial (30*29*28…) possible keyboard layouts. That’s about 10 to the 32nd power. This is far too many to search through every single possible keyboard layout to find the best. So an evolutionary algorithm (sometimes I call it a genetic algorithm, but there are subtle differences so that’s technically wrong) is a good way to find an optimal solution. This site provides a good introduction to genetic algorithms.

A keyboard optimization consists of several steps.

Step 1: Create a pool containing p randomly-generated keyboard layouts.
Step 2: Score each keyboard according to a fitness function and sort the keyboards by score.
Step 3: Delete the lowest-scoring half of the pool and create a copy of each remaining keyboard.
Step 4: Mutate the copies by randomly swapping the positions of two keys m times.
Step 5: Repeat steps 2-4 until an optimal keyboard is reached.

This is not the complete algorithm, as you will see later. And the fitness function is an entirely different matter. But even so, there are a few questions. What values should be used for p and m? How do we know when an optimal keyboard is reached?

I’ll start by answering that last question. I know of three ways to know when to stop. The first and simplest is to just run the algorithm for some predetermined number of rounds. The advantage to this is that it’s easy and doesn’t require much computing power. The disadvantage is that you don’t actually know if the resulting keyboard is the best you can get. To be very sure that the keyboard is the best you can get, you have to do a huge number of rounds, so if you really want to be sure then it’s not fast at all.

The second way is to stop when the best keyboard in the pool has reached a certain level of fitness. This can be tricky. You have to know what level of fitness you want. If you know that, this method can get you to exactly where you want but no better. The problem here is, what if it’s impossible to reach the desired level of fitness? Sometimes a keyboard will get “stuck” where it’s not yet at the optimal keyboard, but there is no possible improvement. Let’s imagine a scenario. Say that QWERTY is the perfect keyboard layout, and we’re trying to evolve towards it. QWERTY is of course not the perfect keyboard layout, but for the purpose of this exercise let’s say it is. Let’s say our algorithm has come up with this:

Current Optimal Keyboard-------QWERTY
QSCWD PO;IL------------------QWERT YUIOP
VEFBR UKYJM------------------ASDFG HJKL;
ZAXGT HN,./------------------ZXCVB NM,./

Let’s say that any possible change will make the keyboard worse. But it still hasn’t reached the optimal keyboard. So it’s stuck. It can’t change, so it can’t reach QWERTY. It has to get worse before it gets better. Are there any solutions to this problem? Perhaps; I’ll discuss this more later. What this shows for now is that the second method is not optimal.

The third method is this: look at the best keyboard layout in the pool and remember it. If the best layout stays the same for b rounds, that is, it isn’t replaced with any better keyboard, then we can be pretty sure that the algorithm will never find a better keyboard layout. This method can be unpredictable, but it is the most likely to efficiently find the best possible keyboard layout.

Now let’s revisit the problem of getting “stuck”. I know of three ways to fix this. Since these three are not mutually exclusive, any or all of them can be used.

The first method is fairly simple. Generate an optimized layout as described above. Then generate another optimized layout. Continue generating optimized layouts some number of times. Any of the three stopping methods described above can also be used here to decide when to stop. The difference here is that each resulting keyboard is created independently, so they don’t get “stuck”. Advantages: the layouts can be very different, and variety is good; getting stuck every time becomes extremely unlikely. Disadvantages: it’s very slow. To get really optimal results, this method can make the whole process take a thousand times longer or more.

The second method is to use roulette selection. Instead of deleting the worst half of the layouts in the pool, delete keyboards randomly. But not completely randomly: worse keyboards have a higher probability of being deleted. Advantages: it is much less likely that a layout gets stuck; it’s faster than the first method. Disadvantages: as with any random function, it’s unpredictable so it’s possible to completely miss a really good layout; this doesn’t matter so much, but it requires significantly more program code than there would be otherwise.

The third method involves increasing the value of m, the number of mutations. If there are more mutations per iteration, it will increase the potential for change. This allows a layout to get a lot better. But it also allows a layout to get a lot worse, so a high value for m will seriously slow down the program.

For most effective optimization, the first two methods will be used. Now let’s look at our new and improved algorithm.

Step 1: Create a pool containing p randomly-generated keyboard layouts.
Step 2: Score each keyboard according to a fitness function and sort the keyboards by score.
Step 3: Randomly delete half of the pool (giving preference to keyboards with lower fitness) and create a copy of each remaining keyboard.
Step 4: Mutate the copies by randomly swapping the positions of two keys m times.
Step 5: Repeat steps 2-4 until the best keyboard in the pool has not changed for b rounds.
Step 6: Place this best keyboard in pool O and sort each keyboard in O by score.
Step 7: Repeat steps 2-6 until the best keyboard in pool O has not changed for o rounds.
Step 8: Repeat steps 2-4 using pool O until the best keyboard in the pool has not changed for q rounds.

This is a bit longer, but still pretty simple. The remaining question is, what should the values be for the constants p, m, b, o and q?

Let us assume that p = 1 and m = 1 (that is, there is one layout in the pool, and a mutation swaps the positions of two keys). This makes it very simple to calculate b. We want to iterate long enough to be pretty sure that no improvement will be made. The keyboard that is least likely to improve is one where every key is in the best spot, except for two which are exactly swapped. So swapping those two keys is the only possible way to make the keyboard better. The probability of swapping those two keys is one out of 435 (two out of 30 chance to select one of the two keys multiplied by a one out of 29 chance to select the matching key).

Since If the keys to swap are being selected randomly, then b ≈ 1000. But if the key swaps are not actually random but are instead cyclical in a pseudorandom way, then after 435 iterations you can be guaranteed to hit the one swap that’s an improvement. So if there are no changes for 435 iterations, you can be positive that the keyboard is not going to get any better.

This can be extrapolated for higher values of m. Let’s say that m = 2. Worst-case scenario, the only possible improvement is two specific swaps. The probability of swapping those two pairs of keys is one out of 94612.5 (four out of 30 chance to select one of the four keys multiplied by a one out of 29 chance to select the matching key, multiplied by a two out of 30 chance to select a key from the remaining pair multiplied by a one out of 29 chance to select the matching key). We can then create a formula for any value of m where p = 1.

b = (1/29)^m * [(2m / 30) * (2(m-1) / 30) * (2(m-2) / 30) * . . . * (2(1) / 30)]

From this we can try to find an optimal value for m. We want to find a balance between mutability and efficiency. Perhaps the balance is as low as m = 1, since there are better ways to allow for mutability.

So we have m = 1 and b = 435. But should b really be such a large number? If it’s smaller, there will be a lower chance of reaching the optimal layout, but the program will run much faster. Instead of looking at the second best layout, let’s look at the third best. That is, there are two possible swaps that would be an improvement. The probability of getting one of these swaps in a random mutation is 1 out of 218. (This is of course an oversimplification; the second best layout might not be at all similar to the best layout. But this model works for our purposes.)

In general, if we are satisfied with the nth best layout, we should set b = (2n/30) * (1/29). You may say, Why would we compromise? Well, remember that we are later going to move on to an optimized pool O, and in this pool we can be more stingy. The average layout has about 218 possible improvements, so a layout with only, say, 50 possible improvements is going to be one of the better ones. To reach a layout with 50 possible improvements, only about 9 iterations are necessary. If we want a layout with only 5 possible improvement, 87 iterations are necessary. All things considered, that’s a pretty good equilibrium.

Now we must factor in what happens when we have p > 1; that is, there is more than one keyboard in the pool. Things become much more complicated. That will be the topic of discussion in part 2.

Posted in Computer Science, Keyboarding Theory, Keyboards, Math, New Keyboard Layout Project, Programming | 5 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 450 other followers

%d bloggers like this: