Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keyboard mapping issues when using non-US layout #4382

Open
phil-hands opened this issue Dec 2, 2021 · 37 comments
Open

keyboard mapping issues when using non-US layout #4382

phil-hands opened this issue Dec 2, 2021 · 37 comments

Comments

@phil-hands
Copy link
Contributor

If one selects a non US keyboard within the SUT, and then use type_string() then for example one sees that z and y are swapped when typing with a german keyboard selected.

It seems that this is because keys are mapped to keycodes in consoles/VNC.pm and this continues to assume a us layout, or perhaps the VM continues to apply a US keycode to layout mapping, or some such.

I notice that one can tell the VM to use another layout by setting KVMKB.

One can see the problem in this screenshot, where the colour code it's trying to type is #000000 but as a result of selecting a British keyboard during the install, that comes out as £000000 (which happens to work anyway :-) )

Interestingly, if one sets KVMKB=en-gb that changes to ~000000 as seen here, which I assume to be because consoles/VNC.pm includes code that puts # in shift_keys, apparently declaring # to be a shifted 3. On a UK keyboard however, # is not shifted, and is instead on the key adjacent to the lower part of the return key. If one shifts the # on a UK keyboard, one gets a ~ which I presume is where the ~ is coming from above.

This would seem to depend upon the backend in use, so I've no idea if a similar issue exists for non-KVM backends, or if there are related but different issues.

If it's actually possible, it would be nice to be able to inform the SUT that one's selected a non-US mapping, so that when one says one is wanting to type e.g. y or #, that each layer of this is sufficiently aware of the mapping to ensure that the same key gets to the SUT unchanged. Failing that, this problem and possible work-arounds should be documented somewhere, probably in the type_string() docs.

@okurz
Copy link
Member

okurz commented Dec 2, 2021

Not sure about a resolution to your problem but I know of tests that test a different keyboard layout, see
https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=opensuse&flavor=DVD&machine=64bit&test=installer_extended&version=Tumbleweed
surprisingly this test does not add any special parameters to instruct the VM about the keyboard map.

@phil-hands
Copy link
Contributor Author

phil-hands commented Dec 2, 2021

That test seems to rely on the behaviour in question.

Looking at the line where it types stuff, it actually types azerty on a French keyboard in order to see qwerty displayed.

Similarly, if you hunt for the string kezboard you'll find tests that are trying to type keyboard via a German layout.

@Martchus
Copy link
Contributor

Martchus commented Dec 2, 2021

The test referenced by @okurz does this:

  1. Switch the keyboard layout to France
  2. Type "azerty".
  3. Expect "qwerty" to be actually typed.
  4. Switch the keyboard layout back to US so the rest of the test is unaffected.

So this test in fact relies on os-autoinst still using a (virtual) US keyboard all the time regardless of what the SUT expects.

I suppose you're asking for a way to change the layout of os-autoinst's virtual keyboard from US to something else? It looks like VNC.pm already allows to assign a different keymap. However, this functionality is not exposed as a test API function yet and so far the only alternative mapping is "ikvm".


@phil-hands I haven't seen your comment when I submitted this comment.

@phil-hands
Copy link
Contributor Author

phil-hands commented Dec 2, 2021

@Martchus I think that's probably right.

I suspect that setting KVMKB does most of what I want.

If you ran that azerty/qwerty test with KVMKB=fr-fr (or whatever the default french keyboard layout is) then I would guess that it would display azerty and that if you were to check which keycodes were used to achieve that, it would have been the first five keys on the top row.

Just to make things awkward, the case I was caring about was the UK keyboard, and it seems that there are some assumptions about # and it being shifted in the code that do not hold for a UK keyboard, which should perhaps be handled as a separate bug.

My hope would be that one could end up in a situation where one could write a test that has an @ being typed into a shell, and not get a " instead if you happen to have the wrong mapping, or at least some documentation about the problem and how to handle it (which I'm happy to write myself, once I'm more certain what the right thing to do is, and where to put that documentation).

BTW the @ vs. " thing breaks curl's @filename magic when collecting data. I can work around it by setting the mapping back to US, but I'd rather fix it properly

One could kludge around it by having a function that takes a string and turns it into whatever one has to type in order to fool the SUT into thinking you typed that string, but this doesn't strike me as the most elegant solution to the problem ;-)

@Martchus
Copy link
Contributor

Martchus commented Dec 2, 2021

Stupid question, but what's that KVMKB variable you're talking about? I'm only aware of QEMU's -k language CLI option.

@phil-hands
Copy link
Contributor Author

Oops, sorry.
What I actually meant to say was VNCKB (I have a slight tendency to mix up TLAs, I'm afraid ... I should have checked)

BTW That's VNCKB as documented here: https://github.com/os-autoinst/os-autoinst/blob/master/doc/backend_vars.asciidoc

@AdamWill
Copy link
Contributor

AdamWill commented Dec 2, 2021

FWIW, in Fedora's instance we also kinda expect/rely on the "qemu is always emulating a US keyboard" behaviour. I think we actually wrote a version of the "not elegant" function @phil-hands suggests at some point, but I think we've factored that out since...we just have several places where we type the "right wrong thing" in French :)

@phil-hands
Copy link
Contributor Author

Additionally, Fedora has console_loadkeys_us() in lib/utils.pm in their tests which runs loadkeys us in a terminal in order that the subsequent scripting need not worry about this problem (as long as you only need to worry about french and japanese keyboards).

I've added british as an option, and done it in a way that still works in the Debian Installer environment (which does not include the US mapping file, so you need to create it). Obviously, I should change the french & german options to do the same ckbcomp us trick if I want that to also work in D-I, but I thought I'd see what the outcome here was first.

The kludgy nature of this was another thing that prompted me to open this issue, since I can certainly imagine someone in Debian deciding that they want to really test alternate keyboard layouts, so I'd prefer to make something a bit more generally applicable if possible.

@AdamWill
Copy link
Contributor

AdamWill commented Dec 3, 2021

@phil-hands if you want to poke at this, the code is mainly in os-autoinst VNC.pm starting at line 520. The docs for the "other end" are here. It does note, for instance, the issue you mention with keys that are shifted in some layouts but not in others: "The "shift state" (i.e. whether either of the Shift keysyms are down) should only be used as a hint when interpreting a keysym. For example, on a US keyboard the '#' character is shifted, but on a UK keyboard it is not. A server with a US keyboard receiving a '#' character from a client with a UK keyboard will not have been sent any shift presses. In this case, it is likely that the server will internally need to "fake" a shift press on its local system, in order to get a '#' character and not, for example, a '3'." I would assume qemu can do most of that handling, but I've never tested it.

It's notable that there are actually two different ways we could send keypresses to qemu, here. The one we're using is the older, VNC-standard one ("KeyEvent"), where we (the VNC client) send a keysym (approximately, a value that represents a character, not a physical key) to the server (qemu). qemu supports an extension, "QEMU Extended Key Event Message", which allows passing both a keysym and a scancode, with the keysym optional (0 means "unknown"). This is clearly an improvement for a human using a physical keyboard, because it avoids an unnecessary layer of translation (qemu translating a keysym to a keycode for the guest VM to translate back into a keysym becomes just qemu passing through a keycode for the VM to translate into a keysym). I'm not sure it'd be better for us, but it's worth thinking about, I guess.

It would, I think, at least be a more 'true' way to test alternate keyboard layouts extensively, if you really wanted to do that.

@AdamWill
Copy link
Contributor

AdamWill commented Dec 3, 2021

BTW, I think I did experiment with VNCKB briefly but found it actually more trouble than help; the problem is the tests where we test non-US layouts do actually use a US layout configured for quite some time - all the way up until we actually get to configure the keyboard layout in the installer - and so if we need to type anything at that point, e.g. on the kernel command line when booting the installer, we have to translate it "the other way" if we have VNCKB set...

@stale
Copy link

stale bot commented Mar 5, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 5, 2022
@phil-hands
Copy link
Contributor Author

[sorry for leaving this fallow for a while ... I'd not forgotten about it, just been busy.]

It seems to me that I need to add a function that allows one to set the mapping to use, such that if one calls something like set_keyboard_mapping('en-gb'), and I'd also tweak init_ikvm_keymap so that it uses that setting when creating the keymap so that one can get characters to be passed on without them getting mangled.

However I'm a bit vague about how/where I'm supposed to add that in a way that makes it available via the testapi.

It seems that the required mappings ought to be already defined somewhere (vnc? qemu? in the linux kernel?) in which case using that mapping as the basis of the conversion in KVM.pm would seem to be a way of doing it without having to hand-code a series of locale-based mappings in the style it's been done already for the US mapping..

Any suggestions on where to find the relevant info would be very welcome.

@stale stale bot removed the stale label Mar 8, 2022
@Martchus
Copy link
Contributor

I found key-mappings in the /usr/share/kbd/keymaps folder on my machine (provided by the kbd package, see http://kbd-project.org and https://wiki.archlinux.org/title/Linux_console/Keyboard_configuration). Not sure whether these kinds of keymapping files are helpful to us.

@phil-hands
Copy link
Contributor Author

OK, so I have come across mappings in various places that seem like what's available there, but I discounted them because, for example, if one looks at the apostrophe in US mapping from the kbd-project:

https://github.com/legionus/kbd/blob/master/data/keymaps/i386/qwerty/us.map#L36

you see that it's mapping decimal 40 to be the apostrophe (so hex 0x28, octal 0050)

whereas in the VNC.pm file we see that the code for that key is 0x34:

https://github.com/os-autoinst/os-autoinst/blob/master/consoles/VNC.pm#L542

As another example, the Escape key is 1 in the kbd mapping, but 0x29 in the VNC.pm mapping.

This appears to be a distinction between scancodes and keycodes or some such, and I'd presume that there's a mapping from one to the other somewhere (in the linux kernel perhaps?) which one could use to generate the numbers we need in VNC.pm, probably by combining them with the mapping from the kbd-project, such that we wouldn't need to reinvent the wheel for each keyboard we're interested in supporting.

@AdamWill
Copy link
Contributor

Well, a few things.

First of all those values you're looking at are from $keymap_ikvm, which are USB scancodes - there's a reference linked in the comment at the start of the hash. This bug seems to be more about qemu than iKVM. For the qemu case, the codes we send are defined in $keymap_x11 above. The codes that aren't explicitly defined are derived using chr and ord in init_x11_keymap(). These will also not map to the kbd codes you found, because as I explained above, the codes we currently send are keysyms not keycodes. A keysym (broadly speaking) maps to a character - 0x61 is the keysym for "the character a". A keycode maps to a physical key on a keyboard - 30 is the keycode for "the key on a keyboard which has the character "a" printed on it if it's a US keyboard, but the character "q" if it's a French keyboard". The codes you find in those kbd layouts are keycodes, because those layouts are what the kernel uses to map keypresses to characters at the console. The 'us' layout says "keycode 30 is a". The 'fr' layout says "keycode 30 is q".

You do have to keep in mind the whole chain of what's happening here.

Remember that ultimately we're talking to a VM, here. What qemu must ultimately do when we tell it to "press a key" is have the VM's virtual keyboard emit a keypress, which the guest OS's kernel will read as a keycode. The guest OS will then decide based on the layout configuration it's currently using what character to map that keycode to. Given the mechanisms at play here, it's not actually possible to tell qemu "just make this exact character appear on the VM's screen". We're always working at a more indirect level than that.

What happens with our current mechanism is we tell qemu "press the A key". qemu then uses an internal US keyboard mapping to decide what physical key it should tell the VM was just pressed - i.e. it converts the keysym we sent it, 0x61, to the keycode that would give that keysym on a US keyboard, 30. The guest OS sees a press of the physical key with code 30, and converts that to whatever character it is mapped to currently at the guest OS level; if the guest OS is using a US keyboard layout, it'll print an "a", if the guest OS is using a French keyboard layout, it'll print a "q".

So effectively speaking, what we're really telling qemu when we say "press the [something] key" is "press the key that's labelled [something] on a US keyboard".

We do have the option of sending keycodes instead of keysyms to qemu, if we want. We can directly send the keycode 30 instead of the keysym 0x61. But if you think it through - at least, for me - it's not clear this would help us much.

As I figure it, we'd then have to keep code on the os-autoinst side to map from characters back to scancodes for every keyboard layout we want to support, and we'd have to have a mechanism for the test to keep track of what keyboard layout configuration it thinks the guest is currently using, so we'd know what layout to send the keycodes for when the test code says "press the a key". That seems like a lot of work that can all go wrong at some point. It's difficult with the current approach too, but it's not clear to me at least that changing from sending keysyms to keycodes would make it any less difficult. I think it's just inherently a hard problem.

Sending keycodes instead of keysyms is obviously better if the agent using qemu is a real person typing on a real keyboard, because it avoids two whole layers of translation that can go wrong and just passes the code for the physical key the person pressed straight through to the guest OS to interpret. But in os-autoinst, there isn't a human typing on a physical keyboard, there's just test code that says "type this character". Which is unfortunately a rather complex operation when you unpick it.

@AdamWill
Copy link
Contributor

AdamWill commented Mar 11, 2022

Oh, forgot to mention - what happens if you pass qemu the -k argument (which is what happens if we set VNCKB) is that it tells qemu to use a different mapping for the keysym to keycode conversion step. So if we set VNCKB to 'fr', when we tell qemu "press the a key", it will use a French mapping to decide the keycode to send, and it will send keycode 16 (not 30 like it would with the default US mapping). So if the guest OS is using a French layout, it will print the character 'a'; if the guest OS is using a US layout, it will print the character 'q'.

As I said, the problem I had with using this mechanism is that in practice, when we're testing non-US languages and layouts, the VM still has a US keyboard layout loaded until we actually reach the installer and change it to the native layout. If you set VNCKB to 'fr' but find you need to type something before the guest OS reaches the installer (or desktop or whatever) and loads a French layout, you still have a mismatch and have to "type things wrong" until you manage to load the desired layout.

So if you need to edit kernel parameters in grub before the installer starts, or interact with the installer image's boot menu, or anything like that, you still have a problem.

@AdamWill
Copy link
Contributor

AdamWill commented Mar 11, 2022

Another way to think about this mess:

The interface we want for test code is "specify a string to send". We want test writers to be able to say "type the string 'banana'", not "send this string of obscure physical keycodes". But we cannot change the fact that what ultimately gets generated in the VM under test is a keycode for its guest OS to convert into a character.

So, the "map this string of characters to a string of keycodes" step must happen. We only have the choice of letting qemu do it, or doing it ourselves. Whether we do it, or qemu does it, there's a tricky problem that needs solving: keeping track of what layout is currently configured in the guest, and making sure the string we send gets mapped to the keycodes that produce the desired output string with that layout.

Currently the way we're mostly doing this, I think, is by "typing things wrong": when we want to test a non-US layout, we just use 'wrong' strings that we know cause the correct keycodes to be emitted. So if we know the guest OS has a French layout loaded, we know to tell qemu to hit the "q" key when we want the guest OS to get an "a": we use type_string 'bqnqnq'; and it works out.

AFAICS, there are theoretically two other choices there:

  1. It would be interesting if there was a qmp command to change the mapping qemu uses for keysym-to-keycode conversions on the fly, but I don't think there is. So I don't think that's actually a choice.

  2. We could switch to sending keycodes to qemu, and implement the character->keycode conversion in os-autoinst. We could then implement on-the-fly changing of the mapping used to convert characters, so tests could change that mapping whenever they changed the layout configured in the guest OS. That would, theoretically, allow us to always use 'correct' strings in the test code, if we were careful about keeping the mapping used in the guest and the os-autoinst conversion mapping in sync. But it strikes me as quite an undertaking for probably a somewhat limited return. It would look something like this, I guess:

    # we start with defaults on os-autoinst and VM side, so both are assuming 'us'
    # we're at a console in the VM
    type_string 'banana\n';
    # we type 'banana' into the console (and bash says command not found of course)
    assert_script_run 'loadkeys fr';
    # we just loaded a French layout into the VM, but os-autoinst is still using a US mapping
    type_string 'banana\n';
    # this time we typed 'bqnqnq'
    testapi::use_layout('fr');
    # now we told os-autoinst to use FR instead of US mapping
    type_string 'banana\n';
    # this time we typed 'banana'! success!

@Martchus
Copy link
Contributor

Thanks for the lengthy but enlightening comments. I suppose the first step is to check whether there's a qmp command. If we needed to do the mapping on our own we could likely make use of existing keymappings (although we'll have to use them the other way around). I'm wondering whether it is worth it considering that testing another layout is usually not done very extensively and "typing it wrong" was sufficient so far.

@AdamWill
Copy link
Contributor

Yeah, that's kinda broadly my take. Plus I feel like the way we actually implement os-autoinst tests - with lots of reuse of modules in different contexts - it'd actually be quite tricky to do the "keep the guest layout and os-autoinst mapping in sync" part in practice, that feels like it'd get a bit bear trap-y.

The current options (VNCKB and "typing things wrong") do still have the issue @phil-hands noted at the start, I guess, when a character is input using a modifier key in the US layout but not in the guest OS layout, or vice versa. I think the only way you could deal with that under the current regime would be to bypass map_and_send_key(), which means you'd have to use send_key_event_up() and send_key_event_down() directly. I don't know if that's even possible (whether you can import those subs from test code) and even if you can it's probably a bad idea (for a start your test code isn't independent of the backend any more).

@AdamWill
Copy link
Contributor

AdamWill commented Mar 11, 2022

well...hmm...I suppose it should be workaroundable, actually. If you want to type a # with the guest set to a UK layout, you just need to figure out what key on a US keyboard sends the keycode for the # key on a UK keyboard, and type the unshifted character on that key. If you want to type whatever shift-3 gets you on a UK keyboard, you can just do send_key '#';. So...um...unless I'm missing anything, there should always be a "typing it wrong" solution. I guess the one exception might be whatever the "extra" key on a 102- or 105-key keyboard is vs. a 101- or 104-key keyboard. US keyboards are 101/104 keys, so there's actually an "extra" keycode on a UK or French keyboard which a US keyboard just physically does not have. I don't know offhand if there's a way to make qemu send that keycode when it's using the US keysym->keycode mapping. I'll look into it.

so basically the challenge is: set the guest OS to the UK keyboard layout, then type a \ or a |. I believe the 'extra' key on a 102-key keyboard, as far as scan/keycodes are concerned, is the one to the right of the left shift key, which on a UK keyboard types \ unshifted and | shifted. The key to the left of 'enter' on a 102-key keyboard is the "same key" (at least once you get up to the keycode level, at the scancode level it is, uh, complicated) as the key above 'enter' on a 101-key keyboard.

The question is whether there's a keysym you can send qemu when it's using its "us" conversion mode which will produce the keycode for that "extra" key (referred to as "INT1", it seems). I think that keycode is 86 - it's defined in usr/include/linux/input-event-codes.h as #define KEY_102ND 86.

It looks like qemu's en-us mapping does map that keycode, although AFAIK there is no official standard definition of what that key "should be" on a US keyboard. qemu decides it would be a key that types a "less than" character unshifted, a "greater than" character shifted, a "bar" (I think that's a pipe) with altgr, and a "broken bar" (that's ¦) with shift and altgr. I think that's similar to what that key does on most European layouts. (This is all starting to seem familiar for some reason. I feel like I've dealt with a bug in qemu that involved this same key before!)

So. I think the solution to the challenge may be to send the keysym 0x03c, which is the keysym for "less". I suspect qemu might convert an instruction to "press the key labelled less on a US keyboard" to a press of this key that doesn't actually exist on any normal physical US keyboard. (In case you're wondering, when a test instructs us to actually type the < character, what we do is send the keysym for "shift" and the keysym for "comma", because shift-comma is how you type a < on a US keyboard).

This is fun! I'm gonna test it out.

@phil-hands
Copy link
Contributor Author

Sorry about the long pause getting back to this.

I just tried a test where a UK keyboard was selected when installing the SUT, and VNCKB is not set.

I then had it type_string the following:

\\ hash #  tilde ~  s-quote '  d-quotes \"  at @   pipe |  \n
\\ backslash \\  grave `  lt <   gt >  \n

which gives one the following results:
image
So far I've not discovered a way of typing wrong either backslash or pipe.

Without pipe, I guess the solution (which I already use) is to tell the SUT that it should switch to a US mapping before doing anything complicated in a script. I was hoping that there would be a way of getting type_string to do the right thing regardless of the currently active mapping, but it looks like the benefit isn't worth the effort (particularly if one has to hand-code things for each different keyboard setting)

BTW adding any of broken-bar ¦, sterling £ or corner-thing ¬ to that results in it throwing an error when trying to type the relevant character:

Reason: backend died: No map for '�' at /usr/lib/os-autoinst/consoles/VNC.pm line 740. 

@phil-hands
Copy link
Contributor Author

BTW one aspect of this that bites when one is using a UK mapping on the SUT is that upload_logs() breaks, because it builds a command to run starting from:

my $cmd      = "curl --form upload=\@$file --form upname=$upname ";

but typing that @ into the SUT mangles it into ", which curl then fails to deal with, unsurprisingly.

I briefly wondered if one could use something like \100 or $(echo A | tr 9-B 8-A) or even $(awk 'BEGIN {printf "%c", 64}') to generate an @ in the target shell, but each of those has one of \ | or " in them, so are no good.

@stale
Copy link

stale bot commented Aug 11, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 11, 2022
@stale
Copy link

stale bot commented Sep 20, 2022

This Pull Request has been automatically closed as it did not have any activity in the last 97 days. Thank you.

@stale stale bot closed this as completed Sep 20, 2022
@okurz
Copy link
Member

okurz commented Sep 23, 2022

Reopening as likely still valid. Also I created https://progress.opensuse.org/issues/117136 to prevent the stale bot to close issues which are still valid.

@okurz okurz reopened this Sep 23, 2022
@stale stale bot removed the stale label Sep 23, 2022
@AdamWill
Copy link
Contributor

yeah, this is still an interesting topic, but it's just not practically speaking very important so I can't justify getting back to it until I have nothing more important to work on :| don't know about phil.

@phil-hands
Copy link
Contributor Author

Thanks for that.

However, I think being put in a position where I felt like I ought to put some more effort into this, and come up with something constructive to say before reopening it myself was probably quite motivating ... I've just not got round to actually applying that motivation, yet :-)

@stale
Copy link

stale bot commented Dec 23, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 23, 2022
@wasoeki
Copy link

wasoeki commented Feb 12, 2024

Hi, I was dealing with some of the problems explained here and first I have to say this issue helped me a lot to understand what I was dealing with, so thanks. To say more about my skills in that matter, I learn Perl for the first time through openQA so excuse me in advance if I am wrong on certain things. But I would like to contribute and help maybe improving the way keyboard layouts are implemented from an openQA type_string function to a qemu virtual keyboard output on a VM display, here are some insights of what we had to do in my team :

We test a custom preseeded debian ISO which from the beginning of the installation use the french language and a french keyboard layout. All inputs of our tests then need to obviously take this layout into account. With the VNCKB=fr set, we thought it would be enough but several things came on our path to resolution :

The current state of openQA says (written inside the die handler function in the lib VNC.pm) that no other layouts than en-us are supported.

sub die_on_invalid_mapping ($key) {
    die decode_utf8 "No map for '$key' - layouts other than en-us are not supported\n";
}

This explained that many character we tried to type didn't match even with the VNCKB=fr parameter set. Most of these wrong characters were at the top of the keyboard layout where numbers and special character accessible with shift are. So we had to edit this lib to bring support to the fr layout :

/usr/lib/os-autoinst/consoles/VNC.pm

As said even with the VNCKB=fr (which tells qemu to use a virtual keyboard with a french layout), we came through unattended issues :

The lib VNC.pm includes a function called shift_keys which only handles the en-us layout and then type wrong any other keyboard layout. The errors seem to occure in a strangely manner until you know it is handled as a hard coded list of key/value matches 😃 . So we edited it as is :

    sub shift_keys () {
    # see http://en.wikipedia.org/wiki/IBM_PC_keyboard
    # see https://www.tcl.tk/man/tcl8.4/TkCmd/keysyms.html
    # see https://www.ascii-code.com/fr
    # see https://doc.ubuntu-fr.org/tutoriel/comprendre_la_configuration_du_clavier

    return {
        '1' => '&',
        '2' => chr(233), # é est mal encodé donc on le désigne par chr(233)
        '3' => '"',
        '4' => '\'',
        '5' => '(',
        '6' => '-',
        '7' => chr(232), # è est mal encodé donc on le désigne par chr(232)
        '8' => '_',
        '9' => chr(231), # ç est mal encodé donc on le désigne par chr(231)
	'0' => chr(224), # à est mal encodé donc on le désigne par chr(224)
        #'°' => ')',
        chr(176) => ')', # ° est parfois mal encodé donc on le désigne par chr(176)
	'+' => '=',
	
	# second line
	#'"' => '^', # trema buggué car boucle avec les double quote
	#'£' => '$',
	chr(163) => '$', # £ est parfois mal encodé donc on le désigne par chr(163)

        # third line
	'%' => chr(249), # ù est mal encodé donc on le désigne par chr(249)
	#'µ' => '*',
	chr(181) => '*', # µ est parfois mal encodé donc on le désigne par chr(181)

        # fourth line
	'?' => ',',
	'.' => ';',
	'/' => ':',
	#'§' => '!',
	chr(167) => '!', # § est parfois mal encodé donc on le désigne par chr(167)
	'>' => '<',
    };
}

⚠️ Note that the function shift_keys is only called for its keys and never its value which seem strange (see below code). It created problems in the step (5) below.

 for my $key (keys %{shift_keys()}) {
        die_on_invalid_mapping($key) unless $keymap{$key};
        $keymap{$key} = [$keymap{shift}, $keymap{$key}];
    }

while it could be

my %shiftkeys=%{shift_keys()};
    for my $key (keys (%shiftkeys)) {
       die_on_invalid_mapping($key) unless $keymap{$key};
	bmwqemu::diag "[Keytab Shift Keys] $key : [$keymap{shift},$keymap{$shiftkeys{$key}}]";
        $keymap{$key} = [$keymap{shift}, $keymap{$shiftkeys{$key}}];
    }

making more sense to me...
Otherwise it leads to sending keysyms that sometimes qemu doesn't know. (no scancode for # keysym : 35 for example)

There aren't always matches between ASCII/UTF8 and keysym/keycode so it brings difficulties, and here are the steps we came through :

  1. from type_string() to map_and_send_key() : the characters aren't correctly encoded, so we added decode_utf8.
  2. we added all needed particular keys inside keymap_x11 (specifically to handle altgr key) :
my $keymap_x11 = {
    'esc' => 0xff1b,
    'down' => 0xff54,
    'right' => 0xff53,
    'up' => 0xff52,
    'left' => 0xff51,
    'equal' => ord('='),
    'spc' => ord(' '),
    'minus' => ord('-'),
    'shift' => 0xffe1,
    'ctrl' => 0xffe3,    # left, right is e4
    'ctrlright' => 0xffe4,    
    'caps' => 0xffe5,
    'meta' => 0xffe7,    # left, right is e8
    'metaright' => 0xffe8,    
    'alt' => 0xffe9,    # left one, right is ea
    'altgr' => 0xffea, 
    'ret' => 0xff0d,
    'tab' => 0xff09,
    'backspace' => 0xff08,
    'end' => 0xff57,
    'delete' => 0xffff,
    'home' => 0xff50,
    'insert' => 0xff63,
    'pgup' => 0xff55,
    'pgdn' => 0xff56,
    'sysrq' => 0xff15,
    'super' => 0xffeb,    # left, right is ec
    'superright' => 0xffec, 
};
  1. When keymapping, we needed to ensure which encoding characters have in the keytab so it is the same than the one inside the input. Otherwise 'é' was brought as two characters, something like that '©'. So we used, instead of writing directly special characters, the function chr which refers to them with their keysym. ex: 'é' = chr(233) où 233 is the keysym of char 'é'. Reverse function -> ord('é')=233.
  2. We also created a function called replace_special_char() that needs to be called inside a type_string function for example to automatically replace a 'é' with a chr(233). Meaning another hard coded matching table.
  3. As I understand, we send to qemu the keysyms of each character we want to input on its virtual keyboard. And qemu translates these keysyms to keycode. ⚠️ I don't know why, but all keysyms aren't recognized by qemu... So we needed another matching table which for each unrecognized character associates another character which lays on the same key on a french keyboard and which as a keysym known by qemu. ex: To get 'é', we ask qemu to type the key where the character 2 lays. To get 2, we ask the same but combined with the shift key.

@stale stale bot removed the stale label Feb 12, 2024
@wasoeki
Copy link

wasoeki commented Feb 12, 2024

Finally here is the rest of the code that works for us know :

sub special_keys () {
    # see https://www.tcl.tk/man/tcl8.4/TkCmd/keysyms.html
    # see https://www.ascii-code.com/fr
    # see https://doc.ubuntu-fr.org/tutoriel/comprendre_la_configuration_du_clavier
    # Liste des caractères mal encodés (ASCII vs UTF8) : °çéè৵ù || mal gérés par qemu (altgr)
    return {
        chr(233) => '2', # ord('é')=233 # é est mal encodé donc on prend le caractère 2 qui est sur la même touche
        '~' => '2', # ~ est keysym no scancode in qemu donc on prend le caractère 2 qui est sur la même touche
	'#' => '3', # '#' keysym = 35 : no scancode in qemu
        '{' => '\'', # '{' keysym no scancode in qemu
        '[' => '(', # '[' keysym no scancode in qemu
        '|' => '-', # '|' keysym no scancode in qemu
        '`' => '7', # ` keysym no scancode in qemu donc on prend le caractère 7 qui est sur la même touche
        chr(232) => '7', # ord('è')=232 # è est mal encodé donc on prend le caractère 7 qui est sur la même touche
        '\\' => '_', # '\' keysym no scancode in qemu
        chr(231) => '9', # ord('ç')=231 # ç est mal encodé donc on prend le caractère 9 qui est sur la même touche
        '^' => '9', # ^ est mal encodé donc on prend le caractère 9 qui est sur la même touche
        chr(224) => '0', # ord('à')=224 # à est mal encodé donc on prend le caractère 0 qui est sur la même touche
        '@' => '0', # @ keysym no scancode in qemu donc on prend le caractère 0 qui est sur la même touche
        chr(176) => ')', # ord('°')=176 # '°' est mal encodé donc on le désigne par chr(176) et on prend le caractère ')' qui est sur la même touche
        ']' => ')', # ']' keysym no scancode in qemu
        '}' => '=', # '}' keysym no scancode in qemu
        chr(249) => '%', # ord('ù')=249 # ù est mal encodé donc on prend le caractère % qui est sur la même touche
        chr(181) => '*', # ord('µ')=181 # µ est mal encodé donc on prend le caractère * qui est sur la même touche
	chr(167) => '!', # ord('§')=167 # § est mal encodé donc on prend le caractère ! qui est sur la même touche
    }
}

sub altgr_keys () {
    # see http://en.wikipedia.org/wiki/IBM_PC_keyboard
    # see https://www.tcl.tk/man/tcl8.4/TkCmd/keysyms.html
    # see https://www.ascii-code.com/fr
    # see https://doc.ubuntu-fr.org/tutoriel/comprendre_la_configuration_du_clavier
    return {
        '~' => chr(233), # é est mal encodé donc on le désigne par chr(233)
        '#' => '"',
        '{' => '\'',
        '[' => '(',
        '|' => '-',
        '`' => chr(232), # è est mal encodé donc on le désigne par chr(232)
        '\\' => '_',
        '^' => chr(231), # ç est mal encodé donc on le désigne par chr(231)
        '@' => chr(224), # à est mal encodé donc on le désigne par chr(224)
        ']' => ')',
	'}' => '=',
	#chr(164) => '$', # ¤ commenté car apparaît avec ê à la place... # ¤ est mal encodé donc on le désigne par chr(164)
	#chr(183) => ':', # · semble non suporté ou accessible via une autre combinaison (shift+altgr+k) # · est mal encodé donc on le désigne par char(183)
	chr(128) => 'e', # € est mal encodé donc on le désigne par char(128)

    };
}


sub init_x11_keymap ($self) {
    return if $self->keymap;
    # create a deep copy - we want to reuse it in other instances
    my %keymap = %$keymap_x11;

    for my $key (30 .. 255) {
        $keymap{chr($key)} ||= $key;
    }
    for my $key (1 .. 12) {
        $keymap{"f$key"} = 0xffbd + $key;
    }
    for my $key ("a" .. "z") {
        $keymap{$key} = ord($key);
        bmwqemu::diag "[Keytab Shift Keys] $key : [$keymap{shift}, ".ord(uc $key)."]";
        # shift-H looks strange, but that's how VNC works
        $keymap{uc $key} = [$keymap{shift}, ord(uc $key)];
    }
    # VNC doesn't use the unshifted values, only prepends a shift key
    for my $key (keys %{shift_keys()}) {
        die_on_invalid_mapping($key) unless $keymap{$key};
	bmwqemu::diag "[Keytab Shift Keys] $key : [$keymap{shift},$keymap{$key}]";
        $keymap{$key} = [$keymap{shift}, $keymap{$key}];
    }
    my %altgrkeys=%{altgr_keys()};
    for my $key (keys (%altgrkeys)) {
       die_on_invalid_mapping($key) unless $keymap{$key};
	bmwqemu::diag "[Keytab Altgr Keys] $key : [$keymap{altgr},$keymap{$altgrkeys{$key}}]";
        $keymap{$key} = [$keymap{altgr}, $keymap{$altgrkeys{$key}}];
    }
    my %specialkeys=%{special_keys()};
    for my $key (keys (%specialkeys)) {
        die_on_invalid_mapping($key) unless $keymap{$key};
        if (ref($keymap{$key}) eq 'ARRAY') {
	    if (ref($keymap{$specialkeys{$key}}) eq 'ARRAY') {
		$keymap{$key}[-1] = $keymap{$specialkeys{$key}}[-1];
	    }
	    else {
		$keymap{$key}[-1] = $keymap{$specialkeys{$key}};
	    }
	    bmwqemu::diag "[Keytab Special Keys] $key : @{$keymap{$key}}";
	}
	else {
	    if (ref($keymap{$specialkeys{$key}}) eq 'ARRAY') {
                $keymap{$key} = $keymap{$specialkeys{$key}}[-1];
            }
            else {
                $keymap{$key} = $keymap{$specialkeys{$key}};
            }
	    bmwqemu::diag "[Keytab Special Keys] $key : $keymap{$key}";
	}
    }
    $self->keymap(\%keymap);
    foreach my $k (keys(%keymap)) {
    	bmwqemu::diag "[Keytab] Key=Char=$k Val=Keysym=$keymap{$k}";
    }

}

sub replace_special_char {
  my $chaine = shift;
  
  #233 keysym du char é
  my $needle=chr(233);
  $chaine =~ s/é/${needle}/g;

  #232 keysym du char è
  $needle=chr(232);
  $chaine =~ s/è/${needle}/g;

  #231 keysym du char ç
  $needle=chr(231);
  $chaine =~ s/ç/${needle}/g;

  #224 keysym du char à
  $needle=chr(224);
  $chaine =~ s/à/${needle}/g;

  #176 keysym du char ° 
  $needle=chr(176);
  $chaine =~ s/°/${needle}/g;

  #163 keysym du char £
  $needle=chr(163);
  $chaine =~ s/£/${needle}/g;
   
  #249 keysym du char ù
  $needle=chr(249);
  $chaine =~ s/ù/${needle}/g;

  #181 keysym char µ
  $needle=chr(181);
  $chaine =~ s/µ/${needle}/g;

  #167 keysym du char §
  $needle=chr(167);
  $chaine =~ s/§/${needle}/g;

  #128 keysym du char €
  $needle=chr(128);
  $chaine =~ s//${needle}/g;
  
  return $chaine;

}

@Martchus
Copy link
Contributor

But I would like to contribute and help maybe improving the way keyboard layouts are implemented…

Thanks - that's generally appreciated. Since you already have code to share it would make sense to create a PR then. Even if it isn't production-ready that helps because on a PR we can comment/discuss on code more easily (and see what has actually changed in your version).

I suppose what you're writing makes sense but I haven't looked into that topic for a while (as it is not a priority for us) so I would have to have a closer look later to give you more feedback.

@AdamWill
Copy link
Contributor

AdamWill commented Feb 12, 2024

Same as @Martchus , but broadly I think if we want to make any extensive changes here to try and fix this "properly", it is more or less a prerequisite to implement qemu's keycode extension to VNC. It is just too hard/weird/impossible to do this properly by sending keysyms over the wire. There are just too many levels of translation going on in that case.

By implementing that extension we can send qemu physical scan codes instead of keysyms that it then translates to scan codes. Well, technically you're required to send both, but I think qemu ignores the keysym, it's just there to make the protocol happy. It reduces our problem to, more or less, "we know the character(s) we want to produce, and the layout our test has configured in the target system: what are the scan code(s) that represent the correct key(s) to produce the desired result?"

@AdamWill
Copy link
Contributor

If we do it that way, and I'm thinking this through right, I think all we really need is one big lookup table per layout supported, with some kinda handling of modified presses. The default would be for a US English layout, and we would map each typable character to the scancode or combination of scancodes required to produce that character on a US English keyboard. To support another layout, you just add a lookup table.

It would be nice to make it easy to switch between them on demand in the test code, I guess, so you can implement tests where the SUT is using different layouts at different times (e.g. defaulting to US layout during boot, but you want to switch to French later). And there's a question just how sophisticated do we make it - do we handle compose keys? But I think it's a clearer and simpler design ultimately.

@AdamWill
Copy link
Contributor

I suppose that does beg the question "do we use the VNC console for anything that's not qemu? And if so, do those other things support qemu's extension?"

@Martchus
Copy link
Contributor

do we use the VNC console for anything that's not qemu?

Yes, we do. For instance the svirt backend is using it (and it can use multiple virtualization solutions, e.g. VMWare and will also connect to a local Xvnc server).

I don't think we need to support different keyboard mappings for all those backends, though. Of course we need to take care not to break any of those use cases (e.g. make additional mapping that might be QEMU-specific an opt-in).

Note that there's also still the $keymap_ikvm-mapping defined in VNC.pm but I'm not sure whether it is actually still used.

And if so, do those other things support qemu's extension?

I guess that needed to be looked up for each use of the VNC console specifically. I guess in general (and we probably want to keep the VNC console generic) the answer is no.

@AdamWill
Copy link
Contributor

ah, fun. that does make it less easy to do a really 'clean' approach :(

i guess what we'd want to do is just always send keysyms that are appropriate for a US English layout, and send keycodes in the way discussed above for qemu, or something along those lines. it does seem like we'd wind up with lots of duplicate/parallel logic and lookup tables :( an alternative approach might be to subclass VNC.pm just for qemu, I guess, but that just sort of moves the complexity around a bit...ah, well.

@wasoeki
Copy link

wasoeki commented Feb 14, 2024

Thanks - that's generally appreciated. Since you already have code to share it would make sense to create a PR then. Even if it isn't production-ready that helps because on a PR we can comment/discuss on code more easily (and see what has actually changed in your version).

I created a PR right here : os-autoinst/os-autoinst#2457. The problem of this PR is that it removes the support of the en-us keyboard layout, so it can't be considered as is.
Thanks for considering changes for the support of other layouts, and switching between them. I will try my best to help on that matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants