A new Left Recursive PEG Parser Generator for rust

I’d like to introduce a new PEG parser generator for rust, called lrpeg. I’ve recently started working on this. This PEG parser generator allows left recursion which makes PEG grammars much more useful.

When you want to use a parser generator with rust, there are a few options. First of all, there is lalrpop, which uses LR(1) grammars. This means there can only be one token lookahead, and no backtracking. This can make it very hard to write some grammars; often you’ll be faced with the dreaded shift-reduce errors. The Solang Solidity Compiler uses an lalrpop grammar. The grammar contains all sorts of tricks to avoid shift-reduce errors, which make the grammar totally unreadable. I’ve read Parsing Techniques by Dick Grune just to fully grok what can be done with LR(1) grammars. LR(1) complexity makes it hard to reason about.

The other option you have is pest. pest uses PEG grammars, without left recursion. This means that if you want to write a simple calculator, you have to re-write rules in a non-left recursive way, in other words the left-most rule in a definition cannot be the rule itself. The rule expr

num = { ASCII_DIGIT+ }
op = _{ "+" | "-" | "*" | "/" }
expr = { expr ~ op ~ term | term }
term = _{ num }

is not permitted because expr has itself as the most left most rule. So you have to re-write this as:

num = { ASCII_DIGIT+ }
op = _{ "+" | "-" | "*" | "/" }
expr = { term ~ (op ~ term)* }
term = _{ num }

This grammar still is not complete yet. Operator precedence, i.e. 1+2*4 should be parsed as 1+(2*4), not (1+2)*4. The way pest deals with this is with the precedence climber. This is some code which modifies the parse tree after parsing to adjust the precedence of operators. It is difficult to follow and this is not expressed in the grammar itself.

Another option, although not available in rust right now, are GLR grammars which ANTLR and GNU Bison use. Essentially that’s LR(1) with backtracking, and that makes for a powerful parser generator. However, implementing GLR is quite tricky. GLR parsers and grammars are pretty tricky to understand, there even more so than LR(1). So I think that pest has the right idea by using PEG grammars. However, it is possible to write the parser generator to allow for left-recursive rules, like python now uses. So this is why I wrote lrpeg. lrpeg can deal with left recursion (and indirect left recursion, where a left recursion happens via another rule). With lrpeg parser generator the grammar for a calculator is simply:

expr <- expr "+" term
/ expr "-" term
/ term;

term <- term "*" num
/ term "/" num
/ "(" expr ")"
/ num;

num <- r"[0-9]+";

Note that the precedence of * over + is specified by the ordering of the terms; expr will first recurse to term and try * before fallback to +. A more complete example with more precedence is the peg grammar for IRP.

The lrpeg parser generator is a new code base. There are outstanding issues like it does not detect unreachable alternatives or infinite loops in the grammar (e.g. expr <- expr "a";). It is also not self-hosting yet, in other words it does not parse the peg grammar with peg, it actually still uses lalrpop for that. This is why rules must end with a “;”: LR(1) has only one token lookahead, and that is not enough to determine the end of a rule. On the other hand, it is a fully functional parser generator which can parse IRP.

PEG grammars offer another advantage over GLR grammars which I’m keen on. The Solidity language has a huge list of keywords. Keywords exist in programming language because GLR and LR(1) grammars have separate tokenizer or lexer, which handles keywords as a special case. The lexer does not have the context to know if a keyword is used as a variable name, so the parser just gives up. For example, in Solidity int seconds = 1; is not permitted. With PEG grammars, we can do away with all these problems. The lexer and the parser are in one big definition; so we can match seconds as a keyword only where it is expected; in other cases, it can be a variable name. This would mean an end to most keywords. That would be nice.

How to run Hyper-V/Docker on Windows 10 in qemu/libvirt/kvm

Hyper-V on Windows requires nested virtualization, since it runs linux in a virtual machine. You need Hyper-V to run docker. Windows 10 does not allow virtualization when it is run in kvm. If you open the task manager, it does not even show “Virtualization: Not available”.

The trick is to make Windows 10 think it is running on real hardware, with virtualization enabled. So you need hypervisor off but vmx on; on the qemu command line this would be –cpu -hypervisor,+vmx.

First, download a time-limited Windows 10 VM from Microsoft. The VirtualBox and vmware downloads contain a .vmdk file, which can be converted to qcow2 using qemu-img. In this example the file downloaded was WinDev2009Eval.VMware.zip, this might be different for you.

$ unzip WinDev2009Eval.VMware.zip WinDev2009Eval-disk1.vmdk
Archive: WinDev2009Eval.VMware.zip
inflating: WinDev2009Eval-disk1.vmdk
$ qemu-img convert -f vmdk WinDev2009Eval-disk1.vmdk -O qcow2 WinDev2009Eval-disk1.qcow2

Second, in Virtual Manager, create a new virtual machine. You have to select Windows 10 and the WinDev2009Eval-disk1.qcow2 as a disk image (not as an installer iso).

To make Hyper-V work, you have to edit the xml which describes the cpu in Virtual Manager. First you have to enable xml editting in Edit -> Preferences -> Enabled XML editting. Then your machine details, find the CPUs, go to xml and find the cpu element. This should read:

<cpu mode="custom" match="exact" check="partial">
<model fallback="allow">IvyBridge</model>
<feature policy="disable" name="hypervisor"/>
<feature policy="require" name="vmx"/>

Press apply to make it work and start your machine. Now you can install Docker in Windows 10.

docker in Windows 10 in KVM

The hypervisor hack was inspired by a similar hack for vmware.

Moving from lirc tools to rc-core tooling

The lirc project comes with a set of cli tools which have been used for years. rc-core has replacements for them. Here is a cheat-sheet to see how to use the rc-core tooling.

You’ll need the ir-ctl and ir-keytable tools. Both are in the v4l-utils package, except for debian and derivates, where you will also need the ir-keytable package as well as v4l-utils.

In a lirc world, you need to set up the lirc daemon to do almost anything. This is not true in the modern rc-core world; there is no daemon and the lirc daemon is not required. Do not install lirc-disable-kernel-rc.noarch on Fedora.


Simply run ir-ctl -r. From v4l-utils 1.18.0 onwards, this will give you a more concise output. If you do not like this run ir-ctl -r --mode2.


There is no equivalent other than running ir-ctl -r on the command line. Having said that, xmode2 is totally broken anyway.


With rc-core, there no reason to map from IR events to X events. rc-core events are regular input events already.


You can send IR using:

  • ir-ctl -s file
  • ir-ctl -S protocol:scancode, e.g. ir-ctl -S rc5:0x1e01
  • ir-ctl -K KEY_VOLUMEUP -k /lib/udev/rc_keymaps/hauppauge.toml


Run ir-keytable -c -p all -t to see what your remote is sending will need to assembly the rc keymap by hand. There are various blog posts on this.


Since rc-core events are regular input events, a tool for executing commands on IR events should listen to regular linux input events. Shortcuts can be set up in your gnome environment, for example. Another example for headless execution is inputexec.


You can test your setup with ir-keytable -t.


Either use ir-keytable -t or evtest.

irpty, lirc-make-devinput

Since rc-core events are regular input events, this is not needed at all in the new order.

irpipe, irtext2udp, lirc-init-db

lirc specific tools which make no sense in rc-core.


ls /lib/udev/rc_keymaps/

irsimreceive, irsimsend

rc-core does not have an explicit simulated code path, however there is an rc loopback device which you can load using modprobe rc-loopback. You can use this to test IR decoding and sending.

Grant for Solang Solidity Compiler

Since the end of November, I’ve been working full time on a new Solidity compiler written in rust, called Solang. Back in March 2019, I decided to prototype an idea: write a new Solidity compiler from scratch using llvm and rust. The prototype worked fine and I managed to get it integrated with Hyperledger Burrow; I was working at Monax at that time.

The existing Solidity Smart Contracts compiler has a number of problems. It’s a huge inscrutable C++ code base, and the language has all sorts of weird limitations. By starting from scratch we create a compiler which does not have these limitations. By using using llvm, we can also generate more optimal code. Lastly we can target different blockchains, not just ethereum.

I applied for a grant from the Web3 Foundation, and it was accepted. The grant covers language parity with the Ethereum Foundation Solidity compiler. So, over the next year, I’ll be working on Solang full time.

Support for keymap with raw IR and sending key from keymap

kernel v5.3 will introduce a new BPF feature: loops. This makes it possible to decode IR based on raw IR. This means the keymap does not list a specific protocol or protocol decoder; for each key it simply list the pulses and spaces that make up that key.

Here is an example based on the blaupunkt remote.

name = 'BLAUPUNKT'
protocol = 'raw'
keycode = 'KEY_VIDEO_PREV'
raw = '+6875 -6850 +660'
keycode = 'KEY_VIDEO_NEXT'
raw = '+8050 -8050 +660'
keycode = 'KEY_UP'
raw = '+3850 -3850 +660'
keycode = 'KEY_OK'
raw = '+7500 -7400 +660'
keycode = 'KEY_DOWN'
raw = '+4450 -4400 +660'
keycode = 'KEY_RIGHT'
raw = '+5660 -5630 +660'
keycode = 'KEY_LEFT'
raw = '+6280 -6200 +660'
keycode = 'KEY_VOLUMEUP'
raw = '+2650 -2580 +660'
keycode = 'KEY_VOLUMEDOWN'
raw = '+3250 -3200 +660'
keycode = 'KEY_MUTE'
raw = '+5050 -5000 +660'
keycode = 'KEY_DOT'
raw = '+2050 -2000 +660'

So for each key, in the raw string the + denotes a pulse and the - a space. The + and - prefixes are actually optional, since the position in the string determines whether it is pulse or space. To load these keymaps, you’ll need a git ir-keytable and kernel v5.3 or later.

The script lircd2toml.py to convert lircd remote conf into ir-keytable toml formats now also supports raw_codes. I’m very excited that this now means we support the vast majority of lirc remotes.

Having raw IR in keymaps makes them much more useful for sending IR too. So, ir-ctl now supports sending keys from keymaps. Save the above keymap to blaupunkt.toml, and you send keys like so:

ir-ctl -k blaupunkt.toml -K KEY_VOLUMEUP

This means that – as far as I know – we have feature parity with the lirc framework. However no daemon is required at all and it’s a much more modern setup.

How to add support for a new remote from lircd.conf file

You have a remote for which the is no rc keymap, but you do have a lircd.conf file which works. That requires running the lirc daemon; nowadays you most likely don’t need to any more.

Automagic conversion from lircd.conf to toml keymap

I’ve written a python script which parses a lircd.conf file, and outputs a toml file. Now this script should cover most cases, but it won’t work in all cases. Get lircd2toml.py from the v4l-utils contrib. This requires python3 to work.

  • Make lircd2toml.py executable: chmod 755 lircd2toml.py
  • Run ./lircd2toml.py -o output.toml yourlircdfile.lircd.conf

Once you have the output toml file, it probably needs a little fixing up. First of all, lircd does not usually use linux keycodes, so those will need adjusting. Have a look at the linux input keycodes for the list keycodes and adjust the toml accordingly. Also make sure the name is set to a reasonable value. Once you have done that, test it using:

ir-keytable -c -w output.toml -t

Please do let me know how you get on and comment below. Alternatively, find us on the linux-media mailinglist or on irc #v4l on Freenode. If the keymap works fine please submit it to the mailinglist as a patch for v4l-utils.

What if lircd2toml.py doesn’t work

First of all, please ask. I’m eager to hear about people using this tool. Having said that, I should explain a little more about lircd.conf files work.

IR decoders in a rc-core world

In the toml file there is the “protocol” field, which can either match a kernel decoder or BPF decoder.

There are two types of IR decoders in the rc-core world. There are the hardcoded decoders in the kernel, which cover the most common protocols: rc-5, nec, rc-6, rc-mm, sony, sanyo, jvc, sharp, xmp. If your remote uses one of these protocols, you can use the method described in the previous post. However, you can try the method with lircd2toml.py too, it should work for most cases.

There are also protocol decoders written in BPF, which also have parameters. So if your remote has a regular protocol with non-standard signaling length, you can specify that in your keymap, much like the parameters in lircd.conf files.

The protocol in the lircd.conf is set in the “flags” field. lircd has a few:

  • space_enc: This is anything where bits are encoded using the distance between to pulses or spaces. Many of these are plain nec, but not all. Those that are not, can be decoded using the BPF pulse_distance decoder, or BPF pulse_length depending on whether the space length differs or the pulse length differs.
  • space_first: Not sure yet.
  • rc5 or shift_enc: This is just rc-5. Some versions use non-standard times, in which case you should use the BPF manchester decoder.
  • rc6: This is just rc-6.
  • rcmm: This is just rc-mm.
  • grundig: This is for some ancient grundig remotes. There is a BPF grundig decoder for them.
  • bo: This is a Bang & Olufsen IR protocol. There is no BPF decoder for this yet, patches welcome :)
  • goldstar: This for some ancient goldstar remotes. There is no BPF decoder for this yet, and supported has been removed from lirc too
  • serial: This is for IR receivers that decode IR and put it on a tty. Not supported in rc-core, as this should be done by a device driver.
  • xmp: This is just xmp.
  • raw_codes: This is just a plain dump of the pulse/spaces for each button. This is not supported in BPF yet (but so below).

The parameters for the BPF decoders can be seen in the source code, see the global ints declared at the beginning of the c files. Have a look at the output of ir-ctl -r and see if you can match things up. Also have a look a the lirc documentation for the lircd format.

How to deal with raw_codes

If you have a lircd.conf remote with a raw_codes protocol, then things are a bit harder to solve unfortunately. The format lists all the keys and the associated IR, by listing the raw IR. The first value is a pulse, then space, pulse, etc. Essentially the protocol is not reverse engineered so you are not much better off than without the lird.conf. This is a case where you should reverse engineer the protocol, like you would if you had no lircd.conf file at all.

It might be possible to support raw_codes in BPF, but loops are not allowed so this is not trivial. You would have to create the state machine of walking through the list, and unrolling the loop enough so that it can successfully decode the IR.

Note that some raw_codes are a well-known protocol but the author of the lircd.conf did not realise. For example, this hauppauge lircd.conf file uses raw_codes but it’s actually rc-5.

How to add support for a new remote

So you have a remote, but you can’t find a keymap that works for it. How do you add a new keymap for your remote?

Figure out what IR protocol it uses

For this to work, ideally you want a rc device which has a raw IR receiver. When you plug in your device will get a message in the kernel log (dmesg or journal -k) like:

rc rc0: lirc_dev: driver winbond-cir registered at minor = 0, raw IR receiver, raw IR transmitter

Now, if yours says scancode receiver it might still work, as long as the IR receiver supports the same protocol as remote. Run:

ir-keytable -c -p all -t

And now start pressing buttons on your remote. Hopefully you will start to see messages like:

3630.181420: lirc protocol(rc-5): scancode = 0x1e01

This means the protocol is rc-5.

If you get nothing, then unfortunately you’ll going to have to dig a little deeper. First of all verify that IR actually being received. Run:

ir-ctl -r

And press some buttons again. If you still get nothing, check your batteries and that you have a clear line of sight between your remote and the IR receiver. If you do get something, then most likely you are unlucky and the protocol your remote uses is not one of the standard protocols supported by the linux kernel; a protocol decoder written in BPF would be in order, but that is the subject for another post.

For now I’m assuming that ir-keytable -c -p all -t gave you something like rc-5 or nec and no custom BPF decoder is needed.

Create the kernel rc keymap for your remote

The keymaps in /lib/udev/rc_keymmaps/ are written in toml, but they are generated from the kernel sources and you should create your keymap there so that it can be submitted as a patch for everyone else to use.

git clone the linux kernel tree and go to drives/media/rc/keymaps/. Copy one of the exists files to a new file, e.g. rc-foo.c and edit the file. First of the protocol should be set to one you discovered in the first step (e.g. RC_PROTO_NEC). Now you need to set the individual keys.

Using ir-keytable -c -p all -t you can get the scancode; that needs to be mapped to a linux keycode. All the keycodes are listed in include/uapi/linux/input-event-codes.h in the kernel tree. Note that there are entries for mouse buttons too.

Ensure you have the copyright and license header set correctly; also check you have created a new RC_MAP_FOO entry in include/media/rc-map.h and that your file is included in drivers/media/rc/keymaps/Makefile. Commit your changes.

In principle it is ready now to be submitted, but it should be tested of course.

Create the toml keymap for your keymap

  • In your cloned linux repo, ensure you can build the kernel and run it without errors
  • Do make headers_install
  • Clone v4l-utils git repo
  • Do make sync-with-kernel KERNEL_DIR=path/to/your/linux/repo in v4l-utils
  • Now you should have an additional toml file in utils/keytable/rc_keymaps/
  • Load the keymap with ir-keytable -c -w foo.toml and test

Share your keymap with everyone else

The keymap should be submitted to linux-media kernel list, following all rules for submitting patches for the linux kernel. The patch should be against the kernel tree, and the v4l-utils tree will get updated by us once it’s merged into the linux-media git tree.

What's new in kernel v5.0 for rc-core

It has been a little quiet on the rc-core front, with just cosmetic changes in kernel v4.19 and v4.20. However, we do have two new features in kernel v5.0.

Mouse movement decoding in IR BPF

Some remotes have some sort directional pad or joystick which can be used as a mouse. When decoding using the lirc daemon, these are called lircmd.

From kernel v5.0 onwards the BPF function bpf_rc_pointer_rel(x, y) exists to report mouse movement. We use this to decode the iMON RSC remote protocol. This means that the SoundGraph iMON Station is now fully supported.

Driver for XBox DVD Remote

This is a usb device for the original XBox, and does not have a regular usb connector. You’ll need an adapter cable to make this work on non-XBox hardware.

This decoder only decodes IR from the remote the device comes with. I used the opportunity to figure out the protocol and now we have a decoder for this written in BPF, should you not have the decoder hardware.

What's coming in kernel v4.18 for rc-core

In kernel v4.18 the major new feature is IR BPF.

IR decoding in BPF

Kernel v4.18 introduces a new type of BPF program, called BPF_PROG_LIRC_MODE2. This type of program can decode raw IR and report decoded scancodes. This is to support the many lircd.conf rc keymaps which are currently not supported by rc-core.

Now that the kernel space work is complete, ir-keytable needs to be extended to support BPF type decoders (loading, querying and deattaching) and we need a ne set of IR decoders. I’m currently working to this.

The ultimate goal is to support all the keymaps for IR decoding and sending that lircd currently supports, without the need to run a daemon.

Faster IR decoding

The in-kernel IR decoders have always been a little sluggish, since the timeouts they use are far greater than what is actually needed. So, now using IR is much more responsive and keys will less “sticky”. I think this makes a huge difference.

Other changes

There some other minor improvements, such a MCE Keyboard decoder improvements, and when a lirc device is registered, it reports if is has a transmitter and if it is a raw or scancode receiver (or no receiver at all).

What's coming in kernel v4.17 for rc-core

In kernel v4.17 there are only minor changes.


The rc_core_debug module parameter for the rc-core modules is gone. Debug must be enabled via dynamic debug.

Minor Fixes

There are minor fixes to ir-spi transmit and meson-ir timeout handling.


There is a new driver for the iMON Station, which is an external usb device (unlike most of the other iMON gear). It is a raw IR device, so it does no decoding in hardware.

The second new iMON feature is a decoder for the iMON PAD remotes. These remotes have their own protocol, which is decoded by the iMON Inside, iMON VFD or iMON Knob. I spent a fair amount of time decoding it, and I think this is the first time someone has figured out how it works. ir-keytable has be updated for this feature, which is not done yet. In the mean time, you can use it so:

ir-keytable -s rc0 -c -w /lib/udev/rc_keymaps/imon_pad
echo imon > /sys/class/rc/rc0/protocols

mceusb learning mode and carrier report

Any mceusb device should have a wideband receiver for learning mode, which was not supported up until now. Learning mode should give you a more accurate reading, and can also measure the carrier. The downside is that the wideband receiver only works for short distances, so you have to hold your remote as close as possible to the IR receiver.

Using ir-ctl -m -r you can enable learning mode and carrier reports. The wideband receiver will remain enabled until you execute ir-ctl -M.