Embedded and
wearable systems require new ideas for input controller hardware. But older
software still expects a keyboard and mouse. How do we bridge the gap without
having to write kernel drivers?
A few months ago, I built a
Linux computer small enough to fit on my wrist, that ran a full X desktop
environment. The issue wasn’t just fitting the computer in, but also figuring
out how to control it. Touchscreens become problematic when the screen is only
barely bigger than your finger.
Of course, a mouse and
keyboard are totally out of the question. Alas, all the good software still
expects them. So it’s important to be able to fake those standard controller
inputs using whatever crazy hardware you do build. That’s what we're going to
do in this article: learn how to create “virtual” input devices for an
operating system (once the electronics are working).
Wearable devices desperately
need new ideas for ways of controlling them. The good news is that Linux makes
it easy. There’s no need to write special mouse drivers or kernel modules, and
you can use whatever language you want (though C is still best).
To make my wrist-mounted
machine, I tried several input systems, including:
1. A magnetometer (or digital compass)
module and a “3D stylus” (a magnet) to move the mouse pointer and simulate
clicks.
2. An accelerometer mode where tilting the
module in any direction caused the mouse pointer to “roll” in that direction,
like a ball on a table. This was prohibitively hard to control.
3. A mode where velocity (integrated
acceleration) in X and Y directions moves the mouse pointer like a normal mouse
would. The issue here was that you can't use it while moving.
4. A mode where “tapping” the module caused
mouse clicks. This wasn't a great idea for fine control.
5. A gyroscope mode where rotating the
module directly translates to mouse movement.
It was the last method,
the gyroscope, that was the nicest to use,
along with a physical button for clicking. It turns out that humans can
perceive and control much smaller angular “twisting” movements compared to
lateral, “swaying” accelerations. This is why knobs exist.
Many people have asked why I
didn't do it the “obvious” velocity way, exactly copying how a desktop mouse
works. It’s important to note the gyro method wasn’t my first choice or my
original intention. Sure, I had some strong ideas, but if I’d stuck to them I
would have built a poor interface.
Experiment. Iterate. See it
in context. Let reality have a say in the matter.
The gyroscope-based system
gave the best mix of fast movement and fine control, and that’s what I was
looking for, so I wrote “gyromouse” and used it as
the basis for the manipulator.
It’s trivially easy these
days to wire up an I2C sensor
to an embedded Linux machine like the Raspberry Pi. Ground, VCC, SDA, and SCL.
Two power wires, two data wires—and that’s it. Job done.
Once physically connected,
the Pi’s I2C support is excellent. You need to
enable it through raspi-config, at least for
Wheezy and Jessie. (I have hopes that in a future Raspbian,
the I2C, SPI, and I2S buses might be enabled by default.)
The bundled command line tool i2cdetect can enumerate the devices on the bus,
so you can verify the hardware is mostly working without writing a line of code.
From there, it’s
straightforward to use your language of choice to talk to the chip with its
preferred protocol. Most I2C and
SPI sensor chips have a couple of configuration registers that you write to
turn them on and put them in the correct mode, and a couple of registers you
read continuously to get the sensor values back out.
I’ve used the command line
tools, Python, C, C++, and even Javascript to
read a sensor. Most sensors have update rates in the 10-1000Hz range that are broadly
compatible with the millisecond-accurate timing loops in Linux.
We’re going to skip over the
specifics of talking to the chip and assume you’ve got a program of some kind
that can read your sensor and, for example, write a stream of numbers to stdout representing the samples.
Here’s the code I mashed
together for the MPU6050 Inertial Measurement Unit used in Gyromouse:
Gyromouse IMU Code
And here’s a similar wodge
for dealing with the HMC5883 magnetometer:
Gyromouse Magnetometer Code
Now comes the important part:
using that raw data to “fake” a typical input device like a mouse, touchpad, or
keyboard. We want to write a second program that takes the stream of sensor
samples, processes it, and feeds back into the operating system like a device
driver would.
And this is where Linux
shines because it has user-space input devices. “User-space” means
you don’t have to write a kernel device driver, and this is a great, great
thing. You really want to avoid writing kernel modules. It will take over your
life. Ask Linus Torvalds.
Instead, we use /dev/uinput which is a stream that connects to an existing
kernel module. By writing command messages to this special file, your code can
ask the API to register a “virtual” device on your behalf and assign it
capabilities.
This works because UNIX
really doesn’t care how many keyboards or mice you have plugged in. It never
has. Whichever mouse is moved pushes the pointer, and whatever buttons are
pressed are accepted as keys.
There are three basic
“classes” of uinput device we’ll talk about
(though there are many more) and a single program can use any subset:
● Key events
● Relative events
● Absolute events
This includes both keyboard
and mouse buttons. At this level, mouse button events don’t have coordinates
attached. In fact, they can’t because that would imply the
lowly input device knew what the X-windows pointer was up to. So mouse buttons
are basically treated like tiny keyboards within the uinput system.
Key events come in two
halves, the “down” and “up” parts, and you should send them in neat pairs.
(I’ve bent that rule more than once so it’s not enforced. It's just confusing
to X.) You could soft-wire that straight to a GPIO pin connected to a switch,
so long as youdebounce it a little.
During setup, you have to
tell the API which key codes you plan to send. If you’re implementing a full
keyboard, this gets tedious—but it’s a necessary evil so that Linux or X can
profile the device and potentially change behaviour if keys are different
(e.g., Meta/Windows/Mac keys, multilingual keyboards, third mouse buttons).
I don’t think it’s possible
to dynamically add or remove keys from a virtual device after setup, but you
can always destroy and re-create the virtual device with new definitions.
Mouse movement is the classic
example of a relative event. The mouse doesn’t know where it is—it just knows
it rolled left. Each axis gets its own event message, so REL_X events can be
sent at a different rate than REL_Y events, and then you send EV_SYN messages
after each “set” of updates. (Essentially, this is a “commit” once you’ve done
all X and Y and wheel updates for a common time period.)
Axis coordinates are sent as
deltas in the device’s own coordinate system with as many DPI as you want
(within sane limits). (DPI, by the way, is an abuse of Dots Per Inch which is shorthand for coordinate increments
per distance traveled. Not printed dots.)
X-windows has a “mouse speed/acceleration” preference panel which adjusts how X
turns those event values into arrow pointer movement (which isn’t our concern
here).
The weakness in that system
is if you’ve got devices with wildly different DPI-equivalents, there is no
optimum “mouse speed” which will work for all. This is why high-resolution mice
have that little switch on the bottom to make them ordinary again. Rather than
define your own scale, you want to send event values that sort-of correspond to
the deltas that a mouse would generate. You want it to be easy to drive the
pointer across an average screen (say, 1000 “dots”).
These are devices that have a
finite, limited range of inputs, such as touchscreens or volume knobs. During
initialization, you are expected to send definitions about the range of each
axis, so you can say your touchscreen is [0...1023]×[0...1023]
if you have 10-bit ADCs that give a full range of sample values. It’s someone else’s problem to turn that into screen
coordinates at the current resolution. You deal purely in your sample values.
That doesn’t mean you should
directly copy raw sensor values to event values. That results in a very jittery
pointer as all the noise gets shown clearly to the user. Some form of sample
filtering is usually needed (if not already done by the sensor), so the output
range you pick is going to be related more to your filter math than to the I2C device’s specifications.
Just remember that X-windows
can’t simply create absolute resolution, so if you’re expecting to drive a
pointer on a 4K screen with individual-pixel accuracy using a standard 8-bit
resistive touch-screen ... well, that’s not going to work. Menus are fine if
you’re skipping 7 out of 8 pixels, but not Photoshop.
As far as I know, there’s no
explicit prohibition of sending both relative and absolute events from a single
source. However, there are not many circumstances where it makes sense. Even if
your sensor inputs are coming from something like a GPS/inertial sensor
combination (which has perhaps 10Hz of “absolute” updates vs. 1000Hz of
“relative” from the IMU), you’d be fusing those inputs in your code and
presenting a single stream to /dev/uinput. Otherwise,
you have no idea how X’s mouse acceleration is tweaking the scaling between the
two modes.
You will usually be mixing
either relative or absolute events with key events. A standard mouse is the
classic case of relative + key events.
The following example is
written in pure C, but the same concepts can be applied in any language. If you’re
using Python or Javascript, there are libraries
which neatly wrap up this functionality—but this is what they’re all doing
underneath.
I’ve learned that C is a very
efficient language for doing low-level I2C and uinput work, and the lack of
dependency on any interpreted language or libraries means my mouse/keyboard
isn’t going to stop working because my Python/Javascript upgrade
broke.
Example C
Code
There is no specific library
to install—you just open /dev/uinput as if it
were a normal unix file, and use ioctl() and write() functions to get/send commands and data
to the API. The message block format and codes are defined in the standard
Linux includes that should already be on your system.
#include
<stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
#include <linux/input.h>
#include <linux/uinput.h>
int fd; //
file handle
void initialize_uinput() {
// open the uinput fifo
fd =
open("/dev/uinput", O_WRONLY
| O_NONBLOCK);
if(fd < 0) {
printf("could
not open /dev/uinput\n");
return 1;
}
// enable the
message types we're going to send
if(ioctl(fd, UI_SET_EVBIT, EV_KEY)
< 0) die("error: ioctl");
if(ioctl(fd, UI_SET_KEYBIT,
BTN_LEFT) < 0) die("error: ioctl");
if(ioctl(fd, UI_SET_KEYBIT,
BTN_RIGHT) < 0) die("error: ioctl");
if(ioctl(fd, UI_SET_KEYBIT,
BTN_MIDDLE) < 0) die("error: ioctl");
if(ioctl(fd, UI_SET_EVBIT, EV_REL)
< 0) die("error: ioctl");
if(ioctl(fd, UI_SET_RELBIT, REL_X)
< 0) die("error: ioctl");
if(ioctl(fd, UI_SET_RELBIT, REL_Y)
< 0) die("error: ioctl");
// create our
virtual input device
struct uinput_user_dev uidev;
memset(&uidev, 0, sizeof(uidev));
snprintf(uidev.name,
UINPUT_MAX_NAME_SIZE, "gyromouse");
uidev.id.bustype =
BUS_VIRTUAL;
uidev.id.vendor = 0x1;
uidev.id.product = 0x2;
uidev.id.version = 1;
if(write(fd, &uidev, sizeof(uidev))
< 0) die("error: write");
if(ioctl(fd, UI_DEV_CREATE)
< 0) die("error: ioctl");
}
During initialization, we
also provide metadata such as a device name (“gyromouse”
in this example, but you should change it to your own). However, if we were
trying to emulate a very specific device (say, for legacy software which needs
something specific), we could impersonate that device by using its
bus/vendor/product IDs instead of generic defaults.
Once setup is done, we can
write message blocks to the file that fire off UI events.
void send_button(int btn_code, int value) {
struct input_event btn_ev;
// button
event
memset(&btn_ev, 0, sizeof(struct input_event));
btn_ev.type =
EV_KEY;
btn_ev.code = btn_code;
btn_ev.value = value;
if(write(fd, &btn_ev, sizeof(struct input_event)) < 0) die("error:
write");
// syn event
send_syn();
}
void send_syn() {
struct input_event syn_ev;
// syn event
memset(&syn_ev, 0, sizeof(struct input_event));
syn_ev.type =
EV_SYN;
syn_ev.code = 0;
syn_ev.value = 0;
if(write(fd, &syn_ev, sizeof(struct input_event)) < 0) die("error:
write");
}
The button/key code is the
same we told uinput about during
initialization. The value is 1 for “down” or 0 for “up” (or 2 for
hardware autorepeat). To simulate a full
press-release, you must send both events:
send_button(BTN_LEFT, 1); //
send left mouse button down
send_button(BTN_LEFT, 0); //
send left mouse button up
This time, we send a pair of
events for X and Y deltas (even if the delta is zero, which we could
theoretically skip). Then we send a sync.
void send_mouse(int dx, int dy) {
struct input_event ev;
// mouse X
movement
memset(&ev, 0, sizeof(struct input_event));
ev.type =
EV_REL;
ev.code =
REL_X;
ev.value =
dx;
if(write(fd, &ev, sizeof(struct input_event)) < 0) die("error:
write");
// mouse Y
movement
memset(&ev, 0, sizeof(struct input_event));
ev.type =
EV_REL;
ev.code =
REL_Y;
ev.value = dy;
if(write(fd, &ev, sizeof(struct input_event)) < 0) die("error:
write");
// syn event
send_syn();
}
As our program is closing, we
should send a final message to the API to destroy our device—in a logic-type
way. (If your hardware device ends up physically destroyed, that’s another
issue entirely.)
void shutdown() {
// destroy our
mouse device
if(ioctl(fd, UI_DEV_DESTROY)
< 0) die("error: ioctl");
// close the
file
close(fd);
}
In addition to executing this
function when the program closes, you should try to trap SIGTERM events so that
this function can be called when a user presses (Ctrl+C)
or they run a “kill -9” command. But if you don’t, it’s not the end of the
system—the input device hangs around (probably until the machine reboots) but
doesn’t really interfere.
If you’re running the program
repeatedly without cleanly destroying the device, then you might get problems
due to old references piling up in the kernel. So don’t do that. But having
dozens of hanging devices during development didn’t give me any grief, so don’t
feel like you have to reboot after every segfault.
This article merely scratches
the surface of the uinput system. There are
also EV_SW “stateful switch” events, meant for
laptop lid open/closed detection. There are EV_LED events to
query/control blinky lights such as caps
lock, EV_SND for beeps, EV_FF to control force-feedback rumble packs (yes,
really), modifiers to enable multi-touch and multi-tool detection (if you have
a tagged stylus with an eraser on the other end), and even tool-distance
(hovering) support for quasi-3D interfaces. So there is tremendous scope for
creating standard UNIX input devices with abilities far more complex than those
of a 2D mouse.
A good overview of the full
range of device types is laid out here.
Splitting the task into two
programs and using a connecting stream is the “UNIX way.” It allows us to do
tricks later like tee the sensor data stream to extra programs (since only one
program can access the hardware) or even separate programs across a network.
Since they’re normal UNIX
programs (not special device drivers), we have to make sure that they run at
boot time. I did it the quick and dirty way with an init.d script
that starts the two programs together, but advanced UNIX gurus will be able
to daemonize the programs in proper FIFO
ways.
So now you know it’s not that
hard to fake button presses. Any program can do it, so long as it has access to
the /dev/uinput file.
Most languages have a wrapper
library named some variant of uinput that
makes the process even easier. If you’ve already got the skills to write the
code to talk to your sensor, it won’t be hard. The tricky part is the “glue” in
the middle, which translates the raw data into user actions. It’s tricky
because there’s no right answer.
For example, the code
provided for Gyromouse has different
scaling for the X and Y axes because it’s easier to twist your wrist than tilt
your entire arm. It just felt nicer when the Y speed was toned
down. Of course, that applies only to devices on your wrist.
It’s pretty specific to the task at hand (badum-tsh!).
So that’s your job now:
connecting the hardware and software with new ideas. Making it feel
right. The tools are there for you!