Kernel space: Oopses, printk and config menu

I am entering my final step in the internship with the encoding almost finished (there are still some problems to consider and testing to be done, especially on machines with multiple CPUs). This article will be rather short as I am focusing on my coding now more.

Updates

I have used the library libqrencode after realizing that it was actually a LGPLv2.1 (I thought it was LGPLv3) and that kind of license can be transformed to a GPLv2. If it were a LGPLv3 it wouldn’t have been compatible with a  GPLv2. So what I did was translate to kernel space and use most of it as it is.

As I left the license transformation for later on I put all my work on gitlab because on github I would have been posting publicly (I don’t have private repos as an option for github) a code without the proper licensing.

Oops in depth

What I learned as a general thing is that the best way to understand something is by reading documentation, reading the source code and testing it. These were the steps I followed. As I mentioned before, an Oops occurs when you’re doing something illegal like a ‘NULL pointer deference’. The kernel may or may not continue after an Oops occurs (or a panic for that matter).

How does it actually work? It is a rather simple mechanism in principle. An Oops has several information useful for debugging like call trace, stack trace, eip, etc. The kernel dumps information to the console about these using ‘printk’ calls. For example, I did an inspection of the x86 dumpstack file. The architecture dependent files are located in the arch folder with a folder for each one, so here’s the link on lxr.free-electrons.com: http://lxr.free-electrons.com/source/arch/x86/kernel/dumpstack.c

In order for the ‘printk’ to work on multiple CPUs is by using a lock for exclusive access to the console. All this is done in the printk.h in the architecture independent part of the kernel. The idea I had was to use a configuration option so that each printk call during an Oops has a lateral effect of writing into a pagesize (4K) static buffer. Then, after the Oops handling is over and the last printk call (it ends with a End of Call Trace string) is done, the buffer is encoded and written.

Config menu

Another useful thing in the kernel is the configuration menu. Before compiling the kernel, you write the following command in your terminal

make menuconfig

You then enter a UI that lets you choose several options for compiling the kernel. You can see debug information, driver options, certain security information and many more. You select to include on by pressing the space bar and by pressing Enter you may navigate a submenu. For some options you can select it and there appears a ‘*’ before that option, deselect it (you have a ‘-’ before it) or insert it as a module ‘M’.

You can include your own option, as I did with my QR library option by adding information in the Kconfig file. So for example, right now my Kconfig file inside the lib/ directory (this is where the library folder is) has these following lines:

config QRLIB
    bool "QR encoding library"
    help
     QR encoding library

I can then use the CONFIG_QRLIB to test if this option has been enabled.

In order for the library to compile I had to write a Makefile inside the directory and then add a hook for it in the lib/ directory’s Makefile. The way the kernel builds is by having a sort of hierarchical Makefile (it is rather normal, if you think how all big projects are compiled, but the kernel has some additional scripts which is cool).

Going forth

The next step is doing the compression. This is rather delicate, but I hope to come with a good solution. There are several solutions of compression already in the kernel. I will test these first.

Reed Solomon & Kernel?

A lot of time has passed since I updated the blog so this will be a fairly big post. I’ll start with the next steps on implementing the QR code and then some kernel programming stuff.

Reed Solomon

When doing the encoding on a bit stream, in order to ensure that it can be decoded properly. By reading the error correcting bits, the decoder can determine if it read the data and correct it if the percent of errors are recoverable. There are four levels of error correction listed in the table below depending on the amount of bits used for the error correction. The higher the amount of recoverable data, the larger the QR will be.

Error Correction Level Error Correction Capability
L Recovers 7% of data
M Recovers 15% of data
Q Recovers 25% of data
H Recovers 30% of data

So, as I pointed out in my earlier post, each QR has a version level, and now a error correction level. Based on the input and the output size, the QR encoder decides which version and error correction it can use and how many bits it needs to encode. To get an idea on you can check out this table here.

Reed Solomon does two things: consider the bit stream input as coefficients for a polynomial function, generates another polynomial function, divides the two and uses the remainder as the error correction codewords. The original data codewords and the error correction codewords are interleaved based on the version of the QR resulting in the final bit stream. The final step is putting them in a function pattern, that’s a portion of the QR’s matrix where the decoder expects to read a certain part of the code in order to decode it to data. That’s just a very sketchy idea, but I’ll detail in the following sections.

I will explain some of the steps in the error correction process, but you can find it more detail in this tutorial. It is really nicely explained and clearly, even for someone who has forgotten algebra.

Message polynomial

The message polynomial uses the output of the data encoding part as coefficients. So, as an example, if the codewords, converted to integers, were 110, 118, and 35, the message polynomial would be 110x2 + 118x + 35.

Generator polynomial

The generator polynomial starts from a 2-codeword and that’s x2  + 3x1  + 2x0. This comes from multiplying the (x-1) which is (x-20) and (x-2) which is (x-21). So, can you see the pattern? This is done recursively, so that at each step N you multiply (x-2N-1) with the gN-1(x) in order to get the gN(x), the generator polynomial for N error correction codewords.

Galois field and arithmetic

This gets really mathematical. The main point here is that the exponents fall under certain arithmetic rules, so when you get an exponent (at division) that’s greater or equal to 256, you apply modulo 255 before doing anything else.

Division steps

(taken from the tutorial)

  1. Find the appropriate term to multiply the generator polynomial by. The result of the multiplication should have the same first term as the the message polynomial (in the first multiplication step) or remainder (in all subsequent multiplication steps).
  2. XOR the result with the message polynomial (in the first multiplication step) or remainder (in all subsequent multiplication steps).
  3. Perform these steps n times, where n is the number of data codewords.

Structure the final message

For smaller QRs you leave the remainder from the division as they are. But for large QR outputs you  blocks are interleaved. Then the error correction is added to the interleaved list of blocks. There is also a certain placement in the matrix you have to follow according to the standard, you can find them detailed in the standard or explained in the tutorial.

Anecdote about Galois

The ‘Galois field’ got me to exercise my brain a bit and remember the algebra I did in high school. I had a very nostalgic feeling about the name Galois and then I remembered the stories my math teacher told. Galois had a very interesting story and romantic, as he died in a duel and even on his deathbed he wrote the researches he had been carrying (even during the time he spent in prison!) that would later be published. He was a fervent mathematician with a powerful imagination. Well, our teacher said he died because of a duel for a young lady, but when I read about it on the Internet it was not the case. Either my memory is playing tricks on me or my teacher was trying to make a point (I wonder). Another thing about him is that he failed entering l’Ecole Polytechnique two times and some say that it may have been that he was far more qualified than his examiner. According to this “Legend has it that Galois, who worked almost entirely in his head and who was poor at presenting his ideas verbally, became so enraged at the stupidity of his examiner that he hurled an eraser at him.“. Either way, he managed to get recognized and publish articles, but his life remained tumultuous (as his genius) until the very end.

Reed Solomon in the Kernel

Well, after going on about how it’s done I found out from PJ (when looking where the QR library should be located in the kernel’s code) that Reed Solomon is actually already implemented in the kernel as a module and you can find the code on http://lxr.free-electrons.com/source/lib/reed_solomon/.

The Reed Solomon error correction code (according to wikipedia and another tutorial) is used in:

  • Storage devices (including tape, Compact Disk, DVD, barcodes, etc)
  • Wireless or mobile communications (including cellular telephones, microwave links, etc)
  • Satellite communications
  • Digital television / DVB
  • High-speed modems such as ADSL, xDSL, etc

It has many applications and could be used in driver implementation.

Thank you for reading so far, I know I haven’t been updating my blog as often as I should/could, but it’s my first time keeping a blog and I don’t have the discipline to do it, yet. I also had a “holiday” for my exams, so I had to focus on that for a week. This article has been in the back of my mind for a while now and I wanted to clear it out now, but next time it will get more Linux than Galois. :)

QR codes

As I’ve said in my previous post, I’ll keep this blog as a way to track my progress so far.

QR codes

Before going into the problems and the solutions I came up with after a couple of weeks into the internship, I’ll try explaining what a QR code is.

QR codes, as explained on wiki, are 2D barcodes. As of quite recently (meaning two years) they have become a popular way of advertising (I know I’ve seen them all over the place), but they were actually created to track vehicles during manufacture by Denso Ware, Japan. You can decode them with a smartphone app or a camera and a program that analyzes the camera’s sensor input. Usually, they lead to a URL or display a text.

There are four input modes: numeric, alphanumeric, binary/byte and kanji. Since Oops messages have characters that are not included in the alphanumerical set, the input mode is byte. The maximum number of characters for this input mode is 2,953. This may seem quite nice, but in reality there are two problems: kernel Oops messages are quite large and even so, we wouldn’t want to generate a QR code the size of the screen.

For example, let’s take a kernel Oops message on a x86 architecture (taken from here)

BUG: unable to handle kernel NULL pointer dereference at 00000000
IP: [<00000000>]
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: stap_25635fa2767efdc6a286d47c0380042a_404(+) iTCO_wdt
iTCO_vendor_support sg intel_agp floppy e100 i2c_i801 agpgart rng_core button
pcspkr mii sr_mod cdrom i2c_i810 i2c_algo_bit i2c_core ata_generic ata_piix
libata dock sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ehci_hcd [last unloaded:
microcode]Pid: 1830, comm: staprun Not tainted (2.6.26-rc5 #1)
EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
EIP is at 0x0
EAX: 00000000 EBX: df778e88 ECX: 00000000 EDX: df778e88
ESI: e0973568 EDI: c11a0e04 EBP: e0973700 ESP: df778df0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process staprun (pid: 1830, ti=df778000 task=df79dc80 task.ti=df778000)
Stack: e092b053 ffffffe1 74737973 61746d65 df400070 00000005 c04ac269 df778e58
       00001360 00000000 00001709 00000005 df40a6c0 df407000 00000002 c13dce98
       c13dce80 00000672 70617473 3635325f 61663533 37363732 63646665 38326136
Call Trace:
 [<e092b053>] _stp_transport_init+0x2b7/0x43b
[stap_25635fa2767efdc6a286d47c0380042a_404]
 [<c04ac269>] ida_get_new_above+0xc2/0x17d
 [<c0433103>] sys_init_module+0x1599/0x16fb
 [<c04665b2>] do_path_lookup+0x154/0x199
 [<c04af285>] simple_strtoull+0x0/0xf0
 [<c040371e>] syscall_call+0x7/0xb
 =======================
Code:  Bad EIP value.
EIP: [<00000000>] 0x0 SS:ESP 0068:df778df0
---[ end trace df61255cfec2f882 ]---

And here is the QR code: http://tinyurl.com/nu78673. This is still readable on a phone app, but it is a very complex QR code so depending on the camera’s sensors it might get tricky to read them.

Existing libraries for QR encoding

The most popular implementation in C is libqrencode. There are several others that are not as complex as this one or use this one and write to framebuffer (which is what we need).

Unfortunately, the libqrencode is under LGPL (GNU Lesser General Public License) and it is not compatible with GPL (GNU General Public License), which is the license Linux kernel is under. So, either way I will have to implement a similar library tailored to the use of this project.

LE: Actually, the LGPLv2.1 is compatible is code in the Linux kernel that is GPLv2. The third version of LGPL is not compatible with GPLv2.

The implementation will have to follow one of standards of encoding data into a QR and the one I will follow is ISO/IEC 18004. I have started looking through the specifications of this standard, though very generally, since the pdf file containing the standard’s description has quite a lot of pages… I will follow up with a post on the algorithm used on encoding data soon.

QR codes and Oops messages

As I hinted above, Oops messages are quite large and encoding them is problematic. The whole idea with QR codes is that these messages don’t fit on a screen and are difficult to read so a gigantic QR code is out of the question. Plus the QR code’s character limit would be reached. That is why before the encoding is done the Oops has to be compressed. Since the input type of the QR encoding is fixed, the compression of the original Oops message should result in one of the input modes, in our case, byte.

So, what options do we have here? First thing that comes to mind is compressing the Oops message and encoding it with a binary-to-text encoding method (that’s only if we are keen on having the input as text).

As an example, let’s take the above mentioned Oops message. It has 1446 characters. After compression with gzip and encoding with base64 we get a string of 1139 characters.

 H4sIAI/kulIAA41UbWvjOBD+Lrj/MLBf2ttrT5Zs2THXg22aXRZ6u4Fs4SAEI0tyootjGVvpZv/9
 jRSn7cJ9OEHCjOaZ0TNv/uX+6VMJx07WrQHvYCc7jdLeDJ1p4cvT4yP0znbeDKDNYBr8dcqA9ECn
 Qz4vS1j/cVH/3JBfe23g7hXw1fVjGVVYv0s25C+nj60ZobXd3miwXQmjl33FMsGzRrJc5KbRSkhW
 CJ3mivKC0pTJKqXp1ftrsN/mX6vv2pMoPJtOu6Eaj33vBg/jFgLdtpLbHprW9f0PMAk+bZmqbEET
 QMNWInLotpVyg4H66L3rSK/Gfj/AwVoYh+rgNCg9uMPkmJwjyHbrqtr6qERv6WW1NZ0ZrIpyb+2J
 tLZGGbRTexh1DDaq0UbBnDyHf2oNh1pJtTNw3Clb7RRaLsK6laPHrrROaqNLcrBqcMppsyFkaXUJ
 ScHpb6Dc4XCu3XDs4Ivz4GVIXsMVuxW3TNwMKoN3yTVZhCZRKmj5tlOw+Pj44dMq9iahLBUwXz6h
 FuBgx9jlE2of/i5fugmLe9R0k+eFKQpYzH+yPbyxkcXqcwmGznKeCUQ+oKaSRFJDU4yynGx58Fst
 Jz/dUAIPkVJe4/0kfFxNA3RmyzmsoiAKssTKmHF8KcNV/1ogb+9i1ODp5bgP2kyr4qzdvjFfk5WX
 ah85sZpmHJp4TAJ5iiSRKIgkT4UWGTJN0SWnl8QzUDSVionZlHxWEDifYE64oK81ijc5nb06h3BS
 KBqF/ILBw7BeXCszewl31osJIXIGOQ20kB3H5eEsa5CmCCLe5HiXcwaCi1QIJM4LzgTSIXPZtvBt
 kMqUBJf3kjSORDX6vvKD7MawTpXtrH9PT6zOf6enlNdk/b9XdRMiXwqDka0Oi+KrznyvZO2eDcZV
 DMMmuZ6gnCc0kBh/jPHlsC/4qUBgks1mASqaesJiPjVDrHZVL/2uap3bH/sITQNyNru837AiC0Ht
 oW8N5jd4d2xbRFLEhXmLMMrzxJzfVlidKvwhJuSNT9799yFzXMoS4F5qCEvzLNujuT2v20+bho+F
 kcVBj1Nbvgz7zc3NGvATBj50A0dAJCzLVGMUa4qCwQYB5F8scBlFpgUAAA==

That’s not very impressive, it’s a 1.26 compression ratio (that’s uncompressed_size / compressed_size), but a better compression tool should be used. There are other compression algorithms that specialize on compressing text input with a better compression ratio. As a matter of fact, Oops messages are similar with log files, since they have repeating patterns/substrings. Log files are text files with the best compression ratio as described on maximumcompression.com.

Conclusion

For now, I’ll be implementing the QR encoding library (but only the byte input mode) after taking a good look on the algorithm and standard. I’ve set up a github repository and hope to make as many commits as I can until the year ends.

Oh, yeah! :-)

First step to being a part of FOSS: OPW Intern

Here I am! Feels like I’ve come aboard a very pleasant ride on my way to contributing to FOSS and not just any open-source, to the Linux kernel, thanks to the OPW.

My internship starts today and consists in encoding and decoding QR codes for the Oops messages in the kernel. I will be updating from now on this blog (sort of like a diary) and I hope I can add useful information to it.

So, who am I? Skipping all the philosophical details that come with that sort of question, I am a open source, open education and technology enthusiast. Quite enthusiastic, aren’t I? Other that that, I am a Computer Science student at Politehnica University of Bucharest, Romania.

I have used Linux distributions through out my years of university (mind you, 3 have passed and this is my last) and I have been a Linux fan for quite a while, thus making me want to contribute in a way to it and now I have this wonderful opportunity.

All in all, I’ll grab this opportunity and make the best of it.