IMG_0779

Thoughts on my first LinuxCon

Last week I had a great time at LinuxCon Europe in Dusseldorf and I want to share my thoughts not only about the presentations I went but also about the general atmosphere and the city ’cause it was a very whole experience for me – perhaps I’ll do the city talk in another post.

It was my first time ever participating in a conference so I can’t really compare this experience to anything else, but it was a very big event and the organizers were really well prepared. The overall content was very varied and most of the time I was interested in more than one presentation during the same time interval, which meant I had to carefully choose my schedule.

Keynotes

Linus and Dirk

Of course I was excited about the Linux: Where are we going keynote that features Dirk Hohndel and Linus Torvalds. It was a very fun and light talk. Compared to the rest of the presentations I went to, it wasn’t all that technical. Somebody raised the issue about the way the community can be rude or mean – I never experienced this, but there are certainly times when it is so – and I guess there are arguments on both sides. This was discussed in the Kernel Developer Panel as well.

Jono Bacon
Jono Bacon’s talk, Building Exponential Communities, was one the best out of the all the talks at LinuxCon! He very nicely captures the essence of building a (open-source) community: a level of empowerment, a sense of purpose and a sense of belonging. Couldn’t have it any better, really. I am not really into community management but these ideas are still relevant to me: even as a small piece of the puzzle it is very important to understand the foundations of communities in open-source.

#1

On my first day, I got only 4 hours of sleep ’cause I was very excited. I had rehearsed the night before my presentation in my hotel room. The walls were kinda thin and I tried not to talk really loud, but hey, who could blame the other hotel guests for mistaking my murmurs with a broken record player or some magic mantric incantations – sorry, guys.

Enhancing Real-Time Capabilities with the PRU - so PRU stands for Programmable Realtime Unit and it’s a very interesting architecture TI proposes. Basically, you want to get real-time operations over fast (that means very little latency) and deterministic so that’s why you have this specialized processing unit but you also get to connect that unit to another processing unit, an ARM processor in order to take care of the more complicated or more resource consuming operations. The very interesting part I guess is how the two parts communicate and there’s one from PRU to ARM which is sending interrupts and then from ARM to PRU who does polling.

Technical Feasability Study: Is Linux Ready for Medical Devices - the short answer is yes, but no. That means that you get comparable performance with Linux, but there may be other factors to take into account when choosing Linux for your products – for example, the technical support a commercial OS offers. Check the presentation out as it has more figures and goes through the steps in more detail.

Karen and Sarah (btw, they were really cool!) proposed we do a mockup presentation and that was really helpful. Last time I spoke in front of people was during my graduation project’s presentation and that didn’t go very well… I also met a part of the OPW interns and saw a preview of their presentations – really nice work! The rehearsal lasted about 45 minutes so I still had time to go to another presentation. I went to Christoph Lameter’s talk about slab allocators but as the time for my talk drew nearer I just couldn’t focus anymore. The Kernel Internship Report was really well done. I liked the fact that Sarah had some interesting statistics about the interns overall impact on making the Linux kernel cleaner. I was very nervous up until my time to talk. And then I saw we had a microphone. Don’t know why but microphones intimidate me thinking my voice would sound funny. Just when I started talking I managed to look in the crowd and find some familiar faces and calmed down a bit. At one point, I did utter a Romanian word (but I think it was in a very silent voice and nobody noticed). Overall, I think I managed to get my point across. You can also find my talk there. :)

#2

The first talk of the second day was also from ELCE and it opened a new perspective for me. Do you ever think of how the user interacts with the API or the tool you build, how user efficient is it? Well, the talk Building tools from the outside in: bringing user-centered design to embedded Linux tackles this aspect. Belen is an interactions engineer and it was the first time hearing about such a job (guess I’m still clueless). It was a very dynamic presentation and it made very clear why and how the Yocto guys build the Toaster project.

The Chromium OS audio system talk was moderately intelligible to me because I wasn’t familiar with how any audio system works but managed to get an idea which was the whole point. I went to another talk (couldn’t find the slides, but I found an article) who admittedly sounded rather promising but unfortunately the project was still in its initial stage – hope to hear more about this initiative and its results.

I didn’t know much about the big.LITTLE architecture and the Cut power consumption by 5x without losing performance: a big.LITTLE software strategy presentation was a very nice way to find out about it – like how it can be improved. The measurements were made for rendering scenes from browser, a very specific scenario.

Josh explained really well the whole stack of the OS in the Chrome OS internals talk. Although there was a lot of information for me to process, the thing that I found very distinguishable is the security approach Chrome OS takes. It is a very well structured presentation and worth checking out!

#3

High-speed data acquisition with the Linux IIO framework - I am starting to work with IIO and this was a great way of understanding the capabilities of the IIO framework. I mainly looked at how sensors can use IIO (usually having very little data to transmit when a hardware interrupt occurs) but never at how large streams of data could be sent using IIO and DMA.

How to design a Linux kernel API - this was very instructive and Michael Kerrisk really made a great point: it is really important to expose coherent and documented interfaces.

Open Source: A Job and an Adventure - there’s a lot of jobs in open source, not just developers. That’s a thing I haven’t really thought about before. Dawn talked about different ways to enter the open source community and get to actually work in open source. At the end of the talk though I felt a little bit discouraged by the fact that all these people had been a part of open source when it wasn’t that widely used and they were really avant-garde. One of those times when I wish I was born in another period – like when I want so bad to be at a Rush (yes, the Canadian band) concert in the 80s.

The best part of the closing event was the Spaceteam game. Basically you and your friends are part of a spaceship crew trying to go to hyperspace. Each one has only a part of the control system and has to yell out the instructions they see on the screen and cannot be activated using their own control panel ’cause another teammate has that function. The instructions were not always very scientific or serious – things like ‘dry clothes’ or unutterable things – and sometimes the scene looked like utter chaos but these things made it look more fun.

Kernel space: Oopses, printk and config menu

I am entering my final step in the internship with the encoding almost finished (there are still some problems to consider and testing to be done, especially on machines with multiple CPUs). This article will be rather short as I am focusing on my coding now more.

Updates

I have used the library libqrencode after realizing that it was actually a LGPLv2.1 (I thought it was LGPLv3) and that kind of license can be transformed to a GPLv2. If it were a LGPLv3 it wouldn’t have been compatible with a  GPLv2. So what I did was translate to kernel space and use most of it as it is.

As I left the license transformation for later on I put all my work on gitlab because on github I would have been posting publicly (I don’t have private repos as an option for github) a code without the proper licensing.

Oops in depth

What I learned as a general thing is that the best way to understand something is by reading documentation, reading the source code and testing it. These were the steps I followed. As I mentioned before, an Oops occurs when you’re doing something illegal like a ‘NULL pointer deference’. The kernel may or may not continue after an Oops occurs (or a panic for that matter).

How does it actually work? It is a rather simple mechanism in principle. An Oops has several information useful for debugging like call trace, stack trace, eip, etc. The kernel dumps information to the console about these using ‘printk’ calls. For example, I did an inspection of the x86 dumpstack file. The architecture dependent files are located in the arch folder with a folder for each one, so here’s the link on lxr.free-electrons.com: http://lxr.free-electrons.com/source/arch/x86/kernel/dumpstack.c

In order for the ‘printk’ to work on multiple CPUs is by using a lock for exclusive access to the console. All this is done in the printk.h in the architecture independent part of the kernel. The idea I had was to use a configuration option so that each printk call during an Oops has a lateral effect of writing into a pagesize (4K) static buffer. Then, after the Oops handling is over and the last printk call (it ends with a End of Call Trace string) is done, the buffer is encoded and written.

Config menu

Another useful thing in the kernel is the configuration menu. Before compiling the kernel, you write the following command in your terminal

make menuconfig

You then enter a UI that lets you choose several options for compiling the kernel. You can see debug information, driver options, certain security information and many more. You select to include on by pressing the space bar and by pressing Enter you may navigate a submenu. For some options you can select it and there appears a ‘*’ before that option, deselect it (you have a ‘-’ before it) or insert it as a module ‘M’.

You can include your own option, as I did with my QR library option by adding information in the Kconfig file. So for example, right now my Kconfig file inside the lib/ directory (this is where the library folder is) has these following lines:

config QRLIB
    bool "QR encoding library"
    help
     QR encoding library

I can then use the CONFIG_QRLIB to test if this option has been enabled.

In order for the library to compile I had to write a Makefile inside the directory and then add a hook for it in the lib/ directory’s Makefile. The way the kernel builds is by having a sort of hierarchical Makefile (it is rather normal, if you think how all big projects are compiled, but the kernel has some additional scripts which is cool).

Going forth

The next step is doing the compression. This is rather delicate, but I hope to come with a good solution. There are several solutions of compression already in the kernel. I will test these first.

Reed Solomon & Kernel?

A lot of time has passed since I updated the blog so this will be a fairly big post. I’ll start with the next steps on implementing the QR code and then some kernel programming stuff.

Reed Solomon

When doing the encoding on a bit stream, in order to ensure that it can be decoded properly. By reading the error correcting bits, the decoder can determine if it read the data and correct it if the percent of errors are recoverable. There are four levels of error correction listed in the table below depending on the amount of bits used for the error correction. The higher the amount of recoverable data, the larger the QR will be.

Error Correction Level Error Correction Capability
L Recovers 7% of data
M Recovers 15% of data
Q Recovers 25% of data
H Recovers 30% of data

So, as I pointed out in my earlier post, each QR has a version level, and now a error correction level. Based on the input and the output size, the QR encoder decides which version and error correction it can use and how many bits it needs to encode. To get an idea on you can check out this table here.

Reed Solomon does two things: consider the bit stream input as coefficients for a polynomial function, generates another polynomial function, divides the two and uses the remainder as the error correction codewords. The original data codewords and the error correction codewords are interleaved based on the version of the QR resulting in the final bit stream. The final step is putting them in a function pattern, that’s a portion of the QR’s matrix where the decoder expects to read a certain part of the code in order to decode it to data. That’s just a very sketchy idea, but I’ll detail in the following sections.

I will explain some of the steps in the error correction process, but you can find it more detail in this tutorial. It is really nicely explained and clearly, even for someone who has forgotten algebra.

Message polynomial

The message polynomial uses the output of the data encoding part as coefficients. So, as an example, if the codewords, converted to integers, were 110, 118, and 35, the message polynomial would be 110x2 + 118x + 35.

Generator polynomial

The generator polynomial starts from a 2-codeword and that’s x2  + 3x1  + 2x0. This comes from multiplying the (x-1) which is (x-20) and (x-2) which is (x-21). So, can you see the pattern? This is done recursively, so that at each step N you multiply (x-2N-1) with the gN-1(x) in order to get the gN(x), the generator polynomial for N error correction codewords.

Galois field and arithmetic

This gets really mathematical. The main point here is that the exponents fall under certain arithmetic rules, so when you get an exponent (at division) that’s greater or equal to 256, you apply modulo 255 before doing anything else.

Division steps

(taken from the tutorial)

  1. Find the appropriate term to multiply the generator polynomial by. The result of the multiplication should have the same first term as the the message polynomial (in the first multiplication step) or remainder (in all subsequent multiplication steps).
  2. XOR the result with the message polynomial (in the first multiplication step) or remainder (in all subsequent multiplication steps).
  3. Perform these steps n times, where n is the number of data codewords.

Structure the final message

For smaller QRs you leave the remainder from the division as they are. But for large QR outputs you  blocks are interleaved. Then the error correction is added to the interleaved list of blocks. There is also a certain placement in the matrix you have to follow according to the standard, you can find them detailed in the standard or explained in the tutorial.

Anecdote about Galois

The ‘Galois field’ got me to exercise my brain a bit and remember the algebra I did in high school. I had a very nostalgic feeling about the name Galois and then I remembered the stories my math teacher told. Galois had a very interesting story and romantic, as he died in a duel and even on his deathbed he wrote the researches he had been carrying (even during the time he spent in prison!) that would later be published. He was a fervent mathematician with a powerful imagination. Well, our teacher said he died because of a duel for a young lady, but when I read about it on the Internet it was not the case. Either my memory is playing tricks on me or my teacher was trying to make a point (I wonder). Another thing about him is that he failed entering l’Ecole Polytechnique two times and some say that it may have been that he was far more qualified than his examiner. According to this “Legend has it that Galois, who worked almost entirely in his head and who was poor at presenting his ideas verbally, became so enraged at the stupidity of his examiner that he hurled an eraser at him.“. Either way, he managed to get recognized and publish articles, but his life remained tumultuous (as his genius) until the very end.

Reed Solomon in the Kernel

Well, after going on about how it’s done I found out from PJ (when looking where the QR library should be located in the kernel’s code) that Reed Solomon is actually already implemented in the kernel as a module and you can find the code on http://lxr.free-electrons.com/source/lib/reed_solomon/.

The Reed Solomon error correction code (according to wikipedia and another tutorial) is used in:

  • Storage devices (including tape, Compact Disk, DVD, barcodes, etc)
  • Wireless or mobile communications (including cellular telephones, microwave links, etc)
  • Satellite communications
  • Digital television / DVB
  • High-speed modems such as ADSL, xDSL, etc

It has many applications and could be used in driver implementation.

Thank you for reading so far, I know I haven’t been updating my blog as often as I should/could, but it’s my first time keeping a blog and I don’t have the discipline to do it, yet. I also had a “holiday” for my exams, so I had to focus on that for a week. This article has been in the back of my mind for a while now and I wanted to clear it out now, but next time it will get more Linux than Galois. :)

QR codes

As I’ve said in my previous post, I’ll keep this blog as a way to track my progress so far.

QR codes

Before going into the problems and the solutions I came up with after a couple of weeks into the internship, I’ll try explaining what a QR code is.

QR codes, as explained on wiki, are 2D barcodes. As of quite recently (meaning two years) they have become a popular way of advertising (I know I’ve seen them all over the place), but they were actually created to track vehicles during manufacture by Denso Ware, Japan. You can decode them with a smartphone app or a camera and a program that analyzes the camera’s sensor input. Usually, they lead to a URL or display a text.

There are four input modes: numeric, alphanumeric, binary/byte and kanji. Since Oops messages have characters that are not included in the alphanumerical set, the input mode is byte. The maximum number of characters for this input mode is 2,953. This may seem quite nice, but in reality there are two problems: kernel Oops messages are quite large and even so, we wouldn’t want to generate a QR code the size of the screen.

For example, let’s take a kernel Oops message on a x86 architecture (taken from here)

BUG: unable to handle kernel NULL pointer dereference at 00000000
IP: [<00000000>]
*pde = 00000000
Oops: 0000 [#1]
Modules linked in: stap_25635fa2767efdc6a286d47c0380042a_404(+) iTCO_wdt
iTCO_vendor_support sg intel_agp floppy e100 i2c_i801 agpgart rng_core button
pcspkr mii sr_mod cdrom i2c_i810 i2c_algo_bit i2c_core ata_generic ata_piix
libata dock sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ehci_hcd [last unloaded:
microcode]Pid: 1830, comm: staprun Not tainted (2.6.26-rc5 #1)
EIP: 0060:[<00000000>] EFLAGS: 00010246 CPU: 0
EIP is at 0x0
EAX: 00000000 EBX: df778e88 ECX: 00000000 EDX: df778e88
ESI: e0973568 EDI: c11a0e04 EBP: e0973700 ESP: df778df0
 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process staprun (pid: 1830, ti=df778000 task=df79dc80 task.ti=df778000)
Stack: e092b053 ffffffe1 74737973 61746d65 df400070 00000005 c04ac269 df778e58
       00001360 00000000 00001709 00000005 df40a6c0 df407000 00000002 c13dce98
       c13dce80 00000672 70617473 3635325f 61663533 37363732 63646665 38326136
Call Trace:
 [<e092b053>] _stp_transport_init+0x2b7/0x43b
[stap_25635fa2767efdc6a286d47c0380042a_404]
 [<c04ac269>] ida_get_new_above+0xc2/0x17d
 [<c0433103>] sys_init_module+0x1599/0x16fb
 [<c04665b2>] do_path_lookup+0x154/0x199
 [<c04af285>] simple_strtoull+0x0/0xf0
 [<c040371e>] syscall_call+0x7/0xb
 =======================
Code:  Bad EIP value.
EIP: [<00000000>] 0x0 SS:ESP 0068:df778df0
---[ end trace df61255cfec2f882 ]---

And here is the QR code: http://tinyurl.com/nu78673. This is still readable on a phone app, but it is a very complex QR code so depending on the camera’s sensors it might get tricky to read them.

Existing libraries for QR encoding

The most popular implementation in C is libqrencode. There are several others that are not as complex as this one or use this one and write to framebuffer (which is what we need).

Unfortunately, the libqrencode is under LGPL (GNU Lesser General Public License) and it is not compatible with GPL (GNU General Public License), which is the license Linux kernel is under. So, either way I will have to implement a similar library tailored to the use of this project.

LE: Actually, the LGPLv2.1 is compatible is code in the Linux kernel that is GPLv2. The third version of LGPL is not compatible with GPLv2.

The implementation will have to follow one of standards of encoding data into a QR and the one I will follow is ISO/IEC 18004. I have started looking through the specifications of this standard, though very generally, since the pdf file containing the standard’s description has quite a lot of pages… I will follow up with a post on the algorithm used on encoding data soon.

QR codes and Oops messages

As I hinted above, Oops messages are quite large and encoding them is problematic. The whole idea with QR codes is that these messages don’t fit on a screen and are difficult to read so a gigantic QR code is out of the question. Plus the QR code’s character limit would be reached. That is why before the encoding is done the Oops has to be compressed. Since the input type of the QR encoding is fixed, the compression of the original Oops message should result in one of the input modes, in our case, byte.

So, what options do we have here? First thing that comes to mind is compressing the Oops message and encoding it with a binary-to-text encoding method (that’s only if we are keen on having the input as text).

As an example, let’s take the above mentioned Oops message. It has 1446 characters. After compression with gzip and encoding with base64 we get a string of 1139 characters.

 H4sIAI/kulIAA41UbWvjOBD+Lrj/MLBf2ttrT5Zs2THXg22aXRZ6u4Fs4SAEI0tyootjGVvpZv/9
 jRSn7cJ9OEHCjOaZ0TNv/uX+6VMJx07WrQHvYCc7jdLeDJ1p4cvT4yP0znbeDKDNYBr8dcqA9ECn
 Qz4vS1j/cVH/3JBfe23g7hXw1fVjGVVYv0s25C+nj60ZobXd3miwXQmjl33FMsGzRrJc5KbRSkhW
 CJ3mivKC0pTJKqXp1ftrsN/mX6vv2pMoPJtOu6Eaj33vBg/jFgLdtpLbHprW9f0PMAk+bZmqbEET
 QMNWInLotpVyg4H66L3rSK/Gfj/AwVoYh+rgNCg9uMPkmJwjyHbrqtr6qERv6WW1NZ0ZrIpyb+2J
 tLZGGbRTexh1DDaq0UbBnDyHf2oNh1pJtTNw3Clb7RRaLsK6laPHrrROaqNLcrBqcMppsyFkaXUJ
 ScHpb6Dc4XCu3XDs4Ivz4GVIXsMVuxW3TNwMKoN3yTVZhCZRKmj5tlOw+Pj44dMq9iahLBUwXz6h
 FuBgx9jlE2of/i5fugmLe9R0k+eFKQpYzH+yPbyxkcXqcwmGznKeCUQ+oKaSRFJDU4yynGx58Fst
 Jz/dUAIPkVJe4/0kfFxNA3RmyzmsoiAKssTKmHF8KcNV/1ogb+9i1ODp5bgP2kyr4qzdvjFfk5WX
 ah85sZpmHJp4TAJ5iiSRKIgkT4UWGTJN0SWnl8QzUDSVionZlHxWEDifYE64oK81ijc5nb06h3BS
 KBqF/ILBw7BeXCszewl31osJIXIGOQ20kB3H5eEsa5CmCCLe5HiXcwaCi1QIJM4LzgTSIXPZtvBt
 kMqUBJf3kjSORDX6vvKD7MawTpXtrH9PT6zOf6enlNdk/b9XdRMiXwqDka0Oi+KrznyvZO2eDcZV
 DMMmuZ6gnCc0kBh/jPHlsC/4qUBgks1mASqaesJiPjVDrHZVL/2uap3bH/sITQNyNru837AiC0Ht
 oW8N5jd4d2xbRFLEhXmLMMrzxJzfVlidKvwhJuSNT9799yFzXMoS4F5qCEvzLNujuT2v20+bho+F
 kcVBj1Nbvgz7zc3NGvATBj50A0dAJCzLVGMUa4qCwQYB5F8scBlFpgUAAA==

That’s not very impressive, it’s a 1.26 compression ratio (that’s uncompressed_size / compressed_size), but a better compression tool should be used. There are other compression algorithms that specialize on compressing text input with a better compression ratio. As a matter of fact, Oops messages are similar with log files, since they have repeating patterns/substrings. Log files are text files with the best compression ratio as described on maximumcompression.com.

Conclusion

For now, I’ll be implementing the QR encoding library (but only the byte input mode) after taking a good look on the algorithm and standard. I’ve set up a github repository and hope to make as many commits as I can until the year ends.

Oh, yeah! :-)

First step to being a part of FOSS: OPW Intern

Here I am! Feels like I’ve come aboard a very pleasant ride on my way to contributing to FOSS and not just any open-source, to the Linux kernel, thanks to the OPW.

My internship starts today and consists in encoding and decoding QR codes for the Oops messages in the kernel. I will be updating from now on this blog (sort of like a diary) and I hope I can add useful information to it.

So, who am I? Skipping all the philosophical details that come with that sort of question, I am a open source, open education and technology enthusiast. Quite enthusiastic, aren’t I? Other that that, I am a Computer Science student at Politehnica University of Bucharest, Romania.

I have used Linux distributions through out my years of university (mind you, 3 have passed and this is my last) and I have been a Linux fan for quite a while, thus making me want to contribute in a way to it and now I have this wonderful opportunity.

All in all, I’ll grab this opportunity and make the best of it.