Compiling the linux kernel docs

In the last article, I said that compiling and installing source versions of software was akin to “going rogue”. I must confess that I have compiled from source and installed software that wasn’t in my distribution, most recently TexStudio, as being one of the larger projects, requiring tons of other libraries and whatnot to also be installed (or quite often, compiled from source on the side), since it wasn’t a part of the linux distro I was using at the time. It also wasn’t a part of Cygwin, and I compiled for that too. It was a great way to kill an afternoon.

But there was a time that I had compiled the kernel from source. It was necessary for me, as speed was an issue and I had slow hardware at the time. What I also had was a mixture of hardware pulled from different computers at different times. I researched specs on sound cards, network cards, video cards and the motherboard chipsets, and knew what specs to tweak on the kernel compilation dialogs, so I could get the kernel to do the right thing: which is to be fast and recognize all my hardware. I was doing this before the days of modules, with the version 1.x kernel. It worked, and it was noticeably faster than the stock kernels. X-Windows on my 80486 PC ran quite well with these compiled kernels, but was sluggish to the point of un-useable with a stock kernel running. Every few versions of the kernel, I would re-compile a new kernel for my PC, and pretty soon using the tcl/tk dialogs they had made things pretty easy, and I could answer all the questions from memory.

But then that all ended with version 2. Yes, I compiled a version 2 kernel from source, and yes, it ran OK. But it also had modules. The precompiled kernels were now stripped down and lean, and the modules would only be added as needed when the kernel auto-detected the presence of the appropriate hardware. After compiling a few times, I no longer saw the point from a performance standpoint, and today we are well into kernel version 5.3, and I haven’t compiled my own kernel for a very long time.

For the heck of it, I downloaded the 5.3 kernel, which uncompressed into nearly 1 gigabyte of source code. I studied the config options and the Makefile options, and saw that I could just run “make” to create only the documentation. So that’s what I did.

It created over 8,500 pages of documentation across dozens of PDF files. And 24 of them are zero-length PDFs, which presumably didn’t compile properly, otherwise the pagecount would have easily tipped the scales at 10,000. The pages were generated quickly, the 8,500 or more pages were generated with errors in about 3 minutes. The errors seemed to be manifest in the associated PDFs not showing up under the Documentation directory. I have a fast-ish processor, an Intel 4770k (a 4th generation i7 processor), which I never overclocked, running on what is now a fast-ish gaming motherboard (an ASUS Hero Maximus VI) with 32 gigs of fast-ish RAM. The compilation, even though it was only documentation, seemed to go screamingly fast on this computer, much faster than I was accustomed to (although I guess if I am using 80486’s and early Pentiums as a comparison …). The generated output to standard error of the LaTeX compilation was a veritable blur of underfull hbox’es and page numbers.

For the record, the pagecount was generated using the following code:

#! /bin/bash
list=`ls *.pdf`
tot=0
for i in $list ; do
        # if the PDF is of non-zero length then ...
        if [ -s "${i}" ] ; then 
                j=`pdfinfo ${i} | grep ^Pages`
                j=`awk '{gsub("Pages:", "");print}' <<< ${j}`
                # give a pagecount/filename/running total
                echo ${j}	    ${i}    ${tot}
                # tally up the total so far
                tot=$(($tot + $j))
        fi
done

echo Total page count: ${tot}

The next step for Linux development

As you might know, there are nearly 300 Linux distributions (currently 289– low in historical terms), and this is a testament to how successful the Linux kernel has become on the PC, as well as other devices, especially in relation to previously-existing *NIX systems, who have either fallen by the wayside, or are barely existing in comparison. A *NIX system that might be a distant second might be BSD UNIX.

Just earlier today, I observed that for the installation of TexStudio, for instance, there are two installation images for MS-Windows (all versions from windows 7 on up), the only distinction being between 32 and 64-bit. On the other hand, there were a plethora of Linux images, all depending on which distro of Linux you used. My distro is Ubuntu Studio. I use Gnome as the window manager. The only Ubuntu-based Linux images were for xUbuntu (which uses xfce as a window manager).

Apparently, it also seems necessary to have to compile a separate image each time a linux distro is upgraded. The 19 images I counted for xUbuntu were for versions 14 through to 19. Now, I understand that seperate images need to be compiled for different processors, but most of these are for PC’s running with 32 or 64-bit processors. The same was true for each upgrade of Debian, or Fedora, or OpenSuse. And even then, they needed separate binaries from each other. There are easily more than 50 Linux-based installation images you can choose from at the moment.

The “package” system that is now near universal in the Linux environment provides a convenient way for sysops to assure themselves that installations can happen without problems happening to the system. Before that, one compiled most new software from source, tweaking system variables and modifying config files to conform to whatever you had in your system. This has since become automated with “make –configure” or “make config” that most source has these days. In other words, modernization of Linux seems to mean increasing levels of abstraction, and increasing levels of trust that the judgement of a “make config” trumps human judgement of what needs configuring for the source to compile. On a larger scale, trusting a package manager over our own common sense can be seen as working “most of the time”, so there is a temptation to be lazy and just find something else to install besides whatever our first choice was in case the installation failed due to a package conflict. Installing software by compiling from source, once seen as the rite of passage of any sensible Linux geek, is now seen as “going rogue”, since that is now seen as subverting the package manager, and in a sense, taking the law into your own hands.

Of course, Linux installations still exist for the latter kind of Linux user. The foremost, in my opinion, is Slackware (if you screw up, at least something will run) and a close second is Arch Linux. It is my understanding that Arch Linux requires much more knowledge of your own hardware in order to even boot the system; whereas Slackware will be likely to at least boot if your knowledge of the hardware is not quite so keen (but still keen). My experience with Slackware is in the distant past, so I am not sure what is the norm these days, although I understand they still use tarballs, which I remember allowed me to play with the installation by un-compressing it in a directory tree not intended for the installation to see what was inside before I committed myself to deciding whether it could be installed. The tarballs are compressed nowadays with “xz” compression, giving the files a “.txz” extension.

But I digress. Getting back to installation images, it should not be too difficult for people who manage these linux distros to make it less necessary to have so many different images for the same Linux distribution. In the MS-Windows example, only one version of TexStudio was needed across three or four different Windows versions. I am running windows 7 with software that didn’t exist in the days of Windows 7, and with other software that originated from Windows 2000. All of it still runs, and runs quite well. Fixing this problem is hopefully do-able in the near future.