Wednesday, December 09, 2009

The kpkg issues

Hello everybody... Today I'm going to speak about kpkg.
Well, I wrote 4 package managers before I started kpkg, there was kwt-get, kum, kget and another one I can't remember.
With every package manager I wrote things where from a very simple start, allowing to install, search and remove packages to a more complex usage allowing several mirrors, dependencies resolution out of the box, and finally the idea of third party packages (this last idea appeared in kpkg), package series, etc.
Let's analyze those 4 functionalities:
1.- More than one mirror was something asked, but once I implemented, there wasn't more than 3 or 4 mirrors out there.
2.- Dependencies resolution is a huge deal, the real thing is that under the hood it sucks, and I see why big distributions with lots of developers has package managers supporting this feature, as they have several developers who everyone takes 10 or 20 packages, build them and fill the databases with the dependency tree. The fact is that this feature is bad implemented in any distribution, it shouldn't be something that the developer has to fill (the database with its dependencies), it should be something auto generated at build time, I don't know, a file listing its dependencies generated by configure (when using autotools) or something similar. I could spend like an hour talking about this feature since I made 3 different implementations of its resolution algorithm.
3.- Well, third party packages (also known as tpp). This actually wasn't a feature it was a way to deal with an inconsistency in kpkg, the fact that packages were registered in mirror databases, if one was repeated, it has to be a "third party package", so kpkg (without any environment variable altering its normal course) treated it as a non identified package (or a package provided by a third party developer not included in any mirror). Of course someone could deal with this using environment variables (MIRROR, STANDALONE and SERIE). So installing a package from the console without having it in a mirror and keeping the database clean without tpp was a huge deal. With tpp, kpkg tried to do, with the same tool, what debian does with a separated tool (dpkg).
4.- Series was kinda one of those features that supposed to help but didn't, in fact, it complicated things. If you start looking at a package system you think "Great, series, this is a way to pre-hash every package", well, it kinda did that job, but complicated all other things, actually this feature was one of those that pushed the "third party packages" "feature".

Despite with dependency resolution (now, a few years back it was supported), kpkg support all this features. By this time you might be asking "Why god why this package manager is so complex and twisted"... Well, the fact is that even I hate kpkg nowadays.
Also, as most of you might know, kpkg is written in bash, with leave us a lot of issue. Why? Well, kpkg makes use of several console tools included in packages like coreutils (ls, rm, mktemp, md5sum, etc), findutils (find), sed, awk, grep, wget, tar, lzma, etc... Can you imagine what could happen if you try to update for example the coreutils? Well, to understand this question and being able to answer it, you need to know some of the kpkg internals, but summarizing what kpkg does is "Download the new package, remove the old package, install the new package", and almost every package managers do this (there's also some other algorithms to deal with this but you need to save more data of every file included in the package). Now, if you upgrade to a new version of, for example, the coreutils, what kpkg do is: "Download the new coreutils, remove the old coreutils, install the new coreutils", if you remove coreutils kpkg starts failing pathetically leaving your system out of use (sorry, toilet cleaning :-P) and we all have to pull the shades down and go home.
In fact this happened some weeks ago and a user came to me with insults, ranting and all the bad things you can imagine. Well, he was right about my mistake, but I'm human and I can make tons of those (and thank god we all still making mistakes, it's a proof that we still humans) and the fact is that no one is paying me for developing Kwort and all its tools, so take it easy.
Anyways, this leave us to the fact that we can't upgrade some (several) packages and with them, of course, the libc.

So, I'm dealing with all this problems re-writing kpkg in what we can call "kpkg new generation" (LOL, it even sounds cool eh? :-P). How I dealed with all this problems? Well, some of those original "features" were dropped, "dependencies resolution" was out of the table way before starting, "series" support was removed as with the new database it isn't needed anymore, and third party packages was also dropped since the new approach is more consistent. What about the fact that you can't upgrade several packages? Well, kpkg is now written in C and static compiled (so everything can be upgraded :-)).

The new kpkg use sqlite3 as database backend, libarchive for package decompressing (giving us support for tgz, tbz2, tar, lzma, xz, zip, etc) and libcurl for package retrieving which leave us support for tons of protocols. And every piece of the code is very well documented with doxygen which will help everyone who wants to read the source code.

So far, the new implementation of kpkg (in C) has support for almost the same options (search, install, remove, provides, update and download) the actual kpkg has, and only the upgrade support is the only one missing (I hope I have time to code it this week). So I will try to upload the source to github as soon as I can, so everyone can start to find bugs in it and improve it.

See you soon guys, and if I don't before Christmas, Merry Christmas :-)


No comments: