Friday, June 27, 2008

Does Gentoo make sense?

When I mention to colleagues in the IT industry that compiling packages before installing them on a computer is a good thing, they either give me a blank look or an ever slight smirk. What is the point wasting many hours waiting for some package to compile instead of fetching the binaries from a repository and have it installed in seconds?

Come to think of it, I am actually writing this while waiting for Gentoo to upgrade GCC from 3.4.4 to 4.3.1. It may not sound much but it's actually a big deal. GCC is probably the package that takes the longest slowest to build, in order order of two hours, even on recent dual-core 64bit machines.

Portage, gentoo's package management system, when installing a package, say X, will fetch X's sources from some repository, and then build X from the sources. For example, if package X was written in C, it will compile the sources and then link the resulting binary files into an executable program. As mentioned previously, this process of building from sources can take from minutes to several hours depending on the package and its dependencies. Note that if package X requires package A, B and C, and B requires D and E, and D requires F, Portage will build A,B,C,D,E and F in the correct order.

Clearly, building from sources is much slower than fetching the binary package. But, building from sources will implicitly check that the required dependencies for the package under construction are available. If X requires A,B,C,E and F if any of those five packages is missing, then X won't compile and hence will not install. Thus, if Portage is able to install X, then you can be fairly confident that it is installed correctly on your system. Of course, you would still need to configure X according to your needs, but as far as the binaries of X and its dependencies are concerned, you are reasonably safe.

Contrast it with installing binary packages. You can never be sure that you are not missing a library or if they have a conflicting version. Conceptually, Gentoo vs. Ubuntu is analogous to compiled and statically typed languages, e.g. C++ or Java, versus interpreted and dynamically typed languages, e.g. Python or Ruby.

Interpreted and dynamically typed languages enjoy a shorter development cycle but are somewhat more brittle whereas compiled and statically typed languages have a slower development cycle but are often deemed more reliable.

Another analogy would be an RDBMS enforcing data integrity constraints e.g. MySQL+InnoDB versus an RDBMS ignoring data integrity constraints, e.g. MySQL+MyISAM.

As it stands, Portage is still building GCC.