Navigation:

Search



Related Articles

Our Friends

Articles Compiler Frontends: distcc
 

Compiler Frontends: distcc

This was written by David Cantrell and given on Sat Sep 06 2003.

Table of Contents


1. Introduction

Distcc is a client/server program that allows you to distribute C, C++, Objective-C, or Objective-C++ code compiles across a network. It does not require a central disk mounted on each host, nor does it require the same operating systems, library versions, or sets of headers.

Since distcc is a drop-in tool for current build systems, you do not need to do any special reconfiguration of your network to get it working. You simply invoke distcc as your C compiler rather than cc or gcc and it takes over from there.

Distcc is written by Martin Pool.

2. Installation

Before you can use distcc, you must install and configure it on each machine that you want acting as a "volunteer" on your network. At the time of writing, the current version is 2.7.1. Download the source code, compile it, and install it:

wget http://distcc.samba.org/ftp/distcc/distcc-2.7.1.tar.bz2
bzip2 -dc distcc-2.7.1.tar.bz2 | tar -xvf -
cd distcc-2.7.1
./configure --prefix=/usr/local
make
make install

Most distributions offer precompiled distcc packages, so you may want to go that route.

3. Configuration
3.1. Run the Server

On each host that you plan to use as a volunteer, you need to run distccd to accept incoming distcc connections. As root, run this command (or one similar to it):

distccd -a 192.168.1.0/24 --daemon

Note: You cannot run the server as root, so make a user for it. By default it wants to run as 'distcc'.

3.2. Set the Host List

On each host that you plan to use as a volunteer, you need to configure the list of available distcc hosts. Start with localhost and then list the remote hosts. The syntax is ip[:port][/maxjobs] . Here is an example listing (from my laptop):

localhost/2
warp.burdell.org/3

This tells distcc, when run from my laptop, spawn no more than two jobs on localhost and no more than 3 jobs on warp.

You have two choices for the location of the host list. One is in the DISTCC_HOSTS environment variable, like this:

export DISTCC_HOSTS="localhost/2 warp.burdell.org/3"

The other place is to create a /usr/local/etc/distcc/hosts file with one host entry per line. I use the file method, but both methods work just fine. The file I mention here is the system-wide one. You can have a user configuration file in ~/.distcc/hosts . The format is the same. It should also be noted that the DISTCC_HOSTS environment variable will override whatever you have in the hosts file.

4. Compiling

When you compile C or C++ code, there are several main tasks that are completed. These are:

  • Generating preprocessed files from source and headers.
  • Compiling to assembly instructions.
  • Assembling to object files.
  • Linking object files and libraries.

The only tasks that are sent to the volunteers are compiling and assembling. All preprocessing and linking is done on the main node, that is, the node where you invoked the job.

The easiest way to invoke distcc is as an override to the CC variable that most Makefiles and configure scripts honor. For example:

CC="distcc" ./configure --prefix=/somewhere
make -j 47

If you are compiling C++ code, you can do CXX="distcc g++" . Distcc defaults to using the regular C compiler, which is why you do not need to specify it for the CC variable.

Another way to invoke distcc is manually. Instead of:

gcc -O9 -fsuper-fast -march=mysystem -o proggie proggie.c

You can do:

distcc gcc -O9 -fsuper-fast -march=mysystem -o proggie proggie.c

That's about it for invoking distcc. Not much to it.

5. Real-World Example: the Linux Kernel

Something we're all familiar with is compiling the Linux kernel. We're also all familiar with the fact that it can take some time, even on speedy machines. I did a comparison of kernel compiles, one using distcc and one not using distcc, just to get some numbers. The controls:

  • Computer : iBook2 500MHz G3, 384MB RAM
  • Kernel Version : 2.4.21
  • gcc version : 3.2.3
  • Volunteer : 2x 1GHz G4, 512MB RAM

I did a timed run of the build without using distcc first, then I did it with distcc. Here is the script I timed on each run:

#!/bin/bash
make mrproper
cp /boot/config .config
make oldconfig
make dep clean
make MAKE="make -j 5" CC="$1" vmlinux
make MAKE="make -j 5" CC="$1" modules

For the first run, I passed gcc as the argument. For the second run, I passed distcc as the argument. Here are the results:

Trial real user sys
Without distcc 50m29.959s 47m19.550s 2m50.780s
With distcc 14m24.400s 11m13.560s 2m6.370s

So you see, results. With just two systems, I was able to speed up my compile time by that much.

6. Resources