An Overview of Linux Multimedia

Table of Contents

Grip Screenshot

1. History

1.1. Overview

In the past, the use of multimedia in Linux was hampered by the
rarity of hardware accelerated XFree86 servers and finicky ISAPNP
configuration. Full motion video software lacked the codec support
that other operating systems enjoyed. The permeation of quality audio
decoders on the platform, though, was well on its way.

As the accelerated XFree86 server and the release of DVD movies became
more common, Linux became a usable platform for high quality multimedia
video.

1.2. Some Silly Dates

March 1996 Oldest date in isapnptool’s changelog
September 1996 O’Reilly publishes Linux Multimedia Guide by Jeff Trante
1997 Amp, written by Tomislav Uzelac, is the first desktop MP3 decoder
November 1997 First X11Amp Beta released
January 2000 MPAA files lawsuits for DeCSS DMCA violations

2. Hardware Considerations

Multimedia demands more from computer hardware than many other
computing activities, so there are some general requirements for
having a fulfilling multimedia experience in Linux. Recently, support
for popular hardware in Linux has been common.

2.1. Sound

Most new sound cards have at least basic support in Linux. While
many are supported directly by the kernel, there are several alternative
options including the Advanced Linux Sound Architecture and (a
commercial offering) the Open Sound System.

A sound card capable of 16-bit PCM output will suffice for all but
the most extreme audiophile. Hardware MIDI support is less frequent
for most cards, but utilities like timidity can simulate MIDI devices
in software.

3. Multimedia Utilities

3.1. Music

Linux users have available to them a plethora of programs for use in
playing MP3 audio, which may or may not continue to be the format of
choice for storing gigs upon gigs of perfectly legal backups of your
CD collection. Recently more popular, in light of legal hullabaloo
when Fraunhofer began making demands on royalties for MP3 software,
is the Ogg Vorbis format, which has some excellent software available
as well. We’ll discuss a few interesting options for playback.

3.1.1. amp

amp, with its last release in 1997, was created by Tomislav Uzelac while
he was a graduate student at the University of Zagreb in Croatia.
This is yet a very viable solution for playing MP3 audio files — it is
non-interactive, and Just Works from the commandline. A webpage from
the PlayMedia site refers to amp’s MP3 decoding algorithms as "the
Rolls-Royce of MP3 playback technology." What I like about it is that
it’s very simple, and works nicely in small scripts. amp comes with
Slackware, and there is a Debian package for it in the "non-free"
section; it’s not GPL’d, but the source is available and the author
describes it as "free software". AMP stands for "Advanced Multimedia
Products", which was Uzelac’s company, later absorbed by PlayMedia.
AMP is also the AMP in X11AMP, WinAMP, and MacAMP, which are (or
at one point were) derivative works.

To learn more about Tomislav Uzelac, see
http://www.playmedia.hr/tomislav.htm.

3.1.2. mpg123

mpg123 plays and decodes MP3 audio files, and does it rather well. It’s
generally used for its simplicity, and because it works nicely as a backend
for other programs, some of which work from the commandline, under X, with
ncurses, or from within Emacs. While its source is available, mpg123’s
license does not allow for commercial uses, and as such, it isn’t Free
Software. Take a look at mpg321 for a Free option.

mpg123 can be found at http://www.mpg123.de/.

3.1.3. mpg321

mpg321 is a Free replacement for mpg123, intended to replicate its
functionality and allow people to integrate its code into new projects
without licensing entanglements. It uses the Ogg Vorbis project’s libao,
although as of yet it doesn’t play Ogg files (for that, see ogg123). In
a move that could either be seen as helpful or obnoxious, mpg321’s install
scripts by default create a symlink called mpg123 to the mpg321 binary.

mpg321 lives at http://mpg321.sourceforge.net/.

3.1.4. xmms

xmms stands for X MultiMedia System (formerly known as x11amp), and
is a graphical, modular program, using various plugins to handle
various sorts of media files that you might wish to play. It looks
and feels very much like Winamp, for those familiar with that
particular program. Most significantly, it’s used to play MP3,
although it also supports Ogg. xmms’s plugin system allows new file
formats, output methods, and effects to be added without disturbing
the rest of the program. Particularly entertaining is the "voice
removal" plugin, which comes with xmms. In a recent test, it managed
to nearly completely remove Frank Sinatra’s voice from "The Coffee
Song", allowing for an almost perfect karaoke experience. Success
with this particular plugin varies from vocalist to vocalist.

XMMS Screenshot

3.1.5. Ogg Vorbis Tools

Ogg Vorbis is an "open, free" audio codec, free from patents and
royalty problems, with a slick variable quality feature. It probably
deserves its own presentation sometime later. Generally
available are the excellent programs oggenc and ogg123, which encode
and decode .ogg files, respectively. Particularly under Slackware,
these ship as the "oggtools" package, but from vorbis.com, they
can be had as vorbis-tools-1.0.tar.gz. These are not the only
programs available for playing oggs, however — as mentioned earlier,
xmms works nicely, as does Zinf (formerly FreeAMP), apparently.
Various other packages are available, as are shell scripts to
convert your entire MP3 collection into oggs. This is generally
not considered as a good idea, however, as converting from one lossy
format to another format lossy in different ways results in a
cumulative loss of sound quality. The xiph.org folks recommend
keeping one’s MP3s (or just re-encoding straight from the CDs
to Ogg) and encoding new music with oggenc, as many players can handle
both formats. xmms supports Ogg Vorbis with no modification, and
oggenc and ogg123 are very easy to use.

With great alacrity (and For Great Justice), take a look at:

http://www.xiph.org/

http://www.vorbis.com/

3.2. Video

Video is generally the last mile in Linux multimedia. Most hardware
acceleration still requires a fair amount of effort, although the most
popular new cards are well supported. Nvidia releases a closed source
accelerated driver for XFree86 (Linux only), while the now-defunct 3DFX’s
line of Voodoo cards and ATI’s Radeon line have DRI (Direct Rendering
Infrastructure) support in the Linux kernel.

Full motion video utilities can display hardware accelerated output
through the X-Video (Xv) extensions available in XFree86 4.X. X-Video
allows images to be displayed with quality scaling and filtering using
shared memory segments. To test whether hardware accelerated Xv output
is available, use the xvinfo(1) command.

Example xvinfo(1) output

$ xvinfo
X-Video Extension version 2.2
screen #0

Adaptor #0: "NeoMagic Video Engine"
number of ports: 1
port base: 55
operations supported: PutVideo PutImage
supported visuals:
depth 16, visualID 0x23

depth 16, visualID 0x24
number of attributes: 3
"XV_COLORKEY" (range 0 to 16777215)
client settable attribute
client gettable attribute (current value is 2110)

"XV_BRIGHTNESS" (range -128 to 127)
client settable attribute
client gettable attribute (current value is 0)
"XV_INTERLACE" (range 0 to 2)

3.3. Tracked Modules

Xmp, the Extended Module Player, is a tracked module player which has
support for a multitude of tracked module formats including XM, S3M,
MOD, IT, and many others. If you have GUS or SoundBlaster AWE hardware,
xmp can use the sequencer to play the modules. For more information see
http://xmp.sf.net/.

3.4. CD Ripping

There are quite a few useful utilities for ripping and encoding CD audio.
Among these are cdparanoia, an excellent command-line ripper with error
correction, lame, an MP3 encoder which supports MPEG 1, 2, and 2.5 layer
III encoding, and oggenc, the official Ogg Vorbis encoder. Grip, a GNOME
application can serve as a front end to all these utilities and will
automate the entire ripping/encoding/tagging process involved in archiving
CD audio.

4. Resources

Home Directory Sprawl (Using CVS)

Table of Contents

CVS stands for the Concurrent Versions System (or Concurrent Versioning
System or Concurrent Versioning Software, and so forth) and is pronounced
Cee Vee Ess. It was designed as a replacement for RCS, or the Revision
Control System.

1. Background

The idea of tracking changes made to a development project has been around
for a while. It’s not a particularly interesting job, as evidenced by the
tools we have today. Some of the more popular source code control systems
are CVS, RCS, and SCCS.

CVS Concurrent Versions System. This is the de-facto standard used in
open source community.
RCS Revision Control System. The old system used before we had CVS.
SCCS Source Code Control System. Used by companies like Sun. You
won’t find SCCS tools in the open source world. If you’re
itching to see SCCS, look in /usr/ccs/bin on a Solaris machine.

There are some new projects underway to replace CVS. While CVS does work,
it has some well-known limitations that are now becoming an issue as
CVS-controlled projects become larger and larger. A popular one in the
Linux community is BitKeeper (or simply bk). BitKeeper is made by
BitMover, Inc. They designed bk to be a good enough tool that Linus would
be willing to use it for kernel development. A difficult task, at best.
BitKeeper is in use in a lot of places, but one of the major differences
that you’ll find between bk and CVS is that you can’t get the bk source
code, only binaries. This is because BitKeeper is not an open source
project, it’s commercial software.

And there is yet another hopeful CVS replacement, Subversion. The
Subversion project was started by the CVS authors (!) because they wanted
to correct CVS’s deficiencies and make the upgrade somewhat painless.
We’ll see what becomes of that project, so far it’s looking pretty good.

… So, with all the above, why CVS? Several reasons, and most of them
are related to the current status of the other projects:

  1. CVS is well established, it’s pretty much error-free.
  2. It’s open source, which means you can get it working on basically
    any platform.
  3. I don’t like trusting ALL of my data, or even just my development
    projects to a new, experimental, and developmental source control
    system.

In a few months, when I decide to look at Subversion again, I may choose
to move over to that. But for now I’m going with CVS. If you decide to
never use CVS for your own purposes, it’s still a good idea to understand
how it works because it is used in so many projects.

2. Terminology

Repository
The name of the CVS server. This is where all of the
data is stored, it’s where your changes go, and it’s
where you pull updates from.
Tag
This is the CVS name for a version or release.
Module
The project you work on from the repository. Typically a
repository will only have a few modules, one for each major
aspect of development. For example, FreeBSD has the ‘src’,
‘doc’, and ‘ports’ modules.
Branch
Concurrent development on the same module. CVS does a
really neat trick which allows you to branch development at
any time and it will start tracking changes specific to
that. This is useful for software development because you
can make a branch after a release and use it for security
updates.
Pserver
CVS’s internal server mechanism. Don’t use this.
Attic
CVS never deletes a file from the repository. Files
marked as deleted get put in the attic.

3. How CVS Works

The terminology section may have given you some hints as to how CVS works
and what it actually does besides "source code control." The easiest way
to think about it is to think of it as a development mediator. Multiple
developers working on the same project, CVS handles merging all the work
of the developers, it handles sending out changes to each developer, and
so forth.

Life in CVS begins by creating a project, which consists of at least one
module. In this module, you import the files that belong to it. In the
world of software development, this consists of C source code files,
header files, Makefiles, support files like X pixmaps, and documentation.

What doesn’t go in CVS? Anything that can be automatically generated.
Object files that the compiler creates, dependency files, the program or
library executables, and … configure scripts. Autoconf can regenerate
the ‘configure’ script, so you don’t want this in CVS. Before
distribution in gzipped tar format, most developers run ‘autoconf’ to
create that script for you.

Your source is now under CVS control. To work on it, you checkout a copy,
make your changes and commit them to the repository. Periodically you
will run the update command to pull down any changes from other
developers. And, if it’s like any other CVS project, you’ll have merge
conflicts that will need resolving by hand.

That’s the big picture. The important parts are understanding that you
are working on a copy of what’s under CVS control. You use CVS to manage
the changes to your copy and other copies.

4. Problems With CVS

4.1. Permissions and ownerships

CVS does not store permissions and ownerships on files. You cannot flag a
file under CVS control as world readable, for instance. When you check
something out from CVS, the files come to your workstation and CVS chowns
and chmods them according to your umask. When you commit changes, CVS
pulls in your changes, but doesn’t modify the permissions in the
repository. Most projects use different group permissions on the
repository so they can restrict access to project developers.

People using CVS for more than software development make use of the post
operation scripts to overcome this limitation. You can tell CVS to run a
script or program after a CVS update or commit. In this script you can set
permissions and other such things that CVS doesn’t handle.

4.2. Symbolic links

You cannot store symbolic links in CVS. Forget about it, ain’t gonna
happen. Use the script hack above or just say good bye to symbolic links
forever.

4.3. Binary files

CVS provides source code control. You can check out old versions,
generate patches between releases, and many other tasks specific to
software development. To perform these tasks, CVS must deal with plain
text files. This is how it can track changes between the files. This
presents a major issue for binary files. So much of an issue that CVS
just doesn’t handle binary files. Now we’re starting to have some
problems. To get around this problem, we can flag a file as ‘binary’ and
CVS won’t track changes to it. It will just make sure one copy is in the
repository. Generally speaking, this works fine, but it means you can’t
use CVS to track changes between, say, JPEG image files.

4.4. You cannot delete directories

Once a directory is added in a CVS project, you can’t delete it. Remember
I said that CVS never deletes a file, it just moves them in to the attic?
Well, if you checkout an old release that had a now deleted file in a now
deleted directory, CVS needs to know where to put it. The empty directory
is where it will put that. Because of this, CVS can think of empty
directories as deleted, which is what you want to do.

4.5. Spaces in filenames

Generally speaking, CVS does not like to deal with spaces in filenames. I
have seen tricks to make CVS deal with this, but I prefer to not have
spaces in filenames anyway. For some people, this may present an issue.

5. Commands

To perform a CVS operation, you run cvs and specify one of the commands
below. Each command has a help screen, which you can get with this
syntax:

cvs command --help

Below are the major CVS commands, you can see –help-commands for a
complete listing. All of these commands assume you have a working CVS
repository (this is covered in the second part of this presentation).

5.1. import

The import command is for creating new CVS-controlled projects. You run
this command from the directory you want in CVS. For software development
projects, this is usually your source directory. To import the current
directory, use this command:

cvs import -d project-namevendor-tagrelease-tag

The -d flag tells CVS to use the file’s mtime as the import time. This
way you can import projects that haven’t been under CVS control for years,
the timestamps are preserved this way. The project name is what you want
to call the CVS module, the vendor-tag is mostly usually, and the
release-tag is just a symbolic name to represent the import. For vendor
tag I use BURDELL. For the release tag on import operations, I use
‘start’.

5.2. checkout

The checkout command is what most people are probably familiar with. The
simplest syntax is:

cvs checkout project module

Which checks out the latest revision of the specified module. Adding the
-r [rev] switch will check out a specific revision.

5.3. export

This commands works like the checkout command, but it does not "check out"

a copy to work on. That is, the CVS server does not know you are working
on that copy. This command mainly exists for creating source archives for
distributions. Once you tag the release, you export it to another
directory and it’s free of the CVS repository and does not have those
‘CVS’ subdirectories all throughout the tree.

5.4. add

The add command is used to add files to a module you have checked out.
You must specify each file to add:

cvs add files...

If you want to add a directory, use the add command on the directory, but
then change in to the directory and add each file. The add command has a
special flag for adding binary files, the -kb switch. Use this switch
when you are adding any file that is not plain text.

5.5. remove

To remove a file from a module (or project), you must use the cvs remove
command. The biggest problem people have with this is that you must
specify each file separately. It’s really not that big of an issue once
an entire project is under CVS control, you’ll find that you rarely remove
large sets of files. If you need to remove an entire directory tree
that’s under CVS control, use the -R switch to recursively remove the
directory. The syntax:

cvs remove [-R] [-f] files...

This CVS command has one major annoyance, you cannot run ‘cvs remove’ on a
file until you actually rm it from the filesystem. To get around this
default behavior, use the -f switch on this command to tell CVS to rm the
file before removing it from the project. Very useful.

5.6. update

You will use this command as much as the commit command. The CVS update
command brings your local copy of the project up to date with all the
changes available in the repository since you last updated. This command
merges differences and also lets you know of merge conflicts. The syntax:

cvs update

Run this command from the main project directory and CVS will check the
repository and pull down and merge all the changes. The output from this
command can be a bit cryptic. The program displays a letter indicating
the operation, followed by the file involved with that operation. Here
are the letter codes you will most likely see:

U file updated
A new file added
P file patched (like U, but not the entire file)
R file removed
M changes merged
C

MERGE CONFLICT

? cvs hasn’t got a clue

The two options I use with cvs update are -d and -P. These two options
bring down all directories in the module and then "prune" empty ones,
which we assume are deleted directories.

If you get merge conflicts (you will), cvs does this really nice thing by
default where it stomps all over your copy of the file in conflict. It is
up to you to then move the file out of the way, get the copy from the
repository, diff the files, and merge the changes by hand. If you are
predicting merge conflicts, it’s a good idea to use the common CVS option
-n, which reports what would be done to your copy without actually doing
it. So running this command:

cvs -n update 2&1 | grep "^C "

Will report what files have merge conflicts without stomping all over
them.

5.7. tag

This command is used for software development projects under CVS control.
If you’re tracking your home directory with CVS, you probably won’t make
releases of it at various points.

The tag commands marks the state of the repository with a symbolic name.
Once you tag the module, you can later checkout a specific tag by name.

The tag command places the symbolic tag on your checked out copy of the
project, the rtag command puts the tag in the repository and does not
affect your copy.

When working with branches, you make use of the tag command. You can
create and merge branches with the tag command.

5.8. log

When you make commits, CVS will prompt you for a log entry. Using this
wisely will produce a log for the project that can be referred to later.
Many people skip this step. It really doesn’t matter, but I like logs, it
makes tracking down changes easy. With a CVS log, you get the file
listing, the revision numbers, and the annotation so you can quickly get
back to working copies of files.

5.9. diff

The diff and rdiff commands can display the changes between revisions of
files in either unified or context format. If you need to manually
resolve a merge conflict, generate a patch to a tagged release based on
the current developmental copy, or to see what you changed when you broke
something, the diff command is what you want to use. The syntax:

cvs diff -r rev -r revfiles...

The diff command has a lot of options, most of which fall through to the
diff command. You can diff specific revisions, files from specific dates,
and you can process directories recursively to generate large patches.

5.10. Other Commands

The CVS commands that begin with an ‘r’ are like the similar command
above, but they operate on the repository instead of the checked out copy.

You may also be familiar with the ‘login’ and ‘logout’ commands. These
are generally used with the CVS pserver mechanism and not when a
repository serves via ssh.

5.11. Options

There are some common CVS options, such as -z (for compression) that apply
to all CVS commands. You can use the –help-options switch to see a list
of those.

On the subject of command options, the location on the command line where
you put options does matter. There are generic CVS options and command
specific options. You need to follow this order when using options:

cvs [common opts] [cvs command] [command opts] files or something

You can alias the CVS commands to the command plus the common options you
prefer. This is done in the ~/.cvsrc file.

5.12. RCS variables

CVS uses the RCS file format to track changes. Because of this, you can
make use of special RCS variables within your plain text files. A common one is $Id$,
which expands to a description containing the RCS file name, a timestamp, and the
revision number. Another RCS variable is $Log$ which expands to the commit log for
the file. This can get really long for
files that change often (.c files for development projects), but for files
that rarely change, it can provide a quick way to look at the log. A list
of the common RCS variables:

$ Id$ Identification string
$ Log$ Commit log
$ Revision$ Revision number

This also brings up a good point about CVS. The RCS man pages apply to
CVS, mostly. Specifically the co(1), ci(1), rcsintro(1), and rcs(1) man
pages.

6. Using CVS to Synchronize Your Home Directory

In this example, I’ll explain how I set up CVS on my own machines to
synchronize my home directory. This is a problem that I’m sure everyone
has encountered at least once.

6.1. The Problem

When you get a new user account on a system, you are given a place for
your files, your home directory. Over time you get more and more shell
accounts, sometimes even on your own machines and you begin to lose track
of where files are and you begin to have trouble maintaining the
environment profiles between them all. This is a problem I’ve fought with
for a long time, until I read the article by Joey Hess in the September
2002 Linux Journal. Joey explained how you can put CVS to work
synchronizing your home directory. It never occurred to me to try this,
but I decided to give it a shot. It’s been working great between my
machines. Below is a short description of what I did to move my life in
to CVS.

6.2. Layout your new home directory

Projects under CVS require some thought. Not being able to freely remove
directories and having files always exist means you can’t just throw
things anywhere. Well, I guess you could do that, but it would make for a
CVS managed mess. So, I created a new directory as the working tree for
what would become my new home directory. Inside this directory I have
these subdirectories:

GNUstep WindowMaker profile and other GNUstep stuff
Mail All of my email (now 118MB)
bin Scripts and programs I’ve written for myself
doc Like ‘My Documents’ on Windows
etc Location of configuration/support files for my ‘bin’ stuff
gt All of my Georgia Tech class stuff (now 425MB)
media Movies, pictures, and random audio files
src Programs I’m working on
tmp Scratch space

That’s it. I keep classwork under the gt subdirectory, general ‘work’
goes under doc in an appropriate subdirectory, and so forth. Keeping to
this structure will ensure your cvs attic doesn’t grow enormous.

6.3. Import

I start by importing an empty directory structure. I will be holding
plain text as well as binary files in my home directory, so I need to
specify the -kb switch on some of them.

cvs import -d david-homedir BURDELL start

Once the import is complete, I remove the directory I just imported and
check it out from CVS.

6.4. Add files

With my empty directory, I start adding in files one at a time. I did
several subdirectories at once and then I’d commit the changes. This
process took a while, but I only have to do it once.

Remember to use the -kb flag for binary files. Use cvs commit to place
the new files in the repository.

6.5. Update systems

Since my home directory is huge, I can quickly lose track of what I’ve
added to CVS and what I haven’t. I use the cvs update command and look at
the lines beginning with "?" and then go and handle those files. I pretty
much used the update command as my checklist for what I still needed to
merge in.

6.6. Take your home directory to CVS

With everything in CVS, you can now take your home directory to CVS. It’s
somewhat tricky, but here’s what I did.

$ cvs commit# final commit
$ cd# change to my home directory
$ cd ..# go up one level (/usr/home on my system)
$ rm -rf ~/*# remove everything in my home directory
$ cvs co -d david david-homedir # check out my home directory

Now, I went to another terminal and tried logging in. Once I verified
everything was working, I logged out of the shell I did the checkout from
and started using normal shells.

6.7. Make CVS a habit

Living in CVS isn’t hard, it just requires a few extra commands on top of
the normal commands you type.

At the end of each day, you should do a cvs commit to commit your work for
the day to CVS. Each time you start working on something new, start in a
logical place and do cvs add on those files and directories. After about
a week, the CVS commands become second nature.

7. Conclusion

Is CVS the best tool for home directory synchronization? Probably not,
but for me it works fine. The advantages I get are:

  • Distributed backups
  • Home directory synchronization
  • History

Based on those advantages alone, I think CVS is the right tool for me. It
has some shortcomings, but I think the advantages above are worth it. I
used to use NIS and NFS for account and home directory management for my
systems, but that requires access to the NIS and NFS server all the time.
This doesn’t work well for laptops. After that, I tried hacking something
together with rsync and ssh, but there was no easy way to keep track of
which machine was the "master" copy of my home directory…rsync doesn’t
merge differences. And now I’m using CVS.

8. Resources

Filesystems

Table of Contents

1. Definitions

1.1. What is a filesystem?

Most hard drives contain numerous tracks, each track
containing thousands of sectors (or blocks), each sector/block
containing 512 bytes, each byte containing 8 bits. That’s a lot
of 1s and 0s! These 1s and 0s are useless, however, if they are
not written to the hard drive in order and with some
organization. Filesystems facilitate not only the storage, but
also the location, retrieval and manipulation of data. By
telling where the data should be stored and how it should be
read and manipulated, filesystems enable us to make good use out
of those expensive hunks of metal, glass, ceramic, silicone, and
more.

1.2. Metadata

Metadata is crucial to filesystems. Metadata, "data about
data," is information stored in reference to, but not a part of,
the main data written to the disk. For instance, whenever a 4
kilobyte file, "foo.bar," is written to the disk, the metadata
remembers it’s size, position on the disk, name, and more.

1.3. I-nodes

Metadata is stored in sections of the hard disk referred to
as i-nodes. I-nodes alsa contain block maps, which store
detailed information as to exactly where the data is on the
disk. Unfortunately, not all files consist of contiguous bits on
the disk. Rather, a file might exist in many contiguous sections
found all over the disk (aka fragmentation).

1.4. Directories

Directories are simply containers for any number of files
(units of data). Directories can be easily stored as linear
lists, containing at least the name of the file and that file’s
inode. Of course there can be directories that contain
sub-directories. And aha! Linked lists 🙂 These linked lists
therefore form hiearchical structures, which should be familiar
to most computer users.

1.5. B-Trees (and B+trees, B*trees)

More advanced filesystems utilize B-trees (or sometimes hash
tables) which store the directory’s contents using a better,
mostly sorted structure. This makes indexing and searching much
easier and faster. They are also relatively compact (few keys)
and scalable.

1.6. Journaling (only for 2.4.x+ kernels)

Journaling is the coolest part of most modern filesystems
(IMHO). A relatively recent addition, journaling allows
filesystems to be more accurate, reliable, and less corrupt. If
your system ever crashes (not that Linux would ever crash, but
if you experience a power failure, or worse, you’re using
Windows), a journaling filesystem will be able to repair
itself. Metadata is the essential component of all journaling
filesystems, because it is the metadata that these
filesystems use to make sure the data is accurate and/or
complete. During bootup after a mishap, only the metadata
that has been manipulated recently (immediately before crash)
is analyzed. Therefore, the filesystem is repaired (brought
to a consistent state) very quickly, no matter how big the
drive is! In addition to how filesystems write/read the data
to and from the disk, the manner in which filesystems utilize
journaling makes them different from one another.

This presentation is a comparison of how XFS, Ext2/3, and
ReiserFS manage data and metadata.

2. Filesystem Types

2.1. Ext2

An oldie but a goodie. Ext2 is the most used filesystem for Linux because it has been around for a long while. The great quality of Ext2 is that it is really fast. Unlike other filesystems, however, Ext2 lets the hard drive handle the cylinder groups. Instead, it refers to the hard drive in separate block quotes. The disadvantage of Ext2 is that it does not implement in itself any sort of journaling. In the rare case that the system reboots unexpectanctly, it is up to a filesystem check program (fsck) to analyze and repair any damage during the next boot. However, one should not immediately turn his or her head away from this filesystem. It is supported very easily by most Unix, Linux, FreeBSD, etc operating systems. So if you want compatibility and speed, and are willing to sacrifice the awesome journaling capabilities of the not yet mentioned filesystems, then Ext2 is for you!!!

Pros: Most used, tried and true, fast, awesome compatibility, well-rounded, "comfortable", solid

Cons: Can’t journal

2.2. Ext3

However, if you do want journaling maybe you should consider
Ext3. Ext3 uses the same code Ext2 does, so it is just as
compatible as Ext2. The only real difference is the addition of
journaling capabilities. Ext2 and Ext3 are irreversible; one can
easily upgrade an Ext2 filesystem to Ext3, and vice versa (but
why would ya?).

The people who created Ext3 were a little creative with
journaling techniques. First of all, to ensure the integrity of
both the metadata and data, they recorded changes to both
metadata and data. While some other filesystems like XFS use
logical journaling, Ext3 uses physical journaling. Physical
journaling stores the complete replicas of the modified blocks –
which also contain unmodified data. This might seem a little
wasteful, but it has its advantages, discussed later. Logical
journaling, on the other hand, records only modified spans of
bytes (an impartial snapshot). Physical journals are generally
less complex than logical journals. Also, physical journaling
allows some optimization, like being able to write the changes
to disk in one write operation (increasing speed and CPU
overhead). Finally, after all this, both the data and metadata
will be consistent.

Unfortunately this is still a bit slow. Recently, Ext3
started using an alternative. This new method journals metadata
only (bear with me). The new driver combines the writes to data
and metadata into one entity, called a transaction. Basically,
each transaction keeps track of the data blocks that correspond
to each journal update, and consists of first writing the data,
then the journal (metadata). This provides the same
data/metadata consistency without the performance sacrifice.

Pros: All of those of Ext2, plus journaling! Easy to deploy!

Cons: Can be slow, depending on the type of journaling being used,
and creates unnecessary disk activity

2.3. ReiserFS

This filesystem is very often talked about. Hans Reiser, the
creator, wanted to create a filesystem that would meet the
performance and features needs of its users, without having them
create special solutions like databases that operate on top of
the filesystem (which degrades speed and efficiency). ReiserFS
is very good at handling small files. It does this by using
balanced B*trees, which boosts performance and is more scalable,
flexible, and efficient. Instead of having a fix space for
inodes set during the creation of the filesystem (Ext2 does
this), ReiserFS dynamically allocates the inodes.
Another cool feature of ReiserFS deals with tails. Tails are files (or ends of files) that are smaller than the filesystem block/sector (512 bytes). Filesystems like Ext2 write these files to the data like all other data, but since Ext2 allocates storage space in blocks of 1k or 4k, the rest of that reserved section is wasted. ReiserFS, alternatively, stores these files in the B*tree leaf nodes instead of writing the address of the data in the nodes and the files on the disk like all other files. This is the trick to increasing small file performance, since both the data and metadata are in one place and can therefore be read in one swoop. ReiserFS also packs the tails together. Not only does the way ReiserFS manages tails make it faster, but it also saves space (typical 6% increase of storage capacity over Ext2). Unfortunately, it’s not all great, because whenever files are changed, ReiserFS must repack the tails; this causes a decrease in performance. Tail packing can be turned off, for those speed freaks out there.

As for journaling, ReiserFS uses logical journaling. Unlike
Ext3, it does not ensure that data is consistent with metadata.
This can potentially create a security risk since (although
rare) recently modified files could contain portions of
previously deleted files.

Pros: fast as hell, stable as of 2.4.18

Cons: not as reliable as Ext3 (in reference to data integrity)

2.4. SGI’s XFS

And finally there is XFS 🙂 Written by Silicon Graphics Inc in the early 90s,
this filesystem was based on the philosophy to "think big."

Accordingly, XFS is the fastest of the 3 journaling filesystems
discussed when dealing with large files. It’s speed was very
close to that of ReiserFS when handling medium to small files,
unless certain optimizing parameters are passed during the
creation and mounting of the filesystem.

XFS also likes to cache data a lot, eliminating the
unnecessary disk activity that ails Ext3.

The really cool characteristic of XFS lies in what SGI refers
to as allocation groups. The block device is split into 8 or
more sections (allocation groups) depending on the size of your
partition, each allocation groups being its own filesystem with
its own inodes. This allows multiple threads and processes to
run in parallel! Since XFS was designed for high-end hardware,
couple XFS with high-end hardware and you’ll get really nice
speed.

XFS fully utilizes B+trees, because of the incredible speed
and scalability advantages associated with them. In fact, XFS
uses 2 B+trees for each allocation group, one containing the
extents of free space ordered by size, and the other regions
ordered by their starting physical location. XFS is great at
maximizing write performance because of its ability to locate
free space quickly and efficiently. XFS also uses B+trees to
keep track of all the inodes on the disk, which are allocated
dynamically like ReiserFS, only in groups of 64.

One cool thing worth mentioning is XFS allows the journal to
exist on another block device, which improves speed even
more!

Unlike ReiserFS, when an XFS filesystem recovers from a
crash, it writes nulls (0s) to any unwritten blocks. This fixes
the security issue known to plague ReiserFS (although it isn’t
that frequent and significant).

Another feature of XFS (which is unique to XFS) is delayed
allocation. Instead of writing to the disk immediately, it waits
and saves the data to RAM. Basically, it waits so that it can
optimize the number of actual IO operations it will have to
make. This not only improves speed, but also allows data to be
written contiguously (reducing fragmentation). For instance, if
the data was going to be appended to a single file in the end,
XFS writes this file to one contiguous chunk, instead of that
file being here, there, and everywhere! 🙂 Also, this delayed allocation
eliminates the need to write volatile temporary files to
disk.

Procrastination pays off! See, that’s exactly why I
procrastinate with my assignments – so I can wait until I can do
all of the assignment in one chunk of time! Maybe it’s a good
habit afterall! 😛

Pros: fast, not much disk activity, more secure than XFS (somewhat), smart, scalable

Cons: Slow when deleting files (should be fixed soon via patches), not as reliable as Ext3, Gentoo is starting not to like it that much – they recommend ext3 or reiser

# mke2fs /dev/hda1----
Ext2
# mke2fs -j /dev/hda1---- Ext3 (Ext2
w/ journaling)
# mkfs.xfs /dev/hda1----
XFS
Options:-d agcount=n will change the
number of allocation groups it creates - default is 1 every 4gb
(36gb = 9 AGs)

-l size=n will change the size of the journal ('n' is in megabytes)
- 32mb is a good size
# mkreiserfs /dev/hda3----
ReiserFS

If upgrading ReiserFS to XFS, zero out the partition first.

*Make sure your kernel supports the filesystem(s) you have chosen to
use!

Try to use Ext2 or Ext3 for the boot partition. If you use
ReiserFS you must mount it with ‘-o notail’ option which
disables tail packing.

3. Conclusion

Ext2 = Standard FS

Ext3 = Rugged Journaling FS

ReiserFS = Speedy Journaling FS

XFS = Quick and smart, but Gentoo believes it to be flaky ("fry lots of data" – hmmm)

CD Duplication and Mastering under Linux

Table of Contents

Note:Throughout this document, I refer to CD-RW drives for the recordable
drives. I mean CD-RW and CD-R drives though, it’s just easier to say
CD-RW when referring to both types of drive.

1. Hardware

CD-RW drives were originally only available as SCSI devices. This has
many advantages over other interfaces, but price isn’t one of them. ATAPI
CD-RW drives are now to a point where they perform equal to or, in some
cases, better than comparable SCSI drives. My first two burners were
SCSI, now I use an ATAPI one.

It is important to understand that all CD-RW drives built for a personal
computer speak the SCSI protocol. ATAPI is a way of sending SCSI commands
over IDE. It’s how we have IDE CD-ROM drives, for example.

Since IDE CD-RW drives speak SCSI commands, we can easily make them work
under Linux by using the SCSI emulation layer.

1.1. SCSI

If you prefer SCSI equipment, look no further than Plextor. These guys
make great drives and they work under all operating systems. The other
SCSI brand I don’t have trouble recommending is Ricoh. My first burner
was a Ricoh SCSI one and it lasted for four years before just giving out
after continuous use. HP also makes reasonable SCSI drives.

If you do go with SCSI, don’t forget the SCSI host adapter. If you don’t
have one of these, you’ll need to get one. It can add anywhere from $50
to $200 to the price of the drive. You will want a good host adapter.
Buying a top of the line professional $1000 SCSI burner and driving it off
an Adaptec 1502 ISA host adpater would be stupid. Generally speaking, go
with ATAPI unless you already have SCSI equipment.

1.2. ATAPI (IDE)

ATAPI drives work on most all IDE controllers. Plextor and Teac both math
nice ATAPI drives.

Isolating your burner on its own IDE channel is also a good idea. This
way, your hard disks and burner can operate at their maximum speeds. Also
note that if you plan to direct CD to CD-R transfers, the two devices
should not be on the same channel.

1.3. Firewire and USB

Newcomers to the world of burners. I have not used a Firewire or USB
burner, but I hear that the ones that do work under Linux work like an
ATAPI device. You load the module for the device and then communicate
with it over the SCSI emulation layer.

1.4. Miscellaneous Recommendations

I was once told two good arguments for paying the extra money for an
external caddy-loading CD-RW drive. Caddy-loading drives have fewer
moving parts, which means there’s a less likely chance that the drive will
die from mechanical failure. And an external drive can be turned off,
which means it won’t be sucking dust from the inside of your case when the
drive is not in use.

Something to consider, but honestly, CD-RW drives are so cheap these days
that there’s really not much point in seeking out hardware that will last
forever. We all get new computers every few years anyway (or parts or
whatever).

1.5. Kernel Stuff

For a SCSI drive, you will want to enable the driver for your host
adapter. Virtually all Adaptec PCI adapters use the aic7xxx driver.
Symbios cards are the next most popular, and the sym53c8xx driver will
take care of most cards in that family.

The drive acts as both a reader and writer. For reading, we need to
enable the SCSI CD-ROM driver. To write to the drive, we enable the SCSI
generic driver and turn on the write permissions with the chmod command on
the device node. With both the CD-ROM and generic driver enabled, your
drive will be available as a /dev/srX and a /dev/sgX device, where X is a
number or letter depending on the driver. If you have only one device,
you would have /dev/sr0 and /dev/sga. The sr device is your SCSI Read
device, and the generic device (sg) will be what we use for writing (after
doing a "chmod +x" on it). Lastly, you should enable vendor specific
extensions under the SCSI configuration section of the kernel.

For an IDE drive, you don’t need a SCSI host adapter driver (not even the
dummy on). You do, however, need to enable SCSI emulation support. You
can leave the IDE CD-ROM support enable, but I recommend disabling it as
it has been known to cause conflicts with the SCSI emulation driver.

With SCSI emulation support on, your IDE CD-ROM drive will be identified
as a /dev/srX and /dev/sgX device, just as the SCSI drives are. If you
leave the IDE CD-ROM driver enabled, you can pass the "ide-scsi=hdX" boot
parameter to tell the SCSI emulation driver which IDE drive you want
emulated as a SCSI drive.

2. Standards and Terminology

There are some specific terms associated with CD recording. Below are
some of the common ones.

2.1. Standards

2.1.1. ISO 9660

Also known as the High Sierra filesystem, this is the standard
filesystem format for CD-ROM data discs. It’s supported under all
operating systems that support CD-ROM drives. This filesystem has
certain extensions by various groups and companies that make it more
usable on specific operating systems. Most people this to "ISO" when
naming ISO 9660 CD-ROM images. This is incorrect, but people still
insist on it.

2.1.2. Rockridge

UNIX extensions to ISO 9660. These allow us to have permissions and
ownerships for files on a CD-ROM, among other things. The name for
this comes from an area of Oakland, CA (kind of like Midtown is to
Atlanta). There is a Rockridge BART station and a Rockridge public
library.

2.1.3. Joliet

Microsoft extensions to ISO 9660. Filenames are stored in Unicode.
Making a Joliet-only CD usually results in read problems for
non-Microsoft operating systems.

2.1.4. El Torito

The bootable CD-ROM specification created for Intel PCs. It’s name
comes from a chain of Mexican restaurants in the San Francisco Bay
Area.

2.2. Terminology

2.2.1. disc-at-once (DAO)

Recording an entire disc without turning off the laser. Offers more
control over the disc layout. DAO will also eliminate the 2 second gap
between tracks on an audio CD.

2.2.2. track-at-once (TAO)

Recording a disc one track at a time, with the laser turning off then
on again between each track.

2.2.3. buffer underrun

This happens when your computer does not send data to the burner fast
enough. It results in a bad burn and is typically due to a slow
computer, not enough buffer memory on the CD-RW drive, bad software, or
a combination of any of the above.

2.2.4. pregap

Silence between audio tracks. CDs created in track-at-once mode have a
default 2 second pregap.

2.2.5. coaster

The physical media resulting from a bad burn. A coaster cannot be used
or repaired. I think it’s a bad name to use because CDs make terrible
coasters.

2.2.6. burn/roast/toast

To record data on a blank CD-R or CD-RW media.

3. Software

Under Linux, we use a collection of command line tools to record CDs.
There are numerous graphical frontends out there, but becoming familiar
with the actual command line tools will give you a better understanding of
the process of creating a CD under Linux.

3.1. cdwrite (historical)

The original CD recording tool for Linux. Only worked with a handful of
writers, and only SCSI ones at that. The original author was Adam Richter
(of Yggdrasil Computing, Inc.). Interestingly, it shares a syntax similar
to what cdrecord offers.

This program is deprecated, obsolete, broken, and should not be used.

3.2. cdrtools (cdrecord)

Cdrtools is the name of the software collection that includes cdrecord,
mkisofs, mkhydrib, and cdda2wav. You get mastering, writing, and ripping
tools all in one. Cdrecord is arguably the best writing program available
for UNIX. Cdrecord can speak with SCSI and ATAPI burners and supports DAO
and TAO recording. Many other options are available.

Cdda2wav is the audio ripper included with cdrtools. Most people prefer
to use cdparanoia for ripping as it tends to work with a wider range of
hardware. I find cdda2wav to be extremely fast on SCSI hardware, but your
mileage may vary.

3.3. cdrdao

Cdrdao was written to specialize in writing disc-at-once CDs. Support for
the command ".BIN/.CUE" format is supported by this program as well.
Though the build procedure seems a bit anti-GNU, it does work well and is
handy when cdrecord isn’t playing nice.

3.4. cdparanoia

Cdparanoia seems to be the preferred audio ripper amongst the Linux
community. It’s known for being able to rip scratched CDs successfully.

3.5. sox/wav2cdr

If you’ll be creating audio CDs from MP3 files, you have to send the
decoded MP3s through a swabbing process. Cdrecord claims to do this
transparently, though I have had varying success with it. I find sox to
be a much better solution. Sox can transform a .wav file in to a CD audio
track file, which can then be written to a CD. Wav2cdr offers similar
functionality.

3.6. mkisofs/mkhybrid

Since CDs are read-only, we need to create a self-contained filesystem to
write to the media. Under Linux we use the mkisofs tool to create ISO
9660 CDs. The mkhybrid tool is a special version of mkisofs used for
creating Macintosh format CDs, but it’s functionality has been merged in
to the larger and way more complex mkisofs program.

3.7. lame

There are many MP3 encoders out there. The LAME program is a nice encoder
if you want to rip an audio CD and encode it to MP3s.

3.8. Ogg Vorbis

Ogg Vorbis is a fully open standard for compressed audio. An entire
presentation can be done on Ogg Vorbis alone, so I won’t go into those
details here. If you want to rip a CD to Ogg Vorbis files or make a CD
from a collection of Ogg Vorbis files, you will need the Ogg Vorbis
utilities. Most major distributions now ship with the necessary
components (libogg, libvorbis, libao, vorbis-tools).

3.9. Graphical Front Ends

There are a lot of graphical frontends for burning CDs. All of them use
the tools described above to get the job done. Some are tailored to a
specific task (burning DAO audio CDs, for example), others aim to be the
most full featured programs around.

Some of the major burning frontends are:

X-CD-Roast http://www.xcdroast.org/
gcombust http://www.abo.fi/~jmunsin/gcombust/
KisoCD http://kisocd.sourceforge.net/kisocd.htm

Gnome Toaster http://gnometoaster.rulez.org/
KOnCD http://www.koncd.de/
KreateCD http://www.kreatecd.de/

It may be a good idea to understand the underlying commands these
frontends use, even if you plan on using a recording frontend all the
time. Sometimes the frontends fail and you have to resort to the command
line tools.

4. Examples

The following examples assume you’re CD-R(W) is /dev/sga and your CD-ROM is
/dev/sr0. Change the device nodes to correctly reference your system.

4.1. Finding your CD-RW device

Your burner should be accessible through the SCSI subsystem. The easiest
way to find it is to run cdrecord and scan the bus:

cdrecord -scanbus

This will return output similar to the following:

Cdrecord 1.10 (sparc-sun-solaris2.8) Copyright (C) 1995-2001 Jrg Schilling
Using libscg version 'schily-0.5'
scsibus0:
0,0,00) 'QUANTUM ' 'ATLAS 10K 36LWS ' 'UCP0' Disk
0,1,01) *
0,2,02) *

0,3,03) 'PLEXTOR ' 'CD-ROM PX-40TX' '1.04' Removable CD-ROM
0,4,04) 'PLEXTOR ' 'CD-RPX-W124TS' '1.06' Removable CD-ROM
0,5,05) *
0,6,06) *

0,7,07) HOST ADAPTER

My CD-RW is device 0,4,0. I will use this when recording CDs rather than
the Linux device node (those may or may not change).

4.2. Erasing a CD-RW disc

Assuming my CD-RW device is 0,4,0, I use cdrecord with this command:

cdrecord -v dev=0,4,0 blank=fast

4.3. Making a bootable Linux CD on an x86 machine

This one requires knowing what’s needed for a bootable root filesystem.
But let’s assume you have that and you’ve managed to fit it into 2.5MB of
space (your root filesystem). First, you make an image file containing
that filesystem. Make a 2.88MB file, format and mount it, and copy your
root filesystem there. Call the image flop288.img. This is a 2.88MB
floppy disk image.

The El Torito standard emulates either a 1.44MB or 2.88MB floppy disk
through the CD-ROM. Your disk image will reside at the beginning of the
drive and will be loaded and booted as though it were a floppy.

Create a temporary filesystem to hold everything for this CD, including
your flop288.img file:

mkdir -p /tmp/cd-root
cp stuff flop288.img /tmp/cd-root

Now we can create the bootable image:

cd /tmp/cd-root
mkisofs -o /tmp/bootcd.img -R -V "My Boot CD" \
-v -d -D -N -b flop288.img -c boot.cat -A "My Boot CD" .

The bootable CD image file is now /tmp/bootcd.img. There are several
other options related to boot CDs. For instance, the -no-emul-boot option
causes a ‘no emulation’ El Torito image to be created. This is like a
normal El Torito image, but it doesn’t emulate the floppy disk. There are
also switches to add multiple boot images to the disc. And, you can
specify a hard disk partition for the -b switch, provided you also specify
the -hard-disk-boot option. Mkisofs will then use the hard disk to create
the boot image for the CD.

4.4. Copying a data CD

I prefer using cat to make image files of CDs I want to copy:

cat /dev/sr0 /tmp/cd.img

The /tmp/cd.img file can now be burned to another disc using cdrecord,
like this:

cdrecord -v dev=/dev/sga -data -eject image.img

Some people like doing direct CD to CD copies. You can do this, provided
you have two drives. Say your burner is 0,4,0 and your reader is
identified under Linux as /dev/sr1. This command will do a disc to disc
copy:

cdrecord -v dev=0,4,0 -eject -isosize /dev/sr1

And you can always use dd to make the image file and then it using
cdrecord.

4.5. Creating an audio CD from MP3 files

First you’ll want to select either 74 minutes or 80 minutes worth of
music. Count the time, not the file sizes. XMMS can create a playlist
and give you total running time. There are plenty of other methods to
calculate running time for a set of MP3s.

Once you have your MP3 files selected, decode them to WAV files. I use
amp to do this:

amp -c -w MP3 fileWAV file

I run that command for each MP3 file and end up with a set of WAV files.
The WAV files are almost ready for burning. I need to reverse the byte
order so that they sound right on an audio CD player. For that, I use
sox:

sox -w -x WAV fileCD audio track

Now my set of WAV files have been converted to CD audio tracks and I can
burn them using cdrecord:

cdrecord -v dev=/dev/sga speed=12 -audio \

-eject CD audio track ...

Just list the track files one after another at the end of the cdrecord
command. The program will burn them to the disc in this order. This
method burns the audio disc in track-at-once mode, which is somewhat nasty
for music CDs. A better method is to record in disc-at-once mode to get
rid of those pregaps:

cdrecord -v dev=/dev/sga speed=12 -audio -dao \
-eject track ...

You can also use cdrdao to do the above steps. For the conversion from
MP3 to CD audio, we can get a bit more creative. Instead of decoding to
WAV then to CDDA, we can combine the two commands so it ones in one step:

amp -c -w MP3 file - | sox -w -x - CD audio track

So how about a for loop to do that?

for mp3file in *.mp3 ; do
cddafile="`basename $mp3file .mp3`.cdda"
amp -c -w $mp3file - | sox -w -x - $cddafile
done

4.6. Copying an audio CD

There are a couple ways to do this. Cdrdao is probably the easiest way to
make a perfect copy of an audio CD, that’s what I like to use:

cdrdao copy \
--device 0,4,0 \
--source-device 0,3,0 \
--speed 12 \
--eject

You can also use cdrdao to make a copy of the disc for burning later:

cdrdao read-cd --source-device 0,3,0

Using the copy command with cdrdao will cause the program to create an
image file and then burn that. If you want no image file (like a cdrecord
disc to disc copy), use the –on-the-fly option.

4.7. Making a Data CD

Creating a data CD for a Linux system is pretty much the same as creating
a bootable Linux CD, you just don’t specify the -b or -c flags.

First, make a temporary directory and populate with the files you want on
the CD. Assuming the temporary directory is /tmp/cd-root, issue this
command to make an image:

cd /tmp/cd-root
mkisofs -o /tmp/datacd.img -R -V "My Data CD" \
-v -d -D -N -A "My Data CD" .

This will make an ISO 9660 image file with Rockridge extensions. This is
generally what you should use when making a Linux or UNIX CD image. If
you will be using the disc on a Windows machine, consider using the -J
flag on mkisofs to generate Joliet data. One method of creating a Windows
CD is:

mkisofs -allow-lowercase -allow-multiboot -D -l -iso-devel 3 \
-J -relaxed-filenames -no-iso-translate -R \
-o /tmp/datacd.img .

Generating a Macintosh CD takes quite a bit more effort. Macs have
traditonally used the HFS filesystem on CDs as well as hard disks. The
mkisofs program can generate HFS CD images, but it can also generate CD
images using Apple’s extensions to ISO 9660 (kind of like Apple Joliet).
There is a lot of information you need to gather before generating the CD.
The positions of icons on the CD, type and creator codes for the files,
and probably some other stuff. If you’re really interested in making a CD
for Macs that will be read under pre MacOS X, just use a Mac.

4.8. Creating an Audio CD from Ogg Vorbis Files

You can use the oggdec program to convert Ogg Vorbis files to CDDA files:

for oggfile in *.ogg ; do
cddafile="`basename $oggfile .ogg`.cdda"
oggdec -b 16 -e 1 -o $cddafile
done

You can then burn those CDDA files to a disc:

cdrecord -v dev=/dev/sga speed=8 -audio -dao pregap=0 -eject *.cdda

4.9. Making a VideoCD

There is a nice tutorial on creating VideoCDs under Linux at:

http://www.satlug.org/~bigjnsa/vcd-linux.html

There are some additional tools you will need installed in order to
convert movie files to VCD images. Links are provided on the page above
to get the additional tools.

4.10. Copying a PlayStation CD

Making backup copies of your Playstation games can be handy for several
reasons. Educational purposes being the top one.

Cdrdao comes with a tool called psxcopy which can make copies of your
Playstation games. There are two C programs, cdjob and psxdump, which
handle copying the data and audio from the game disc. Two scripts are
included which combine the usage of those programs with cdrdao to make
copies of Playstation discs.

5. Resources

Linux and Wireless

Table of Contents

1. Introduction

1.1. So Many Standards

What is Wireless? Wireless Ethernet is ethernet usually transfered over
radio ways at certain frequencies. It provides a wireless high speed
ethernet connection. It can be very useful for people who have laptops or
other mobile devices that need an internet connection. There are currently
many different Wireless devices with many standards. Some of the standards
include 802.11, 802.11-DS, 802.11-b, 802.11-a, HiperLan, HiperLan II,
OpenAir, HomeRF / SWAP, and BlueTooth. We will focus on 802.11-b, 802.11-a,
and a little BlueTooth.

1.1.3. BlueTooth

BlueTooth is not Wireless LAN. It is a cable replacement technology
mostly developed and promoted by Ericsson with the help of Intel, offering
point to point links and no native support for IP. It is much like a
Wireless USB idea. I only mention it here because of the confusion that
BlueTooth is some kind of Wireless LAN.

2. 802.11-b Vocabulary

  1. ESSID The ESSID is used to
    identify cells which are part of the same virtual
    network.
    As opposed to the NWID which defines a single cell,
    the ESSID defines a group of cell connected via
    repeaters or infrastructure, where the user may
    roam. With some card, you may disable the ESSID
    checking (ESSID promiscuous) with off or any (and
    on to reenable it).
  2. NWID As all adjacent wireless networks share the same medium, this parameter is used
    to differenciate them (create logical colocated
    networks) and identify nodes belonguing to the same
    cell. With some card, you may disable the Network
    ID checking (NWID promiscuous) with off (and on to
    reenable it).
  3. MODE The mode can be
    1. Ad-hoc
      (network composed of only one cell and without
      Access Point)
    2. Managed (network composed of many
      cells, with roaming or with an Access Point)
    3. Master (the node is the synchronisation master or act as an Access Point)
    4. Repeater (the node forward
      packets on the air)
    5. Secondary (the node act as a
      backup master/repeater)
    6. Auto
  4. Access Point An access point is just a device that acts as a path from your wireless device to a wired network. Many Access points can be attached together to act as repeaters or even bridged together. But on down the line, an Access Point is in charge of trasferring the wireless signal to a wired network. It acts just like a hub.
  5. Bandwidth Rate The rate is just a measure of the speed of the connection. In 802.11-b these range from 1 to 11 Mb/s. The nice thing about these is that if your signal is not strong enough for one, it will drop down to the next highest signal. This will auto-negociate to the best rate possible and keep checking at some interval.

3. Wireless Encryption

The encryption in the 802.11 specification is a RC 4 Algorithm. Much has
been said about the the WEP or Wired Equivalent Privacy key having some
serious security flaws. The key is easily cracked but only after long
periods of time when data can be recieved and then analyzed. The idea
of WEP, however was just to provide a privy equivalent to wired ethernet.
Wired ethernet has privacy because someone would need to get into a
building or get to a wired port. The wireless web is just supposed to
provide a deterent. A better solution to having people not access you
wireless network is to have you AP only talk to specified MAC addresses or
unique wireless cards of authorized users.

802.11-b supports both 64bit and 128bit encryption keys. The GT LAWN
uses ta 64 bit setup. Sometimes you may hear about 40bit or 104bit
encryption. These are the same as 64/128 but that some people refer to 24
of the bits as a different part.

4. Wireless and Linux

4.1. Drivers

4.1.1. Pcmcia

First you need pcmcia support to use the PCCard devices. Most people
would only run wireless on laptops because the speed of wired ports is
much better for desktops. I will not go over how to get pcmcia support
in Linux. That is beyond the scope of this document. Most people would
say to use the kernel pcmcia support and use the scripts from the
pcmcia-cs project.

4.1.2. Wireless Cards

The wireless card drivers are either found in the kernel or in a
seperate package. There are a few packages available including pcmcia-cs
and wg-lan. The file that takes care of assigning which driver to which
card is /etc/pcmcia/config.

4.2. Testing

Once you have everything setup you can try to insert the Wirless card and
look at your logs specifically /var/log/messages for information on your
wireless setup. If you do not see anything in those logs, you might want
to make sure pcmcia service is started. The following is a portion of my
logs pertaining to my wireless card.

Jun 18 15:40:09 silver-bullet cardmgr[1005]: socket 2: Lucent Technologies WaveLAN/IEEE Adapter
Jun 18 15:40:09 silver-bullet cardmgr[1005]: executing: 'modprobe orinoco_cs'
Jun 18 15:40:09 silver-bullet cardmgr[1005]: executing: './network start eth1'
Jun 18 15:40:10 silver-bullet /etc/hotplug/net.agent: invoke ifup eth1

4.3. Wireless Tools

The Wireless Tools is a set of tools allowing to manipulate the Wireless
Extensions. They use a textual interface and are rather crude, but aim to
support the full Wireless Extension.

iwconfig – manipulate the basic wireless parameters
iwlist – (formerly part of iwspy) allow to list addresses, frequencies, bit-rates
iwspy – allow to get per node link quality
iwpriv – allow to manipulate the Wireless Extensions specific to a driver (private)

The latest wireless tools package can be found at
http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/Tools.html

4.4. Schemes

Schemes are nice for when you have multiple wireless networks you connect
to at different times. For instance you could have one for the LAWN and one
for your work/room. This way you can just give easy commands to change
the schemes. You can just use the cardctl command to change schemes.

cardctl scheme scheme-name

For scheme support to work, you need to have support for it in 2 files,
networks.opts and wireless.opts which are located in
/etc/pcmcia/.

4.4.1. network.opts

Click here to download network.opts.

# Network adapter configuration
#
# The address format is "scheme,socket,instance,hwaddr".
#
# Note: the "network address" here is NOT the same as the IP address.
# See the Networking HOWTO.In short, the network address is the IP
# address masked by the netmask.
#
case "$ADDRESS" in

lawn,*,*,*)

DHCP="y"
;;

room,*,*,*)
DHCP="y"
;;

esac

4.4.2. wireless.opts

Click here to download wireless.opts.

Wireless LAN adapter configuration
#

case "$ADDRESS" in

# Lawn stuff
lawn,*,*,*)

INFO="LAWN"
ESSID="GTwireless"
MODE="Managed"
KEY="xxxxxxxxxx"
;;

#room stuff
room,*,*,*)

INFO="apartment room"
ESSID="ssaw318"
MODE="Ad-Hoc"
KEY="off"
;;

# Generic example (decribe all possible settings)
#*,*,*,*)
#INFO="Fill with your own settings..."
## ESSID (extended network name) : My Network, any
#ESSID=""
## NWID/Domain (cell identifier) : 89AB, 100, off
#NWID=""
## Operation mode : Ad-Hoc, Managed, Master, Repeater, Secondary,
##auto
#MODE=""
## Frequency or channel : 1, 2, 3 (channel) ; 2.422G, 2.46G
## (frequency)
#FREQ=""
#CHANNEL=""
## Sensitivity (cell size + roaming speed) : 1, 2, 3 ; -70 (dBm)
#SENS=""
## Bit rate : auto, 1M, 11M
#RATE=""
## Encryption key : 4567-89AB-CD, s:password
#KEY=""
## RTS threshold : off, 500
#RTS=""
## Fragmentation threshold : off, 1000
#FRAG=""
## Other iwconfig parameters : power off, ap 01:23:45:67:89:AB
#IWCONFIG=""
## iwspy parameters : + 01:23:45:67:89:AB
#IWSPY=""
## iwpriv parameters : set_port 2, set_histo 50 60
#IWPRIV=""
#;;
esac

5. GT LAWN

The LAWN is the Georgia Tech Local Area Wireless Network. It is a on
going project to provide wireless (802.11-b now) service in campus
buildings. The equipment is Lucent/ Agere wireless Access Points that
can easily be upgraded to 802.11-a. The LAWN works in the following way.
You need to get the WEP key that OIT uses for the LAWN from OIT. Then
once you have the card setup you need to insert the wireless card and
bring up the interface using dhcp. After that, just open a web browser
and sign-in to the LAWN. Now you are on the LAWN.

6. LAWN Coverage

The most up-to-date information on LAWN coverage can be found in the
OIT FAQ: http://faq.oit.gatech.edu/0256.html

Building Coverage
500 Tech Parkway Full
845 Marietta Street Full
A.French Building Full
Architecture Building Full
Manufacturing Research Center Full
Bookstore Mall CyberCafe Full
Carnegie Building Full
Centennial Research Building Full
Coliseum Annex Full
College of Computing Full
Georgia Center for Advanced Telecom Tech Full
Human Resources Full
Library Full
Rich Computing Center Full
Van Leer Building Full
Wardlaw Center Full
811 Marietta Street 2nd floor
Alumni House 1st floor
Baker Building 2nd floor and SE corner of 1st floor
Calculator 2nd floor only
ESM building Ground floor
Instruction Center 2nd floor only
King Facilities Building SW office area
Lyman Hall 3rd floor S
MRDC Varios Parts of 2nd and 3rd Floors
Pettit Building (MiRC) Room 102

7. Available Devices and Prices

These prices were taken from www.buy.com
just for an idea of what devices cost.

7.1. 802.11-b

Based on the 802.11 standard which was finalized in September of 1997,
this standard operates at the 2.4 Gigahertz Frequency at speeds of 1, 2,
5.5 and 11 Mb/s. The later 2 speeds were introduced with the 802.11-b
standard. There were only slight changes in the DS physical layer from
802.11. It is by far the most common standard in use today. The GT LAWN
runs on 802.11b.

7.2. 802.11-a

802.11-a was standardized before 802.11-b as the name implies. It is
just now starting to hit the market however. It is sometimes referred to
as 802.11 at 5 GHz. It offers speeds of 6, 12, and 24 Mb/s and optional
speeds of 9, 18, 36, 48, and 54 Mb/s.

8. Resources

Installing and Using Debian

Table of Contents

1. What is Debian?

Debian is a distribution of the GNU/Linux operating system that is
developed and maintained entirely by volunteers. The Debian Project is
committed to freedom, as defined by their social contract
(http://www.debian.org/social_contract). All software in Debian,
as well as all aspects of the distribution itself, may be modified
and redistributed, in source or binary form, free of charge.

Software that does not meet the Debian Free Software Guidelines
(DFSG) but has been configured for use in a Debian system is also
available, though it is not a part of the Debian Project, and
redistribution of these packages may be restricted. The non-free
section of the package archives contains software that does not meet
the requirements of the DFSG, and the contrib section contains packages
that are themselves free, but depend on software that is not.

Though freedom is the raison d’e^tre of the Debian project, it is not
the only reason that people find to use this distribution. The packaging
system has been built in such a way that the entire system may be
upgraded with a minimal amount of effort and while the system is
still running. The large amount of packages available allows for a
wide variety of software to be installed without worrying about how it
will fit with the rest of the system. The "unstable" branch of the
distribution is updated continuously, which allows users to keep up
with the latest in software development while retaining the advantages
of a packaging system. On the other end of the spectrum, the "stable"

branch is extensively tested before release, making it very stable and
secure.

1.1. A quick note on versions

Currently, three branches of the Debian distribution are maintained
at a time: stable, testing, and unstable. Stable is the most recent
official release, and contains packages that have been thoroughly
tested and meet the requirements for stability and security. The only
changes made to the stable branch are to repair security holes and
other critical bugs found after the release. Stable is often accused
of being out of date, since the release schedule tends to be a bit
slow. At the time of this writing, the current stable branch of
Debian is 22 months old.

Unstable contains the most recent available version of every package,
bugs and all.

The testing branch was created after the release of potato as an
attempt to create a compromise between stable and unstable. Packages
are automatically placed in testing from unstable based on activity in
the bug reports (http://bugs.debian.org/). Once the release
manager decides that it’s ready, testing will be renamed to frozen, and
the testing cycles will begin in preparation for a release.

All versions of Debian since 1.1 (buzz) have been named after
characters from the movie Toy Story, and these are usually the names
by which the versions are referred. The current active versions are
potato (2.2, stable), woody (3.0, testing), and sid (unstable).

2. Installing Debian

2.1. How to get debian

2.1.1. cdrom

As with most other Linux distributions, bootable ISO CD images of
the installation files are available for download. At one time there
was a frightening labyrinth of questions that had to be answered
correctly before finally being shown the list of mirrors, in an
attempt to lighten the load on them. Fortunately, this is no more,
and the CD page (http://www.debian.org/CD/) now tells you where
to find the CD images, rather than telling you to buy a CD or perform
some arcane ritual involving the package mirrors and rsync. However,
a full install CD may not be necessary. Unless you are unable to get
your network working during the install, it may be better to download
only enough to boot your system and install the rest over the network.

2.1.2. floppies

There are floppy images available for potato and woody which contain
enough to boot the system, install the kernel and drivers, and install
the rest of the base system through some other means. The base system
is contained in a tarball located in the same directory as the disk
images, and may be automatically downloaded over http or ftp by the
installer, downloaded manually and made available on a local partition or
over nfs, installed from a Debian CD, or installed from another stack
of floppies.

All floppy images are available in
/debian/dists/(potato|woody)/main/disks-arch> in a package mirror,
where arch> is the architecture of the machine on which you wish
to install Debian.

2.1.3. minimal cdrom

For those who loathe making floppies but don’t want or need a full
CD, there is hope. Small CD images containing only the contents of the
rescue, root, and driver floppy images are available from
http://www.debian.org/CD/netinst/ and work well for a network
installation of Debian.

2.2. The install

Though many people curse and say nasty things about the installation
process, it’s actually not too bad. It has gone through several
improvements in usability and stability in the recent past, and most
people will be able to simply walk through the steps in the
installation menu without difficulty. If a shell is needed for any
reason, one can be found on tty2 (accessible with alt-F2), or by
scrolling down the installation menu to "Execute a shell", which will
run a shell on that console and return to the install menu when the
shell exits.

Once the base packages have been installed and a means of booting
the system has been created, you may reboot the system into Debian, at
which point some basic configuration will be handled (root password,
timezone, non-root account), and further package installation will
commence.

3. The Debian Package System

Debian packages are similar to Redhat’s RPMs, with the greatest
difference being a more finely tuned system of dependencies. In addition
to dependencies and conflicts, deb packages have recommended and
suggested packages, which, though not absolutely necessary to install or
run the software in the given package, may be helpful.

The thing that makes Debian packages great, however, is the
Advanced Package Tool, apt. apt is a package management system that
handles downloading, installing, and configuring packages and their
dependencies. Most of the package management tools are actually
frontends to apt.

The only thing that you should need to worry about within apt itself
is the /etc/apt/sources.list file, which tells apt where to find
new packages. The sources.list file is a list of locations of
package archives, in order of preference. The newest available version
of a package will always be used, and if more than one location has
the newest version, the package will be downloaded from the one that
occurs earliest in sources.list.

Each line consists of the type (deb for binary packages, deb-src for
source packages), the location, the version to use (referenced either
by codename (potato, woody, etc) or status (stable, frozen, etc)), and
a list of the branches to use. The current branches are main, contrib,
and non-free, as well as non-US versions of each. Here is an example
sources.list:

# Local package mirror, generated by apt-move
deb file:/var/mirror/debian potato main

deb ftp://ftp-linux.cc.gatech.edu/debian potato main contrib non-free
deb http://www.debian.org/debian potato main contrib non-free

# Packages that can't be exported from the US, usually due to encryption or patents
deb ftp://nonus.debian.org/debian-non-US potato/non-US main contrib non-free

# Source packages for above, so apt-get source pkg-namewill work
deb-src ftp://ftp-linux.cc.gatech.edu/debian potato main contrib non-free
deb-src ftp://ftp.debian.org/debian potato main contrib non-free
deb-src ftp://nonus.debian.org/debian-non-US potato/non-US main contrib non-free

# If a security hole is found in the stable distribution, the fixed version
# of the package will be put here
deb http://security.debian.org/ stable/updates main contrib non-free

3.1. Command line utilities

3.1.1. apt-get

apt-get is the command line interface to apt. With a few simple
commands, it can be used to update the list of available packages,
install and remove individual packages and their dependencies, and
upgrade all the packages in the system to the latest available
version. However, apt-get may not be best for installing packages,
since it ignores the Recommends and Suggests fields, which may
create difficulty in the expected functionality of a package. Suggestions
that apt-get have features added to report recommended and suggested
packages are generally met with claims that apt-get isn’t meant to be
used that way, use some other frontend. Regardless, apt-get is still
useful for many tasks, and it is helpful to at least be familiar with
some of the commands.

update
running ‘apt-get update’ will download the latest package lists from
the locations you put in sources.list, updating the local list of
available packages. It is generally a good idea to run this before
doing anything else.
install package name…>
Installs the packages and any dependencies
remove package name…>
Removes the packages and any packages that depend on them.
Configuration files, if present, will be left behind, so the packages
will be left in the ‘deinstall’ state until purged.
upgrade
Upgrades all packages in the system to the latest version
dist-upgrade
Same as upgrade, but attempts determine changed dependency names
and may be used for upgrading across versions. For example, if you
are running potato and want to run sid instead, you can simply change
all instances of potato in /etc/sources.list to sid and run ‘apt-get
update; apt-get dist-upgrade’ which will upgrade your system with a
minimal amount of trouble.

3.1.2. apt-cache

The full functionality of apt-cache isn’t terribly useful unless you
really enjoy playing around with apt’s package cache, but it can be useful
for finding a particular package name. ‘apt-cache search regex>’
will give a list of all packages with a name that matches the regular
expression, along with a single-line description. ‘apt-cache
depends pkg name…>’ will print out all Dependencies,
Recommendations, and Suggestions for the packages.

3.1.3. apt-move

apt-move is very useful if you run Debian on more than one computer.
It can be used to generate a package archive, usable by other machines,
from the package files downloaded by APT.

3.2. The frontends

3.2.1. tasksel

tasksel is what is run when you choose the "Simple" package selection
method during the initial configuration. There are meta packages
available, named task-*, which contain no data and depend on groups
of other packages. tasksel simply chooses and installs these
task-* packages. However, tasksel cannot be used to uninstall
anything, and can only be used to install whatever the Debian
maintainers felt would make an appropriate group of packages, so it’s
not very useful outside of the initial configuration.

3.2.2. dselect

dselect, the "Advanced" side of the initial package selection, is the
tool that is generally suggested for an everyday package management
tool. It can be used to install, remove, purge, and hold back (package
won’t be upgraded even if newer version is available) individual
packages and dependencies. It notifies the user of recommended and
suggested packages and offers menus to handle situations where more than
one package meets a particular dependency, or to resolve dependency
conflicts. However, it’s extremely slow, has a very user unfriendly
interface, and is impossible to navigate. dselect is single-handedly
responsible for turning countless users away from Debian.

3.2.3. aptitude

aptitude, like dselect, has a curses interface, but, unlike dselect,
it can search through the package database at a reasonable speed, and
its interface is much easier to use. aptitude also allows operations
to be performed on the command line, much like apt-get.

3.2.4. synaptic

synaptic, formerly raptor, is a GUI frontend to apt-get. It is
friendlier (or at least prettier) to use, but suffers from the same
limitations as apt-get. It also has a fairly robust, though slightly
confusing, means of filtering the package list displayed.

3.2.5. stormpkg

Originally created for the now-defunct Stormix, stormpkg is a
GNOME-based APT GUI which can handle full dependency management, package
list filtering based on package name or description to ease the search
for particular packages, and it can edit the sources.list file.
It seems friendly, but still has some stability issues to work
out, like not crashing when someone double-clicks on a package name.

4. Adding to the system.

Once in a while you’ll find some piece of software, or a particular
version of a program, that has no corresponding deb package. What now?
You can either put it in the system as a local addition, or roll your
own deb.

Debian packages will never overwrite anything in /usr/local, so this
is where you should put all software with no corresponding package.
Most of the time, autoconf scripts default to /usr/local as the install
location, so ‘./configure; make; make install’ is often all that is
needed to install a program.

If you do want to make your own package, the easiest way to do this
is usually dh_make, a part of the debhelper system. Start with a regular
source tree, run dh_make from within it, and then you can edit the
control files, preinst and postinst scripts, changelog, and other
package information files. All package files are located in the
debian directory in the source tree. A package can be built by running
debuild from within the source directory. See the policy manual
(http://www.debian.org/doc/debian-policy/) for more information on
package building.

5. Miscellany

5.1. init scripts

Debian uses a System-V init system. Init scripts are marked as
configuration files in packages, so scripts that you add will never be
accidentally overwritten by new packages (at least not without prompting
you first). Init scripts are located in /etc/init.d, and may be added
to the runlevels either by making the symlinks in /etc/rc?.d by hand,
or by using the update-rc.d tool.

5.2. menus

Debian uses a collection of menu configuration files to generate menus
for the various X window managers, so that packages can create menu
entries without needing to know anything about each menu system. Packages
place files in /usr/lib/menu. These files may be overridden globally
by placing files in /etc/menu, or on a per-user basis with files
~/.menu. The menu files are generated from the menu configs by
update-menus. See the documentation to the menu package for more
information on the menu file format. Menu entries may be created for
locally added programs by creating a menu file using the name "local.foo"
for the package name, where foo may be anything. Any local.* package
name is assumed to be always installed.

5.3. kernel modules

modconf allows you to choose modules from a menu, attempt to load
them into the kernel, and then will set things to automatically load
the module on boot. All it does is add the module name to /etc/modules,
and possibly create a file in /etc/modutils if any special options
are needed. The files in /etc/modutils are concatenated together to
form /etc/modules.conf, a task handles by update-modules.

5.4. alternatives

Sometimes more than one program may be installed that handles the
same or similar functions. Debian allows you to choose one of these
to reference by a simple name through the alternatives system. For example,
you may have nvi, elvis, and vim installed on a system, and you
would like to able to type ‘vi’ and run vim. What Debian will do
is create a symlink in /usr/bin/vi to /etc/alternatives/vi, which will in
turn be a symlink to /usr/bin/vim. You can change the mapping of the
alternatives either by changing the symlinks in /etc/alternatives by
hand (the packages will detect when this has happened and not overwrite
them when the packages are updated), or you may use the update-alternatives
program. See the man page for details.

6. Resources

Using Doxygen

Table of Contents

1. What is Doxygen?

Doxygen is a tool for generating formatted, browsable, and printable
documentation from specially-formatted comment blocks in source code.
This allows for developer documentation to be embedded in the files where
it is most likely to be kept complete and up-to-date.

Doxygen currently supports C, C++, Java, and IDL, using two different
styles of documentation. Output formats include HTML,
LATEX, XML, and
RTF; additional tools can be used to generate documentation in just about
any format imaginable, including hyperlinked PDF, man pages, PostScript,
compressed HTML (HTML Help), and Microsoft Word.

1.1. Other Doxygen Features

While Doxygen appears similar to Javadoc or Qt-Doc, Doxygen offers a
wealth of additional features:

  • Highly portable (available for Unix, Windows, and MacOS X).
  • Compatible with Javadoc, Qt-Doc, and KDOC.
  • Automatically recognizes and generates cross-references.
  • Can generate syntax-highlighted annotated source code.
  • Converts from HTML tags to markup in LATEX,
    RTF, or man pages.
  • Automatically generates class diagrams.
  • Organize elements into groups, with specialized documentation.
  • Include LATEX-style mathematical formulas.

1.2. Who Uses Doxygen?

A number of high-profile projects use Doxygen extensively, and have
their source documentation available online:

There are many, many more projects listed on the
"Projects that
use Doxygen
" page.
Browsing though the generated documention of various projects is the
best way to get a feel for the flexibility of and style of Doxygen output.

2. Getting Started

The Doxygen
download page

offers binaries and source distributions for the package-happy, as well
as instructions for accessing the Doxygen CVS tree. If you encounter
problems while building or installing, be sure to consult the
Doxygen
installation guide

for known problems and workarounds.

To generate documentation from your project, you’ll need a Doxygen
configuration file, usually named Doxyfile. To do this, you can
have Doxygen generate a simple, basic configuration:

doxygen -g filename

Alternatively, you can use the doxywizard tool, which provides a GUI
interface for creating and editing the configuration file. doxywizard
is included with Doxygen, but you’ll need to enable it during compilation
(if you compiled Doxygen from source) with the --with-doxywizard
flag to configure. doxywizard requires Qt 2.x to run.

A few options which I have found very useful to enable in the
configuration file are:

Option Description
EXTRACT_ALL=YES Generate documentation for all elements, even if they don’t have
documentation yet.
JAVADOC_AUTOBRIEF=YES When using Javadoc-style comments, treat the first sentence
as the brief description, and everything else as the detailed
description (this is the Javadoc standard behavior).

SOURCE_BROWSER=YES

Generate a list of source files with cross-referenced entities.
GENERATE_HTML=YES

Generate HTML documentation, including class diagrams.
GENERATE_LATEX=YES Generate LaTeX documentation. A makefile will also be generated so
that you can build PostScript, PDF, and DVI versions correctly.
RECURSIVE=YES

Recursively search from the current directory for source files.

GENERATE_TREEVIEW=YES For HTML output, generate a sidebar index.

More options can be found in the
configuration
section of the Doxygen manual
.

3. Writing Documentation

The good news is that if you already are familiar with Javadoc or Qt-Doc,
you already know the basics of writing Doxygen documentation. For those
unfamiliar with either system, the basic idea is that you place
specially-formatted comments immediately above anything you want to
document (such as a class, struct, method, field, etc.).

Javadoc-style example:

/**
* Method documentation.
* @param x The parameter.
* @return The return value.

* @see anotherFunction()
*/

/** Single-line documentation. */
/// Single-line documentation.

Qt-style example:

/*!
Method documentation.

\param x The parameter.
\return The return value.
\sa anotherFunction()
*/

/*! Single-line documentation. */
//! Single-line documentation.

In addition, Doxygen also lets you place documentation after an
element, useful for quickly documenting enums, structs, and member
variables:

int a;/// Javadoc-style.
char b;//! Qt-style.

For C and C++ files, you can place the documentation for your
elements in either the header or the main file; Doxygen will match up
declarations with the actual code.

3.1. Common Doxygen Markup

Include the following markup commands in your documentation to denote
special items. I’m using Javadoc-style here; for the most part,
the Qt-style equivalents are the same keyword, but starting with a
backslash rather than an at-symbol.

Markup Description
@param var desc...

Document a parameter called var to a function or method.

@return desc... Document the return value of a function.
@see elem Add a "see also" link to elem, which can be a function,
class, or any other documented identifier.
@author name

Indicate the author of an element.
@author name

Indicate the author of an element.
@version ver

Indicate the version of an element.
@todo desc...

Leave a note about unfinished work.
@warning desc...

Leave a warning.

See the Special
Commands Reference
for a complete list of markup commands.

Securing MySQL

Table of Contents

1. What is MySQL?

The MySQL database server is the world’s most popular open source
database. Its architecture makes it extremely fast and easy to
customize. Extensive reuse of code within the software and a
minimalistic approach to producing functionally-rich features has
resulted in a database management system unmatched in speed,
compactness, stability and ease of deployment. The unique
separation of the core server from the storage engine makes it
possible to run with strict transaction control or with ultra-fast
transactionless disk access, whichever is most appropriate for the
situation.

The MySQL database server is available for free under the GNU
General Public License (GPL). Commercial licenses are available
for users who prefer not to be restricted by the terms of the GPL.
(Taken from http://www.mysql.com/products/mysql/)

There are four different versions of MySQL available. The one
most commonly referred to is "MySQL Standard," which includes the
MySQL storage engines and InnoDB storage engines. For 99% of the
applications out there, this is good enough ™. However, if
the licensee is a for-profit corporation (e.g., not non-profit),
MySQL comes in the Pro form, which basically is the same thing
with a different LICENSE.TXT file. 🙂

2. How do I install MySQL?

MySQL runs on most *nix platforms and quite a few Microsoft-based
ones as well. The source/binaries can be downloaded from
http://www.mysql.com/downloads/mysql.html.

For Gentoo (portage-based systems):

emerge mysql

For Redhat (RPM systems):

(download Server, Client programs, Libraries and Header files,
and Client shared libraries)

rpm -i MySQL-*.rpm

From source code (for MySQL Version 4):

groupadd mysql
useradd -g mysql mysql
tar xvfz mysql-VERSION.tar.gz
cd mysql-VERSION

CFLAGS="-O3 -mpentiumpro" CXX=gcc CXXFLAGS="-O3 -mpentiumpro \
-felide-constructors -fno-exceptions -fno-rtti" ./configure \
--prefix=/usr/local/mysql --enable-assembler \
--with-mysqld-ldflags=-all-static

make
make install
scripts/mysql_install_db
chown -R root /usr/local/mysql
chown -R mysql /usr/local/mysql/var
chgrp -R mysql /usr/local/mysql
cp support-files/my-medium.cnf /etc/my.cnf
/usr/local/mysql/bin/safe_mysqld --user=mysql &

3. Configuration Files

There are three files MySQL reads by default for configuration:

/etc/mysql/my.cnf Global settings
DATADIR/my.cnf Server-specific options
~/.my.cnf User-specific options

The DATADIR directory will be the MySQL data directory, typically
/usr/local/mysql/data or /usr/local/var.

Option files can contain any of the following lines:

#comment
[group]
option
option=value
set-variable = variable=value

The standard *nix escape sequences still apply here. All leading
and trailing whitspace is automatically deleted.

A standard configuration file would be:

[client]
#password= my_password
port= 3306
socket= /var/run/mysqld/mysqld.sock

[safe_mysqld]
err-log= /var/log/mysql/mysql.err

[mysqld]
#skip-networking
skip-innodb
user= mysql
pid-file= /var/run/mysqld/mysqld.pid
socket= /var/run/mysqld/mysqld.sock
port= 3306
log= /var/log/mysql/mysql.log
basedir= /usr
datadir= /var/lib/mysql
tmpdir= /tmp
language= /usr/share/mysql/english
skip-locking
set-variable= key_buffer=16M
set-variable= max_allowed_packet=1M
set-variable= thread_stack=128K

[mysqldump]
quick
set-variable= max_allowed_packet=1M

[mysql]
#no-auto-rehash# faster start of mysql but no tab completition

[isamchk]
set-variable= key_buffer=16M

4. Securing the Configuration Files

There are many ways to secure MySQL by using the /etc/mysql/my.cnf
configuration file.

  • Changing the default port for the server and the clients.
    set ‘port = 3306’ to some other value, like ‘port = 27098’.
  • Use the ‘bind-address’ variable to bind MySQL only to the
    localhost (127.0.0.1).

    bind-address = 127.0.0.1
  • By using the ‘skip-name-resolve’ directive, clients that
    attempt to authenticate to the server must do so using the
    IP only; that is, DNS resolution will not work for
    authentication. While this isn’t a wonderful security
    procedure, it is something that could slow a potential hack.
  • The ‘safe-show-database’ directive will only display
    databases for which the authenticated user has some read/
    write privileges. Otherwise, ‘SHOW DATABASE’ will return
    every database the system includes.

So a ‘more secure’ configuration file could be:

[client]
port= 40044
socket= /var/run/mysqld/mysqld.sock

[safe_mysqld]
err-log= /var/log/mysql/mysql.err

[mysqld]
skip-innodb
user= mysql
pid-file= /var/run/mysqld/mysqld.pid
socket= /var/run/mysqld/mysqld.sock
port= 40044
log= /var/log/mysql/mysql.log
basedir= /usr
datadir= /var/lib/mysql
tmpdir= /tmp
language= /usr/share/mysql/english
skip-locking
set-variable= key_buffer=16M
set-variable= max_allowed_packet=1M
set-variable= thread_stack=128K
bind-address= 127.0.0.1
skip-name-resolve
safe-show-database

[mysqldump]
quick
set-variable= max_allowed_packet=1M

[isamchk]
set-variable= key_buffer=16M

5. What about SSL magic?

MySQL 3.x does not support SSL; MySQL 4.x, however, does. Since the
4.x line was just released, this author does not have much experience
with all the changes and wrote this tutorial for 3.x. Documentation
can be found on MySQL’s website at http://www.mysql.com/.

However, MySQL 3.x can be used in conjunction with stunnel. stunnel
creates an encrypted tunnel from the client to the server through
which all database transactions can be securely transmitted over the
‘net.

stunnel can be obtained from http://www.stunnel.org/. It runs on *nix
and Windows boxes so it makes a very useful addition to the MySQL
setup. Compiling and installing is a simple matter of reading the
Install file located in the tar file (or just typing ’emerge
stunnel’, though this does not give you the latest version).

stunnel is currently in its fourth major version. On Gentoo it can be
installed by typing:

emerge /usr/portage/net-misc/stunnel/stunnel-4.x.ebuild

Otherwise, you can download and install stunnel from
http://www.stunnel.org/. The basic install procedures are simple enough.
You still need to create the stunnel.pem file, however (used for openssl
encryption). Since Gentoo handles all the installation, there are a few
commands beyond the ordinary you need to run:

ebuild /usr/portage/net-misc/stunnel/stunnel-4.x.ebuild \
fetch unpack compile
cd /var/tmp/portage/stunnel-4.x/work/stunnel/tools/

Now, regardless of operating system, you need to create the stunnel.pem
file:

make stunnel.pem
chown root:root stunnel.pem
chmod 400 stunnel.pem
cp stunnel.pem /etc/stunnel/stunnel.pem

This procedure will need to be repeated on both client and server.
There will be options you need to type in for stunnel.pem … simple
enough to BS, so have fun.

For the server, you will have to modify the /etc/stunnel/stunnel.conf
file as such:

cert = /etc/stunnel/stunnel.pem
pid = /var/tmp/stunnel/stunnel.pid

setuid = nobody
setgid = nobody

client = no

[3306]-- your regular mysql port
accept = 3307-- your mysql ssl port
connect = 3306-- your regular mysql port

For the client, you will have to modify the /etc/stunnel/stunnel.conf
file as such:

cert = /etc/stunnel/stunnel.pem
pid = /var/tmp/stunnel/stunnel.pid

setuid = nobody
setgid = nobody

client = yes

[3307]-- your mysql ssl port
accept = 3306-- your regular mysql port
connect = server:3307 -- your server IP and mysql ssl port

After these files have been modified, start the stunnel daemon. On
Gentoo it can be started by running ‘/etc/init.d/stunnel start’. Make
sure both the client and the server are running this daemon and that the
server has the mysql engine running.

At this point, simply typing ‘mysql -h server -u user -p’ should connect
from the client to the server over the SSL connection.

Better reference for stunnel setup can be found at
http://www.freebsddiary.org/stunnel-v3-to-v4.php.

6. Miscellaneous Notes

Since I didn’t have a Redhat box available, I couldn’t see how to do
this via RPMs. However, after installing MySQL from the RPMS, you can
still configure all the options in the my.cnf file and can still use
stunnel to route connections over SSL.

The only RPMS for stunnel that I could find were for Rawhide Linux, so I
don’t hold any responsibility for faulty installations. Smile I would
definitely suggest either writing your own RPM for the 4.x line (and
publishing it) or just compiling it in as source. Very few if any
programs actually have it as a dependency, so you should be good to go
by just downloading the source.

DO NOT UNDER ANY CIRCUMSTANCES USE THE stunnel.pem THAT MIGHT COME WITH
stunnel!
It is a standard SSL key that has been distributed all over
the Internet.

Other fault points I have discovered involve using MySQL with PHP.
When PHP code has to access the MySQL database, you need to supply a
username and password. These PHP scripts are usually world readable,
and therefore so is the username and password to access the database.
Special measures outside the scope of this document may be taken when
attempting to secure MySQL with PHP/Apache. Perhaps this can be
discussed in a different presentation, one concerned with securing
web-related applications.

Basic Linux Wireless How-to

Table of Contents

1. Overview

The Linux kernel has supported wireless extensions since 1996. In 2002, it
was updated with a new API for more user space support. The full header
code may be found in /usr/src/linux/include/linux/wireless.h.

Wireless technologies fall under the IEEE 802.11 committee. This family of
protocols runs over 802.3, or Ethernet; hence, it uses CSMA/CD and all other
features of Ethernet. Sometimes, it is referred to as "WiFi", though this
term more specifically means 802.11b.

Since the 802.11 standard was introduced, four specifications have evolved:
802.11, 802.11a, 802.11b, and 802.11g. The de facto standard nowadays is
802.11b. A brief synopsis of each can be found in the following table.

Name Frequency Max Speed Modulation
802.11a 5-6 GHz 54 Mbps OFDM
802.11b 2.4 GHz 11 Mbps CCK
802.11g 2.4 GHz 54 Mbps OFDM

OFDM – Orthogonal Frequency Division Multiplexing
CCK – Complementary Code Keying

Obviously support for wireless needs to be found in the kernel, so let’s
explore that now.

2. Kernel Support for WiFi

Since 99% of all wireless applications deals with laptops, I won’t even cover
a desktop system. The methodology behind it is similar, however. For more
information, look at the Wireless How-to at The Linux Documentation Project.

The usermode PCMCIA card services are much better than the kernel’s built-in
support. However, you need to compile your kernel in such a way that the
PCMCIA-CS can be loaded. When you ‘make menuconfig’ go to the following menus:

General setup
PCMCIA/CardBus support

Under the last submenu, there should be an option for PCMCIA/CardBuss support.
Set that to "N", or exclude it from the kernel. Now, traverse another set of
menus, starting back from the original screen:

Network device support

Wireless LAN (non-hamradio)

The only option that should be selected here is the top, "Wireless LAN
(non-hamradio)". It should be built-in to the kernel proper. All other
drivers should be excluded. Now, just make the kernel and you can continue
into user space.

3. Usermode Support for WiFi

You will first need to install the PCMCIA CardBus services since we did not
build them into the kernel. The source can be found at SourceForge:

http://pcmcia-cs.sourceforge.net/

The latest drivers as of this writing are pcmcia-cs-3.2.3. I personally run
version 3.2.1 for hardware reasons.

NOTE: If you are running the Orinoco wireless card and wish to do any sort
of wireless monitoring (using Kismet or Ethereal or tcpdump), you will
need to use the wavelan drivers. Under Gentoo, add "wavelan" to your
USE variable prior to compiling pcmcia-cs.

Once you get pcmcia-cs compiled and installed, the next step is to configure
it. Under /etc/conf.d/, you will find a pcmcia file. There should be a line
in this file that reads "PCIC". If there isn’t, add one. Smile If this
is set to your CardBus chipset, then all is good to go. If it isn’t, add
the appropriate value. For the Dell laptops, this line should read:

PCIC="i82365"

There are other options, but they are outside the scope of this document.

Upon a reboot (and adding the pcmcia init script to your BOOT runlevel),
the pcmcia card services should be up and running. If your wireless driver
is supported by pcmcia-cs natively, the driver should be loaded at boot time.
If not, you will need to follow the manufacturer’s instructions for installing
your card’s drivers. (Good luck, is all I have to say.) I would like to
point out that most cards as supported by the prism2 driver included with
pcmcia-cs.

If you cannot find your drivers in the pcmcia-cs package, try the linux-wlan
project (http://www.linux-wlan.org/). They use the pcmcia-cs package for
cardbus services, but install their own drivers.

After you get pcmcia-cs installed and your driver loaded, you can work on
configuring your wireless options. The main file you will edit is
/etc/pcmcia/wireless.opts. While this file can have many options, the basic
few you need to access a wireless network follow this pattern:

case "$ADDRESS" in

scheme,socket,instance,hwaddr)
INFO="Description of wlan_name"
ESSID="essid_of_wlan_name"
MODE="Managed"/* Managed, Ad-Hoc */

RATE="auto"
KEY="wep key goes here"/* Omit if no wep */
;;

esac

The VERY basic identification block is:

*,*,*,*)

INFO="Any ESSID"
ESSID="any"
;;

The one you will need for the GT LAWN is:

gtwireless,*,*,*)

INFO="GT LAWN"
ESSID="GTwireless"
MODE="Managed"
RATE="auto"
KEY="wep here"/* wep omitted for docs on web */

;;

So now we’ve configured all the wireless options. You can change schemes by
using the cardctl command:

# cardctl scheme default
# cardctl scheme gtwireless

Now for each entry in wireless.opts, create an entry in networks.opts, same
directory. These settings will be used in bringing a wireless interface up.
You can use DHCP, BOOTP, or statically assigned IPs. Two basic entries might
be:

case "$ADDRESS" in
gtwireless,*,*,*)

INFO="GT LAWN Network Setup"
DHCP="y"
;;
dorm,*,*,*)
INFO="Dormroom WAP settings -- NO DHCP FOR SECURITY REASONS"

DHCP="n"
BOOTP="n"
IPADDR="192.168.1.1"
NETMASK="255.255.255.0"
NETWORK="192.168.1.0"

BROADCAST="192.168.1.255"
GATEWAY="192.168.1.1"
DOMAIN="headnut.org"
DNS_1="192.168.1.1"
DNS_2="128.61.15.251"

DNS_3="128.61.15.244"
MOUNTS=""/* For any NFS mounts located in /etc/fstab */
MTU=""/* Override the default MTU here */
start_fn() { return; }
stop_fn(){ return; }

NO_CHECK=n/* Card ejection policy */
NO_FUSER=n/* Card ejection policy */
;;
*,*,*,*)
DHCP="y"
;;
esac

Needless to say, any of these options can be omitted or simply set to "".
There are many more, so feel free to look at Jean Tourrilhes’ pcmcia-cs
website: http://www.hpl.hp.com/personal/Jean_Tourrilhes/Linux/. The PCMCIA.txt
file under that directory is particularly concise (HINT HINT!).

Since we just created all the scheme information needed to start this puppy,
let’s change our default scheme. Remember the /etc/conf.d/pcmcia file? There
is another option you need to add to the file:

SCHEME="default_scheme_name_here"

Assuming your initialization scripts work correctly, upon the next reboot, all
will be happy in your wireless world.

4. Wireless Extensions

Wireless extensions under Linux have been made possible by Jean Tourrilhes.
Now in its 25th version, it merely consists of a proc file. Smile

/proc/net/wireless contains all the networking stats you can pull from the
kernel and/or drivers. There are tools like iwconfig/iwspy/iwlist that allow
a user to poll data easily from this proc file, but that’s all, folks.

Jean’s tutorial can be found at his website on the wireless extensions page:
Linux.Wireless.Extensions.html (get his page link from above)

The iwconfig utility acts much like ifconfig for wireless cards. In fact, he
took much of the same code from ifconfig. It is somewhat self-explanatory.

iwspy sounds like so much more than it really is. It can be used to pull
statistics for packets signed with specific MAC addresses. The basic syntax
is:

iwspy interface [[+/-] [ip_addr] [hw hw_addr]]

A third utility, iwpriv, is used by some drivers (like the patched Orinoco) to
extend the functionality of the system. By using ioctl(), it allows for a very
extensible solution to the rather rigid driver structure provided by Linux.

5. IEEE 802.11 Family

As noted above in the Overview, there are quite a few specifications in the
802.11 family. The most common (and the one we run at GT) is 802.11b. This
will be the one we touch most upon. First, however, let us discuss the others.

802.11a operates in the 5 GHz frequency range, its modulation driven by the
OFDM protocol. This combination allows for speeds of up to 54 Mbps, but
with a VERY limited range. Users should opt for 802.11a if they need the speed
enhancement, if they are in an area filled with 2.4 GHz traffic, or if the
user base for wireless applications is very dense. Since 802.11a and 802.11b
operate on (a) different frequencies and (b) different modulations, they are
completely uncompatible for the possibility of future upgrades, etc.

802.11g operates in the 2.4 GHz range, using the same modulation as 802.11a.
This protocol has not been fully standardized at the time of this writing,
however, so many things can change between now and then. Smile It’s main
advantage is that it can be compatible with 802.11b in terms of frequency, so
holds a higher potential as an upgrade solution later down the line. The very
first 802.11g enabled devices are just beginning to come out on the market,
following an alpha release standards document.

Now for the crux of this section …

802.11b, also known as WiFi, is by far the most popular of all 802.11 specs.
It’s popularity came with the DSL and Cable Modem boom a few years ago,
with every Tom, Dick, and Harry buying one of those Linksys routers and
some really cheap-assed WPC11 cards for their computers. (More on how this
is advantageous to YOU later.) According to some reports on the Internet,
with directional antennas the range can be over 4 miles! However, more
realistically the range for a 1 Mbps signal is limited to under 800 feet
unobstructed, less than that for walls and wiring that may get in the way of a
signal. For an 11 Mbps signal, the wireless card must be within 150 feet of
the access point.

For information about extending WiFi’s range, visit:
http://www.pbs.org/cringely/pulpit/pulpit20010628.html

There are two modes WiFi can run in: Ad-Hoc and Infrastructure. Ad-Hoc means
two or more clients connect to one another independent of an access point or
central means of regulating the traffic flow. Infrastructure mode depends on
an access point to handle all base communication between clients on the node.

Under 802.11, there are 11 separate "channels" numbered 1 through 11. Each
channel represents a separate wireless LAN. These spherical globes can be
interleaved so long as no two globes with the same channel "touch". Typically,
any given environment only needs 3 channels (1, 6, and 11) to cover an enormous
area.

6. Security under 802.11b

Security on a wireless network is … touchy at best. It can be accomplished
using IPSec or some other point to point protocol best, but there do exist
built in methods of encrypting the data. The Wireless Encryption Protocol,
or WEP, encrypts all packets on a node using a 64-bit or 128-bit algorithm.
The WEP is seeded by either a passphrase or a key. (Georgia Tech uses a 64-bit
key-based system.)

Needless to say, it sucks. Anyone, given enough time (usually less than 24
hours), can crack a WEP and read all your nice data being broadcast everywhere.
For some insane reason, the CIA/NSA have approved usage of specific 802.11b
applications. We’ll see …

7. Wireless Fun …

Twice now I have noted the lack of security inherent in the system. The first
was the wide spread usage of the Linksys routers for DSL and Cable Modems.
A Linksys router uses factory specific defaults:

IP Address = 192.168.0.1
DHCP Range = 192.168.0.100 - 192.168.0.254
Username = ""
Password = "admin"
WEP = disabled

Which means if you can find any of these, renew your IP address, and open
Mozilla, you have complete access to the Wireless Access Point (WAP).
Just visit http://192.168.0.1 and enjoy!

The second security concern with 802.11 is the WEP. Even a network secured
with a WEP can be decrypted with enough time. I suggest you look into network
sniffers like Ethereal and Kismet.

Linux 2.6/3.0 Changes

Table of Contents

1. Introduction

Linux was written back in the 90s by Linus Torvalds. It aimed to be a
POSIX compliant clone for the x86 machines, but has grown into the
friendly penguin OS we all know and love. "Linux" actually refers to
the kernel of the operating system.

Definition: kernel

\Ker’nel\, n. (1) the inner and usually edible part of a
seed or grain or nut or fruit stone; (2) the choicest or
most essential or most vital part of some idea or
experience; (3) (operating system) the essential part of
Unix or other operating systems, responsible for resource
allocation, low-level hardware interfaces, security, etc.

http://dictionary.reference.com/search?q=kernel

When dealing with the Linux kernel, there are two categories one
generally falls into: stable and unstable. Stable kernels, meaning
they have been tested extensively and are the production version, have
an even minor number. Unstable versions have an odd minor number.

Kernel versioning schema:
x.y.z2.4.21 (Stable)

^ ^ ^2.5.66 (Unstable)
| | |2.6.1(Stable)
| | --- release number
| | --- minor version
| ----- major version

As a rule, NEW USERS SHOULD NOT MESS WITH UNSTABLE KERNELS!🙂
However, if you feel adventurous and have wonderful backups/don’t care
about your data, go for it. The official kernel website can be found
at http://www.kernel.org/. If you don’t feel like waiting, point your
browser to http://www.kernel.org/pub/linux/kernel/ for a full listing
of kernel-related source. This archive contains everything from the
original 1.0 kernel up to the latest unstable release, including some
obscure patches perhaps not found elsewhere.

2. Installing 2.5

The latest unstable kernel as of this writing is 2.5.66. Soon

(e.g. within my lifetime) the unstable branch will be locked,
accepting no further changes, and release candidate testing will
begin. Since some things have already been unofficially locked, we
can start discussing those. One of these is the install methodology.

From kernel 2.2, the install script has been:

make mrproper
make menuconfig--- choose the options you want
make dep--- figure out all dependencies
make bzImage--- make the kernel proper
make modules--- compile the modules
make modules_install--- install the modules

Starting with 2.5, the script has been significantly cut back:

make menuconfig
make

By default ‘make’ will create the kernel proper for your architecture
and compile/install all modules. It supports -jN for parallel make
operation (running more than one copy of make at the same time). For
the kernel hackers out there, the make script also will compile
individual files by typing ‘make filename‘. For the graphically
inclined, ‘make xconfig’ can replace ‘make menuconfig’, using the
latest in qt graphic components. (It is slow, but bearable.)

Rumors started awhile back about the kbuild system being used in 2.5;
for those that don’t know, kbuild is an alternative build system for
large projects. It has NOT been included in this release.

3. Major changes in 2.5

  • /proc/stat format changed
  • in-kernel module will now free memory marked __init or __init_data
  • kernel build system (see above section)
  • I/O subsystem reworked
    • faster due to new memory layers
    • 512 byte granularity on O_DIRECT I/O calls
    • access up to 16TB on 32-bit architectures, 8EB on 64-bit
  • /proc/sys/vm/swapiness now allows users to set preference for page
    cache over mapped memory
  • Ingo Molnar’s O(1) scheduler
    • sched_yield() problem
  • preemptive patches included kernel-wide
  • futexes (Fast Userspace Mutexes) (http://ds9a.nl/futex-manpages)
  • kernel threads improvements
    • ptrace functionality
    • /proc updates for threading now
  • core dump with style (/proc/sys/kernel/core_pattern)
  • ALSA included as standard (entered late into 2.4)
    • replaces OSS but provides backwards-compatibility
  • AGP 3.0 supported by overhauled agpgart
  • Faster system calls for chips that support SYSENTR extension
    • Intel Pentium Pro/AMD Athlon and higher
    • need updated glibc (>= 2.3.1) for this to work
  • SCSI is almost completely broken
  • quotas have been completely rewritten
  • CD writing/reading overhaul
  • New filesystems (JFS, XFS, NFSv4, sysfs, CIFS, etc.)
  • CPU Frequency Scaling (like SpeedStep(tm) technology)
  • IPSec included in the mainstream
  • Number of ports expanded

4. Deprecated in 2.5

  • khttpd (kernel-based webserver)
  • DRM for XFree86 4.0 (upgraded for 4.1.0)
  • system call table no longer exported
  • ham radio support moved to userspace
  • must boot from bootloader (e.g. no more straight floppy-based
    booting)
  • swap partitions using version 0 (only supports >= v1)
  • Compressed VFAT removed
    • remember the old DriveSpace from DOS 6.2?
  • usbdevfs
  • elvtune

5. The two R’s of CDs

Beginning with the 2.5 kernel, CD writing and ripping can be performed
under DMA mode. For the hardware illiterate, DMA stands for Direct
Memory Access; it allows certain devices to get a request for a block
of information and fill that request directly to preallocated memory,
leaving the CPU free to do other work. Hard drives and network cards
are the two most well-known DMA devices. Without DMA, a computer must
use PIO (Programmable Input/Output). Under PIO, the CPU must handle
the task of moving individual bytes of data from the device’s buffer
to RAM. The advent of DMA brought speed increases in hard drive and
compact disc technologies. Until now, however, CD writing and audio
ripping was still limited to PIO operation only.

This contribution to the kernel was made by Jens Axboe
axboe at suse dot de. It has actually been availale in patch form for the
2.4 kernel for some time now, though never fully accepted until 2.5.
His work generally tends to center around multimedia block devices
(CDs and DVDs), so his website has great documentation about his speed
enhancements in this area.

http://kernel.org/pub/linux/kernel/people/axboe/

6. The (infamous CS3210) O(1) Scheduler

One fateful day, Ingo sat at his computer, a case of Jolt at his feet
and the stench of 1000 long coding sessions all around. As his
fingers began to type, he wrote …

Okay, what actually happened was an attempt to improve the scheduler
latency. Since the 1.0 kernel, the scheduler in Linux has always been
O(n). The reason for this is the data structures used to represent
active processes in Linux.

6.1. Brief History

A process in Linux is actually represented by a struct. This
struct contains things like the process id (PID), nice value (the
value used to indicate the "interactiveness" of a process), etc.
When a process wanted to be given processor time, it would tell the
scheduler to add its struct to the "active processes" list.

Barring special Real-time processes, each process would be given a
quanta of time per epoch. What’s an epoch? Imagine the point in
the course of human history where caffiene can no longer be
produced — e.g. the end of time as we know it. To the scheduler,
an epoch occurs when all the processes that could be scheduled have
been scheduled. That is, every process available has run on the
CPU for at least its quanta of time. (A quantum is a small slice
of something.)

The scheduler in 2.4 simply maintained a linked list of processes
and would scan the list every time it was run to determine which
process would go next. The scheduling priority in Linux allows for
some processes (interactive ones) to be scheduled before
non-interactive processes that simply hog the CPU. You can set
this level with the ‘nice’ command. A higher niceness indicates an
interactive process. (Nice ranges from 0 … 40.)

6.2. Current History

Scanning this linked list EVERY SINGLE TIME the schedule was run
caused the computer to stay in kernelspace far too long —
latency. In attempting to improve this latency, Ingo Molnar
reasoned that determining the highest priority process available
could be done via a priority queue. (Again, we’re not considering
Real-time processes yet!) His kernel patch takes the original
linked list and transforms it into two priority queues, one of
expired processes and one of active processes.

The highest priority process is removed from the "active" priority
queue. It runs for some quanta. If it is preempted or yields
control and has quanta left, it is placed back on the active
queue. If it has no quanta left, however, it’s quanta is reset and
the process is thrown onto the "expired" queue. This is
recursively done until the active queue is completely empty. At
that point, the empty queue becomes the active queue, and vice
versa. Then the show continues….

Priority queue removal in his scheduler runs at O(1). It uses
slight optimization techniques and hardware bitmapping far beyond
the scope of this presentation. Feel free to ask any questions
afterwards, however. There are a few problems with it:

  1. Interactive processes run for the quanta and then do not get
    scheduled again until ALL OTHER PROCESSES (including
    non-interactive ones) have expired their quanta. This leaves
    an interval for lag when the system has a high load.
  2. sched_yield() now causes processes to sleep for quite some
    time due to the dual-queue approach. This should not affect
    programs, yet some (like OpenOffice) were written to take
    advantage of the benefits in the old version of the
    scheduler. These programs will seem to respond more
    sluggishly until their designers recode those portions.

7. Excuse me, I need to interrupt you

The preemptive patches in the kernel are quite a leap forward in
increasing the responsiveness of the Linux system overall. Combined
with Ingo’s O(1) scheduler, there is an amazing decrease in lag time.
It should be noted, however, that the O(1) scheduler and the
preemptive patches are exclusive in development — e.g. you can patch
a 2.4 kernel with either the O(1) scheduler OR the preemptive patches
OR both, but they can be used separately.

If a process is preemptible, it means it can be interrupted
mid-execution to allow another process (usually with higher priority,
like an interrupt handler) to run. Inside the Linux kernel, however,
procedure calls have never been preemptible. That is, once you make a
system call, you cannot be taken off the processor until the system
call has completed. (On the way out of the system call, the kernel
checks to see if the process needs to be rescheduled.)

The preempt kernel patch, maintained by Robert Love, allows 99.9% of
the kernel to be preempted. There are a few areas that cannot be
preempted (the scheduler and some SMP synchronization code, for
instance), but they disable the preemption mechanism for the duration
of their execution. More information about the preempt patches can be
found at their website:

http://kpreempt.sourceforge.net/

8. User-mode Linux

Run Linux inside Linux! Wait a minute … has the penguin gotten to
you again?!

UML is actually a kernel virtual machine built into the Linux kernel.
It is designed to help developers poke around with the more sensitive
internals without having to crash machines and reboot servers. Some
people have gone so far as to run UML as their "main" system (David
Coulson and http://usermodelinux.org/ being the most notable).

With UML, a developer can test kernel code using normal tools like
electric fence, gdb/dbx, etc. The "real" kernel keeps the usermode
kernel separate from the hardware unless you allow it access; even so,
there are quite a few things that just cannot be done because of
abstraction and handling routines. All in all, however, it is a great
tool for the kernel (non) savvy.

9. Filesystems

Filesystems added in the 2.5 kernel include NFSv4, XFS, JFS, and sysfs

NFSv4 is the new new reimplementation of NFS, and is to include
support for ACLs, a "pseudo filesystem" for client caching of
directories and data, state management, and file locking.

sysfs, not to be confused with the system call of the same name, is
a memory based filesystem for the representation and modification
of kernel object (kobjects). This is similar to the functionality
currently provided by proc, but now everything is a bit more well
defined, and sysfs doesn’t contain process information.

JFS is IBM’s journaling filesystem (from AIX) and XFS is Silicon
Graphic’s journaling filesystem (from IRIX). Filesystems were
covered more thoroughly by Ben McMillan at:
http://lugatgt.org/articles/filesystems/

10. Device Mapper

The kernel device-mapper is a driver that allows the definition of new
block devices consisting of sectors of exisiting block devices. LVM2
uses this to define the logical volumes.

11. Quota

The 2.5 kernel uses a new quota format that allows for 32-bit UIDs
and GIDs, needed for filesystems such as ReiserFS and XFS.

12. CPU Frequency Scaling

CPU frequency scaling allows the user to change the clock speed of the
CPU while the computer is running, which is useful for laptops.
Support for this exists in the 2.5 proper.

13. IPSec

2.5 adds support for a new protocol family type, PF_KEY, and IPsec
network encryption. IPsec support was ported from KAME, and is used
by VPNs and the like.

14. Resources