Presented by David L. Cantrell on October 2, 2002
Table of Contents
- 1. Background
- 2. Terminology
- 3. How CVS Works
- 4. Problems With CVS
- 4.1. Permissions and ownerships
- 4.2. Symbolic links
- 4.3. Binary files
- 4.4. You cannot delete directories
- 4.5. Spaces in filenames
- 5. Commands
- 6. Using CVS to Synchronize Your Home Directory
- 6.1. The Problem
- 6.2. Layout your new home directory
- 6.3. Import
- 6.4. Add files
- 6.5. Update systems
- 6.6. Take your home directory to CVS
- 6.7. Make CVS a habit
- 7. Conclusion
- 8. Resources
CVS stands for the Concurrent Versions System (or Concurrent Versioning
System or Concurrent Versioning Software, and so forth) and is pronounced
Cee Vee Ess. It was designed as a replacement for RCS, or the Revision
Control System.
1. Background
The idea of tracking changes made to a development project has been around
for a while. It’s not a particularly interesting job, as evidenced by the
tools we have today. Some of the more popular source code control systems
are CVS, RCS, and SCCS.
| CVS | Concurrent Versions System. This is the de-facto standard used in open source community. |
| RCS | Revision Control System. The old system used before we had CVS. |
| SCCS | Source Code Control System. Used by companies like Sun. You won’t find SCCS tools in the open source world. If you’re itching to see SCCS, look in /usr/ccs/bin on a Solaris machine. |
There are some new projects underway to replace CVS. While CVS does work,
it has some well-known limitations that are now becoming an issue as
CVS-controlled projects become larger and larger. A popular one in the
Linux community is BitKeeper (or simply bk). BitKeeper is made by
BitMover, Inc. They designed bk to be a good enough tool that Linus would
be willing to use it for kernel development. A difficult task, at best.
BitKeeper is in use in a lot of places, but one of the major differences
that you’ll find between bk and CVS is that you can’t get the bk source
code, only binaries. This is because BitKeeper is not an open source
project, it’s commercial software.
And there is yet another hopeful CVS replacement, Subversion. The
Subversion project was started by the CVS authors (!) because they wanted
to correct CVS’s deficiencies and make the upgrade somewhat painless.
We’ll see what becomes of that project, so far it’s looking pretty good.
… So, with all the above, why CVS? Several reasons, and most of them
are related to the current status of the other projects:
- CVS is well established, it’s pretty much error-free.
- It’s open source, which means you can get it working on basically
any platform. - I don’t like trusting ALL of my data, or even just my development
projects to a new, experimental, and developmental source control
system.
In a few months, when I decide to look at Subversion again, I may choose
to move over to that. But for now I’m going with CVS. If you decide to
never use CVS for your own purposes, it’s still a good idea to understand
how it works because it is used in so many projects.
2. Terminology
- Repository
- The name of the CVS server. This is where all of the
data is stored, it’s where your changes go, and it’s
where you pull updates from. - Tag
- This is the CVS name for a version or release.
- Module
- The project you work on from the repository. Typically a
repository will only have a few modules, one for each major
aspect of development. For example, FreeBSD has the ’src’,
‘doc’, and ‘ports’ modules. - Branch
- Concurrent development on the same module. CVS does a
really neat trick which allows you to branch development at
any time and it will start tracking changes specific to
that. This is useful for software development because you
can make a branch after a release and use it for security
updates. - Pserver
- CVS’s internal server mechanism. Don’t use this.
- Attic
- CVS never deletes a file from the repository. Files
marked as deleted get put in the attic.
3. How CVS Works
The terminology section may have given you some hints as to how CVS works
and what it actually does besides "source code control." The easiest way
to think about it is to think of it as a development mediator. Multiple
developers working on the same project, CVS handles merging all the work
of the developers, it handles sending out changes to each developer, and
so forth.
Life in CVS begins by creating a project, which consists of at least one
module. In this module, you import the files that belong to it. In the
world of software development, this consists of C source code files,
header files, Makefiles, support files like X pixmaps, and documentation.
What doesn’t go in CVS? Anything that can be automatically generated.
Object files that the compiler creates, dependency files, the program or
library executables, and … configure scripts. Autoconf can regenerate
the ‘configure’ script, so you don’t want this in CVS. Before
distribution in gzipped tar format, most developers run ‘autoconf’ to
create that script for you.
Your source is now under CVS control. To work on it, you checkout a copy,
make your changes and commit them to the repository. Periodically you
will run the update command to pull down any changes from other
developers. And, if it’s like any other CVS project, you’ll have merge
conflicts that will need resolving by hand.
That’s the big picture. The important parts are understanding that you
are working on a copy of what’s under CVS control. You use CVS to manage
the changes to your copy and other copies.
4. Problems With CVS
4.1. Permissions and ownerships
CVS does not store permissions and ownerships on files. You cannot flag a
file under CVS control as world readable, for instance. When you check
something out from CVS, the files come to your workstation and CVS chowns
and chmods them according to your umask. When you commit changes, CVS
pulls in your changes, but doesn’t modify the permissions in the
repository. Most projects use different group permissions on the
repository so they can restrict access to project developers.
People using CVS for more than software development make use of the post
operation scripts to overcome this limitation. You can tell CVS to run a
script or program after a CVS update or commit. In this script you can set
permissions and other such things that CVS doesn’t handle.
4.2. Symbolic links
You cannot store symbolic links in CVS. Forget about it, ain’t gonna
happen. Use the script hack above or just say good bye to symbolic links
forever.
4.3. Binary files
CVS provides source code control. You can check out old versions,
generate patches between releases, and many other tasks specific to
software development. To perform these tasks, CVS must deal with plain
text files. This is how it can track changes between the files. This
presents a major issue for binary files. So much of an issue that CVS
just doesn’t handle binary files. Now we’re starting to have some
problems. To get around this problem, we can flag a file as ‘binary’ and
CVS won’t track changes to it. It will just make sure one copy is in the
repository. Generally speaking, this works fine, but it means you can’t
use CVS to track changes between, say, JPEG image files.
4.4. You cannot delete directories
Once a directory is added in a CVS project, you can’t delete it. Remember
I said that CVS never deletes a file, it just moves them in to the attic?
Well, if you checkout an old release that had a now deleted file in a now
deleted directory, CVS needs to know where to put it. The empty directory
is where it will put that. Because of this, CVS can think of empty
directories as deleted, which is what you want to do.
4.5. Spaces in filenames
Generally speaking, CVS does not like to deal with spaces in filenames. I
have seen tricks to make CVS deal with this, but I prefer to not have
spaces in filenames anyway. For some people, this may present an issue.
5. Commands
To perform a CVS operation, you run cvs and specify one of the commands
below. Each command has a help screen, which you can get with this
syntax:
cvs command --help
Below are the major CVS commands, you can see –help-commands for a
complete listing. All of these commands assume you have a working CVS
repository (this is covered in the second part of this presentation).
5.1. import
The import command is for creating new CVS-controlled projects. You run
this command from the directory you want in CVS. For software development
projects, this is usually your source directory. To import the current
directory, use this command:
cvs import -d project-namevendor-tagrelease-tag
The -d flag tells CVS to use the file’s mtime as the import time. This
way you can import projects that haven’t been under CVS control for years,
the timestamps are preserved this way. The project name is what you want
to call the CVS module, the vendor-tag is mostly usually, and the
release-tag is just a symbolic name to represent the import. For vendor
tag I use BURDELL. For the release tag on import operations, I use
’start’.
5.2. checkout
The checkout command is what most people are probably familiar with. The
simplest syntax is:
cvs checkout project module
Which checks out the latest revision of the specified module. Adding the
-r [rev] switch will check out a specific revision.
5.3. export
This commands works like the checkout command, but it does not "check out"
a copy to work on. That is, the CVS server does not know you are working
on that copy. This command mainly exists for creating source archives for
distributions. Once you tag the release, you export it to another
directory and it’s free of the CVS repository and does not have those
‘CVS’ subdirectories all throughout the tree.
5.4. add
The add command is used to add files to a module you have checked out.
You must specify each file to add:
cvs add files...
If you want to add a directory, use the add command on the directory, but
then change in to the directory and add each file. The add command has a
special flag for adding binary files, the -kb switch. Use this switch
when you are adding any file that is not plain text.
5.5. remove
To remove a file from a module (or project), you must use the cvs remove
command. The biggest problem people have with this is that you must
specify each file separately. It’s really not that big of an issue once
an entire project is under CVS control, you’ll find that you rarely remove
large sets of files. If you need to remove an entire directory tree
that’s under CVS control, use the -R switch to recursively remove the
directory. The syntax:
cvs remove [-R] [-f] files...
This CVS command has one major annoyance, you cannot run ‘cvs remove’ on a
file until you actually rm it from the filesystem. To get around this
default behavior, use the -f switch on this command to tell CVS to rm the
file before removing it from the project. Very useful.
5.6. update
You will use this command as much as the commit command. The CVS update
command brings your local copy of the project up to date with all the
changes available in the repository since you last updated. This command
merges differences and also lets you know of merge conflicts. The syntax:
cvs update
Run this command from the main project directory and CVS will check the
repository and pull down and merge all the changes. The output from this
command can be a bit cryptic. The program displays a letter indicating
the operation, followed by the file involved with that operation. Here
are the letter codes you will most likely see:
| U | file updated |
| A | new file added |
| P | file patched (like U, but not the entire file) |
| R | file removed |
| M | changes merged |
| C |
MERGE CONFLICT |
| ? | cvs hasn’t got a clue |
The two options I use with cvs update are -d and -P. These two options
bring down all directories in the module and then "prune" empty ones,
which we assume are deleted directories.
If you get merge conflicts (you will), cvs does this really nice thing by
default where it stomps all over your copy of the file in conflict. It is
up to you to then move the file out of the way, get the copy from the
repository, diff the files, and merge the changes by hand. If you are
predicting merge conflicts, it’s a good idea to use the common CVS option
-n, which reports what would be done to your copy without actually doing
it. So running this command:
cvs -n update 2
Will report what files have merge conflicts without stomping all over
them.
5.7. tag
This command is used for software development projects under CVS control.
If you’re tracking your home directory with CVS, you probably won’t make
releases of it at various points.
The tag commands marks the state of the repository with a symbolic name.
Once you tag the module, you can later checkout a specific tag by name.
The tag command places the symbolic tag on your checked out copy of the
project, the rtag command puts the tag in the repository and does not
affect your copy.
When working with branches, you make use of the tag command. You can
create and merge branches with the tag command.
5.8. log
When you make commits, CVS will prompt you for a log entry. Using this
wisely will produce a log for the project that can be referred to later.
Many people skip this step. It really doesn’t matter, but I like logs, it
makes tracking down changes easy. With a CVS log, you get the file
listing, the revision numbers, and the annotation so you can quickly get
back to working copies of files.
5.9. diff
The diff and rdiff commands can display the changes between revisions of
files in either unified or context format. If you need to manually
resolve a merge conflict, generate a patch to a tagged release based on
the current developmental copy, or to see what you changed when you broke
something, the diff command is what you want to use. The syntax:
cvs diff -r rev -r revfiles...
The diff command has a lot of options, most of which fall through to the
diff command. You can diff specific revisions, files from specific dates,
and you can process directories recursively to generate large patches.
5.10. Other Commands
The CVS commands that begin with an ‘r’ are like the similar command
above, but they operate on the repository instead of the checked out copy.
You may also be familiar with the ‘login’ and ‘logout’ commands. These
are generally used with the CVS pserver mechanism and not when a
repository serves via ssh.
5.11. Options
There are some common CVS options, such as -z (for compression) that apply
to all CVS commands. You can use the –help-options switch to see a list
of those.
On the subject of command options, the location on the command line where
you put options does matter. There are generic CVS options and command
specific options. You need to follow this order when using options:
cvs [common opts] [cvs command] [command opts] files or something
You can alias the CVS commands to the command plus the common options you
prefer. This is done in the ~/.cvsrc file.
5.12. RCS variables
CVS uses the RCS file format to track changes. Because of this, you can
make use of special RCS variables within your plain text files. A common one is $Id$,
which expands to a description containing the RCS file name, a timestamp, and the
revision number. Another RCS variable is $Log$ which expands to the commit log for
the file. This can get really long for
files that change often (.c files for development projects), but for files
that rarely change, it can provide a quick way to look at the log. A list
of the common RCS variables:
| $ | Id$ | Identification string |
| $ | Log$ | Commit log |
| $ | Revision$ | Revision number |
This also brings up a good point about CVS. The RCS man pages apply to
CVS, mostly. Specifically the co(1), ci(1), rcsintro(1), and rcs(1) man
pages.
6. Using CVS to Synchronize Your Home Directory
In this example, I’ll explain how I set up CVS on my own machines to
synchronize my home directory. This is a problem that I’m sure everyone
has encountered at least once.
6.1. The Problem
When you get a new user account on a system, you are given a place for
your files, your home directory. Over time you get more and more shell
accounts, sometimes even on your own machines and you begin to lose track
of where files are and you begin to have trouble maintaining the
environment profiles between them all. This is a problem I’ve fought with
for a long time, until I read the article by Joey Hess in the September
2002 Linux Journal. Joey explained how you can put CVS to work
synchronizing your home directory. It never occurred to me to try this,
but I decided to give it a shot. It’s been working great between my
machines. Below is a short description of what I did to move my life in
to CVS.
6.2. Layout your new home directory
Projects under CVS require some thought. Not being able to freely remove
directories and having files always exist means you can’t just throw
things anywhere. Well, I guess you could do that, but it would make for a
CVS managed mess. So, I created a new directory as the working tree for
what would become my new home directory. Inside this directory I have
these subdirectories:
| GNUstep | WindowMaker profile and other GNUstep stuff |
| All of my email (now 118MB) | |
| bin | Scripts and programs I’ve written for myself |
| doc | Like ‘My Documents’ on Windows |
| etc | Location of configuration/support files for my ‘bin’ stuff |
| gt | All of my Georgia Tech class stuff (now 425MB) |
| media | Movies, pictures, and random audio files |
| src | Programs I’m working on |
| tmp | Scratch space |
That’s it. I keep classwork under the gt subdirectory, general ‘work’
goes under doc in an appropriate subdirectory, and so forth. Keeping to
this structure will ensure your cvs attic doesn’t grow enormous.
6.3. Import
I start by importing an empty directory structure. I will be holding
plain text as well as binary files in my home directory, so I need to
specify the -kb switch on some of them.
cvs import -d david-homedir BURDELL start
Once the import is complete, I remove the directory I just imported and
check it out from CVS.
6.4. Add files
With my empty directory, I start adding in files one at a time. I did
several subdirectories at once and then I’d commit the changes. This
process took a while, but I only have to do it once.
Remember to use the -kb flag for binary files. Use cvs commit to place
the new files in the repository.
6.5. Update systems
Since my home directory is huge, I can quickly lose track of what I’ve
added to CVS and what I haven’t. I use the cvs update command and look at
the lines beginning with "?" and then go and handle those files. I pretty
much used the update command as my checklist for what I still needed to
merge in.
6.6. Take your home directory to CVS
With everything in CVS, you can now take your home directory to CVS. It’s
somewhat tricky, but here’s what I did.
$ cvs commit
$ cd
$ cd ..
$ rm -rf ~/*
$ cvs co -d david david-homedir # check out my home directory
Now, I went to another terminal and tried logging in. Once I verified
everything was working, I logged out of the shell I did the checkout from
and started using normal shells.
6.7. Make CVS a habit
Living in CVS isn’t hard, it just requires a few extra commands on top of
the normal commands you type.
At the end of each day, you should do a cvs commit to commit your work for
the day to CVS. Each time you start working on something new, start in a
logical place and do cvs add on those files and directories. After about
a week, the CVS commands become second nature.
7. Conclusion
Is CVS the best tool for home directory synchronization? Probably not,
but for me it works fine. The advantages I get are:
- Distributed backups
- Home directory synchronization
- History
Based on those advantages alone, I think CVS is the right tool for me. It
has some shortcomings, but I think the advantages above are worth it. I
used to use NIS and NFS for account and home directory management for my
systems, but that requires access to the NIS and NFS server all the time.
This doesn’t work well for laptops. After that, I tried hacking something
together with rsync and ssh, but there was no easy way to keep track of
which machine was the "master" copy of my home directory…rsync doesn’t
merge differences. And now I’m using CVS.