Version Control Systems (RCS, CVS, SVN, GIT)
A version control system is an important tool in collaborative authoring of not only source code of computer programs but also professionally typeset papers in LaTeX. The version control systems I have used include RCS, CVS, SVN, and GIT (which I am learning now). I am ordering them based on what I have used most recently:
- GIT. it has gained popularity in recent years due to better interface and more powerful features, including distributed repositories. This is what I am migrating to.
- SVN (tracks sets of files (“builds”) rather than individual files). This is what most of my files have been stored so far.
- CVS (built on RCS, supports merging simultaneous edits w/out lock, remote access)
- RCS (old, based on locking individual files)
The assumption is that you have access to a Unix-like shell and tools. On MacOSX, this is under /Applications/Utilities/Terminal.app; on Linux, you also have a terminal or X-window. On Windows, download Cygwin and use the setup script to install these tools. You are also assumed to have an account on the server.
Note: in this writeup, code segments that begin with %
indicate a command to be typed into the shell command-line. %
is the prompt character for csh and tcsh; in bash, it is the $
character, but the commands are the same. In any case, type the command without the prompt character.
1. GIT
GIT is a newer version control system that has caught on in recent years. Before I write my own, you might want to take a look at this Git – SVN Crasu Course.
The reasons I use GIT are its better web interfaces and slightly more powerful features, plus good free-hosting services. I do miss the ability to set $Id:$ and other tags. For collaborative work, what you need are
- GIT client program
- account on a GIT hosting service, assumed to be gitlab.com.
- create a project on the server
- set up and upload local repository to the project on the server
- add collaborators
- start working with basic GIT commands: clone, add, commit, push, pull, …
1.1 GIT client
You should first install GIT if it is not already on your computer. Recent versions of MacOS X and Linux should already have it. For Windows, download and install with the default option of command-line prompt, which installs git-bash for you along with the rest of the tools you need, including ssh-keygen.
Here are some detailed instructions for installing the Windows version. I assume GitExtensions.
- Download GitExtensions246SetupComplete.msi and install it. Here is more information about Git Extensions, including video tutorials. It extends Windows Explorer with a few more features. Getting started.
- Note that it may ask you to install .NET 4.0 framework. If that is the case, then you can install that: Download Microsoft .NET Framework 4 Web Installer
- In the GitExtension installer you may be asked whether to install MsysGit 1.8.3 and KDiff3 0.9.97. Check Yes to both.
- When asked to install OpenSSH or PuTTY, I think I use OpenSSH, even though the default is PuTTY.
- When installing Git, it may ask you to Use Git Bash only or Run Git from Windows command prompt. The default is Git Bash Only, and I would use that. You can leave all other things as default.
- After installing, Git Setup will run, and ask you to enter your name and initials. Then you will click the Finish button.
- You will need an SSH Key. You only need to do this once. Run Git Gui, go to Help menu > Show SSH Key. It first says No Keys found. You can click on “Generate Key” and then “Copy to Clipboard”. You will need to paste this long string of characters to your gitlab account in step 2.
- After installation, you will have one folder called Git from your Start menu. One is Git GUI, and one is Git Bash. GUI is the graphical user interface, while Bash is the command line.
1.2 Account on Gitlab
Gitlab is a free hosting service for git repositories. I use it because I find it friendly and reliable, and I don’t have to maintain it myself. You can use other repositories and they are similar.
- Sign Up for a new account at https://gitlab.com/users/sign_up
- sign in to gitlab.com with your user and password
- Click on the Profile Settings (which is the “Person”-like icon next to the + sign on the web page after you login). Once you see that, you will see another row of menu items, where SSH Keys is shown. Click on My SSH Keys, click on Add SSH Key, and it will take you to “Add an SSH Key” screen. Just paste in the SSH key that you generated in the second to the last step of 1.1 above.
1.3 Login and Create a Project on the Server
- Login to your gitlab.com account. Click on the + sign on the gray bar. It will say “New Project” when you mouse over. It will ask you “Project Name is”, and you can fill in the name and description.. The project name may have to be just one word. It may show you. Let’s say you call your project “ssg2013”.
- After creating the project, it will take you to a screen with something like this. Don’t type the commands yet! Instead, look at it as a sample output, but type the commands in Step 1.4.Git global setup:
git config –global user.name “FirstName LastName”
git config –global user.email “[email protected]”
Create Repository
mkdir ssg2013
touch README
cd ssg2013
git init
git add README
git commit -m ‘first commit’
git remote add origin [email protected]:yourUserName/ssg2013.git
git push -u origin master
Existing Git Repo?
cd existing_git_repo
git remote add origin [email protected]:yourUserName/ssg2013.git
git push -u origin master
1.4 Set up and upload your local repositor to gitlab
- Open Git Bash. You will see a black window with white and green letters. something that looks like user@host ~ $ This is called the prompt.
- At the prompt, type the word “cd ” (not including quotes – just the the word
cd
followed by a space.
Go to Windows explorer and find the folder for your LaTeX files. Use your mouse to drag the folder that contains your LaTeX source file onto this Git Bash window. The mouse cursor into a + temporarily. When you release the mouse button, it types the path for you. - Hit return. If you are successful, then it should not give any warning message.
- Type the commands into Git Bash based on those from the first two black boxes (but on the 3rd black box) shown above, but with some adjustments, depending on your specific setting. Specifically,
git config --global user.name "Firstname Lastname"
git config --global user.email "[email protected]"
git init
git add
(drag and drop all those source files and folders from Windows Explorer onto the Bash Shell; don’t include generated files, such as .aux, .log, .bbl, .blg, or the generated pdf itself)git commit -m 'first commit'
git remote add origin [email protected]:userName/ssg2013.git
git push -u origin master
1.5 Add Collaborators
- Go back to your gitlab.com window. Say you are currently in the ssg2013.git project.
- Click on members in the side bar, or + for project members. In either case, it has the heading “Users with access to this project”. You can click on New Project Member, and it will ask you to choose people you want in the project and set access level for them. Type in anyone’s email address that has been registered in this system to invite them, and it will bring up my first and last name.
- You should give your collaborators at least “developer” access rights.
1.6 Basic GIT Commands
GIT differs from the others in that in addition to the master repository, each person “clone”s one and can use it locally, instead of a “checkout”. When you “commit”, you commit changes into your local repository. There is no “update” because nobody else besides you makes changes.
To submit changes back to the master repository, you do a “push”. To refresh your copy with newer changes from the master repository, you do a “pull”.
In GIT, the commands are
git clone repositorypath.git
This makes a clone
git add file1 file2 file3 etc
This adds the list of files to be committed (or added) to the “staging area” for next commit action. Useful especially if you just created a file that wasn’t in the repository yet.
git commit -m "log message"
This commit the staged files locally to your own repository. Important: the staging is assumed to be explicit. If nothing has been modified then it is the same as git status, which displays what has bene staged.
However, if you are used to svn, which automatically figures out what files have been changed, then you probably want to use
git commit -am "log message"
This (-a) “automatically” adds modified files to the staging area and then commit.
Most of the time you would want to use this, instead of having to explicitly name the files to commit.
git pull
This grabs new changes from master repository to refresh your copy, by merging with your changes if necessary. Note that you should commit first before pulling.
git push -f -m "log message"
This submits your changes back to master repository (the -f -m “log message” part is optional). You need to commit first before pushing.
2. SVN
SVN, which stands for Subversion, can be seen as further improvement on CVS with many of the same commands. Where CVS tracks versions as individual files, SVN tracks snapshots of your working directory. The reason is that you might make 27 modifications to file A but only 2 modifications to file B. CVS doesn’t tell you which combination of version of A and version of B makes a valid build, if you ever have to go back and dig up something. SVN, on the other hand, lets you commit snapshots, so that each build is a (hopefully) consistent snapshot as seen by the committer. The commands for SVN are also similar to those in CVS, but SVN keeps track of more details for the user. Also, SVN can do merging just like CVS.
2.1 Setup
Unlike CVS, which builds on RCS and uses the plain file system to track each file in a mostly human readable form, SVN uses its own file and directory format to track the builds, and it is not meant to be human readable. Also, setting up is entirely different for SVN. You need to use svnadmin
for repository actions, and svn
for working actions. It is recommended that you set up ssh public/private keys and work over a secure shell.
- Make sure the
svn
andsvnadmin
path is in your $path variable. If not, add it by editing your .tcshrc or .bashrc file. If .cshrc or .tcshrc, add the lineset path=($path /usr/bin/svnpackage/)
(assuming /usr/bin/svnpackage contains the executables for svn and svnadmin). - You might want to make sure we have group write permission. Also put in your .tcshrc
umask 02
- To create a repository, login (ssh) to your server and type the command
% svnadmin create pathOfRepository
(note to my student:) I would do
% svnadmin create /svnroot/pai
just once.
(note to my student:) we also have websvn, but it doesn’t show up automatically. You need to add it by editing the file on embedded:/Library/WebServer/Documents/websvn/include/config.php, and add the line$config->addRepository('pai', 'file:///svnroot/pai');
– replace with your own – for it to show up in the web listing. - Once you have a repository, you would need to go back to your local machine and set up a working directory. You can use either
svn co
orsvn import
to establish this binding. - Suppose someone else has already set up the repository and initial set of files. You can just do
% svn co svn+ssh://user@host/fullPathOfRepositoryOnTheHost
- (note to my student:) As an example, I already made the svnroot/pai/icmu10 directory in my repository. To get a copy of your own, do
% svn co svn+ssh://[email protected]/cvsroot/pai/icmu10
- Suppose you already have your local directory with files and you want to put them into the repository. You can do
% svn import localDirName svn+ssh://user@host/fullPathOfRepositoryOnTheHost
and then do ansvn co
command (as described in the previous sub-bullet) to start working. localDirName is optional if you are already in the directory that you want to import (into SVN) - (note to my student:) if I already have my files in the directory “code10” under my repository, and I want to put a copy of the files in my repository under …/pai/src10 (a new subdirectory in SVN) then I would do
% cd code10
% svn import . svn+ssh://[email protected]/svnroot/pai/src10
(note to my student:) If successful, it should logically add and physically copy the files into the repository.
But you shouldn’t start editing yet! The files (in code10) you just imported are not yet automatically under version control. So, you should do a checkout (co) – and these would be considered a separate set of files.)
% svn co svn+ssh://[email protected]/svnroot/pai/src10
This will create the subdirectory src10 relative to your current directory. You can actually move the src10 subdirectory elsewhere and even rename it, and just work in there.
2.2 Update, Commit
SVN and CVS share many commands in common.
Command | Meaning | (Side) effect |
---|---|---|
% svn co svn+ssh://user@host/dirPath |
checkout files in dirPath on the remote user@host’s account |
creates some bookkeeping files and SVN directory with your local working directory to keep track of the remote path |
% svn update [file(s) or directory] |
update (i.e., copy from repository to your working directory; bring your working directory up to date).Normally you just do svn update without additional files or directory. When necessary, you can also update individual files. |
merges your local files with changes from repository. |
% svn commit |
commit (i.e., check in the snapshot into the repository) | unlocks also, if you locked any of the files you are committing. If you have $Id:$ tags or other tags, and you have set the property tag for some files, they get updated, too. |
% svn add file(s) |
mark files to be added to the repository | the files are not actually submitted to the repository until you do the svn commit |
% svn rm file(s) |
mark the files to be removed from the repository | the files are not actually removed untilsvn commit |
% svn propset svn:keywords "Id" files |
sets the property tag (e.g., $Id:$) to be updated at the time of commit | not updated until commit |
% svn lock file(s) |
lock the file(s) in the repository | prevent others from locking |
% svn unlock file(s) |
unlocks the file(s) in the repository | enable others to lock |
So, SVN and CVS commands are quite similar. The main difference is that you don’t (need to) say which file or directory you commit or update; it does them for you. Because it does not require locking to edit, you may run into merging problems. To avoid these problems, follow the tips:
- wrap your lines, keep them no more than 80 chars per line;
- use some instant messaging tool to know what your teammates are working on.
3. CVS
CVS stands for Concurrent Versioning System. It is actually built on top of RCS, but makes several improvements
- Commands can work remotely, so you don’t have to actually login with a shell to ci/co and manually copy files between machines
- It solves the locking problem by allowing multiple people to edit their (unlocked) copies at the same time, and it will attempt to do a merge when you do commit/update.
- Branching, though we won’t talk about it here.
For terminology,
commit
(in CVS, SVN) meansci
(“checkin” in RCS): submit your changes to the repositoryupdate
(in CVS, SVN) meansco
(“checkout” in RCS): grab new files from the repository; bring your working directory up to date.checkout
(in CVS, SVN) means to set up a local directory to be a working directory tied to a repository. After the set up, you can then do commit and update, etc.import
(in CVS, SVN) means to import the files in your current directory into the repository’s tree by creating a new directory there. However, you would still need to do a checkout afterwards in order to start working.
3.1 Setup
Setting up is similar to RCS, in that you just need to create a directory. But we assume you want to be working on your own machine rather than on the server. So, you will also need to set up SSH keys as well.
- One of the team members creates a repository directory somewhere on this shared file system. (Same as RCS):
% mkdir repositoryDirectory
- Make sure this directory is writable by the group. (Same as RCS)
% chmod g+rw repositoryDirectory
. If the collaborators are in different (unix) groups, then you may have to usea+rw
instead ofg+rw
but that is less secure.Set up the SSH public/private keys in case you haven’t.
To set up your own working copy, you would need to checkout
the repository. It downloads a copy and sets up the server path. – % cvs -d :ext:user@hostname:/absolutePath/to/cvsroot co directory
checks out a copy of the (remote) directory to your (local) working copy.
3.2 Update, Commit
Unlike RCS, which uses two separate executables ci
and co
, CVS is one command that takes ci
and co
as subcommands within the cvs
executable, but you should use cvs update
and cvs commit
most of the time.
Command | Meaning | (Side) effect |
---|---|---|
% cvs -d :ext:remoteCvsRoot co dirPath |
checkout files in dirPath under the remoteCvsRoot in the remote user@host’s account (encoded as path of remoteCvsRoot) |
creates some bookkeeping files and CVS directory with your local working directory to keep track of the remote path |
% cvs update [file(s) or directory] |
update (i.e., copy from repository to your working directory; bring your working directory up to date). If you don’t specify files or directory, then it updates all of the files in the working directory. |
merges your local files with changes from repository. |
% cvs commit [file(s) or directory] |
commit (i.e., check in the file into the repository and unlock. If you don’t specify files or directory, then it commits all the changed files in the working directory. |
unlocks also, if you locked any of the files you are committing. If you have $Id:$ tags or other tags, they get updated, too. |
% cvs add file(s) |
mark files to be added to the repository | the files are not actually submitted to the repository until you do thecvs commit |
% cvs rm file(s) |
mark the files to be removed from the repository | the files are not actually removed untilcvs commit |
% cvs lock file(s) |
lock the file(s) in the repository | prevent others from locking |
% cvs unlock file(s) |
unlocks the file(s) in the repository |
enable others to lock |
Note that you can still lock and unlock, though you are not required to. RCS commands still work.
Note that merging works most of the time, but not always. It does tell you where merging fails, and you would need to manually merge those changes. So, you should still need to be coordinated with your teammates somehow. In general,
- wrap your lines. That means insert returns in your text files, especially for LaTeX and other source files, even though some text editors visually wrap the lines for you. The reason is that CVS relies on
diff
, which relies on line breaks to mark units of comparison. Wrapping your lines ensures that diff has a reasonable unit for comparison purpose - Have some idea what your teammates are working on, by instant messaging or some other tool. This way, you avoid stepping on each other’s toes.
When you get a conflict, you will be given a chance to resolve the conflict. The source file will include >>>>> and <<<< (“mine” and “theirs”) to mark the difference that it does not know how to resolve.
4. RCS
RCS is Revision Control System. It assumes collaborators are logged in to the same computer (at least computers sharing the same file system) and have access to a shared repository directory.
4.1 Setup
- One of the team members creates a repository directory somewhere on this shared file system.
% mkdir repositoryDirectory
(for my students: login by ssh to embedded.ece.uci.edu,
cd ~super/cvsroot
and do amkdir
command. you may have tosudo mkdir
) - Make sure this directory is writable by the group.
% chmod g+rw repositoryDirectory
If the collaborators are in different (unix) groups, then you may have to use
a+rw
instead ofg+rw
, but that is less secure. - Each collaborator should create his/her own working directory
% cd
% mkdir workingDirectory
- In the working directory, each collaborator should make a symbolic link to the repository and name it RCS
% cd workingDirectory
% ln -s repositoryDirectory RCS
(for my students:% ln -s ~super/cvsroot/ yourUserName/yourRepositoryDir RCS
4.2 Check-in, Check-out
RCS provides two commands: check-in (ci
) check-out (co
). You have the option to lock and unlock. By default, you check out a read-only copy to your working directory. Before making a change, you should lock the file. When you are done, you check in the changes back into the repository. Others have the option of continuing to edit the files, but no more than one person may lock the same file at the same time.
Command | Meaning | Side effect |
---|---|---|
% ci file(s) |
check in a file into the repository (linked by the RCS link) and unlock | remove working copy from your working directory. Modifies the corresponding file in the repository. If you have $Id:$ or other tags, they get updated, too. |
% ci -u file(s) |
check in the file into the repository and unlock |
keeps the working copy in your working directory. Makes your copy read-only. |
% ci -l file(s) |
check in the file into the repository and keep the lock | keeps the working copy |
% co file(s) |
check out the file(s) from the repository | overwrite the file(s) in your working directory |
% co -l file(s) |
check out and lock the file(s) from the repository | overwrites the file(s) in your working directory, but warns you if you have a writable |
% ci -r2.1 file(s) |
check in file with version number 2.1 | overrides the default version number. |
% co RCS/*,v |
check out all the files in the repository | same (side) effect as co |
Note that in the repository, your files are named with a ,v
extension. For example, paper.tex is named “paper.tex,v”. This is why when you do co RCS/*,v
, you check out everything. Note that if you don’t create the RCS link, it puts the *,v
files in the same directory, which can be very confusing.
Summary: RCS uses lock and unlock to track version. It is good practice, though it is easy to forget. When someone locks a file, they don’t automatically notify you. RCS relies on file permission in the working directory where unlocked files are read-only, so that if you attempt make a change, your text editor warns you. But if you are working remotely, then it is a copy of a copy, and the permissions can’t be set automatically (since you would ssh to the server to ci/co in your working directory on the server).