Version Control Reference

Contents:

Introduction

CSE 331 uses git to distribute starter code and to turn in assignments. At the deadline, the staff collects a tagged current version of your files from your repository.

A note about terminology: git is a version control system that lets software engineers backup, manage, and collaborate on software projects. GitLab is a hosting service — a place to store git repositories for students in CSE 331.

Git stores the history of all versions of your files in a “repository”. Each user “clones” the repository, making a local copy of it. Cloning also creates a “working copy” of the latest version of the files. Each user can edit his or her working copy, without affecting other users or the master version.

Please read "Version control concepts and best practices". Even people who have used git before often find that it clarifies concepts.

Why use version control?

Every serious software project uses version control — even single-person projects. CSE 331 gives you practice with version control.

All version control systems, including git, provide the following functionality:

Over many years of teaching courses like CSE331, we have observed that on average about one student's computer will either crash unrecoverably or be stolen during the quarter. Therefore, you should commit and push your work to the repository often. Committing often:

Two ways to perform each command

This document indicates how to perform version control tasks on the command line and in IntelliJ.

You can mix and match, performing some tasks on the command line and some in IntelliJ. Even if you mostly use IntelliJ, you may find the command-line versions helpful because IntelliJ lacks certain functionality, sometimes gets wedged, and sometimes gives less informative error messages.

If you are using Windows, you will need to change some directory names from the given examples.

Setup: Cloning the project (creating your working copy)

You always edit your own personal copy of files that are under git control. Before you can make such edits, you must “clone” to create your local repository (which is stored in a hidden directory) and your working copy (where you will do your programming). You need to do this step only once at the beginning of the term. If you plan to work at both UW CSE and home, you need to do these setup steps both while logged into a department machine and from your home computer, so you have a local copy on both machines.

  1. Ensure that your repository exists, by browsing to https://gitlab.cs.washington.edu/cse331-19sp-students/cse331-19sp-YourCSENetID (be sure to change the YourCSENetID part!).
  2. Setup an ssh key. Do this on each computer you plan to use. Follow the instructions to create an RSA Key and the instructions to add the key to GitLab. You should use an empty passphrase, which is less secure but perfectly reasonable in this scenario.
  3. Run ssh -T git@gitlab.cs.washington.edu to ensure that your key is correctly setup. You should receive a welcome message and should not be prompted for a password.
  4. Follow the below instructions for cloning the repository, either from the command line or from IntelliJ (not both).

Command Line

Execute the following commands at the command prompt to clone the project and create a working copy in ~/cse331-19sp-YourCSENetID:

  cd
  git clone git@gitlab.cs.washington.edu:cse331-19sp-students/cse331-19sp-YourCSENetID.git

Note for those who are new to the command line: When you try to type passwords in the command line, you may be alarmed that you can't see any text entered. To protect your password, your typing isn't being shown. Just type your password as normal and press enter.

You can run any git command in any directory of your working copy.

IntelliJ

NOTE: Occasionally, Windows users have had trouble cloning their repository within IntelliJ. These Windows users had success cloning their repository from the command line using the "git bash" tool, then doing all their work within IntelliJ after the initial clone step. Please try that if you have trouble.

First, set up an ssh key.

Follow the IntelliJ instructions on how to clone a repository.

Be sure to import as a Gradle Project, pick Java 11 as your Project SDK, turn on Auto-Import, and ensure that your project uses the recommended Gradle Wrapper option.

The URL to use when cloning your repository is git@gitlab.cs.washington.edu:cse331-19sp-students/cse331-19sp-YourCSENetID.git .

Updating Files

Git's "pull" command updates your local copy of files to reflect changes made to the remote repository by other people (or by you when working on a different computer system). The only changes made by people other than you will be made by the CSE 331 staff when we are adding new homeworks to your repositories. If you work at home and at UW CSE, you will need to use commit, push, and pull to propagate your changes between the two locations.

Git usually does a good job of merging changes made to multiple working copies (say, by different people or by you on your home computer and you at UW CSE), even if those changes are to different parts of the same file. However, if both people change the same line of a file, then git cannot decide which version should take precedence. In this case, git will signal a conflict during git pull, and you must resolve the conflict manually. This can be an unpleasant task.

To minimize the possibility of conflicting changes being made simultaneously, you should pull frequently and commit/push frequently.

Command Line

To update your local copy, go to the root of your repository (e.g. ~/cse331-19sp-YourCSENetID) and run:

git pull origin master

This will display a list of files that have been updated.

IntelliJ

The IntelliJ Documentation has instructions on how to pull changes from a repository.

Committing Changes

After making changes to, adding, or removing files, you must “commit” your changes to git. This step will cause git to record your changes to the repository, so that your changes are backed-up and available to other people working on the repository, or to you when working on a different computer system.

In git, committing changes to your local copy does not change the remote copy of your repository on GitLab. To do this, you must “push“ your commits to the remote repository

In general, you should “pull” any new files or changes before committing your latest code. (And, if the pull results in any conflicts, you should resolve them before committing.) If you forget to pull, git will abort and remind you to pull first.

It is a good idea to commit your changes frequently. It backs up your work, thus enabling you to revert to an earlier version of your code if you find yourself going down a wrong path. Also, when you are working with others, it minimizes conflicts.

Command Line

Determine what changed

To see a summary of the changes you have made since the last git commit, run:

  git status

This will either say that you have made no changes, or will list your staged files (changed, and marked to be committed on the next commit), unstaged files (changed, but not to be committed), and untracked files (new files not staged/added to git). Git never automatically adds files to your repository. You can also see details about the changes you have made since the last git commit. The changes are also called "diffs", short for "differences".

  git diff

The command git diff shows the difference between the staged changes and the changes in the working copy. To see the difference between the repository and the staged changes (which is what git commit will store in your repository), run:

  git diff --staged

We strongly recommend that you review the diffs before you commit, so that you do not accidentally commit a change you did not mean to.

Staging: Choose which changes to commit

By default, even if you have made changes in your working copy, git does not commit any of those changes to the local repository. You have to tell git which of your changes you wish to commit, by putting those changes in the staging area.

(What is the purpose of the staging area? Programmers try to commit their changes in small, logical chunks, rather than big ones. If they have made two different improvements to their working copy, they would make two commits. We do not require this in CSE 331, but you might find it useful. Or, just commit between logical chunks of work.)

Now you can stage, or add, a file or directory as follows:

  git add NewFile.java

Git also has tools for putting just some of a file's changes into the staging area. Programmers use this frequently, but we do not require it in CSE 331.

Commit changes

To commit changes, run

  git commit -m "a descriptive log message"

Don't forget to push after you commit.

If you just run git commit (without the -m "a descriptive log message" message flag), git will open an editor where you can enter a message.

Alternatively, you can skip the staging step and commit all of your unstaged files in one command. If you do this, be especially sure that you have reviewed all the changes by running git diff.

  git commit -am "a descriptive log message"

IntelliJ

The IntelliJ Documentation has instructions on how to commit changes.

Pushing Commits to GitLab

Running git commit stores changes in your local repository. You need to push to propagate those changes to GitLab. Make sure to do this! TAs will grade the version of your work that appears in GitLab.

You may commit multiple times before pushing all those changes to GitLab. Or, you might choose to push every time you commit, to avoid forgetting.

Every time you push to GitLab, GitLab will validate your code, equivalent to you running ./gradlew validateRemote on Attu. (Running tests on every push is a standard industry practice, called "continuous testing" or "continuous integration".) If you get email from GitLab saying "Your pipeline has failed.", that means that `./gradlew validateRemote` has failed. (If you are not yet ready to submit your work, the failure will be no surprise. If you are ready to submit your homework and this informs you of a problem you had overlooked, it can be a lifesaver!) To see the output, click on the word "validate" in the email. You can also browse to GitLab to view the error messages from the failed pipeline.

Command Line

Run:

git push

IntelliJ

The IntelliJ Documentation has instructions on how to push changes to the repository.

Resolving Conflicts

When multiple people (or the same person on multiple machines, such as the lab machines and your own computer) are working on the same file concurrently, git tries to merge the changes made by each person together as each person runs git pull. Usually, git succeeds.

The most common case of this is when the staff pushes out new sets of starter files for each homework. After this happens, you may be unable to git push any changes until your local copy is all up to date with the new starter files. This is especially likely if you have made commits that you haven't pushed to GitLab yet. If you see an error that mentions '[rejected - non-fast-forward]', don't fret! Simply pull as you normally would and an automatic merge should happen.

Once the automatic merge completes, push to ensure the changes are passed on to GitLab.

However, sometimes git is unable to merge the files together when there are two different changes to the same line of a file. In this case, git will signal a conflict during the update; git pull will produce output such as Automatic merge failed; fix conflicts, and git status will produce output such as You have unmerged paths.

Git conflicts are rare — most students will never encounter one — but if you do get a git conflict, you need to resolve it. This is a very brief primer about resolving conflicts; you can read the git documentation to get the full story. Also, it's better to prevent a merge conflict than to have to resolve it later on.

On the command line, to see the status of all your files, run git status. This will tell you, for each file and directory, whether it is currently in a conflicted state or not.

When git detects a file conflict, it changes the file to include both versions of any conflicting portions (yours and the one from the repository), in this format:

  <<<<<<< filename
  YOUR VERSION
  =======
  REPOSITORY'S VERSION
  >>>>>>> 4e2b407... -- repository version's revision number

For each conflicting file, edit it to choose one of the versions (or to merge them by hand). Be sure to remove the <<<<<<<, =======, and >>>>>>> lines. (Searching for "<<<" until you've resolved all the conflicts is generally a good idea.)

Once you have made these edits, then you can tell git that you have resolved the conflicts by staging the file as you normally would:

  git add src/hw2/test/SpecificationTests.java

Preventing merge conflicts

The text above showed how to fix a merge conflict if one occurs. It's better to prevent them in the first place. Conflicts are possible even when you are working by yourself.

The remainder of this section gives tips for preventing merge conflicts when working with teammates.

Git is no replacement for management! Coordination of work is important, even if you're working separately. You should minimize working on the same file at the same time if possible. If you do work on the same file, work on different portions. Modularizing code into multiple files often makes parallelizing work more efficient. You should always pass major design decisions by your teammates before implementing them, particularly if they involve interfaces that will affect their code.

When and how often should you commit? If you commit too often without sufficient testing, you may introduce bugs into the repository that will affect your teammates' work. However, if you commit too rarely, your teammates will be using outdated code, which may cause wasted effort and merge conflicts later.

There is no hard and fast rule, but one good rule of thumb is to make sure everything at least compiles before you commit and push. If you push non-compiling code, your teammates will be very annoyed when they update (which is good practice) and they cannot compile the code any longer.

Another good rule of thumb (though this one is far more malleable) is that you should minimize leaving something uncommitted when you quit for the day. A lot can happen while you're not coding, and it's generally better to get your changes in working order and commit it before you leave. Large amounts of uncommitted code being committed all at once will result in much more conflicts than small amounts of code being committed often. Since the previous rule (of never pushing non-working code) is more important, this can be hard to accomplish if you're making big changes. Thus, it's often good to tackle one feature at a time, so you can finish each piece quickly and keep the repository up-to-date.

Coordinating your efforts with your teammates is, of course, the true key to minimizing merging hassles. Again, git is no replacement for management!

Adding and Removing Directories

Command Line

You can add a subdirectory as normal with:

  mkdir dirname

Git will not recognize an empty directory as a change, so you have to populate it first with a file. Then you can add, commit, and push the file as normal.

To delete a directory or file from the repository, use the standard rm command:

  rm -rf dirname

After adding or deleting a directory, you must perform a commit for the change to be reflected in the repository.

Tracking Changes

Command Line

To see the change log, which is a list of the messages used when checking in changes:

  git log

To see differences between the working copy and the repository's latest copy:

  git diff filename(s)

Omit filename to see differences for all files. Each commit is associated with a long hash value, which you can see in git log as commit 2d03d7...

To see changes compared to a particular commit version, enter:

  git diff REVISION filename(s)

Where REVISION is either a commit hash, like 2d03d7... or HEAD for the repository's most recent master version.

IntelliJ

The IntelliJ Documentation has instructions on how to track changes.

Viewing an old version of a file

You can show an old version of a file under git control with commands like the following:

  git show HEAD:MyFile.java
  git show 7ff8dc80:MyFile.java
  git show HEAD@{2013-03-14}:MyFile.java

As seen in the first command, HEAD means the most recent commit on your local copy of the repository. The second command uses a commit hash and the third uses a date.

You can also revert to an older version of a file under git control:

  git checkout HEAD MyFile.java

This command changes your working copy — that is, your local directory — but it does not change the repository. Do not attempt to edit the old version of the file in your working copy. Doing so will result in nasty merge conflicts and confusion.

To reset all of the files under git control to a previous version, use the following command:

  git reset HEAD

Note, checkout and reset are complicated and can have adverse affects on your local repository. You MAY lose data playing with these commands, and you should only use them with full understanding of what they do.

You should save a copy of the file (e.g. git stash) somewhere, then git pull your working copy to the current version of the file. Now, edit the current version in whatever way you like, possibly copying some or all of the differences from the old version that you saved. You can discard the old version when you are done; there is no need to check it in into GitLab.

Git Pitfalls

Some tips on avoiding common problems while using git: