Week 3

Day 5 - "Putting things back again"

Revert, I say. Revert!

Whilst working with your repository, something occurs quite often, is the need to go back in time, either temporarily or permanently, or even partially. Git allows you to do this in a multitude of ways. Let's see a real life situation where this need could arise.
In the trenches...
"No, I don't have a copy of the file. I was stupid and after I submitted it to you I er... deleted it".

John gave Michael the raised eyebrow look. It wasn't the first time Michael had come to him with a similar problem. Usually John would have had Michael go rooting through the archives to find it. This time though, he wondered if Git might just come to the rescue.

"Tell ya what Michael," he grinned, "Since this isn't the first time you've come to me with this kind of predicament, why don't you go and find out how to use Git to get the file back." Michael sighed. "I have tagged the repo each time we created an archive, so tell me what I need to run to extract it."

* * *

"Man", started Michael running over to John's desk forty five minutes later. "I never knew there were so many ways to skin a cat"

Michael was a little out of shape and though he had only crossed a minor distance, he now stood there, leaned over John's desk ever so slightly out of breath.

"So, you learn much?" asked John.

"Where d'ya want me to start?"

Where exactly do we start? Well one of the neat things about Git is that there are many ways to produce the same result. While that may not seem like a benefit now, the trick is knowing just how to use each tool and what the benefit is of each method. Right now, we are ready to look at four methods for achieving the task of viewing old information in the repository. So how do we choose which method we wish to use? We need to answer a few more questions before we are ready to decide.

The table below shows the three methods that we have access to now. Note that this may not be a definitive list of methods, but that these can give us access to the data we need to view. The columns are requirements or criteria. We need to evaluate each command in order to determine which one is right for us. Once you have been using Git a while, these kind of evaluations will become second nature to you, but right now, we will take a look at all available options, just to see what is out there.
Method Name Alters Repository Changes History Alters Working Copy Reversible Multiple Files
ResetPossiblyPossiblyPossiblyPossiblyYes
CheckoutNoNoYesNoYes
ShowNoNoNoN/ANo

Let's take a look at each of these in turn. We are going to be covering two new commands and revisiting an old one. Let us start with git reset. We have already met this tool once. When we used it previously, its purpose was to pull files out of the index that we were not ready to commit. We were using git reset in its simplest state. Actually Git can perform several other kinds of reset. It should be noted here that using this can be quite dangerous as it can affect your index, your working copy, your branch and the pointer HEAD.

In order to use git reset in any sane way to achieve our goal, we would need to look at branching, which at the moment, we are not ready to do. In short git reset can drastically effect your working copy, affecting multiple files, and before we begin investigating, we really need to learn how to play in a safe environment.

The next method on our list to discuss is the git checkout command. This command can be used to bring back either a single file or multiple files and once again at this stage, is best employed in conjunction with branches. At this point, you may be wondering why we are placing such emphasis on the use of branches. As you will see next week, branches are incredibly powerful things, which allow you to experiment and play with your data, without the risk of losing anything. git checkout will pull files from a previous commit into our working copy. This is something remember. If we have any changes in our working copy, the checkout will fail.

The last method we can use to view data which was in a previous commit, is the git show command. This command literally pulls data from a previous version and dumps it to the standard output, a little like the cat command present in almost every single *nix environment.

Now that we have taken a quick look at our three methods, we must decide which one is going to be the most useful to us. Looking at the scenario above, we can deduce that we really only need to pull out one file. If our intention was to do large amounts of work on an old branch and pull many files from it, git reset may have been a good choice. As we are looking for only a single file, we should consider looking at the checkout and show tools.

So now let us see how we can use git checkout to take one of our files back to the past.
john@satsuki:~/coderepo$ git status
# On branch master
nothing to commit (working directory clean)
john@satsuki:~/coderepo$ git checkout v0.9 -- my_second_committed_file
john@satsuki:~/coderepo$ cat my_second_committed_file
Change1
john@satsuki:~/coderepo$ git checkout HEAD -- my_second_committed_file
john@satsuki:~/coderepo$ cat my_second_committed_file
Changed this file completely
john@satsuki:~/coderepo$

john@satsuki:~/coderepo$ git status
# On branch master
nothing to commit (working directory clean)
john@satsuki:~/coderepo$ git checkout v0.9 -- my_second_committed_file
john@satsuki:~/coderepo$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# modified: my_second_committed_file
#
john@satsuki:~/coderepo$ git checkout HEAD -- my_second_committed_file
john@satsuki:~/coderepo$ git status
# On branch master
nothing to commit (working directory clean)
john@satsuki:~/coderepo$

Notice how we first checked that we didn't have any local modifications by running the git status command. Then we are safe to run the git checkout command. We used the v0.9 tag from earlier to refer to an earlier commit state. The next part of the command is the double hyphen -- that tells Git that what comes after it is the path. Finally we choose my_second_committed_file as the source file. After this, when we cat the file, we see that it has changed to what it used to be in v0.9.

We then switch the file back to the latest version by using the HEAD reference. Note that on the odd occasion, the HEAD reference doesn't always point to where you think it does, but this is an area we are yet to cover. Then we run the command one more time, but this time we intersperse it with a git status to see that there are changes made to out local working copy.

Show me the money

The git show command will have largely the same effect, except it grabs us the data without having to change existing files in our working copy. Let us view a quick example.
john@satsuki:~/coderepo$ git show v0.9:my_second_committed_file
Change1
john@satsuki:~/coderepo$ git show v0.9:my_second_committed_file > temp_file
john@satsuki:~/coderepo$ cat temp_file
Change1
john@satsuki:~/coderepo$

The format of the git show command is rather similar to the checkout command we used a few moments ago. The only difference is the presence of the colon, instead of the double hyphen. Notice how the effect of the first command is just to print out the contents of the requested file to the screen. With the Linux environment it is easy to pipe this output to a new file. In the example above, we pipe the output using the > character to the file called temp_file

Hopefully you can now see that there are often several ways to achieve the same result and it is important to ensure that you choose the right tool for the job. The reset command was too dangerous to use, the checkout command modified our working copy, but the show tool allowed us to create a new file, guaranteeing that our working copy remained untouched.
In the trenches...
"So, if I am currently have changes to the file you want, in my local repository," began John, "What command would you recommend I use?"

Michael paused, clearly considering each method in his head. The noise from the sandwich van's horn rang through the office and Michael immediately stood bolt upright and looked panicked. "The van John" he stuttered, "The van"

"You can go to the van when you tell me which command I should use." John smirked. Michael was one of the more junior members of the team and the managers often took the opportunity to haze him.

"I'm gonna go with git show," he said in a rush.

"Why?" asked John.

"So you don't harm the working tree." replied Michael smoothly, already walking out the door.

"You could have also branched," shouted Rob, who was a few steps ahead of him.

* * *

"So, what's the status then John?" asked Markus.

John pressed a button on his laptop and the slideshow on the screen advanced to show an organisational model. "Well, we've not had a whole lot of time this week as the release for project Manta, but we've managed to look at logging and diffing, which is something that we really needed to cover. Klaus also showed everyone how to tag things and went through our version numbering system again as several people had forgotten." Everyone in the room looked at Jack. "We also found out about how to pull older versions out of the repository in a variety of ways."

Markus looked pleased, "So, what's next?"

"Klaus?" asked John.

"Next, John put my team in charge of defining and teaching everyone about branching and merging. This is the really important stuff." Klaus took over control of the laptop and clicked onto the next slide, which detailed a list of features. "We really need to get a good handle on these topics to be successful. It is key to collaboration"

"Well done team," ended Markus, "Wayne is going to be impressed with this."

We now have a good working knowledge of how to do many key things in Git. Logging and diffing is supremely important for inspecting what changes have occurred in the repository. Though the options here are not an exhaustive list, they should give you a basic understanding of how to use the tools. It is well worth looking at the man pages for these commands to get an idea of just how expansive they can be. For example, the diff tool can not only show you differences between your working copy and the index, but also between your index and the latest commit, using the cached option.

Next we move on to branching and merging. Branching can be a tricky subject, so it is important to understand what is happening at the repository level. It would be prudent to look over the After Hours section for Week 2 before continuing as some of the terminology may be a little confusing otherwise.

Previous Day

Next Day

 
   
home | download | read now | source | feedback | legal stuff