Week 8

Day 2 - "Looking for problems"

A problem shared is a problem bisected

During most software development, bugs are introduced. Sometimes these bugs are fixed immediately and sometimes they sit there in the code festering away for months on end until someone tests a specific case. Of course it is always best to have test suites and run them regularly against the code base, but on occasions either the test case itself has a bug, or the test case is written in such a way that a particular bug would never present itself. Tamagoyaki Inc. have a fairly rigorous testing procedure. Unfortunately it would seem that one particularly nasty bug has slipped through the cracks. Cue a difficult discussion.
In the trenches...
"But what I don't understand John, is that you now know what happened at every step in the process. How can something like this break and you not know about it?" As always Markus was getting snappy and as always John was having to bite his lip.

"It's not a question about not knowing about it," begain John, "The difficulty is knowing what change introduced the problem. We are on such a rapid development schedule that too many things are changing at once."

"Well, this is one of the reasons you guys have spent the last two months getting this version control system running." Markus got up and opened the door. "I suggest you fix it."

* * *

"Markus is blaming us for introducing a bug?" Rob was pretty shocked as he and Simon chatted at the water cooler.

"More like, Markus believed that a version control system was going to solve all of our problems," replied Simon.

Rob squinted his face up as a car drove into the buildings car park, showering the room with reflected sunlight. He shielded his eyes. "You know I heard there was a tool in Git for helping to find bugs. Think I may take a look over lunch, you know, be a real hero."

They both chuckled.

It is true that Git does have a very powerful tool for helping to detect revisions that introduced bugs into the system. The tool is called git bisect and it is used to successively checkout revisions from the repository, check them to see if the bug is present and then use that information to determine the revision that is most likely to have introduced the bug.

Let us assume that the bug in our repository is a fairly simple one. For some bizarre reason our codebase is broken unless the word Addition is present in one of the files. If we run a simple Linux grep command across the files, we can see that the word we are after is not there. However, if we go back to tag v1.0a and run the same command, we can see that the word is there.
john@satsuki:~/coderepo$ grep "Addition" *
john@satsuki:~/coderepo$ git checkout v1.0a
Note: checking out 'v1.0a'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

git checkout -b new_branch_name

HEAD is now at a022d4d... Messed with a few files
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$

Notice the warning about checking out a non-branch. This is perfectly normal and should not worry you but please be aware that it is obviously best to have a clean working directory before starting any type of bisect commands. We can see that the string we are looking for is present in the file called my_third_committed_file. As our repository is very small, it would not take us long to go through and check each revision to see when this string was deleted. In fact we have other tools available to search for the adding and removal of strings. For now let us assume that the bug is more complicated than this.

Let us go back to the facts. We know that the repository was good at tag v1.0a. We also know that the repository is bad in its current state. By feeding these details to the git bisect command, we can begin a search for the bug. What will happen at each stage is that Git will checkout a revision that it wants us to test and we tell Git if we think that revision is good or bad.
john@satsuki:~/coderepo$ git bisect start
Already on 'master'
john@satsuki:~/coderepo$ git bisect good v1.0a
john@satsuki:~/coderepo$ git bisect bad master
Bisecting: 9 revisions left to test after this (roughly 3 steps)
[ed2301ba223a63a5a930b536a043444e019460a7] Removed third file
john@satsuki:~/coderepo$

So we invoke the tool by running git bisect start. After this we tell Git the things that we know. It was good at v1.0a, git bisect good v1.0a. However, it was bad at master, our current revision, git bisect bad master. After this, Git checks out revision ed2301b and tells us that there are 9 revisions between the two points and that it should take only 3 more steps to complete. Now we run our test again.
john@satsuki:~/coderepo$ grep "Addition" *
john@satsuki:~/coderepo$

As we have no result here, this would be classed as a bad revision and so we mark it as so.
john@satsuki:~/coderepo$ git bisect bad
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[9710177657ae00665ca8f8027b17314346a5b1c4] Added another file
john@satsuki:~/coderepo$

Git now presents us with a new choice and you can see that the number of revisions left to check has decreased dramatically from 9 to 3. We continue marking our revisions as good and bad.
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$ git bisect good
Bisecting: 2 revisions left to test after this (roughly 1 step)
[cfbecabb031696a217b77b0e1285f2d5fc2ea2a3] Fantastic new feature
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$ git bisect good
Bisecting: 0 revisions left to test after this (roughly 1 step)
[b119573f4508514c55e1c4e3bebec0ab3667d071] Merge branch 'wonderful'
john@satsuki:~/coderepo$ grep "Addition" *
my_third_committed_file:Addition to the line
john@satsuki:~/coderepo$ git bisect good
ed2301ba223a63a5a930b536a043444e019460a7 is the first bad commit
commit ed2301ba223a63a5a930b536a043444e019460a7
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Fri Apr 1 07:37:34 2011 +0100

Removed third file

:100644 000000 68365cc0e5909dc366d31febf5ba94a3268751c6 0000000000000000000000000000000000000000 D my_third_committed_file
john@satsuki:~/coderepo$

Oh! Something different. Git has actually finished the bisect and has suggested to us that this commit was responsible for introducing the bug in our code. If we look at the comment it was in this revision that we removed a particular file. This file was the one that contained our special Addition string. Git was right! We screwed up then. At this point we can go back to our master branch and decide what to do about it.
john@satsuki:~/coderepo$ git branch -v
* (no branch) b119573 Merge branch 'wonderful'
develop aed985c More new deving
master 30900fe More new deving
wonderful 4d91aab Updated another file again
zaney 7cc32db Made another awesome change
john@satsuki:~/coderepo$ git checkout master
Previous HEAD position was b119573... Merge branch 'wonderful'
Switched to branch 'master'
john@satsuki:~/coderepo$

Notice that at the end of the bisect, Git does not return us to the master branch. We are left in the last tested checked out revision.

Automating the process

So bisecting is a very powerful way of quickly and efficiently finding the point at which bugs were introduced or regression testing. Git was spot on when it suggested that that revision was the one responsible for the mistake. Sometimes you may not be able to test a revision that Git checks out for you for other reasons. In this case you can always run git bisect skip to skip that revision. It is all very well being able to run this at each revision Git asks us to but to be honest, if you have 30-40 steps to test and you have to compile code to see if the bug is present it can get a little bit boring.

Git has a way of allowing us to test automatically. The example we are going to use is obviously based on a Linux environment, but if you are a developer on a Windows platform, you should have no trouble understanding what is happening here. We are going to create a small shell script that will automatically run our grep test. If the string is found we will exit with a status code of 0, indicating that it was successful and if the string is not found, we will exit with a status code of 123, indicating that the test was unsuccessful.

Git will use these status codes and interpret a code of 0 as good and a code of 123 as bad. Below is a copy of our shell script which we have saved as test.sh and have given relevant permissions to allow it to run etc. Notice we have had to exclude our test.sh file from the test, else the string Addition would have been found there which would have returned true every time.
john@satsuki:~/coderepo$ cat test.sh
#!/bin/bash

if grep -q Addition * --exclude=test.sh
then echo "Good"
exit 0
else
echo "Bad"
exit 123
fi
john@satsuki:~/coderepo$

Now we invoke git bisect slightly differently by asking it to start and itterate over the revisions master to v1.0a. At this point we have not told Git anything about which revisions are good or bad.
john@satsuki:~/coderepo$ git bisect start master v1.0a
Bisecting: 9 revisions left to test after this (roughly 3 steps)
[ed2301ba223a63a5a930b536a043444e019460a7] Removed third file
john@satsuki:~/coderepo$

Now we ask Git to continue testing, but to run our script at each iteration to determine the success or failure of each checked out revision.
john@satsuki:~/coderepo$ git bisect run sh ./test.sh
running sh ./test.sh
Bad
Bisecting: 3 revisions left to test after this (roughly 2 steps)
[9710177657ae00665ca8f8027b17314346a5b1c4] Added another file
running sh ./test.sh
Good
Bisecting: 2 revisions left to test after this (roughly 1 step)
[cfbecabb031696a217b77b0e1285f2d5fc2ea2a3] Fantastic new feature
running sh ./test.sh
Good
Bisecting: 0 revisions left to test after this (roughly 1 step)
[b119573f4508514c55e1c4e3bebec0ab3667d071] Merge branch 'wonderful'
running sh ./test.sh
Good
ed2301ba223a63a5a930b536a043444e019460a7 is the first bad commit
commit ed2301ba223a63a5a930b536a043444e019460a7
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Fri Apr 1 07:37:34 2011 +0100

Removed third file

:100644 000000 68365cc0e5909dc366d31febf5ba94a3268751c6 0000000000000000000000000000000000000000 D my_third_committed_file
bisect run success
john@satsuki:~/coderepo$

The parameters after the git bisect run tell Git which command we wish to run at each stage. In our case it is sh ./test.sh. You can see Git invoking our test.sh script in each case, and the result of our script, either Good or Bad depending on which was echoed from the result of the grep test. Git has arrived at exactly the same result, but we have had to do nothing other than write a small script. For larger tests, this would have saved us a large amount of work.
In the trenches...
"Simon could I have a word?" It was Rob and he wasn't looking happy.

Simon turned to him and grinned, "Sure buddy what's up?" His face dropped when he saw Rob's expression.

"I think we'd better go grab the meeting room."

Simon looked confused.

"I used the bisect tool to find the bug. But you're not gonna like what I found." * * * "Simon how could you have done that?" John was asking the questions and they were coming thick and fast. "I mean changing the API key for the web sevice whilst developing was not a great idea to start with, but committing that to the repository was rediculous." Simon sat there with his head in his hands. "You know how secret that API key is right?" Simon nodded. "Simon we were supposed to be releasing this repository publically in a few weeks but now that the API is in there we can't do that."

"John I'm really sorry OK." Simon was kicking himself for his mistake.

John sighed, he had been really angry to begin with but now he was calming down, "It's OK Simon, we're all getting used to the repository and version control. Do you think we can fix it?"

Previous Day

Next Day

 
   
home | download | read now | source | feedback | legal stuff