Week 8

Day 1 - "Give a man a patch"

Collaborating with outsiders

We have spoken at great length now about rebasing and have seen that it is a very very powerful tool. It can form part of your workflow in your development cycle. However, always heed that warning that should send alarm bells ringing in the back of your mind about rebasing. Rebasing changes the past. Rebasing changes history. As such, it should be used a) with caution, and b) only by people who understand exactly what they are doing.

We are going to leave rebasing for a while now, take a quick look at a feature you really should know about and then focus on some of the more advanced features of Git. The following situation occurs fairly regularly for some people.
In the trenches...
John was stroking his chin and looking pensively out of the window when Simon approached his desk. The manager hadn't seen him yet and Simon instinctively swayed a little back and forth, try to make himself known in as subtle a way as possible. Klaus, who was watching from the corner of his eye took a more direct approach. He took the out of date org chart down from the office divider, screwed it up into a ball and launched it at John's head. It struck the manager squarely in the jaw causing him to almost tip from his awkwardly balanced chair.

John noticed Simon standing there and looked a little surprised. He then noticed Klaus and in an instant understood the chain of events that had just taken place. "Sorry Simon," started John, "I've been trying to figure out a problem all morning."

"It's no problem." Simon pulled up a chair and sat down. "I was wondering if you had a few minutes to discuss Luigi?"

* * *

"Well as Luigi is a contractor, he's not going to get access to our repository here to perform commits directly. And he doesn't have the capability, nor do I really want him, making our code available on the internet. But he does have a clone of our repository from last week." John understood the problem.

"Right!"

"Have you heard of patching in Git?" asked John.

Simon looked at his shoes, "Can't say I have John, sorry."

John smiled, "No worrys. What we can do is get Luigi to generate a patch of his changes. We can then take that patch and apply it to our codebase. Luigi can then just reset his clone when he comes into the office." Simon nodded as John continued, "Go and ask Martha about it. I think she's pretty hot on these types of things."

Klaus giggled, "Think she's hot eh John?"

The paper was returned.

It is a good question though. Sometimes you may have a repository that is either publically available, or made available to a group of people. You do not necessarily want to set up a remote tracking branch and pull changes in from every single contributor. There are two primary reasons for this;
  1. There are a large number of people submitting small changes to the code.
  2. There are difficulties in communicating between the two repositories either for security or general reasons.

In these cases we need another way to apply changes from one branch into another. Many larger open source projects allow contributors to email in patches. Git does have some rather advanced ways of dealing with these types of scenarios. We are going to scratch the surface and look at using three commands git apply, git format-patch and git am.

First, let us find a way of generating a patch. Let us take the example we have currently in our repository. Imagine that the develop branch exists on another computer in a clone of our repository. At some point in time, someone cloned our repository. They have the HEAD of our repository at the same point as we do, but they have continued to do some development in a new branch called develop. Now they are ready to give those changes back.

Firstly we are going to look at using the git diff tool to generate a patch file which we can apply.
john@satsuki:~/coderepo$ git checkout develop
Already on 'develop'
john@satsuki:~/coderepo$ git diff master develop
diff --git a/newfile2 b/newfile2
index 3545c1d..ff59f55 100644
--- a/newfile2
+++ b/newfile2
@@ -1,2 +1,3 @@
Another new file
and a new awesome feature
+newer dev work
diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
john@satsuki:~/coderepo$

That will generate us a diff from the develop to the master branch. We could copy and paste that information from the terminal window into a file, but Linux offers us an easier way of doing this.
john@satsuki:~/coderepo$ git diff master develop > our_patch.diff
john@satsuki:~/coderepo$ cat our_patch.diff
diff --git a/newfile2 b/newfile2
index 3545c1d..ff59f55 100644
--- a/newfile2
+++ b/newfile2
@@ -1,2 +1,3 @@
Another new file
and a new awesome feature
+newer dev work
diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
john@satsuki:~/coderepo$

So we can see that the file itself has the information we are looking for. Now we can use the git apply tool to actually modify the files in master and bring in the changes that have happened in develop.
john@satsuki:~/coderepo$ git checkout master
Switched to branch 'master'
john@satsuki:~/coderepo$ git apply our_patch.diff
john@satsuki:~/coderepo$ git diff
diff --git a/newfile2 b/newfile2
index 3545c1d..ff59f55 100644
--- a/newfile2
+++ b/newfile2
@@ -1,2 +1,3 @@
Another new file
and a new awesome feature
+newer dev work
diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
john@satsuki:~/coderepo$ git commit -a -m 'Updated with patch'
[master 81eee9f] Updated with patch
2 files changed, 2 insertions(+), 0 deletions(-)
john@satsuki:~/coderepo$ git diff develop master
john@satsuki:~/coderepo$

Of course doing things this way means that we still have to commit our changes. Plus, all of the changes that we have made in the patch are committed in one block. Sure, we could split that using some of the techniques in the After Hours sections, but then we may not always be aware of what should be split where.

Can we have some order please?

There is another tool that can come to our rescue here. It is primarily used for working with mailboxes, but it also has some other uses which we will describe here. Would it not be nice to be able to have each commit that we want to use as a patch in a separate patch file. The file our_patch.diff above contained two commits worth of data. We have access to another tool in our fight against disparate systems. This is the git format-patch command.

First we will undo the changes we made previously by resetting the master branch back to its older position and deleting the our_patch.diff file.
john@satsuki:~/coderepo$ git reflog show master -n 4
81eee9f master@{0}: commit: Updated with patch
f8d5100 master@{1}: commit: Finished new dev
1968324 master@{2}: commit: Start new dev
john@satsuki:~/coderepo$ git reset --hard f8d5100
HEAD is now at f8d5100 Finished new dev
john@satsuki:~/coderepo$ rm our_patch.diff
john@satsuki:~/coderepo$

We used the git reflog command to show what the last four master HEAD values were. Then we reset the branch back to the point before the git apply. Finally we deleted the patch. Now let us see how to use the git format-patch command to create multiple patch files.
john@satsuki:~/coderepo$ git format-patch master..develop
0001-Some-new-dev-work.patch
0002-More-new-deving.patch
john@satsuki:~/coderepo$

It would appear that the result of this command is that two files have been generated. Let us confirm our suspicions and cat the contents of them to ensure that they contain the data we expect.
john@satsuki:~/coderepo$ cat 0001-Some-new-dev-work.patch
From af3c6d730a8632d99b5626a7c0e921d14af21f50 Mon Sep 17 00:00:00 2001
From: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Thu, 7 Jul 2011 19:01:59 +0100
Subject: [PATCH 1/2] Some new dev work

---
newfile3 | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/newfile3 b/newfile3
index 638113c..2e00739 100644
--- a/newfile3
+++ b/newfile3
@@ -1 +1,2 @@
These changes are in the origin
+new dev work
--
1.7.4.1

john@satsuki:~/coderepo$

Woah! Hold on a minute. This does not seem to be a normal diff file at all. In fact, that is absolutely right. This is a patch file and the two are not the same. The patch file contains much more information than the simple diff file. For a start we get information about which commit this patch came from, who created it, when and a subject. In fact this looks almost like an email. In fact it is created to resemble a format that would be easily emailable.

We have specified a range of commits to the git format-patch command with the parameter master..develop. The format of that parameter should be familar from earlier chapters when we utilised it for commands like git diff and git log. We could now take those files, email them to someone else and they could apply them. Let us learn one more tool, and see how we would apply those patches when they had been received at the other end.
john@satsuki:~/coderepo$ git am 0001-Some-new-dev-work.patch
Applying: Some new dev work
john@satsuki:~/coderepo$ git am 0002-More-new-deving.patch
Applying: More new deving
john@satsuki:~/coderepo$ git diff master..develop
john@satsuki:~/coderepo$

Of course this is just a simple example case and in actual usage there may be cases where conflicts and other complications occur. Looking at a log output, we can see that the original dates and times of the commits are maintained and are not updated. We can ignore this if we wish and use the --ignore-date parameter to use the current date when committing the patch to the repository.
john@satsuki:~/coderepo$ git log -n4
commit 30900fe1b7e72411dabab8b02070f36e2431f704
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Thu Jul 7 19:02:15 2011 +0100

More new deving

commit a8281fb589e36389cc8cb0da7ebee225b4d1adfc
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Thu Jul 7 19:01:59 2011 +0100

Some new dev work

commit f8d5100142b43ffaba9bbd539ba4fd92af79bf0e
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Thu Jul 7 08:39:29 2011 +0100

Finished new dev

commit 1968324ce2899883fca76bc25496bcf2b15e7011
Author: John Haskins <john.haskins@tamagoyakiinc.koala>
Date: Thu Jul 7 08:39:07 2011 +0100

Start new dev
john@satsuki:~/coderepo$

Interestingly if we use our alias for the log command we see something maybe a little unexpected.
john@satsuki:~/coderepo$ git logg -n6
* 30900fe (HEAD, master) More new deving
* a8281fb Some new dev work
| * aed985c (develop) More new deving
| * af3c6d7 Some new dev work
|/
* f8d5100 Finished new dev
* 1968324 Start new dev
john@satsuki:~/coderepo$

Notice that the branch master has not been simply fast forwarded to that of commit of develop. This is because we have not performed a merge, but in a sense we have manually made that changes to the files and created separate commits for them. In this way the commits 30900fe and a8281fb are not the same as their develop counterparts.

If you intend to use this workflow, it is worth spending some time reading the man page for git am and git format-patch as both of them hold valuable information regarding the customisation and handling of patches and emails. Tamagoyaki Inc. are not going to use this workflow often and so just applying a few patches here and there from contractors using the methods is prefectly acceptable to them. If you were a large open source establishment, or any company that accepts a large number of patches, you may want to take a closer look at how to work these. Now it is time to move on to some more advanced topics within Git, but first a little cleanup.
john@satsuki:~/coderepo$ rm 0001-Some-new-dev-work.patch
john@satsuki:~/coderepo$ rm 0002-More-new-deving.patch
john@satsuki:~/coderepo$

Previous Day

Next Day

 
   
home | download | read now | source | feedback | legal stuff