git ready

learn git one commit at a time
by Nick Quaranto

pushing and pulling

committed 21 Jan 2009

Today we’re going to review another basic yet powerful concept that Git among other version control systems of its type has: distribution! As you may know, your commits are all local, and repositories are simply clones of each other. That means the real work in distributing your projects is in synchronizing the changes via git push and git pull.

If you’re new to Git, you may think that this is too much overhead and one that leads to a breakdown of control. Look at it this way: if your central server goes down, you’re usually hosed and prevented from working and collaborating with others. Since all of the work involved with actually creating revisions is done on your own machine, you can code whether the network is down without the permission of others or being subject to network issues. Did I mention it’s a lot faster too for most routine operations? Check out some more of advantages (and disadvantages) of DVCS at Wikipedia.

Oliver Steele has whipped up a great image of how pushing and pulling works:

The topmost part of the diagram is easy, that’s committing your changes to the staging area. Once that’s done, your version control is essentially in place, but now we want to synchronize it with our remote repository. This could be on a hosting site like GitHub, your work’s main git server, a production machine, or even another coworker’s repository. Once the changes have been delivered, others can pull them down and work with them. With most actions in Git there’s a few options, but let’s stick with pull for today.

Let’s go through a real world example. I’ve just added the Twitter link at the bottom of this blog. I want to push my changes up to the project’s GitHub repository so I can update the server. The syntax of this command is usually git push <remote> <branch>. A remote is basically the address of a cloned repository. It’s got pretty much all of the same data and history, it’s just waiting to get updated. Usually most projects work with an origin remote, and the master branch by default.

$ git commit -am "Adding twitter link"
  Created commit f2cd831: Adding twitter link
  1 files changed, 1 insertions(+), 0 deletions(-)

$ git push origin master
  Counting objects: 7, done.
  Compressing objects: 100% (4/4), done.
  Writing objects: 100% (4/4), 407 bytes, done.
  Total 4 (delta 2), reused 0 (delta 0)
     361303d..f2cd831  master -> master

Now our remote repository is updated, and if you visit GitHub you’d be able to see the commits. The output this command prints out has to deal with how Git transfers over the various blobs of data that have changed, and then lets us know that the repository has been updated by showing the changed SHA1.

But what about pull? We can use that to update any other repositories with the commits I just synchronized. The project’s crappy deploy script does exactly that:

$ git pull origin master
  remote: Counting objects: 7, done.        
  remote: Compressing objects: 100% (4/4), done.        
  remote: Total 4 (delta 2), reused 0 (delta 0)        
  Updating 361303d..f2cd831
  Fast forward
   _layouts/default.html |    1 +
    1 files changed, 1 insertions(+), 0 deletions(-)

The output shows us that the repository living on the server has some changes to bring in from remote repository, and it proceeds to bring them in. This process would be exactly the same if you wanted to bring in changes that another person committed to your repository. Those who have forked the gitready repository on GitHub can also now get the changes I’ve made. Future tutorials will go over that process more in depth.

If you can think of any other ideas or concepts that were missed (we’ll save refspecs for later tutorials), let us know.