git: fetch and merge, don’t pull

This is too long and rambling, but to steal a joke from Mark Twain Blaise Pascal I haven’t had time to make it shorter yet.  There is some discussion of this post on the git mailing list, but much of it is tangential to the points I’m trying to make here.

One of the git tips that I find myself frequently passing on to people is:

Don’t use git pull, use git fetch and then git merge.

The problem with git pull is that it has all kinds of helpful magic that means you don’t really have to learn about the different types of branch in git. Mostly things Just Work, but when they don’t it’s often difficult to work out why. What seem like obvious bits of syntax for git pull may have rather surprising results, as even a cursory look through the manual page should convince you.

The other problem is that by both fetching and merging in one command, your working directory is updated without giving you a chance to examine the changes you’ve just brought into your repository. Of course, unless you turn off all the safety checks, the effects of a git pull on your working directory are never going to be catastrophic, but you might prefer to do things more slowly so you don’t have to backtrack.

Branches

Before I explain the advice about git pull any further it’s worth clarifying what a branch is. Branches are often described as being a “line of development”, but I think that’s an unfortunate expression since:

  • If anything, a branch is a “directed acyclic graph of development” rather than a line.
  • It suggests that branches are quite heavyweight objects.

I would suggest that you think of branches in terms of what defines them: they’re a name for a particular commit and all the commits that are ancestors of it, so each branch is completely defined by the SHA1sum of the commit at the tip. This means that manipulating them is a very lightweight operation – you just change that value.

This definition has some perhaps unexpected implications. For example, suppose you have two branches, “stable” and “new-idea”, whose tips are at revisions E and F:

  A-----C----E ("stable")
   \
    B-----D-----F ("new-idea")

So the commits A, C and E are on “stable” and A, B, D and F are on “new-idea”. If you then merge “new-idea” into “stable” with the following commands:

    git checkout stable   # Change to work on the branch "stable"
    git merge new-idea    # Merge in "new-idea"

… then you have the following:

  A-----C----E----G ("stable")
   \             /
    B-----D-----F ("new-idea")

If you carry on committing on “new idea” and on “stable”, you get:

  A-----C----E----G---H ("stable")
   \             /
    B-----D-----F----I ("new-idea")

So now A, B, C, D, E, F, G and H are on “stable”, while A, B, D, F and I are on “new-idea”.

Branches do have some special properties, of course – the most important of these is that if you’re working on a branch and create a new commit, the branch tip will be advanced to that new commit. Hopefully this is what you’d expect. When merging with git merge, you only specify the branch you want to merge into the current one, and only your current branch advances.

Another common situation where this view of branches helps a lot is the following: suppose you’re working on the main branch of a project (called “master”, say) and realise later that what you’ve been doing might have been a bad idea, and you would rather it were on a topic branch. If the commit graph looks like this:

   last version from another repository
      |
      v
  M---N-----O----P---Q ("master")

Then you separate out your work with the following set of commands (where the diagrams show how the state has changed after them):

  git branch dubious-experiment

  M---N-----O----P---Q ("master" and "dubious-experiment")

  git checkout master

  # Be careful with this next command: make sure "git status" is
  # clean, you're definitely on "master" and the
  # "dubious-experiment" branch has the commits you were working
  # on first...

  git reset --hard <SHA1sum of commit N>

       ("master")
  M---N-------------O----P---Q ("dubious-experiment")

  git pull # Or something that updates "master" from
           # somewhere else...

  M--N----R---S ("master")
      \
       O---P---Q ("dubious-experiment")

This is something I seem to end up doing a lot… :)

Types of Branches

The terminology for branches gets pretty confusing, unfortunately, since it has changed over the course of git’s development. I’m going to try to convince you that there are really only two types of branches. These are:

(a) “Local branches”: what you see when you type git branch, e.g. to use an abbreviated example I have here:

       $ git branch
         debian
         server
       * master

(b) “Remote-tracking branches”: what you see when you type git branch -r, e.g.:

       $ git branch -r
       cognac/master
       fruitfly/server
       origin/albert
       origin/ant
       origin/contrib
       origin/cross-compile

The names of tracking branches are made up of the name of a “remote” (e.g. origin, cognac, fruitfly) followed by “/” and then the name of a branch in that remote respository. (“remotes” are just nicknames for other repositories, synonymous with a URL or the path of a local directory – you can set up extra remotes yourself with “git remote”, but “git clone” by default sets up “origin” for you.)

If you’re interested in how these branches are stored locally, look at the files in:

  • .git/refs/heads/ [for local branches]
  • .git/refs/remotes/ [for tracking branches]

Both types of branches are very similar in some respects – they’re all just stored locally as single SHA1 sums representing a commit. (I emphasize “locally” since some people see “origin/master” and assume that in some sense this branch is incomplete without access to the remote server – that isn’t the case.)

Despite this similarity there is one particularly important difference:

  • The safe ways to change remote-tracking branches are with git fetch or as a side-effect of git-push; you can’t work on remote-tracking branches directly. In contrast, you can always switch to local branches and create new commits to move the tip of the branch forward.

So what you mostly do with remote-tracking branches is one of the following:

  • Update them with git fetch
  • Merge from them into your current branch
  • Create new local branches based on them

Creating local branches based on remote-tracking branches

If you want to create a local branch based on a remote-tracking branch (i.e. in order to actually work on it) you can do that with git branch –track or git checkout –track -b, which is similar but it also switches your working tree to the newly created local branch. For example, if you see in git branch -r that there’s a remote-tracking branch called origin/refactored that you want, you would use the command:

    git checkout --track -b refactored origin/refactored

In this example “refactored” is the name of the new branch and “origin/refactored” is the name of existing remote-tracking branch to base it on. (In recent versions of git the “–track” option is actually unnecessary since it’s implied when the final parameter is a remote-tracking branch, as in this example.)

The “–track” option sets up some configuration variables that associate the local branch with the remote-tracking branch. These are useful chiefly for two things:

  • They allow git pull to know what to merge after fetching new remote-tracking branches.
  • If you do git checkout to a local branch which has been set up in this way, it will give you a helpful message such as:
    Your branch and the tracked remote branch 'origin/master'
    have diverged, and respectively have 3 and 384 different
    commit(s) each.

… or:

    Your branch is behind the tracked remote branch
    'origin/master' by 3 commits, and can be fast-forwarded.

The configuration variables that allow this are called “branch.<local-branch-name>.merge” and “branch.<local-branch-name>.remote”, but you probably don’t need to worry about them.

You have probably noticed that after cloning from an established remote repository git branch -r lists many remote-tracking branches, but you only have one local branch. In that case, a variation of the command above is what you need to set up local branches that track those remote-tracking branches.

You might care to note some confusing terminology here: the word “track” in “–track” means tracking of a remote-tracking branch by a local branch, whereas in “remote-tracking branch” it means the tracking of a branch in a remote repository by the remote-tracking branch. Somewhat confusing…

Now, let’s look at an example of how to update from a remote repository, and then how to push changes to a new repository.

Updating from a Remote Repository

So, if I want get changes from the remote repository called “origin” into my local repository I’ll type git fetch origin and you might see some output like this:

  remote: Counting objects: 382, done.
  remote: Compressing objects: 100% (203/203), done.
  remote: Total 278 (delta 177), reused 103 (delta 59)
  Receiving objects: 100% (278/278), 4.89 MiB | 539 KiB/s, done.
  Resolving deltas: 100% (177/177), completed with 40 local objects.
  From ssh://longair@pacific.mpi-cbg.de/srv/git/fiji
     3036acc..9eb5e40  debian-release-20081030 -> origin/debian-release-20081030
   * [new branch]      debian-release-20081112 -> origin/debian-release-20081112
   * [new branch]      debian-release-20081112.1 -> origin/debian-release-20081112.1
     3d619e7..6260626  master     -> origin/master

The most important bits here are the lines like these:

     3036acc..9eb5e40  debian-release-20081030 -> origin/debian-release-20081030
   * [new branch]      debian-release-20081112 -> origin/debian-release-20081112

The first line of these two shows that your remote-tracking branch origin/debian-release-20081030 has been advanced from the commit 3036acc to 9eb5e40. The bit before the arrow is the name of the branch in the remote repository. The second line similarly show that since we last did this, a new remote tracking branch has been created. (git fetch may also fetch new tags if they have appeared in the remote repository.)

The lines before those are git fetch working out exactly which objects it will need to download to our local repository’s pool of objects, so that they will be available locally for anything we want to do with these updated branches and tags.

git fetch doesn’t touch your working tree at all, so gives you a little breathing space to decide what you want to do next. To actually bring the changes from the remote branch into your working tree, you have to do a git merge. So, for instance, if I’m working on “master” (after a git checkout master) then I can merge in the changes that we’ve just got from origin with:

    git merge origin/master

(This might be a fast-forward, if you haven’t created any new commits that aren’t on master in the remote repository, or it might be a more complicated merge.)

If instead you just wanted to see what the differences are between your branch and the remote one, you could do that with:

    git diff master origin/master

This is the nice point about fetching and merging separately: it gives you the chance to examine what you’ve fetched before deciding what to do next. Also, by doing this separately the distinction between when you should use a local branch name and a remote-tracking branch name becomes clear very quickly.

Pushing your changes to a remote repository

How about the other way round? Suppose you’ve made some changes to the branch “experimental” and want to push that to a remote repository called “origin”. This should be as simple as:

    git push origin experimental

You might get an error saying that the remote repository can’t fast-forward the branch, which probably means that someone else has pushed different changes to that branch. So, that case you’ll need to fetch and merge their changes before trying the push again.

Aside

If the branch has a different name in the remote repository (“experiment-by-bob”, say) you’d do this with:

      git push origin experimental:experiment-by-bob

On older versions of git, if “experiment-by-bob” doesn’t already exist, the syntax needs to be:

      git push origin experimental:refs/heads/experiment-by-bob

… to create the remote branch.  However that seems to be no longer the case, at least in git version 1.6.1.2 – see Sitaram’s comment below.

If the branch name is the same locally and remotely then it will be created
automatically without you having to use any special syntax, i.e. you can just do git push origin experimental as normal.

In practice, however, it’s less confusing if you keep the branch names the same. (The <source-name>:<destination-name> syntax there is known as a “refspec”, about which we’ll say no more here.)

An important point here is that this git push doesn’t involve the remote-tracking branch origin/experimental at all – it will only be updated the next time you do git fetch. Correction: as Deskin Miller points out below, your remote-tracking branches will be updated on pushing to the corresponding branches in one of your remotes.

Why not git pull?

Well, git pull is fine most of the time, and particularly if you’re using git in a CVS-like fashion then it’s probably what you want. However, if you want to use git in a more idiomatic way (creating lots of topic branches, rewriting local history whenever you feel like it, and so on) then it helps a lot to get used to doing git fetch and git merge separately.


Posted

in

by

Tags:

Comments

138 responses to “git: fetch and merge, don’t pull”

  1. Deskin Miller Avatar

    “An important point here is that this git push doesn’t involve the remote-tracking branch origin/experimental at all – it will only be updated the next time you do git fetch”

    This is untrue: when pushing to a named remote, if the remote branch you’re pushing to is one which would be tracked according to your settings for that remote, your remote-tracking branch will be updated appropriately. This is the case even when pushing multiple branches.

    Excellent post overall!

    1. mark Avatar

      Deskin: thanks, and thank-you for catching that error; I’ve corrected the appropriate bits.

  2. Sitaram Avatar

    inside the little breakout box starting with “aside…”, it says:

    … or if “experiment-by-bob” doesn’t already exist, the syntax needs to be:

    git push origin experimental:refs/heads/experiment-by-bob

    I have just tested it and it is not necessary; I can make a brand new branch and push it to a *different* new name to origin without needing that “refs/heads” part.

    This is on Git 1.6.2; I did not check older versions but I do not remember ever having to do this.

    1. mark Avatar

      Sitaram: thanks for bringing that to my attention. I’ve just tested on one of the systems I use which still has git 1.5.3.5 and it does give an error in that case:

      $ git push origin topic:topic-new
      error: dst refspec topic-new does not match any existing ref on the remote and does not start with refs/.
      fatal: The remote end hung up unexpectedly
      error: failed to push to '/home/mark/tmp/foo/.git'

      … but the same test on git 1.6.1.2 works OK:

      $ git push origin topic:new-topic
      Total 0 (delta 0), reused 0 (delta 0)
      To /home/mark/tmp/quuz/
      * [new branch] topic -> new-topic

      At least I wasn’t just making it up :) I’ll update the post with a note that that only applies to older versions.

  3. alexandrul Avatar

    In the first place, thank you for the article.

    As a side question, after a fetch, how can I merge all tracked remote branches with the local ones in a single command? (besides creating a script to merge each branch in turn)

    1. mark Avatar

      I don’t think there’s a single command that will do that. You’d have to be quite careful about writing such a script, since you would need to safely checkout each branch (i.e. check that the working tree is clean before you switch) and check that the merge will be a fast-forward.

      I can’t really imagine wanting to do that, myself, because in practice the warning you get on checking out a branch (i.e. the one about the state of that branch with regard to the remote-tracking branch it tracks) stops me from forgetting to merge before carrying on work.

  4. chhh Avatar

    a very useful and descriptive post, thank you
    the only thing missing, in my opinion, is an example of of updating a local -track branch with a remote tracking one.

    1. mark Avatar

      Thanks for your comment, chhh. Do you mean something like the bit where I talk about doing “git merge origin/master”? I suppose it could go into more detail about the different options for doing that (e.g. “git reset –hard origin/master” to throw away your changes, “git rebase”, etc.) for various situations, but I’m a bit worried that the post is already overlong.

      1. chhh Avatar

        Well, it’s your article, so you decide.
        Usually i just scan through pages when searching for solution, but read this whole text.

        Yes “git merge origin/master” part may become confusing for many people who are only beginning using git. In my opinion it would have been very helpful if you had mentioned ways to get out of merge conflicts (not in full blown detail, just brief directions like git reset –merge/hard HEAD or something)

  5. Jason Wagner Avatar

    Excellent post! Thanks so much for taking the time to write this up, it has helped a lot. I’m pretty comfortable with SVN and some other proprietary SCM’s.. but i’m just learning Git. Some of the principals are quite tricky! :p

  6. Cyrus Master Avatar
    Cyrus Master

    Fantastic article. Thanks!

  7. Ashwin Tumma Avatar
    Ashwin Tumma

    Thank you for the post. It helps a lot ..

  8. pielgrzym Avatar
    pielgrzym

    Hi,

    thanks a lot for this in depth yet very easy to understand explanation of git internals. Not only it helped me to fetch and merge branches, but also enabled me to understand a lot of other git actions commands. I’ve never thought of git as a database containing simple graphs! This article should definately make to some git book – it helps to visualize git internal thus use git more efectively. Thanks again!

  9. Ben Turner Avatar
    Ben Turner

    Just read this article, thought gosh how informative, and then laughed when I saw the author photo at the end :)

    Hey Mark, cool post ! Guess I’d better follow the RSS now I’ve found ya…

    1. admin Avatar
      admin

      Please do! You would be my third reader, as far as I know ;) I’m glad to hear the post was useful.

  10. Todd Avatar

    Really useful post, definitely helped me!

    Cheers :)

  11. Reuben Avatar
    Reuben

    Great article! It’s cleared at least one thing that I’ve been wondering about for a while, and that’s merging from a remote to a local branch that you’re not checkout out in.

    i.e. I’m working on a feature branch, and have discovered a change in the master that I really want included in my feature branch. What I would end up doing is

    $ git pull # from the feature branch, just to make sure I’m up to date
    $ git stash # because I probably have wip that I’m not ready to commit
    $ git checkout master # hopefully helpful message about being behind, and can fast-forward
    $ git pull # I should probably do a git merge, since I’ve already fetched from the previous pull
    $ git checkout feature
    $ git merge master # and that merges all changes from the master since the branch, into feature
    $ git stash pop # restore the wip

    For cases where it’s not convenient to merge all of master in to the feature, I’ve used cherry-pick.

    Seems a bit long winded, doesn’t it, and I’ve wondered if there’s a better way.

    For a while, I was using git pull origin/master from my feature branch, to fetch and merge those latest master changes into my feature. From this article, it might be better to use a bit more caution and git fetch origin/master, before merging with git merge origin/master.

    Does that seem like a better way of grabbing changes from a remote master into a local feature?

    1. mark Avatar

      Well, you can just do:


      git fetch origin # updates all remote-tracking branches for origin
      git checkout feature
      git merge origin/master

      (Of course, that leaves your master where it was – you could checkout master and merge origin/master into that as well, though.)

  12. Jeremy Rowe Avatar
    Jeremy Rowe

    Great post.

  13. Arthur Khakimov Avatar
    Arthur Khakimov

    Thanks!

  14. Dan Shumaker Avatar

    Great article. A light bulb definitely went on when I read the “remote-tracking branch name becomes clear very quickly” section. I was never sure what git considered “proper” addresses (or locations or names). The “refs/heads/master” naming convention is definitely a different way of naming locations than the “origin/branch” naming convention and confused me until I read your post. To check branch names I always did a “git branch -a” and thought that “remotes/origin/branchname” was an actual location to use on the command line rather than just “origin/branchname”.

    A point that still needs clarification for me is the definition of “track”. What exactly is tracked? Is it a connection of some sort? With all the git commands what happens automatically with this “connection” and what do I have to do manually. To me it seems a better name would be “reference” because nothing seems to happen automatically unless you do a “git pull”. I guess “tracking” implies (to me) an automatic updated pointer or something. I’m guessing the tracked thing is just the parent lineage (SHA1 sum) number?

    It seems from a branch maintenance point of view that the “tracking” feature of git falls short. I still find myself in deep development trees trying to find out who’s the ancestor of who and who is behind who. I’ve seen some gui tools that help visualize it but not that well (Tower).

    Thanks again for helping me see the light.

    1. mark Avatar

      The two different meanings of “track” in git are, indeed, rather confusing. I tried to address that in this Stack Overflow answer:

      http://stackoverflow.com/questions/6631337/are-there-different-meanings-to-the-concept-of-tracking-in-git/6631524#6631524

      In the “remote-tracking branch” sense, the “tracking” is defined in the configuration variables that define the remote. For example, in a clone of git’s source code that I have, the following configuration defines the remote:


      remote.origin.url=git://git.kernel.org/pub/scm/git/git.git
      remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*

      The first line option defines that URL of the repository that the remote refers to, and the second describes how the branch names are mapped when fetching from that remote – in this case, take all the branches under refs/heads in the remote repository, and update the remote-tracking branch with the same name under refs/remotes/origin/

  15. Chris Mear Avatar

    Thanks for this post, it’s a very useful resource. When I teach people Git, this is one of the things I explain in some detail, much as you’ve done here. It helps so much not only with their ability to pull safely, but also with equipping them with the understanding to answer their own Git questions in the future.

  16. Andrew Velis Avatar

    Thank you for your post. Being fairly new to git, is there a tutorial or book to help demonstrate the concepts made in this article?

    1. mark Avatar

      Personally, I like the book Pro Git by Scott Chacon, which is also readable online. There are plenty of good introductions to git concepts on the web – I particularly like git for computer scientists, but obviously that’s got a particular target audience. I tried to write a brief tutorial myself.

  17. Nathan B Avatar

    This is a great article, but I do agree that ‘git pull’ is perfectly fine most of the time.
    I can see the benefit of ‘git fetch’ and ‘git merge’ when you’re working on branches for debian or the linux kernel, but I think it would hardly ever be necessary for projects with only a few contributors.

  18. Jonathan Avatar
    Jonathan

    The updating local from remote part of the article, though simple enough to understand, is where I’m having difficulty. After doing the fetch, git diff shows a difference, but the git merge reports ‘already up to date’. Must be something simple, am a beginner with git.

    git fetch doesn’t touch your working tree at all, so gives you a little breathing space to decide what you want to do next. To actually bring the changes from the remote branch into your working tree, you have to do a git merge. So, for instance, if I’m working on “master” (after a git checkout master) then I can merge in the changes that we’ve just got from origin with:

    git merge origin/master

    (This might be a fast-forward, if you haven’t created any new commits that aren’t on master in the remote repository, or it might be a more complicated merge.)

    If instead you just wanted to see what the differences are between your branch and the remote one, you could do that with:

    git diff master origin/master

    1. mark Avatar

      Hi Jonathan: if git is reporting “already up to date” when you try to merge origin/master into your master branch, that means that you already have the complete history of origin/master included in your master branch. If git diff master origin/master then shows a difference between the two, that means that your master branch is ahead of origin/master.

  19. Jonathan Avatar
    Jonathan

    Thanks for the quick response Mark. From your response I see exactly what is going on; I do indeed have master ahead of origin/master and so I may have misunderstood the usage of merge. In trying to learn, I had created a test whereby I was trying to ‘undo’ a local commit by fetching/merging from origin/master but I suspect this probably isn’t an intended behaviour: it must suggest to use git reset or similar on the master. I read half the Loeliger book without understanding it correctly. Thanks again!

  20. umanko Avatar
    umanko

    excellent article thanks for creating it !!

  21. Martin Keegan Avatar
    Martin Keegan

    Mark,

    very useful article; got me unblocked. Delighted to see it was you who wrote it!

    1. mark Avatar

      Hi Martin, I’m glad to hear the post was of use — you might find some of the other things I’ve written here that are in the git category useful as well. I hope you’re well.

  22. Ravi Avatar
    Ravi

    Got my fundas straight. Great article.

  23. Al Brown Avatar
    Al Brown

    I have this annoying habit of looking up words I don’t understand and ‘acylic’ is one of those. But can’t find it anywhere. Is ‘acyclic’ the word intended?

    1. mark Avatar

      Yes, acyclic as in “has no cycles”. In other words, if you start any particular commit, you can never get back to the one you were at originally by following links to parent commits.

  24. Larry Siden Avatar

    Very nice article! Thank you!

  25. Richard Sheppard Avatar

    Thanks Mark – this topic has confused me for many months. I knew that fetch-merge was a safer option than pull from many other posts I’d read, but I now have an understanding why.

    My friend Randy Fay also has a good article, which explains what can go wrong with multiple people doing multiple commits, where everyone is doing pull and push. He has a newer posting to help if you forget to do fetch-merge or insist on always pulling: .

    Keep up the good work!

    1. mark Avatar

      (I’ve edited your original comment and removed the two followups, which I trust is what you’d want.)

      Thanks for the comments. Personally, I’ve always avoided the automatic rebase feature of git, since I like to decide just before pushing whether to rebase or keep any merge commits there – there’s nothing to stop one from pulling normally and rebasing later, after all.

  26. Peter StJ Avatar
    Peter StJ

    Good to know, I can finally understand how to use a GUI to git.

  27. 16aR Avatar
    16aR

    Thank you for expliciting what are the differences between pull and fetch.
    But I think it’s still lacking the explanation that git fetch origin master will download the content of the remotely stored master branch on the origin server into your LOCALLY origin/master branch. Thus, leaving your LOCAL master branch untouched ! You can then merge/rebase the LOCALLY origin/master branch on you local master branch.
    But with the examples and explanation, I figured that out. (didn’t know how to handle the git svn rebase workflow)

  28. Fred Avatar
    Fred

    If I may you didn’t get it: there is a typo in the article, you wrote acylic instead of acyclic (second c is forgotten :-).
    Anyway great article!

    1. mark Avatar

      Oops – I’ve corrected that now. Thanks for pointing out that (and that I’d failed to understand the previous comment mentioning the same error…)

  29. sprei Avatar

    excellent lesson, and lessons that can be used to help my lecture today. thank you

  30. Roy Badami Avatar
    Roy Badami

    I’m still just reading about git – and I’m not sure I’ve got my head around it’s terminology yet which (I get the impression) is deliberately and self-indulgently incompatible with the terminology of other popular version control systems, so I may be completely confused, but…

    To me, a merge is when you take two different sets of edits to the same file, and try to combine the intent of the edits to produce a single resulting file. It is a process that can in some cases be performed automatically, and in other cases is impossible without (sometimes significant) manual work.

    Are you saying that a “git pull” (unlike an “hg pull”) implicitly performs a merge (in the above sense) without it being explicitly requested? That sounds… surprising… Either I don’t get git yet, or I don’t get git yet. (Or maybe both. Which is fine. It took me quite a while to evn start to get Hg, too, and I’m still only in the “reading about” phase with git.)

    roy

    1. mark Avatar

      Hi Roy – it’s nice to hear from you. You’re quite right about the terminology – I’ve got a blog post in mind about the most confusing terms in git, e.g. the three different senses of “track” – aargh!)

      I hadn’t realized this one before, not being familiar with Mercurial, but, indeed, it seems that “hg pull” is closest to “git fetch”, while “hg pull -u” (or “hg fetch” – aargh again!) is closest to “git pull”.

      With regard to explicit / implicit, I dunno. I’d probably argue that using “git pull” is to explicitly request a merge as well as a fetch, since the summary in its man page says “Fetch from and merge with another repository or a local branch” (my emphasis). However, part of the point of this blog post is that a lot of newcomers to git don’t realize that you can do “git fetch” instead of “git pull” to avoid the automatic merge.

      The thing that I might change in your definition of a merge to make it more “git worldview” is that (usually) merges in git are between two commits (and thus the state of two complete source code trees) rather than a file-by-file operation. (I say “usually” because you can merge between more than two commits (octopus merges), there are subtree merges, etc. etc.)

      1. Mukundan Avatar
        Mukundan

        Hey Mark,
        I read your article it just had that piece of info im looking for . I hope so :) .So i just thought of confirming this with you.
        Here is my issue:
        I forgot to do a git pull before i started working on the local resource.In End of the day i made a lot of changes and when i tried to commmit and push the changes , it dint work.It was then i realised a couple of changes were made in the online repository by my collegues. Since im new to GITHUb i found the git bash scary. Could you tel me how i could decide which files from which source is to be retained and which should be overwritten thru git gui,.. please help me buddy :)

        1. mark Avatar

          Hi Mukundan. I’m afraid this isn’t a good place to ask general git questions of that kind. I would suggest that you go to Stack Overflow http://stackoverflow.com and write a question that phrases your problem as clearly as possible. I think you’ll find that if your question about git is clear then you’ll get an answer very fast.

  31. Roy Badami Avatar
    Roy Badami

    Oh, Hi Mark, didn’t realise it was you!

    I’m probably being unfair in my complaint about implicit merges – all source code control systems implicitly merge into the working copy all the time.

    But in source code control systems I have some familiarity with (SVN, a little Hg) the only merges that I can think of that happen without typing the command ‘merge’ happen when there are uncommitted changes in the working copy, and an operation is performed that updates the working copy (and hence has to merge those changes in). And AIUI that’s not what we’re talking about here.

    There’s also something odd going on with command naming – why is it that a pull is not the opposite of a push? To have push and fetch be the actual direct counterparts, with pull being something different, is… unnaturual to say the least…

    roy

  32. Roy Badami Avatar
    Roy Badami

    Oh, and I don’t think (although I haven’t used it) that “hg pull -u” will ever perform a merge unless there are any uncommitted changes in the working directory – everything I can see just says it’s equivalient to “hg pull; hg update” which will just pull changes in, and then update the working copy to the latest revision in the current branch.

    But whoa!, I didn’t know about hg fetch. Talk about implicit, not only does it implicitly merge, it even implicitly commits the results of the merge. (Ok, I imagine you’ll get prompted for a commit message, so you’re unlikely to do so by accident, but still… IMHO something as wacky as fetch should be an Hg extension so you have to explicitly enable it….)

    roy

  33. Bill Rosner Avatar

    Great article, TYVM!

  34. pleintious Avatar
    pleintious

    That is too out.

  35. jeyanth kumar Avatar

    heya mark… Awesome post.. :) thanks for the info will read rest of your git posts..:)

  36. Monica Anderson Avatar

    After a couple years of working with git this still taught me a lot of really important stuff – an excellent post at the “beyond-the-basics” level git use. Probably best one on the internet at that level. Thank you. And no, it’s not too long. Feel free to write a sequel. :-)

  37. Tyler Gillispie Avatar

    Great article. I’m new to Git. This has helped me clarify some steps required for properly merging between origin and local branches and the difference between the branches. Props.

  38. Marnen Laibow-Koser Avatar

    You’ve given a good explanation of how Git works, but your conclusion is bizarre, to say the least. I don’t see why I’d ever fetch code from my current remote if it’s not immediately going into my working tree. There’s simply no scenario that I can envision where that makes sense; can you envision one?

    To put the previous two sentences into Git, I just said that I’ll never run git fetch without git merge being the immediate next command issued. Therefore, it doesn’t make sense for me *not* to use git pull. (Yes, I’ve long known what git fetch does, but I’ve never once found it useful in 4 years of using Git regularly.) Either your workflow is radically different from mine, or your advice that fetch and merge is somehow “safer” is a simple case of not trusting your tools.

    I’m glad git fetch exists, so we have the option of using it. I just can’t see when it would ever be necessary by itself. The one scenario that comes to mind involves too little branching and all development being done on master — more a SVN-style workflow than a Git-style one, surely.

  39. Chris Anderson Avatar

    Amazing article. This really helped my team get over the hump when moving from SVN to Git. Thanks for taking the time to write it.

  40. Sumanth Avatar
    Sumanth

    Lets say i have done this:
    git checkout –track -b refactored origin/refactored

    In due course of time origin/refactored has advanced by few commits. Now,
    (1)what is the command to update the refs of just oririn/refactored remote branch?
    (2)what is command to merge the updates to my local refactored local branch?

    Thanks.. Your write up was short and useful!

    1. mark Avatar

      1. You would generally do “git fetch origin”, which would update all remote-tracking branches from origin, including origin/refactored. (If you really do just want to update that one remote-tracking branch, you could do “git fetch origin refactored:refs/remotes/origin/refactored”, but generally there’s no point in not updating all of them.)

      2. To then merge origin/refactored into your local branch called refactored, you would switch to that branch (“git checkout refactored”) and then do “git merge origin/refactored”.

  41. Dorian McFarland Avatar
    Dorian McFarland

    Brilliant article, very well expained. Thanks for taking the time to make my life easier!

  42. Ian Avatar
    Ian

    So what would I need to run if another dev pushed a branch to origin/dev/new-site? I don’t see the new remote branch if I run git branch -a.

    Which of these would i run; git fetch, git fetch origin, git pull, git remote update? What is the difference in those?

    I understand your concern with doing git pull, but would that do the same as git fetch origin meaning it updates all my branches that are tracking remotely? I only have two, develop and master.

    Also, would git remote show origin show what truly exists on the remote whereas git ls-remote is what I have locally?

    Trying to wrap my head around it. Thanks for any help!

  43. Ben Zittlau Avatar

    Good article. This clarified a few misconceptions I hard around how remote branches, fetch, and pull worked. Hopefully that will leave me less stupified in the future when things don’t behave the way I thought they were going to.

  44. Dennie Avatar

    Hi Mark. Nice Article. Very clear and I learned quick a lot. Bookmarked it ;)

  45. Donna Avatar
    Donna

    Dear Mark,

    Thanks for a great article. One comment you made really stuck out for me , and cleared up (I think) a major confusion I was having with git :

    “(I emphasize “locally” since some people see “origin/master” and assume that in some sense this branch is incomplete without access to the remote server – that isn’t the case.)”

    So, if I understand this, we actually have an ‘origin/master’ stored on our local machine? And so when I do a fetch, all I am doing is updating the local copy of the “remote” branch “origin/master” (which isn’t really ‘remote’ , i.e. off on some distant server, but is actually a local copy of that distant server branch? Fetch now makes alot more sense to me!

    It would be interesting to see an article on which commands actually require an internet connection, and which ones don’t. Apparently, git fetch requires an internet connection.

    And what about gitk? If I don’t do any fetches, does ‘gitk’ go out to the internet to get updates from remote branches? Or does it only look at what is in your local copy of “origin/master” (for example), e.g. whatever is stored in the .git directory.

    1. mark Avatar

      So, if I understand this, we actually have an ‘origin/master’ stored on our local machine? And so when I do a fetch, all I am doing is updating the local copy of the “remote” branch “origin/master” (which isn’t really ‘remote’ , i.e. off on some distant server, but is actually a local copy of that distant server branch? Fetch now makes alot more sense to me!

      That’s essentially right – if you have a look at the file .git/refs/remotes/origin/master you’ll see it just has a the object name (hash) of a commit in it. “git fetch” makes sure that all the objects necessary for that commit are fetched and in .git/objects before updating that file.

      It would be interesting to see an article on which commands actually require an internet connection, and which ones don’t. Apparently, git fetch requires an internet connection.

      Of the commands that are commonly used, it’s only really “git fetch” (and thus “git pull” as well), “git push” and “git clone” (if you give it a non-local URL), but there are others that are less frequently used, like “git submodule update”, “git ls-remote”, “git remote update”, etc.

      And what about gitk? If I don’t do any fetches, does ‘gitk’ go out to the internet to get updates from remote branches? Or does it only look at what is in your local copy of “origin/master” (for example), e.g. whatever is stored in the .git directory.

      gitk doesn’t get any information from remote repositories; it only shows you what’s in your local repository, so the position of origin/master in gitk’s display will be from the last time you updated it with a fetch, pull or push.

  46. Donna Avatar
    Donna

    Thanks, Mark! Just a few more follow-up questions about gitk (which, I understand is less about git, but is still enormously helpful in building a mental map of the git world).

    “gitk doesn’t get any information from remote repositories; it only shows you what’s in your local repository, so the position of origin/master in gitk’s display will be from the last time you updated it with a fetch, pull or push.”

    So if I do git fetch , gitk won’t show any information about the state of that remote repository? i.e. I can’t actually see the history of that remote. I certainly do see lots of labels in gitk like “remotes/andy/master”. Here, what gitk is just showing me, then, is information about what remote ‘andy’ pushed to origin/master?

    Another general question/comment : I’ve always thought that “local” means “local to my machine”, and “remote” means on a physical remote server somewhere. But now I am getting this uneasy feeling that this is not always the intended sense in which these terms are used when describing git repositories…

    1. mark Avatar

      “gitk doesn’t get any information from remote repositories; it only shows you what’s in your local repository, so the position of origin/master in gitk’s display will be from the last time you updated it with a fetch, pull or push.”

      So if I do git fetch , gitk won’t show any information about the state of that remote repository? i.e. I can’t actually see the history of that remote. I certainly do see lots of labels in gitk like “remotes/andy/master”. Here, what gitk is just showing me, then, is information about what remote ‘andy’ pushed to origin/master?

      By “doesn’t get any information from remote repositories”, I meant “doesn’t fetch information over the internet when you run it”. It does show you the state of branches from remote repositories (as you’ve noted) from the last time you fetched from them because it’s showing remote-tracking branches like andy/master. (If you want to make sure that all of these are included, you can run gitk --all.)

      Another general question/comment : I’ve always thought that “local” means “local to my machine”, and “remote” means on a physical remote server somewhere. But now I am getting this uneasy feeling that this is not always the intended sense in which these terms are used when describing git repositories…

      Well, the “remote repository” might be another repository on the same computer, just in a different directory. You should also be aware that “a remote” (i.e. “remote” as a noun) in git terminology is also like an alias or nickname for the URL of a remote repository. For example, “origin” is a remote, as is “andy” in your example above – they might be on the same machine or a different one, depending on the URL.

  47. Jon Doe Avatar
    Jon Doe

    Why do you suggest “git reset –hard” to move the master tag? As you mention that is unsafe to changes in the working directory.
    Why not do:
    git checkout -b dubious-experiement
    git update-ref master

    This is working directory safe and gets you where you want to be.

    ps: To be fair reset –hard seems to be fairly common and accepted approach. I just thought I would offer what I have found to be simpler, because it is safer.

  48. Joe Avatar
    Joe

    Excellent post! I wish somebody would’ve told me that the remote-tracking braches are actually stored locally. That would help me learn git more quickly and with less confusion.

  49. Jakub T. Jankiewicz Avatar

    Actually “directed acyclic graph of development” is git repository and the branch is a path in a graph.

    1. mark Avatar

      No, I’m afraid that’s not right – every commit that’s an ancestor of the tip of the branch is part of the branch, not just those on a single path from there. If you picked a path through the 0th parent of each commit from the branch tip, say, that would typically miss out lots of commits via the other parents that are nonetheless part of the development effort that’s furthered by that branch. (This view of what a branch consists of is backed up by tools like gitk if you click on commits and look at the branches that are listed.)

  50. Jamel Avatar

    There’s certainly a great deal to learn about this topic.
    I love all the points you’ve made.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.