Reset Demystified
One of the topics that I didn't cover in depth in the Pro Git book is the
reset
command. Most of the reason for this, honestly, is that
I never strongly understood the command beyond the handful of
specific use cases that I needed it for. I knew what the command did, but
not really how it was designed to work.
Since then I have become more comfortable with the command, largely thanks to
Mark Dominus's article
re-phrasing the content of the man-page, which I always found very difficult to
follow. After reading that explanation of the command, I now personally feel
more comfortable using reset
and enjoy trying to help others feel
the same way.
This post assumes some basic understanding of how Git branching works. If you don't really know what HEAD and the Index are on a basic level, you might want to read chapters 2 and 3 of this book before reading this post.
The Three Trees of Git

The way I now like to think about reset
and checkout
is through the mental frame of Git being a content manager of three different
trees. By 'tree' here I really mean "collection of files", not specifically the
data structure. (Some Git developers will get a bit mad at me here, because there
are a few cases where the Index doesn't exactly act like a tree, but for our purposes
it is easier - forgive me).
Git as a system manages and manipulates three trees in it's normal operation. Each of these is covered in the book, but let's review them.
Tree Roles | |
---|---|
The HEAD | last commit snapshot, next parent |
The Index | proposed next commit snapshot |
The Working Directory | sandbox |
The HEAD last commit snapshot, next parent
The HEAD in Git is the pointer to the current branch reference, which is in turn a pointer to the last commit you made or the last commit that was checked out into your working directory. That also means it will be the parent of the next commit you do. It's generally simplest to think of it as HEAD is the snapshot of your last commit.
In fact, it's pretty easy to see what the snapshot of your HEAD looks like. Here is an example of getting the actual directory listing and SHA checksums for each file in the HEAD snapshot:
$ cat .git/HEAD ref: refs/heads/master $ cat .git/refs/heads/master e9a570524b63d2a2b3a7c3325acf5b89bbeb131e $ git cat-file -p e9a570524b63d2a2b3a7c3325acf5b89bbeb131e tree cfda3bf379e4f8dba8717dee55aab78aef7f4daf author Scott Chacon1301511835 -0700 committer Scott Chacon 1301511835 -0700 initial commit $ git ls-tree -r cfda3bf379e4f8dba8717dee55aab78aef7f4daf 100644 blob a906cb2a4a904a152... README 100644 blob 8f94139338f9404f2... Rakefile 040000 tree 99f1a6d12cb4b6f19... lib
The Index next proposed commit snapshot
The Index is your proposed next commit. Git populates it with a list of all the
file contents that were last checked out into your working directory and what
they looked like when they were originally checked out. It's not technically a
tree structure, it's a flattened manifest, but for our purposes it's close
enough. When you run git commit
, that command only looks at your
Index by default, not at anything in your working directory. So, it's simplest
to think of it as the Index is the snapshot of your next commit.
$ git ls-files -s 100644 a906cb2a4a904a152e80877d4088654daad0c859 0 README 100644 8f94139338f9404f26296befa88755fc2598c289 0 Rakefile 100644 47c6340d6459e05787f644c2447d2595f5d3a54b 0 lib/simplegit.rb
The Working Directory sandbox, scratch area
Finally, you have your working directory. This is where the content of files are placed into actual files on your filesystem so they're easily edited by you. The Working Directory is your scratch space, used to easily modify file content.
$ tree . ├── README ├── Rakefile └── lib └── simplegit.rb 1 directory, 3 files
The Workflow
So, Git is all about recording snapshots of your project in successively better states by manipulating these three trees, or collections of contents of files.

Let's visualize this process. Say you go into a new directory with a single
file in it. We'll call this V1 of the file and we'll indicate it in blue.
Now we run git init
, which will create a Git repository with a
HEAD reference that points to an unborn branch (aka, nothing)

At this point, only the Working Directory tree has any content.
Now we want to commit this file, so we use git add
to take
content in your Working Directory and populate our Index with the updated
content

Then we run git commit
to take what the Index looks like now
and save it as a permanent snapshot pointed to by a commit, which HEAD is then
updated to point at.

At this point, all three of the trees are the same. If we run
git status
now, we'll see no changes because they're all the
same.
Now we want to make a change to that file and commit it. We will go through the same process. First we change the file in our working directory.

If we run git status
right now we'll see the file in red as
"changed but not updated" because that entry differs between our Index and our
Working Directory. Next we run git add
on it to stage it into our
Index.

At this point if we run git status
we will see the file in green
under 'Changes to be Committed' because the Index and HEAD differ - that is, our
proposed next commit is now different from our last commit. Those are the entries
we will see as 'to be Committed'. Finally, we run git commit
to
finalize the commit.

Now git status
will give us no output because all three trees
are the same.
Switching branches or cloning goes through a similar process. When you checkout a branch, it changes HEAD to point to the new commit, populates your Index with the snapshot of that commit, then checks out the contents of the files in your Index into your Working Directory.
The Role of Reset
So the reset
command makes more sense when viewed in this
context. It directly manipulates these three trees in a simple and predictable
way. It does up to three basic operations.
Step 1: Moving HEAD killing me --soft ly
The first thing reset
will do is move what HEAD points to. Unlike
checkout
it does not move what branch HEAD points to, it directly
changes the SHA of the reference itself. This means if HEAD is pointing to the
'master' branch, running git reset 9e5e64a
will first of all make
'master' point to 9e5e64a
before it does anything else.

No matter what form of reset
with a commit you invoke, this is
the first thing it will always try to do. If you add the flag --soft
,
this is the only thing it will do. With --soft
, reset
will simply stop there.
Now take a second to look at that diagram and realize what it did. It
essentially undid the last commit you made. When you run git commit
,
Git will create a new commit and move the branch that HEAD
points
to up to it. When you reset
back to HEAD~
(the parent
of HEAD), you are moving the branch back to where it was without changing the Index
(staging area) or Working Directory. You could now do a bit more work and
commit
again to accomplish basically what git commit --amend
would have done.
Step 2: Updating the Index having --mixed feelings
Note that if you run git status
now you'll see the in green the
difference between the Index and what the new HEAD is.
The next thing reset
will do
is to update the Index with the contents of whatever tree HEAD now
points to so they're the same.

If you specify the --mixed
option, reset
will stop
at this point. This is also the default, so if you specify no option at all,
this is where the command will stop.
Now take another second to look at THAT diagram and realize what it did. It
still undid your last commit
, but also unstaged everything.
You rolled back to before you ran all your git add
s AND
git commit
.
Step 3: Updating the Working Directory math is --hard, let's go shopping
The third thing that reset
will do is to then make the Working
Directory look like the Index. If you use the --hard
option, it
will continue to this stage.

Finally, take yet a third second to look at that diagram and think
about what happened. You undid your last commit, all the git add
s,
and all the work you did in your working directory.
It's important to note at this point that this is the only way to make the
reset
command dangerous (ie: not working directory safe). Any
other invocation of reset
can be pretty easily undone, the
--hard
option cannot, since it overwrites (without checking) any
files in the Working Directory. In this particular case, we still have v3
version of our file in a commit in our Git DB that we could get back by looking
at our reflog
, but if we had not committed it, Git still would have
overwritten the file.
Overview
That is basically it. The reset
command overwrites these three
trees in a specific order, stopping when you tell it to.
- #1) Move whatever branch HEAD points to (stop if
--soft
) - #2) THEN, make the Index look like that (stop here unless
--hard
) - #3) THEN, make the Working Directory look like that
There are also --merge
and --keep
options, but
I would rather keep things simpler for now - that will be for another article.
Boom. You are now a reset
master.
Reset with a Path
Well, I lied. That's not actually all. If you specify a path,
reset
will skip the first step and just do the other ones but limited
to a specific file or set of files. This actually sort of makes sense - if the
first step is to move a pointer to a different commit, you can't make it point
to part of a commit, so it simply doesn't do that part. However, you can
use reset
to update part of the Index or the Working Directory
with previously committed content this way.
So, assume we run git reset file.txt
. This assumes, since you
did not specify a commit SHA or branch that points to a commit SHA, and that you
provided no reset option, that you are typing the shorthand for
git reset --mixed HEAD file.txt
, which will:
#1) Move whatever branch HEAD points to (stop if--soft
)- #2) THEN, make the Index look like that
(stop here unless--hard
)
So it essentially just takes whatever file.txt
looks like in
HEAD and puts that in the Index.

So what does that do in a practical sense? Well, it unstages the
file. If we look at the diagram for that command vs what git add
does, we can see that it is simply the opposite. This is why the output of
the git status
command suggests that you run this to unstage a
file.

We could just as easily not let Git assume we meant "pull the data from HEAD"
by specifying a specific commit to pull that file version from to populate our
Index by running something like git reset eb43bf file.txt
.

So what does that mean? That functionally does the same thing as if we had
reverted the content of the file to v1, ran git add
on it,
then reverted it back to to v3 again. If we run git commit
,
it will record a change that reverts that file back to v1, even though
we never actually had it in our Working Directory again.
It's also pretty interesting to note that like git add --patch
,
the reset
command will accept a --patch
option to
unstage content on a hunk-by-hunk basis. So you can selectively unstage or
revert content.
A fun example
I may use the term "fun" here a bit loosely, but if this doesn't sound like fun to you, you may drink while doing it. Let's look at how to do something interesting with this newfound power - squashing commits.
If you have this history and you're about to push and you want to squash
down the last N commits you've done into one awesome commit that makes you
look really smart (vs a bunch of commits with messages like "oops.", "WIP"
and "forgot this file") you can use reset
to quickly and easily
do that (as opposed to using git rebase -i
).
So, let's take a slightly more complex example. Let's say you have a project where the first commit has one file, the second commit added a new file and changed the first, and the third commit changed the first file again. The second commit was a work in progress and you want to squash it down.

You can run git reset --soft HEAD~2
to move the HEAD
branch back to an older commit (the first commit you want to keep):

And then simply run git commit
again:

Now you can see that your reachable history, the history you would push, now looks like you had one commit with the one file, then a second that both added the new file and modified the first to it's final state.
Check it out
Finally, some of you may wonder what the difference between checkout
and reset
is. Well, like reset
, checkout
manipulates the three trees and it is a bit different depending on whether you
give the command a file path or not. So, let's look at both examples seperately.
git checkout [branch]
Running git checkout [branch]
is pretty similar to running
git reset --hard [branch]
in that it updates all three trees for
you to look like [branch]
, but there are two important differences.
First, unlike reset --hard
, checkout
is working
directory safe in this invocation. It will check to make sure it's not blowing
away files that have changes to them. Actually, this is a subtle difference,
because it will update all of the working directory except the files you've
modified if it can - it will do a trivial merge between what you're checking
out and what's already there. In this case, reset --hard
will
simply replace everything across the board without checking.
The second important difference is how it updates HEAD. Where reset
will move the branch that HEAD points to, checkout
will move HEAD
itself to point to another branch.
For instance, if we have two branches, 'master'
and 'develop' pointing at different commits, and we're currently on 'develop'
(so HEAD points to it) and we run git reset master
, 'develop' itself
will now point to the same commit that 'master' does.
On the other hand, if we instead run git checkout master
, 'develop'
will not move, HEAD itself will. HEAD will now point to 'master'. So, in both
cases we're moving HEAD to point to commit A, but how we do so is very
different. reset
will move the branch HEAD points to,
checkout
moves HEAD itself to point to another branch.

git checkout [branch] file
The other way to run checkout
is with a file path, which like
reset
, does not move HEAD. It is just like git reset [branch] file
in that it updates the index with that file at that commit, but it also
overwrites the file in the working directory. Think of it like
git reset --hard [branch] file
- it would be exactly the same thing,
it is also not working directory safe and it also does not move HEAD. The only
difference is that reset
will a file name will not accept --hard
,
so you can't actually run that.
Also, like git reset
and git add
, checkout
will accept a --patch
option to allow you to selectively revert file
contents on a hunk-by-hunk basis.
Cheaters Gonna Cheat
Hopefully now you understand and feel more comfortable with the
reset
command, but are probably still a little confused about how
exactly it differs from checkout
and could not possibly remember
all the rules of the different invocations.
So to help you out, I've created something that I pretty much hate, which
is a table. However, if you've followed the article at all, it may be a useful
cheat sheet or reminder. The table shows each class of the reset
and checkout
commands and which of the three trees it updates.
Pay especial attention to the 'WD Safe?' column - if it's red, really think about it before you run that command.
head | index | work dir | wd safe | |
---|---|---|---|---|
Commit Level | ||||
reset --soft [commit] | REF | NO | NO | YES |
reset [commit] | REF | YES | NO | YES |
reset --hard [commit] | REF | YES | YES | NO |
checkout [commit] | HEAD | YES | YES | YES |
File Level | ||||
reset (commit) [file] | NO | YES | NO | YES |
checkout (commit) [file] | NO | YES | YES | NO |
Good night, and good luck.