Taking Control of Your Git History:Editing History
Published on Parts of this series:
Much like human history, Git history doesn’t necessarily reflect exactly what happened, but unlike human history, it shouldn’t have to.
We might want to do that if we want to amend some commits with additional changes, edit their commit message, exclude them from the branch etc. Once we get comfortable with doing this, our commits are no longer final; we can tweak them as much as we want until their diffs and commit messages are just right and they are ready to be reviewed and (hopefully) merged. People viewing our commits shouldn’t have to go through of every aspect of our process, e.g. that we later remembered to add tests, or that we decided to revert a change. Instead, let’s make it concise and descriptive for them.
However, just because we can change history doesn’t mean we always should. Specifically, we shouldn’t change history on branches that multiple people are working on. If we change history of the upstream branch with git push --force
, then everyone else who also works on that branch has to have that same history locally, but if they already committed some changes, it can get really messy, so don’t do that. 😉
This part of the series is a collection of various instructions to help you create the Git history you want.
What about squashing?
When we’re working on our task, it rarely happens that we know exactly what changes we’re going to make. Sometimes it happens that in order to do our task, we need to do something else first. It’s easy to push a bunch of commits like this to our branch:
- “handle empty arrays in
extractWorkers
” - “fix config”
- “fix tests”
- “fix another test”
- “refactor
helpers.js
”
Then squash all those commits together into a single commit, e.g. “handle empty array in extractWorkers
”, upon merging our branch into master
. But what if one of our coworkers wanted to know why helpers.js
was refactored and stumbles upon our unrelated commit pertaining to extractWorkers
? Chances are we didn’t explain refactoring in our commit message.
We can do better. We need to start thinking about our commits as documenting changes for other people, for the sake of our coworkers and our future selves.
Note: I think squashing is very useful in open source, where pull requests are usually smaller, and where we can’t affect our contributors’ Git skills.
Last commit
I want to start by showing you a couple of useful operations we can perform on the last commit. They are simpler, so they can help us understand the concept of editing history before we move on to rebasing.
Amending
Let’s say that we want to add some more changes to a commit and/or edit its commit message. When the commit in question is the last commit, we can simply stage the changes (if any), and run the following:
git commit --amend
This will open Git’s editor with last commit’s message, which gives us the opportunity to change it if we want. After saving and exiting the file, the commit will be saved with new changes amended.
If we already pushed the commit before amending it, and we run:
git status
we should see that our branch is both 1 commit ahead and 1 behind. This is because the commit we amended is no longer the same commit, it has different properties, including SHA-1, so the upstream branch still has our old commit (1 behind) that we had locally overwritten with the new one (1 ahead).
So what should we do now? Well, the fact that we amended that commit means that we’re no longer interested in its old version, so we need to overwrite upstream’s history:
git push --force
Forcefully pushing commits is a serious command, so in order to avoid overwriting commits that someone else might have pushed in the meantime, I recommend running this instead:
git push --force-with-lease
This will forcefully push only if nobody else pushed any changes upstream that we haven’t pulled. We can even create a cute alias to make it shorter to type:
git config --global alias.please 'push --force-with-lease'
git please
Splitting into multiple commits
Sometimes we end up committing too many changes as a single commit that aren’t really related. If you notice this, I suggest committing these changes separately because it will give you much more flexibility:
- merge conflicts will be easier to resolve because a set of related changes will be easier for our brain to deal with
- we can decide to postpone certain changes by moving the commit out of the branch
- if for some reason we give up on the main commits in your branch, we can simply delete them and open a spin-off pull request with the rest of the changes
To start splitting the last commit into multiple commits, we can reset it and start committing again. For this task we should use soft reset:
git reset --soft HEAD^
HEAD
refers to the tip of the current branch, i.e. its last commit, so this command resets the last commit by moving the tip of the branch to the commit before the last commit, a.k.a. its parent, which we can refer to by appending ^
. Alternatively, you can use the SHA-1 of the last commit (appending ^
as well), or the commit before the last one without ^
. Possibilities are endless. 😜
Soft reset leaves file changes from reset commits intact. Now all those changes are back to being staged, just like before we committed them, so we can unstage them by running:
git reset
and start staging and committing changes the way we wanted.
Moving
Sometimes we may realize that the last commit isn’t really related to our pull request. We still want to get it merged eventually, but we also don’t want to pollute the diff with unrelated changes for the person reviewing the pull request. In that case we can move it to another branch following these steps:
- copy that commit’s SHA-1 to the clipboard
git checkout -b <branch> master
to create and switch to a new branch called<branch>
, diverging frommaster
git cherry-pick <commit>
to copy the commit to<branch>
<commit>
is the copied SHA-1
git checkout -
to return to the previous branchgit reset --hard HEAD^
to get rid of the last commit- unlike soft reset, this will erase that commit’s file changes from this branch, but we already copied them to
<branch>
so that’s what we want
- unlike soft reset, this will erase that commit’s file changes from this branch, but we already copied them to
Any commit
Now that we laid the groundwork for editing Git history, we’re getting to the more powerful stuff. 😎
Amending
If the commit we want to edit isn’t the last one, this is where interactive rebasing comes into play. First, we should decide which commits we want to change by viewing the history:
git log --pretty=oneline --abbrev-commit
We used the more compact format to make it easier to find the commits we’re looking for based on their subject:
5dae451 (HEAD -> master) Link logo to the site root
d3942e9 Use SVG format for the logo
e0662b2 Improve a11y of the logo image
09216b1 Add encouraging paragraph above the logo
5b5ffb9 Add index page
96be933 root
Side note: this is a test of how well we write commit subjects. If we’re constantly looking up diffs of commits to figure out which changes they contain, we should reconsider our style of writing subjects.
In order to amend a commit with some changes, we should commit those changes first. The subject can be anything because we’ll discard it anyway, but let’s use a special one:
fixup! <commit>
where <commit>
is the SHA-1 of the commit we want to amend. You’ll see what it does in a moment.
For example, let’s say that we want to amend commits “Improve a11y of the logo image” and “Use SVG format for the logo”. We would commit our changes using their SHA-1 in the subject:
f5409b6 fixup! d3942e9
41aac9e fixup! e0662b2
5dae451 (HEAD -> master) Link logo to the site root
d3942e9 Use SVG format for the logo
e0662b2 Improve a11y of the logo image
09216b1 Add encouraging paragraph above the logo
5b5ffb9 Add index page
96be933 root
After committing all the changes, we should copy the SHA-1 of the oldest commit to be amended, which in our case is e0662b2
because it has been committed before d3942e9
, and run interactive rebase onto its parent:
git rebase --interactive --autosquash e0662b2^
Now we should see something like this in our Git editor:
pick e0662b2 Improve a11y of the logo image
fixup 41aac9e fixup! e0662b2
pick d3942e9 Use SVG format for the logo
fixup f5409b6 fixup! d3942e9
pick 5dae451 Link logo to the site root
Because we used the --autosquash
option, the commits with fixup!
subjects have been automatically reordered to amend their respective commits. Below you’ll see instructions explaining what directives like pick
and fixup
mean:
# Rebase 09216b1..f5409b6 onto f5409b6 (5 commands)
#
# Commands:
# p, pick <commit> = use commit
# r, reword <commit> = use commit, but edit the commit message
# e, edit <commit> = use commit, but stop for amending
# s, squash <commit> = use commit, but meld into previous commit
# f, fixup <commit> = like "squash", but discard this commit's log message
# x, exec <command> = run command (the rest of the line) using shell
# b, break = stop here (continue rebase later with 'git rebase --continue')
# d, drop <commit> = remove commit
# l, label <label> = label current HEAD with a name
# t, reset <label> = reset HEAD to a label
# m, merge [-C <commit> | -c <commit>] <label> [# <oneline>]
# . create a merge commit using the original merge commit's
# . message (or the oneline, if no original merge commit was
# . specified). Use -c <commit> to reword the commit message.
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.
#
# Note that empty commits are commented out
After saving and existing this file, fixup!
commits will amend the commits before them, i.e. above them. If we wanted to also change commit message of amended commits, we can replace pick
with reword
(or just r
), just like the instructions suggest.
Interactive rebase takes some time getting used to, but it’s incredibly powerful, I suggest playing with it because it lets us tweak Git history as much as we want.
If you want to set --autosquash
as the default behavior of interactive rebase, turn it on in your global Git configuration:
git config --global rebase.autosquash=true
Moving
Now that we learned how to copy a commit to another branch and how to interactively rebase, moving any commit is so simple that I’m not even going to show you any code snippets because I know you can handle it:
- copy the commit to the target branch
- interactively rebase onto a commit before before the commit you want to delete
- as the instructions will tell you, simply delete the line with that commit
Git becomes way less cryptic once you learn these concepts, doesn’t it?
Syncing branches
Sometimes we need to sync our branch with changes from master
, maybe we need those changes or, but usually we need to resolve conflicts. There are two ways we can do this:
git merge master
git rebase master
Guess which one I prefer? 😄
As I mentioned at the beginning of this post, people viewing Git history don’t need to know how many times we synced our branch with master
, in my opinion it creates unnecessary noise. Let’s say that we had to merge master
into the branch style
in order to resolve a conflict:
The merge commit on style
only informs us that we synced it with master
, if it’s a larger branch we might have more of these. Do we need that information, though? Compare that to the rebase workflow instead:
Here we rebased onto master
in order to resolve the conflict with the commit “Add Open Graph metadata”. We “replayed” our commits on top of master
as if we committed those changes just now, so now our history is saying that we diverged from “Add Open Graph metadata”, even though we actually diverged from “Link logo to the root”. In my opinion telling this white lie makes history easier to follow in the long run, especially if we consider multiple syncs and multiple branches. Syncing branches is just something we occasionally have to do, I don’t think there is any point in logging that information to our history.
Undo ✨
Sometimes changing Git history can turn out to be a mistake, which is scary because what do we do now? Luckily, there actually is a way to undo it — using reflogs:
git reflog
5dae451 HEAD@{9}: rebase -i (finish): returning to refs/heads/master
5dae451 HEAD@{10}: rebase -i (start): checkout 09216b1
5dae451 HEAD@{11}: rebase -i (finish): returning to refs/heads/master
5dae451 HEAD@{12}: rebase -i (reword): Link logo to the site root
d3942e9 HEAD@{13}: rebase -i (reword): Use SVG format for the logo
aefe4d9 HEAD@{14}: rebase -i: fast-forward
e0662b2 HEAD@{15}: rebase -i (start): checkout e0662b2
a38472b HEAD@{16}: commit: Link logo to the root
aefe4d9 HEAD@{17}: commit: Use SVG for the logo
e0662b2 HEAD@{18}: commit: Improve a11y of the logo image
09216b1 HEAD@{19}: commit (amend): Add encouraging paragraph above the logo
7aff0b4 HEAD@{20}: commit (amend): Add encouraging paragraph above the logos
dc5d1a9 HEAD@{21}: commit: Add encouraging paragraph above the logo
5b5ffb9 HEAD@{22}: reset: moving to HEAD
5b5ffb9 HEAD@{23}: commit: Add index page
96be933 HEAD@{24}: commit (initial): root
At first glance it looks similar to a list of commits, like something a git log
might print, but you can think of it as a list of actions; if we examine it closely, we can recognize rebasing, committing and resetting.
Undoing our interactive rebase is as simple as resetting the actions starting with rebase -i
by passing HEAD@{<n>}
of the action before. In our case this is commit: Link logo to the root
:
git reset --hard HEAD@{16}
And voilá!
Be careful with this, though, I’m still very unfamiliar with reflogs so it would be a good idea to research them more.
Resolving conflicts
Resolving conflicts during rebase looks slightly different than with merging becuase the concepts are different. Initially I thought I would just briefly walk you through the process, but I started writing a lot, so I thought why not write another part solely on resolving conflicts? People hate doing it, so it might help to understand it better and to learn some useful tricks.