Taking Control of Your Git History:
Staging Accurately

Published on

Parts of this series:

  1. The Importance of Git History
  2. Staging Accurately
  3. Editing History
  4. Resolving Conflicts

Make sure that you set Git’s default editor. The operations you’ll be doing with this editor are very simple, so the editor doesn’t have to be fancy, it doesn’t even have to have a GUI (I’m using Vim). But let’s say that you want to use the same text editor that you code with, e.g. VS Code, this is how you would set it as Git’s editor:

git config --global core.editor "code --wait"

Find out how to set it to your own favorite editor.

As you probably already know, there is more to life than always staging entire files. Very often changes in a single file should belong to multiple commits, so instead of staging entires file we should stage only those changes that are relevant to our commit. This is sometimes challenging, but after reading this series I hope you’ll realize that you have complete control over this.

Let’s start going over all of our changes:

git add -p

Git will split changes into groups called hunks, which we can stage separately. At the bottom you should see instructions similar to this:

Stage this hunk [y,n,q,a,d,s,e,?]?

To see what each of these letters mean, type ? and press enter.

Using y to stage, n to skip, and s to split the hunk into smaller hunks is usually enough to stage only related changes. However, sometimes we need finer control:

  • when there are multiple changed lines next to each other, but we only want to stage some of them
  • when we made multiple changes to a single line that we want to commit separately

This is where e comes into play, it will open the current hunk in the text editor (that we configured for Git) so we can manually edit it. Let’s say that we’re staging changes of an HTML file, manually editing a hunk might look like this:

# Manual hunk edit mode -- see bottom for a quick guide.
@@ -7,6 +7,9 @@
   <title>Towards a Better Git History</title>
 </head>
 <body>
-  <img src="/images/logo.png" alt="logo">
+  <p>Manually editing hunks will impress your colleagues.</p>
+  <a href="/">
+    <img src="/images/logo.svg" alt="">
+  </a>
 </body>
 </html>
# ---
# To remove '-' lines, make them ' ' lines (context).
# To remove '+' lines, delete them.
# Lines starting with # will be removed.
#
# If the patch applies cleanly, the edited hunk will immediately be
# marked for staging.
# If it does not apply cleanly, you will be given an opportunity to
# edit again.  If all lines of the hunk are removed, then the edit is
# aborted and the hunk is left unchanged.

If you want to follow along in a test repo, I prepared a gist. First commit index-0.html, then replace its content with index-1.html. Now you should have the exact same diff.

This diff describes four changes:

  1. we added a paragraph above the logo
  2. we removed the text from the alt attribute to improve a11y
  3. we change the logo image format to SVG
  4. we wrapped a link around the logo

Ideally, each of these changes should be a separate commit, but this is where many people give up and cram changes into a single commit. This ends now! ✊

Our first commit can be adding the paragraph above the logo. See instructions below the diff? They explain what we should do to achieve this. We should delete + lines where we’re wrapping the link around the logo, replace - with a space, and move the paragraph line above the logo:

   <title>Towards a Better Git History</title>
 </head>
 <body>
+  <p>Manually editing hunks will impress your colleagues.</p>
   <img src="/images/logo.png" alt="logo">
 </body>
 </html>

Don’t worry, we aren’t making changes to our actual files, we’re just telling it what to stage.

After saving the file and exiting, you have now staged the changes for the first commit!

git commit -m "Add encouraging paragraph above the logo"

Now let’s continue staging:

git add -p
 </head>
 <body>
   <p>Manually editing hunks will impress your colleagues.</p>
-  <img src="/images/logo.png" alt="logo">
+  <a href="/">
+    <img src="/images/logo.svg" alt="">
+  </a>
 </body>
 </html>

The second change commit can be to remove the text from the alt attribute, but we want to preserve the .png format because that will be a separate commit. Remember: you can tell Git to stage anything, even “made-up” changes. We can delete all the + lines, copy the - line below, convert it to a + line, and remove the text logo from the alt attribute:

 </head>
 <body>
   <p>Manually editing hunks will impress your colleagues.</p>
-  <img src="/images/logo.png" alt="logo">
+  <img src="/images/logo.png" alt="">
 </body>
 </html>

Now we can commit this change:

git commit
# subject: Improve a11y of the logo image
# body: Not every image needs an `alt` attribute, reading the logo image isn't
#       useful here, passing an empty string will cause screen readers to skip it.
#
#       https://davidwalsh.name/accessibility-tip-empty-alt-attributes

As an excercise, try to stage and commit the last two changes yourself. 😉

When we make multiple changes to a file, Git only shows the total diff, so it’s up to us to break it down into steps when necessary. I strongly encourage you to start doing this, having correct diffs results in a much better Git history.

Sometimes it can happen that we accidentally added a patch that we would like to revert. We can un-stage changes with git reset:

git reset -p

As you would assume, it’s the opposite of git add in every way. This takes some time getting used to because it can be difficult to think this way, especially when we’re manually editing a hunk! 😅 If it becomes too hard for you, you can always just reset the entire hunk and partially stage it again.

Sometimes we can spent a lot of time staging changes for a single commit, and we can start to lose confidence that we staged exactly what we wanted. Unrelated changes can easily creep in, so I recommend getting into the habit of double-checking whether our staged changes make sense before committing them:

git diff --staged
Hot tip!

If our changes are predominantly word-based, e.g. documentation updates, it might be easier to double-check them using a word diff:

git diff --staged --word-diff

Finally, once we’re happy with our staged changes, we can commit:

git commit

Our Git editor will open, and we can write your commit message. Remember:

  • The first line is the subject, a short summary of your commit. Avoid subjects like “Changed README.md” in favor of subjects like “Clarify usage of tooltips in the docs”.

  • After leaving an empty line below you can write the description of the commit. Often the subject isn’t enough to describe your changes well, so this is the opportunity to do that. It’s not possible to write too much here, and keep in mind who you’re writing this for. Maybe this commit’s pull request contains a link that leads to a Slack message that leads to a Jira ticket that links to a Dropbox Paper document that describes these changes, but instead of counting on anyone being able to follow this trail, why not just describe all of that in the commit message directly?

Finally, we can double-check the diff of the last commit by running:

git show [<commit>]

where <commit> is its SHA-1. It’s optional, as the square brackets indicate, omitting it refers to the last commit. If the diff is big and you’re only interested in specific files, you can specify them after --, for example:

git show 3bb6e04 -- package.json babel.config.js

After committing changes we might decide that we aren’t happy with the diff or the commit message, what then? Now we’re entering the territory of changing Git’s history.