Overview
Teaching: 15 min
Exercises: 30 minQuestionsObjectives
- How can we find out when exactly a line of code was changed?
- How can we navigate past versions of the code?
- How can we find out which commit broke or changed a functionality?
- Quickly find a line of code, find out why it was introduced and when.
- Quickly find the commit that changed a behavior.
Preparation
Please make sure that you do not clone repositories inside an already tracked folder:
$ git status
If you are inside an existing Git repository, step out of it. You need to find a different location since we will clone a new repository.
If you see this message, this is good in this case:
fatal: not a git repository (or any of the parent directories): .git
First the instructor demonstrates few commands on a real life example repository https://github.com/networkx/networkx (mentioned in the amazing site The Programming Historian). Later we will practice these in groups in an archaeology exercise (below).
git grep
to search through the repositoryWith git grep
you can find all lines in a repository which contain some string or regular expression.
This is useful to find out where in the code some variable is used or some error message printed:
$ git grep sometext
In the networkx repository you can try:
$ git clone https://github.com/networkx/networkx
$ cd networkx
$ git grep -i fixme
git log -S
to search through the history of changesWhile git grep
searches the current state of the repository,
it is possible to search also through all changes for “sometext”:
$ git log -S sometext
In the networkx repository you can try:
$ git log -S test_weakly_connected_component
git show
to inspect commitsWe have seen this one before already. Using git show
we can inspect an individual commit if
we know its hash:
$ git show somehash
For instance:
$ git show 759d589bdfa61aff99e0535938f14f67b01c83f7
git annotate
to annotate code with commit metadataTry it out on a file - with git annotate
you can see line by line who and when the line was modified
last. It also prints the precise hash of the last change which modified each line. Incredibly useful
for reproducibility.
$ git annotate somefile
Example:
$ git annotate networkx/convert_matrix.py
If you annotate in a terminal and the file is longer than the screen, Git by default uses the program less
to
scroll the output.
Use /sometext
<ENTER>
to find “sometext” and you can cycle through the results with n
(next) and N
(last).
You can also use page up/down to scroll. You can quit with q
.
Discussion
Discuss how these two affect the annotation:
- wrapping long lines of text/code into shorter lines
- autoformatting tools such as
black
git checkout -b
to inspect code in the pastWe can create branches pointing to a commit in the past. This is the recommended mechanism to inspect old code:
$ git checkout -b branchname somehash
Example:
# create branch called "older-code" from hash 347e6292419b
$ git checkout -b older-code 347e6292419bd0e4bff077fe971f983932d7a0e9
# now you can navigate and inspect the code as it was back then
# ...
# after we are done we can switch back to "master"
$ git checkout master
# if we like we can delete the "older-code" branch
$ git branch -d older-code
On newer Git versions this is the preferred command:
$ git switch --create branchname somehash
Key Points
git log/grep/annotate/show
is a powerful combination when doing archaeology in a project.
git checkout -b <name> <hash>
is the recommended mechanism to inspect old code.On newer Git you can use the more intuitive
git switch --create branchname somehash
.