186

I'm trying to do a fancy stuff here with Git hooks, but I don't really know how to do it (or if it's possible).

What I need to do is: in every commit I want to take its hash and then update a file in the commit with this hash.

Any ideas?

5
  • 18
    Basically I have a web application and I want to associate a installed version of that application with the exact commit that version is associated to. My initial ideia was to update a sort of about.html file with the commit hash. But after studying git's objects model, I realized that this is kind of impossible =/ Commented Aug 9, 2010 at 19:22
  • 40
    This is a very practical problem. I ran into it too! Commented Jul 31, 2013 at 9:20
  • 9
    As for me, I would like my program to write a message like this to the logs: "myprog starting up, v.56c6bb2". That way, if someone files a bug and sends me the log files, I can find out exactly what version of my program was running. Commented Jun 30, 2016 at 20:45
  • 5
    @Jefromi, the actual use case is in fact very common, and hits beginners very easily. Having the real version somehow "imprinted" into baselined files is a basic need, and it's far from obvious why it would be a wrong idea, e.g. because that's pretty much your only option with manual revision control hacks. (Remember beginners.) Add to that that many projects simply don't have any sort of build/installation/deployment step at all which could grab and stamp the version into live files. Regardless, instead of pre-commit, the post-checkout hook could help even in those cases. Commented Nov 21, 2016 at 13:21
  • 1
    This is impossible! If you can do this you broke the SHA-1 hash algorithm... ericsink.com/vcbe/html/cryptographic_hashes.html Commented May 18, 2019 at 9:32

8 Answers 8

108

I would recommend doing something similar to what you have in mind: placing the SHA1 in an untracked file, generated as part of the build/installation/deployment process. It's obviously easy to do (git rev-parse HEAD > filename or perhaps git describe [--tags] > filename), and it avoids doing anything crazy like ending up with a file that's different from what git's tracking.

Your code can then reference this file when it needs the version number, or a build process could incorporate the information into the final product. The latter is actually how git itself gets its version numbers - the build process grabs the version number out of the repo, then builds it into the executable.

Sign up to request clarification or add additional context in comments.

4 Comments

Could someone further expound with a step by step on how to do this? Or at least a nudge in the right direction?
@Joel How to do what? I mentioned how to put the hash in a file; the rest is presumably something about your build process? Maybe a new question if you're trying to ask about that part.
In my case, I added a rule to my Makefile that generates a "gitversion.h" file on every build. See stackoverflow.com/a/38087913/338479
You might be able to automate this with a "git-checkout" hook. The problem is that the hooks would have to be manually installed.
28

It's impossible to write the current commit hash: if you manage to pre-calculate the future commit hash — it will change as soon as you modify any file.

However, there're three options:

  1. Use a script to increment 'commit id' and include it somewhere. Ugly
  2. .gitignore the file you're going to store the hash into. Not very handy
  3. In pre-commit, store the previous commit hash :) You don't modify/insert commits in 99.99% cases, so, this WILL work. In the worst case you still can identify the source revision.

I'm working on a hook script, will post it here 'when it's done', but still — earlier than Duke Nukem Forever is released :))

Update: code for .git/hooks/pre-commit:

#!/usr/bin/env bash
set -e

#=== 'prev-commit' solution by o_O Tync
#commit_hash=$(git rev-parse --verify HEAD)
commit=$(git log -1 --pretty="%H%n%ci") # hash \n date
commit_hash=$(echo "$commit" | head -1)
commit_date=$(echo "$commit" | head -2 | tail -1) # 2010-12-28 05:16:23 +0300

branch_name=$(git symbolic-ref -q HEAD) # http://stackoverflow.com/questions/1593051/#1593487
branch_name=${branch_name##refs/heads/}
branch_name=${branch_name:-HEAD} # 'HEAD' indicates detached HEAD situation

# Write it
echo -e "prev_commit='$commit_hash'\ndate='$commit_date'\nbranch='$branch'\n" > gitcommit.py

Now the only thing we need is a tool that converts prev_commit,branch pair to a real commit hash :)

I don't know whether this approach can tell merging commits apart. Will check it out soon

Comments

14

This can be achieved by using the filter attribute in gitattributes. You'd need to provide a smudge command that inserts the commit id, and a clean command that removes it, such that the file it's inserted in wouldn't change just because of the commit id.

Thus, the commit id is never stored in the blob of the file; it's just expanded in your working copy. (Actually inserting the commit id into the blob would become an infinitely recursive task. ☺) Anyone who clones this tree would need to set up the attributes for herself.

4 Comments

Impossible task, not recursive task. Commit hash depends on tree hash which depends on file hash, which depends on file contents. You have to get self-consistency. Unless you will find a kind of [generalized] fixed point for SHA-1 hash.
@Jakub, is there some kind of trick in git that will allow to create tracked files which do not modify the resulting hash? Some way to override its hash, maybe. That'll be a solution :)
@o_O Tync: Not possible. Changed file means changed hash (of a file) - this is by design, and by definition of a hash function.
This is a pretty good solution, but bear in mind that this involves hooks which have to be manually installed whenever you clone a repository.
14

Someone pointed me to "man gitattributes" section on ident, which has this:

ident

When the attribute ident is set for a path, git replaces $Id$ in the blob object with $Id:, followed by the 40-character hexadecimal blob object name, followed by a dollar sign $ upon checkout. Any byte sequence that begins with $Id: and ends with $ in the worktree file is replaced with $Id$ upon check-in.

If you think about it, this is what CVS, Subversion, etc do as well. If you look at the repository, you'll see that the file in the repository always contains, for example, $Id$. It never contains the expansion of that. It's only on checkout that the text is expanded.

1 Comment

ident is the hash for the file itself, not the hast of the commit. From git-scm.com/book/en/…: "However, that result is of limited use. If you’ve used keyword substitution in CVS or Subversion, you can include a datestamp — the SHA isn’t all that helpful, because it’s fairly random and you can’t tell if one SHA is older or newer than another." filter takes work, but it can get the commit info into (and out of) a file.
13

Think outside of the commit box!

pop this into the file hooks/post-checkout

#!/bin/sh
git describe --all --long > config/git-commit-version.txt

The version will be available everywhere you use it.

1 Comment

I slightly modified your answer to ensure that the version file is always included in the commit by adding this at the end: git add config/git-commit-version.txt.
3

I don't think you actually want to do that, because when a file in the commit is changed, the hash of the commit is also changed.

1 Comment

actually, if this git commit version is needed to show in the final build then what @Cascabel has suggested is the best option. git describe --all --long > src/assets/git-commit-version.txt this is what I'm doing in GitHub actions build steps.
2

Let me explore why this is a challenging problem using the git internals. You can get the sha1 of the current commit by

#!/bin/bash
commit=$(git cat-file commit HEAD) #
sha1=($((printf "commit %s\0" $(echo "$commit" | wc -c); echo "$commit") | sha1sum))
echo ${sha1[0]}

Essentially you run a sha1 checksum on the message returned by git cat-file commit HEAD. Two things immediately jump out as a problem when you examine this message. One is the tree sha1 and the second is the commit time.

Now the commit time is easily taken care of by altering the message and guessing how long it takes to make a commit or scheduling to commit at a specific time. The true issue is the tree sha1, which you can get from git ls-tree $(git write-tree) | git mktree. Essentially you are doing a sha1 checksum on the message from ls-tree, which is a list of all the files and their sha1 checksum.

Therefore your commit sha1 checksum depends on your tree sha1 checksum, which directly depends on the files sha1 checksum, which completes the circle and depends on the commit sha1. Thus you have a circular problem with techniques available to myself.

With less secure checksums, it has been shown possible to write the checksum of the file into the file itself through brute force; however, I do not know of any work that accomplished that task with sha1. This is not impossible, but next to impossible with our current understanding (but who knows maybe in a couple years it will be trivial). However, still this is even harder to brute force since you have to write the (commit) checksum of a (tree) checksum of a (blob) checksum into the file.

1 Comment

Is there a way that one could commit the files, then do a checkout and have the latest commit hash placed as a comment at the beginning of each source code file? Then build and run from that?
1

I prefer simply writing the exact date-time and the parent commit's hash, so in hooks/pre-commit I write the below code:

#!/bin/bash
ver_file=version.txt
> $ver_file
date +"%Y-%m-%d %T %:z" >> $ver_file
echo -n "Parent: " >> $ver_file
git rev-parse HEAD >> $ver_file
git add $ver_file
echo "Date-time and parent commit added to '$ver_file'"
exit 0

Sample auto-generated version.txt file:

2023-05-24 01:24:12 +03:30
Parent: 35acd10240a55d164b371aa28812e8e988ab0c8d

This method also keeps working when you checkout to another commit, as the version.txt file is stored exactly the way the other files are stored, and also in remote repositories and submodules for the same reason, but you have to make sure that the same file exists in .git/hooks/pre-commit for your submodule or remote repository as well.

Downsides

  1. It's not supported on GitHub and perhaps some other websites.

  2. There will always be a conflict on version.txt when performing merge, however, no further actions other than a commit -am <message> is needed since pre-commit will also be run before doing the merge's commit.

1 Comment

I don't know why you get downvoted but for me this is a very good answer

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.