Skip to content

Introduced sha256 support for git-sizer #109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 14 commits into from
Closed

Introduced sha256 support for git-sizer #109

wants to merge 14 commits into from

Conversation

fcharlie
Copy link

image

@fcharlie fcharlie requested a review from a team as a code owner April 17, 2023 13:34
mhagger and others added 14 commits December 13, 2023 17:58
The name `gitDir` is less ambiguous. Also rename method `Path()` to
`GitDir()`.
Add a method `Repository.GitPath(relPath)`, which invokes `git
rev-parse --git-path $relPath` to find the path to a file within the
Git repository.

In `NewRepository()`, instantiate the `Repository` object earlier so
that the new method can be used to find the path to `shallow`.
Extract a method to determine whether the repository seems to be a
full clone. Call it from `NewRepository()`.
If you already have the desired `GIT_DIR`, there's no need to
determine it from the current path.
There's no need to deduce the `GIT_DIR` for a bare repository.
As of Git v2.38.0, there is an option to prevent Git from accessing bare
repositories unless asked for explicitly (via `--git-dir` or `GIT_DIR`):
`safe.bareRepository`.

The tests of `git sizer`, however, assume that Git will access a bare
repository when the current directory points inside that repository.
This only works if `safe.bareRepository` indicates that this is safe.

If that is not the case, i.e. if `safe.bareRepository` is set to
`explicit`, Git demands that the environment variable `GIT_DIR` is set
(either explicitly, or via `--git-dir`) when accessing bare
repositories.

So let's set `GIT_DIR` for the test cases that work on bare
repositories.

Signed-off-by: Johannes Schindelin <[email protected]>
This is the result of running `go fmt -mod=readonly ./...`
Signed-off-by: Johannes Schindelin <[email protected]>
@dscho
Copy link
Contributor

dscho commented Dec 14, 2023

@fcharlie I rebased this onto #117 to resolve merge conflicts, and also changed a few things on the way (please double-check!):

  • The dependencies were updated in a separate commit, to make reviewing this PR easier.
  • The hex size is double the byte size of the SHAs, I modified the code in NewObjectIter() accordingly.
  • The hex size of SHA-256 is not 32, but 64. I modified the code comment of the NewOID() function.
  • Most importantly, I added a test to verify that this thing now does support SHA-256.
Here is the range-diff
  • -: ------- > 1: 69418a9 Re-format some comments

  • -: ------- > 2: 16b2fa2 Update dependencies

  • 1: e8687bb ! 3: 69ac7d8 Introduced sha256 support for git-sizer

    @@ Metadata
     Author: Force Charlie <[email protected]>
     
      ## Commit message ##
    -    Introduced sha256 support for git-sizer
    +    Introduce sha256 support for git-sizer
     
      ## git/git.go ##
     @@ git/git.go: type Repository struct {
    @@ git/git.go: type Repository struct {
     +	hashAlgo HashAlgo
      }
      
    - // smartJoin returns the path that can be described as `relPath`
    -@@ git/git.go: func NewRepository(path string) (*Repository, error) {
    - 	if err == nil {
    - 		return nil, errors.New("this appears to be a shallow clone; full clone required")
    + // smartJoin returns `relPath` if it is an absolute path. If not, it
    +@@ git/git.go: func NewRepositoryFromGitDir(gitDir string) (*Repository, error) {
    + 		)
      	}
    + 
     +	hashAlgo := HashSHA1
    -+	cmd = exec.Command(gitBin, "rev-parse", "--show-object-format")
    -+	if out, err = cmd.Output(); err == nil {
    ++	//nolint:gosec // `gitDir` is the path we need Git to access.
    ++	cmd := exec.Command(gitBin, "--git-dir", gitDir, "rev-parse", "--show-object-format")
    ++	if out, err := cmd.Output(); err == nil {
     +		if string(bytes.TrimSpace(out)) == "sha256" {
     +			hashAlgo = HashSHA256
     +		}
     +	}
    - 
    - 	return &Repository{
    --		path:   gitDir,
    ++
    + 	repo := Repository{
    +-		gitDir: gitDir,
     -		gitBin: gitBin,
    -+		path:     gitDir,
    ++		gitDir:   gitDir,
     +		gitBin:   gitBin,
     +		hashAlgo: hashAlgo,
    - 	}, nil
    - }
    + 	}
      
    -@@ git/git.go: func (repo *Repository) GitCommand(callerArgs ...string) *exec.Cmd {
    - func (repo *Repository) Path() string {
    - 	return repo.path
    + 	full, err := repo.IsFull()
    +@@ git/git.go: func (repo *Repository) GitPath(relPath string) (string, error) {
    + 	// current directory, we can use it as-is:
    + 	return string(bytes.TrimSpace(out)), nil
      }
     +
     +func (repo *Repository) HashAlgo() HashAlgo {
    @@ git/obj_iter.go: func (repo *Repository) NewObjectIter(ctx context.Context) (*Ob
      		headerCh: make(chan BatchHeader),
      	}
     -
    -+	hashSize := repo.HashSize()
    ++	hashHexSize := repo.HashSize() * 2
      	iter.p.Add(
      		// Read OIDs from `iter.oidCh` and write them to `git
      		// rev-list`:
    @@ git/obj_iter.go: func (repo *Repository) NewObjectIter(ctx context.Context) (*Ob
      			"copy-oids",
      			func(_ context.Context, _ pipe.Env, line []byte, stdout *bufio.Writer) error {
     -				if len(line) < 40 {
    -+				if len(line) < hashSize {
    ++				if len(line) < hashHexSize {
      					return fmt.Errorf("line too short: '%s'", line)
      				}
     -				if _, err := stdout.Write(line[:40]); err != nil {
    -+				if _, err := stdout.Write(line[:hashSize]); err != nil {
    ++				if _, err := stdout.Write(line[:hashHexSize]); err != nil {
      					return fmt.Errorf("writing OID to 'git cat-file': %w", err)
      				}
      				if err := stdout.WriteByte('\n'); err != nil {
     
    + ## git/obj_resolver.go ##
    +@@ git/obj_resolver.go: func (repo *Repository) ResolveObject(name string) (OID, error) {
    + 	cmd := repo.GitCommand("rev-parse", "--verify", "--end-of-options", name)
    + 	output, err := cmd.Output()
    + 	if err != nil {
    +-		return NullOID, fmt.Errorf("resolving object %q: %w", name, err)
    ++		return repo.HashAlgo().NullOID(), fmt.Errorf("resolving object %q: %w", name, err)
    + 	}
    + 	oidString := string(bytes.TrimSpace(output))
    + 	oid, err := NewOID(oidString)
    + 	if err != nil {
    +-		return NullOID, fmt.Errorf("parsing output %q from 'rev-parse': %w", oidString, err)
    ++		return repo.HashAlgo().NullOID(), fmt.Errorf("parsing output %q from 'rev-parse': %w", oidString, err)
    + 	}
    + 	return oid, nil
    + }
    +
      ## git/oid.go ##
     @@
      package git
      
      import (
     +	"bytes"
    ++	//nolint:gosec // Git indeed does use SHA-1, still
     +	"crypto/sha1"
     +	"crypto/sha256"
      	"encoding/hex"
    @@ git/oid.go
      }
      
     -// NewOID converts an object ID in hex format (i.e., `[0-9a-f]{40}`)
    -+// NewOID converts an object ID in hex format (i.e., `[0-9a-f]{40,32}`)
    ++// NewOID converts an object ID in hex format (i.e., `[0-9a-f]{40,64}`)
      // into an `OID`.
      func NewOID(s string) (OID, error) {
      	oidBytes, err := hex.DecodeString(s)
    @@ git/oid.go: func NewOID(s string) (OID, error) {
      	dst[0] = '"'
      	dst[len(dst)-1] = '"'
     
    - ## git/ref_filter.go ##
    -@@ git/ref_filter.go: func (f intersection) Filter(refname string) bool {
    - // If `f1` is `nil`, it is treated as including nothing.
    - type include struct{}
    - 
    --func (_ include) Combine(f1, f2 ReferenceFilter) ReferenceFilter {
    -+func (include) Combine(f1, f2 ReferenceFilter) ReferenceFilter {
    - 	if f1 == nil {
    - 		return f2
    - 	}
    - 	return union{f1, f2}
    - }
    - 
    --func (_ include) Inverted() Combiner {
    -+func (include) Inverted() Combiner {
    - 	return Exclude
    - }
    - 
    -@@ git/ref_filter.go: func (f union) Filter(refname string) bool {
    - // If `f1` is `nil`, it is treated as including everything.
    - type exclude struct{}
    - 
    --func (_ exclude) Combine(f1, f2 ReferenceFilter) ReferenceFilter {
    -+func (exclude) Combine(f1, f2 ReferenceFilter) ReferenceFilter {
    - 	if f1 == nil {
    - 		return inverse{f2}
    - 	}
    -@@ git/ref_filter.go: func (_ exclude) Combine(f1, f2 ReferenceFilter) ReferenceFilter {
    - 
    - }
    - 
    --func (_ exclude) Inverted() Combiner {
    -+func (exclude) Inverted() Combiner {
    - 	return include{}
    - }
    - 
    -@@ git/ref_filter.go: var Exclude exclude
    - 
    - type allReferencesFilter struct{}
    - 
    --func (_ allReferencesFilter) Filter(_ string) bool {
    -+func (allReferencesFilter) Filter(_ string) bool {
    - 	return true
    - }
    - 
    -@@ git/ref_filter.go: var AllReferencesFilter allReferencesFilter
    - // whose names start with the specified `prefix`, which must match at
    - // a component boundary. For example,
    - //
    --// * Prefix "refs/foo" matches "refs/foo" and "refs/foo/bar" but not
    --//   "refs/foobar".
    -+//   - Prefix "refs/foo" matches "refs/foo" and "refs/foo/bar" but not
    -+//     "refs/foobar".
    - //
    --// * Prefix "refs/foo/" matches "refs/foo/bar" but not "refs/foo" or
    --//   "refs/foobar".
    -+//   - Prefix "refs/foo/" matches "refs/foo/bar" but not "refs/foo" or
    -+//     "refs/foobar".
    - func PrefixFilter(prefix string) ReferenceFilter {
    - 	if prefix == "" {
    - 		return AllReferencesFilter
    -
      ## git/tree.go ##
     @@ git/tree.go: import (
      
    @@ git/tree.go: func (iter *TreeIter) NextEntry() (TreeEntry, bool, error) {
      	return entry, true, nil
      }
     
    - ## go.mod ##
    -@@ go.mod: module github.com/github/git-sizer
    - go 1.17
    - 
    - require (
    --	github.com/cli/safeexec v1.0.0
    --	github.com/davecgh/go-spew v1.1.1 // indirect
    -+	github.com/cli/safeexec v1.0.1
    -+	github.com/github/go-pipe v1.0.2
    - 	github.com/spf13/pflag v1.0.5
    --	github.com/stretchr/testify v1.8.1
    --	golang.org/x/sync v0.1.0 // indirect
    -+	github.com/stretchr/testify v1.8.2
    - )
    - 
    --require github.com/github/go-pipe v1.0.2
    --
    - require (
    --	github.com/kr/pretty v0.1.0 // indirect
    -+	github.com/davecgh/go-spew v1.1.1 // indirect
    -+	github.com/kr/pretty v0.3.1 // indirect
    - 	github.com/pmezard/go-difflib v1.0.0 // indirect
    --	gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 // indirect
    -+	golang.org/x/sync v0.1.0 // indirect
    -+	gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c // indirect
    - 	gopkg.in/yaml.v3 v3.0.1 // indirect
    - )
    -
    - ## go.sum ##
    -@@
    --github.com/cli/safeexec v1.0.0 h1:0VngyaIyqACHdcMNWfo6+KdUYnqEr2Sg+bSP1pdF+dI=
    --github.com/cli/safeexec v1.0.0/go.mod h1:Z/D4tTN8Vs5gXYHDCbaM1S/anmEDnJb1iW0+EJ5zx3Q=
    -+github.com/cli/safeexec v1.0.1 h1:e/C79PbXF4yYTN/wauC4tviMxEV13BwljGj0N9j+N00=
    -+github.com/cli/safeexec v1.0.1/go.mod h1:Z/D4tTN8Vs5gXYHDCbaM1S/anmEDnJb1iW0+EJ5zx3Q=
    - github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
    - github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
    - github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
    - github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
    - github.com/github/go-pipe v1.0.2 h1:befTXflsc6ir/h9f6Q7QCDmfojoBswD1MfQrPhmmSoA=
    - github.com/github/go-pipe v1.0.2/go.mod h1:/GvNLA516QlfGGMtfv4PC/5/CdzL9X4af/AJYhmLD54=
    --github.com/kr/pretty v0.1.0 h1:L/CwN0zerZDmRFUapSPitk6f+Q3+0za1rQkzVuMiMFI=
    - github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
    -+github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
    -+github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
    -+github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
    - github.com/kr/pty v1.1.1/go.mod h1:pFQYn66WHrOpPYNljwOMqo10TkYh1fy3cYio2l3bCsQ=
    - github.com/kr/text v0.1.0/go.mod h1:4Jbv+DJW3UT/LiOwJeYQe1efqtUx/iVham/4vfdArNI=
    - github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
    - github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
    -+github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e/go.mod h1:pJLUxLENpZxwdsKMEsNbx1VGcRFpLqf3715MtcvvzbA=
    - github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
    - github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
    -+github.com/rogpeppe/go-internal v1.9.0 h1:73kH8U+JUqXU8lRuOHeVHaa/SZPifC7BkcraZVejAe8=
    -+github.com/rogpeppe/go-internal v1.9.0/go.mod h1:WtVeX8xhTBvf0smdhujwtBcq4Qrzq/fJaraNFVN+nFs=
    - github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
    - github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
    - github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
    -@@ go.sum: github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSS
    - github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
    - github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
    - github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
    --github.com/stretchr/testify v1.8.1 h1:w7B6lhMri9wdJUVmEZPGGhZzrYTPvgJArz7wNPgYKsk=
    - github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
    -+github.com/stretchr/testify v1.8.2 h1:+h33VjcLVPDHtOdpUCuF+7gSuG3yGIftsP1YvFihtJ8=
    -+github.com/stretchr/testify v1.8.2/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
    - github.com/yuin/goldmark v1.3.5/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k=
    - go.uber.org/goleak v1.2.0 h1:xqgm/S+aQvhWFTtR0XK3Jvg7z8kGV8P4X14IzwN3Eqk=
    - go.uber.org/goleak v1.2.0/go.mod h1:XJYK+MuIchqpmGmUSAzotztawfKvYLUIgg7guXrwVUo=
    -@@ go.sum: golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8T
    - golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
    - golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
    - gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
    --gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY=
    - gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
    -+gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
    -+gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
    - gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
    - gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
    - gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
    -
      ## internal/testutils/repoutils.go ##
     @@ internal/testutils/repoutils.go: func (repo *TestRepo) UpdateRef(t *testing.T, refname string, oid git.OID) {
      
    @@ internal/testutils/repoutils.go: func (repo *TestRepo) UpdateRef(t *testing.T, r
     
      ## sizes/graph.go ##
     @@ sizes/graph.go: func ScanRepositoryUsingGraph(
    + 	nameStyle NameStyle,
    + 	progressMeter meter.Progress,
      ) (HistorySize, error) {
    - 	ctx, cancel := context.WithCancel(context.TODO())
    - 	defer cancel()
    --
     +	nullOID := repo.HashAlgo().NullOID()
    - 	graph := NewGraph(rg, nameStyle)
    + 	graph := NewGraph(nameStyle)
      
    - 	refIter, err := repo.NewReferenceIter(ctx)
    + 	objIter, err := repo.NewObjectIter(ctx)
     @@ sizes/graph.go: func ScanRepositoryUsingGraph(
      		case "tree":
      			trees = append(trees, ObjectHeader{obj.OID, obj.ObjectSize})
    @@ sizes/output.go: func (i *item) MarshalJSON() ([]byte, error) {
      		stat.ObjectName = i.path.OID.String()
      		stat.ObjectDescription = i.path.Path()
      	}
    -@@ sizes/output.go: func (t *Threshold) Type() string {
    - // A `pflag.Value` that can be used as a boolean option that sets a
    - // `Threshold` variable to a fixed value. For example,
    - //
    --//		pflag.Var(
    --//			sizes.NewThresholdFlagValue(&threshold, 30),
    --//			"critical", "only report critical statistics",
    --//		)
    -+//	pflag.Var(
    -+//		sizes.NewThresholdFlagValue(&threshold, 30),
    -+//		"critical", "only report critical statistics",
    -+//	)
    - //
    - // adds a `--critical` flag that sets `threshold` to 30.
    - type thresholdFlagValue struct {
  • -: ------- > 4: 0fc1aa3 Add a test case for SHA-256 support

@fcharlie
Copy link
Author

fcharlie commented Dec 23, 2023

@dscho Great job, I used in-house tools to convert git-sizer to a sha256 repository, then ran a git-sizer scan and everything worked.

image image
@fcharlie fcharlie closed this by deleting the head repository Sep 7, 2024
@dscho
Copy link
Contributor

dscho commented Mar 12, 2025

So disappointing that this wasn't merged.

@fcharlie
Copy link
Author

fcharlie commented Mar 13, 2025

So disappointing that this wasn't merged.

Sorry, I forgot about this PR, I'll try it again when I have time.

diff --git a/git/git.go b/git/git.go
index 096ce81..b710cef 100644
--- a/git/git.go
+++ b/git/git.go
@@ -24,6 +24,8 @@ type Repository struct {
 	// gitBin is the path of the `git` executable that should be used
 	// when running commands in this repository.
 	gitBin string
+	// hashAgo is repository hash algo
+	hashAlgo HashAlgo
 }
 
 // smartJoin returns `relPath` if it is an absolute path. If not, it
@@ -49,9 +51,18 @@ func NewRepositoryFromGitDir(gitDir string) (*Repository, error) {
 		)
 	}
 
+	hashAlgo := HashSHA1
+	cmd := exec.Command(gitBin, "--git-dir", gitDir, "rev-parse", "--show-object-format")
+	if out, err := cmd.Output(); err == nil {
+		if string(bytes.TrimSpace(out)) == "sha256" {
+			hashAlgo = HashSHA256
+		}
+	}
+
 	repo := Repository{
-		gitDir: gitDir,
-		gitBin: gitBin,
+		gitDir:   gitDir,
+		gitBin:   gitBin,
+		hashAlgo: hashAlgo,
 	}
 
 	full, err := repo.IsFull()
@@ -170,3 +181,15 @@ func (repo *Repository) GitPath(relPath string) (string, error) {
 	// current directory, we can use it as-is:
 	return string(bytes.TrimSpace(out)), nil
 }
+
+func (repo *Repository) HashAlgo() HashAlgo {
+	return repo.hashAlgo
+}
+
+func (repo *Repository) HashSize() int {
+	return repo.hashAlgo.HashSize()
+}
+
+func (repo *Repository) NullOID() OID {
+	return repo.hashAlgo.NullOID()
+}
diff --git a/git/obj_iter.go b/git/obj_iter.go
index cecdc2a..c367f11 100644
--- a/git/obj_iter.go
+++ b/git/obj_iter.go
@@ -30,7 +30,7 @@ func (repo *Repository) NewObjectIter(ctx context.Context) (*ObjectIter, error)
 		errCh:    make(chan error),
 		headerCh: make(chan BatchHeader),
 	}
-
+	hashHexSize := repo.HashSize() * 2
 	iter.p.Add(
 		// Read OIDs from `iter.oidCh` and write them to `git
 		// rev-list`:
@@ -68,10 +68,10 @@ func (repo *Repository) NewObjectIter(ctx context.Context) (*ObjectIter, error)
 		pipe.LinewiseFunction(
 			"copy-oids",
 			func(_ context.Context, _ pipe.Env, line []byte, stdout *bufio.Writer) error {
-				if len(line) < 40 {
+				if len(line) < hashHexSize {
 					return fmt.Errorf("line too short: '%s'", line)
 				}
-				if _, err := stdout.Write(line[:40]); err != nil {
+				if _, err := stdout.Write(line[:hashHexSize]); err != nil {
 					return fmt.Errorf("writing OID to 'git cat-file': %w", err)
 				}
 				if err := stdout.WriteByte('\n'); err != nil {
diff --git a/git/obj_resolver.go b/git/obj_resolver.go
index 418e293..fbeb246 100644
--- a/git/obj_resolver.go
+++ b/git/obj_resolver.go
@@ -9,12 +9,12 @@ func (repo *Repository) ResolveObject(name string) (OID, error) {
 	cmd := repo.GitCommand("rev-parse", "--verify", "--end-of-options", name)
 	output, err := cmd.Output()
 	if err != nil {
-		return NullOID, fmt.Errorf("resolving object %q: %w", name, err)
+		return repo.NullOID(), fmt.Errorf("resolving object %q: %w", name, err)
 	}
 	oidString := string(bytes.TrimSpace(output))
 	oid, err := NewOID(oidString)
 	if err != nil {
-		return NullOID, fmt.Errorf("parsing output %q from 'rev-parse': %w", oidString, err)
+		return repo.NullOID(), fmt.Errorf("parsing output %q from 'rev-parse': %w", oidString, err)
 	}
 	return oid, nil
 }
diff --git a/git/oid.go b/git/oid.go
index 2aefbcb..7df9bc9 100644
--- a/git/oid.go
+++ b/git/oid.go
@@ -1,32 +1,75 @@
 package git
 
 import (
+	"bytes"
+	"crypto/sha1"
+	"crypto/sha256"
 	"encoding/hex"
 	"errors"
 )
 
+const (
+	HashSizeSHA256 = sha256.Size
+	HashSizeSHA1   = sha1.Size
+	HashSizeMax    = HashSizeSHA256
+)
+
+type HashAlgo int
+
+const (
+	HashUnknown HashAlgo = iota
+	HashSHA1
+	HashSHA256
+)
+
 // OID represents the SHA-1 object ID of a Git object, in binary
 // format.
 type OID struct {
-	v [20]byte
+	v        [HashSizeMax]byte
+	hashSize int
 }
 
-// NullOID is the null object ID; i.e., all zeros.
-var NullOID OID
+func (h HashAlgo) NullOID() OID {
+	switch h {
+	case HashSHA1:
+		return OID{hashSize: HashSizeSHA1}
+	case HashSHA256:
+		return OID{hashSize: HashSizeSHA256}
+	}
+	return OID{}
+}
+
+func (h HashAlgo) HashSize() int {
+	switch h {
+	case HashSHA1:
+		return HashSizeSHA1
+	case HashSHA256:
+		return HashSizeSHA256
+	}
+	return 0
+}
+
+// defaultNullOID is the null object ID; i.e., all zeros.
+var defaultNullOID OID
+
+func IsNullOID(o OID) bool {
+	return bytes.Equal(o.v[:], defaultNullOID.v[:])
+}
 
 // OIDFromBytes converts a byte slice containing an object ID in
 // binary format into an `OID`.
 func OIDFromBytes(oidBytes []byte) (OID, error) {
 	var oid OID
-	if len(oidBytes) != len(oid.v) {
+	oidSize := len(oidBytes)
+	if oidSize != HashSizeSHA1 && oidSize != HashSizeSHA256 {
 		return OID{}, errors.New("bytes oid has the wrong length")
 	}
-	copy(oid.v[0:20], oidBytes)
+	oid.hashSize = oidSize
+	copy(oid.v[0:oidSize], oidBytes)
 	return oid, nil
 }
 
-// NewOID converts an object ID in hex format (i.e., `[0-9a-f]{40}`)
-// into an `OID`.
+// NewOID converts an object ID in hex format (i.e., `[0-9a-f]{40,64}`) into an `OID`.
 func NewOID(s string) (OID, error) {
 	oidBytes, err := hex.DecodeString(s)
 	if err != nil {
@@ -37,18 +80,18 @@ func NewOID(s string) (OID, error) {
 
 // String formats `oid` as a string in hex format.
 func (oid OID) String() string {
-	return hex.EncodeToString(oid.v[:])
+	return hex.EncodeToString(oid.v[:oid.hashSize])
 }
 
 // Bytes returns a byte slice view of `oid`, in binary format.
 func (oid OID) Bytes() []byte {
-	return oid.v[:]
+	return oid.v[:oid.hashSize]
 }
 
 // MarshalJSON expresses `oid` as a JSON string with its enclosing
 // quotation marks.
 func (oid OID) MarshalJSON() ([]byte, error) {
-	src := oid.v[:]
+	src := oid.v[:oid.hashSize]
 	dst := make([]byte, hex.EncodedLen(len(src))+2)
 	dst[0] = '"'
 	dst[len(dst)-1] = '"'
diff --git a/git/tree.go b/git/tree.go
index c31fa78..18cb3ee 100644
--- a/git/tree.go
+++ b/git/tree.go
@@ -10,13 +10,14 @@ import (
 
 // Tree represents a Git tree object.
 type Tree struct {
-	data string
+	data     string
+	hashSize int
 }
 
 // ParseTree parses the tree object whose contents are contained in
 // `data`. `oid` is currently unused.
 func ParseTree(oid OID, data []byte) (*Tree, error) {
-	return &Tree{string(data)}, nil
+	return &Tree{string(data), oid.hashSize}, nil
 }
 
 // Size returns the size of the tree object.
@@ -36,13 +37,15 @@ type TreeEntry struct {
 // TreeIter is an iterator over the entries in a Git tree object.
 type TreeIter struct {
 	// The as-yet-unread part of the tree's data.
-	data string
+	data     string
+	hashSize int
 }
 
 // Iter returns an iterator over the entries in `tree`.
 func (tree *Tree) Iter() *TreeIter {
 	return &TreeIter{
-		data: tree.data,
+		data:     tree.data,
+		hashSize: tree.hashSize,
 	}
 }
 
@@ -74,12 +77,12 @@ func (iter *TreeIter) NextEntry() (TreeEntry, bool, error) {
 	entry.Name = iter.data[:nulAt]
 
 	iter.data = iter.data[nulAt+1:]
-	if len(iter.data) < 20 {
+	if len(iter.data) < iter.hashSize {
 		return TreeEntry{}, false, errors.New("tree entry ends unexpectedly")
 	}
-
-	copy(entry.OID.v[0:20], iter.data[0:20])
-	iter.data = iter.data[20:]
+	entry.OID.hashSize = iter.hashSize
+	copy(entry.OID.v[0:iter.hashSize], iter.data[0:iter.hashSize])
+	iter.data = iter.data[iter.hashSize:]
 
 	return entry, true, nil
 }
diff --git a/git_sizer_test.go b/git_sizer_test.go
index 8a7a2d2..c74b459 100644
--- a/git_sizer_test.go
+++ b/git_sizer_test.go
@@ -849,3 +849,40 @@ func TestSubmodule(t *testing.T) {
 	assert.Equal(t, counts.Count32(2), h.UniqueBlobCount, "unique blob count")
 	assert.Equal(t, counts.Count32(3), h.MaxExpandedBlobCount, "max expanded blob count")
 }
+
+func TestSHA256(t *testing.T) {
+	t.Parallel()
+
+	ctx := context.Background()
+
+	t.Helper()
+
+	path, err := os.MkdirTemp("", "sha256")
+	require.NoError(t, err)
+
+	testRepo := testutils.TestRepo{Path: path}
+	defer testRepo.Remove(t)
+
+	// Don't use `GitCommand()` because the directory might not
+	// exist yet:
+	cmd := exec.Command("git", "init", "--object-format", "sha256", testRepo.Path)
+	cmd.Env = testutils.CleanGitEnv()
+	err = cmd.Run()
+	require.NoError(t, err)
+
+	timestamp := time.Unix(1112911993, 0)
+
+	testRepo.AddFile(t, "hello.txt", "Hello, world!\n")
+	cmd = testRepo.GitCommand(t, "commit", "-m", "initial")
+	testutils.AddAuthorInfo(cmd, &timestamp)
+	require.NoError(t, cmd.Run(), "creating initial commit")
+
+	cmd = testRepo.GitCommand(t, "commit", "-m", "initial", "--allow-empty")
+	testutils.AddAuthorInfo(cmd, &timestamp)
+	require.NoError(t, cmd.Run(), "creating commit")
+
+	repo := testRepo.Repository(t)
+
+	_, err = sizes.CollectReferences(ctx, repo, refGrouper{})
+	require.NoError(t, err)
+}
diff --git a/go.mod b/go.mod
index 9db294d..50134b5 100644
--- a/go.mod
+++ b/go.mod
@@ -1,13 +1,15 @@
 module github.com/github/git-sizer
 
-go 1.17
+go 1.23.0
+
+toolchain go1.24.1
 
 require (
-	github.com/cli/safeexec v1.0.0
+	github.com/cli/safeexec v1.0.1
 	github.com/davecgh/go-spew v1.1.1 // indirect
-	github.com/spf13/pflag v1.0.5
-	github.com/stretchr/testify v1.8.1
-	golang.org/x/sync v0.1.0 // indirect
+	github.com/spf13/pflag v1.0.6
+	github.com/stretchr/testify v1.10.0
+	golang.org/x/sync v0.12.0 // indirect
 )
 
 require github.com/github/go-pipe v1.0.2
diff --git a/go.sum b/go.sum
index 5c5d0a9..a6bdc0a 100644
--- a/go.sum
+++ b/go.sum
@@ -1,7 +1,5 @@
-github.com/cli/safeexec v1.0.0 h1:0VngyaIyqACHdcMNWfo6+KdUYnqEr2Sg+bSP1pdF+dI=
-github.com/cli/safeexec v1.0.0/go.mod h1:Z/D4tTN8Vs5gXYHDCbaM1S/anmEDnJb1iW0+EJ5zx3Q=
-github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
-github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
+github.com/cli/safeexec v1.0.1 h1:e/C79PbXF4yYTN/wauC4tviMxEV13BwljGj0N9j+N00=
+github.com/cli/safeexec v1.0.1/go.mod h1:Z/D4tTN8Vs5gXYHDCbaM1S/anmEDnJb1iW0+EJ5zx3Q=
 github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/github/go-pipe v1.0.2 h1:befTXflsc6ir/h9f6Q7QCDmfojoBswD1MfQrPhmmSoA=
@@ -14,48 +12,16 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
 github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
 github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
 github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
-github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
-github.com/spf13/pflag v1.0.5/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
-github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
-github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
-github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
-github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
-github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
-github.com/stretchr/testify v1.8.1 h1:w7B6lhMri9wdJUVmEZPGGhZzrYTPvgJArz7wNPgYKsk=
-github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
-github.com/yuin/goldmark v1.3.5/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k=
+github.com/spf13/pflag v1.0.6 h1:jFzHGLGAlb3ruxLB8MhbI6A8+AQX/2eW4qeyNZXNp2o=
+github.com/spf13/pflag v1.0.6/go.mod h1:McXfInJRrz4CZXVZOBLb0bTZqETkiAhM9Iw0y3An2Bg=
+github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
+github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
 go.uber.org/goleak v1.2.0 h1:xqgm/S+aQvhWFTtR0XK3Jvg7z8kGV8P4X14IzwN3Eqk=
 go.uber.org/goleak v1.2.0/go.mod h1:XJYK+MuIchqpmGmUSAzotztawfKvYLUIgg7guXrwVUo=
-golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
-golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
-golang.org/x/lint v0.0.0-20190930215403-16217165b5de/go.mod h1:6SW0HCj/g11FgYtHlgUYUwCkIfeOF89ocIRzGO/8vkc=
-golang.org/x/mod v0.4.2/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
-golang.org/x/net v0.0.0-20190311183353-d8887717615a/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
-golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
-golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
-golang.org/x/net v0.0.0-20210405180319-a5a99cb37ef4/go.mod h1:p54w0d4576C0XHj96bSt6lcn1PtDYWL6XObtHCRCNQM=
-golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
-golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
-golang.org/x/sync v0.1.0 h1:wsuoTGHzEhffawBOhz5CYhcrV4IdKZbEyZjBMuTp12o=
-golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
-golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
-golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
-golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
-golang.org/x/sys v0.0.0-20210330210617-4fbd30eecc44/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
-golang.org/x/sys v0.0.0-20210510120138-977fb7262007/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
-golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
-golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
-golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
-golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
-golang.org/x/tools v0.0.0-20190311212946-11955173bddd/go.mod h1:LCzVGOaR6xXOjkQ3onu1FJEFr0SW1gC7cKk1uF8kGRs=
-golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
-golang.org/x/tools v0.1.5/go.mod h1:o0xws9oXOQQZyjljx8fwUC0k7L1pTE6eaCbjGeHmOkk=
-golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
-golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
-golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
+golang.org/x/sync v0.12.0 h1:MHc5BpPuC30uJk597Ri8TV3CNZcTLu6B6z4lJy+g6Jw=
+golang.org/x/sync v0.12.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
 gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
 gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127 h1:qIbj1fsPNlZgppZ+VLlY7N33q108Sa+fhmuc+sWQYwY=
 gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
-gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
 gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
 gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
diff --git a/internal/testutils/repoutils.go b/internal/testutils/repoutils.go
index 48a8759..e14e487 100644
--- a/internal/testutils/repoutils.go
+++ b/internal/testutils/repoutils.go
@@ -165,7 +165,7 @@ func (repo *TestRepo) UpdateRef(t *testing.T, refname string, oid git.OID) {
 
 	var cmd *exec.Cmd
 
-	if oid == git.NullOID {
+	if git.IsNullOID(oid) {
 		cmd = repo.GitCommand(t, "update-ref", "-d", refname)
 	} else {
 		cmd = repo.GitCommand(t, "update-ref", refname, oid.String())
diff --git a/sizes/graph.go b/sizes/graph.go
index 0fb1c8a..2101a00 100644
--- a/sizes/graph.go
+++ b/sizes/graph.go
@@ -134,7 +134,7 @@ func ScanRepositoryUsingGraph(
 		case "tree":
 			trees = append(trees, ObjectHeader{obj.OID, obj.ObjectSize})
 		case "commit":
-			commits = append(commits, CommitHeader{ObjectHeader{obj.OID, obj.ObjectSize}, git.NullOID})
+			commits = append(commits, CommitHeader{ObjectHeader{obj.OID, obj.ObjectSize}, repo.NullOID()})
 		case "tag":
 			tags = append(tags, ObjectHeader{obj.OID, obj.ObjectSize})
 		default:
diff --git a/sizes/output.go b/sizes/output.go
index 933cc05..28ed130 100644
--- a/sizes/output.go
+++ b/sizes/output.go
@@ -155,7 +155,7 @@ func (i *item) Emit(t *table) {
 }
 
 func (i *item) Footnote(nameStyle NameStyle) string {
-	if i.path == nil || i.path.OID == git.NullOID {
+	if i.path == nil || git.IsNullOID(i.path.OID) {
 		return ""
 	}
 	switch nameStyle {
@@ -214,7 +214,7 @@ func (i *item) MarshalJSON() ([]byte, error) {
 		LevelOfConcern: float64(value) / i.scale,
 	}
 
-	if i.path != nil && i.path.OID != git.NullOID {
+	if i.path != nil && !git.IsNullOID(i.path.OID) {
 		stat.ObjectName = i.path.OID.String()
 		stat.ObjectDescription = i.path.Path()
 	}
@@ -279,10 +279,10 @@ func (t *Threshold) Type() string {
 // A `pflag.Value` that can be used as a boolean option that sets a
 // `Threshold` variable to a fixed value. For example,
 //
-//		pflag.Var(
-//			sizes.NewThresholdFlagValue(&threshold, 30),
-//			"critical", "only report critical statistics",
-//		)
+//	pflag.Var(
+//		sizes.NewThresholdFlagValue(&threshold, 30),
+//		"critical", "only report critical statistics",
+//	)
 //
 // adds a `--critical` flag that sets `threshold` to 30.
 type thresholdFlagValue struct {
@dscho
Copy link
Contributor

dscho commented Mar 13, 2025

@fcharlie oh, sorry, I didn't mean that I am disappointed by your actions! To the contrary, I am amazed by your tenacity. I assumed that you got too frustrated and gave up, because I would have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants