How to reduce the depth of an existing git clone?

Issue

I have a clone. I want to reduce the history on it, without cloning from scratch with a reduced depth. Worked example:

$ git clone [email protected]:apache/spark.git
# ...
$ cd spark/
$ du -hs .git
193M    .git

OK, so that’s not so but, but it’ll serve for this discussion. If I try gc it gets smaller:

$ git gc --aggressive
Counting objects: 380616, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (278136/278136), done.
Writing objects: 100% (380616/380616), done.
Total 380616 (delta 182748), reused 192702 (delta 0)
Checking connectivity: 380616, done.
$ du -hs .git
108M    .git

Still, pretty big though (git pull suggests that it’s still push/pullable to the remote). How about repack?

$ git repack -a -d --depth=5
Counting objects: 380616, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (95388/95388), done.
Writing objects: 100% (380616/380616), done.
Total 380616 (delta 182748), reused 380616 (delta 182748)
Pauls-MBA:spark paul$ du -hs .git
108M    .git

Yup, didn’t get any smaller. –depth for repack isn’t the same for clone:

$ git clone --depth 1 [email protected]:apache/spark.git
Cloning into 'spark'...
remote: Counting objects: 8520, done.
remote: Compressing objects: 100% (6611/6611), done.
remote: Total 8520 (delta 1448), reused 5101 (delta 710), pack-reused 0
Receiving objects: 100% (8520/8520), 14.82 MiB | 3.63 MiB/s, done.
Resolving deltas: 100% (1448/1448), done.
Checking connectivity... done.
Checking out files: 100% (13386/13386), done.
$ cd spark
$ du -hs .git
17M .git

Git pull says it’s still in step with the remote, which surprises nobody.

OK – so how to change an existing clone to a shallow clone, without nixing it and checking it out afresh?

Solution

git clone --mirror --depth=5  file://$PWD ../temp
rm -rf .git/objects
mv ../temp/{shallow,objects} .git
rm -rf ../temp

This really isn’t cloning “from scratch”, as it’s purely local work and it creates virtually nothing more than the shallowed-out pack files, probably in the tens of kbytes total. I’d venture you’re not going to get more efficient than this, you’ll wind up with custom work that uses more space in the form of scripts and test work than this does in the form of a few kb of temporary repo overhead.

Answered By – jthill

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published