Git blobless repository

Issue

I’m wondering if there’s a way to get commit and tree objects only from a remote.

This may sound like a silly question, I’m not sure—I’m new to git plumbing. I’m building an app that associates meta-data with git commits, authorships, and file system structure. My options are to build a cludgy in-database normalization of the data with some sort of hook-enabled syncing mechanism, or to use the powerful native git tools for syncing, attaching metadata, and querying history.

However, since I don’t actually need the blob objects, it’d save me a buck or two on hosting if I could shed them somehow. Is this or any incarnation of the concept possible?

Solution

Technically, a commit object only names a tree object, and then the tree object (once found) names more trees and blobs. Thus, a git repository in which all the blob object files were deliberately “broken” (e.g., overwritten with an empty file, or even removed entirely) would work to some degree—in fact, to the same degree that it does if you create such a thing manually:

$ chmod +w .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb
$ : > .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb
$ git status
# On branch master
nothing to commit, working directory clean
$ git cat-file -p HEAD:file
error: object file .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb is empty
fatal: Not a valid object name HEAD:file
$ git fsck
Checking object directories: 100% (256/256), done.
error: object file .git/objects/f7/0d6b139823ab30278db23bb547c61e0d4444fb is empty
error: sha1 mismatch f70d6b139823ab30278db23bb547c61e0d4444fb
error: f70d6b139823ab30278db23bb547c61e0d4444fb: object corrupt or missing
missing blob f70d6b139823ab30278db23bb547c61e0d4444fb

Clearly it sort-of-works. (In fact, git cat-file -p HEAD and git cat-file -p HEAD: also work here, as does git ls-tree -r HEAD.)

The problem you’re going to run into immediately is that git prefers to store objects in packs, and transfer packs around, and those will notice the corrupted (or missing, if you rm them) objects. It might not even save that much space, depending on how compressed the objects are in the packs (it’s been observed that the repo is sometimes smaller than the checked-out tree!).

Answered By – torek

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published