Find commit with the smallest diff

Issue

I got sent a bunch of files originating from the same git repo I work with, but they were developed against an older commit. How to find out which commit they used? Something like least lines of diff.

Solution

How to find out which commit they used? Something like least lines of diff.

Well, you can do exactly that: find the commit with the smallest diff against your target directory. Just run a loop over all the commits in your repository, and for each one compute a diff against your target directory, and remember the one with the smallest diff.

Let’s assume you have your repository in ./repo and the files in question in ./target.

#!/bin/sh

cd repo
HEAD=$(git rev-parse HEAD)
git log --pretty='%H' | while read rev; do
    git checkout -q $rev
    lines=$(diff -ruN -x .git . ../target | wc -l)

    if ! [[ "$minlines" ]] || [[ $lines -lt $minlines ]]; then
        minlines=$lines
        commit=$rev
        echo "$lines $rev"
    fi

    [[ $lines -eq 0 ]] && break
done | tail -1

git checkout $HEAD

This will take a while for a repository with a long history, but it works. It will print out the size of the diff followed by the commit id for the commit with the smallest diff.

If you interrupt this script while it’s running you’ll need to check out an appropriate branch head (e.g., git checkout master).

Answered By – larsks

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published