GIT CRLF hook

When Windows and Linux developers are working together on one central repository, like it is done in most companies, it makes sense to use the AUTOCRLF feature of GIT.

git config --global core.autocrlf true

This way GIT automatically converts CRLF to LF when committing and converts it back when the files are checked out. In the repository there is always everything stored with LF.

But what happens if a Windows developer forgets to activate this feature? He is committing CRLFs into the repo. This annoys Linux developers but even worse if another Windows developer has the feature turned on, he will get a huge list of modified files, that he didn’t modify. Why that?

GIT converts LF->CRLF when checking out on Windows. If the file contains already CRLF, GIT is clever enough to detect that and does not expand it to CRCRLF what would be wrong. It keeps the CRLF, which means the file was implicitly changed locally during the checkout, because when committing it again, the wrong CRLF will be corrected to LF. That’s why GIT must mark these files as modified.

It’s good to understand the problem, but we need a solution that prevents that wrong line endings are pushed to the central repo. The solution is to install an update hook on the central server.

 1 #!/bin/sh
 2 #
 3 # Author: Gerhard Gappmeier, ascolab GmbH
 4 # This script is based on the update.sample in git/contrib/hooks.
 5 # You are free to use this script for whatever you want.
 6 #
 7 # To enable this hook, rename this file to "update".
 8 #
 9
10 # — Command line
11 refname="$1"
12 oldrev="$2"
13 newrev="$3"
14 #echo "COMMANDLINE: $*"
15
16 # — Safety check
17 if [ -z "$GIT_DIR" ]; then
18     echo "Don’t run this script from the command line." >&2
19     echo " (if you want, you could supply GIT_DIR then run" >&2
20     echo "  $0 <ref> <oldrev> <newrev>)" >&2
21     exit 1
22 fi
23
24 if [ -z "$refname" -o -z "$oldrev" -o -z "$newrev" ]; then
25     echo "Usage: $0 <ref> <oldrev> <newrev>" >&2
26     exit 1
27 fi
28
29 # binary extension to ignore in CRLF check
30 BINARAY_EXT="pdb dll exe exp png gif jpg doc vsd vss"
31
32 # returns 1 if the given filename is a binary file
33 function IsBinary()
34 {
35     result=0
36     for ext in $BINARAY_EXT; do
37         if [ "$ext" = "${1##*.}" ]; then
38             echo "$1 is binary"
39             result=1
40             break
41         fi
42     done
43
44     return $result
45 }
46
47 # make temp paths
48 tmp=$(mktemp /tmp/git.update.XXXXXX)
49 log=$(mktemp /tmp/git.update.log.XXXXXX)    
50 tree=$(mktemp /tmp/git.diff-tree.XXXXXX)
51 ret=0
52
53 git diff-tree -r "$oldrev" "$newrev" > $tree
54 #echo
55 #echo diff-tree:
56 #cat $tree
57
58 # read $tree using the file descriptors
59 exec 3<&0
60 exec 0<$tree
61 while read old_mode new_mode old_sha1 new_sha1 status name
62 do
63     # debug output
64     #echo "old_mode=$old_mode new_mode=$new_mode old_sha1=$old_sha1 new_sha1=$new_sha1 status=$status name=$name"
65     # skip lines showing parent commit
66     test -z "$new_sha1" && continue
67     # skip deletions
68     [ "$new_sha1" = "0000000000000000000000000000000000000000" ] && continue
69   
70     # don’t do a CRLF check for binary files
71     IsBinary $name
72     if [ $? -eq 1 ]; then
73         #echo "skipping file"
74         continue # skip binary files
75     fi
76     
77     # check for CRLF
78     git catfile blob $new_sha1 > $tmp
79     RESULT=`grep -Pl \r\n $tmp`
80     echo $RESULT
81     if [ "$RESULT" = "$tmp" ]; then
82         echo "###################################################################################################"
83         echo "# ‘$name‘ contains CRLF! Dear Windows developer, please activate the GIT core.autocrlf feature,"
84         echo "# or change the line endings to LF before trying to push."
85         echo "# Use ‘git config core.autocrlf true’ to activate CRLF conversion."
86         echo "# OR use ‘git reset HEAD~1’ to undo your last commit and fix the line endings."
87         echo "###################################################################################################"
88         ret=1
89     fi
90 done
91 exec 0<&3
92 # — Finished
93 exit $ret
94

Download it here.
A pre-commit hook for the local developer repositories would also make sense to get the error already when committing locally, not only when pushing the local changes to the server. But you cannot enforce this technically. This must be done voluntarily by the developer.

Advertisements

3 Responses to “GIT CRLF hook”


  1. 1 Senthil October 8, 2010 at 5:51 am

    That’s great information thanks a lot.
    But i have a doubt now, does GIT does the CRLF conversion during a commit or during a vi of the file. Let’s say i have some files in Windows with CRLF (not a GIT repo), when i sync these files to a UNIX GIT repo and do git add/commit with autocrlf enabled, will these files get the conversion from CRLF to LF? or does it do the conversion of these files only when i vi these files and commit it again?

    • 2 gergap October 8, 2010 at 9:12 am

      With you enable autocrlf git converts it when you commit the file on windows from CRLF to LF.
      So in the repo it is always stored with LF.
      It converts it back to CRLF when you check it out again.

  2. 3 Colin Mollenhour September 28, 2011 at 8:15 pm

    Thanks for the script, this is just what I was looking for!

    I made some changes to it:
    – replaced all of the temporary file uses with pipes
    – used grep pattern that only matches end of line
    – use grep -E instead of -P, used -q mode
    – reduced output when multiple files have CRLF


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





%d bloggers like this: