-
Notifications
You must be signed in to change notification settings - Fork 130
Description
Bug
parseFileHeader in diff.go panics with slice bounds out of range [:-1] when strings.Index returns -1 for a diff header that doesn't contain the expected b/ pattern. Admittedly this is very edgecase but I've encountered it twice
edit: I figured out that the issue was a filename checked into git with a german umlaut (ä). So its actually only slightly edgecase.
Reproduction
old test
func TestParseFileHeader_MalformedDiffLine(t *testing.T) {
// A diff header without the ' b/' separator causes a panic.
// This can happen with unusual filenames or for truncated git output.
input := "diff --git a/somefile a/somefile\n"
done := make(chan SteamParseDiffResult, 1)
assert.NotPanics(t, func() {
StreamParseDiff(strings.NewReader(input), done, 10, 10, 10)
<-done
})
}This panics because strings.Index(line, " b/") returns -1, then line[beg+2 : -1] at line 278 triggers the panic.
Since StreamParseDiff (and (*Repository).Diff) runs in a goroutine, the panic is unrecoverable by the caller and crashes the entire process.
EDIT: here's tests with föo.txt / foo.txt
// TestParseFileHeader_AsymmetricQuoting reproduces a panic in parseFileHeader
// when git quotes only one side of a diff --git header.
//
// Git quotes each path independently based on whether it contains non-ASCII
// characters. A rename from a non-ASCII name to an ASCII name produces:
//
// diff --git "a/f\303\266o.txt" b/foo.txt
//
// The a-side is quoted (contains ö = \303\266), the b-side is not. The parser
// checks line[len(diffHead)] == '"' and, finding a quote, searches for ` "b/`.
// But the b-side has no quote, so strings.Index returns -1, causing a panic
// at: a := line[beg+2 : middle]
//
// Real-world trigger: git diff --full-index -M <parent> <child> on a repo
// where a file with non-ASCII characters was renamed to an ASCII filename.
func TestParseFileHeader_AsymmetricQuoting(t *testing.T) {
tests := []struct {
name string
diff string
}{
{
name: "quoted a-side, unquoted b-side",
diff: "diff --git \"a/f\\303\\266o.txt\" b/foo.txt\n" +
"similarity index 85%\n" +
"rename from \"f\\303\\266o.txt\"\n" +
"rename to foo.txt\n" +
"index abc1234..def5678 100644\n" +
"--- \"a/f\\303\\266o.txt\"\n" +
"+++ b/foo.txt\n" +
"@@ -1,3 +1,3 @@\n" +
"-old line\n" +
"+new line\n" +
" context\n",
},
{
name: "unquoted a-side, quoted b-side",
diff: "diff --git a/foo.txt \"b/f\\303\\266o.txt\"\n" +
"similarity index 85%\n" +
"rename from foo.txt\n" +
"rename to \"f\\303\\266o.txt\"\n" +
"index abc1234..def5678 100644\n" +
"--- a/foo.txt\n" +
"+++ \"b/f\\303\\266o.txt\"\n" +
"@@ -1,3 +1,3 @@\n" +
"-old line\n" +
"+new line\n" +
" context\n",
},
{
name: "both sides quoted (should already work)",
diff: "diff --git \"a/f\\303\\266o.txt\" \"b/f\\303\\266o.txt\"\n" +
"index abc1234..def5678 100644\n" +
"--- \"a/f\\303\\266o.txt\"\n" +
"+++ \"b/f\\303\\266o.txt\"\n" +
"@@ -1,3 +1,3 @@\n" +
"-old line\n" +
"+new line\n" +
" context\n",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
done := make(chan git.SteamParseDiffResult)
r := io.NopCloser(strings.NewReader(tt.diff))
go git.StreamParseDiff(r, done, 100, 100, 100)
result := <-done
if result.Err != nil {
t.Errorf("unexpected error: %v", result.Err)
}
if result.Diff == nil {
t.Fatal("expected non-nil diff")
}
if len(result.Diff.Files) != 1 {
t.Fatalf("expected 1 file, got %d", len(result.Diff.Files))
}
})
}
}Stack trace
panic: runtime error: slice bounds out of range [:-1]
goroutine 830 [running]:
github.com/aymanbagabas/git-module.(*diffParser).parseFileHeader(0xc0002124c0)
diff.go:278 +0xa6d
github.com/aymanbagabas/git-module.(*diffParser).parse(0xc0002124c0)
diff.go:489 +0x3af
github.com/aymanbagabas/git-module.StreamParseDiff(...)
diff.go:553 +0x1d0
edit/note: stacktrace shows https://github.com/aymanbagabas/git-module where I noticed the issue, but the code hasnt changed upstream
Fix
Add a bounds check before slicing:
if middle == -1 {
return nil, fmt.Errorf("malformed diff header: missing file separator in %q", line)
}-> this will catch other edge cases but doesnt solve the actual issue.
proposed fix:
func (p *diffParser) parseFileHeader() (*DiffFile, error) {
line := string(p.buffer)
p.buffer = nil
// Git quotes each side of the diff header independently when a path
// contains non-ASCII characters. All four combinations are possible:
// diff --git a/foo b/bar (neither quoted)
// diff --git "a/föo" "b/bär" (both quoted)
// diff --git "a/föo" b/bar (only a quoted)
// diff --git a/foo "b/bär" (only b quoted)
beg := len(diffHead)
aQuoted := line[beg] == '"'
var middle int
if aQuoted {
// Find closing quote of a-path to locate the separator
end := strings.Index(line[beg+1:], `"`)
if end == -1 {
return nil, fmt.Errorf("malformed diff header: unterminated quote: %s", line)
}
middle = beg + 1 + end + 1 // position of space after closing quote
} else {
middle = strings.Index(line, ` "b/`)
if middle == -1 {
middle = strings.Index(line, ` b/`)
}
}
if middle == -1 || middle >= len(line)-1 {
return nil, fmt.Errorf("malformed diff header: %s", line)
}
bQuoted := line[middle+1] == '"'
// Extract raw a and b path strings (with prefix, possibly quoted)
aRaw := line[beg:middle]
bRaw := line[middle+1:]
// Strip quotes and a/, b/ prefixes
var a, b string
if aQuoted {
inner := aRaw[1 : len(aRaw)-1]
a = string(UnescapeChars([]byte(inner[2:])))
} else {
a = aRaw[2:]
}
if bQuoted {
inner := bRaw[1 : len(bRaw)-1]
b = string(UnescapeChars([]byte(inner[2:])))
} else {
b = bRaw[2:]
}
file := &DiffFile{
Name: a,
oldName: b,
Type: DiffFileChange,
}