Skip to content

Panic in parseFileHeader when diff header lacks expected ' b/' separator #135

@ByteSizedMarius

Description

@ByteSizedMarius

Bug

parseFileHeader in diff.go panics with slice bounds out of range [:-1] when strings.Index returns -1 for a diff header that doesn't contain the expected b/ pattern. Admittedly this is very edgecase but I've encountered it twice

edit: I figured out that the issue was a filename checked into git with a german umlaut (ä). So its actually only slightly edgecase.

Reproduction

old test
func TestParseFileHeader_MalformedDiffLine(t *testing.T) {
	// A diff header without the ' b/' separator causes a panic.
	// This can happen with unusual filenames or for truncated git output.
	input := "diff --git a/somefile a/somefile\n"

	done := make(chan SteamParseDiffResult, 1)
	assert.NotPanics(t, func() {
		StreamParseDiff(strings.NewReader(input), done, 10, 10, 10)
		<-done
	})
}

This panics because strings.Index(line, " b/") returns -1, then line[beg+2 : -1] at line 278 triggers the panic.

Since StreamParseDiff (and (*Repository).Diff) runs in a goroutine, the panic is unrecoverable by the caller and crashes the entire process.

EDIT: here's tests with föo.txt / foo.txt

// TestParseFileHeader_AsymmetricQuoting reproduces a panic in parseFileHeader
// when git quotes only one side of a diff --git header.
//
// Git quotes each path independently based on whether it contains non-ASCII
// characters. A rename from a non-ASCII name to an ASCII name produces:
//
//   diff --git "a/f\303\266o.txt" b/foo.txt
//
// The a-side is quoted (contains ö = \303\266), the b-side is not. The parser
// checks line[len(diffHead)] == '"' and, finding a quote, searches for ` "b/`.
// But the b-side has no quote, so strings.Index returns -1, causing a panic
// at: a := line[beg+2 : middle]
//
// Real-world trigger: git diff --full-index -M <parent> <child> on a repo
// where a file with non-ASCII characters was renamed to an ASCII filename.
func TestParseFileHeader_AsymmetricQuoting(t *testing.T) {
	tests := []struct {
		name string
		diff string
	}{
		{
			name: "quoted a-side, unquoted b-side",
			diff: "diff --git \"a/f\\303\\266o.txt\" b/foo.txt\n" +
				"similarity index 85%\n" +
				"rename from \"f\\303\\266o.txt\"\n" +
				"rename to foo.txt\n" +
				"index abc1234..def5678 100644\n" +
				"--- \"a/f\\303\\266o.txt\"\n" +
				"+++ b/foo.txt\n" +
				"@@ -1,3 +1,3 @@\n" +
				"-old line\n" +
				"+new line\n" +
				" context\n",
		},
		{
			name: "unquoted a-side, quoted b-side",
			diff: "diff --git a/foo.txt \"b/f\\303\\266o.txt\"\n" +
				"similarity index 85%\n" +
				"rename from foo.txt\n" +
				"rename to \"f\\303\\266o.txt\"\n" +
				"index abc1234..def5678 100644\n" +
				"--- a/foo.txt\n" +
				"+++ \"b/f\\303\\266o.txt\"\n" +
				"@@ -1,3 +1,3 @@\n" +
				"-old line\n" +
				"+new line\n" +
				" context\n",
		},
		{
			name: "both sides quoted (should already work)",
			diff: "diff --git \"a/f\\303\\266o.txt\" \"b/f\\303\\266o.txt\"\n" +
				"index abc1234..def5678 100644\n" +
				"--- \"a/f\\303\\266o.txt\"\n" +
				"+++ \"b/f\\303\\266o.txt\"\n" +
				"@@ -1,3 +1,3 @@\n" +
				"-old line\n" +
				"+new line\n" +
				" context\n",
		},
	}

	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			done := make(chan git.SteamParseDiffResult)
			r := io.NopCloser(strings.NewReader(tt.diff))
			go git.StreamParseDiff(r, done, 100, 100, 100)
			result := <-done
			if result.Err != nil {
				t.Errorf("unexpected error: %v", result.Err)
			}
			if result.Diff == nil {
				t.Fatal("expected non-nil diff")
			}
			if len(result.Diff.Files) != 1 {
				t.Fatalf("expected 1 file, got %d", len(result.Diff.Files))
			}
		})
	}
}

Stack trace

panic: runtime error: slice bounds out of range [:-1]

goroutine 830 [running]:
github.com/aymanbagabas/git-module.(*diffParser).parseFileHeader(0xc0002124c0)
	diff.go:278 +0xa6d
github.com/aymanbagabas/git-module.(*diffParser).parse(0xc0002124c0)
	diff.go:489 +0x3af
github.com/aymanbagabas/git-module.StreamParseDiff(...)
	diff.go:553 +0x1d0

edit/note: stacktrace shows https://github.com/aymanbagabas/git-module where I noticed the issue, but the code hasnt changed upstream

Fix

Add a bounds check before slicing:

if middle == -1 {
	return nil, fmt.Errorf("malformed diff header: missing file separator in %q", line)
}

-> this will catch other edge cases but doesnt solve the actual issue.

proposed fix:

func (p *diffParser) parseFileHeader() (*DiffFile, error) {
	line := string(p.buffer)
	p.buffer = nil

	// Git quotes each side of the diff header independently when a path
	// contains non-ASCII characters. All four combinations are possible:
	//   diff --git a/foo b/bar             (neither quoted)
	//   diff --git "a/föo" "b/bär"         (both quoted)
	//   diff --git "a/föo" b/bar           (only a quoted)
	//   diff --git a/foo "b/bär"           (only b quoted)
	beg := len(diffHead)
	aQuoted := line[beg] == '"'

	var middle int
	if aQuoted {
		// Find closing quote of a-path to locate the separator
		end := strings.Index(line[beg+1:], `"`)
		if end == -1 {
			return nil, fmt.Errorf("malformed diff header: unterminated quote: %s", line)
		}
		middle = beg + 1 + end + 1 // position of space after closing quote
	} else {
		middle = strings.Index(line, ` "b/`)
		if middle == -1 {
			middle = strings.Index(line, ` b/`)
		}
	}

	if middle == -1 || middle >= len(line)-1 {
		return nil, fmt.Errorf("malformed diff header: %s", line)
	}

	bQuoted := line[middle+1] == '"'

	// Extract raw a and b path strings (with prefix, possibly quoted)
	aRaw := line[beg:middle]
	bRaw := line[middle+1:]

	// Strip quotes and a/, b/ prefixes
	var a, b string
	if aQuoted {
		inner := aRaw[1 : len(aRaw)-1]
		a = string(UnescapeChars([]byte(inner[2:])))
	} else {
		a = aRaw[2:]
	}
	if bQuoted {
		inner := bRaw[1 : len(bRaw)-1]
		b = string(UnescapeChars([]byte(inner[2:])))
	} else {
		b = bRaw[2:]
	}

	file := &DiffFile{
		Name:    a,
		oldName: b,
		Type:    DiffFileChange,
	}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions