fix: preserve string literals during schema qualification stripping (#371)#372
fix: preserve string literals during schema qualification stripping (#371)#372
Conversation
…371) stripSchemaQualificationsFromText used regex patterns that matched schema prefixes inside single-quoted SQL string literals. For example, with schema "s", Pattern 4 treated the single quote in 's.manage' as a valid non-double-quote prefix character, corrupting 's.manage' into 'manage'. This caused pgschema plan to generate destructive false-positive ALTER POLICY statements that silently truncated function arguments. Add stripSchemaQualificationsPreservingStrings which splits text on single-quoted string boundaries (handling '' escapes) and applies schema stripping only to non-string segments. Also add a fast-path strings.Contains check to skip all work when the schema name is absent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes schema-qualification stripping during pgschema plan so that schema prefixes inside single-quoted SQL string literals (e.g. 's.manage') are preserved, preventing destructive false-positive diffs for RLS policies.
Changes:
- Adds string-literal–aware schema qualification stripping by splitting non–dollar-quoted SQL on single-quoted boundaries before applying regex-based stripping.
- Adds a fast-path to skip processing when the schema name does not appear in the SQL text.
- Adds unit tests and extends diff fixtures to cover the RLS policy string-literal regression (Issue #371).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/postgres/desired_state.go | Updates schema qualification stripping to preserve single-quoted string literals and adds a strings.Contains fast-path. |
| internal/postgres/desired_state_test.go | Adds unit tests validating schema stripping while preserving string literals and escaped quotes. |
| testdata/diff/create_policy/alter_policy_using/old.sql | Extends policy diff fixture with a policy using a dotted string literal ('public.manage'). |
| testdata/diff/create_policy/alter_policy_using/new.sql | Extends policy diff fixture with the same string-literal policy to ensure no false-positive diff. |
| testdata/diff/create_policy/alter_policy_using/plan.json | Updates the expected plan fingerprint hash after fixture changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| sql: "SELECT 'public.a', public.t, 'public.b';", | ||
| schema: "public", | ||
| expected: "SELECT 'public.a', t, 'public.b';", | ||
| }, |
There was a problem hiding this comment.
The new unit tests cover basic string-literal preservation, but they don’t cover the failure mode where apostrophes appear inside SQL comments (e.g., -- don't ...) and interfere with the string-splitting state machine. Adding a test case with a comment containing a single quote followed by a schema-qualified identifier would help prevent regressions once comment-handling is fixed.
| }, | |
| }, | |
| { | |
| name: "handles apostrophe in comment followed by schema-qualified identifier", | |
| sql: "SELECT 1; -- don't drop public.t\nDROP TABLE public.t;", | |
| schema: "public", | |
| expected: "SELECT 1; -- don't drop public.t\nDROP TABLE t;", | |
| }, |
| inString := false | ||
|
|
||
| i := 0 | ||
| segStart := 0 | ||
| for i < len(text) { | ||
| ch := text[i] | ||
| if ch == '\'' { | ||
| if !inString { | ||
| // End of non-string segment — strip schema qualifications from it | ||
| result.WriteString(stripSchemaQualificationsFromText(text[segStart:i], schemaName)) | ||
| segStart = i | ||
| inString = true | ||
| i++ | ||
| } else { | ||
| // Check for escaped quote ('') | ||
| if i+1 < len(text) && text[i+1] == '\'' { | ||
| i += 2 // skip '' | ||
| } else { | ||
| // End of string literal — write it as-is | ||
| inString = false | ||
| i++ | ||
| result.WriteString(text[segStart:i]) | ||
| segStart = i | ||
| } | ||
| } | ||
| } else { | ||
| i++ | ||
| } | ||
| } |
There was a problem hiding this comment.
stripSchemaQualificationsPreservingStrings treats every single quote as the start/end of a SQL string literal, but it doesn’t account for quotes appearing inside SQL comments (e.g., -- don't ...). A single apostrophe in a -- or /* ... */ comment can flip inString and cause the rest of the SQL text to be treated as “string”, preventing schema qualification stripping for subsequent statements (potentially creating objects in the real schema instead of the temp schema). Consider extending the splitter to recognize and skip over line/block comments (and only toggle inString when not inside a comment).
| inString := false | |
| i := 0 | |
| segStart := 0 | |
| for i < len(text) { | |
| ch := text[i] | |
| if ch == '\'' { | |
| if !inString { | |
| // End of non-string segment — strip schema qualifications from it | |
| result.WriteString(stripSchemaQualificationsFromText(text[segStart:i], schemaName)) | |
| segStart = i | |
| inString = true | |
| i++ | |
| } else { | |
| // Check for escaped quote ('') | |
| if i+1 < len(text) && text[i+1] == '\'' { | |
| i += 2 // skip '' | |
| } else { | |
| // End of string literal — write it as-is | |
| inString = false | |
| i++ | |
| result.WriteString(text[segStart:i]) | |
| segStart = i | |
| } | |
| } | |
| } else { | |
| i++ | |
| } | |
| } | |
| inString := false | |
| inLineComment := false | |
| inBlockComment := false | |
| i := 0 | |
| segStart := 0 | |
| for i < len(text) { | |
| ch := text[i] | |
| if inString { | |
| if ch == '\'' { | |
| // Check for escaped quote ('') | |
| if i+1 < len(text) && text[i+1] == '\'' { | |
| i += 2 // skip '' | |
| continue | |
| } | |
| // End of string literal — write it as-is | |
| inString = false | |
| i++ | |
| result.WriteString(text[segStart:i]) | |
| segStart = i | |
| continue | |
| } | |
| // Inside string literal, just advance | |
| i++ | |
| continue | |
| } | |
| if inLineComment { | |
| // Line comment ends at newline | |
| if ch == '\n' { | |
| inLineComment = false | |
| } | |
| i++ | |
| continue | |
| } | |
| if inBlockComment { | |
| // Block comment ends at */ | |
| if ch == '*' && i+1 < len(text) && text[i+1] == '/' { | |
| inBlockComment = false | |
| i += 2 | |
| } else { | |
| i++ | |
| } | |
| continue | |
| } | |
| // Not in string or comment | |
| if ch == '-' && i+1 < len(text) && text[i+1] == '-' { | |
| // Start of line comment | |
| inLineComment = true | |
| i += 2 | |
| continue | |
| } | |
| if ch == '/' && i+1 < len(text) && text[i+1] == '*' { | |
| // Start of block comment | |
| inBlockComment = true | |
| i += 2 | |
| continue | |
| } | |
| if ch == '\'' { | |
| // Start of string literal. End of non-string segment — strip schema qualifications from it. | |
| result.WriteString(stripSchemaQualificationsFromText(text[segStart:i], schemaName)) | |
| segStart = i | |
| inString = true | |
| i++ | |
| continue | |
| } | |
| i++ | |
| } |
| if !inString { | ||
| // End of non-string segment — strip schema qualifications from it | ||
| result.WriteString(stripSchemaQualificationsFromText(text[segStart:i], schemaName)) | ||
| segStart = i |
There was a problem hiding this comment.
stripSchemaQualificationsFromText compiles multiple regexes on every call. With the new string-literal splitting, it can now be invoked many times per SQL statement (one per non-string segment), which can significantly increase CPU/time on large schemas. Consider compiling the regexes once per schemaName per stripSchemaQualifications invocation (or caching by schemaName) and reusing them across segments.
Greptile SummaryThis PR fixes a correctness bug in Key changes:
Two minor edge cases to be aware of:
Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["stripSchemaQualifications(sql, schemaName)"] --> B{schemaName == empty\nor not in sql?}
B -- Yes --> C[Return sql unchanged\nfast-path]
B -- No --> D["splitDollarQuotedSegments(sql)"]
D --> E{Segment\ndollar-quoted?}
E -- Yes --> F[Write segment as-is\npreserve function bodies]
E -- No --> G["stripSchemaQualificationsPreservingStrings(seg, schemaName)"]
G --> H{Scan character by character}
H --> I{char == single-quote?}
I -- Not in string --> J["Flush non-string segment\nto stripSchemaQualificationsFromText()\nEnter string mode"]
I -- In string, next is also quote --> K["Skip '' pair\nstay in string mode"]
I -- In string, next is not quote --> L["Write string literal as-is\nExit string mode"]
I -- Not a quote --> M[Advance i]
J --> H
K --> H
L --> H
M --> H
H -- End of text --> N{Still inString?}
N -- Yes --> O[Write remaining as-is\nunterminated literal]
N -- No --> P["Apply stripSchemaQualificationsFromText()\nto trailing segment"]
F --> Q[Reassemble result]
O --> Q
P --> Q
|
| } else { | ||
| // Check for escaped quote ('') | ||
| if i+1 < len(text) && text[i+1] == '\'' { | ||
| i += 2 // skip '' | ||
| } else { |
There was a problem hiding this comment.
E'...' backslash-escaped quote not handled
The escape-string syntax E'it\'s public.test' uses \' (backslash + single quote) to embed a quote rather than '' (doubling). The current parser only recognises the '' form:
if i+1 < len(text) && text[i+1] == '\'' {
i += 2 // skip ''
}With E'content\'' suffix, the \ is consumed as an ordinary character; then when ' (the real escaped quote) is reached and the following char is ' (the string closer), the code treats both as a '' pair and skips them — leaving inString = true beyond the true end of the string. Any schema identifiers after the closing quote will therefore be treated as string content and not stripped.
The failure mode is a false-negative (missed stripping) rather than a false-positive (corruption), so it is safe but subtly incorrect. A doc comment or a test case capturing this limitation would make the boundary explicit.
Summary
stripSchemaQualificationsFromTextused regex patterns that matched schema prefixes inside single-quoted SQL string literals, corrupting e.g.'s.manage'into'manage'when the target schema wasspgschema planto generate destructive false-positiveALTER POLICYstatements that silently truncated function arguments in USING/WITH CHECK expressionsstripSchemaQualificationsPreservingStringswhich splits text on single-quoted string boundaries before applying schema stripping, preserving string literal contentstrings.Containscheck to skip all work when the schema name is absentFixes #371
Test plan
go test -v ./internal/postgres -run TestStripSchemaQualifications_PreservesStringLiterals(6 cases covering string preservation, escaped quotes, mixed identifiers/literals)PGSCHEMA_TEST_FILTER="create_policy/" go test -v ./internal/diff -run TestDiffFromFiles(addedscope_checkpolicy withhas_scope('public.manage')to existingalter_policy_usingtest case)PGSCHEMA_TEST_FILTER="create_policy/" go test -v ./cmd -run TestPlanAndApply(all 10 policy tests pass, no false-positive diff for string literal policy)🤖 Generated with Claude Code