Skip to content

[feat](search) Support field-grouped query syntax field:(term1 OR term2)#60786

Merged
airborne12 merged 6 commits intoapache:masterfrom
airborne12:pick-explicit-field-tests
Feb 21, 2026
Merged

[feat](search) Support field-grouped query syntax field:(term1 OR term2)#60786
airborne12 merged 6 commits intoapache:masterfrom
airborne12:pick-explicit-field-tests

Conversation

@airborne12
Copy link
Copy Markdown
Member

@airborne12 airborne12 commented Feb 18, 2026

What problem does this PR solve?

Issue Number: close #N/A

Problem Summary:

The search() function did not support ES query_string field-grouped syntax where all terms inside parentheses inherit the field prefix:

-- Previously failed with syntax error
SELECT * FROM t WHERE search('title:(rock OR jazz)', '{"fields":["title","content"]}');

ES semantics:

Input Expansion
title:(rock OR jazz) (title:rock OR title:jazz)
title:(rock jazz) with default_operator:AND (+title:rock +title:jazz)
title:(rock OR jazz) AND music with fields:[title,content] (title:rock OR title:jazz) AND (title:music OR content:music)
title:("rock and roll" OR jazz) (title:"rock and roll" OR title:jazz)

Root cause

The ANTLR grammar SearchParser.g4 defined fieldQuery : fieldPath COLON searchValue where searchValue only accepts leaf values (TERM, QUOTED, etc.), not a parenthesized sub-clause. So title:( caused a syntax error.

Solution

Grammar (SearchParser.g4):

  • Add fieldGroupQuery : fieldPath COLON LPAREN clause RPAREN rule
  • Add it as alternative in atomClause before fieldQuery

Visitor (SearchDslParser.java):

  • Add markExplicitFieldRecursive() helper — marks all leaf nodes in a group as explicitField=true to prevent MultiFieldExpander from re-expanding them across unintended fields
  • Modify visitBareQuery() in both QsAstBuilder and QsLuceneModeAstBuilder to use currentFieldName as field group context when set
  • Add visitFieldGroupQuery() to both AST builders: sets field context, visits inner clause, marks all leaves explicit
  • Update visitAtomClause() and collectTermsFromNotClause() to handle the new rule

Release note

Support ES query_string field-grouped syntax in search() function: field:(term1 OR term2) now correctly expands to (field:term1 OR field:term2), matching Elasticsearch behavior. Supports standard mode, lucene mode, multi-field mode, and all value types (terms, phrases, wildcards, regexps).

Check List (For Author)

  • Test

    • Regression test (test_search_field_group_query.groovy, 23/23 search suites pass)
    • Unit Test (SearchDslParserTest.java, 132/132 tests pass, +10 new tests)
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
  • Behavior changed:

    • Yes. title:(rock OR jazz) previously threw a syntax error; now it is parsed as (title:rock OR title:jazz).
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

Cherry-pick missing test cases from selectdb/selectdb-core#7369.
All code logic was already in master; only these two test methods
covering the lucene mode + best_fields combination were absent.

- testMultiFieldExplicitFieldNotExpanded: verifies that explicit
  field:term syntax (e.g., title:music) is not expanded across fields
  in lucene+best_fields mode, matching ES query_string behavior.

- testMultiFieldMixedExplicitAndBareTerms: verifies that explicit
  field prefix is pinned while bare terms are expanded across fields
  in lucene+best_fields mode.
@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Feb 18, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Copy Markdown
Member Author

run buildall

Add support for ES query_string field-grouped syntax where all terms
inside parentheses inherit the field prefix, e.g.:
  title:(rock OR jazz)  →  (title:rock OR title:jazz)
  title:(rock jazz)     →  (+title:rock +title:jazz)  [with AND operator]

Previously this syntax caused a parse error because the grammar only
allowed leaf values after the colon in fieldQuery.

Changes:
- SearchParser.g4: add fieldGroupQuery rule (fieldPath COLON LPAREN clause RPAREN)
  and add it as alternative in atomClause before fieldQuery
- SearchDslParser.java:
  - Add markExplicitFieldRecursive() helper to mark all leaf nodes in a
    group as explicit (prevents multi-field expander from re-expanding them)
  - Modify visitBareQuery() in both QsAstBuilder and QsLuceneModeAstBuilder
    to use currentFieldName as field group context when set
  - Add visitFieldGroupQuery() to both AST builders: sets field context,
    visits inner clause, then marks all leaves as explicit
  - Update visitAtomClause() and collectTermsFromNotClause() in both
    builders to handle the new fieldGroupQuery alternative
- SearchDslParserTest.java: add 10 unit tests covering simple OR/AND,
  phrase inside group, wildcard+regexp, mixed with bare query, multi-field
  explicit preservation, lucene mode, and subcolumn dot-notation paths
- regression-test: add test_search_field_group_query.groovy with end-to-end
  tests against a running cluster
@airborne12 airborne12 changed the title [test](search) Add lucene+best_fields explicit field expansion tests [feat](search) Support field-grouped query syntax field:(term1 OR term2) Feb 18, 2026
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 28825 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 3c6832f2d2c41167cd580752f790dd5edab21e8f, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17608	4503	4338	4338
q2	q3	10655	786	520	520
q4	4693	356	249	249
q5	7545	1188	1014	1014
q6	170	172	147	147
q7	789	855	666	666
q8	9291	1457	1293	1293
q9	4823	4698	4634	4634
q10	6756	1879	1659	1659
q11	436	244	246	244
q12	679	574	468	468
q13	17787	4186	3400	3400
q14	235	234	219	219
q15	941	785	788	785
q16	737	725	692	692
q17	704	848	420	420
q18	6093	5369	5334	5334
q19	1267	974	579	579
q20	502	492	386	386
q21	5000	1899	1510	1510
q22	398	347	268	268
Total cold run time: 97109 ms
Total hot run time: 28825 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4616	4530	4514	4514
q2	q3	1811	2237	1793	1793
q4	897	1190	761	761
q5	4059	4365	4307	4307
q6	185	176	139	139
q7	1762	1650	1515	1515
q8	2603	2750	2547	2547
q9	7573	7234	7484	7234
q10	2716	2834	2398	2398
q11	510	439	407	407
q12	488	574	456	456
q13	3957	4448	3690	3690
q14	296	300	271	271
q15	908	806	798	798
q16	687	735	705	705
q17	1161	1458	1334	1334
q18	7002	6683	6598	6598
q19	850	979	905	905
q20	2050	2149	1978	1978
q21	3911	3510	3369	3369
q22	476	455	422	422
Total cold run time: 48518 ms
Total hot run time: 46141 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 183787 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 3c6832f2d2c41167cd580752f790dd5edab21e8f, data reload: false

query5	5348	638	510	510
query6	330	233	205	205
query7	4233	460	266	266
query8	339	243	230	230
query9	8714	2760	2788	2760
query10	530	378	337	337
query11	16984	17520	17182	17182
query12	186	127	121	121
query13	1262	512	379	379
query14	7453	3400	3012	3012
query14_1	2908	2875	2870	2870
query15	197	197	197	197
query16	1019	488	434	434
query17	1104	679	583	583
query18	2715	430	351	351
query19	211	191	172	172
query20	140	123	120	120
query21	208	132	113	113
query22	4958	4962	4711	4711
query23	17241	16864	16591	16591
query23_1	16701	16733	16596	16596
query24	6988	1619	1210	1210
query24_1	1220	1255	1231	1231
query25	568	489	403	403
query26	1237	262	148	148
query27	2778	464	283	283
query28	4567	1884	1864	1864
query29	796	556	488	488
query30	316	240	209	209
query31	870	728	651	651
query32	79	72	72	72
query33	518	337	287	287
query34	922	918	560	560
query35	622	679	607	607
query36	1080	1176	970	970
query37	124	88	83	83
query38	2927	2910	2923	2910
query39	893	879	873	873
query39_1	835	823	833	823
query40	226	157	139	139
query41	62	60	59	59
query42	108	106	109	106
query43	413	387	372	372
query44	
query45	200	186	184	184
query46	880	985	603	603
query47	2160	2162	2087	2087
query48	300	316	227	227
query49	622	452	372	372
query50	675	283	221	221
query51	4050	4069	4054	4054
query52	105	104	95	95
query53	291	332	293	293
query54	297	273	270	270
query55	86	84	80	80
query56	307	306	303	303
query57	1391	1386	1279	1279
query58	284	272	270	270
query59	2563	2711	2540	2540
query60	326	341	323	323
query61	150	147	150	147
query62	616	596	537	537
query63	305	282	269	269
query64	4864	1263	1005	1005
query65	
query66	1382	457	349	349
query67	16509	16498	16330	16330
query68	
query69	387	312	301	301
query70	1014	994	933	933
query71	347	315	304	304
query72	2895	2864	2653	2653
query73	539	541	330	330
query74	10044	9942	9797	9797
query75	2852	2752	2475	2475
query76	2301	1054	663	663
query77	378	389	318	318
query78	11278	11289	10709	10709
query79	2698	883	610	610
query80	1837	633	574	574
query81	569	282	252	252
query82	988	158	117	117
query83	360	266	251	251
query84	258	133	111	111
query85	996	485	423	423
query86	424	312	301	301
query87	3118	3089	2993	2993
query88	3531	2685	2687	2685
query89	424	365	361	361
query90	1918	171	180	171
query91	160	152	132	132
query92	72	73	72	72
query93	1138	820	503	503
query94	648	310	274	274
query95	566	389	320	320
query96	652	519	230	230
query97	2459	2498	2412	2412
query98	235	223	218	218
query99	1006	991	866	866
Total cold run time: 256807 ms
Total hot run time: 183787 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 59.21% (45/76) 🎉
Increment coverage report
Complete coverage report

eldenmoon
eldenmoon previously approved these changes Feb 19, 2026
Copy link
Copy Markdown
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Feb 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

HappenLee
HappenLee previously approved these changes Feb 19, 2026
Copy link
Copy Markdown
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@airborne12 airborne12 dismissed stale reviews from HappenLee and eldenmoon via 0ce40ce February 19, 2026 07:36
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label Feb 19, 2026
zzzxl1993
zzzxl1993 previously approved these changes Feb 19, 2026
Copy link
Copy Markdown
Contributor

@zzzxl1993 zzzxl1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Feb 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

…ssing brace, exception types

- Fix markExplicitFieldRecursive overriding inner explicit field bindings
  (e.g., title:(content:foo OR bar) now correctly keeps content:foo)
- Add recursion depth limit (MAX_FIELD_GROUP_DEPTH=32) to prevent StackOverflow
- Replace RuntimeException with SearchDslSyntaxException for consistency
- Fix missing closing brace in testFieldGroupQuerySubcolumnPath (merge artifact)
- Add test cases for inner explicit field preservation and NOT operator in groups
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label Feb 19, 2026
@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 57.32% (47/82) 🎉
Increment coverage report
Complete coverage report

…ent CI flakiness

When CI fuzzy testing sets default_variant_enable_doc_mode=true, variant
subcolumns are stored in document mode, causing inverted index iterators
to be unavailable in BE (VSearchExpr: No indexed columns available).
This results in empty query results for search() on variant subcolumns.

Fix: explicitly set default_variant_enable_doc_mode=false in both variant
search tests, following the pattern from variant_p0/predefine/ tests.
@airborne12 airborne12 force-pushed the pick-explicit-field-tests branch from 451344c to f073270 Compare February 20, 2026 03:23
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 29027 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f0732702952854065be5e1a528d08315d178471a, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17622	4509	4271	4271
q2	q3	10658	797	517	517
q4	4679	360	254	254
q5	7574	1191	1022	1022
q6	178	172	145	145
q7	771	830	677	677
q8	9306	1486	1307	1307
q9	4919	4747	5004	4747
q10	6833	1882	1637	1637
q11	498	260	238	238
q12	702	561	466	466
q13	17800	4218	3421	3421
q14	232	227	214	214
q15	949	803	798	798
q16	757	714	670	670
q17	708	847	424	424
q18	5991	5377	5394	5377
q19	1232	985	626	626
q20	499	503	415	415
q21	4617	1985	1518	1518
q22	388	310	283	283
Total cold run time: 96913 ms
Total hot run time: 29027 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4710	4523	4598	4523
q2	q3	1820	2276	1808	1808
q4	886	1205	802	802
q5	4012	4322	4339	4322
q6	192	193	153	153
q7	1805	1675	1492	1492
q8	2472	2685	2560	2560
q9	7636	7307	7333	7307
q10	2611	2870	2458	2458
q11	510	451	449	449
q12	536	687	475	475
q13	4392	4437	3609	3609
q14	278	287	270	270
q15	862	828	834	828
q16	721	742	710	710
q17	1143	1558	1347	1347
q18	7325	6883	6614	6614
q19	1081	896	876	876
q20	2060	2153	2007	2007
q21	4015	3471	3327	3327
q22	474	448	402	402
Total cold run time: 49541 ms
Total hot run time: 46339 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 183853 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f0732702952854065be5e1a528d08315d178471a, data reload: false

query5	4334	641	511	511
query6	316	219	204	204
query7	4208	468	282	282
query8	336	241	226	226
query9	8736	2751	2744	2744
query10	499	382	347	347
query11	17029	17484	17393	17393
query12	192	130	130	130
query13	1471	497	359	359
query14	6381	3357	2990	2990
query14_1	2897	2882	2922	2882
query15	201	197	175	175
query16	996	497	447	447
query17	2300	744	617	617
query18	2705	461	360	360
query19	205	202	181	181
query20	144	130	132	130
query21	213	139	116	116
query22	5613	5051	4676	4676
query23	17142	16752	16502	16502
query23_1	16794	16573	16679	16573
query24	7223	1619	1220	1220
query24_1	1253	1252	1224	1224
query25	577	498	439	439
query26	1243	261	165	165
query27	2764	471	300	300
query28	4525	1890	1888	1888
query29	836	608	467	467
query30	307	246	212	212
query31	863	715	649	649
query32	79	72	68	68
query33	511	337	280	280
query34	910	919	557	557
query35	624	672	591	591
query36	1079	1127	970	970
query37	130	98	80	80
query38	2955	2936	2855	2855
query39	899	879	974	879
query39_1	837	810	829	810
query40	231	154	133	133
query41	63	59	57	57
query42	105	101	102	101
query43	369	387	344	344
query44	
query45	193	185	181	181
query46	872	1000	604	604
query47	2140	2162	2033	2033
query48	320	323	242	242
query49	620	466	378	378
query50	683	280	211	211
query51	4040	4096	4035	4035
query52	106	110	94	94
query53	287	346	286	286
query54	289	281	269	269
query55	92	92	80	80
query56	351	309	318	309
query57	1373	1327	1289	1289
query58	287	273	267	267
query59	2547	2668	2598	2598
query60	334	344	316	316
query61	151	146	147	146
query62	640	585	537	537
query63	319	276	274	274
query64	4894	1261	1008	1008
query65	
query66	1424	452	352	352
query67	16489	16443	16354	16354
query68	
query69	412	324	285	285
query70	942	978	975	975
query71	336	299	298	298
query72	2796	2802	2584	2584
query73	544	554	326	326
query74	9981	9933	9796	9796
query75	2878	2785	2472	2472
query76	2292	1044	686	686
query77	372	398	325	325
query78	11308	11441	10711	10711
query79	1749	801	636	636
query80	1388	656	574	574
query81	564	281	258	258
query82	1000	158	124	124
query83	341	274	261	261
query84	260	128	100	100
query85	965	533	431	431
query86	405	310	300	300
query87	3103	3088	2985	2985
query88	3603	2687	2678	2678
query89	425	366	342	342
query90	2001	174	175	174
query91	171	152	131	131
query92	81	77	72	72
query93	1010	837	520	520
query94	634	323	293	293
query95	590	330	370	330
query96	640	514	229	229
query97	2452	2477	2390	2390
query98	224	220	217	217
query99	1006	979	890	890
Total cold run time: 255401 ms
Total hot run time: 183853 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 35.37% (29/82) 🎉
Increment coverage report
Complete coverage report

…n tests

CI fuzzy testing randomly sets enable_common_expr_pushdown=false, which
prevents search() expressions from being pushed to the inverted index
evaluation path, causing "SearchExpr should not be executed without
inverted index" errors. Pin the variable to true at session level in all
15 search test files that were missing this protection.
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 28745 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b9516af3cdc720c8629d5803e690f8e31861efc6, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17593	4468	4290	4290
q2	q3	10664	777	540	540
q4	4674	354	260	260
q5	7584	1218	993	993
q6	175	177	150	150
q7	786	856	667	667
q8	9293	1385	1379	1379
q9	4909	4705	4875	4705
q10	6843	1865	1647	1647
q11	469	269	232	232
q12	711	567	472	472
q13	17806	4233	3388	3388
q14	225	230	220	220
q15	963	799	786	786
q16	759	733	653	653
q17	727	884	446	446
q18	5903	5290	5234	5234
q19	1164	974	620	620
q20	492	505	401	401
q21	4409	1858	1414	1414
q22	344	289	248	248
Total cold run time: 96493 ms
Total hot run time: 28745 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4389	4307	4348	4307
q2	q3	1772	2180	1726	1726
q4	855	1164	778	778
q5	4031	4323	4305	4305
q6	188	174	148	148
q7	1724	1578	1486	1486
q8	2437	2647	2514	2514
q9	7523	7335	7431	7335
q10	2713	2903	2491	2491
q11	518	441	441	441
q12	505	605	451	451
q13	3972	4427	3687	3687
q14	286	298	277	277
q15	897	847	790	790
q16	709	764	720	720
q17	1214	1624	1280	1280
q18	7136	6712	6595	6595
q19	903	888	942	888
q20	2110	2265	2033	2033
q21	4355	3589	3385	3385
q22	489	471	417	417
Total cold run time: 48726 ms
Total hot run time: 46054 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 183470 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b9516af3cdc720c8629d5803e690f8e31861efc6, data reload: false

query5	4860	637	508	508
query6	325	225	206	206
query7	4201	468	260	260
query8	326	261	230	230
query9	8743	2789	2755	2755
query10	530	404	334	334
query11	17005	17416	17254	17254
query12	235	130	124	124
query13	1790	484	376	376
query14	6778	3567	3054	3054
query14_1	3000	2924	2864	2864
query15	215	202	186	186
query16	1008	490	445	445
query17	1039	755	602	602
query18	2787	448	335	335
query19	214	229	183	183
query20	144	133	135	133
query21	214	141	115	115
query22	5646	4880	4765	4765
query23	17242	16756	16438	16438
query23_1	16918	16664	16629	16629
query24	7153	1620	1245	1245
query24_1	1237	1246	1236	1236
query25	559	480	423	423
query26	1238	269	154	154
query27	2766	471	287	287
query28	4435	1894	1892	1892
query29	818	582	485	485
query30	321	250	213	213
query31	884	741	641	641
query32	82	77	72	72
query33	567	333	281	281
query34	912	927	557	557
query35	648	675	587	587
query36	1067	1133	996	996
query37	139	89	80	80
query38	2932	2911	2840	2840
query39	903	860	836	836
query39_1	865	822	839	822
query40	242	150	136	136
query41	66	59	61	59
query42	104	101	101	101
query43	379	373	359	359
query44	
query45	199	187	184	184
query46	871	999	603	603
query47	2141	2111	2061	2061
query48	310	313	227	227
query49	633	471	383	383
query50	678	283	212	212
query51	4095	4059	4109	4059
query52	106	106	95	95
query53	292	338	270	270
query54	293	267	261	261
query55	93	86	80	80
query56	311	305	306	305
query57	1360	1359	1277	1277
query58	295	292	268	268
query59	2578	2658	2538	2538
query60	333	369	325	325
query61	144	143	144	143
query62	618	599	530	530
query63	308	278	286	278
query64	4807	1242	968	968
query65	
query66	1399	444	356	356
query67	16420	16351	16275	16275
query68	
query69	410	299	283	283
query70	991	979	960	960
query71	330	297	304	297
query72	2811	2654	2371	2371
query73	545	552	327	327
query74	10095	9921	9759	9759
query75	2861	2772	2480	2480
query76	2320	1049	718	718
query77	372	403	316	316
query78	11151	11359	10704	10704
query79	1155	822	604	604
query80	698	657	584	584
query81	491	293	248	248
query82	1359	150	114	114
query83	268	267	254	254
query84	254	119	104	104
query85	903	562	533	533
query86	361	313	328	313
query87	3097	3110	3015	3015
query88	3556	2715	2668	2668
query89	418	372	344	344
query90	1893	173	170	170
query91	166	155	135	135
query92	75	77	71	71
query93	911	842	515	515
query94	451	317	286	286
query95	586	387	312	312
query96	642	530	226	226
query97	2506	2470	2408	2408
query98	246	215	214	214
query99	978	979	907	907
Total cold run time: 254247 ms
Total hot run time: 183470 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 57.32% (47/82) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Feb 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Copy Markdown
Contributor

@zzzxl1993 zzzxl1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@airborne12 airborne12 merged commit 6cd22c9 into apache:master Feb 21, 2026
30 checks passed
@airborne12 airborne12 deleted the pick-explicit-field-tests branch February 21, 2026 04:05
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Mar 3, 2026
…m2) (apache#60786)

### What problem does this PR solve?

Issue Number: close #N/A

Problem Summary:

The `search()` function did not support ES `query_string` field-grouped
syntax where all terms inside parentheses inherit the field prefix:

```sql
-- Previously failed with syntax error
SELECT * FROM t WHERE search('title:(rock OR jazz)', '{"fields":["title","content"]}');
```

ES semantics:
| Input | Expansion |
|-------|-----------|
| `title:(rock OR jazz)` | `(title:rock OR title:jazz)` |
| `title:(rock jazz)` with `default_operator:AND` | `(+title:rock
+title:jazz)` |
| `title:(rock OR jazz) AND music` with `fields:[title,content]` |
`(title:rock OR title:jazz) AND (title:music OR content:music)` |
| `title:("rock and roll" OR jazz)` | `(title:"rock and roll" OR
title:jazz)` |

### Root cause

The ANTLR grammar `SearchParser.g4` defined `fieldQuery : fieldPath
COLON searchValue` where `searchValue` only accepts leaf values (TERM,
QUOTED, etc.), not a parenthesized sub-clause. So `title:(` caused a
syntax error.

### Solution

**Grammar** (`SearchParser.g4`):
- Add `fieldGroupQuery : fieldPath COLON LPAREN clause RPAREN` rule
- Add it as alternative in `atomClause` before `fieldQuery`

**Visitor** (`SearchDslParser.java`):
- Add `markExplicitFieldRecursive()` helper — marks all leaf nodes in a
group as `explicitField=true` to prevent `MultiFieldExpander` from
re-expanding them across unintended fields
- Modify `visitBareQuery()` in both `QsAstBuilder` and
`QsLuceneModeAstBuilder` to use `currentFieldName` as field group
context when set
- Add `visitFieldGroupQuery()` to both AST builders: sets field context,
visits inner clause, marks all leaves explicit
- Update `visitAtomClause()` and `collectTermsFromNotClause()` to handle
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Mar 4, 2026
…o branch-4.0

Squashed backport of the following master PRs:

- apache#59747 [fix](search) Make AND/OR/NOT operators case-sensitive in search DSL
- apache#60654 [refactor](search) Refactor SearchDslParser to single-phase ANTLR parsing and fix ES compatibility issues
- apache#60782 [fix](search) Upgrade query type for variant subcolumns with analyzer-based indexes
- apache#60784 [fix](search) Fix MATCH_ALL_DOCS query failing in multi-field search mode
- apache#60786 [feat](search) Support field-grouped query syntax field:(term1 OR term2)
- apache#60790 [fix](search) Add searcher cache reuse and DSL result cache for search() function
- apache#60793 [fix](search) Fix wildcard query on variant subcolumns returning empty results
- apache#60798 [fix](search) Use FE-provided analyzer key for multi-index columns in search()
- apache#60814 [fix](search) Fix implicit conjunction incorrectly modifying preceding term in lucene mode
- apache#60834 [test](search) Add regression test for wildcard query on variant subcolumns with multi-index
- apache#60873 [fix](search) fix MATCH_ALL_DOCS losing occur attribute in multi-field expansion
- apache#60891 [fix](search) inject MATCH_ALL_DOCS for multi-MUST_NOT queries in lucene mode
airborne12 added a commit to airborne12/apache-doris that referenced this pull request Mar 4, 2026
…o branch-4.0

Squashed backport of the following master PRs:

- apache#59747 [fix](search) Make AND/OR/NOT operators case-sensitive in search DSL
- apache#60654 [refactor](search) Refactor SearchDslParser to single-phase ANTLR parsing and fix ES compatibility issues
- apache#60782 [fix](search) Upgrade query type for variant subcolumns with analyzer-based indexes
- apache#60784 [fix](search) Fix MATCH_ALL_DOCS query failing in multi-field search mode
- apache#60786 [feat](search) Support field-grouped query syntax field:(term1 OR term2)
- apache#60790 [fix](search) Add searcher cache reuse and DSL result cache for search() function
- apache#60793 [fix](search) Fix wildcard query on variant subcolumns returning empty results
- apache#60798 [fix](search) Use FE-provided analyzer key for multi-index columns in search()
- apache#60814 [fix](search) Fix implicit conjunction incorrectly modifying preceding term in lucene mode
- apache#60834 [test](search) Add regression test for wildcard query on variant subcolumns with multi-index
- apache#60873 [fix](search) fix MATCH_ALL_DOCS losing occur attribute in multi-field expansion
- apache#60891 [fix](search) inject MATCH_ALL_DOCS for multi-MUST_NOT queries in lucene mode
yiguolei pushed a commit that referenced this pull request Mar 4, 2026
… bug fixes (#61028)

### What problem does this PR solve?

Squashed backport of all search() function improvements and bug fixes
from master to branch-4.0.

This PR combines the following master PRs into a single backport:

| Master PR | Type | Description |
|-----------|------|-------------|
| #59747 | fix | Make AND/OR/NOT operators case-sensitive in search DSL
|
| #60654 | refactor | Refactor SearchDslParser to single-phase ANTLR
parsing and fix ES compatibility issues |
| #60782 | fix | Upgrade query type for variant subcolumns with
analyzer-based indexes |
| #60784 | fix | Fix MATCH_ALL_DOCS query failing in multi-field search
mode |
| #60786 | feat | Support field-grouped query syntax field:(term1 OR
term2) |
| #60790 | fix | Add searcher cache reuse and DSL result cache for
search() function |
| #60793 | fix | Fix wildcard query on variant subcolumns returning
empty results |
| #60798 | fix | Use FE-provided analyzer key for multi-index columns in
search() |
| #60814 | fix | Fix implicit conjunction incorrectly modifying
preceding term in lucene mode |
| #60834 | test | Add regression test for wildcard query on variant
subcolumns with multi-index |
| #60873 | fix | fix MATCH_ALL_DOCS losing occur attribute in
multi-field expansion |
| #60891 | fix | inject MATCH_ALL_DOCS for multi-MUST_NOT queries in
lucene mode |

### Release note

Backport search() function improvements including DSL parser
refactoring, multi-field search fixes, variant subcolumn support, query
caching, and field-grouped query syntax.

### Check List (For Author)

- Test
    - [x] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [x] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason

- Behavior changed:
    - [ ] No.
- [x] Yes. New search() function features and bug fixes backported from
master.

- Does this need documentation?
    - [x] No.
    - [ ] Yes.

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.4-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants