Query Information
PPL Command/Query:
source=opensearch_dashboards_sample_data_logs
| eval IPToCountry=geoip("my-datasource", clientip, "country_iso_code")
Expected Result:
The query should successfully execute and return geolocation data for each IP address in the clientip field, similar to when a literal IP string is provided.
Actual Result:
{
"error": {
"reason": "There was internal problem at backend",
"details": "java.sql.SQLException: Error while preparing plan [LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])\n LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], @timestamp=[$15], utc_time=[$16], bytes=[$17], machine=[$18], response=[$21], clientip=[$22], host=[$23], event=[$24], phpmemory=[$26], timestamp=[$27], IPToCountry=[GEOIP('my-datasource':VARCHAR, $22, 'country_iso_code':VARCHAR)])\n CalciteLogicalIndexScan(table=[[OpenSearch, opensearch_dashboards_sample_data_logs]])\n]",
"type": "RuntimeException"
},
"status": 500
}
Dataset Information
Dataset/Schema Type
Index Mapping
{
"mappings": {
"properties": {
"clientip": { "type": "ip" },
"host": {
"type": "text",
"fields": {
"keyword": { "type": "keyword" }
}
},
"geo": {
"properties": {
"coordinates": { "type": "geo_point" },
"dest": { "type": "keyword" },
"src": { "type": "keyword" }
}
}
}
}
}
Sample Data
{
"clientip": "223.87.60.27",
"host": "artifacts.opensearch.org",
"geo": {
"srcdest": "IN:US",
"src": "IN",
"dest": "US",
"coordinates": {
"lat": 39.41042861,
"lon": -88.8454325
}
}
}
Bug Description
Issue Summary:
The geoip() function fails when passed a field reference of type ip as the second parameter, but works correctly when passed a literal string IP address. This appears to be a type compatibility issue in the Calcite query planner where the IP type is not being accepted by the function's operand type checker which expects SqlTypeFamily.CHARACTER.
Steps to Reproduce:
- Load the OpenSearch Dashboards sample data (opensearch_dashboards_sample_data_logs)
- Configure a geospatial datasource (e.g., "my-datasource")
- Execute the failing query:
source=opensearch_dashboards_sample_data_logs
| eval IPToCountry=geoip("my-datasource", clientip, "country_iso_code")
- Observe the 500 error with "Error while preparing plan"
Comparison with Working Query:
source=opensearch_dashboards_sample_data_logs
| eval IPToCountry=geoip("my-datasource", "50.68.18.229", "country_iso_code")
This query succeeds because the literal string "50.68.18.229" is of type VARCHAR/CHARACTER.
Impact:
This bug prevents users from using the geoip() function with IP-typed fields in their indices, which is the primary use case for geolocation enrichment in log analysis and security analytics scenarios. Users must resort to explicit type casting as a workaround.
Environment Information
OpenSearch Version: 3.3.0-SNAPSHOT
Additional Details:
- The issue is specific to the Calcite-based query engine
- The
clientip field has OpenSearch mapping type ip which maps to ExprCoreType.IP
- The
geoip function's operand type checker in GeoIpFunction.java only accepts SqlTypeFamily.CHARACTER types
Tentative Root Cause Analysis
Preliminary Analysis - Requires Verification
The root cause appears to be in /opensearch/src/main/java/org/opensearch/sql/opensearch/functions/GeoIpFunction.java at the getOperandMetadata() method (lines 68-74):
@Override
public UDFOperandMetadata getOperandMetadata() {
return UDFOperandMetadata.wrap(
(CompositeOperandTypeChecker)
OperandTypes.CHARACTER_CHARACTER.or(
OperandTypes.family(
SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER)));
}
The operand type checker restricts the second parameter to SqlTypeFamily.CHARACTER types only. However, when a field of OpenSearch type ip is referenced, it is represented as ExprIPType in Calcite, which extends ExprJavaType and is backed by the ExprIpValue Java class. This IP type does not belong to the CHARACTER type family, causing the query planner to fail during type validation.
The IP type in OpenSearch is defined as:
- OpenSearch mapping:
"ip" → ExprCoreType.IP (defined as IP(STRING) in ExprCoreType)
- Calcite representation:
ExprIPType (a custom UDT extending ExprJavaType)
While ExprCoreType.IP has STRING as its base type, the Calcite type system doesn't automatically recognize ExprIPType as belonging to SqlTypeFamily.CHARACTER.
Tentative Proposed Fix
Tentative and Preliminary Solution - Requires Verification:
Modify the getOperandMetadata() method in GeoIpFunction.java to accept IP types in addition to CHARACTER types:
@Override
public UDFOperandMetadata getOperandMetadata() {
return UDFOperandMetadata.wrap(
(CompositeOperandTypeChecker)
OperandTypes.CHARACTER_CHARACTER.or(
OperandTypes.family(
SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER))
.or(
OperandTypes.family(
SqlTypeFamily.CHARACTER, SqlTypeFamily.ANY, SqlTypeFamily.CHARACTER)));
}
Alternatively, implement a custom operand type checker that explicitly accepts both CHARACTER and IP types for the second parameter, similar to how other IP-related functions like cidrmatch might handle this.
The implementation should also ensure that the fetchIpEnrichment method can handle ExprIpValue objects and convert them to strings appropriately.
Workaround
Users can work around this issue by explicitly casting the IP field to STRING:
source=opensearch_dashboards_sample_data_logs
| eval ip_str=CAST(clientip AS STRING)
| eval IPToCountry=geoip("my-datasource", ip_str, "country_iso_code")
This workaround successfully executes and returns the expected geolocation data.
Query Information
PPL Command/Query:
Expected Result:
The query should successfully execute and return geolocation data for each IP address in the
clientipfield, similar to when a literal IP string is provided.Actual Result:
{ "error": { "reason": "There was internal problem at backend", "details": "java.sql.SQLException: Error while preparing plan [LogicalSystemLimit(fetch=[10000], type=[QUERY_SIZE_LIMIT])\n LogicalProject(referer=[$0], request=[$1], agent=[$2], extension=[$3], memory=[$4], ip=[$5], index=[$6], message=[$7], url=[$8], tags=[$9], geo=[$10], @timestamp=[$15], utc_time=[$16], bytes=[$17], machine=[$18], response=[$21], clientip=[$22], host=[$23], event=[$24], phpmemory=[$26], timestamp=[$27], IPToCountry=[GEOIP('my-datasource':VARCHAR, $22, 'country_iso_code':VARCHAR)])\n CalciteLogicalIndexScan(table=[[OpenSearch, opensearch_dashboards_sample_data_logs]])\n]", "type": "RuntimeException" }, "status": 500 }Dataset Information
Dataset/Schema Type
Index Mapping
{ "mappings": { "properties": { "clientip": { "type": "ip" }, "host": { "type": "text", "fields": { "keyword": { "type": "keyword" } } }, "geo": { "properties": { "coordinates": { "type": "geo_point" }, "dest": { "type": "keyword" }, "src": { "type": "keyword" } } } } } }Sample Data
{ "clientip": "223.87.60.27", "host": "artifacts.opensearch.org", "geo": { "srcdest": "IN:US", "src": "IN", "dest": "US", "coordinates": { "lat": 39.41042861, "lon": -88.8454325 } } }Bug Description
Issue Summary:
The
geoip()function fails when passed a field reference of typeipas the second parameter, but works correctly when passed a literal string IP address. This appears to be a type compatibility issue in the Calcite query planner where the IP type is not being accepted by the function's operand type checker which expectsSqlTypeFamily.CHARACTER.Steps to Reproduce:
Comparison with Working Query:
This query succeeds because the literal string "50.68.18.229" is of type VARCHAR/CHARACTER.
Impact:
This bug prevents users from using the
geoip()function with IP-typed fields in their indices, which is the primary use case for geolocation enrichment in log analysis and security analytics scenarios. Users must resort to explicit type casting as a workaround.Environment Information
OpenSearch Version: 3.3.0-SNAPSHOT
Additional Details:
clientipfield has OpenSearch mapping typeipwhich maps toExprCoreType.IPgeoipfunction's operand type checker inGeoIpFunction.javaonly acceptsSqlTypeFamily.CHARACTERtypesTentative Root Cause Analysis
Preliminary Analysis - Requires Verification
The root cause appears to be in
/opensearch/src/main/java/org/opensearch/sql/opensearch/functions/GeoIpFunction.javaat thegetOperandMetadata()method (lines 68-74):The operand type checker restricts the second parameter to
SqlTypeFamily.CHARACTERtypes only. However, when a field of OpenSearch typeipis referenced, it is represented asExprIPTypein Calcite, which extendsExprJavaTypeand is backed by theExprIpValueJava class. This IP type does not belong to theCHARACTERtype family, causing the query planner to fail during type validation.The IP type in OpenSearch is defined as:
"ip"→ExprCoreType.IP(defined asIP(STRING)in ExprCoreType)ExprIPType(a custom UDT extendingExprJavaType)While
ExprCoreType.IPhasSTRINGas its base type, the Calcite type system doesn't automatically recognizeExprIPTypeas belonging toSqlTypeFamily.CHARACTER.Tentative Proposed Fix
Tentative and Preliminary Solution - Requires Verification:
Modify the
getOperandMetadata()method inGeoIpFunction.javato accept IP types in addition to CHARACTER types:Alternatively, implement a custom operand type checker that explicitly accepts both CHARACTER and IP types for the second parameter, similar to how other IP-related functions like
cidrmatchmight handle this.The implementation should also ensure that the
fetchIpEnrichmentmethod can handleExprIpValueobjects and convert them to strings appropriately.Workaround
Users can work around this issue by explicitly casting the IP field to STRING:
This workaround successfully executes and returns the expected geolocation data.