diff --git a/docs/dev/Pagination-v2.md b/docs/dev/Pagination-v2.md
index 77694c25109..2416ae51de2 100644
--- a/docs/dev/Pagination-v2.md
+++ b/docs/dev/Pagination-v2.md
@@ -7,7 +7,8 @@ A cursor is a SQL abstraction for pagination. A client can open a cursor, retrie
Currently, SQL plugin does not provide SQL cursor syntax. However, the SQL REST endpoint can return result a page at a time. This feature is used by JDBC and ODBC drivers.
# Scope
-Currenty, V2 engine supports pagination only for simple `SELECT * FROM
` queries without any other clauses like `WHERE` or `ORDER BY`.
+This document describes pagination in V2 sql engine for non-aggregate queries -- queries
+without `GROUP BY` clause or use of window functions.
# Demo
https://user-images.githubusercontent.com/88679692/224208630-8d38d833-abf8-4035-8d15-d5fb4382deca.mp4
@@ -15,7 +16,7 @@ https://user-images.githubusercontent.com/88679692/224208630-8d38d833-abf8-4035-
# REST API
## Initial Query Request
-Initial query request contains the search request and page size. Search query to OpenSearch is built during processing of this request. Neither the query nor page size can be change while scrolling through pages based on this request.
+Initial query request contains the search request and page size. Search query to OpenSearch is built during processing of this request. Neither the query nor page size can be changed while scrolling through pages based on this request.
The only difference between paged and non-paged requests is `fetch_size` parameter supplied in paged request.
```json
@@ -62,10 +63,7 @@ Similarly to v1 engine, the response object is the same as initial response if t
`cursor_id` will be different with each request.
## End of scrolling/paging
-
-When scrolling is finished, SQL plugin returns a final cursor. This cursor leads to an empty page, which has no cursor and no data hits. Receiving that page means all data was properly queried, and the scrolling cursor has been closed.
-
-The client will receive an [error response](#error-response) if executing this request results in an OpenSearch or SQL plug-in error.
+The last page in a response will not have a cursor id property.
## Cursor Keep Alive Timeout
@@ -79,7 +77,7 @@ Keep alive timeout is [configurable](../user/admin/settings.rst#plugins.sql.curs
## Error Response
-The client will receive an error response if any of the above REST calls result in an server-side error.
+The client will receive an error response if any of the above REST calls result in a server-side error.
The response object has the following format:
```json
@@ -98,7 +96,7 @@ The response object has the following format:
## OpenSearch Data Retrieval Strategy
-OpenSearch provides several data retrival APIs that are optimized for different use cases.
+OpenSearch provides several data retrieval APIs that are optimized for different use cases.
At this time, SQL plugin uses simple search API and scroll API.
@@ -106,51 +104,46 @@ Simple retrieval API returns at most `max_result_window` number of documents. `
Scroll API requests returns all documents but can incur high memory costs on OpenSearch coordination node.
-Efficient implementation of pagination needs to be aware of retrival API used. Each retrieval strategy will be considered separately.
+Efficient implementation of pagination needs to be aware of retrieval API used. Each retrieval strategy will be considered separately.
The discussion below uses *under max_result_window* to refer to scenarios that can be implemented with simple retrieval API and *over max_result_window* for scenarios that require scroll API to implement.
## SQL Node Load Balancing
-V2 SQL engine supports *sql node load balancing* -- a cursor request can be routed to any SQL node in a cluster. This is achieved by encoding all data necessary to retrieve the next page in the `cursor_id`.
+V2 SQL engine supports *sql node load balancing* — a cursor request can be routed to any SQL node in a cluster. This is achieved by encoding all data necessary to retrieve the next page in the `cursor_id` property in the response.
## Feature Design
+To support pagination, v2 SQL engine needs to:
+1. in REST front-end:
+ 1. Route supported paginated query to v2 engine for
+ 1. Initial requests,
+ 2. Next page requests.
+ 2. Fallback to v1 engine for queries not supported by v2 engine.
+ 3. Create correct JSON response from execution of paginated physical plan by v2 engine.
+2. during query planning:
+ 1. Differentiate between paginated and normal query plans.
+ 2. Push down pagination to table scan.
+ 3. Create a physical query plan from a cursor id.
+3. during query execution:
+ 1. Serialize an executing query and generate a cursor id after returning `fetch_size` number of elements.
+4. in OpenSearch data source:
+ 1. Support pagination push down.
+ 2. Support other push down optimizations with pagination.
+
+### Query Plan Changes
+
+All three kinds of query requests — non-paged, initial page, or subsequent page — are processed in the same way. Simplified workflow of query plan processing is shown below for reference.
-### Plan Tree changes
-
-Different plan trees are built during request processing. Read more about their purpose and stages [here](query-optimizer-improvement.md#Examples). The section below describes the changes being introduced to these trees by the pagination feature.
-
-Simplified workflow of plan trees is shown below. Initial Page Request is processed the same way as a non-paging request.
-
-```mermaid
-stateDiagram-v2
- state "Non Paged Request" as NonPaged {
- direction LR
- state "Parse Tree" as Parse
- state "Unresolved Plan Tree" as Unresolved
- state "Abstract Plan Tree" as Abstract
- state "Logical Plan Tree" as Logical
- state "Optimized Logical Plan Tree" as Optimized
- state "Physical Plan Tree" as Physical
-
- [*] --> Parse : ANTLR
- Parse --> Unresolved : AstBuilder
- Unresolved --> Abstract : QueryPlanner
- Abstract --> Logical : Planner
- Logical --> Optimized : Optimizer
- Optimized --> Physical : Implementor
- }
-```
```mermaid
stateDiagram-v2
- state "Initial Page Request" as Paged {
+ state "Request" as NonPaged {
direction LR
state "Parse Tree" as Parse
- state "Unresolved Plan Tree" as Unresolved
- state "Abstract Plan Tree" as Abstract
- state "Logical Plan Tree" as Logical
- state "Optimized Logical Plan Tree" as Optimized
- state "Physical Plan Tree" as Physical
+ state "Unresolved Query Plan" as Unresolved
+ state "Abstract Query Plan" as Abstract
+ state "Logical Query Plan" as Logical
+ state "Optimized Query Plan" as Optimized
+ state "Physical Query Plan" as Physical
[*] --> Parse : ANTLR
Parse --> Unresolved : AstBuilder
@@ -160,23 +153,13 @@ stateDiagram-v2
Optimized --> Physical : Implementor
}
```
-```mermaid
-stateDiagram-v2
- state "Subsequent Page Request" as Paged {
- direction LR
- state "Abstract Plan Tree" as Abstract
- state "Physical Plan Tree" as Physical
- [*] --> Abstract : QueryPlanner
- Abstract --> Physical : Deserializer
- }
-```
-New plan tree workflow was added for Subsequent Page Requests. Since a final Physical Plan tree was already created for Initial request, subsequent requests should have the same tree. The tree is serialized into a `cursor` by `PlanSerializer` to be de-serialized on the subsequence Page Request. Query parsing and analysis is not performed for Subsequent Page Requests, since the Physical Plan tree is instead de-serialized from the `cursor`.
+#### Unresolved Query Plan
-#### Abstract Plan tree
+Unresolved Query Plan for non-paged requests remains unchanged.
-Abstract Plan Tree for non-paged requests remains unchanged. The `QueryPlan`, as a top entry, has a new optional field `pageSize`, which is not defined for non-paged requests.
+To support initial query requests, the `QueryPlan` class has a new optional field `pageSize`.
```mermaid
classDiagram
@@ -187,15 +170,15 @@ classDiagram
-UnresolvedPlan plan
-QueryService queryService
}
- class UnresolvedPlanTree {
+ class UnresolvedQueryPlan {
<>
}
- QueryPlan --* UnresolvedPlanTree
+ QueryPlan --* UnresolvedQueryPlan
```
-Abstract plan tree for Initial Query Request has following changes:
-1. New Plan node -- `Paginate` -- added into the tree.
-2. `pageSize` parameter in `QueryPlan` is set, and `Paginate` is being added. It is converted to `LogicalPaginate` later.
+When `QueryPlanFactory.create` is passed initial query request, it:
+1. Adds an instance of `Paginate` unresolved plan as the root of the unresolved query plan.
+2. Sets `pageSize` parameter in `QueryPlan`.
```mermaid
classDiagram
@@ -211,31 +194,33 @@ classDiagram
-int pageSize
-UnresolvedPlan child
}
- class UnresolvedPlanTree {
+ class UnresolvedQueryPlan {
<>
}
QueryPlan --* Paginate
- Paginate --* UnresolvedPlanTree
+ Paginate --* UnresolvedQueryPlan
```
-Non-paging requests have the same plan tree, but `pageSize` value in `QueryPlan` is unset.
-
-Abstract plan tree for Subsequent Query Request (for second and further pages) contains only one node -- `ContinuePaginatedPlan`.
+When `QueryPlanFactory.create` is passed a subsequent query request, it:
+1. Creates an instance of `Cursor` unresolved plan as the sole node in the unresolved query plan.
```mermaid
classDiagram
- direction LR
- class ContinuePaginatedPlan {
- <>
- -String cursor
- -PlanSerializer planSerializer
- -QueryService queryService
- }
+ direction LR
+ class QueryPlan {
+ <>
+ -Optional~int~ pageSize
+ -UnresolvedPlan plan
+ -QueryService queryService
+ }
+ class Cursor {
+ <>
+ -String cursorId
+ }
+ QueryPlan --* Cursor
```
-`ContinuePaginatedPlan` translated to entire Physical Plan Tree by `PlanSerializer` on cursor deserialization. It bypasses Logical Plan tree stage, `Planner`, `Optimizer` and `Implementor`.
-
-The examples below show Abstract Plan Tree for the same query in different request types:
+The examples below show Abstract Query Plan for the same query in different request types:
```mermaid
stateDiagram-v2
@@ -271,14 +256,16 @@ stateDiagram-v2
}
state "Subsequent Query Request" As Sub {
- ContinuePaginatedPlan
+ Cursor
}
```
-#### Logical Plan tree
+#### Logical Query Plan
+
+There are no changes for non-paging requests.
-Changes to plan tree for Initial Query Request with pagination:
-1. `LogicalPaginate` is added to the top of the tree. It stores information about paging/scrolling should be done in a private field `pageSize` being pushed down in the `Optimizer`.
+Changes to logical query plan to support Initial Query Request:
+1. `LogicalPaginate` is added to the top of the tree. It stores information about paging should be done in a private field `pageSize` being pushed down in the `Optimizer`.
```mermaid
classDiagram
@@ -287,33 +274,32 @@ classDiagram
<>
int pageSize
}
- class LogicalPlanTree {
+ class LogicalQueryPlan {
<>
}
class LogicalRelation {
<>
}
- LogicalPaginate --* LogicalPlanTree
- LogicalPlanTree --* LogicalRelation
+ LogicalPaginate --* LogicalQueryPlan
+ LogicalQueryPlan --* LogicalRelation
```
-There are no changes for non-paging requests.
+For subsequent page requests, `Cursor` unresolved plan is mapped to `LogicalCursor` logical plan.
```mermaid
classDiagram
direction LR
- class LogicalPlanTree {
+ class LogicalQueryPlan {
<>
}
- class LogicalRelation {
+ class LogicalCursor {
<>
+ -String cursorId
}
- LogicalPlanTree --* LogicalRelation
+ LogicalQueryPlan --* LogicalCursor
```
-Note: This step is not executed for Subsequent Query Request.
-
-The examples below show Logical Plan Tree for the same query in different request types:
+The examples below show logical query plan for the same query in different request types:
```mermaid
stateDiagram-v2
@@ -343,32 +329,25 @@ stateDiagram-v2
FilterIP --> AggregationIP
AggregationIP --> RelationIP
}
+
+state "Subsequent Query Request" As Sub {
+Cursor
+}
```
-#### Optimized Logical Plan tree
-Changes:
-1. For pagination request, a `OpenSearchPagedIndexScanBuilder` is inserted to the bottom of the tree instead of `OpenSearchIndexScanQueryBuilder`. Both are instances of `TableScanBuilder` which extends `LogicalPlan` interface.
-2. `LogicalPaginate` is removed from the tree during push down operation in `Optimizer`.
+#### Optimized Logical Query Plan
-See [article about `TableScanBuilder`](query-optimizer-improvement.md#TableScanBuilder) for more details.
+Pagination is implemented by push down to OpenSearch. The following is only relevant for
+initial paged requests. Non-paged request optimization was not changed and there is no optimization
+to be done for subsequent page query plans.
-```mermaid
-classDiagram
- direction LR
- class LogicalProject {
- <>
- }
- class OpenSearchPagedIndexScanBuilder {
- <>
- }
-
- LogicalProject --* OpenSearchPagedIndexScanBuilder
-```
+Push down logical is implemented in `OpenSearchIndexScanQueryBuilder.pushDownPageSize` method.
+This method is called by `PushDownPageSize` rule during plan optimization. `LogicalPaginate` is removed from the query plan during push down operation in `Optimizer`.
-Note: No Logical Plan tree created for Subsequent Query Request.
+See [article about `TableScanBuilder`](query-optimizer-improvement.md#TableScanBuilder) for more details.
-The examples below show optimized Logical Plan Tree for the same query in different request types:
+The examples below show optimized Logical Query Plan for the same query in different request types:
```mermaid
stateDiagram-v2
@@ -383,38 +362,18 @@ stateDiagram-v2
SortNP --> RelationNP
}
- state "Initial Paged Request" as Paged {
- state "LogicalProject" as ProjectIP
- state "LogicalLimit" as LimitIP
- state "LogicalSort" as SortIP
- state "OpenSearchPagedIndexScanBuilder" as RelationIP
-
- ProjectIP --> LimitIP
- LimitIP --> SortIP
- SortIP --> RelationIP
- }
```
-#### Physical Plan tree
+#### Physical Query Plan and Execution
Changes:
-1. `OpenSearchPagedIndexScanBuilder` is converted to `OpenSearchPagedIndexScan` by `Implementor`.
-2. Entire Physical Plan tree is created by `PlanSerializer` for Subsequent Query requests. The deserialized tree has the same structure as the Initial Query Request.
+1. `OpenSearchIndexScanBuilder` is converted to `OpenSearchIndexScan` by `Implementor`.
+2. `LogicalPlan.pageSize` is mapped to `OpenSearchIndexScan.maxResponseSize`. This is the limit to the number of elements in a response.
+2. Entire Physical Query Plan is created by `PlanSerializer` for Subsequent Query requests. The deserialized plan has the same structure as the Initial Query Request.
+3. Implemented serialization and deserialization for `OpenSearchScrollRequest`.
-```mermaid
-classDiagram
- direction LR
- class ProjectOperator {
- <>
- }
- class OpenSearchPagedIndexScan {
- <>
- }
-
- ProjectOperator --* OpenSearchPagedIndexScan
-```
-The examples below show Physical Plan Tree for the same query in different request types:
+The examples below show physical query plan for the same query in different request types:
```mermaid
stateDiagram-v2
@@ -423,32 +382,38 @@ stateDiagram-v2
state "LimitOperator" as LimitNP
state "SortOperator" as SortNP
state "OpenSearchIndexScan" as RelationNP
+ state "OpenSearchQueryRequest" as QRequestNP
ProjectNP --> LimitNP
LimitNP --> SortNP
SortNP --> RelationNP
+ RelationNP --> QRequestNP
}
state "Initial Query Request" as Paged {
state "ProjectOperator" as ProjectIP
state "LimitOperator" as LimitIP
state "SortOperator" as SortIP
- state "OpenSearchPagedIndexScan" as RelationIP
+ state "OpenSearchIndexScan" as RelationIP
+ state "OpenSearchQueryRequest" as QRequestIP
ProjectIP --> LimitIP
LimitIP --> SortIP
SortIP --> RelationIP
+ RelationIP --> QRequestIP
}
state "Subsequent Query Request" As Sub {
state "ProjectOperator" as ProjectSP
state "LimitOperator" as LimitSP
state "SortOperator" as SortSP
- state "OpenSearchPagedIndexScan" as RelationSP
+ state "OpenSearchIndexScan" as RelationSP
+ state "OpenSearchScrollRequest" as RequestSP
ProjectSP --> LimitSP
LimitSP --> SortSP
SortSP --> RelationSP
+ RelationSP --> RequestSP
}
```
@@ -474,7 +439,7 @@ SQLService ->>+ QueryPlanFactory: execute
QueryService ->>+ Planner: optimize
Planner ->>+ CreateTableScanBuilder: apply
CreateTableScanBuilder -->>- Planner: index scan
- Planner -->>- QueryService: Logical Plan Tree
+ Planner -->>- QueryService: Logical Query Plan
QueryService ->>+ OpenSearchExecutionEngine: execute
OpenSearchExecutionEngine -->>- QueryService: execution completed
QueryService -->>- QueryPlanFactory: execution completed
@@ -485,9 +450,8 @@ SQLService ->>+ QueryPlanFactory: execute
Processing of an Initial Query Request has few extra steps comparing versus processing a regular Query Request:
1. Query validation with `CanPaginateVisitor`. This is required to validate whether incoming query can be paged. This also activate legacy engine fallback mechanism.
-2. Creating a paged index scan with `CreatePagingTableScanBuilder` `Optimizer` rule. A Regular Query Request triggers `CreateTableScanBuilder` rule instead.
-3. `Serialization` is performed by `PlanSerializer` - it converts Physical Plan Tree into a cursor, which could be used query a next page.
-4. Traversal of Physical Plan Tree to get total hits, which is required to properly fill response to a user.
+2. `Serialization` is performed by `PlanSerializer` - it converts Physical Query Plan into a cursor, which could be used query a next page.
+
```mermaid
sequenceDiagram
@@ -496,7 +460,7 @@ sequenceDiagram
participant CanPaginateVisitor
participant QueryService
participant Planner
- participant CreatePagingTableScanBuilder
+ participant CreatePagingScanBuilder
participant OpenSearchExecutionEngine
participant PlanSerializer
@@ -508,10 +472,10 @@ SQLService ->>+ QueryPlanFactory : execute
QueryPlanFactory ->>+ QueryService : execute
QueryService ->>+ Planner : optimize
rect rgb(91, 123, 155)
- Planner ->>+ CreatePagingTableScanBuilder : apply
- CreatePagingTableScanBuilder -->>- Planner : paged index scan
+ Planner ->>+ CreateTableScanBuilder : apply
+ CreateTableScanBuilder -->>- Planner : paged index scan
end
- Planner -->>- QueryService : Logical Plan Tree
+ Planner -->>- QueryService : Logical Query Plan
QueryService ->>+ OpenSearchExecutionEngine : execute
rect rgb(91, 123, 155)
Note over OpenSearchExecutionEngine, PlanSerializer : Serialization
@@ -529,10 +493,10 @@ SQLService ->>+ QueryPlanFactory : execute
#### Subsequent Query Request
Subsequent pages are processed by a new workflow. The key point there:
-1. `Deserialization` is performed by `PlanSerializer` to restore entire Physical Plan Tree encoded into the cursor.
-2. Since query already contains the Physical Plan Tree, all tree processing steps are skipped.
-3. `Serialization` is performed by `PlanSerializer` - it converts Physical Plan Tree into a cursor, which could be used query a next page.
-4. Traversal of Physical Plan Tree to get total hits, which is required to properly fill response to a user.
+1. `Deserialization` is performed by `PlanSerializer` to restore entire Physical Query Plan encoded into the cursor.
+2. Since query already contains the Physical Query Plan, analysis and optimization steps are no-ops.
+3. `Serialization` is performed by `PlanSerializer` - it converts Physical Query Plan into a cursor, which could be used query a next page.
+4. Traversal of Physical Query Plan to get total hits, which is required to properly fill response to a user.
```mermaid
sequenceDiagram
@@ -547,7 +511,7 @@ SQLService ->>+ QueryPlanFactory : execute
rect rgb(91, 123, 155)
note over QueryService, PlanSerializer : Deserialization
QueryService ->>+ PlanSerializer: convertToPlan
- PlanSerializer -->>- QueryService: Physical plan tree
+ PlanSerializer -->>- QueryService: Physical Query Plan
end
Note over QueryService : Planner, Optimizer and Implementor
are skipped
QueryService ->>+ OpenSearchExecutionEngine : execute
@@ -592,18 +556,18 @@ RestSQLQueryAction ->>+ SQLService : prepareRequest
#### Serialization and Deserialization round trip
-The SQL engine should be able to completely recover the Physical Plan tree to continue its execution to get the next page. Serialization mechanism is responsible for recovering the plan tree. note: `ResourceMonitorPlan` isn't serialized, because a new object of this type would be created for the restored plan tree before execution.
+The SQL engine should be able to completely recover the Physical Query Plan to continue its execution to get the next page. Serialization mechanism is responsible for recovering the query plan. note: `ResourceMonitorPlan` isn't serialized, because a new object of this type would be created for the restored query plan before execution.
Serialization and Deserialization are performed by Java object serialization API.
```mermaid
stateDiagram-v2
direction LR
- state "Initial Query Request Plan Tree" as FirstPage
+ state "Initial Query Request Query Plan" as FirstPage
state FirstPage {
state "ProjectOperator" as logState1_1
state "..." as logState1_2
state "ResourceMonitorPlan" as logState1_3
- state "OpenSearchPagedIndexScan" as logState1_4
+ state "OpenSearchIndexScan" as logState1_4
state "OpenSearchScrollRequest" as logState1_5
logState1_1 --> logState1_2
logState1_2 --> logState1_3
@@ -611,24 +575,24 @@ stateDiagram-v2
logState1_4 --> logState1_5
}
- state "Deserialized Plan Tree" as SecondPageTree
+ state "Deserialized Query Plan" as SecondPageTree
state SecondPageTree {
state "ProjectOperator" as logState2_1
state "..." as logState2_2
- state "OpenSearchPagedIndexScan" as logState2_3
- state "ContinuePageRequest" as logState2_4
+ state "OpenSearchIndexScan" as logState2_3
+ state "OpenSearchScrollRequest" as logState2_4
logState2_1 --> logState2_2
logState2_2 --> logState2_3
logState2_3 --> logState2_4
}
- state "Subsequent Query Request Plan Tree" as SecondPage
+ state "Subsequent Query Request Query Plan" as SecondPage
state SecondPage {
state "ProjectOperator" as logState3_1
state "..." as logState3_2
state "ResourceMonitorPlan" as logState3_3
- state "OpenSearchPagedIndexScan" as logState3_4
- state "ContinuePageRequest" as logState3_5
+ state "OpenSearchIndexScan" as logState3_4
+ state "OpenSearchScrollRequest" as logState3_5
logState3_1 --> logState3_2
logState3_2 --> logState3_3
logState3_3 --> logState3_4
@@ -641,16 +605,16 @@ stateDiagram-v2
#### Serialization
-All plan tree nodes which are supported by pagination should implement [`SerializablePlan`](https://github.com/opensearch-project/sql/blob/f40bb6d68241e76728737d88026e4c8b1e6b3b8b/core/src/main/java/org/opensearch/sql/planner/SerializablePlan.java) interface. `getPlanForSerialization` method of this interface allows serialization mechanism to skip a tree node from serialization. OpenSearch search request objects are not serialized, but search context provided by the OpenSearch cluster is extracted from them.
+All query plan nodes which are supported by pagination should implement [`SerializablePlan`](https://github.com/opensearch-project/sql/blob/f40bb6d68241e76728737d88026e4c8b1e6b3b8b/core/src/main/java/org/opensearch/sql/planner/SerializablePlan.java) interface. `getPlanForSerialization` method of this interface allows serialization mechanism to skip a tree node from serialization. OpenSearch search request objects are not serialized, but search context provided by the OpenSearch cluster is extracted from them.
```mermaid
sequenceDiagram
participant PlanSerializer
participant ProjectOperator
participant ResourceMonitorPlan
- participant OpenSearchPagedIndexScan
+ participant OpenSearchIndexScan
+ participant OpenSearchScrollRequest
participant OpenSearchScrollRequest
- participant ContinuePageRequest
PlanSerializer ->>+ ProjectOperator : getPlanForSerialization
ProjectOperator -->>- PlanSerializer : this
@@ -659,31 +623,31 @@ PlanSerializer ->>+ ProjectOperator : serialize
ProjectOperator ->>+ ResourceMonitorPlan : getPlanForSerialization
ResourceMonitorPlan -->>- ProjectOperator : delegate
Note over ResourceMonitorPlan : ResourceMonitorPlan
is not serialized
- ProjectOperator ->>+ OpenSearchPagedIndexScan : serialize
+ ProjectOperator ->>+ OpenSearchIndexScan : writeExternal
alt First page
- OpenSearchPagedIndexScan ->>+ OpenSearchScrollRequest : toCursor
- OpenSearchScrollRequest -->>- OpenSearchPagedIndexScan : serialized request
+ OpenSearchIndexScan ->>+ OpenSearchScrollRequest : writeTo
+ OpenSearchScrollRequest -->>- OpenSearchIndexScan : serialized request
else Subsequent page
- OpenSearchPagedIndexScan ->>+ ContinuePageRequest : toCursor
- ContinuePageRequest -->>- OpenSearchPagedIndexScan : serialized request
+ OpenSearchIndexScan ->>+ OpenSearchScrollRequest : writeTo
+ OpenSearchScrollRequest -->>- OpenSearchIndexScan : serialized request
end
- Note over OpenSearchPagedIndexScan : dump private fields
- OpenSearchPagedIndexScan -->>- ProjectOperator : serialized
+ Note over OpenSearchIndexScan : dump private fields
+ OpenSearchIndexScan -->>- ProjectOperator : serialized
ProjectOperator -->>- PlanSerializer : serialized
Note over PlanSerializer : Zip to reduce size
```
#### Deserialization
-Deserialization restores previously serialized Physical Plan tree. The recovered tree is ready to execute and should return the next page of the search response. To complete the tree restoration, SQL engine should build a new request to the OpenSearch node. This request doesn't contain a search query, but it contains a search context reference -- `scrollID`. To create a new `ContinuePageRequest` object it is require to access to the instance of `OpenSearchStorageEngine`. `OpenSearchStorageEngine` can't be serialized and it exists as a singleton in the SQL plugin engine. `PlanSerializer` creates a customized deserialization binary object stream -- `CursorDeserializationStream`. This stream provides an interface to access the `OpenSearchStorageEngine` object.
+Deserialization restores previously serialized Physical Query Plan. The recovered plan is ready to execute and returns the next page of the search response. To complete the query plan restoration, SQL engine will build a new request to the OpenSearch node. This request doesn't contain a search query, but it contains a search context reference — `scrollID`. To create a new `OpenSearchScrollRequest` object it requires access to the instance of `OpenSearchStorageEngine`. Note: `OpenSearchStorageEngine` can't be serialized, and it exists as a singleton in the SQL plugin engine. `PlanSerializer` creates a customized deserialization binary object stream — `CursorDeserializationStream`. This stream provides an interface to access the `OpenSearchStorageEngine` object.
```mermaid
sequenceDiagram
participant PlanSerializer
participant CursorDeserializationStream
participant ProjectOperator
- participant OpenSearchPagedIndexScan
- participant ContinuePageRequest
+ participant OpenSearchIndexScan
+ participant OpenSearchScrollRequest
Note over PlanSerializer : Unzip
Note over PlanSerializer : Validate cursor integrity
@@ -692,14 +656,14 @@ PlanSerializer ->>+ CursorDeserializationStream : deserialize
Note over ProjectOperator: load private fields
ProjectOperator -->> CursorDeserializationStream : deserialize input
activate CursorDeserializationStream
- CursorDeserializationStream ->>+ OpenSearchPagedIndexScan : create new
+ CursorDeserializationStream ->>+ OpenSearchIndexScan : create new
deactivate CursorDeserializationStream
- OpenSearchPagedIndexScan -->>+ CursorDeserializationStream : resolve engine
- CursorDeserializationStream ->>- OpenSearchPagedIndexScan : OpenSearchStorageEngine
- Note over OpenSearchPagedIndexScan : load private fields
- OpenSearchPagedIndexScan ->>+ ContinuePageRequest : create new
- ContinuePageRequest -->>- OpenSearchPagedIndexScan : created
- OpenSearchPagedIndexScan -->>- ProjectOperator : deserialized
+ OpenSearchIndexScan -->>+ CursorDeserializationStream : resolve engine
+ CursorDeserializationStream ->>- OpenSearchIndexScan : OpenSearchStorageEngine
+ Note over OpenSearchIndexScan : load private fields
+ OpenSearchIndexScan ->>+ OpenSearchScrollRequest : create new
+ OpenSearchScrollRequest -->>- OpenSearchIndexScan : created
+ OpenSearchIndexScan -->>- ProjectOperator : deserialized
ProjectOperator -->>- PlanSerializer : deserialized
deactivate CursorDeserializationStream
```
@@ -723,15 +687,15 @@ sequenceDiagram
participant OpenSearchExecutionEngine
participant ProjectOperator
participant ResourceMonitorPlan
- participant OpenSearchPagedIndexScan
+ participant OpenSearchIndexScan
OpenSearchExecutionEngine ->>+ ProjectOperator: getTotalHits
Note over ProjectOperator: default implementation
ProjectOperator ->>+ ResourceMonitorPlan: getTotalHits
Note over ResourceMonitorPlan: call to delegate
- ResourceMonitorPlan ->>+ OpenSearchPagedIndexScan: getTotalHits
- Note over OpenSearchPagedIndexScan: use stored value from the search response
- OpenSearchPagedIndexScan -->>- ResourceMonitorPlan: value
+ ResourceMonitorPlan ->>+ OpenSearchIndexScan: getTotalHits
+ Note over OpenSearchIndexScan: use stored value from the search response
+ OpenSearchIndexScan -->>- ResourceMonitorPlan: value
ResourceMonitorPlan -->>- ProjectOperator: value
ProjectOperator -->>- OpenSearchExecutionEngine: value
```