From cb7da205a3558eeb4f4848d178ade62a2bf17018 Mon Sep 17 00:00:00 2001
From: MaxKsyunz <maxk@bitquilltech.com>
Date: Fri, 19 May 2023 03:28:05 -0700
Subject: [PATCH 1/4] Update design document to reflact refactor.

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
---
 docs/dev/Pagination-v2.md | 300 ++++++++++++++++----------------------
 1 file changed, 125 insertions(+), 175 deletions(-)
diff --git a/docs/dev/Pagination-v2.md b/docs/dev/Pagination-v2.md
index 77694c25109..05008edfcb1 100644
--- a/docs/dev/Pagination-v2.md
+++ b/docs/dev/Pagination-v2.md
@@ -7,7 +7,7 @@ A cursor is a SQL abstraction for pagination. A client can open a cursor, retrie
 Currently, SQL plugin does not provide SQL cursor syntax. However, the SQL REST endpoint can return result a page at a time. This feature is used by JDBC and ODBC drivers.
 
 # Scope
-Currenty, V2 engine supports pagination only for simple `SELECT * FROM <table>` queries without any other clauses like `WHERE` or `ORDER BY`.
+At this time, V2 engine supports pagination only for simple `SELECT * FROM <table>` queries without any other clauses like `WHERE` or `ORDER BY`.
 
 # Demo
 https://user-images.githubusercontent.com/88679692/224208630-8d38d833-abf8-4035-8d15-d5fb4382deca.mp4
@@ -15,7 +15,7 @@ https://user-images.githubusercontent.com/88679692/224208630-8d38d833-abf8-4035-
 # REST API
 ## Initial Query Request
 
-Initial query request contains the search request and page size. Search query to OpenSearch is built during processing of this request. Neither the query nor page size can be change while scrolling through pages based on this request.
+Initial query request contains the search request and page size. Search query to OpenSearch is built during processing of this request. Neither the query nor page size can be changed while scrolling through pages based on this request.
 The only difference between paged and non-paged requests is `fetch_size` parameter supplied in paged request.
 
 ```json
@@ -79,7 +79,7 @@ Keep alive timeout is [configurable](../user/admin/settings.rst#plugins.sql.curs
 
 ## Error Response
 
-The client will receive an error response if any of the above REST calls result in an server-side error.
+The client will receive an error response if any of the above REST calls result in a server-side error.
 
 The response object has the following format:
 ```json
@@ -98,7 +98,7 @@ The response object has the following format:
 
 ## OpenSearch Data Retrieval Strategy
 
-OpenSearch provides several data retrival APIs that are optimized for different use cases.
+OpenSearch provides several data retrieval APIs that are optimized for different use cases.
 
 At this time, SQL plugin uses simple search API and scroll API.
 
@@ -106,51 +106,30 @@ Simple retrieval API returns at most `max_result_window` number of documents.  `
 
 Scroll API requests returns all documents but can incur high memory costs on OpenSearch coordination node.
 
-Efficient implementation of pagination needs to be aware of retrival API used. Each retrieval strategy will be considered separately.
+Efficient implementation of pagination needs to be aware of retrieval API used. Each retrieval strategy will be considered separately.
 
 The discussion below uses *under max_result_window* to refer to scenarios that can be implemented with simple retrieval API and *over max_result_window* for scenarios that require scroll API to implement.
 
 ## SQL Node Load Balancing
 
-V2 SQL engine supports *sql node load balancing* -- a cursor request can be routed to any SQL node in a cluster. This is achieved by encoding all data necessary to retrieve the next page in the `cursor_id`.
+V2 SQL engine supports *sql node load balancing* &mdash; a cursor request can be routed to any SQL node in a cluster. This is achieved by encoding all data necessary to retrieve the next page in the `cursor_id`.
 
 ## Feature Design
 
-### Plan Tree changes
+### Query Plan Changes
 
-Different plan trees are built during request processing. Read more about their purpose and stages [here](query-optimizer-improvement.md#Examples). The section below describes the changes being introduced to these trees by the pagination feature.
+All three kinds of requests &mdash; non-paged, initial page, or subsequent page  &mdash; are processed in the same way. Simplified workflow of query plan processing is shown below for reference.
 
-Simplified workflow of plan trees is shown below. Initial Page Request is processed the same way as a non-paging request.
-
-```mermaid
-stateDiagram-v2
-  state "Non Paged Request" as NonPaged {
-    direction LR
-    state "Parse Tree" as Parse
-    state "Unresolved Plan Tree" as Unresolved
-    state "Abstract Plan Tree" as Abstract
-    state "Logical Plan Tree" as Logical
-    state "Optimized Logical Plan Tree" as Optimized
-    state "Physical Plan Tree" as Physical
-
-    [*] --> Parse : ANTLR
-    Parse --> Unresolved : AstBuilder
-    Unresolved --> Abstract : QueryPlanner
-    Abstract --> Logical : Planner
-    Logical --> Optimized : Optimizer
-    Optimized --> Physical : Implementor
-  }
-```
 ```mermaid
 stateDiagram-v2
-  state "Initial Page Request" as Paged {
+  state "Request" as NonPaged {
     direction LR
     state "Parse Tree" as Parse
-    state "Unresolved Plan Tree" as Unresolved
-    state "Abstract Plan Tree" as Abstract
-    state "Logical Plan Tree" as Logical
-    state "Optimized Logical Plan Tree" as Optimized
-    state "Physical Plan Tree" as Physical
+    state "Unresolved Query Plan" as Unresolved
+    state "Abstract Query Plan" as Abstract
+    state "Logical Query Plan" as Logical
+    state "Optimized Query Plan" as Optimized
+    state "Physical Query Plan" as Physical
 
     [*] --> Parse : ANTLR
     Parse --> Unresolved : AstBuilder
@@ -160,23 +139,13 @@ stateDiagram-v2
     Optimized --> Physical : Implementor
   }
 ```
-```mermaid
-stateDiagram-v2
-  state "Subsequent Page Request" as Paged {
-    direction LR
-    state "Abstract Plan Tree" as Abstract
-    state "Physical Plan Tree" as Physical
 
-    [*] --> Abstract : QueryPlanner
-    Abstract --> Physical : Deserializer
-  }
-```
 
-New plan tree workflow was added for Subsequent Page Requests. Since a final Physical Plan tree was already created for Initial request, subsequent requests should have the same tree. The tree is serialized into a `cursor` by `PlanSerializer` to be de-serialized on the subsequence Page Request. Query parsing and analysis is not performed for Subsequent Page Requests, since the Physical Plan tree is instead de-serialized from the `cursor`.
+#### Unresolved Query Plan
 
-#### Abstract Plan tree
+Unresolved Query Plan for non-paged requests remains unchanged. 
 
-Abstract Plan Tree for non-paged requests remains unchanged. The `QueryPlan`, as a top entry, has a new optional field `pageSize`, which is not defined for non-paged requests.
+To support initial page requests, the `QueryPlan` class has a new optional field `pageSize`.
 
 ```mermaid
 classDiagram
@@ -187,15 +156,15 @@ classDiagram
     -UnresolvedPlan plan
     -QueryService queryService
   }
-  class UnresolvedPlanTree {
+  class UnresolvedQueryPlan {
     <<UnresolvedPlan>>
   }
-  QueryPlan --* UnresolvedPlanTree
+  QueryPlan --* UnresolvedQueryPlan
 ```
 
-Abstract plan tree for Initial Query Request has following changes:
-1. New Plan node -- `Paginate` -- added into the tree.
-2. `pageSize` parameter in `QueryPlan` is set, and `Paginate` is being added. It is converted to `LogicalPaginate` later.
+When `QueryPlanFactory.create` is passed initial query request, it:
+1. Adds an instance of `Paginate` unresolved plan as the root of the unresolved query plan. 
+2. Sets `pageSize` parameter in `QueryPlan`.
 
 ```mermaid
 classDiagram
@@ -211,31 +180,33 @@ classDiagram
     -int pageSize
     -UnresolvedPlan child
   }
-  class UnresolvedPlanTree {
+  class UnresolvedQueryPlan {
     <<UnresolvedPlan>>
   }
   QueryPlan --* Paginate
-  Paginate --* UnresolvedPlanTree
+  Paginate --* UnresolvedQueryPlan
 ```
 
-Non-paging requests have the same plan tree, but `pageSize` value in `QueryPlan` is unset.
-
-Abstract plan tree for Subsequent Query Request (for second and further pages) contains only one node -- `ContinuePaginatedPlan`.
+When `QueryPlanFactory.create` is passed a subsequent query request, it:
+1. Creates an instance of `Cursor` unresolved plan as the sole node in the unresolved query plan.
 
 ```mermaid
 classDiagram
-  direction LR
-  class ContinuePaginatedPlan {
-    <<AbstractPlan>>
-    -String cursor
-    -PlanSerializer planSerializer
-    -QueryService queryService
-  }
+    direction LR
+    class QueryPlan {
+        <<AbstractPlan>>
+        -Optional~int~ pageSize
+        -UnresolvedPlan plan
+        -QueryService queryService
+    }
+    class Cursor {
+        <<UnresolvedPlan>>
+        -String cursorId
+    }
+    QueryPlan --* Cursor
 ```
 
-`ContinuePaginatedPlan` translated to entire Physical Plan Tree by `PlanSerializer` on cursor deserialization. It bypasses Logical Plan tree stage, `Planner`, `Optimizer` and `Implementor`.
-
-The examples below show Abstract Plan Tree for the same query in different request types:
+The examples below show Abstract Query Plan for the same query in different request types:
 
 ```mermaid
 stateDiagram-v2
@@ -271,14 +242,16 @@ stateDiagram-v2
   }
 
   state "Subsequent Query Request" As Sub {
-    ContinuePaginatedPlan
+    Cursor
   }
 ```
 
-#### Logical Plan tree
+#### Logical Query Plan
+
+There are no changes for non-paging requests.
 
-Changes to plan tree for Initial Query Request with pagination:
-1. `LogicalPaginate` is added to the top of the tree. It stores information about paging/scrolling should be done in a private field `pageSize` being pushed down in the `Optimizer`.
+Changes to logical query plan to support initial pagination request:
+1. `LogicalPaginate` is added to the top of the tree. It stores information about paging should be done in a private field `pageSize` being pushed down in the `Optimizer`.
 
 ```mermaid
 classDiagram
@@ -287,33 +260,32 @@ classDiagram
     <<LogicalPlan>>
     int pageSize
   }
-  class LogicalPlanTree {
+  class LogicalQueryPlan {
     <<LogicalPlan>>
   }
   class LogicalRelation {
     <<LogicalPlan>>
   }
-  LogicalPaginate --* LogicalPlanTree
-  LogicalPlanTree --* LogicalRelation
+  LogicalPaginate --* LogicalQueryPlan
+  LogicalQueryPlan --* LogicalRelation
 ```
 
-There are no changes for non-paging requests.
+For subsequent page requests, `Cursor` unresolved plan is mapped to `LogicalCursor` logical plan.
 
 ```mermaid
 classDiagram
   direction LR
-  class LogicalPlanTree {
+  class LogicalQueryPlan {
     <<LogicalPlan>>
   }
-  class LogicalRelation {
+  class LogicalCursor {
     <<LogicalPlan>>
+    -String cursorId
   }
-  LogicalPlanTree --* LogicalRelation
+  LogicalQueryPlan --* LogicalCursor
 ```
 
-Note: This step is not executed for Subsequent Query Request.
-
-The examples below show Logical Plan Tree for the same query in different request types:
+The examples below show logical query plan for the same query in different request types:
 
 ```mermaid
 stateDiagram-v2
@@ -343,32 +315,25 @@ stateDiagram-v2
     FilterIP --> AggregationIP
     AggregationIP --> RelationIP
   }
-```
 
-#### Optimized Logical Plan tree
+state "Subsequent Query Request" As Sub {
+Cursor
+}
+```
 
-Changes:
-1. For pagination request, a `OpenSearchPagedIndexScanBuilder` is inserted to the bottom of the tree instead of `OpenSearchIndexScanQueryBuilder`. Both are instances of `TableScanBuilder` which extends `LogicalPlan` interface.
-2. `LogicalPaginate` is removed from the tree during push down operation in `Optimizer`.
 
-See [article about `TableScanBuilder`](query-optimizer-improvement.md#TableScanBuilder) for more details.
+#### Optimized Logical Query Plan
 
-```mermaid
-classDiagram
-  direction LR
-  class LogicalProject {
-    <<LogicalPlan>>
-  }
-  class OpenSearchPagedIndexScanBuilder {
-    <<TableScanBuilder>>
-  }
+Pagination is implemented by push down to OpenSearch. The following is only relevant for
+initial paged requests. Non-paged request optimization was not changed and there is no optimization
+to be done for subsequent page query plans.
 
-  LogicalProject --* OpenSearchPagedIndexScanBuilder
-```
+Push down logical is implemented in  `OpenSearchIndexScanQueryBuilder.pushDownPageSize` method.
+This method is called by `PushDownPageSize` rule during plan optimization.  `LogicalPaginate` is removed from the query plan during push down operation in `Optimizer`.
 
-Note: No Logical Plan tree created for Subsequent Query Request.
+See [article about `TableScanBuilder`](query-optimizer-improvement.md#TableScanBuilder) for more details.
 
-The examples below show optimized Logical Plan Tree for the same query in different request types:
+The examples below show optimized Logical Query Plan for the same query in different request types:
 
 ```mermaid
 stateDiagram-v2
@@ -383,38 +348,18 @@ stateDiagram-v2
     SortNP --> RelationNP
   }
 
-  state "Initial Paged Request" as Paged {
-    state "LogicalProject" as ProjectIP
-    state "LogicalLimit" as LimitIP
-    state "LogicalSort" as SortIP
-    state "OpenSearchPagedIndexScanBuilder" as RelationIP
-
-    ProjectIP --> LimitIP
-    LimitIP --> SortIP
-    SortIP --> RelationIP
-  }
 ```
 
-#### Physical Plan tree
+#### Physical Query Plan and Execution
 
 Changes:
-1. `OpenSearchPagedIndexScanBuilder` is converted to `OpenSearchPagedIndexScan` by `Implementor`.
-2. Entire Physical Plan tree is created by `PlanSerializer` for Subsequent Query requests. The deserialized tree has the same structure as the Initial Query Request.
+1. `OpenSearchIndexScanBuilder` is converted to `OpenSearchIndexScan` by `Implementor`.
+2. `LogicalPlan.pageSize` is mapped to `OpenSearchIndexScan.maxResponseSize`. This is the limit to  the number of elements in a response.
+2. Entire Physical Query Plan is created by `PlanSerializer` for Subsequent Query requests. The deserialized plan has the same structure as the Initial Query Request.
+3. Implemented serialization and deserialization for `OpenSearchScrollRequest`.
 
-```mermaid
-classDiagram
-  direction LR
-  class ProjectOperator {
-    <<PhysicalPlan>>
-  }
-  class OpenSearchPagedIndexScan {
-    <<TableScanOperator>>
-  }
 
-  ProjectOperator --* OpenSearchPagedIndexScan
-```
-
-The examples below show Physical Plan Tree for the same query in different request types:
+The examples below show physical query plan for the same query in different request types:
 
 ```mermaid
 stateDiagram-v2
@@ -423,32 +368,38 @@ stateDiagram-v2
     state "LimitOperator" as LimitNP
     state "SortOperator" as SortNP
     state "OpenSearchIndexScan" as RelationNP
+    state "OpenSearchQueryRequest" as QRequestNP
 
     ProjectNP --> LimitNP
     LimitNP --> SortNP
     SortNP --> RelationNP
+    RelationNP --> QRequestNP
   }
 
   state "Initial Query Request" as Paged {
     state "ProjectOperator" as ProjectIP
     state "LimitOperator" as LimitIP
     state "SortOperator" as SortIP
-    state "OpenSearchPagedIndexScan" as RelationIP
+    state "OpenSearchIndexScan" as RelationIP
+    state "OpenSearchQueryRequest" as QRequestIP
 
     ProjectIP --> LimitIP
     LimitIP --> SortIP
     SortIP --> RelationIP
+    RelationIP --> QRequestIP
   }
 
   state "Subsequent Query Request" As Sub {
     state "ProjectOperator" as ProjectSP
     state "LimitOperator" as LimitSP
     state "SortOperator" as SortSP
-    state "OpenSearchPagedIndexScan" as RelationSP
+    state "OpenSearchIndexScan" as RelationSP
+    state "OpenSearchScrollRequest" as RequestSP
 
     ProjectSP --> LimitSP
     LimitSP --> SortSP
     SortSP --> RelationSP
+    RelationSP --> RequestSP
   }
 ```
 
@@ -474,7 +425,7 @@ SQLService ->>+ QueryPlanFactory: execute
     QueryService ->>+ Planner: optimize
       Planner ->>+ CreateTableScanBuilder: apply
         CreateTableScanBuilder -->>- Planner: index scan
-      Planner -->>- QueryService: Logical Plan Tree
+      Planner -->>- QueryService: Logical Query Plan
     QueryService ->>+ OpenSearchExecutionEngine: execute
       OpenSearchExecutionEngine -->>- QueryService: execution completed
     QueryService -->>- QueryPlanFactory: execution completed
@@ -485,9 +436,8 @@ SQLService ->>+ QueryPlanFactory: execute
 
 Processing of an Initial Query Request has few extra steps comparing versus processing a regular Query Request:
 1. Query validation with `CanPaginateVisitor`. This is required to validate whether incoming query can be paged. This also activate legacy engine fallback mechanism.
-2. Creating a paged index scan with `CreatePagingTableScanBuilder` `Optimizer` rule. A Regular Query Request triggers `CreateTableScanBuilder` rule instead.
-3. `Serialization` is performed by `PlanSerializer` - it converts Physical Plan Tree into a cursor, which could be used query a next page.
-4. Traversal of Physical Plan Tree to get total hits, which is required to properly fill response to a user.
+2. `Serialization` is performed by `PlanSerializer` - it converts Physical Query Plan into a cursor, which could be used query a next page.
+3Traversal of Physical Query Plan to get total hits, which is required to properly fill response to a user.
 
 ```mermaid
 sequenceDiagram
@@ -496,7 +446,7 @@ sequenceDiagram
     participant CanPaginateVisitor
     participant QueryService
     participant Planner
-    participant CreatePagingTableScanBuilder
+    participant CreatePagingScanBuilder
     participant OpenSearchExecutionEngine
     participant PlanSerializer
 
@@ -508,10 +458,10 @@ SQLService ->>+ QueryPlanFactory : execute
   QueryPlanFactory ->>+ QueryService : execute
     QueryService ->>+ Planner : optimize
       rect rgb(91, 123, 155)
-      Planner ->>+ CreatePagingTableScanBuilder : apply
-        CreatePagingTableScanBuilder -->>- Planner : paged index scan
+      Planner ->>+ CreateTableScanBuilder : apply
+        CreateTableScanBuilder -->>- Planner : paged index scan
       end
-      Planner -->>- QueryService : Logical Plan Tree
+      Planner -->>- QueryService : Logical Query Plan
     QueryService ->>+ OpenSearchExecutionEngine : execute
       rect rgb(91, 123, 155)
       Note over OpenSearchExecutionEngine, PlanSerializer : Serialization
@@ -529,10 +479,10 @@ SQLService ->>+ QueryPlanFactory : execute
 #### Subsequent Query Request
 
 Subsequent pages are processed by a new workflow. The key point there:
-1. `Deserialization` is performed by `PlanSerializer` to restore entire Physical Plan Tree encoded into the cursor.
-2. Since query already contains the Physical Plan Tree, all tree processing steps are skipped.
-3. `Serialization` is performed by `PlanSerializer` - it converts Physical Plan Tree into a cursor, which could be used query a next page.
-4. Traversal of Physical Plan Tree to get total hits, which is required to properly fill response to a user.
+1. `Deserialization` is performed by `PlanSerializer` to restore entire Physical Query Plan encoded into the cursor.
+2. Since query already contains the Physical Query Plan, analysis and optimization steps are no-ops.
+3. `Serialization` is performed by `PlanSerializer` - it converts Physical Query Plan into a cursor, which could be used query a next page.
+4. Traversal of Physical Query Plan to get total hits, which is required to properly fill response to a user.
 
 ```mermaid
 sequenceDiagram
@@ -547,7 +497,7 @@ SQLService ->>+ QueryPlanFactory : execute
     rect rgb(91, 123, 155)
     note over QueryService, PlanSerializer : Deserialization
     QueryService ->>+ PlanSerializer: convertToPlan
-      PlanSerializer -->>- QueryService: Physical plan tree
+      PlanSerializer -->>- QueryService: Physical Query Plan
     end
     Note over QueryService : Planner, Optimizer and Implementor<br />are skipped
     QueryService ->>+ OpenSearchExecutionEngine : execute
@@ -592,18 +542,18 @@ RestSQLQueryAction ->>+ SQLService : prepareRequest
 
 #### Serialization and Deserialization round trip
 
-The SQL engine should be able to completely recover the Physical Plan tree to continue its execution to get the next page. Serialization mechanism is responsible for recovering the plan tree. note: `ResourceMonitorPlan` isn't serialized, because a new object of this type would be created for the restored plan tree before execution. 
+The SQL engine should be able to completely recover the Physical Query Plan to continue its execution to get the next page. Serialization mechanism is responsible for recovering the query plan. note: `ResourceMonitorPlan` isn't serialized, because a new object of this type would be created for the restored query plan before execution. 
 Serialization and Deserialization are performed by Java object serialization API.
 
 ```mermaid
 stateDiagram-v2
     direction LR
-    state "Initial Query Request Plan Tree" as FirstPage
+    state "Initial Query Request Query Plan" as FirstPage
     state FirstPage {
         state "ProjectOperator" as logState1_1
         state "..." as logState1_2
         state "ResourceMonitorPlan" as logState1_3
-        state "OpenSearchPagedIndexScan" as logState1_4
+        state "OpenSearchIndexScan" as logState1_4
         state "OpenSearchScrollRequest" as logState1_5
         logState1_1 --> logState1_2
         logState1_2 --> logState1_3
@@ -611,24 +561,24 @@ stateDiagram-v2
         logState1_4 --> logState1_5
     }
 
-    state "Deserialized Plan Tree" as SecondPageTree
+    state "Deserialized Query Plan" as SecondPageTree
     state SecondPageTree {
         state "ProjectOperator" as logState2_1
         state "..." as logState2_2
-        state "OpenSearchPagedIndexScan" as logState2_3
-        state "ContinuePageRequest" as logState2_4
+        state "OpenSearchIndexScan" as logState2_3
+        state "OpenSearchScrollRequest" as logState2_4
         logState2_1 --> logState2_2
         logState2_2 --> logState2_3
         logState2_3 --> logState2_4
     }
 
-    state "Subsequent Query Request Plan Tree" as SecondPage
+    state "Subsequent Query Request Query Plan" as SecondPage
     state SecondPage {
         state "ProjectOperator" as logState3_1
         state "..." as logState3_2
         state "ResourceMonitorPlan" as logState3_3
-        state "OpenSearchPagedIndexScan" as logState3_4
-        state "ContinuePageRequest" as logState3_5
+        state "OpenSearchIndexScan" as logState3_4
+        state "OpenSearchScrollRequest" as logState3_5
         logState3_1 --> logState3_2
         logState3_2 --> logState3_3
         logState3_3 --> logState3_4
@@ -641,16 +591,16 @@ stateDiagram-v2
 
 #### Serialization
 
-All plan tree nodes which are supported by pagination should implement [`SerializablePlan`](https://github.com/opensearch-project/sql/blob/f40bb6d68241e76728737d88026e4c8b1e6b3b8b/core/src/main/java/org/opensearch/sql/planner/SerializablePlan.java) interface. `getPlanForSerialization` method of this interface allows serialization mechanism to skip a tree node from serialization. OpenSearch search request objects are not serialized, but search context provided by the OpenSearch cluster is extracted from them.
+All query plan nodes which are supported by pagination should implement [`SerializablePlan`](https://github.com/opensearch-project/sql/blob/f40bb6d68241e76728737d88026e4c8b1e6b3b8b/core/src/main/java/org/opensearch/sql/planner/SerializablePlan.java) interface. `getPlanForSerialization` method of this interface allows serialization mechanism to skip a tree node from serialization. OpenSearch search request objects are not serialized, but search context provided by the OpenSearch cluster is extracted from them.
 
 ```mermaid
 sequenceDiagram
     participant PlanSerializer
     participant ProjectOperator
     participant ResourceMonitorPlan
-    participant OpenSearchPagedIndexScan
+    participant OpenSearchIndexScan
+    participant OpenSearchScrollRequest
     participant OpenSearchScrollRequest
-    participant ContinuePageRequest
 
 PlanSerializer ->>+ ProjectOperator : getPlanForSerialization
   ProjectOperator -->>- PlanSerializer : this
@@ -659,31 +609,31 @@ PlanSerializer ->>+ ProjectOperator : serialize
   ProjectOperator ->>+ ResourceMonitorPlan : getPlanForSerialization
     ResourceMonitorPlan -->>- ProjectOperator : delegate
   Note over ResourceMonitorPlan : ResourceMonitorPlan<br />is not serialized
-  ProjectOperator ->>+ OpenSearchPagedIndexScan : serialize
+  ProjectOperator ->>+ OpenSearchIndexScan : writeExternal
     alt First page
-      OpenSearchPagedIndexScan ->>+ OpenSearchScrollRequest : toCursor
-        OpenSearchScrollRequest -->>- OpenSearchPagedIndexScan : serialized request
+      OpenSearchIndexScan ->>+ OpenSearchScrollRequest : writeTo
+        OpenSearchScrollRequest -->>- OpenSearchIndexScan : serialized request
     else Subsequent page
-      OpenSearchPagedIndexScan ->>+ ContinuePageRequest : toCursor
-        ContinuePageRequest -->>- OpenSearchPagedIndexScan : serialized request
+      OpenSearchIndexScan ->>+ OpenSearchScrollRequest : writeTo
+        OpenSearchScrollRequest -->>- OpenSearchIndexScan : serialized request
     end
-    Note over OpenSearchPagedIndexScan : dump private fields
-    OpenSearchPagedIndexScan -->>- ProjectOperator : serialized
+    Note over OpenSearchIndexScan : dump private fields
+    OpenSearchIndexScan -->>- ProjectOperator : serialized
   ProjectOperator -->>- PlanSerializer : serialized
 Note over PlanSerializer : Zip to reduce size
 ```
 
 #### Deserialization
 
-Deserialization restores previously serialized Physical Plan tree. The recovered tree is ready to execute and should return the next page of the search response. To complete the tree restoration, SQL engine should build a new request to the OpenSearch node. This request doesn't contain a search query, but it contains a search context reference -- `scrollID`. To create a new `ContinuePageRequest` object it is require to access to the instance of `OpenSearchStorageEngine`. `OpenSearchStorageEngine` can't be serialized and it exists as a singleton in the SQL plugin engine. `PlanSerializer` creates a customized deserialization binary object stream -- `CursorDeserializationStream`. This stream provides an interface to access the `OpenSearchStorageEngine` object.
+Deserialization restores previously serialized Physical Query Plan. The recovered plan is ready to execute and should return the next page of the search response. To complete the query plan restoration, SQL engine should build a new request to the OpenSearch node. This request doesn't contain a search query, but it contains a search context reference &mdash; `scrollID`. To create a new `OpenSearchScrollRequest` object it is required to access to the instance of `OpenSearchStorageEngine`. `OpenSearchStorageEngine` can't be serialized, and it exists as a singleton in the SQL plugin engine. `PlanSerializer` creates a customized deserialization binary object stream &mdash; `CursorDeserializationStream`. This stream provides an interface to access the `OpenSearchStorageEngine` object.
 
 ```mermaid
 sequenceDiagram
     participant PlanSerializer
     participant CursorDeserializationStream
     participant ProjectOperator
-    participant OpenSearchPagedIndexScan
-    participant ContinuePageRequest
+    participant OpenSearchIndexScan
+    participant OpenSearchScrollRequest
 
 Note over PlanSerializer : Unzip
 Note over PlanSerializer : Validate cursor integrity
@@ -692,14 +642,14 @@ PlanSerializer ->>+ CursorDeserializationStream : deserialize
     Note over ProjectOperator: load private fields
     ProjectOperator -->> CursorDeserializationStream : deserialize input
   activate CursorDeserializationStream
-  CursorDeserializationStream ->>+ OpenSearchPagedIndexScan : create new
+  CursorDeserializationStream ->>+ OpenSearchIndexScan : create new
   deactivate CursorDeserializationStream
-    OpenSearchPagedIndexScan -->>+ CursorDeserializationStream : resolve engine
-  CursorDeserializationStream ->>- OpenSearchPagedIndexScan : OpenSearchStorageEngine
-    Note over OpenSearchPagedIndexScan : load private fields
-    OpenSearchPagedIndexScan ->>+ ContinuePageRequest : create new
-      ContinuePageRequest -->>- OpenSearchPagedIndexScan : created
-    OpenSearchPagedIndexScan -->>- ProjectOperator : deserialized
+    OpenSearchIndexScan -->>+ CursorDeserializationStream : resolve engine
+  CursorDeserializationStream ->>- OpenSearchIndexScan : OpenSearchStorageEngine
+    Note over OpenSearchIndexScan : load private fields
+    OpenSearchIndexScan ->>+ OpenSearchScrollRequest : create new
+      OpenSearchScrollRequest -->>- OpenSearchIndexScan : created
+    OpenSearchIndexScan -->>- ProjectOperator : deserialized
   ProjectOperator -->>- PlanSerializer : deserialized
   deactivate CursorDeserializationStream
 ```
@@ -723,15 +673,15 @@ sequenceDiagram
     participant OpenSearchExecutionEngine
     participant ProjectOperator
     participant ResourceMonitorPlan
-    participant OpenSearchPagedIndexScan
+    participant OpenSearchIndexScan
 
 OpenSearchExecutionEngine ->>+ ProjectOperator: getTotalHits
   Note over ProjectOperator: default implementation
   ProjectOperator ->>+ ResourceMonitorPlan: getTotalHits
     Note over ResourceMonitorPlan: call to delegate
-    ResourceMonitorPlan ->>+ OpenSearchPagedIndexScan: getTotalHits
-      Note over OpenSearchPagedIndexScan: use stored value from the search response
-      OpenSearchPagedIndexScan -->>- ResourceMonitorPlan: value
+    ResourceMonitorPlan ->>+ OpenSearchIndexScan: getTotalHits
+      Note over OpenSearchIndexScan: use stored value from the search response
+      OpenSearchIndexScan -->>- ResourceMonitorPlan: value
     ResourceMonitorPlan -->>- ProjectOperator: value
   ProjectOperator -->>- OpenSearchExecutionEngine: value
 ```

From 5eac1b075d153368b436c9c41f30b5ed95847011 Mon Sep 17 00:00:00 2001
From: MaxKsyunz <maxk@bitquilltech.com>
Date: Fri, 19 May 2023 14:54:25 -0700
Subject: [PATCH 2/4] Update design document to reflect refactor.

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
---
 docs/dev/Pagination-v2.md | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/docs/dev/Pagination-v2.md b/docs/dev/Pagination-v2.md
index 05008edfcb1..7418c16f922 100644
--- a/docs/dev/Pagination-v2.md
+++ b/docs/dev/Pagination-v2.md
@@ -62,10 +62,7 @@ Similarly to v1 engine, the response object is the same as initial response if t
 `cursor_id` will be different with each request.
 
 ## End of scrolling/paging
-
-When scrolling is finished, SQL plugin returns a final cursor. This cursor leads to an empty page, which has no cursor and no data hits. Receiving that page means all data was properly queried, and the scrolling cursor has been closed.
-
-The client will receive an [error response](#error-response) if executing this request results in an OpenSearch or SQL plug-in error.
+The last page in a response will not have a cursor id property.
 
 ## Cursor Keep Alive Timeout
 
@@ -112,9 +109,25 @@ The discussion below uses *under max_result_window* to refer to scenarios that c
 
 ## SQL Node Load Balancing
 
-V2 SQL engine supports *sql node load balancing* &mdash; a cursor request can be routed to any SQL node in a cluster. This is achieved by encoding all data necessary to retrieve the next page in the `cursor_id`.
+V2 SQL engine supports *sql node load balancing* &mdash; a cursor request can be routed to any SQL node in a cluster. This is achieved by encoding all data necessary to retrieve the next page in the `cursor_id` property in the response.
 
 ## Feature Design
+To support pagination, v2 SQL engine needs to:
+1. in REST front-end:
+    1. Route supported paginated query to v2 engine for
+        1. Initial requests,
+        2. Next page requests.
+    2. Fallback to v1 engine for queries not supported by v2 engine.
+    3. Create correct JSON response from execution of paginated physical plan by v2 engine.
+2. during query planning:
+    1. Differentiate between paginated and normal query plans.
+    2. Push down pagination to table scan.
+    3. Create paginated physical plan from cursor id.
+3. during query execution:
+    1. Execute paginated physical plan.
+4. in OpenSearch data source: 
+    1. Support pagination push down.
+    2. Support other push down optimizations with pagination.
 
 ### Query Plan Changes
 

From 8fc92f7081fe848a40196bf0f772cd1662a80077 Mon Sep 17 00:00:00 2001
From: Max Ksyunz <maxk@bitquilltech.com>
Date: Tue, 23 May 2023 10:44:10 -0700
Subject: [PATCH 3/4] Incorporate review feedback

Co-authored-by: Andrew Carbonetto <andrewc@bitquilltech.com>
Signed-off-by: Max Ksyunz <maxk@bitquilltech.com>
---
 docs/dev/Pagination-v2.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/dev/Pagination-v2.md b/docs/dev/Pagination-v2.md
index 7418c16f922..a61818752c0 100644
--- a/docs/dev/Pagination-v2.md
+++ b/docs/dev/Pagination-v2.md
@@ -638,7 +638,7 @@ Note over PlanSerializer : Zip to reduce size
 
 #### Deserialization
 
-Deserialization restores previously serialized Physical Query Plan. The recovered plan is ready to execute and should return the next page of the search response. To complete the query plan restoration, SQL engine should build a new request to the OpenSearch node. This request doesn't contain a search query, but it contains a search context reference &mdash; `scrollID`. To create a new `OpenSearchScrollRequest` object it is required to access to the instance of `OpenSearchStorageEngine`. `OpenSearchStorageEngine` can't be serialized, and it exists as a singleton in the SQL plugin engine. `PlanSerializer` creates a customized deserialization binary object stream &mdash; `CursorDeserializationStream`. This stream provides an interface to access the `OpenSearchStorageEngine` object.
+Deserialization restores previously serialized Physical Query Plan. The recovered plan is ready to execute and returns the next page of the search response. To complete the query plan restoration, SQL engine will build a new request to the OpenSearch node. This request doesn't contain a search query, but it contains a search context reference &mdash; `scrollID`. To create a new `OpenSearchScrollRequest` object it requires access to the instance of `OpenSearchStorageEngine`. Note: `OpenSearchStorageEngine` can't be serialized, and it exists as a singleton in the SQL plugin engine. `PlanSerializer` creates a customized deserialization binary object stream &mdash; `CursorDeserializationStream`. This stream provides an interface to access the `OpenSearchStorageEngine` object.
 
 ```mermaid
 sequenceDiagram

From 5daad43a160a3a4c4f55d4e9763b029fb4138c79 Mon Sep 17 00:00:00 2001
From: MaxKsyunz <maxk@bitquilltech.com>
Date: Tue, 23 May 2023 12:16:37 -0700
Subject: [PATCH 4/4] Address review feedback.

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
---
 docs/dev/Pagination-v2.md | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/docs/dev/Pagination-v2.md b/docs/dev/Pagination-v2.md
index a61818752c0..2416ae51de2 100644
--- a/docs/dev/Pagination-v2.md
+++ b/docs/dev/Pagination-v2.md
@@ -7,7 +7,8 @@ A cursor is a SQL abstraction for pagination. A client can open a cursor, retrie
 Currently, SQL plugin does not provide SQL cursor syntax. However, the SQL REST endpoint can return result a page at a time. This feature is used by JDBC and ODBC drivers.
 
 # Scope
-At this time, V2 engine supports pagination only for simple `SELECT * FROM <table>` queries without any other clauses like `WHERE` or `ORDER BY`.
+This document describes pagination in V2 sql engine for non-aggregate queries -- queries 
+without `GROUP BY` clause or use of window functions.
 
 # Demo
 https://user-images.githubusercontent.com/88679692/224208630-8d38d833-abf8-4035-8d15-d5fb4382deca.mp4
@@ -122,16 +123,16 @@ To support pagination, v2 SQL engine needs to:
 2. during query planning:
     1. Differentiate between paginated and normal query plans.
     2. Push down pagination to table scan.
-    3. Create paginated physical plan from cursor id.
+    3. Create a physical query plan from a cursor id.
 3. during query execution:
-    1. Execute paginated physical plan.
+   1. Serialize an executing query and generate a cursor id after returning `fetch_size` number of elements.
 4. in OpenSearch data source: 
     1. Support pagination push down.
     2. Support other push down optimizations with pagination.
 
 ### Query Plan Changes
 
-All three kinds of requests &mdash; non-paged, initial page, or subsequent page  &mdash; are processed in the same way. Simplified workflow of query plan processing is shown below for reference.
+All three kinds of query requests &mdash; non-paged, initial page, or subsequent page  &mdash; are processed in the same way. Simplified workflow of query plan processing is shown below for reference.
 
 ```mermaid
 stateDiagram-v2
@@ -158,7 +159,7 @@ stateDiagram-v2
 
 Unresolved Query Plan for non-paged requests remains unchanged. 
 
-To support initial page requests, the `QueryPlan` class has a new optional field `pageSize`.
+To support initial query requests, the `QueryPlan` class has a new optional field `pageSize`.
 
 ```mermaid
 classDiagram
@@ -176,7 +177,7 @@ classDiagram
 ```
 
 When `QueryPlanFactory.create` is passed initial query request, it:
-1. Adds an instance of `Paginate` unresolved plan as the root of the unresolved query plan. 
+1. Adds an instance of `Paginate` unresolved plan as the root of the unresolved query plan.
 2. Sets `pageSize` parameter in `QueryPlan`.
 
 ```mermaid
@@ -263,7 +264,7 @@ stateDiagram-v2
 
 There are no changes for non-paging requests.
 
-Changes to logical query plan to support initial pagination request:
+Changes to logical query plan to support Initial Query Request:
 1. `LogicalPaginate` is added to the top of the tree. It stores information about paging should be done in a private field `pageSize` being pushed down in the `Optimizer`.
 
 ```mermaid
@@ -450,7 +451,7 @@ SQLService ->>+ QueryPlanFactory: execute
 Processing of an Initial Query Request has few extra steps comparing versus processing a regular Query Request:
 1. Query validation with `CanPaginateVisitor`. This is required to validate whether incoming query can be paged. This also activate legacy engine fallback mechanism.
 2. `Serialization` is performed by `PlanSerializer` - it converts Physical Query Plan into a cursor, which could be used query a next page.
-3Traversal of Physical Query Plan to get total hits, which is required to properly fill response to a user.
+
 
 ```mermaid
 sequenceDiagram