feat: implement RESTCatalog with database and table CRUD#160
feat: implement RESTCatalog with database and table CRUD#160discivigour wants to merge 8 commits intoapache:mainfrom
Conversation
c3b4acc to
c4f6182
Compare
04a01f6 to
a8161d0
Compare
luoyuxia
left a comment
There was a problem hiding this comment.
@discivigour Thanks for the pr. Left some comments. PTAL
| /// | ||
| /// The mock server returns table metadata pointing to Spark-provisioned data on disk. | ||
| #[tokio::test] | ||
| async fn test_rest_catalog_read_append_table() { |
There was a problem hiding this comment.
is there any difference between log table and dv table regarding to rest catalog?
If not, I think we just need to keep only one test.
There was a problem hiding this comment.
Also, i think you can reuse scan_and_read_with_projection. Pass a catalog trait to it. It should works for both filesystem catalog and rest catalog.
| arrow-array = { workspace = true } | ||
| tokio = { version = "1", features = ["macros", "rt-multi-thread"] } | ||
| futures = "0.3" | ||
| serde_json = "1" |
There was a problem hiding this comment.
No, 'mod mock_server' needs it. But I will move it to dev-dependencies.
| } | ||
|
|
||
| #[tokio::main] | ||
| async fn main() { |
There was a problem hiding this comment.
can these two example combine to single one example?
| pub struct DLFAuthProvider { | ||
| uri: String, | ||
| token: Option<DLFToken>, | ||
| token: tokio::sync::Mutex<Option<DLFToken>>, |
There was a problem hiding this comment.
nit:
use parking_lot::RwLock
RwLock<Option<DLFToken>>,
There was a problem hiding this comment.
Thanks, but I think mutex is better here because get_or_refresh_token() have read-then-write operation.
| let warehouse = options | ||
| .get(CatalogOptions::WAREHOUSE) | ||
| .cloned() | ||
| .unwrap_or_default(); |
There was a problem hiding this comment.
return error instead of default?
| } | ||
|
|
||
| // Refresh the token | ||
| let new_token = self.refresh_token().await?; |
There was a problem hiding this comment.
please don't hold lock across .await points, it may well cause dead lock
| let api = match api_guard.as_ref() { | ||
| Some(existing) => existing, | ||
| None => { | ||
| let new_api = RESTApi::new(self.catalog_options.clone(), false).await?; |
There was a problem hiding this comment.
dito:
please don't hold lock across .await points, it may well cause dead lock
| .context(IoUnexpectedSnafu { | ||
| message: format!("Failed to list files in '{path}'"), | ||
| })?; | ||
| let entries = op.list_with(&list_path).await.context(IoUnexpectedSnafu { |
|
|
||
| let list_path_normalized = list_path.trim_start_matches('/'); | ||
| for entry in entries { | ||
| // opendal list_with includes the root directory itself as the first entry. |
| let mut dirs = Vec::new(); | ||
| for status in statuses { | ||
| if status.is_dir { | ||
| // Skip the directory itself (opendal list_with includes the root entry) |
Purpose
Linked issue: close #119
Implement a complete REST-based catalog (
RESTCatalog) for Apache Paimon Rust,supporting database and table CRUD operations, token-based FileIO for OSS data access.
Brief change log
New files:
catalog/rest/mod.rs: REST catalog module entry point.catalog/rest/rest_catalog.rs:RESTCatalogimplementing theCatalogtrait with fulldatabase and table CRUD (list, create, get, drop, alter, rename).
catalog/rest/rest_token.rs:RESTTokenstruct for table-level data access credentials.catalog/rest/rest_token_file_io.rs:RESTTokenFileIO— aFileIOwrapper that lazilyfetches and caches table tokens from the REST server, enabling OSS data access with
short-lived credentials.
catalog/database.rs:Databasestruct representing a catalog database.examples/rest_catalog_example.rs: End-to-end example for REST catalog operations.examples/rest_catalog_read_append_example.rs: Example for reading append-only tables via REST catalog.tests/mock_server.rs: Mock HTTP server (axum-based) simulating the Paimon REST API for testing.tests/rest_catalog_test.rs: Integration tests covering all CRUD operations on the mock server.Modified files:
api/rest_api.rs: Addoptions()accessor andload_table_token()method; fix variableshadowing in config merge; remove
#[allow(dead_code)]onoptionsfield.catalog/filesystem.rs: Implementget_databaseforFileSystemCatalog.io/storage_oss.rs: UseHashMap::removeinstead ofget+ clone forsecurity_token.integration_tests/tests/read_tables.rs: Add REST catalog read tests using mock server.Tests
tests/rest_catalog_test.rs: Full CRUD integration tests (list/create/get/drop/alter database,list/create/get/drop/rename/alter table) against the mock server.
tests/rest_api_test.rs: REST API layer tests including DLF ECS token loader.integration_tests/tests/read_tables.rs: End-to-end read tests for append-only andprimary-key (deletion-vector) tables via REST catalog backed by mock server.
cargo test -p paimonpasses.API and Format
Documentation