feat(hive): support structure comparison for Hive (#2872)#3322
Merged
Conversation
Add DriverTypeHive = "Hive" to the driver type constants block in sqle/driver/v2/util.go, matching DMS-EE's DBTypeHive value exactly. This enables the SQLE plugin system to recognize Hive as a valid database driver type. issue: #2859
Add Hive plugin directory with PluginProcessor and HiveDriverImpl implementing the full Plugin interface. Features: - init() registration to BuiltInPluginProcessors - GetDriverMetas with PluginName="Hive", port=10000 - additionalParams: auth (NOSASL/NONE/LDAP/KERBEROS), transport_mode (binary/http) - Parse with SQL splitting and keyword-prefix classification (DQL/DML/DDL) - Audit returning empty results (no rules in skeleton phase) - Stub implementations for unsupported methods issue: #2859
Map-case style tests covering: - Plugin registration in BuiltInPluginProcessors - GetDriverMetas: PluginName, DefaultPort, auth/transport_mode params, empty Rules and EnabledOptionalModule - classifySQL: DQL (SELECT/WITH/SHOW/DESCRIBE/DESC/EXPLAIN), DML (INSERT/UPDATE/DELETE/MERGE/LOAD/EXPORT), DDL (CREATE/ALTER/DROP/GRANT) - splitSQL: single/multiple/trailing semicolon/empty/whitespace - Audit returns correct-length empty results - Parse with single/multiple/empty SQL - Ping with nil DSN returns error - Open with nil DSN succeeds (offline mode) issue: #2859
Replace placeholder Ping/Close with real gohive connectivity: - Add newHiveConnection() that creates gohive.Connection from DSN parameters (host, port, user, password, database, auth, transport_mode) following DMS-EE NewHiveConn pattern - Update Open() to establish real connection when DSN is provided - Update Ping() to execute SELECT 1 via gohive cursor - Update Close() to close gohive connection - Add HiveDriverImpl.conn field for connection lifecycle - Add unit tests for nil conn boundary cases Offline audit mode (nil DSN) continues to work without connection.
The Hive built-in plugin's init() function was never called because sqle/server/sqled.go only imported the MySQL driver. Adding the blank import for github.com/actiontech/sqle/sqle/driver/hive ensures the Hive plugin registers itself into BuiltInPluginProcessors at startup, making the Hive data source type available in the DMS driver list. Fixes #2859
…ASES
The Schemas() method was returning an error "hive plugin does not support
Schemas", causing the /v1/projects/{project}/instances/{instance}/schemas
API to return 500. This blocked the data export workflow as users could
not select a database from the dropdown.
Now executes SHOW DATABASES via gohive cursor and returns the result list.
Fixes BUG-2.1-1
…as (EE-6, compat-RISK-1) (#2872)
…ejection (EE-7, compat-RISK-4/9) (#2872)
…-9, compat-RISK-6) (#2872)
…TE fallback (EE-8, compat-RISK-4/6/9) (#2872)
…DDL (EE-16, compat-RISK-1/4/6/9) (#2872)
…17, compat-RISK-10) (#2872) HiveServer2's FetchResults stage returns a non-fatal ROW-ERR (StatusCode:ERROR_STATUS, InfoMessages:[Server-side error; please check HS2 logs.]) for statements that produce no result columns - USE <db>, SET ..., DDL. Prior to this fix gohiveQueryRunner.run SingleStringQuery surfaced the ROW-ERR as a hard error, which broke GetDatabaseObjectDDL / GetDatabaseDiffModifySQL whenever a USE <schema> was issued and produced TC-HIVE-005 4012 errors in web tests. Changes: - Extract isHS2NoResultRowErr classifier matching the canonical HS2 ROW-ERR markers (status + info-message substrings). - Introduce hiveCursor interface + gohiveCursorAdapter so the fetch loop can be exercised by unit tests without a live HS2. - Move the FetchOne / HasMore loop into fetchAllRows which tolerates the classified ROW-ERR (break out, return rows so far) and propagates every other cursor error unchanged. - runSingleStringQuery and Schemas now delegate the loop to fetchAllRows so both code paths share the contract. Tests: - Test_IsHS2NoResultRowErr (5 sub-cases) covers nil, the real HS2 payload, a syntax error and two partial-match negatives. - Test_FetchAllRows_RowErrTolerant (3 sub-cases) verifies USE row-err tolerance, SHOW TABLES with trailing row-err, and genuine syntax-error propagation. - Test_RunSingleStringQuery_NilConn guards the early-return. References: compat-RISK-10, docs/test/case-TC-HIVE-005.md
… (compat-RISK-10) (#2872) server/compare/database_compare_ee.go::ExecDatabaseCompare forwards SchemaName but never populates DatabaseObjects on the DatabaseSchemaInfo it hands to drivers. MySQL handles this with an auto-discovery branch (mysql_ee.go::GetDatabaseObjectDDL line 380-389 calling getAllSchemaObjects). The Hive driver previously did not, so even after the ROW-ERR tolerance fix the API returned comparison_result=same / inconsistent_num=0 and database_diff_objects was empty — the structure compare tree always showed zero diffs. Changes: - listAllSchemaObjects helper enumerates SHOW TABLES + SHOW VIEWS, classifies each name as TABLE/VIEW. SHOW VIEWS failures degrade to "all TABLE" so the helper still works on Hive < 2.2. - GetDatabaseObjectDDL detects empty DatabaseObjects and fills it via listAllSchemaObjects after the USE <schema> succeeds. - GetDatabaseDiffModifySQL takes the union of base + compared auto-discovery, then re-USEs both sides before the per-object SHOW CREATE TABLE so the diff round-trip is correct. Tests: - Test_ListAllSchemaObjects (4 sub-cases): tables+views classified, SHOW VIEWS failure degrades, SHOW TABLES error propagates, empty names skipped. - Test_GetDatabaseObjectDDL_DefaultDiscovery: caller passes nil DatabaseObjects -> driver discovers t_base_only (TABLE) + v_user_summary (VIEW) and returns both DDLs. - Test_GetDatabaseObjectDDL_EmptyObjectList updated to the new contract (auto-discovery vs prior "return empty"). References: compat-RISK-10 (secondary fix), task_test_003 episodic, mysql/mysql_ee.go::GetDatabaseObjectDDL behavioural parity.
… compat-RISK-9) (#2872) Align GetDatabaseObjectDDL and GetDatabaseDiffModifySQL FUNCTION branch with the PROCEDURE/TRIGGER/EVENT short-circuit: silently skip the unsupported FUNCTION object (no Go error, no placeholder DDL entry) and emit a structured WARN log carrying objectType=FUNCTION. The previous behaviour returned a hard error from the FUNCTION branch, which dropped otherwise valid TABLE/VIEW results in the same batch — observed as TC-HIVE-016 (mixed TABLE+FUNCTION) FAIL during Task-TEST-005. Driver guarantees post-fix: * GetDatabaseObjectDDL: empty results entry when every requested object is FUNCTION; mixed batches return the TABLE/VIEW DDLs only. * GetDatabaseDiffModifySQL: USE-header preserved; FUNCTION never contributes SQL to ModifySQLs; TABLE main path keeps producing ALTER / DROP+CREATE statements. * sqled.log: logrus.Fields {objectType=FUNCTION, object=<name>} so operators can detect the skipped objects without parsing free text. Unit tests updated to pin the new contract: * Test_GetDatabaseObjectDDL_FunctionRejected: results entry has 0 DDLs, err == nil. * Test_GetDatabaseDiffModifySQL_FunctionRejected: only USE header in block; no DROP/CREATE/ALTER; err == nil. * Test_GetDatabaseDiffModifySQL_MixedFunctionAndTable (new): TABLE ALTER CHANGE COLUMN amt amt BIGINT preserved; fake_fn absent. * Test_GetDatabaseObjectDDL_MixedFunctionAndTable (new): TABLE DDL preserved; FUNCTION absent from DatabaseObjectDDLs. Aligned with design §3.2.2 line 239 ("FUNCTION 跳过 objInfo, results 不含该项"). compat-RISK-9 state moves implemented -> verified after TC-HIVE-015 / TC-HIVE-016 regression PASS.
…modify SQL (#2872) Until now the Hive plugin returned a hard error "hive plugin does not support Exec" for both Exec and ExecBatch, which means once the structure-compare flow produces modify SQL and the user tries to run it through an SQLE work order, the workflow execution fails immediately. Rewrite the Hive driver to actually execute statements against HiveServer2: - Implement Exec: submit one statement per cursor.Exec, strip the trailing ';', skip empty / comment-only statements (the modify-SQL splitter can produce "-- WARNING: ..." trailers), tolerate the HS2 no-result-column ROW-ERR (same classifier as runSingleStringQuery, compat-RISK-10). - Implement ExecBatch: iterate Exec per statement, stop on the first error and return the partial result set (matches MySQL driver's batch contract in sqle/driver/mysql/mysql.go::ExecBatch). - Return a hiveExecResult that satisfies database/sql/driver.Result by surfacing a defensive "not supported by hive plugin" error for LastInsertId / RowsAffected; HiveServer2 / gohive do not expose either reliably and we don't want to fabricate zeros that downstream auditors might trust. - Add execRunnerFactory injection point + fakeExecRunner test double to exercise the contract without a live HS2 cluster. - Cover the new behaviour with eight test cases: Test_StripSQLTerminator, Test_IsAllCommentLines, Test_Exec_SingleStatement, Test_Exec_EmptyAndCommentStatementsAreNoOp, Test_Exec_PropagatesRunnerError, Test_Exec_NilConnAndNoFactoryFails, Test_ExecBatch_AllSucceed, Test_ExecBatch_StopsOnFirstError. Fixes #2872
Add github.com/beltran/gohive v1.6.0 for real HiveServer2 connectivity in the SQLE Hive plugin. Transitive deps: apache/thrift v0.19.0, beltran/gosasl, beltran/gssapi, go-zookeeper/zk v1.0.3. Version aligned with DMS-EE go.mod to avoid dependency conflicts.
…18n keys for Hive structure compare (compat-RISK-1) (#2872) Add error code 4009 (DatabaseCompareNotSupported) and bilingual i18n keys for the controller-layer compatibility check that rejects database types lacking OptionalGetDatabaseObjectDDL / OptionalGetDatabaseDiffModifySQL capability. EE controller wires these into the database compare whitelist. Refs design.md §3.2.3 / §3.5 (EE-11 / EE-12).
be277d3 to
3f3866f
Compare
PR Reviewer Guide 🔍(Review updated until commit 875ac5c)
|
|
Failed to generate code suggestions for PR |
…QL pattern Modified fetchTableDDL to return an empty string with exists=false for non-existent tables, including query errors. This change harmonizes the error handling with the MySQL implementation, improving consistency in the Hive driver.
|
Persistent review updated to latest commit 875ac5c |
PR Code Suggestions ✨No code suggestions found for the PR. |
Seechi-Yolo
approved these changes
May 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
Summary
实现 Hive 数据源的结构对比能力(issue actiontech/sqle-ee#2872):
sqle/driver/hive/),通过github.com/beltran/gohive真实连接 HiveServer2OptionalGetDatabaseObjectDDL(TABLE/VIEW 取 DDL;FUNCTION 拒绝返回 ddl-compat-unsupported)与OptionalGetDatabaseDiffModifySQL(ALTER + DROP/CREATE fallback;DROP/CREATE 分支在结果中前置-- WARNING: ...注释)DriverTypeHive常量;为新错误码 4009 (DatabaseCompareNotSupported) 添加中英文 i18n 文案SHOW DATABASES;Ping/Close 走 gohive;Exec/ExecBatch 提供执行 modify SQL 的入口USE/SET类无结果语句容忍 row-err;空 schema-list 自动通过 SHOW TABLES 兜底发现;FUNCTION 对象在批量中跳过而非中断gohive v1.6.0与传递依赖(apache/thrift v0.19.0、beltran/gosasl、beltran/gssapi、go-zookeeper/zk v1.0.3),版本与 DMS-EE 对齐避免冲突Test plan
go vet ./...通过go test ./sqle/driver/hive/...通过Fixes #2872 (actiontech/sqle-ee)
Refs design.md §3.2.3 / §3.5
Description
新增 Hive 驱动插件及结构对比功能支持
添加大量单元测试覆盖 Hive DDL 与 DiffModifySQL 场景
更新错误码、国际化消息与日志提示信息
集成 Hive 驱动依赖(gohive、gosasl、gssapi、zk)到项目中
File Walkthrough
2 files
新增 Hive 对比功能单元测试,覆盖多种场景新增 Hive 驱动单元测试文件5 files
实现 Hive 驱动插件,支持对象 DDL 获取与差异 SQL 生成更新错误码,将结构对比相关错误统一为 4009注册 Hive 驱动类型到驱动常量中导入 Hive 驱动,集成至服务器启动流程新增 Hive 表结构差异处理逻辑实现4 files
添加 Hive 驱动结构对比相关的中文国际化消息添加 Hive 驱动 logo 文件占位更新英文国际化消息,增加 Hive 对比提示更新中文国际化消息,添加 Hive 对比提示1 files
更新依赖,新增 gohive、gosasl、gssapi 和 zk 支持