STREAM DTYPE Schema { fields: [Field { name: "c1", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "c2", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }
RB SCHEMA: Schema { fields: [Field { name: "c1", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "c2", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }
STREAM DTYPE Schema { fields: [Field { name: "c1", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "c2", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }
RB SCHEMA: Schema { fields: [Field { name: "c1", data_type: Utf8, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "c2", data_type: Int32, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }
I figured I'd add an assertion into the RecordBatchStreamAdapter, and it looks like ~12 tests fail on main right now with mismatched schemas. I wonder if it's worth adding that as a debug assertion?
Describe the bug
We were trying to implement a
DataSinkand found that we were being given different schemas for the record batches than per the RecordBatchStream.The first (correct) invocation comes from assembling a logical plan with
LogicalPlanBuilder::insert_intoThe second (incorrect) invocation comes from
session.sql("INSERT INTO my_tbl VALUES ('hello', 42::INT);")I figured I'd add an assertion into the RecordBatchStreamAdapter, and it looks like ~12 tests fail on main right now with mismatched schemas. I wonder if it's worth adding that as a debug assertion?
https://github.com/gatesn/datafusion/pull/new/ngates/record-batch-stream-schema
cc @AdamGS
To Reproduce
No response
Expected behavior
No response
Additional context
No response