Skip to content

RFC for table encryption#2183

Draft
xanderbailey wants to merge 5 commits intoapache:mainfrom
xanderbailey:xb/encryption_rfc
Draft

RFC for table encryption#2183
xanderbailey wants to merge 5 commits intoapache:mainfrom
xanderbailey:xb/encryption_rfc

Conversation

@xanderbailey
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

RFC for table encryption
Part of: #2034
Rough draft with some of the key parts: #2042

  • Closes #.

What changes are included in this PR?

Are these changes tested?

@mbutrovich
Copy link
Copy Markdown
Collaborator

Not specific feedback for the RFC, just sharing some context links and previous discussion that was helpful in getting the EncryptionFactory design into DataFusion that interfaces with the parquet crate. Comet makes good use of this design with a custom EncryptionFactory that uses JNI to interface with Spark-based custom KMSs.

apache/datafusion#16779
apache/datafusion#15216 (comment)

Also tagging @ggershinsky in case he has any cycles to read this. His guidance was instrumental on the DataFusion and Arrow-rs PME work, as well as Iceberg Java encryption implementation.

@ggershinsky
Copy link
Copy Markdown

Thanks, I'll be glad to have a look.

Comment thread docs/rfcs/0003_table_encryption.md Outdated
```
Master Key (in KMS)
└── wraps → KEK (Key Encryption Key) — stored in table metadata as EncryptedKey
└── wraps → DEK (Data Encryption Key) — stored in StandardKeyMetadata per file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some DEKs (those for manifest list files) are also stored in table metadata as EncryptedKey. These DEKs are indeed packaged in a StandardKeyMetadata (along with AAD prefix and file length). The serialized StandardKeyMetadata is encrypted/wrapped by the KEK, and stored in the table metadata / encrypted_keys structure.

The manifest file DEKs are packaged in StandardKeyMetadata, and stored as-is (without encryption) in manifest list files. The latter are encrypted then.

The data file DEKs are packaged in StandardKeyMetadata, and stored as-is (without encryption) in manifest files. The latter are encrypted then.

Comment thread docs/rfcs/0003_table_encryption.md Outdated

- **Master keys** live in the KMS and never leave it
- **KEKs** are wrapped by the master key and stored in `TableMetadata.encryption_keys`
- **DEKs** are wrapped by a KEK and stored per-file in `StandardKeyMetadata`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only manifest list DEKs are wrapped by a KEK. Other DEKs are encrypted in the parent files, by the parent DEKs

load_manifest_list(file_io, table_metadata)
1. Look up encryption_key_id in table_metadata.encryption_keys
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to unwrap the KEK (via a KMS client)

Comment thread docs/rfcs/0003_table_encryption.md Outdated
a. file_io.new_encrypted_output(path) → AGS1-encrypting OutputFile
b. em.wrap_key_metadata() → EncryptedKey for table metadata
c. Store key_id on Snapshot.encryption_key_id
3. Table updates include AddEncryptionKey for new KEKs
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also need to wrap the KEK (via a KMS client)

@xanderbailey
Copy link
Copy Markdown
Contributor Author

Thanks for taking a look @ggershinsky I've tried to fill in some of the details here

Copy link
Copy Markdown
Contributor

@blackmwk blackmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xanderbailey for this pr, generally LGTM, left some suggestions.

Comment thread docs/rfcs/0003_table_encryption.md Outdated

### Iceberg Spec: Encryption

The [Iceberg table spec](https://iceberg.apache.org/spec/#table-metadata) defines encryption
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link may be a better reference.

Comment thread docs/rfcs/0003_table_encryption.md Outdated
│ ├── kms/
│ │ ├── mod.rs
│ │ └── in_memory.rs # InMemoryKms (testing only)
│ └── integration_tests.rs # End-to-end encryption round-trip tests
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In rust we usually put integration tests in tests dir, next to src dir.

Comment thread docs/rfcs/0003_table_encryption.md Outdated

```rust
#[async_trait]
pub trait EncryptionManager: Debug + Send + Sync {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this to be a trait? I think a no-op kms client would be enough?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we actually don't need this to be a trait the standard encryption manager is the only one we need

Comment thread docs/rfcs/0003_table_encryption.md Outdated
`InputFile` and `OutputFile` are enums with three variants each:

```rust
pub enum InputFile {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we this enum should sth as following:

pub enum EncryptedInputFile {
   Encrypted {
     key_metadata: KeyMetadata
     inner: InputFile
   }, 
   NativeedEncryted {
     key_material: NativeKeyMaterial,
     inner: InputFile
   }
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this requires a lot more changes to the current code since encrypted inputs and outputs won't be handled transparently. Or is the suggestion to have InputFile as an enum of Plain, EncryptedInputFile and EncryptedInputFile is itself an enum?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a little difficult to imagine what it will look like given different choice? Could you give raise some examples? I'm even thinking if we actually need an enum. InputFile, EncryptedInputFile and NativeEncryptedInputFile are used in differenent cases. Let's use an example, ManifestReader, if we remove the enum, what we need is to add an extra api for read(EncryptedInputFile). This is acceptable to me, some we have a FileRead trait to abstract out the descryption process.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards the example snippet @blackmwk above and I don't think we should expand Input/OutputFile to an enum. EncryptedFile should be a wrapper of Input/OutputFile rather than a variation.

By the same logic, having a wrapper Storage like EncryptionStorage may not be ideal. For encryption, we only need to (un)wrap Input/OutputFile in the FileIO level

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can do this if you both think this is better but this requires some reasonable surgery on structs like ManifestListWriter which today takes an OutputFile and will now need to take a Box<dyn FileWrite> and ManifestWriter will need Box<dyn FileWrite> and a location.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I tried it out and it's actually not too bad. If you're both happy with the changes I mentioned above then I think this is a good approach.

Comment thread docs/rfcs/0003_table_encryption.md Outdated
// Via FileIOBuilder extension (works with RestCatalog and any extension-aware catalog)
let file_io = FileIOBuilder::new("s3")
.with_prop("s3.region", "us-east-1")
.with_extension(encryption_manager)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed.

Comment thread docs/rfcs/0003_table_encryption.md Outdated
.build()?;

// Or via convenience method on FileIO
let file_io = file_io.with_encryption_manager(encryption_manager);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment thread docs/rfcs/0003_table_encryption.md Outdated
// Option A: EncryptionManager on the catalog
let catalog = GlueCatalogBuilder::default()
.with_storage_factory(Arc::new(OpenDalStorageFactory::S3))
.with_encryption_manager(encryption_manager)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what we actually need from user is KmsClientFactory?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting, where they can provide a way to construct a number of KMS clients?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the code to be as following:

let catalog = GlueCatalogBuilder::default()
    .with_storage_factory(Arc::new(OpenDalStorageFactory::S3))
    .with_kms_client_factory(new AwsKmsClientFactory())

Comment thread docs/rfcs/0003_table_encryption.md Outdated
.load("my_catalog", props)
.await?;

// Option B: Wrapping StorageFactory
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions Bot added the stale label Apr 13, 2026
@blackmwk
Copy link
Copy Markdown
Contributor

Don't close it, still under development.

@blackmwk blackmwk added not-stale and removed stale labels Apr 13, 2026
Copy link
Copy Markdown
Contributor

@blackmwk blackmwk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xanderbailey , I think it's also done. Please address the comments and fix typos, then we will be able to merge it.


```rust
/// Wraps a plain InputFile with decryption capabilities.
pub enum EncryptedInputFile {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove the enum for now, just plain struct EncryptedInputFile, NativeEncryptedInputFile? I don't get the benefits of such an enum. We could always add it back later if necessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cleaner!

}

/// Wraps a plain OutputFile with encryption capabilities.
pub enum EncryptedOutputFile {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.


The `EncryptionManager` is stored as a type-safe `FileIOBuilder` extension. This integrates
naturally with catalogs that support extensions (e.g. `RestCatalog.with_file_io_extension()`):
The `EncryptionManager` is attached to `FileIO` via a convenience method. The encryption
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a good idea to attach it to FileIO.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead we should implement EncryptingStroage

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea!

  1. Looks like Storage is serde serialisable
    #[typetag::serde(tag = "type")]
    pub trait Storage: Debug + Send + Sync {
    which is hard to do for an EncryptionManager which holds an Arc<dyn KeyManagementClient> and that has caches and so on inside it.
  2. PME isn't transparent so Storage and FileIo would need to expose a native_encrypt/decrypt method which maybe seems like we're breaking the abstractions?

What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at java, bot KeyManagementClient and EncryptionManager are serializable, so I think we should also mark them as serializable.

What's PME?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PME is the Parquet Modular Encryption

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case how about we create an EncryptingFileIO, EncryptingStorage?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I currently don't have a full picture of all the data structures involved, so instead of adding too much abstractions early, I prefer to keep more types. We could always add abstractions later to simplify types. WDYT?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EncryptingFileIO would mean more code changes since each call site would now need to take EncryptingFileIO which seems wrong also somehow?

I'm going to see if I can move EncryptionManager out of FileIo altogether and just have it on the Table and pass it down where it's needed. I'll report back when I have a better idea of how this looks

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah okay so I think this is cleaner, TableBuilder constructs the EncryptionManager and passes that down. FileIo is not aware of encryption. It's a little awkward now to pass down the EM everywhere but honestly, I think it's more explicit than sort of assuming the FileIo has been configured correctly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make it explicit first, and polish the api later. As long as we be careful with pub visibility, we could reduce user impact as much as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants