Skip to main content

Release 0.1.0-incubating

Release 0.1.0-incubating (docs)

This is the first official Apache release for Apache XTable (Incubating), an incubating project under the Apache Software Foundation. Apache XTable™ (Incubating) facilitates omni-directional interoperability across data processing systems and query engines by allowing users to convert between open table formats without the need to rewrite any data files. Currently, Apache XTable™ (Incubating) supports the open-source table formats for Apache Hudi, Apache Iceberg, and Delta Lake.

Features

Apache XTable™ (Incubating) provides users with the ability to translate metadata from one table format to another.

Apache XTable™ (Incubating) provides two sync modes, "incremental" and "full." The incremental mode is more lightweight and has better performance, especially on large tables. If there is anything that prevents the incremental mode from working properly, the tool will fall back to the full sync mode.

This sync provides users with the following:

  1. Syncing of data files along with their column level statistics and partition metadata
  2. Schema updates in the source are reflected in the target table metadata
  3. Metadata maintenance for the target table formats.
    • For Hudi, unreferenced files will be marked as cleaned to control the size of the metadata table.
    • For Iceberg, snapshots will be expired after a configured amount of time.
    • For Delta, the transaction log will be retained for a configured amount of time.

Improvements

  1. Added apache release guide and infra components to be compliant with ASF release process.
  2. Fix bugs related to dependency conflicts, few edge cases related to column stats etc.
  3. Improved README, docker demo and website docs based on feedback provided by users.
  4. Refactored the codebase to follow apache naming practices.

GH Release Notes

https://github.com/apache/incubator-xtable/releases/tag/0.1.0-incubating