Skip to content

Floe

Floe is a policy-based table maintenance system for Apache Iceberg. It continuously assesses table health and automatically triggers maintenance operations based on configurable conditions.

  • Compact small files for better query performance
  • Expire old snapshots to reclaim storage
  • Remove orphan files not referenced by any snapshot
  • Optimize manifests for faster query planning

High Level Architecture

Quick Start

git clone https://github.com/nssalian/floe.git
cd floe
make start

Open http://localhost:9091

Documentation

Section Description
Policies Policy configuration guide and API reference
Configuration Configure catalogs, engines, storage
Architecture System design and extension points

Catalogs

Catalog Description
REST Iceberg REST Catalog
Nessie Git-like versioning
Polaris Multi-engine access control
Hive Hive Metastore
Lakekeeper Open-source Iceberg catalog
Gravitino Unified metadata lake
DataHub DataHub catalog integration

Engines

Engine Description
Spark Apache Spark (via Livy)
Trino Trino SQL

API

API documentation (once service is up and running): Swagger UI