Apache Spark Engine¶
Apache Spark for Iceberg maintenance operations, submitted via Apache Livy REST API.
Operation Support¶
Spark has full support for all Iceberg maintenance operations across all supported catalog types (REST, Hive, Nessie, Polaris, Lakekeeper, Gravitino).
Spark's SparkActions API provides the most comprehensive maintenance capabilities. Floe submits Spark jobs through Apache Livy,
which provides a REST interface to Spark clusters.
Configuration¶
Options¶
| Property | Required | Default | Description |
|---|---|---|---|
floe.livy.url |
Yes | - | Livy REST API URL |
floe.livy.job.jar |
No | local:///opt/floe/floe-maintenance-job.jar |
Path to maintenance job JAR |
floe.livy.job.main-class |
No | com.floe.spark.job.MaintenanceJob |
Main class |
floe.livy.driver-memory |
No | 2g |
Spark driver memory |
floe.livy.executor-memory |
No | 2g |
Spark executor memory |
Job Lifecycle¶
- Floe submits batch job to Livy
- Livy creates Spark application
- Job executes maintenance operation
- Floe polls for completion
- Results returned to Floe
Monitoring¶
Livy UI¶
Access at http://livy:8998/ui to see:
- Active sessions
- Batch job history
- Logs