Flink: handle Iceberg tables with nested and complex field types#2706@dolfinus Creates SchemaDatasetFacet with nested fields for Iceberg tables with list, map and struct columns.
Flink: handle Avro schema with nested and complex field types#2711@dolfinus Creates SchemaDatasetFacet with nested fields for Avro schemas with complex types (union, record, map, array, fixed).
Spark: add facets to Spark application events#2677@dolfinus Adds support for Spark application start and stop events in the ExecutionContext interface.
Spark: add nested fields to SchemaDatasetFieldsFacet#2689@dolfinus Adds nested Spark Dataframe fields support to SchemaDatasetFieldsFacet. Also include field comment as description.
Spark: add SparkApplicationDetailsFacet#2688@dolfinus Adds SparkApplicationDetailsFacet to runEvents emitted on Spark application start.
Spark: improve job suffix assigning mechanism#2665@pawel-big-lebowski For some catalog handlers, the mechanism was creating different dataset identifiers on START and COMPLETE depending on whether a dataset was created or not. This improves the mechanism to assign a deterministic job suffix based on the output dataset at the moment of a start event. Note: this may change job names in some scenarios.
Airflow: fix empty dataset name for AthenaExtractor#2700@kacpermuda The dataset name should not be empty when passing only a bucket as S3 output in Athena.
Flink: fix SchemaDatasetFacet for Protobuf repeated primitive types#2685@dolfinus Fixes issues with the Protobuf schema converter.
Python: clean up Python client code, add logging.#2653@kacpermuda Cleans up client code, refactors logging in all Python modules.
SQL: catch TokenizerErrors, PanicException#2703@mobuchowski The SQL parser now catches and handles these errors.
Python: suppress warning on importing v1 module in init.py.#2713@JDarDagran Suppresses the deprecation warning when v1 facets are used.
Integration/Java/Python: use UUIDv7 instead of UUIDv4#2686#2687@dolfinus Uses UUIDv7 instead of UUIDv4 for runEvents. The new UUID version produces monotonically increasing values, which leads to more performant queries on the OL consumer side. Note: UUID version is an implementation detail and can be changed in the future.