The new version 0.23.0 of Flowman has been released. The main feature of this version is a significant improvement of the new documentation system, which now also includes column level lineage. The automatically generated documentation is a valuable artifact for both developers and business experts to improve the understanding of the data models and transformations. Flowman projects can also specify quality checks (like
NOT NULL condition, foreign key relationships or arbitrary SQL expressions), which are not only included in the documentation but also executed on the real data.
Moreover support for SQL databases via JDBC has been improved again with the introduction of temporary staging tables to perform updates within a transactional commit.
- github-148: Support staging table for all JDBC relations
- github-120: Use staging tables for UPSERT and MERGE operations in JDBC relations
- github-147: Add support for PostgreSQL
- github-151: Implement column level lineage in documentation
- github-121: Correctly apply documentation, before/after and other common attributes to templates
- github-152: Implement new ‘cast’ mapping
Flowman is a data build tool on top of Apache Spark which uses a declarative approach for specifying the full data flow including all sources, targets and transformation. Like usual, you can find the latest version of Flowman prebuilt for different Spark / Hadoop versions at https://flowman.io