The new version 0.23.0 of Flowman has been released. The main feature of this version is a significant improvement of the new documentation system, which now also includes column level lineage. The automatically generated documentation is a valuable artifact for both developers and business experts to improve the understanding of the data models and transformations. Flowman projects can also specify quality checks (like NOT NULL condition, foreign key relationships or arbitrary SQL expressions), which are not only included in the documentation but also executed on the real data.

Moreover support for SQL databases via JDBC has been improved again with the introduction of temporary staging tables to perform updates within a transactional commit.

Detailed Changes

About Flowman

Flowman is a data build tool on top of Apache Spark which uses a declarative approach for specifying the full data flow including all sources, targets and transformation. Like usual, you can find the latest version of Flowman prebuilt for different Spark / Hadoop versions at