Flowman 0.23.0 is now avaible!


The new version 0.23.0 of Flowman has been released. The main feature of this version is a significant improvement of the new documentation system, which now also includes column level lineage. The automatically generated documentation is a valuable artifact for both developers and business experts to improve the understanding of the data models and transformations. Flowman projects can also specify quality checks (like NOT NULL condition, foreign key relationships or arbitrary SQL expressions), which are not only included in the documentation but also executed on the real data.

Moreover support for SQL databases via JDBC has been improved again with the introduction of temporary staging tables to perform updates within a transactional commit.

Detailed Changes

  • github-148: Support staging table for all JDBC relations
  • github-120: Use staging tables for UPSERT and MERGE operations in JDBC relations
  • github-147: Add support for PostgreSQL
  • github-151: Implement column level lineage in documentation
  • github-121: Correctly apply documentation, before/after and other common attributes to templates
  • github-152: Implement new ‘cast’ mapping

About Flowman

Flowman is a data build tool on top of Apache Spark which uses a declarative approach for specifying the full data flow including all sources, targets and transformation. Like usual, you can find the latest version of Flowman prebuilt for different Spark / Hadoop versions at https://flowman.io

Flowman 1.1.0 released

We are happy to announce the release of Flowman 1.1.0. This release contains many small improvements and bugfixes. Flowman now finally supports

Flowman at Smartclip

smartclip is a successful and growing company specialized for online video advertisement. More importantly, smartclip was one of the first companies implementing