Flowman - Declarative Spark ETL Framework for Data Engineers

Product

Overview

What is Flowman?
Flowman is a declarative data build tool based on Apache Spark.

Everything as Code
Simple YAML files support proven workflows with source code management, code reviews and CI/CD pipelines.

Declarative Spark
With its declarative approach, Flowman removes the complexity of writing robust Spark applications and let your developers focus on the business logic instead.

Development Workflow
By using simple YAML files, Flowman easily supports collaboration between developers. An optional integration with Apache Maven simplifies CI/CD processes.
Users

Data Engineers
Learn how Flowman reduces the cognitive load of data engineers.

Operations Teams
Learn how Flowman supports your operations.
Community
Get Started

Overview
Install and try out Flowman, or simply request a demo session.

Download Flowman
The latest Flowman release for local installation

Run in Docker
The simplest way to get started with Flowman

Install Locally
How to set up Apache Spark and Flowman on your local machine, step by step.
Learn

Reference Documentation
Flowman provides a rich and extensive documentation with concepts, tutorials and reference.

Blog
Read background stories about Flowman and find the release informations.

FAQ
Find answers to commonly asked questions

Flowman is for Data Engineers

Data engineering is the term for the complex task of building robust data pipelines between different systems. This requires expertise in classical data topics, but also in programming languages like Python or Scala and frameworks like Apache Spark.

Flowman provides a clear focus on business logic and SQL and thereby reduces the cognitive load. Data engineers don’t need to be experts in software engineering any more.