Frequently asked questions about Flowman (FAQ)
Here you will find answers to common product level questions about Flowman. These will help you on deciding how Flowman fits into your overall application landscape. For more technical questions on the developer side, please visit the cookbook section of the official Flowman documentation.
- Files either on local file system or distributed storage (HDFS) or blob storage (S3, ABS, etc)
- Hive tables and views
- Relational databases (MySQL, MariaDB, MS SQL Server, PostgreSQL, …)
- Plain text
- Fixed width format
- Hadoop Sequence files
- JSON files
- Avro files
- Delta Lake
Flowman supports many relational databases via JDBC connectivity.
- MS SQL Server
- Azure SQL
New databases can be implemented on request without much effort.
Actually Flowman is neither a replacement for schedulers like Apache Airflow or Oozie, nor does it contain a job scheduler which automatically starts the execution of jobs at specific times. Since job scheduling is an overarching topic which is required to run many different tools, this is not a shortcoming of Flowman itself, but rather a design decision to exclude this feature since other excellent tools already exist.
This means you can use any existing scheduler which supports starting a bash script (since this is what the Flowman executables essentially are), so for example Oozie or Airflow work just fine.