Apache Flink is an open source platform for scalable batch and stream data processing.
Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams.
Flink includes several APIs for creating applications that use the Flink engine:
- DataSet API for static data embedded in Java, Scala, and Python,
- DataStream API for unbounded streams embedded in Java and Scala, and
- Table API with a SQL-like expression language embedded in Java and Scala.
Flink also bundles libraries for domain-specific use cases:
Apache Flink is a platform for efficient, distributed, general-purpose data processing. It features powerful programming abstractions in Java and Scala, a high-performance runtime, and automatic program optimization. It has native support for iterations, incremental iterations, and programs consisting of large DAGs of operations.
If you quickly want to try out the system, please look at one of the available quickstarts. For a thorough introduction of the Flink API please refer to the Programming Guide.
This is an overview of Flink’s stack. Click on any component to go to the respective documentation.
This documentation is for Apache Flink version 0.9-SNAPSHOT, which is the current development version of the next upcoming major release of Apache Flink.
The Scala API uses Scala 2.10. Please make sure to use a compatible version.