Airflow 2

Airflow is a platform to programmatically author, schedule and monitorworkflows.

Apache Airflow version 2.0: Kubernetes version 1.18.14 Environment: Azure - AKS What happened: I have just upgraded my Airflow from 1.10.13 to 2.0. I am running it in Kubernetes (AKS Azure) with Ku. Innovative design meets Swiss precision: The AIR-FLOW ® HANDY 2+ represents the classic mobile device system: It’s specifically designed for powders with a 40µm grain-size, effectively removing Biofilm, stains and other partially-mineralized deposits in hard-to-reach areas. In addition to oral hygiene and prophylaxis, the AIR-FLOW ® HANDY 2+ has a wide range of clinical applications, from.

Use Airflow to author workflows as Directed Acyclic Graphs (DAGs) of tasks.The Airflow scheduler executes your tasks on an array of workers whilefollowing the specified dependencies. Rich command line utilities makeperforming complex surgeries on DAGs a snap. The rich user interfacemakes it easy to visualize pipelines running in production,monitor progress, and troubleshoot issues when needed.

When workflows are defined as code, they become more maintainable,versionable, testable, and collaborative.

Airflow

Principles¶

Airflow 220t

  • Dynamic: Airflow pipelines are configuration as code (Python), allowing for dynamic pipeline generation. This allows for writing code that instantiates pipelines dynamically.

  • Extensible: Easily define your own operators, executors and extend the library so that it fits the level of abstraction that suits your environment.

  • Elegant: Airflow pipelines are lean and explicit. Parameterizing your scripts is built into the core of Airflow using the powerful Jinja templating engine.

  • Scalable: Airflow has a modular architecture and uses a message queue to orchestrate an arbitrary number of workers. Airflow is ready to scale to infinity.

Beyond the Horizon¶

Airflow is not a data streaming solution. Tasks do not move data fromone to the other (though tasks can exchange metadata!). Airflow is notin the Spark Streamingor Storm space, it is more comparable toOozie orAzkaban.

Workflows are expected to be mostly static or slowly changing. You can thinkof the structure of the tasks in your workflow as slightly more dynamicthan a database structure would be. Airflow workflows are expected to looksimilar from a run to the next, this allows for clarity aroundunit of work and continuity.

Apache Airflow 2.0 is coming with a lot of big improvements. In this post, I’m gonna explain to you what are the most important features to expect what they solve. If you are totally new to Airflow, check my introduction course right there, or if you want an in-depth approach check my course here. Let’s get started.

The version 2.0 brings new features to take Apache Airflow at the next level in terms of scalability, resiliency and performance.

  • The scheduler won’t be the single point of failure anymore. It becomes highly available like the other components.
  • The REST API is going to be stable following the OPEN API 3 specification.
  • Dag serializing allows you to make your web server and soon, the scheduler as well, stateless.
  • KEDA with the Celery Executor is an incredible to scale Apache Airflow and the Kubernetes Executor is finally challenged

Airflow 2.0 Github

and much more.

Airflow 2000

If you are wondering, when Apache Airflow 2.0 will come out, well, it is scheduled to the 3rd quarter 2020 but no promises. Don’t forget, it’s an open source project, the vast majority of people are working on it for free, so be patient.

Check the video below to have a complete sum up of all the features with beautiful examples 🙂

Enjoy.

Interested by learning more? Stay tuned and get special promotions!