Apache Airflow Using Google Cloud Composer: Introduction
Last updated 5/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.83 GB | Duration: 3h 51m
Last updated 5/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.83 GB | Duration: 3h 51m
With Google Cloud composer learn Apache Airflow without making any local install. Ensures focus is on Airflow topics.
What you'll learn
Understand automation of Task workflows through Airflow
Airflow Architecture - On Premise (local install), Cloud, single node, multiple node
How to use connection functionality to connect to different systems to automate data pipelines
What is Google cloud Big query and briefly how it can be used in Dataware housing as well as in Airflow DAG
Master core functionalities such as DAGs, Operators, Tasks through hands on demonstrations
Understand advanced functionalities like XCOM, Branching, Subdags through hands on demonstrations
Get an overview understanding on SLAs, Kubernetes executor functionality in Apache Airflow
The source files of Python DAG programs (9 .py files) used in demonstration are available for download towards practice for students
Requirements
Google Cloud Platform Account OR even Free Trial account - NO Install required
Good understanding on Python code and some exposure to bash shell scripting will help.
Description
Apache Airflow is an open-source platform to programmatically author, schedule and monitor workflows.Cloud Composer is a fully managed workflow orchestration service that empowers you to author, schedule, and monitor pipelines that span across clouds and on-premises data centers. Built on the popular Apache Airflow open source project and operated using the Python programming language, Cloud Composer is free from lock-in and easy to use. With Apache Airflow hosted on cloud ('Google' Cloud composer) and hence,this will assist learner to focus on Apache Airflow product functionality and thereby learn quickly, without any hassles of having Apache Airflow installed locally on a machine.Cloud Composer pipelines are configured as directed acyclic graphs (DAGs) using Python, making it easy for users of any experience level to author and schedule a workflow. One-click deployment yields instant access to a rich library of connectors and multiple graphical representations of your workflow in action, increasing pipeline reliability by making troubleshooting easy.This course is designed with beginner in mind, that is first time users of cloud composer / Apache airflow. The course is structured in such a way that it has presentation to discuss the concepts initially and then provides with hands on demonstration to make the understanding better.The python DAG programs used in demonstration source file (9 Python files) are available for download toward further practice by students. Happy learning!!!
Overview
Section 1: Course Overview
Lecture 1 Course Overview - Topics of coverage
Section 2: Introduction
Lecture 2 Data pipe lines & Uses cases for Apache Airflow
Lecture 3 What is Task and why Orchestration needed?
Lecture 4 What is Apache Airflow & environment options?
Section 3: What is Airflow - Directed Acyclic Graph (DAG) & operators?
Lecture 5 What is Airflow - Directed Acyclic Graph
Section 4: Apache Airflow architecture
Lecture 6 Apache Airflow architecture
Lecture 7 Apache Airflow - Single Node vs Multinode
Section 5: Google Cloud Platform: Cloud composer used as Apache Airflow
Lecture 8 Provisioning Google Composer - Apache Airflow environment - Part 1
Lecture 9 Provisoning Google Composer - Apache Airflow environment - Part 2
Lecture 10 Navigation - Cloud composer(Apache airflow) Web UI navigation
Section 6: Understanding Apache Airflow program structure
Lecture 11 Understanding Apache Airflow program structure
Section 7: Activity 1 : Create and submit Apache airflow DAG program
Lecture 12 Activity 1 : Create and submit Apache airflow DAG program
Section 8: Activity 2: Using Template functionality in Apache Airflow program
Lecture 13 Activity 2: Using Templating functionality in Apache Airflow program
Lecture 14 Activity 2: Using Templating functionality in Apache Airflow program - Part 2
Section 9: Using Variables in Apache Airflow
Lecture 15 What is variable in Apache Airflow and when to use them?
Lecture 16 Activity 3: Variables usage in DAG python program
Section 10: Activity 4: Calling Bash script in different folder / different machine.
Lecture 17 Activity 4: Calling Bash script in different folder / different machine - Part1
Lecture 18 Activity 4: Calling Bash script in different folder / different machine - Part 2
Section 11: Creating connections in Apache Airflow
Lecture 19 Why connections are required in Apache Airflow
Lecture 20 Navigation and creating connection steps in Apache Airflow
Lecture 21 Activity 5: Creating and testing connection in Apache Airflow - Part 1
Lecture 22 Activity 5: Creating and testing connection in Apache Airflow - Part 2
Section 12: Using Google's cloud Bigquery with Apache Airflow Datapipelines
Lecture 23 What is Google Cloud BigQuery?
Lecture 24 Creation of custom Bigquery table
Lecture 25 BigQuery data upload from Excel sheet (CSV file)
Lecture 26 Activity 6 : Apache Airflow DAG Data pipeline for BigQuery
Section 13: Cross communication between tasks - XCOM
Lecture 27 What is xcom?
Lecture 28 Activity 7: xcom demonstration pipeline
Section 14: Branching based on conditions
Lecture 29 Overview about Branching Functionality
Lecture 30 Activity 8: Tasks Branching demonstration
Section 15: SUBDAGS
Lecture 31 What is a Subdag?
Lecture 32 Activity 9: SubDAGs demonstration
Section 16: Other functionalities
Lecture 33 Service Level Agreement with Airflow
Lecture 34 Airflow now support Kubernetes
Lecture 35 Sensors
Section 17: Apache Airflow Vs Apache Beam and Spark - Quick comparison
Lecture 36 Apache Airflow Vs Apache Beam and Spark - Quick comparison
Section 18: Bonus
Lecture 37 Concluding remarks
People interested in Data warehousing, Big data, Data engineering,People interested in Automated tools for task workflow scheduling,Student interested to know about Airflow,Professional to wish to explore as how Apache Airflow can be used in Task scheduling and building Data pipelines