Data Engineering Fundamentals With Prefect Workflow
Published 3/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.11 GB | Duration: 3h 9m
Published 3/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.11 GB | Duration: 3h 9m
Data Engineering Fundamentals with Prefect Data pipeline using Oracle Cloud Infrastructure - VM and Autonomous DB
What you'll learn
What is Data Engineering and its difference with Data Analysis and Data Science
Provisioning of Virtual Machine and Oracle Cloud Autonomous Database in Oracle Cloud Infrastructure
Introduction to Data Pipeline workflow tool - Prefect.
Demonstration fo Prefect client with prefect Dash Board & its integration
Building up and executing tasks using Python prefect libraries, task dependencies, views in Perfect dashboard
Demonstation of Webhooks with Prefect.
Requirements
Access to Oracle Cloud Infratructure free tier
Basic Linux and Python programming skills.
Description
Data engineering is the process of designing and building systems that let people collect and analyze raw data from multiple sources and formats. These systems empower people to find practical applications of the data, which businesses can use to thrive.Companies of all sizes have huge amounts of disparate data to comb through to answer critical business questions. Data engineering is designed to support the process, making it possible for consumers of data, such as analysts, data scientists and executives, to reliably, quickly and securely inspect all of the data available.About a decade back, the data analysis was merely on the structured data available on the a Relational data base or in ERP system and any decision was made based on analysis of the historic data and tools like ETL (extract, Tranform & load) was used for datawarehousing system. However in this dynamic ever changing world, non relational data base information need to used for quick analysis.So apart from transactions in database, the other source of web information from CSV, webhooks, http & MQTT need to taken care as appropriate.Further more, the process of ETL as evolved into Data pipelines. A data pipeline is a method in which raw data is ingested from various data sources and then ported to data store, like a data lake or data warehouse, for analysis. In data pipe line task dependency can be build with different task. These task can be also based on some events happening like Order booked or Issues raise which can trigger a task. For this concepts of Webhooks are used.Prefect is one such newly evolved data pipeline or workflow tool, in which one can build not only static task dependency, but these task dependency can be built based on some event happeningas well. This course uses the cloud version Prefect worflow tool which can be invoked from a cloud based virtual machine. Knowledge of Python & shell scripting is essential.This course covers following topic:•Difference between Data Engineering Vs Data Analysis Vs Data Science•An Overview about Data Science, Machine Learning & Data Science.•Extract, Transform, Load vs Data pipeline.•Provisioning Oracle Linux Virtual machine On Oracle Cloud Infrastructure.•Prefect Cloud Data pipeline and Client VM Set up.•Documentation reference - Prefect Workflow / Data pipelines.•Hands-on Demonstration of Perfect Flow with Tasks dependency.•Building Prefect dataflow pipeline for Oracle Database extract using Python.•Introduction to Webhooks and Hands-on Demonstration with Prefect & Github.•Career Path for Data EngineersHappy Learning!
Overview
Section 1: Introduction
Lecture 1 Course Coverage
Section 2: Difference between Data Engineering Vs Data Analysis Vs Data Science
Lecture 2 Difference between Data Engineering Vs Data Analysis Vs Data Science
Lecture 3 An overview on Data science
Section 3: Extract, Transform, Load vs Data pipeline
Lecture 4 Extract, Transform, Load vs Data pipeline
Lecture 5 Comparison between Apache Airflow and Prefect Data pipeline - Orchestration
Section 4: Provisioning Oracle Linux Virtual machine On Oracle Cloud Infrastructure.
Lecture 6 What is Virtualization?
Lecture 7 Steps involved in creation of Linux Virtual Machine on OCI
Lecture 8 Creationing Public private & public Key using Putty Gen
Lecture 9 Provisioning Compartment and Virtual cloud Network (VCN) in OCI
Lecture 10 Creating Linux 9 - Virtual Machine on OCI
Lecture 11 Connecting through putty to Virtual Machine
Lecture 12 Executing scripts in VM for Linux GUI - Part 1
Lecture 13 Executing scripts in VM for Linux GUI - Part 2
Section 5: Prefect Cloud Datapipeline and Client VM
Lecture 14 Overview : Prefect Cloud Environment
Lecture 15 Prefect Client installation on Linux 9 - VM
Lecture 16 Connecting to Prefect cloud Dashboad Data pipeline from Client VM
Lecture 17 Executing the first Flow based datapipeline program using Prefect orchestration
Section 6: Documentation reference - Prefect Workflow / Datapipelines
Lecture 18 Documentation reference - Prefect Workflow / Datapipelines
Section 7: Hands-on Demonstration of Perfect Flow with Tasks
Lecture 19 Hands-on Demonstration of Prefect Flow with Task
Section 8: Building Prefect dataflow pipeline for Oracle Database extract using Python
Lecture 20 What is Autonomous Cloud Database?
Lecture 21 Significance of Compartment & creation-deletion of Compartment.
Lecture 22 Provisioning the Autonomous Database on OCI
Lecture 23 Different Ways to Connect to Oracle Autonomous Database
Lecture 24 Connecting through Cloud Web SQL Developer
Lecture 25 Python Connect to Oracle Autonomous Database through python library - Part 1
Lecture 26 Python Connect to Oracle Autonomous Database through python library - Part 2
Lecture 27 OLTP Vs OLAP
Lecture 28 Prefect datapipe with two task and building dependency between tasks
Section 9: Introduction to Webhooks and Hands-on Demonstration with Prefect & Github
Lecture 29 Understanding the difference between Web Hooks, MQTT, Web Sockets
Lecture 30 Hands-on Demonstration Web Hooks with Prefect Workflow and Githhub - Part 1
Lecture 31 Automated Deployment of webhook for event based workflows - Prefect & Githhub
Section 10: Career Path for Data Engineers
Lecture 32 Career Path for Data Engineers
Section 11: Concluding Remarks
Lecture 33 Concluding Remarks
Computer science students,IT consultants