Tags
Language
Tags
December 2024
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4

Build Real World Big Data Projects

Posted By: ELK1nG
Build Real World Big Data Projects

Build Real World Big Data Projects
Published 12/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 4.08 GB | Duration: 5h 36m

Work with Big Data Tools, SQL Databases, AWS, ETL, Data Integration Tools & more to master real-world Big Data Projects

What you'll learn

How to Build a Scalable Data Pipeline using various Components

Data Warehouse Design

Data Preparation,Cleaning, Data Transformation and Manipulation

Industry Project Ready projects

Requirements

It is also beneficial to have prior knowledge of SQL, programming basics, data pipelines and ETL concepts

Description

A real data engineering project usually involves multiple components. Setting up a data engineering project, while conforming to best practices can be extremely time-consuming. If you areA data analyst, student, scientist, or engineer looking to gain data engineering experience, but are unable to find a good starter project.1. Wanting to work on a data engineering project that simulates a real-life project.2. Looking for an end-to-end data engineering project.3. Looking for a good project to get data engineering experience for job interviews.Then this Course is for you. In this Course, you willLearn How to Set up data infrastructure such as Airflow, Redshift, Snowflake, etcLearn data pipeline best practices.Learn how to spot failure points in data pipelines and build systems resistant to failures.Learn how to design and build a data pipeline from business requirements.Learn How to Build End to End ETL PipelineSet up Apache Airflow, AWS EMR, AWS Redshift, AWS Spectrum, and AWS S3.Tech stack:  ➔Language: Python➔Package: PySpark➔Services: Docker, Kafka, Amazon Redshift,S3, IICS, DBT Many MoreRequirementsThis course  presume that students have prior knowledge of AWS or its Big Data services.Having a fair understanding of Python and SQL would help but it is not mandatory.Every Month New Projects will be added

Overview

Section 1: Build ETL Data Pipeline on AWS EMR Cluster

Lecture 1 Exploration of the dataset

Lecture 2 Creating EMR Cluster

Lecture 3 Login into EMR part 1

Lecture 4 Login into EMR part 2

Lecture 5 Upload Data into Amazon S3

Lecture 6 using HIve as ETL Tool

Lecture 7 Hive Data Insertion

Lecture 8 Install Tableau Desktop

Lecture 9 Install Driver

Lecture 10 Connect Tableau to Amazon EMR Hive

Lecture 11 Add data schema and Table

Lecture 12 plot charts in Tableau part 1

Lecture 13 plot charts in Tableau part 2

Lecture 14 plot charts in Tableau part 3

Lecture 15 plot charts in Tableau part 4

Lecture 16 plot charts in Tableau part 5

Lecture 17 Building Dashboard and story

Section 2: Build Modern ETL Data Pipeline using IICS

Lecture 18 Tour to Architecture diagram

Lecture 19 Exploration of the dataset

Lecture 20 Upload data to AWS S3

Lecture 21 Create Postgresql in aws

Lecture 22 Download Pgadmin

Lecture 23 set up postgres sql and create schemas

Lecture 24 Create schemas and order table in your postgres instance

Lecture 25 set up infromatica cloud account

Lecture 26 Add S3 Connection part 1

Lecture 27 Add S3 Connection part 2

Lecture 28 Add postgres Connection

Lecture 29 Create customer Destination in Datawarehouse

Lecture 30 EL for aws s3 to data warehouse

Lecture 31 Create order Destination in Datawarehouse

Lecture 32 EL for app database to data warehouse

Lecture 33 Create DBT account

Lecture 34 dbt part 1

Lecture 35 dbt part 2

Lecture 36 dbt part 3

Lecture 37 dbt part 4

Lecture 38 dbt part 5

Lecture 39 dbt part 6

Lecture 40 dbt part 7

Lecture 41 dbt part 8

Lecture 42 Tableau and postgres set up

Lecture 43 How to Build charts in Tableau

Section 3: Create A Data Pipeline based on Messaging Using PySpark and Airflow

Lecture 44 Tour to Architecture diagram

Lecture 45 Create EC2 Instance

Lecture 46 SSH into EC2 Instance

Lecture 47 Envirnoment setup with docker

Lecture 48 Copy Important folder from local to ec2 and give required permissions

Lecture 49 To connect to different services locally after port forwarding

Lecture 50 To get into bash shell of different containers

Lecture 51 Insert Nifi Template

Lecture 52 Data Extraction with Nifi

Lecture 53 Data encryption parsing

Lecture 54 Data sources hdfs kafka part 1

Lecture 55 Data sources hdfs kafka part 2

Lecture 56 Data sources hdfs kafka part 3

Lecture 57 streaming data from kafka to pyspark

Lecture 58 pyspark streaming output kafka nifi hdfs part 1

Lecture 59 pyspark streaming output kafka nifi hdfs part 2

Lecture 60 Move Data HDFS to hive Table part 1

Lecture 61 Move Data HDFS to hive Table part 2

Lecture 62 Dataflow Orchestration with Airflow part 1

Lecture 63 Dataflow Orchestration with Airflow part 2

Lecture 64 Connecting with Data Visualization Tool

Lecture 65 Plot charts

People with some software background who want to learn the New technology in big data analysis will want to check this out. This course focuses on Various Big data Tools; we introduce some Data Engineering and data Science concepts along the way, but that's not the focus. If you want to learn how to Build Data Engineering Projects , then this course is for you.,Data analysts and Data Engineer who are curious about Big Data Tools and how it relates to their work.