Tags
Language
Tags
July 2025
Su Mo Tu We Th Fr Sa
29 30 1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31 1 2
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Data Engineering using AWS Analytics Services (Updated 4/2022)

    Posted By: BlackDove
    Data Engineering using AWS Analytics Services (Updated 4/2022)

    Data Engineering using AWS Analytics Services (Updated 10/2021)
    Genre: eLearning | MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
    Language: English | Size: 10.0 GB | Duration: 26h 15m

    Build Data Engineering Pipelines using AWS Analytics Services such as Glue, EMR, Athena, Kinesis, Quick Sight, etc

    What you'll learn
    Data Engineering leveraging AWS Analytics features
    Managing Tables using Glue Catalog
    Engineering Batch Data Pipelines using Glue Jobs
    Orchestrating Batch Data Pipelines using Glue Workflows
    Running Queries using Athena - Server less query engine service
    Using AWS Elastic Map Reduce (EMR) Clusters for building Data Pipelines
    Using AWS Elastic Map Reduce (EMR) Clusters for reports and dashboards
    Data Ingestion using Lambda Functions
    Scheduling using Events Bridge
    Engineering Streaming Pipelines using Kinesis
    Streaming Web Server logs using Kinesis Firehose
    Overview of data processing using Athena
    Running Athena queries or commands using CLI
    Running Athena queries using Python boto3

    Description
    Data Engineering is all about building Data Pipelines to get data from multiple sources into Data Lake or Data Warehouse and then from Data Lake or Data Warehouse to downstream systems. As part of this course, I will walk you through how to build Data Engineering Pipelines using AWS Analytics Stack. It includes services such as Glue, Elastic Map Reduce (EMR), Lambda Functions, Athena, QuickSight, and many more.

    Here are the high-level steps which you will follow as part of the course.

    Setup Development Environment

    Getting Started with AWS

    Development Life Cycle of Pyspark

    Overview of Glue Components

    Setup Spark History Server for Glue Jobs

    Deep Dive into Glue Catalog

    Exploring Glue Job APIs

    Glue Job Bookmarks

    Data Ingestion using Lambda Functions

    Streaming Pipeline using Kinesis

    Consuming Data from s3 using boto3

    Populating GitHub Data to Dynamodb

    Getting Started with AWS

    Introduction - AWS Getting Started

    Create s3 Bucket

    Create IAM Group and User

    Overview of Roles

    Create and Attach Custom Policy

    Configure and Validate AWS CLI

    Development Lifecycle for Pyspark

    Setup Virtual Environment and Install Pyspark

    Getting Started with Pycharm

    Passing Run Time Arguments

    Accessing OS Environment Variables

    Getting Started with Spark

    Create Function for Spark Session

    Setup Sample Data

    Read data from files

    Process data using Spark APIs

    Write data to files

    Validating Writing Data to Files

    Productionizing the Code

    Overview of Glue Components

    Introduction - Overview of Glue Components

    Create Crawler and Catalog Table

    Analyze Data using Athena

    Creating S3 Bucket and Role

    Create and Run the Glue Job

    Validate using Glue CatalogTable and Athena

    Create and Run Glue Trigger

    Create Glue Workflow

    Run Glue Workflow and Validate

    Who this course is for:
    Beginner or Intermediate Data Engineers who want to learn AWS Analytics Services for Data Engineering
    Intermediate Application Engineers who want to explore Data Engineering using AWS Analytics Services
    Data and Analytics Engineers who want to learn Data Engineering using AWS Analytics Services
    Testers who want to learn Databricks to test Data Engineering applications built using AWS Analytics Services