Tags
Language
Tags
October 2025
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Pyspark Essentials For Data Scientists (Big Data + Python)

    Posted By: ELK1nG
    Pyspark Essentials For Data Scientists (Big Data + Python)

    Pyspark Essentials For Data Scientists (Big Data + Python)
    Last updated 5/2022
    MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
    Language: English | Size: 8.76 GB | Duration: 17h 17m

    Learn how to wrangle Big Data for Machine Learning using Python in PySpark taught by an industry expert!

    What you'll learn
    Use Python with Big Data on a distributed framework (Apache Spark)
    Work with REAL datasets on realistic consulting projects
    How to streaming LIVE data from Twitter using Spark Structured Streaming
    Learn how to create a "Pandora Like" app that classifies songs into genres using machine learning
    Flag suspicious job postings using Natural Language Processing
    Use machine learning to predict optimal cement strength and the factors that affect it
    Classify Christmas cooking recipes using Topic Modeling (LDA)
    Customer Segmentation using Gaussian Mixture Modeling (Clustering)
    Use cluster analysis to develop a strategy designed to increase college graduation rates for under-priveleged populations
    How to use the k-means clustering algorithm to define a marketing outreach strategy
    Integrate a UI to monitor your model training and development process with MLflow
    Theory and application of cutting edge data science algorithms
    Manipulate, Join and Aggregate Dataframes in Spark with Python
    Learn how to apply Spark's machine learning techniques on distributed Dataframes
    Cross Validation & Hyperparameter Tuning
    Frequent Pattern Mining Techniques
    Classification & Regression Techniques
    Data Wrangling for Natural Language Processing
    How to write SQL Queries in Spark
    Requirements
    Familiarity with Python is helpful but not required
    Some background in data science is helpful but not required
    A hunger to LEARN
    Description
    This course is for data scientists (or aspiring data scientists) who want to get PRACTICAL training in PySpark (Python for Apache Spark) using REAL WORLD datasets and APPLICABLE coding knowledge that you’ll use everyday as a data scientist! By enrolling in this course, you’ll gain access to over 100 lectures, hundreds of example problems and quizzes and over 100,000 lines of code! I’m going to provide the essentials for what you need to know to be an expert in Pyspark by the end of this course, that I’ve designed based on my EXTENSIVE experience consulting as a data scientist for clients like the IRS, the US Department of Labor and United States Veterans Affairs.I’ve structured the lectures and coding exercises for real world application, so you can understand how PySpark is actually used on the job. We are also going to dive into my custom functions that I wrote MYSELF to get you up and running in the MLlib API fast and make getting started building machine learning models a breeze! We will also touch on MLflow which will help us manage and track our model training and evaluation process in a custom user interface that will make you even more competitive on the job market!Each section will have a concept review lecture as well as code along activities structured problem sets for you to work through to help you put what you have learned into action, as well as the solutions to each problem in case you get stuck. Additionally, real world consulting projects have been provided in every section with AUTHENTIC datasets to help you think through how to apply each of the concepts we have covered.Lastly, I’ve written up some condensed review notebooks and handouts of all the course content to make it super easy for you to reference later on. This will be super helpful once you land your first job programming in PySpark!I can’t wait to see you in the lectures! And I really hope you enjoy the course! I’ll see you in the first lecture!

    Overview

    Section 1: Course Introduction

    Lecture 1 Frequently Asked Questions

    Lecture 2 Course Introduction

    Lecture 3 Course Orientation

    Lecture 4 Course Materials Bulk Download

    Lecture 5 Resources for Setting up PySpark

    Lecture 6 Python Cheatsheet Resources

    Lecture 7 Introduction to PySpark

    Lecture 8 Transitioning from Python to PySpark Concept Review

    Lecture 9 Transitioning from Python to PySpark Code Along Activity

    Section 2: Dataframe Essentials: Read, Write, Validate & Explore

    Lecture 10 Dataframe Essentials Concept Review

    Lecture 11 A little something to keep you going….

    Lecture 12 Read, Write and Validate Dataframes Code Along Activity

    Lecture 13 Read, Write and Validate Data HW

    Lecture 14 Read, Write and Validate Data HW Solutions Code Review

    Lecture 15 A little something to keep you going….

    Lecture 16 Search and Filter Dataframes Code Along Activity

    Lecture 17 Search and Filter Dataframes HW

    Lecture 18 Search and Filter Dataframes HW Solution Code Review

    Lecture 19 A little something to keep you going….

    Lecture 20 SQL Options in Spark/PySpark Code Along Activity

    Lecture 21 SQL Options in Spark/PySpark HW

    Lecture 22 SQL Options in Spark/PySpark HW Solutions

    Lecture 23 A little something to keep you going….

    Section 3: Dataframe Essentials: Clean, Manipulate, Join, Aggregate

    Lecture 24 Manipulating Dataframes Code Along Activity

    Lecture 25 Manipulating Dataframes HW

    Lecture 26 Manipulating Dataframes HW Solution

    Lecture 27 A little something to keep you going….

    Lecture 28 Aggregating Data in Dataframes Code Along Activity

    Lecture 29 Aggregating Data in Dataframes HW

    Lecture 30 Aggregating Data in Dataframes HW Solution

    Lecture 31 A little something to keep you going….

    Lecture 32 Joining and Appending Dataframes Code Along Activity

    Lecture 33 Joining and Appending Dataframes HW

    Lecture 34 Joining and Appending Dataframes HW Solution Code Review

    Lecture 35 A little something to keep you going….

    Lecture 36 Handling Missing Data in Dataframes Code Along Activity

    Lecture 37 Handling Missing Data in Dataframes HW

    Lecture 38 Handling Missing Data in Dataframes HW Solution

    Lecture 39 Dataframe Essentials Coding Master Review

    Lecture 40 A little something to keep you going….

    Section 4: Introduction to Spark MLlib

    Lecture 41 Introduction to Machine Learning Concept Review

    Lecture 42 Introduction to MLlib Concept Review

    Lecture 43 Model Selection and Tuning in MLlib Concept Review

    Lecture 44 Two Links to Bookmark

    Lecture 45 A little something to keep you going….

    Section 5: Classification in MLlib

    Lecture 46 Introduction to Classification in MLlib Concept Review

    Lecture 47 A little something to keep you going….

    Lecture 48 Classification in MLlib Code Along Part 1: Data Formatting and Transformations

    Lecture 49 Classification in MLlib Code Review Part 2.0: Train and Evaluate Models [Intro]

    Lecture 50 Classification in MLlib Code Review Part 2.1: Train & Test Models [Logistic]

    Lecture 51 Classification in MLlib Code Review Part 2.2: Train & Test Models [1 vs Rest]

    Lecture 52 A little something to keep you going….

    Lecture 53 Classification in MLlib Code Review Part 2.3: Train & Test Models[Multilayer PC]

    Lecture 54 Classification in MLlib Code Review Part 2.4: Train & Test Models [Naive Bayes]

    Lecture 55 Classification in MLlib Code Review Part 2.5: Train & Test Models [Linear SVM]

    Lecture 56 Classification in MLlib Code Review Part 2.6: Train & Test Models[Decision Tree]

    Lecture 57 Classification in MLlib Code Review Part 2.7: Train & Test Models[Random Forest]

    Lecture 58 Classification in MLlib Code Review Part 2.8: Train & Test Models [GBT]

    Lecture 59 A little something to keep you going….

    Lecture 60 BONUS: Add loop functions to your training and evaluation script

    Lecture 61 BONUS: Leverage MLflow to better track and manage your results

    Lecture 62 Classification Project

    Lecture 63 Remember to be creative with this project!

    Lecture 64 Classification Project Solution

    Section 6: Natural Language Processing in MLlib

    Lecture 65 Introduction to Natural Language Processing

    Lecture 66 Natural Language Processing Concept Review [Part 1: Feature Transformers]

    Lecture 67 Natural Language Processing Concept Review [Part 2: Feature Extractors]

    Lecture 68 A little something to keep you going….

    Lecture 69 Natural Language Processing Code Along Activity Part 1: Data Prep

    Lecture 70 Natural Language Processing Code Along Activity Part 2: Vectorize, Train & Eval

    Lecture 71 Natural Language Processing Project

    Lecture 72 Natural Language Processing Project Solution

    Lecture 73 A little something to keep you going….

    Section 7: Regression in MLlib

    Lecture 74 Regression in MLlib Concept Review

    Lecture 75 Regression in MLlib Code Review Introduction

    Lecture 76 Regression in MLlib Code Review Part 1: Data Prep

    Lecture 77 Regression in MLlib Code Review Part 2.0: Linear Regression

    Lecture 78 A little something to keep you going….

    Lecture 79 Regression in MLlib Code Review Part 2.1: Decision Tree Regression

    Lecture 80 Regression in MLlib Code Review Part 2.2: Random Forest Regression

    Lecture 81 Regression in MLlib Code Review Part 2.3: Gradient Boosted Tree Regression

    Lecture 82 A little something to keep you going….

    Lecture 83 BONUS: Add loop functions to your regression training and evaluation script

    Lecture 84 Regression Project

    Lecture 85 And finally… have FUN with this project and LOVE what you do!

    Lecture 86 Regression Project Solution Code Along Activity

    Section 8: Clustering in PySpark

    Lecture 87 Intro to Clustering in MLlib Concept Review

    Lecture 88 K-Means & Bisecting K-Means in MLlib Code Along Activity

    Lecture 89 Latent Dirichlet Allocation in MLlib Code Along Activity

    Lecture 90 A little something to keep you going….

    Lecture 91 Gaussian Mixture Modeling in MLlib Code Along Activity

    Lecture 92 Clustering Project Introduction

    Lecture 93 Clustering Project Solution Code Review

    Lecture 94 A little something to keep you going….

    Section 9: Frequent Pattern Mining in MLlib

    Lecture 95 Frequent Pattern Mining in MLlib Concept Review

    Lecture 96 Frequent Pattern Mining Code Along Activity [Part 1: FP-Growth]

    Lecture 97 Frequent Pattern Mining Code Along Activity [Part 2: PrefixSpan]

    Lecture 98 A little something to keep you going….

    Lecture 99 Frequent Pattern Mining Project Introduction

    Lecture 100 Frequent Pattern Mining Project Solution Code Review

    Section 10: Spark Structured Streaming

    Lecture 101 Intro to Spark Structured Streaming

    Lecture 102 Intro to Streaming Data Using Sockets

    Lecture 103 Twitter Structure Streaming Project Setup and Intro

    Lecture 104 Twitter Project Tweet Listener Setup

    Lecture 105 Twitter Project Structured Stream Setup and Implementation

    Lecture 106 Additional Spark Structured Streaming Resources

    Section 11: Course Wrap-up

    Lecture 107 Closing Remarks

    Lecture 108 Tips for success moving forward

    Lecture 109 And finally… remember to set your goals HIGH!

    Data Scientists interested in learning PySpark,PySpark developers looking to strengthen their coding skills,Python developers who need to work with big data,Data Scientists who want to learn to work with big data