Tags
Language
Tags
December 2024
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4

Machine Learning For Aspiring Data Scientists: Zero To Hero

Posted By: ELK1nG
Machine Learning For Aspiring Data Scientists: Zero To Hero

Machine Learning For Aspiring Data Scientists: Zero To Hero
Published 8/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 9.23 GB | Duration: 16h 7m

Learn the foundations of machine learning necessary to get a job in data science. No coding experience required.

What you'll learn
Undertand the foundations of machine learning even if you're a total beginner
Be able to pass job interviews for data science jobs
Learn without wasting time in things that don't come up in interviews or real work
Avoid rookie mistakes that waste companies' time and money
Requirements
No programming or advanced math experience required! You'll learn everything you need to know.
Description
This course will teach you the complete foundations of machine learning that you need to get a job in data science (and do a great job afterward). The course will help you:Pass job interviews and technical quizzesAvoid rookie mistakes that waste companies' time and moneyBe prepared for real work.Important stuff about this course:You won't spend hours learning stuff that never comes up in a job interview.Total beginners are welcome; coding experience or advanced math knowledge are not required.It was designed by an industry expert who's been on the hiring side of the table and knows what companies are looking for.This course will be of great help if you are:A student who wants to prepare for work in data science after graduating.An established professional or academic who wants to switch careers to data science.A total beginner who wants to dabble in machine learning and data science for the first time.How is this different from an academic course or a bootcamp?In academic courses, your teacher spends hours speaking about calculus and linear algebra, but then none of that comes up in a job interview! That in-depth knowledge certainly has a place but is not what most companies are looking for.In bootcamps you tend to learn how to use many tools but not how they work under the hood. This black-box knowledge is what companies want to avoid the most in applicants!This course sits in between—you gain foundational knowledge and truly understand machine learning, without spending time on unimportant stuff.

Overview

Section 1: Machine Learning Models

Lecture 1 Modeling an epidemic

Lecture 2 The machine learning recipe

Lecture 3 The components of a machine learning model

Lecture 4 Why model?

Lecture 5 On assumptions and can we get rid of them?

Lecture 6 The case of AlphaZero

Lecture 7 Overfitting/underfitting/bias/variance

Lecture 8 Why use machine learning

Lecture 9 Notes on machine learning models

Section 2: Linear regression

Lecture 10 The InsureMe challenge

Lecture 11 Supervised learning

Lecture 12 A quick note on the word "features"

Lecture 13 Linear assumption

Lecture 14 Linear regression template

Lecture 15 Non-linear vs proportional vs linear

Lecture 16 Linear regression template revisited

Lecture 17 Loss function

Lecture 18 Training algorithm

Lecture 19 Code time

Lecture 20 R squared

Lecture 21 Why use a linear model?

Lecture 22 Kaggle notebook on linear regression

Lecture 23 Notes on supervised learning and linear regression

Lecture 24 Finding closed-form solution to linear regression (optional)

Section 3: Scaling and Pipelines

Lecture 25 Introduction to scaling

Lecture 26 Min-max scaling

Lecture 27 Code time (min-max scaling)

Lecture 28 The problem with min-max scaling

Lecture 29 What's your IQ?

Lecture 30 Standard scaling

Lecture 31 Code time (standard scaling)

Lecture 32 Model before and after scaling

Lecture 33 Inference time

Lecture 34 Pipelines

Lecture 35 Code time (pipelines)

Lecture 36 Kaggle notebook on scaling and pipelines

Lecture 37 Notes on scaling and pipelines

Section 4: Regularization

Lecture 38 Spurious correlations

Lecture 39 L2 regularization

Lecture 40 Code time (L2 regularization)

Lecture 41 L2 results

Lecture 42 L1 regularization

Lecture 43 Code time (L1 regularization)

Lecture 44 L1 results

Lecture 45 Why does L1 encourage zeros?

Lecture 46 L1 vs L2: Which one is best?

Lecture 47 Kaggle notebook on regularization

Lecture 48 Notes on regularization

Section 5: Validation

Lecture 49 Introduction to validation

Lecture 50 Why not evaluate model on training data

Lecture 51 The validation set

Lecture 52 Code time (validation set)

Lecture 53 Error curves

Lecture 54 Model selection

Lecture 55 The problem with model selection

Lecture 56 Tainted validation set

Lecture 57 Monkeys with typewriters

Lecture 58 My own validation epic fail

Lecture 59 The test set

Lecture 60 What if the model doesn't pass the test?

Lecture 61 How not to be fooled by randomness

Lecture 62 Cross-validation

Lecture 63 Code time (cross validation)

Lecture 64 Cross-validation results summary

Lecture 65 AutoML

Lecture 66 Is AutoML a good idea?

Lecture 67 Red flags: Don't do this!

Lecture 68 Red flags summary and what to do instead

Lecture 69 Your job as a data scientist

Lecture 70 Kaggle notebook on validation and cross-validation

Lecture 71 30-minute code assignment with new dataset!

Lecture 72 Notes on validation and testing

Lecture 73 Extra reading: Model retraining

Lecture 74 Extra reading: The Difference between Statistics and Machine Learning

Section 6: Common Mistakes

Lecture 75 Intro and recap

Lecture 76 Mistake #1: Data leakage

Lecture 77 The golden rule

Lecture 78 Helpful trick (feature importance)

Lecture 79 Real example of data leakage (part 1)

Lecture 80 Real example of data leakage (part 2)

Lecture 81 Another (funny) example of data leakage

Lecture 82 Mistake #2: Random split of dependent data

Lecture 83 Another example (insurance data)

Lecture 84 Mistake #3: Look-Ahead Bias

Lecture 85 Example solutions to Look-Ahead Bias

Lecture 86 Consequences of Look-Ahead Bias

Lecture 87 How to split data to avoid Look-Ahead Bias

Lecture 88 Cross-validation with temporally related data

Lecture 89 Mistake #4: Building model for one thing, using it for something else

Lecture 90 Sketchy rationale

Lecture 91 Why this matters for your career and job search

Lecture 92 Find the error: 10-minute code assignment

Lecture 93 Assignment solution

Lecture 94 Notes on common mistakes

Section 7: Classification - Part 1: Logistic Model

Lecture 95 Classifying images of handwritten digits

Lecture 96 Why the usual regression doesn't work

Lecture 97 Machine learning recipe recap

Lecture 98 Logistic model template (binary)

Lecture 99 Decision function and boundary (binary)

Lecture 100 Logistic model template (multiclass)

Lecture 101 Decision function and boundary (multi-class)

Lecture 102 Summary: binary vs multiclass

Lecture 103 Code time!

Lecture 104 Why the logistic model is often called logistic regression

Lecture 105 One vs Rest, One vs One

Lecture 106 Kaggle notebook on logistic model for digit classification

Lecture 107 Notes on Logistic Model

Section 8: Classification - Part 2: Maximum Likelihood Estimation

Lecture 108 Where we're at

Lecture 109 Brier score and why it doesn't work

Lecture 110 The likelihood function

Lecture 111 Optimization task and numerical stability

Lecture 112 Let's improve the loss function

Lecture 113 Loss value examples

Lecture 114 Adding regularization

Lecture 115 Binary cross-entropy loss

Lecture 116 Notes on Maximum Likelihood Estimation

Section 9: Classification - Part 3: Gradient Descent

Lecture 117 Recap

Lecture 118 No closed-form solution

Lecture 119 Naive algorithm

Lecture 120 Fog analogy

Lecture 121 Gradient descent overview

Lecture 122 The gradient

Lecture 123 Numerical calculation

Lecture 124 Parameter update

Lecture 125 Convergence

Lecture 126 Analytical solution

Lecture 127 [Optional] Interpreting analytical solution

Lecture 128 Gradient descent conditions

Lecture 129 Beyond vanilla gradient descent

Lecture 130 Code time

Lecture 131 Reading the documentation

Lecture 132 10-minute coding exercise: Classify images of clothes

Lecture 133 Notes on Gradient Descent

Section 10: Classification metrics and class imbalance

Lecture 134 Binary classification and class imbalance

Lecture 135 Assessing performance

Lecture 136 Accuracy

Lecture 137 Accuracy with different class importance

Lecture 138 Precision and Recall

Lecture 139 Sensitivity and Specificity

Lecture 140 F-measure and other combined metrics

Lecture 141 ROC curve

Lecture 142 Area under the ROC curve

Lecture 143 Custom metric (important stuff!)

Lecture 144 Other custom metrics

Lecture 145 Bad data science process :(

Lecture 146 Data rebalancing (avoid doing this!)

Lecture 147 Stratified split

Lecture 148 Notes on Classification Metrics

Section 11: Neural Networks

Lecture 149 The inverted MNIST dataset

Lecture 150 The problem with linear models

Lecture 151 Neurons

Lecture 152 Multi-layer perceptron (MLP) for binary classification

Lecture 153 MLP for regression

Lecture 154 MLP for multi-class classification

Lecture 155 Hidden layers

Lecture 156 Activation functions

Lecture 157 Decision boundary

Lecture 158 Loss function

Lecture 159 Intro to neural network training

Lecture 160 Parameter initialization

Lecture 161 Saturation

Lecture 162 Non-convexity

Lecture 163 Stochastic gradient descent (SGD)

Lecture 164 More on SGD

Lecture 165 Code time!

Lecture 166 Backpropagation

Lecture 167 The problem with MLPs

Lecture 168 Deep learning

Lecture 169 Notes on Neural Networks

Lecture 170 20-minute coding task

Section 12: Tree-Based Models

Lecture 171 Decision trees

Lecture 172 Building decision trees

Lecture 173 Stopping tree growth

Lecture 174 Pros and cons of decision trees

Lecture 175 Decision trees for classification

Lecture 176 Decision boundary

Lecture 177 Bagging

Lecture 178 Random forests

Lecture 179 Gradient-boosted trees for regression

Lecture 180 Gradient-boosted trees for classification [optional]

Lecture 181 How to use gradient-boosted trees

Lecture 182 20-minute coding exercise (important!)

Section 13: K-nn and SVM

Lecture 183 Nearest neighbor classification

Lecture 184 K nearest neighbors

Lecture 185 Disadvantages of k-NN

Lecture 186 Recommendation systems (collaborative filtering)

Lecture 187 Introduction to Support Vector Machines (SVMs)

Lecture 188 Maximum margin

Lecture 189 Soft margin

Lecture 190 SVM vs Logistic Model (support vectors)

Lecture 191 Alternative SVM formulation

Lecture 192 Dot product

Lecture 193 Non-linearly separable data

Lecture 194 Kernel trick (polynomial)

Lecture 195 RBF kernel

Lecture 196 SVM remarks

Section 14: Unsupervised Learning

Lecture 197 Intro to unsupervised learning

Lecture 198 Clustering

Lecture 199 K-means clustering

Lecture 200 K-means application example

Lecture 201 Elbow method

Lecture 202 Clustering remarks

Lecture 203 Intro to dimensionality reduction

Lecture 204 PCA (principal component analysis)

Lecture 205 PCA remarks

Lecture 206 Code time (PCA)

Section 15: Feature Engineering

Lecture 207 Missing data

Lecture 208 Imputation

Lecture 209 Imputer within pipeline

Lecture 210 One-Hot encoding

Lecture 211 Ordinal encoding

Lecture 212 How to combine pipelines

Lecture 213 Code sample

Lecture 214 Feature Engineering

Lecture 215 Features for Natural Language Processing (NLP)

Lecture 216 Anatomy of a Data Science Project

Lecture 217 Next steps!

Lecture 218 Final Project: Predict Titanic survivors

Aspiring data scientists who want to get their first job in the field.,Software engineers who want to be involved in data science and machine learning.,Researchers who want to make the move from academia to industry