Tags
Language
Tags
April 2024
Su Mo Tu We Th Fr Sa
31 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 1 2 3 4

R Data Pre-Processing & Data Management - Shape Your Data!

Posted By: Sigha
R Data Pre-Processing & Data Management - Shape Your Data!

R Data Pre-Processing & Data Management - Shape Your Data!
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English (US) | Size: 1.93 GB | Duration: 6h 26m

Learn how to prepare your data for great analytics in R.

What you'll learn
import data into R in several ways while also beeing able to identify a suitable import tool
select and implement a proper object class (data.frame, data.table, data_frame)
convert your data into (and understand) a tidy data format
filter and query your data based on a wide range of parameters
join 2 data tables together with dplyr 2 table verb syntax
use SQL code within R
translate basic R into SQL
work with dates and time
work with strings using regular expressions
detecting outliers in datasets

Requirements
Computer with R and RStudio ready to use
You should have basic R / RStudio knowledge
Required add on packages will be listed in the course orientation video

Description
Let’s get your data in shape!

Data Pre-Processing is the very first step in data analytics. You
cannot escape it, it is too important. Unfortunately this topic is
widely overlooked and information is hard to find.


With this course I will change this!


Data Pre-Processing as taught in this course has the following steps:


1.       Data Import: this might sound trivial but if you consider
all the different data formats out there you can imagine that this can
be confusing. In the course we will take a look at a standard way of
importing csv files, we will learn about the very fast fread method and I
will show you what you can do if you have more exotic file formats to
handle.


2.       Selecting the object class: a standard data.frame might be
fine for easy standard tasks, but there are more advanced classes out
there like the data.table. Especially with those huge datasets nowadays,
a data.frame might not do it anymore. Alternatives will be demonstrated
in this course.


3.       Getting your data in a tidy form: a tidy dataset has 1 row
for each observation and 1 column for each variable. This might sound
trivial, but in your daily work you will find instances where this
simple rule is not followed. Often times you will not even notice that
the dataset is not tidy in its layout. We will learn how tidyr can help
you in getting your data into a clean and tidy format.


4.       Querying and filtering: when you have a huge dataset you
need to filter for the desired parameters. We will learn about the
combination of parameters and implementation of advanced filtering
methods. Especially data.table has proven effective for that sort of
querying on huge datasets, therefore we will focus on this package in
the querying section.


5.       Data joins: when your data is spread over 2 different tables
but you want to join them together based on given criteria, you will
need joins for that. There are several methods of data joins in R, but
here we will take a look at dplyr and the 2 table verbs which are such a
great tool to work with 2 tables at the same time.


6.       Integrating and interacting with SQL: R is great at
interacting with SQL. And SQL is of course the leading database
language, which you will have to learn sooner or later as a data
scientist. I will show you how to use SQL code within R and there is
even a R to SQL translator for standard R code. And we will set up a
SQLite database from within R. 7.  Outlier detection: Datasets often contain values outside a plausible range. Faulty data generation or entry happens regularly. Statistical methods of outlier detection help to identify these values. We will take a look at the implemention of these.8. Character strings as well as dates and time have their own rules when it comes to pre-processing. In this course we will also take a look at these types of data and how to effectively handle it in R.
How do you best prepare yourself for this course?


You only need a basic knowledge of R to fully benefit from this
course. Once you know the basics of RStudio and R you are ready to
follow along with the course material. Of course you will also get the R
scripts which makes it even easier.


The screencasts are made in RStudio so you should get this program on
top of R. Add on packages required are listed in the course.


Again, if you want to make sure that you have proper data with a tidy
format, take a look at this course. It will make your analytics with R
much easier!



Who this course is for:
Data pre-processing is a crucial step of data related work - therefore this course is intended for all R users


R Data Pre-Processing & Data Management - Shape Your Data!


For More Courses Visit & Bookmark Your Preferred Language Blog
From Here: English - Français - Italiano - Deutsch - Español - Português - Polski - Türkçe - Русский