Data Science with R: A Step By Step Guide With Visual Illustrations & Examples by Andrew Oleksy
English | November 16, 2018 | ISBN: N/A | ASIN: B07KNBFGFZ | 276 pages | AZW3 | 4.46 MB
English | November 16, 2018 | ISBN: N/A | ASIN: B07KNBFGFZ | 276 pages | AZW3 | 4.46 MB
A Step By Step Guide with Visual Illustrations and Examples
The Data Science field is expected to continue growing rapidly over the next several years and Data Scientist is consistently rated as a top career.Data Science with R gives you the necessery theoretical background to start your Data Science journey and shows you how to apply the R programming language through practical examples in order to extract valuable knowledge from data. Professor Andrew Oleksy guides you through all important concepts of data science including the R programming language, Data Mining, Clustering, Classification and Prediction, Hadoop framework and more.
- Introduction to Data Mining
- Data Science
- Knowledge Discovery in Databases (KDD)
- Model Types
- Examples and Counterexamples
- Classification of Data Mining methods
- Applications
- Challenges
- The R Programming Language
- Basic Concepts, Definitions and Notations
- Tool Installation
- Introduction to R
- Data Types
- Basic Tasks
- Control Structures
- Functions
- Scoping Rules
- Iterated Functions
- Help from the console and Package Installation
- Types, Quality and Data Preprocessing
- Categories and Types of Variables
- Preprocessing processes
- dplyr and tidyr packages
- Summary Statistics and Visualization
- Measures of Position
- Measures of Dispersion
- Visualization of Qualitative Data
- Visualization of Quantitative Data
- Classification and Prediction
- Classification
- Prediction
- Overfitting and Regularization
- Clustering
- Unsupervised Learning
- Concept of Cluster
- K-means algorithm
- Hierarchical Clustering Algorithms
- DBSCAN Algorithm
- Mining of Frequent Itemsets and Association Rules
- Introduction
- Theoretical Background
- Apriori Algorithm
- Frequent Itemsets Types
- Positive and Negative Border of Frequent Itemsets
- Association Rules Mining
- Alternative Methods for Large Itemsets generation
- FP-Growth Algorithm
- Arules Package
- Computational Methods for Big Data Analysis (Hadoop and MapReduce)
- Introduction
- Advantages of Hadoop's Distributed File System
- Hadoop Users
- Hadoop Architecture
- The Hadoop Cluster Architecture
- Hadoop Java API
- List Loops & Generic Classes and Methods