New members: get your first 7 days of ITTutorPro Premium for free! Join for free

Data Scientist

Course Description

Accelerate your career in Data Science with the exclusive Data Scientist Master’s program in collaboration with IBM. Experience world-class Data Science training by an industry leader on the most in-demand Data Science and Machine learning skills. Gain hands-on exposure to key technologies including R, Python, Tableau, Hadoop, and Spark. Become an expert Data Scientist today.

About the Program

IBM is the second-largest Predictive Analytics and Machine Learning solutions provider globally (source: The Forrester Wave report, September 2018). A joint partnership with Simplilearn and IBM introduces students to integrated blended learning, making them experts in Artificial Intelligence and Data Science. The Data Science course in collaboration with IBM will make students industry-ready for Artificial Intelligence and Data Science job roles.

IBM is a leading cognitive solutions and cloud platform company, headquartered in Armonk, New York, offering a plethora of technology and consulting services. Each year, IBM invests $6 billion in research and development and has achieved five Nobel prizes, nine US National Medals of Technology, five US National Medals of Science, six Turing Awards, and 10 Inductions in US Inventors Hall of Fame.

What can I expect from this Data Science courses developed in collaboration with IBM?

Upon completion of this Data Scientist online Master’s program, you will receive the certificates from IBM and Simplilearn in the Data Science courses on the learning path*. These certificates will testify to your skills as an expert in Data Science. You will also receive the following:

USD 1200 worth of IBM cloud credits that you can leverage for hands-on exposure
Access to IBM cloud platforms featuring IBM Watson and other software for 24/7 practice
Industry-recognized Data Scientist Master’s certificate from Simplilearn

What are the learning objectives?

Data Scientist is one of the hottest professions. IBM predicts the demand for Data Scientists will rise by 28% by 2020. Simplilearn’s Data Scientist Master’s program co-developed with IBM encourages you to master skills including statistics, hypothesis testing, data mining, clustering, decision trees, linear and logistic regression, data wrangling, data visualization, regression models, Hadoop, Spark, PROC SQL, SAS Macros, recommendation engine, supervised, and unsupervised learning and more.

This Data Scientist Master’s program covers extensive Data Science training, combining online instructor-led classes and self-paced learning co-developed with IBM. The program concludes with a capstone project designed to reinforce the learning by building a real industry product encompassing all the key aspects learned throughout the program. The skills focused on in this program will help prepare you for the role of a Data Scientist.

Why be a Data Scientist?

A Data scientist is the top ranking professional in any analytics organization. Glassdoor ranks Data Scientists first in the 25 Best Jobs for 2019. In today’s market, Data Scientists are scarce and in demand. As a Data Scientist, you are required to understand the business problem, design a data analysis strategy, collect and format the required data, apply algorithms or techniques using the correct tools, and make recommendations backed by data.

What important Data Science skills you will learn with this Master’s program?

Gain an in-depth understanding of data structure and data manipulation
Understand and use linear and non-linear regression models and classification techniques for data analysis
Obtain an in-depth understanding of supervised and unsupervised learning models such as linear regression, logistic regression, clustering, dimensionality reduction, K-NN, and pipeline
Perform scientific and technical computing using the SciPy package and its sub-packages such as Integrate, Optimize, Statistics, IO, and Weave
Gain expertise in mathematical computing using the NumPy and Scikit-Learn packages
Understand the different components of the Hadoop ecosystem
Learn to work with HBase, its architecture, and data storage, learning the difference between HBase and RDBMS, and use Hive and Impala for partitioning
Understand MapReduce and its characteristics, plus learn how to ingest data using Sqoop and Flume
Master the concepts of recommendation engine and time series modeling and gain practical mastery over principles, algorithms, and applications of machine learning
Learn to analyze data using Tableau and become proficient in building interactive dashboards

What Data Science projects are included in this Master’s program?

This Data Scientist Master’s program includes 15+ real-life, industry-based projects on different domains to help you master concepts of Data Science and Big Data. A few of the projects that you will be working on are mentioned below:

Capstone Project:

Description: You will go through dedicated mentor classes in order to create a high-quality industry project, solving a real-world problem leveraging the skills and technologies learned throughout the program. The capstone project will cover all the key aspects of data extraction, cleaning, and visualization to model building and tuning. You also get the option of choosing the domain/industry dataset you want to work on from the options available.
After successful submission of the project, you will be awarded a capstone certificate that can be showcased to potential employers as a testament to your learning.

Project 1: Products rating prediction for Amazon

Domain: E-commerce

Amazon, one of the leading US-based e-commerce companies, recommends products within the same category to customers based on their activity and reviews on other similar products. Amazon would like to improve this recommendation engine by predicting ratings for the non-rated products and add them to recommendations accordingly.

Project 2: Improving customer experience for Comcast

Domain: Telecom

Description: Comcast, one of the leading US-based global telecommunication companies wants to improve customer experience by identifying and acting on problem areas that lower customer satisfaction if any. The company is also looking for key recommendations that can be implemented to deliver the best customer experience.

Project 3: Attrition Analysis for IBM

Domain: Workforce Analytics

Description: IBM, one of the leading US-based IT companies, would like to identify the factors that influence the attrition of employees. Based on the parameters identified, the company would also like to build a logistics regression model that can help predict if an employee will churn or not.

Project 4: Predict accurate sales for 45 stores of Walmart, one of the leading US-based leading retail stores, considering the impact of promotional markdown events. Check if macroeconomic factors like CPI, unemployment rate, etc. have an impact on sales.

Domain: Retail

Description: Walmart runs several promotional markdown events throughout the year. The markdowns precede prominent holidays, such as the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in valuation than non-holiday weeks. The business is facing a challenge due to unforeseen demand, resulting in stocks running out at times due to inaccurate demand estimation. The macroeconomic factors like CPI, Unemployment Index, etc. also play an important role in predicting the demand, but the business hasn’t been able to leverage these factors yet. As a part of this project, create a model to highlight the effects of markdowns on holiday weeks.

Project 5: Learn how leading Healthcare industry leaders make use of Data Science to leverage their business.

Domain: HealthCare

Description: Predictive analytics can be used in healthcare to mediate hospital readmissions. In healthcare and other industries, predictors are most useful when they can be brought into action. However, historical and real-time data alone are worthless without intervention. More importantly, to judge the efficiency and value of forecasting a trend and ultimately changing behavior, both the predictor and the intervention must be integrated back into the same system and workflow where the trend originally occurred.

Project 6: Understand how Insurance leaders like Berkshire Hathaway, AIG, AXA, etc. make use of Data Science by working on a real-life project based on Insurance.

Domain: Insurance

Description: The use of predictive analytics has increased greatly in insurance businesses, especially for the biggest companies, according to the 2013 Insurance Predictive Modeling Survey. While the survey showed an increase in predictive modeling throughout the industry, all the respondents from companies that write over $1 billion in personal insurance employ predictive modeling, compared to 69% of companies with less than that amount of premium.

Project 7: See how banks like Citigroup, Bank of America, ICICI, HDFC, etc. make use of Data Science to stay ahead of the competition.

Domain: Banking

Description: A Portuguese banking institution ran a marketing campaign to convince potential customers to invest in a bank term deposit. Its marketing campaigns were conducted through phone calls, and sometimes the same customer was contacted more than once. Your job is to analyze the data collected from the marketing campaign.

Project 8: Learn how Stock Markets, such as NASDAQ, NSE, and BSE leverage Data Science and Analytics to arrive at a consumable data from complex datasets.

Domain: Stock Market

Description: You need to import data using Yahoo data reader of the following companies: Yahoo, Apple, Amazon, Microsoft, and Google. Perform fundamental analytics including plotting closing price, plotting stock trade by volume, performing daily return analysis, and using pair plot to show the correlation between all the stocks.

Project 9: See how Data Science is used in the field of engineering by taking up this case study of MovieLens Dataset Analysis.

Domain: Engineering

Description: The GroupLens Research Project is a research group in the Department of Computer Science and Engineering at the University of Minnesota. The researchers of this group are involved in many research projects related to the fields of information filtering, collaborative filtering, and recommender systems.

Project 10: Understand how leading retail companies like Walmart, Amazon, Target, etc. make use of Data Science to analyze and optimize their product placements and inventory.

Domain: Retail

Description: Analytics is used in optimizing product placements on shelves or optimization of inventory to be kept in the warehouses using industry examples. Through this project, participants learn the daily cycle of product optimization from the shelves to the warehouse. This gives them insights into regular occurrences in the retail sector.

Who should take this Data Scientist Master’s program?

The Data Science role requires an amalgam of experience, data science knowledge, and correct tools and technologies. It is a solid career choice for both new and experienced professionals. Aspiring professionals of any educational background with an analytical frame of mind are most suited to pursue the Data Science course, including:

IT Professionals
Analytics Managers
Business Analysts
Banking and Finance Professionals
Marketing Managers
Supply Chain Network Managers
Beginners or Recent Graduates in Bachelors or Master’s Degree


Share on:

Course Syllabus

Course 1 Online Classroom Flexi Pass

Data Science Certification Training – R Programming

Simplilearn’s Data Science certification with R Programming training makes you an expert in data analytics using the R programming language. This course enables you to take your Data Science skills into a variety of companies, helping them analyze data and make more informed business decisions.

Data Science with R

Lesson 00 – Course Introduction01:31

Course Introduction01:31
Lesson 01 – Introduction to Business Analytics21:06
1.001 Overview00:44
1.002 Business Decisions and Analytics04:33
1.003 Types of Business Analytics03:53
1.004 Applications of Business Analytics08:57
1.005 Data Science Overview01:29
1.006 Conclusion01:30
Knowledge Check
Lesson 02 – Introduction to R Programming26:35
2.001 Overview00:31
2.002 Importance of R05:20
2.003 Data Types and Variables in R02:14
2.004 Operators in R04:39
2.005 Conditional Statements in R02:45
2.006 Loops in R05:07
2.007 R script01:44
2.008 Functions in R02:58
2.009 Conclusion01:17
Knowledge Check
Lesson 03 – Data Structures50:57
3.001 Overview01:04
3.002 Identifying Data Structures13:14
3.003 Demo Identifying Data Structures14:05
3.004 Assigning Values to Data Structures04:51
3.005 Data Manipulation09:23
3.006 Demo Assigning values and applying functions07:46
3.007 Conclusion00:34
Knowledge Check
Lesson 04 – Data Visualization29:40
4.001 Overview00:29
4.002 Introduction to Data Visualization03:03
4.003 Data Visualization using Graphics in R18:50
4.004 ggplot205:14
4.005 File Formats of Graphic Outputs01:08
4.006 Conclusion00:56
Knowledge Check
Lesson 05 – Statistics for Data Science-I14:10
5.001 Overview00:21
5.002 Introduction to Hypothesis02:06
5.003 Types of Hypothesis03:13
5.004 Data Sampling02:48
5.005 Confidence and Significance Levels04:33
5.006 Conclusion01:09
Knowledge Check
Lesson 06 – Statistics for Data Science-II29:55
6.001 Overview00:28
6.002 Hypothesis Test00:47
6.003 Parametric Test14:36
6.004 Non-Parametric Test08:31
6.005 Hypothesis Tests about Population Means02:09
6.006 Hypothesis Tests about Population Variance00:45
6.007 Hypothesis Tests about Population Proportions01:11
6.008 Conclusion01:28
Knowledge Check
Lesson 07 – Regression Analysis45:04
7.001 Overview00:26
7.002 Introduction to Regression Analysis01:11
7.003 Types of Regression Analysis Models01:38
7.004 Linear Regression08:59
7.005 Demo Simple Linear Regression07:29
7.006 Non-Linear Regression03:49
7.007 Demo Regression Analysis with Multiple Variables13:29
7.008 Cross Validation01:48
7.009 Non-Linear to Linear Models02:06
7.010 Principal Component Analysis02:45
7.011 Factor Analysis00:26
7.012 Conclusion00:58
Knowledge Check
Lesson 08 – Classification01:05:14
8.001 Overview00:31
8.002 Classification and Its Types04:24
8.003 Logistic Regression03:35
8.004 Support Vector Machines04:26
8.005 Demo Support Vector Machines11:13
8.006 K-Nearest Neighbours02:34
8.007 Naive Bayes Classifier02:53
8.008 Demo Naive Bayes Classifier06:15
8.009 Decision Tree Classification09:47
8.010 Demo Decision Tree Classification06:25
8.011 Random Forest Classification02:01
8.012 Evaluating Classifier Models06:04
8.013 Demo K-Fold Cross Validation04:09
8.014 Conclusion00:57
Knowledge Check
Lesson 09 – Clustering28:10
9.001 Overview00:17
9.002 Introduction to Clustering02:57
9.003 Clustering Methods07:47
9.004 Demo K-means Clustering11:15
9.005 Demo Hierarchical Clustering05:02
9.006 Conclusion00:52
Knowledge Check
Lesson 10 – Association23:13
10.001 Overview00:15
10.002 Association Rule06:20
10.003 Apriori Algorithm05:19
10.004 Demo Apriori Algorithm10:37
10.005 Conclusion00:42
Knowledge Check

Free Course

Data Science in Real life

Lesson 1 – Course Objective
Learning Objectives
Lesson 2 – Defining Data Science12:46
Learning Objectives
1.1 What is data science02:37
1.2 There are many paths to data science03:55
1.3 Any advice for new data scientist02:59
1.4 What is the cloud03:15
Lesson 3 – What do Data Science People do11:24
Learning Objectives
2.1 A day in the life of a data science person03:53
2.2 R versus Python01:51
2.3 Data science tools and technology05:40
Lesson 4 – Data Science in Business10:40
Learning Objectives
3.1 How should companies get started in data science03:00
3.2 Recruiting for data science07:40
Lesson 5 – Use Cases for Data Science06:28
Learning Objectives
4.1 Applications of data science06:28
Lesson 6 – Data Science People01:05
Learning Objectives
5.1 Things data science people say01:05
Unlocking IBM Certificate

Course 3 Online Classroom Flexi Pass

Machine Learning
Explore the concepts of Machine Learning and understand how it’s transforming the digital world. An exciting branch of Artificial Intelligence, this Machine Learning certification online course will provide the skills you need to become a Machine Learning Engineer and unlock the power of this emerging field.

Machine Learning
Lesson 01 Course Introduction06:41
Course Introduction05:31
Accessing Practice Lab01:10
Lesson 02 Introduction to AI and Machine Learning19:36
2.1 Learning Objectives00:43
2.2 Emergence of Artificial Intelligence01:56
2.3 Artificial Intelligence in Practice01:48
2.4 Sci-Fi Movies with the Concept of AI00:22
2.5 Recommender Systems00:45
2.6 Relationship between Artificial Intelligence, Machine Learning, and Data Science: Part A02:47
2.7 Relationship between Artificial Intelligence, Machine Learning, and Data Science: Part B01:23
2.8 Definition and Features of Machine Learning01:30
2.9 Machine Learning Approaches01:48
2.10 Machine Learning Techniques02:21
2.11 Applications of Machine Learning: Part A01:34
2.12 Applications of Machine Learning: Part B02:11
2.13 Key Takeaways00:28
Knowledge Check
Lesson 03 Data Preprocessing35:57
3.1 Learning Objectives00:38
3.2 Data Exploration Loading Files: Part A02:52
3.2 Data Exploration Loading Files: Part B01:34
3.3 Demo: Importing and Storing Data01:27
Practice: Automobile Data Exploration – A
3.4 Data Exploration Techniques: Part A02:56
3.5 Data Exploration Techniques: Part B02:47
3.6 Seaborn02:18
3.7 Demo: Correlation Analysis02:38
Practice: Automobile Data Exploration – B
3.8 Data Wrangling01:27
3.9 Missing Values in a Dataset01:55
3.10 Outlier Values in a Dataset01:49
3.11 Demo: Outlier and Missing Value Treatment04:18
Practice: Data Exploration – C
3.12 Data Manipulation00:47
3.13 Functionalities of Data Object in Python: Part A01:49
3.14 Functionalities of Data Object in Python: Part B01:33
3.15 Different Types of Joins01:32
3.16 Typecasting01:23
3.17 Demo: Labor Hours Comparison01:54
Practice: Data Manipulation
3.18 Key Takeaways00:20
Knowledge Check
Storing Test Results
Lesson 04 Supervised Learning01:21:04
4.1 Learning Objectives00:31
4.2 Supervised Learning02:17
4.3 Supervised Learning- Real-Life Scenario00:53
4.4 Understanding the Algorithm00:52
4.5 Supervised Learning Flow01:50
4.6 Types of Supervised Learning: Part A01:54
4.7 Types of Supervised Learning: Part B02:03
4.8 Types of Classification Algorithms01:01
4.9 Types of Regression Algorithms: Part A03:20
4.10 Regression Use Case00:34
4.11 Accuracy Metrics01:23
4.12 Cost Function01:48
4.13 Evaluating Coefficients00:53
4.14 Demo: Linear Regression13:47
Practice: Boston Homes – A
4.15 Challenges in Prediction01:45
4.16 Types of Regression Algorithms: Part B02:40
4.17 Demo: Bigmart21:55
Practice: Boston Homes – B
4.18 Logistic Regression: Part A01:58
4.19 Logistic Regression: Part B01:38
4.20 Sigmoid Probability02:05
4.21 Accuracy Matrix01:36
4.22 Demo: Survival of Titanic Passengers14:07
Practice: Iris Species
4.23 Key Takeaways00:14
Knowledge Check
Health Insurance Cost
Lesson 05 Feature Engineering27:52
5.1 Learning Objectives00:27
5.2 Feature Selection01:28
5.3 Regression00:53
5.4 Factor Analysis01:57
5.5 Factor Analysis Process01:05
5.6 Principal Component Analysis (PCA)02:31
5.7 First Principal Component02:43
5.8 Eigenvalues and PCA02:32
5.9 Demo: Feature Reduction05:47
Practice: PCA Transformation
5.10 Linear Discriminant Analysis02:27
5.11 Maximum Separable Line00:44
5.12 Find Maximum Separable Line03:12
5.13 Demo: Labeled Feature Reduction01:53
Practice: LDA Transformation
5.14 Key Takeaways00:13
Knowledge Check
Simplifying Cancer Treatment
Lesson 06 Supervised Learning Classification55:43
6.1 Learning Objectives00:34
6.2 Overview of Classification02:05
Classification: A Supervised Learning Algorithm00:52
6.4 Use Cases of Classification02:37
6.5 Classification Algorithms00:16
6.6 Decision Tree Classifier02:17
6.7 Decision Tree Examples01:45
6.8 Decision Tree Formation00:47
6.9 Choosing the Classifier02:55
6.10 Overfitting of Decision Trees01:00
6.11 Random Forest Classifier- Bagging and Bootstrapping02:22
6.12 Decision Tree and Random Forest Classifier01:06
Performance Measures: Confusion Matrix02:21
Performance Measures: Cost Matrix02:06
6.15 Demo: Horse Survival08:30
Practice: Loan Risk Analysis
6.16 Naive Bayes Classifier01:28
6.17 Steps to Calculate Posterior Probability: Part A01:44
6.18 Steps to Calculate Posterior Probability: Part B02:21
6.19 Support Vector Machines : Linear Separability01:05
6.20 Support Vector Machines : Classification Margin02:05
6.21 Linear SVM : Mathematical Representation02:04
6.22 Non-linear SVMs01:06
6.23 The Kernel Trick01:19
6.24 Demo: Voice Classification10:42
Practice: College Classification
6.25 Key Takeaways00:16
Knowledge Check
Classify Kinematic Data
Lesson 07 Unsupervised Learning28:26
7.1 Learning Objectives00:29
7.2 Overview01:48
7.3 Example and Applications of Unsupervised Learning02:17
7.4 Clustering01:49
7.5 Hierarchical Clustering02:28
7.6 Hierarchical Clustering Example02:01
7.7 Demo: Clustering Animals05:39
Practice: Customer Segmentation
7.8 K-means Clustering01:46
7.9 Optimal Number of Clusters01:24
7.10 Demo: Cluster Based Incentivization08:32
Practice: Image Segmentation
7.11 Key Takeaways00:13
Knowledge Check
Clustering Image Data
Lesson 08 Time Series Modeling37:44
8.1 Learning Objectives00:24
8.2 Overview of Time Series Modeling02:16
8.3 Time Series Pattern Types: Part A02:16
8.4 Time Series Pattern Types: Part B01:19
8.5 White Noise01:07
8.6 Stationarity02:13
8.7 Removal of Non-Stationarity02:13
8.8 Demo: Air Passengers – A14:33
Practice: Beer Production – A
8.9 Time Series Models: Part A02:14
8.10 Time Series Models: Part B01:28
8.11 Time Series Models: Part C01:51
8.12 Steps in Time Series Forecasting00:37
8.13 Demo: Air Passengers – B05:01
Practice: Beer Production – B
8.14 Key Takeaways00:12
Knowledge Check
IMF Commodity Price Forecast
Lesson 09 Ensemble Learning35:41
9.01 Ensemble Learning00:24
9.2 Overview02:41
9.3 Ensemble Learning Methods: Part A02:28
9.4 Ensemble Learning Methods: Part B02:37
9.5 Working of AdaBoost01:43
9.6 AdaBoost Algorithm and Flowchart02:28
9.7 Gradient Boosting02:36
9.8 XGBoost02:23
9.9 XGBoost Parameters: Part A03:15
9.10 XGBoost Parameters: Part B02:30
9.11 Demo: Pima Indians Diabetes04:14
Practice: Linearly Separable Species
9.12 Model Selection02:08
9.13 Common Splitting Strategies01:45
9.14 Demo: Cross Validation04:18
Practice: Model Selection
9.15 Key Takeaways00:11
Knowledge Check
Tuning Classifier Model with XGBoost
Lesson 10 Recommender Systems25:45
10.1 Learning Objectives00:28
10.2 Introduction02:17
10.3 Purposes of Recommender Systems00:45
10.4 Paradigms of Recommender Systems02:45
10.5 Collaborative Filtering: Part A02:14
10.6 Collaborative Filtering: Part B01:58
10.7 Association Rule Mining01:47
Association Rule Mining: Market Basket Analysis01:43
10.9 Association Rule Generation: Apriori Algorithm00:53
10.10 Apriori Algorithm Example: Part A02:11
10.11 Apriori Algorithm Example: Part B01:18
10.12 Apriori Algorithm: Rule Selection02:52
10.13 Demo: User-Movie Recommendation Model04:19
Practice: Movie-Movie recommendation
10.14 Key Takeaways00:15
Knowledge Check
Book Rental Recommendation
Lesson 11 Text Mining43:58
11.1 Learning Objectives00:22
11.2 Overview of Text Mining02:11
11.3 Significance of Text Mining01:26
11.4 Applications of Text Mining02:23
11.5 Natural Language ToolKit Library02:35
11.6 Text Extraction and Preprocessing: Tokenization00:33
11.7 Text Extraction and Preprocessing: N-grams00:55
11.8 Text Extraction and Preprocessing: Stop Word Removal01:24
11.9 Text Extraction and Preprocessing: Stemming00:44
11.10 Text Extraction and Preprocessing: Lemmatization00:35
11.11 Text Extraction and Preprocessing: POS Tagging01:17
11.12 Text Extraction and Preprocessing: Named Entity Recognition00:54
11.13 NLP Process Workflow00:53
11.14 Demo: Processing Brown Corpus10:05
Wiki Corpus
11.15 Structuring Sentences: Syntax01:54
11.16 Rendering Syntax Trees00:55
11.17 Structuring Sentences: Chunking and Chunk Parsing01:38
11.18 NP and VP Chunk and Parser01:39
11.19 Structuring Sentences: Chinking01:44
11.20 Context-Free Grammar (CFG)01:56
11.21 Demo: Structuring Sentences07:46
Practice: Airline Sentiment
11.22 Key Takeaways00:09
Knowledge Check
FIFA World Cup
Lesson 12 Project Highlights02:40
Project Highlights02:40
Uber Fare Prediction
Amazon – Employee Access
Practice Projects

Course 4 Online Classroom Flexi Pass

Tableau Training
This Tableau certification course helps you master Tableau Desktop, a world-wide utilized data visualization, reporting, and business intelligence tool. Advance your career in analytics by learning Tableau and how to best use this training in your work.

Tableau 10

Lesson 01 – Course Introduction05:04
1.01 Course Introduction05:04
Lesson 02 – Getting Started with Tableau10:22
2.01 Getting Started with Tableau00:29
2.02 Download and Install Tableau Public02:01
2.03 Load Data from Excel03:42
2.04 User Interface of Tableau Public03:52
2.05 Key Takeaways00:18
Knowledge Check
Lesson 03 – Core Topics in Tableau11:38
3.01 Core Topics in Tableau00:31
3.02 Dimension vs Measures02:42
3.03 Discrete vs. Continuous01:27
3.04 Application of Discrete and Continuous Fields04:05
3.05 Aggregation in Tableau02:33
3.06 Key Takeaways00:20
Knowledge Check
Lesson 04 – Creating Charts in Tableau33:11
4.01 Creating Charts in Tableau00:43
4.02 Bar Chart02:51
4.03 Stacked Bar Chart02:01
4.04 Line Chart03:38
4.05 Scatter Plot02:55
4.06 Dual-Axis Charts05:42
4.07 Combined-Axis Chart02:01
4.08 Funnel Chart02:54
4.09 Cross Tabs01:50
4.10 Highlight Tables02:22
4.11 Maps03:17
4.12 Measure Name and Measure Values02:38
4.13 Key takeaways00:19
Knowledge Check
Customer Analysis
Lesson 05 – Working with Metadata13:57
5.01 Working with Metadata00:35
5.02 Data Types05:08
5.03 Rename, Hide, Unhide and Sort Columns03:42
5.04 Default Properties of Fields04:09
5.05 Key takeaways00:23
Knowledge Check
Lesson 06 – Filters in Tableau40:22
6.01 Filters in Tableau00:43
6.02 Dimension Filter07:38
6.03 Date Filter06:25
6.04 Measure Filter03:39
6.05 Visual Filter06:00
6.06 Interactive Filter08:13
6.07 Data source Filter02:27
6.08 Context Filter05:02
6.09 Key takeaways00:15
Knowledge Check
Product Analysis
Lesson 07 – Applying Analytics to the Worksheet57:02
7.01 Applying Analytics to the Worksheet00:42
7.02 Sets06:54
7.03 Parameters05:22
7.04 Group05:50
7.05 Calculated Fields06:16
7.06 Date Functions05:37
7.07 Text Functions05:28
7.08 Bins and Histogram04:05
7.09 Sort03:15
7.10 Reference and Trend Lines05:06
7.11 Table Calculations03:49
7.12 Pareto Chart02:52
7.13 Waterfall Chart01:26
7.14 Key Takeaways00:20
Knowledge Check
Lesson 08 – Dashboards01:13:25
8.01 Dashboards in Tableau00:41
8.02 Dashboard05:17
8.03 Working with Layout07:42
8.04 Objects in Dashboard09:37
8.05 Making Interactive Dashboard04:10
8.06 Actions in Dashboard08:23
8.07 Best Practices for Dashboard Creation00:59
8.08 Dashboards for Mobile03:28
8.09 Story03:22
Case Study29:20
8.11 Key Takeaways00:26
Knowledge Check
Sales Dashboard
Lesson 09 – Modifications to Data Connections17:59
9.01 Modifications to Data Connections00:37
9.02 Edit Data Source02:33
9.03 Union03:32
9.04 Joins07:04
9.05 Data Blending03:55
9.06 Key Takeaways00:18
Knowledge Check
Lesson 10 – Level of Detail17:11
10.01 Level of Detail00:32
10.02 Introduction to Level of Detail (LOD)02:54
10.03 Fixed LOD05:09
10.04 Include LOD03:33
10.05 Exclude LOD02:58
10.06 Publish to Tableau Public01:43
10.07 Key Takeaways00:22
Knowledge Check

Course 5 Online Classroom Flexi Pass

Big Data Hadoop and Spark Developer
Our Big Data Hadoop certification training course lets you master the concepts of the Hadoop framework, Big Data tools, and methodologies to prepare you for success in your role as a Big Data Developer. Learn how various components of the Hadoop ecosystem fit into the Big Data processing lifecycle.

Big Data Hadoop and Spark Developer

Lesson 1 Course Introduction08:51
1.1 Course Introduction05:52
1.2 Accessing Practice Lab02:59
Lesson 2 Introduction to Big Data and Hadoop43:59
1.1 Introduction to Big Data and Hadoop00:31
1.2 Introduction to Big Data01:02
1.3 Big Data Analytics04:24
1.4 What is Big Data02:54
1.5 Four Vs Of Big Data02:13
1.6 Case Study Royal Bank of Scotland01:31
1.7 Challenges of Traditional System03:38
1.8 Distributed Systems01:55
1.9 Introduction to Hadoop05:28
1.10 Components of Hadoop Ecosystem Part One02:17
1.11 Components of Hadoop Ecosystem Part Two02:53
1.12 Components of Hadoop Ecosystem Part Three03:48
1.13 Commercial Hadoop Distributions04:19
1.14 Demo: Walkthrough of Simplilearn Cloudlab06:51
1.15 Key Takeaways00:15
Knowledge Check
Lesson 3 Hadoop Architecture,Distributed Storage (HDFS) and YARN57:50
2.1 Hadoop Architecture Distributed Storage (HDFS) and YARN00:50
2.2 What Is HDFS00:54
2.3 Need for HDFS01:52
2.4 Regular File System vs HDFS01:27
2.5 Characteristics of HDFS03:24
2.6 HDFS Architecture and Components02:30
2.7 High Availability Cluster Implementations04:47
2.8 HDFS Component File System Namespace02:40
2.9 Data Block Split02:32
2.10 Data Replication Topology01:16
2.11 HDFS Command Line02:14
2.12 Demo: Common HDFS Commands04:39
HDFS Command Line
2.13 YARN Introduction01:32
2.14 YARN Use Case02:21
2.15 YARN and Its Architecture02:09
2.16 Resource Manager02:14
2.17 How Resource Manager Operates02:28
2.18 Application Master03:29
2.19 How YARN Runs an Application04:39
2.20 Tools for YARN Developers01:38
2.21 Demo: Walkthrough of Cluster Part One03:06
2.22 Demo: Walkthrough of Cluster Part Two04:35
2.23 Key Takeaways00:34
Knowledge Check
Hadoop Architecture,Distributed Storage (HDFS) and YARN
Lesson 4 Data Ingestion into Big Data Systems and ETL01:05:21
3.1 Data Ingestion into Big Data Systems and ETL00:42
3.2 Data Ingestion Overview Part One01:51
3.3 Data Ingestion Overview Part Two01:41
3.4 Apache Sqoop02:04
3.5 Sqoop and Its Uses03:02
3.6 Sqoop Processing02:11
3.7 Sqoop Import Process02:24
3.8 Sqoop Connectors04:22
3.9 Demo: Importing and Exporting Data from MySQL to HDFS05:07
Apache Sqoop
3.9 Apache Flume02:42
3.10 Flume Model01:56
3.11 Scalability in Flume01:33
3.12 Components in Flume’s Architecture02:40
3.13 Configuring Flume Components01:58
3.15 Demo: Ingest Twitter Data04:43
3.14 Apache Kafka01:54
3.15 Aggregating User Activity Using Kafka01:34
3.16 Kafka Data Model02:56
3.17 Partitions02:04
3.18 Apache Kafka Architecture03:02
3.21 Demo: Setup Kafka Cluster03:52
3.19 Producer Side API Example02:30
3.20 Consumer Side API00:43
3.21 Consumer Side API Example02:36
3.22 Kafka Connect01:14
3.26 Demo: Creating Sample Kafka Data Pipeline using Producer and Consumer03:35
3.23 Key Takeaways00:25
Knowledge Check
Data Ingestion into Big Data Systems and ETL
Lesson 5 Distributed Processing – MapReduce Framework and Pig01:01:09
4.1 Distributed Processing MapReduce Framework and Pig00:44
4.2 Distributed Processing in MapReduce03:01
4.3 Word Count Example02:09
4.4 Map Execution Phases01:48
4.5 Map Execution Distributed Two Node Environment02:10
4.6 MapReduce Jobs01:55
4.7 Hadoop MapReduce Job Work Interaction02:24
4.8 Setting Up the Environment for MapReduce Development02:57
4.9 Set of Classes02:09
4.10 Creating a New Project02:25
4.11 Advanced MapReduce01:30
4.12 Data Types in Hadoop02:22
4.13 OutputFormats in MapReduce02:25
4.14 Using Distributed Cache01:51
4.15 Joins in MapReduce03:07
4.16 Replicated Join02:37
4.17 Introduction to Pig02:03
4.18 Components of Pig02:08
4.19 Pig Data Model02:23
4.20 Pig Interactive Modes03:18
4.21 Pig Operations01:19
4.22 Various Relations Performed by Developers03:06
4.23 Demo: Analyzing Web Log Data Using MapReduce05:43
4.24 Demo: Analyzing Sales Data and Solving KPIs using PIG02:46
Apache Pig
4.25 Demo: Wordcount02:21
4.23 Key takeaways00:28
Knowledge Check
Distributed Processing – MapReduce Framework and Pig
Lesson 6 Apache Hive59:47
5.1 Apache Hive00:37
5.2 Hive SQL over Hadoop MapReduce01:38
5.3 Hive Architecture02:41
5.4 Interfaces to Run Hive Queries01:47
5.5 Running Beeline from Command Line01:51
5.6 Hive Metastore02:58
5.7 Hive DDL and DML02:00
5.8 Creating New Table03:15
5.9 Data Types01:37
5.10 Validation of Data02:41
5.11 File Format Types02:40
5.12 Data Serialization02:35
5.13 Hive Table and Avro Schema02:38
5.14 Hive Optimization Partitioning Bucketing and Sampling01:28
5.15 Non Partitioned Table01:58
5.16 Data Insertion02:22
5.17 Dynamic Partitioning in Hive02:43
5.18 Bucketing01:44
5.19 What Do Buckets Do02:04
5.20 Hive Analytics UDF and UDAF03:11
5.21 Other Functions of Hive03:17
5.22 Demo: Real-Time Analysis and Data Filteration03:18
5.23 Demo: Real-World Problem04:30
5.24 Demo: Data Representation and Import using Hive03:52
5.25 Key Takeaways00:22
Knowledge Check
Apache Hive
Lesson 7 NoSQL Databases – HBase21:41
6.1 NoSQL Databases HBase00:33
6.2 NoSQL Introduction04:42
Demo: Yarn Tuning03:28
6.3 HBase Overview02:53
6.4 HBase Architecture04:43
6.5 Data Model03:11
6.6 Connecting to HBase01:56
HBase Shell
6.7 Key Takeaways00:15
Knowledge Check
NoSQL Databases – HBase
Lesson 8 Basics of Functional Programming and Scala48:00
7.1 Basics of Functional Programming and Scala00:39
7.2 Introduction to Scala02:59
7.3 Demo: Scala Installation02:54
7.3 Functional Programming03:08
7.4 Programming with Scala04:01
Demo: Basic Literals and Arithmetic Operators02:57
Demo: Logical Operators01:21
7.5 Type Inference Classes Objects and Functions in Scala04:45
Demo: Type Inference Functions Anonymous Function and Class05:04
7.6 Collections01:33
7.7 Types of Collections05:37
Demo: Five Types of Collections03:42
Demo: Operations on List03:16
7.8 Scala REPL02:27
Demo: Features of Scala REPL03:17
7.9 Key Takeaways00:20
Knowledge Check
Basics of Functional Programming and Scala
Lesson 9 Apache Spark Next Generation Big Data Framework36:54
8.1 Apache Spark Next Generation Big Data Framework00:43
8.2 History of Spark01:58
8.3 Limitations of MapReduce in Hadoop02:48
8.4 Introduction to Apache Spark01:11
8.5 Components of Spark03:10
8.6 Application of In-Memory Processing02:54
8.7 Hadoop Ecosystem vs Spark01:30
8.8 Advantages of Spark03:22
8.9 Spark Architecture03:42
8.10 Spark Cluster in Real World02:52
8.11 Demo: Running a Scala Programs in Spark Shell03:45
8.12 Demo: Setting Up Execution Environment in IDE04:18
8.13 Demo: Spark Web UI04:14
8.11 Key Takeaways00:27
Knowledge Check
Apache Spark Next Generation Big Data Framework
Lesson 10 Spark Core Processing RDD01:16:31
9.1 Processing RDD00:37
9.1 Introduction to Spark RDD02:35
9.2 RDD in Spark02:18
9.3 Creating Spark RDD05:48
9.4 Pair RDD01:53
9.5 RDD Operations03:20
9.6 Demo: Spark Transformation Detailed Exploration Using Scala Examples03:13
9.7 Demo: Spark Action Detailed Exploration Using Scala03:32
9.8 Caching and Persistence02:41
9.9 Storage Levels03:31
9.10 Lineage and DAG02:11
9.11 Need for DAG02:51
9.12 Debugging in Spark01:11
9.13 Partitioning in Spark04:05
9.14 Scheduling in Spark03:28
9.15 Shuffling in Spark02:41
9.16 Sort Shuffle03:18
9.17 Aggregating Data with Pair RDD01:33
9.18 Demo: Spark Application with Data Written Back to HDFS and Spark UI09:08
9.19 Demo: Changing Spark Application Parameters06:27
9.20 Demo: Handling Different File Formats02:51
9.21 Demo: Spark RDD with Real-World Application04:03
9.22 Demo: Optimizing Spark Jobs02:56
9.23 Key Takeaways00:20
Knowledge Check
Spark Core Processing RDD
Lesson 11 Spark SQL – Processing DataFrames29:08
10.1 Spark SQL Processing DataFrames00:32
10.2 Spark SQL Introduction02:13
10.3 Spark SQL Architecture01:25
10.4 DataFrames05:21
10.5 Demo: Handling Various Data Formats03:21
10.6 Demo: Implement Various DataFrame Operations03:20
10.7 Demo: UDF and UDAF02:50
10.8 Interoperating with RDDs04:45
10.9 Demo: Process DataFrame Using SQL Query02:30
10.10 RDD vs DataFrame vs Dataset02:34
Processing DataFrames
10.11 Key Takeaways00:17
Knowledge Check
Spark SQL – Processing DataFrames
Lesson 12 Spark MLLib – Modelling BigData with Spark34:04
11.1 Spark MLlib Modeling Big Data with Spark00:38
11.2 Role of Data Scientist and Data Analyst in Big Data02:12
11.3 Analytics in Spark03:37
11.4 Machine Learning03:27
11.5 Supervised Learning02:19
11.6 Demo: Classification of Linear SVM03:47
11.7 Demo: Linear Regression with Real World Case Studies03:41
11.8 Unsupervised Learning01:16
11.9 Demo: Unsupervised Clustering K-Means02:45
11.10 Reinforcement Learning02:02
11.11 Semi-Supervised Learning01:17
11.12 Overview of MLlib02:59
11.13 MLlib Pipelines03:42
11.14 Key Takeaways00:22
Knowledge Check
Spark MLLib – Modeling BigData with Spark
Lesson 13 Stream Processing Frameworks and Spark Streaming01:13:16
12.1 Stream Processing Frameworks and Spark Streaming00:34
12.1 Streaming Overview01:41
12.2 Real-Time Processing of Big Data02:45
12.3 Data Processing Architectures04:12
12.4 Demo: Real-Time Data Processing02:28
12.5 Spark Streaming04:21
12.6 Demo: Writing Spark Streaming Application03:15
12.7 Introduction to DStreams01:52
12.8 Transformations on DStreams03:44
12.9 Design Patterns for Using ForeachRDD03:25
12.10 State Operations00:46
12.11 Windowing Operations03:16
12.12 Join Operations stream-dataset Join02:13
12.13 Demo: Windowing of Real-Time Data Processing02:32
12.14 Streaming Sources01:56
12.15 Demo: Processing Twitter Streaming Data03:56
12.16 Structured Spark Streaming03:54
12.17 Use Case Banking Transactions02:29
12.18 Structured Streaming Architecture Model and Its Components04:01
12.19 Output Sinks00:49
12.20 Structured Streaming APIs03:36
12.21 Constructing Columns in Structured Streaming03:07
12.22 Windowed Operations on Event-Time03:36
12.23 Use Cases01:24
12.24 Demo: Streaming Pipeline07:07
Spark Streaming
12.25 Key Takeaways00:17
Knowledge Check
Stream Processing Frameworks and Spark Streaming
Lesson 14 Spark GraphX28:43
13.1 Spark GraphX00:35
13.2 Introduction to Graph02:38
13.3 Graphx in Spark02:41
13.4 Graph Operators03:29
13.5 Join Operators03:18
13.6 Graph Parallel System01:33
13.7 Algorithms in Spark03:26
13.8 Pregel API02:31
13.9 Use Case of GraphX01:02
13.10 Demo: GraphX Vertex Predicate02:23
13.11 Demo: Page Rank Algorithm02:33
13.12 Key Takeaways00:17
Knowledge Check
Spark GraphX
13.14 Project Assistance02:17
Practice Projects

Course 6

Data Science Capstone
Simplilearn’s Data Science Capstone project will give you an opportunity to implement the skills you learned in the Data Scientist Master’s Program. Through dedicated mentoring sessions, you’ll learn how to solve a real-world, industry-aligned data science problem, from data processing and model building to reporting your business results and insights. The project is the final step in the Data Scientist Master’s Program and will help you to show your expertise in data science to employers.

Data Science Capstone

Day 1 – Problem and approach overview
Data Science Capstone
Day 2 – Data pre-processing techniques application on data set
Data Science Capstone
Day 3 – Model Building and fine tuning leveraging various techniques
Data Science Capstone
Day 4 – Dashboard problem statement to meet the business objective
Data Science Capstone
Day 5 – Final evaluation
Data Science Capstone

From: $14.99 / month

  • Vast selection of courses and labs Access
  • Unlimited access from all devices
  • Learn from industry expert instructors
  • Assessment quizzes and monitor progress
  • Vast selection of courses and labs Access
  • Blended Learning with Virtual Classes
  • Access to new courses every quarter
  • 100% satisfaction guarantee

You Will Get Certification After Completetion This Course.

Instructor Led Lectures
All IT Tutor Pro Formerly It Nuggets Courses replicate a live class experience with an instructor on screen delivering the course’s theories and concepts.These lectures are pre-recorded and available to the user 24/7. They can be repeated, rewound, fast forwarded.
Visual Demonstrations, Educational Games & Flashcards
IT Tutor Pro Formerly It Nuggets recognizes that all students do not learn alike and different delivery mediums are needed in order to achieve success for a large student base. With that in mind, we delivery our content in a variety of different ways to ensure that students stay engaged and productive throughout their courses.
Mobile Optimization & Progress Tracking
Our courses are optimized for all mobile devices allowing students to learn on the go whenever they have free time. Students can access their courses from anywhere and their progress is completely tracked and recorded.
Practice Quizzes And Exams
IT Tutor Pro Formerly It Nuggets Online’s custom practice exams prepare you for your exams differently and more effectively than the traditional exam preps on the market. Students will have practice quizzes after each module to ensure you are confident on the topic you are learning.
World Class Learning Management System
IT Tutor Pro Formerly It Nuggets provides the next generation learning management system (LMS). An experience that combines the feature set of traditional Learning Management Systems with advanced functionality designed to make learning management easy and online learning engaging from the user’s perspective.

Frequently Asked Questions

How does online education work on a day-to-day basis?
Instructional methods, course requirements, and learning technologies can vary significantly from one online program to the next, but the vast bulk of them use a learning management system (LMS) to deliver lectures and materials, monitor student progress, assess comprehension, and accept student work. LMS providers design these platforms to accommodate a multitude of instructor needs and preferences.
Is online education as effective as face-to-face instruction?
Online education may seem relatively new, but years of research suggests it can be just as effective as traditional coursework, and often more so. According to a U.S. Department of Education analysis of more than 1,000 learning studies, online students tend to outperform classroom-based students across most disciplines and demographics. Another major review published the same year found that online students had the advantage 70 percent of the time, a gap authors projected would only widen as programs and technologies evolve.
Do employers accept online degrees?
All new learning innovations are met with some degree of scrutiny, but skepticism subsides as methods become more mainstream. Such is the case for online learning. Studies indicate employers who are familiar with online degrees tend to view them more favorably, and more employers are acquainted with them than ever before. The majority of colleges now offer online degrees, including most public, not-for-profit, and Ivy League universities. Online learning is also increasingly prevalent in the workplace as more companies invest in web-based employee training and development programs.
Is online education more conducive to cheating?
The concern that online students cheat more than traditional students is perhaps misplaced. When researchers at Marshall University conducted a study to measure the prevalence of cheating in online and classroom-based courses, they concluded, “Somewhat surprisingly, the results showed higher rates of academic dishonesty in live courses.” The authors suggest the social familiarity of students in a classroom setting may lessen their sense of moral obligation.
How do I know if online education is right for me?
Choosing the right course takes time and careful research no matter how one intends to study. Learning styles, goals, and programs always vary, but students considering online courses must consider technical skills, ability to self-motivate, and other factors specific to the medium. Online course demos and trials can also be helpful.
What technical skills do online students need?
Our platform typically designed to be as user-friendly as possible: intuitive controls, clear instructions, and tutorials guide students through new tasks. However, students still need basic computer skills to access and navigate these programs. These skills include: using a keyboard and a mouse; running computer programs; using the Internet; sending and receiving email; using word processing programs; and using forums and other collaborative tools. Most online programs publish such requirements on their websites. If not, an admissions adviser can help.