Drop us a line!





DS Program-6

MS Excel, SAS, Python/R, Tableau, IBM SPSS, Hadoop 5-6 Months (Weekend Batches)
60,000 (Inclusive of Taxes)

Modules Included

MS Excel 20 HRS
BASE & Advanced SAS 30 HRS
Python/R 30 HRS
SPSS 20 HRS
Big Data & Hadoop 40 HRS

Data Analytics with Excel

Relevance in industry & need of the hour
Types of analytics – Marketing, Risk, Operations, etc
Business & Technology drivers for analytics
Future of analytics & critical requirement
Types of problems and business objectives in various industries
Different phases of Analytics Project
Introduction to Excel
Working with Formulas and functions
Formating & Conditional Formating
Filtering, sorting, paste special etc
Functions (Logical & Text, Mathematical, Statistical etc)
Data Manipulation & Data Aggregation
Data Analysis using functions
Analyzing Data using Pivots
Descriptive Statistics
Creating Charts & Graphics
Data analytics tool (What -if analysis, Goal seek, Data Table, Solver)
Protecting Workbooks, worksheets and formulas
Working with VBE (Visual Basic Editor)
Introduction to Excel Object Model
Understanding of Sub and Function Procedures
Key Component of Programming Language
Understanding of If, Select Case, With End With Statements
Looping with VBA
User Defined Function
Some Commonly Used Macro Examples
Error Handling
Object and Memory Management in VBA
User Form Controls
ActiveX Controls
Communicating with Database MS Access through ADO - Exporting /Importing Data

BASE & Advanced SAS

Introduction to SAS, GUI
Concepts of Libraries, PDV, data execution etc
Building blocks of SAS (Data & Proc Steps - Statements & options)
Debugging SAS Codes
Importing different types of data & connecting to data bases
Data Understanding(Meta data, variable attributes(format, informat, length, label etc))
SAS Procedures for data import /export / understanding(Proc import/Proc contents/Proc print/Proc means/Proc feq)
Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type converstions, renaming, formatting, etc)
Data manipulation tools (Operators, Functions, Procedures, control structures, Loops, arrays etc)
SAS Functions (Text, numeric, date, utility functions)
SAS Procedures for data manipulation (Proc sort, proc format etc)
SAS Options (System Level, procedure level)
Introduction exploratory data analysis
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
SAS Procedures for Data Analysis(proc freq/Proc means/proc summary/proc tabulate/Proc univariate etc)
SAS Procedures for Graphical Analysis (Proc Sgplot, proc gplot etc)
Introduction to Reporting
SAS Reporting Procedures (Proc print, Proc Report, Proc Tabulate etc)
Exporting data sets into different formats (Using proc export)
Concept of ODS (output delivery system)
ODS System - Exporting output into different formats
Introduction to Advanced SAS - Proc SQL & Macros
Understanding select statement (From, where, group by, having, order by etc)
Proc SQL - Data creation/extraction
Proc SQL - Data Manipulation steps
Proc SQL - Summarizing Data
Proc SQL - Concept of sub queries, indexes etc
SAS Macros - Creating/defining macro variables
SAS Macros - Defining/calling macros
SAS Macros- Concept of local/global variables
Introduction of Statistics
Descriptive and inferential statistics
Explanatory Versus Predictive Modeling
Population and samples
Uses of variable independent and dependent
Types of variables quantitative and categorical
Descriptive Statistics Introduction
Descriptive Statistics Introduction
Histogram
Measures of shape skewness
Box Plots
Univariante Procedure
Statistical graphics procedures
The SGPLOT Procedure
ODS Graphics Output
Using SAS to picture your data
Confidence Intervals for the Mean Introduction
Distribution of sample means
Normality and the central limit theorem
Calculation of 95% confidence interval
Hypothesis Testing introduction
Decision Making Process
Steps in Hypothesis Testing
Types of error and power
The p value effect size and sample size
Statistical Hypothesis Test
the t statistic t distribution and two sided t test
Using proc univariate to generate a t statistic

Python

Why do we need Python?
Program structure in Python
Interactive Shell
Executable or script files.
User Interface or IDE
Numbers
Strings
List
Tuple
Dictionary
Other Core Types
Assignments, Expressions and prints
If tests and Syntax Rules
While and For Loops
Iterations and Comprehensions
Opening a file
Using Files
Other File tools
Function definition and call
Function Scope
Arguments
Function Objects
Anonymous Functions
Cleansing Data with Python
Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
Python Built-in Functions (Text, numeric, date, utility functions)
Python User Defined Functions
Stripping out extraneous information
Normalizing data
Formatting data
Important Python Packages for data manipulation (Pandas, Numpy etc)
Importing Data from various sources (Csv, txt, excel, access etc)
Database Input (Connecting to database)
Viewing Data objects - subsetting, methods
Exporting Data to various formats
Introduction exploratory data analysis
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, Pandas and scipy.stats etc)
Basic Statistics - Measures of Central Tendencies and Variance
Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem
Inferential Statistics -Sampling - Concept of Hypothesis Testing
Statistical Methods - Z/t-tests (One sample, independent, paired), Anova, Correlation and Chi-square

R

Introduction R/R-Studio - GUI
Concept of Packages - Useful Packages (Base & other packages) in R
Data Structure & Data Types (Vectors, Matrices, factors, Data frames, and Lists)
Importing Data from various sources
Database Input (Connecting to database)
Exporting Data to various formats)
Viewing Data (Viewing partial data and full data)
Variable & Value Labels – Date Values
Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type converstions, renaming, formating etc)
Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
R Built-in Functions (Text, numeric, date, utility functions)
R User Defined Functions
R Packages for data manipulation(base, dplyr, plyr, reshape,car, sqldf etc)
Introduction exploratory data analysis
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
R Packages for Exploratory Data Analysis(dplyr, plyr, gmodes, car, vcd, Hmisc, psych, doby etc)
R Packages for Graphical Analysis (base, ggplot, lattice etc)

SPSS

The Research Process
Initial Observation
Generate Theory
Generate Hypotheses
Data collection to Test Theory
What to measure
How to Measure
Analyze data
Descriptive Statistics: Overview
Central Tendency
Measure of variation
Coefficient of Variation
Fitting Statistical Models
Conclusion
Types of statistical models
Populations and samples
Simple statistical models
The mean as a model
The variance and standard deviation
Central Limit Theorem
The standard error
Confidence Intervals
Test statistics
Non-significant results and Significant results:
One- and two-tailed tests
Type I and Type II errors
Effect Sizes
Statistical power
Accessing SPSS
To explore the key windows in SPSS
Data editor
The viewer
The syntax editor
How to create variables
Enter Data and adjust the properties of your variables
How to Load Files and Save
Opening Excel Files
Recoding Variables
Deleting/Inserting a Case or a Column
Selecting Cases
Using SPSS Help
The art of presenting data
The SPSS Chart Builder
Histograms: a good way to spot obvious problems
Boxplots (box–whisker diagrams)
Graphing means: bar charts and error bars
Simple bar charts for independent means
Clustered bar charts for independent means
Simple bar charts for related means
Clustered bar charts for related means
Clustered bar charts for ‘mixed’ designs
Line charts
Graphing relationships: the scatterplot
Simple scatterplot
Grouped scatterplot
Simple and grouped -D scatterplots
Matrix scatterplot
Simple dot plot or density plot
Drop-line graph
Editing graphs
What are assumptions?
Assumptions of parametric data
The assumption of normality
Quantifying normality with numbers
Exploring groups of data
Testing whether a distribution is normal
Kolmogorov–Smirnov test on SPSS
Testing for homogeneity of variance
Correcting problems in the data
Looking at relationships
Standardization and the correlation coefficient
The significance of the correlation coefficient
Confidence intervals for r
Correlation in SPSS
i. Bivariate correlation
ii. Pearson’s correlation coefficient
iii. Spearman’s correlation coefficient
iv. Kendall’s tau (non-parametric)
v. Biserial and point–biserial correlations
vi. Partial correlation
vii. The theory behind part and partial correlation
viii. Partial correlation using SPSS
ix. Semi-partial (or part) correlations
Comparing correlations
Comparing independent rs
dependent rs
Calculating the effect size
How to report correlation coefficients
An introduction to regression
Some important information about straight lines
The method of least squares
Assessing the goodness of fit: sums of squares, R and R2
Doing simple regression on SPSS
Multiple regression: the basics
How to do multiple regression using SPSS
Descriptive
Checking assumptions
Background to logistic regression
What are the principles behind logistic regression?
Assessing the model: the log-likelihood statistic
Assessing the model: R and R2
Methods of logistic regression
Interpreting logistic regression
How to report logistic regression
Testing assumptions
Predicting several categories: multinomial logistic regression
Running multinomial logistic regression in SPSS
Looking at differences
The t-test
Rationale for the t-test
Reporting the dependent t-test
Reporting the independent t-test
Between groups or repeated measures?
The t-test as a general linear model
Comparing several means : ANOVA (GLM)
The theory behind ANOVA
Inflated error rates
Interpreting f-test
ANOVA as regression
Assumptions of ANOVA
Planned contrasts
Post hoc procedure

Big Data & Hadoop

Introduction and relevance
Uses of Big Data analytics in various industries like Telecom, E- commerce, Finance and Insurance etc.
Problems with Traditional Large-Scale Systems
Motivation for Hadoop
Different types of projects by Apache
Role of projects in the Hadoop Ecosystem
Key technology foundations required for Big Data
Limitations and Solutions of existing Data Analytics Architecture
Comparison of traditional data management systems with Big Data management systems
Evaluate key framework requirements for Big Data analytics
Hadoop Ecosystem & Hadoop 2.x core components
Explain the relevance of real-time data
Explain how to use big and real-time data as a Business planning tool
Hadoop Master-Slave Architecture
The Hadoop Distributed File System - Concept of data storage
Explain different types of cluster setups(Fully distributed/Pseudo etc)
Hadoop cluster set up - Installation
Hadoop 2.x Cluster Architecture
A Typical enterprise cluster – Hadoop Cluster Modes
Understanding cluster management tools like Cloudera manager/Apache ambari
HDFS Overview & Data storage in HDFS
Get the data into Hadoop from local machine(Data Loading Techniques) - vice versa
Map Reduce Overview (Traditional way Vs. MapReduce way)
Concept of Mapper & Reducer
Understanding MapReduce program Framework
Develop MapReduce Program using Java (Basic)
Develop MapReduce program with streaming API) (Basic)
Integrating Hadoop into an Existing Enterprise
Loading Data from an RDBMS into HDFS by Using Sqoop
Managing Real-Time Data Using Flume
Accessing HDFS from Legacy Systems
Apache PIG - MapReduce Vs Pig, Pig Use Cases
PIG’s Data Model
PIG Streaming
Pig Latin Program & Execution
Pig Latin : Relational Operators, File Loaders, Group Operator, COGROUP Operator, Joins and COGROUP, Union, Diagnostic Operators, Pig UDF
Writing JAVA UDF’s
Embedded PIG in JAVA
PIG Macros
Parameter Substitution
Use Pig to automate the design and implementation of MapReduce applications
Use Pig to apply structure to unstructured Big Data
Apache Hive - Hive Vs. PIG - Hive Use Cases
Discuss the Hive data storage principle
Explain the File formats and Records formats supported by the Hive environment
Perform operations with data in Hive
Hive QL: Joining Tables, Dynamic Partitioning, Custom Map/Reduce Scripts
Hive Script, Hive UDF
Hive Persistence formats
Loading data in Hive - Methods
Serialization & Deserialization
Handling Text data using Hive
Integrating external BI tools with Hadoop Hive

What You Get?

Learn from our comprehensive collection of project case-studies, hand-picked by industry experts, to give you an in-depth understanding of how data science moves industries like telecom, transportation, e-commerce & more.
  1. Global Sales Store Data Analytics - WallMart
  2. Service Calls & Engineers Utilization Data Analytics – HCL Services
  3. Clinical Data Analytics of Cancer Patients Diagnosis & Medication – Global Health Care
  4. Data Analytics of Training & Development Program of Defense Forces – Indian Navy ...many more...
You will be having the opportunity of 10-15 Hrs e-learning exercises along with instructor-led-training which enable candidates to get the maximum out of the subjects and empowering them to build logics to hand any new requirement.
This program has been designed in collaboration with some of the most influential analytics leader and top academician in data science.

Final Outcome

Thanks to the digital revolution that is sweeping the world and India in particular, data scientists are now the most sought-after professionals by big corporations as well as startups. And companies across industries are rewarding good data analysts and scientists with desirable career growth and salaries.