Drop us a line!

DS Program-4

MS Excel, Hadoop, Python 4 Months (Weekend Batches)
35,000(Inclusive of Taxes)

Modules Included

MS Excel 20 HRS
Big Data & Hadoop 40 HRS
Python 20 HRS

Data Analytics with Excel

Relevance in industry & need of the hour
Types of analytics – Marketing, Risk, Operations, etc
Business & Technology drivers for analytics
Future of analytics & critical requirement
Types of problems and business objectives in various industries
Different phases of Analytics Project
Introduction to Excel
Working with Formulas and functions
Formating & Conditional Formating
Filtering, sorting, paste special etc
Functions (Logical & Text, Mathematical, Statistical etc)
Data Manipulation & Data Aggregation
Data Analysis using functions
Analyzing Data using Pivots
Descriptive Statistics
Creating Charts & Graphics
Data analytics tool (What -if analysis, Goal seek, Data Table, Solver)
Protecting Workbooks, worksheets and formulas
Working with VBE (Visual Basic Editor)
Introduction to Excel Object Model
Understanding of Sub and Function Procedures
Key Component of Programming Language
Understanding of If, Select Case, With End With Statements
Looping with VBA
User Defined Function
Some Commonly Used Macro Examples
Error Handling
Object and Memory Management in VBA
User Form Controls
ActiveX Controls
Communicating with Database MS Access through ADO - Exporting /Importing Data

Big Data & Hadoop

Introduction and relevance
Uses of Big Data analytics in various industries like Telecom, E- commerce, Finance and Insurance etc.
Problems with Traditional Large-Scale Systems
Motivation for Hadoop
Different types of projects by Apache
Role of projects in the Hadoop Ecosystem
Key technology foundations required for Big Data
Limitations and Solutions of existing Data Analytics Architecture
Comparison of traditional data management systems with Big Data management systems
Evaluate key framework requirements for Big Data analytics
Hadoop Ecosystem & Hadoop 2.x core components
Explain the relevance of real-time data
Explain how to use big and real-time data as a Business planning tool
Hadoop Master-Slave Architecture
The Hadoop Distributed File System - Concept of data storage
Explain different types of cluster setups(Fully distributed/Pseudo etc)
Hadoop cluster set up - Installation
Hadoop 2.x Cluster Architecture
A Typical enterprise cluster – Hadoop Cluster Modes
Understanding cluster management tools like Cloudera manager/Apache ambari
HDFS Overview & Data storage in HDFS
Get the data into Hadoop from local machine(Data Loading Techniques) - vice versa
Map Reduce Overview (Traditional way Vs. MapReduce way)
Concept of Mapper & Reducer
Understanding MapReduce program Framework
Develop MapReduce Program using Java (Basic)
Develop MapReduce program with streaming API) (Basic)
Integrating Hadoop into an Existing Enterprise
Loading Data from an RDBMS into HDFS by Using Sqoop
Managing Real-Time Data Using Flume
Accessing HDFS from Legacy Systems
Apache PIG - MapReduce Vs Pig, Pig Use Cases
PIG’s Data Model
PIG Streaming
Pig Latin Program & Execution
Pig Latin : Relational Operators, File Loaders, Group Operator, COGROUP Operator, Joins and COGROUP, Union, Diagnostic Operators, Pig UDF
Writing JAVA UDF’s
Embedded PIG in JAVA
PIG Macros
Parameter Substitution
Use Pig to automate the design and implementation of MapReduce applications
Use Pig to apply structure to unstructured Big Data
Apache Hive - Hive Vs. PIG - Hive Use Cases
Discuss the Hive data storage principle
Explain the File formats and Records formats supported by the Hive environment
Perform operations with data in Hive
Hive QL: Joining Tables, Dynamic Partitioning, Custom Map/Reduce Scripts
Hive Script, Hive UDF
Hive Persistence formats
Loading data in Hive - Methods
Serialization & Deserialization
Handling Text data using Hive
Integrating external BI tools with Hadoop Hive


Why do we need Python?
Program structure in Python
Interactive Shell
Executable or script files.
User Interface or IDE
Other Core Types
Assignments, Expressions and prints
If tests and Syntax Rules
While and For Loops
Iterations and Comprehensions
Opening a file
Using Files
Other File tools
Function definition and call
Function Scope
Function Objects
Anonymous Functions
Cleansing Data with Python
Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
Python Built-in Functions (Text, numeric, date, utility functions)
Python User Defined Functions
Stripping out extraneous information
Normalizing data
Formatting data
Important Python Packages for data manipulation (Pandas, Numpy etc)
Importing Data from various sources (Csv, txt, excel, access etc)
Database Input (Connecting to database)
Viewing Data objects - subsetting, methods
Exporting Data to various formats
Introduction exploratory data analysis
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, Pandas and scipy.stats etc)
Basic Statistics - Measures of Central Tendencies and Variance
Building blocks - Probability Distributions - Normal distribution - Central Limit Theorem
Inferential Statistics -Sampling - Concept of Hypothesis Testing
Statistical Methods - Z/t-tests (One sample, independent, paired), Anova, Correlation and Chi-square

What You Get?

Learn from our comprehensive collection of project case-studies, hand-picked by industry experts, to give you an in-depth understanding of how data science moves industries like telecom, transportation, e-commerce & more.
  1. Global Sales Store Data Analytics - WallMart
  2. Service Calls & Engineers Utilization Data Analytics – HCL Services
  3. Clinical Data Analytics of Cancer Patients Diagnosis & Medication – Global Health Care
  4. Data Analytics of Training & Development Program of Defense Forces – Indian Navy ...many more...
You will be having the opportunity of 10-15 Hrs e-learning exercises along with instructor-led-training which enable candidates to get the maximum out of the subjects and empowering them to build logics to hand any new requirement.
This program has been designed in collaboration with some of the most influential analytics leader and top academician in data science.

Final Outcome

Thanks to the digital revolution that is sweeping the world and India in particular, data scientists are now the most sought-after professionals by big corporations as well as startups. And companies across industries are rewarding good data analysts and scientists with desirable career growth and salaries.