Introduction to data science using python
- What is Data Science?
- Common Terms in Analytics
- Types of problems and business objectives in various industries
- Overview of analytics tools & their popularity
- List of steps in Analytics projects
- Identify the most appropriate solution design for the given problem statement
- Why Python for data science?

Python: Essentials
- Overview of Python- Starting with Python
- Introduction to installation of Python
- Introduction to Python IDE’s
- Understand Jupyter notebook
- Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
- Installing & loading Packages & Name Spaces
- Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
- List and Dictionary Comprehensions
- Basic Operations
- Reading and writing data
- Simple plotting
- Control flow & conditional statements
- How to create class and modules and how to call them?
Data analysis – Visulization using python
- Introduction exploratory data analysis
- Descriptive statistics, Frequency Tables and summarization
- Univariate Analysis (Distribution of data & Graphical Analysis)
- Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
- Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
- Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, Pandas etc
Introduction to statistics
- Basic Statistics – Measures of Central Tendencies and Variance
- Building blocks – Probability Distributions – Central Limit Theorem
- Inferential Statistics -Sampling – Concept of Hypothesis Testing
- Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square
- Important modules for statistical methods: Numpy, Scipy, Pandas
Scientific distuributions used in python for data science
- Numpy, pandas, matplotlib, scikitlearn etc
Accessing/importing and exporting data using python modules
- Importing Data from various sources (Csv, txt, excel etc)
- Viewing Data objects – subsetting, methods
- Exporting Data to various formats
- Important python modules: Pandas
Data manupulation – Cleansing – Munging using python modules
- Cleansing Data with Python
- Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
- Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
- Normalizing data
- Formatting data
- Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)
Data exloration for modeling
- Need for structured exploratory data
- EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
- Identify missing data
- Identify outliers data
- Visualize the data trends and patterns
Introduction to predictive modeling
- Concept of model in analytics and how it is used?
- Common terminology used in analytics & modeling process
- Popular modeling algorithms
- Different Phases of Predictive Modeling