Data Science using Python

Introduction to data science using python
What is Data Science?
Common Terms in Analytics
Types of problems and business objectives in various industries
Overview of analytics tools & their popularity
List of steps in Analytics projects
Identify the most appropriate solution design for the given problem statement
Why Python for data science?

  Python: Essentials

Overview of Python- Starting with Python
Introduction to installation of Python
Introduction to Python IDE’s
Understand Jupyter notebook
Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
Installing & loading Packages & Name Spaces
Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
List and Dictionary Comprehensions
Basic Operations
Reading and writing data
Simple plotting
Control flow & conditional statements
How to create class and modules and how to call them?

 Data analysis - Visulization using python

Introduction exploratory data analysis
Descriptive statistics, Frequency Tables and summarization
Univariate Analysis (Distribution of data & Graphical Analysis)
Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, Pandas etc

Introduction to statistics

Basic Statistics – Measures of Central Tendencies and Variance
Building blocks – Probability Distributions – Central Limit Theorem
Inferential Statistics -Sampling – Concept of Hypothesis Testing
Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square
Important modules for statistical methods: Numpy, Scipy, Pandas

Scientific distuributions used in python for data science

Numpy, pandas, matplotlib, scikitlearn etc

Accessing/importing and exporting data using python modules
Importing Data from various sources (Csv, txt, excel etc)
Viewing Data objects – subsetting, methods
Exporting Data to various formats
Important python modules: Pandas

Data manupulation – Cleansing – Munging using python modules
Cleansing Data with Python
Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
Normalizing data
Formatting data
Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

Data exloration for modeling
Need for structured exploratory data
EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
Identify missing data
Identify outliers data
Visualize the data trends and patterns

Introduction to predictive modeling
Concept of model in analytics and how it is used?
Common terminology used in analytics & modeling process
Popular modeling algorithms
Different Phases of Predictive Modeling