Data Science using Python

Introduction to data science using python

  • What is Data Science?
  • Common Terms in Analytics
  • Types of problems and business objectives in various industries
  • Overview of analytics tools & their popularity
  • List of steps in Analytics projects
  • Identify the most appropriate solution design for the given problem statement
  • Why Python for data science?

Python: Essentials

  • Overview of Python- Starting with Python
  • Introduction to installation of Python
  • Introduction to Python IDE’s
  • Understand Jupyter notebook
  • Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
  • Installing & loading Packages & Name Spaces
  • Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
  • List and Dictionary Comprehensions
  • Basic Operations
  • Reading and writing data
  • Simple plotting
  • Control flow & conditional statements
  • How to create class and modules and how to call them?

Data analysis – Visulization using python

  • Introduction exploratory data analysis
  • Descriptive statistics, Frequency Tables and summarization
  • Univariate Analysis (Distribution of data & Graphical Analysis)
  • Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
  • Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
  • Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, Pandas etc

Introduction to statistics

  • Basic Statistics – Measures of Central Tendencies and Variance
  • Building blocks – Probability Distributions – Central Limit Theorem
  • Inferential Statistics -Sampling – Concept of Hypothesis Testing
  • Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square
  • Important modules for statistical methods: Numpy, Scipy, Pandas

Scientific distuributions used in python for data science

  • Numpy, pandas, matplotlib, scikitlearn etc

Accessing/importing and exporting data using python modules

  • Importing Data from various sources (Csv, txt, excel etc)
  • Viewing Data objects – subsetting, methods
  • Exporting Data to various formats
  • Important python modules: Pandas

Data manupulation – Cleansing – Munging using python modules

  • Cleansing Data with Python
  • Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables, sampling, Data type conversions, renaming, formatting etc)
  • Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
  • Normalizing data
  • Formatting data
  • Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)

Data exloration for modeling

  • Need for structured exploratory data
  • EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
  • Identify missing data
  • Identify outliers data
  • Visualize the data trends and patterns

Introduction to predictive modeling

  • Concept of model in analytics and how it is used?
  • Common terminology used in analytics & modeling process
  • Popular modeling algorithms
  • Different Phases of Predictive Modeling