Predictive Modeling
- The concept of the model in analytics and how it is used?
- Common terminology used in analytics & modeling process
- Popular modeling algorithms
- Different Phases of Predictive Modeling
Data explration for modeling
- Need for structured exploratory data
- EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
- Identify missing data
- Identify outliers data
- Visualize the data trends and patterns
Linear regression: solving regression problems
- Introduction – Applications
- Assumptions of Linear Regression
- Building a Linear Regression Model
- Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis, etc)
- Assess the overall effectiveness of the model
- Interpretation of Results

Logistic regression: Solving classifiction problems
- Introduction – Applications
- Building the Logistic Regression Model
- Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve, etc)
- Validation of Logistic Regression Models
- Standard Business Outputs (ROC Curve, Probability Cut-offs, Lift charts, Model equation, Drivers or variable importance, etc)
Supervised learning: naive bayes
- The concept of Conditional Probability
- Bayes Theorem and Its Applications
- Naïve Bayes for classification
- Applications of Naïve Bayes in Classifications
Time series forecasting: Solving forecasting problems
- Introduction – Applications
- Basic Techniques – Averages, Smoothening, etc
- Advanced Techniques – AR Models, ARIMA, etc
- Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc
Supervised Learning: Decision Trees
- Introduction – Applications
- Basic Techniques – Averages, Smoothening, etc
- Advanced Techniques – AR Models, ARIMA, etc
- Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc
- Decision Trees – Introduction – Applications
- Types of Decision Tree Algorithms
- Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-Leaf node; Entropy; Information Gain, Gini Index, Chi-Square, Regression Trees
- Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other Measures of Randomness
- Pruning a Decision Tree; Cost as a consideration
- Decision Trees – Validation
- Overfitting – Best Practices to avoid
What is segmentation & Role of ML in Segmentation?
- K-Means Clustering
- Expectation Maximization
- Principle Component Analysis (PCA)
Supervised learning: support vector machines
- Motivation for Support Vector Machine & Applications
- Interpretation of Outputs and Fine tune the models with hyper parameters
Supervised Learning: KNN
- What are KNN & Applications?
- KNN for missing treatment
- KNN For solving regression problems
- KNN for solving classification problems
- Validating KNN model
- Model fine tuning with hyperparameters