GitHub - yashj1301/lead-scoring: This project is a Lead Scoring Case Study, built as part of the UpGrad Data Science course, to help businesses identify high-converting leads using Logistic Regression Machine Learning model.

Lead Scoring Case Study (Capstone Project)

📌 Project Overview

This project is a Lead Scoring Case Study, built as part of the UpGrad Data Science course, to help businesses identify high-converting leads using Machine Learning models.

🎯 Objectives

Understand lead conversion behavior based on given features
Clean & preprocess data to handle missing values, outliers
Feature Engineering to extract useful insights
Build ML models to predict lead conversion probability
Evaluate models and deploy the best-performing one

🛠️ Tech Stack & Tools

Programming Language: Python
Data Handling: Pandas, NumPy
Machine Learning: Scikit-Learn, Logistic Regression
Visualization: Matplotlib, Seaborn

Exploratory Analysis Outcomes

The overall lead conversion so far has been 47.1%. The analysis suggests that leads tagged with high-intent indicators (like 'Closed by Horizzon' or 'Interested in Next batch') are more likely to convert.
Channels like Websites and References have led to the most conversions (>90%), while social media and search engines like google came up far second (60-70%).
Developing countries with lesser technology adaptibility such as Bahrain and Bangladesh dominate the successful conversions demographics, while Working Professionals in the field of Management also carry the same weightage.
Successful conversions have spent an average of 1.5x time of the unsuccessful conversions on the website. Moreover, a low bounce rate (6.42%) indicates that once anyone enters the website, they are more likely to explore it beyond the first few webpages.
A lead is more likely to be converted when contacted either via email or by phone. The data also indicates that prospects prefer email communication over calls.
A positive correlation could be observed as we moved towards a higher activity index for a prospect, but it was the opposite when it came to profile index.

Model Outcomes

This classification model can be deployed to help the business accurately predict outcomes (e.g., lead conversion) with 96% accuracy.
The high precision and recall for both classes ensure that the model minimizes both false positives (incorrectly predicting a lead would convert) and false negatives (missing out on actual lead conversions).
A high ROC AUC score indicates the model can strongly differentiate between both classes, further enhancing its performance.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
notebooks		notebooks
reports		reports
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Lead Scoring Case Study (Capstone Project)

📌 Project Overview

🎯 Objectives

🛠️ Tech Stack & Tools

Exploratory Analysis Outcomes

Model Outcomes

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

yashj1301/lead-scoring

Folders and files

Latest commit

History

Repository files navigation

Lead Scoring Case Study (Capstone Project)

📌 Project Overview

🎯 Objectives

🛠️ Tech Stack & Tools

Exploratory Analysis Outcomes

Model Outcomes

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages