Skip to content

A multidimensional ML project inspired by M-theory. Explores the Titanic dataset through structured preprocessing and visualization. Trains and optimizes multiple models to predict passenger survival. Includes an interactive Streamlit app for individual and batch predictions. Designed for users & theory enthusiasts exploring hidden data patterns.

Notifications You must be signed in to change notification settings

jigyasaG18/Titanic-Survival-Prediction-using-Machine-Learning

Repository files navigation

🚢 Titanic Survival Prediction — A Multidimensional, Theory-Oriented Machine Learning Exploration

Where Data Meets Higher-Dimensional Insight

Live App: 👉 https://titanic-survival-prediction-using-machine-learning.streamlit.app/


🌌 1. Concept & Theoretical Orientation

This project interprets the Titanic dataset not merely as a table of numbers, but as a multi-dimensional structure, where observable outcomes (survival) emerge from the interplay of hidden variables — much like how M-Theory proposes that our universe is shaped by dimensions beyond direct perception.

In this analogy:

Machine Learning Concept M-Theory Interpretation
Features (Age, Sex, Fare…) Dimensions of the system
Model Training Stabilizing a configuration of interacting dimensions
Hyperparameters Geometric constraints on the system
Prediction Emergent behavior from multidimensional dynamics
Random Forest Ensemble of “worlds,” each contributing to final reality

This project is built so that even users with no coding background can explore these relationships through an interactive interface.


🚀 2. Project Overview

This repository contains:

  • Full data preprocessing pipeline

  • Exploratory analysis & visualizations

  • Categorical encoding

  • Training and tuning of four models

  • Performance comparison

  • Exporting the best model (Random Forest)

  • A full Streamlit web application that delivers:

    • Individual predictions
    • Batch CSV prediction
    • Explanations & UI-friendly design

The project uses the Titanic dataset to predict whether a passenger survived based on seven core features.


🧠 3. Philosophical Insight

In real life, outcomes arise from countless interacting variables. Machine learning approximates this by building a mathematical structure that captures those relationships.

Similar to M-theory:

  • What we observe (survived/not survived) is a projection from a higher-dimensional structure (Age, Fare, Pclass, Sex…).

  • The Random Forest represents many “possible worlds” (decision trees), whose combined vote shapes the final prediction.

  • GridSearchCV is the analog of exploring different configurations of the system to find the most stable or accurate one.

Thus, this project is not only a technical demonstration but a conceptual visualization of how patterns emerge from data-space.


🗂️ 4. Technical Workflow Summary

1️⃣ Import Libraries

Includes numpy, pandas, matplotlib, seaborn, scikit-learn, etc.

2️⃣ Load Dataset

Explore shape, types, missing values, statistical properties.

3️⃣ Handle Missing Data

  • Drop Cabin (too sparse)
  • Fill Age using mean
  • Fill Embarked using mode

4️⃣ Visual Exploration

  • Survival counts
  • Gender distribution
  • Class-based differences
  • Combined visual plots

5️⃣ Encode Categorical Columns

Sex: male→0, female→1  
Embarked: S→0, C→1, Q→2

6️⃣ Select Features

Using:

Pclass, Sex, Age, SibSp, Parch, Fare, Embarked

7️⃣ Split Train/Test

8️⃣ Scale Data

Only for SVM & Logistic Regression.

9️⃣ Train Four Models

  • Logistic Regression
  • Decision Tree
  • Random Forest
  • SVM

All optimized using GridSearchCV (5-fold cross validation).

🔟 Compare Model Performance

Model Training Acc Test Acc
Logistic Regression 0.8146 0.7821
Random Forest 0.8694 0.7933
Decision Tree 0.8539 0.7709
SVM 0.8567 0.7877

🏆 Random Forest is the best performer.

1️⃣1️⃣ Save Final Model

best_random_forest_model.pkl

🌐 5. Streamlit App Features

🏠 Home

Theory, explanations, and visual introduction.

📝 Individual Prediction

Enter passenger details → app predicts survival.

📊 Batch Prediction

Upload a CSV → get full annotated results.

ℹ️ About the App

Explains model logic, dataset, and educational purpose.


## 🌐 6. Live Demo

Experience the full interactive application directly in your browser.
No installation, no setup, and no coding skills required.
Explore survival predictions for individual passengers or entire CSV files.
Understand how multidimensional data shapes real-world outcomes.
Dive into the model's behavior through an intuitive, theory-inspired interface.

👉 Launch the App:
https://titanic-survival-prediction-using-machine-learning.streamlit.app/


📊 7. Sample Dataset (For Batch Prediction)

PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
1 0 3 Braund, Mr. Owen Harris male 22 1 0 A/5 21171 7.25 S
2 1 1 Cumings, Mrs. John Bradley female 38 1 0 PC 17599 71.2833 C85 C
3 1 3 Heikkinen, Miss. Laina female 26 0 0 STON/O2. 3101282 7.925 S
4 1 1 Futrelle, Mrs. Jacques Heath female 35 1 0 113803 53.1 C123 S
5 0 3 Allen, Mr. William Henry male 35 0 0 373450 8.05 S

🌟 8. Future Directions (Theory & ML)

  • Incorporating SHAP for explainability → interpret dimensions
  • Extending to generative modeling → alternate “universes”
  • Using boosted trees → deeper multi-branch reality surfaces
  • Adding neural networks → higher-order nonlinear manifolds

🙏 9. Acknowledgments

This project blends Data Science, theoretical inspiration, and user-friendly interaction. It demonstrates how prediction emerges from structure — just as physical reality may arise from deeper hidden dimensions.


About

A multidimensional ML project inspired by M-theory. Explores the Titanic dataset through structured preprocessing and visualization. Trains and optimizes multiple models to predict passenger survival. Includes an interactive Streamlit app for individual and batch predictions. Designed for users & theory enthusiasts exploring hidden data patterns.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published