DataFlowAI - Interactive Machine Learning Visualizations

DataFlowAI is an interactive web application that helps users understand four fundamental machine learning algorithms through visual demonstrations.

Algorithms Overview

1. Linear Regression

What is Linear Regression?

Linear Regression is a supervised learning algorithm that predicts a continuous output value based on one or more input features. It finds the best-fitting straight line (or hyperplane in higher dimensions) through the data points by minimizing the sum of squared errors.

Implementation Details

Data Points: Users can add points manually by clicking or generate random points
Training Process:
- Uses gradient descent to find optimal slope and intercept
- Animated line fitting during training
- Shows error lines between points and predictions
Performance Metrics:
- R² Score (0 to 1): Measures how well the model fits the data
  - 1.0 = Perfect fit
  - 0.0 = Poor fit
  - Negative = Worse than horizontal line
- Mean Squared Error (MSE): Average of squared differences between predictions and actual values
  - Penalizes larger errors more heavily
  - Always positive
  - Lower is better
- Mean Absolute Error (MAE): Average of absolute differences between predictions and actual values
  - Easier to interpret
  - Less sensitive to outliers than MSE
  - Lower is better

2. Logistic Regression

What is Logistic Regression?

Logistic Regression is a classification algorithm that predicts binary outcomes (0 or 1). It uses a sigmoid function to transform linear predictions into probabilities between 0 and 1.

Implementation Details

Data Points: Users can add points of two classes (0 and 1)
Training Process:
- Uses gradient descent with mini-batch processing
- L2 regularization to prevent overfitting
- Animated decision boundary
- Optimal threshold finding
Performance Metrics:
- Accuracy: Percentage of correct predictions (both classes)
- Precision: Of points predicted as Class 1, how many were actually Class 1
- Recall: Of all actual Class 1 points, how many were correctly identified
- F1 Score: Harmonic mean of precision and recall
- Decision Boundary: Shows the line where probability = 0.5
- Probability Tooltips: Shows P(class=1) for any point

3. K-Nearest Neighbors (KNN)

What is KNN?

KNN is a simple, non-parametric classification algorithm that classifies points based on the majority class of their k nearest neighbors. It's an example of instance-based learning.

Implementation Details

Data Points: Users can add training points of three classes
Test Point: Right-click to add/move test point
Classification Process:
- Calculates Euclidean distance to all training points
- Finds k nearest neighbors
- Predicts class based on majority vote
- Visualizes nearest neighbors with highlighted circles
Performance Metrics (using Leave-One-Out Cross Validation):
- Confusion Matrix: Shows prediction distribution for each class
- Per-Class Metrics:
  - True Positives (TP): Correctly predicted points of this class
  - False Positives (FP): Other classes predicted as this class
  - False Negatives (FN): This class predicted as other classes
  - Precision: TP / (TP + FP)
  - Recall: TP / (TP + FN)
  - F1 Score: 2 _ (Precision _ Recall) / (Precision + Recall)

4. K-Means Clustering

What is K-Means?

K-Means is an unsupervised learning algorithm that groups similar data points into k clusters. It iteratively assigns points to the nearest centroid and updates centroid positions.

Implementation Details

Data Points: Users can add points by clicking
Clustering Process:
- Randomly initializes k centroids
- Animated assignment of points to nearest centroid
- Smooth centroid movement during updates
- Convergence detection
Interactive Features:
- Adjustable number of clusters (k)
- Color-coded clusters
- Animated clustering process
- Triangle markers for centroids
Status Information:
- Number of points
- Current iteration
- Convergence status

Technical Implementation Details

Common Features Across All Models

Interactive canvas using Chart.js
Real-time visualization updates
Responsive design
Error handling and input validation
Clear and reset functionality
Informative tooltips and instructions

Animation and Visualization

Smooth transitions for model updates
Color-coded classes and clusters
Clear visual distinction between different elements
Responsive canvas sizing
Cross-browser compatibility

User Interface

Intuitive point addition with left/right clicks
Clear control buttons with appropriate disabled states
Real-time metric updates
Detailed instructions and explanations
Mobile-responsive design

Performance Optimizations

Efficient data structure usage
Optimized animation frames
Debounced event handlers
Proper cleanup of chart instances
Memory leak prevention

Usage Instructions

Each algorithm has specific interaction patterns:

Linear Regression: Click to add points, train model to fit line
Logistic Regression: Add points of two classes, train to find decision boundary
KNN: Add training points, right-click for test point
K-Means: Add points, adjust k, watch clustering animation

Technical Requirements

Modern web browser with JavaScript enabled
Recommended minimum screen width: 768px
Touch device support for basic interactions

Libraries Used

React for UI components
Chart.js for visualizations
TailwindCSS for styling
React Router for navigation

Future Improvements

Additional algorithms
More interactive features
Dataset import/export
Mobile optimization
Advanced visualization options

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
public		public
src		src
.gitignore		.gitignore
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
vercel.json		vercel.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataFlowAI - Interactive Machine Learning Visualizations

Algorithms Overview

1. Linear Regression

What is Linear Regression?

Implementation Details

2. Logistic Regression

What is Logistic Regression?

Implementation Details

3. K-Nearest Neighbors (KNN)

What is KNN?

Implementation Details

4. K-Means Clustering

What is K-Means?

Implementation Details

Technical Implementation Details

Common Features Across All Models

Animation and Visualization

User Interface

Performance Optimizations

Usage Instructions

Technical Requirements

Libraries Used

Future Improvements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

KingShivamX/DataFlowAI

Folders and files

Latest commit

History

Repository files navigation

DataFlowAI - Interactive Machine Learning Visualizations

Algorithms Overview

1. Linear Regression

What is Linear Regression?

Implementation Details

2. Logistic Regression

What is Logistic Regression?

Implementation Details

3. K-Nearest Neighbors (KNN)

What is KNN?

Implementation Details

4. K-Means Clustering

What is K-Means?

Implementation Details

Technical Implementation Details

Common Features Across All Models

Animation and Visualization

User Interface

Performance Optimizations

Usage Instructions

Technical Requirements

Libraries Used

Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages