π Personal Finance Tracker - TrackYourTransactions
A web-based spending tracker that helps users understand, categorise, and visualise their bank transaction history. Simply upload a CSV file exported from your bank, and the app automatically categorises your transactions and generates meaningful insights.
- Backend (and fronted) hosted with AWS EC2 and Nginx.
- React + TailwindCSS.
- CSV parsing with with react-papaparse
- Data visualisation with Recharts and AG-Grid.
- Data persistence implemented with DexieDB.
- Node.js + Express.
- MongoDB for login information and metadata storage.
- Classification model(s) trained, evaluated, and implemented using Tensorflow + Pandas + NumPy.
- Text embedding handled using OpenAI's text-embedding-3-small.
- Endpoint containerised using Docker.
- Endpoint hosted using AWS Lambda + AWS API Gateway.
- MLP
- LogReg
- OpenAI (ChatGPT)
- Random Forest
- SVM
The final model chosen was the Multi-Layer Perceptron (MLP). Trained on a 974 row (with duplicates) training dataset.
~75% accuracy score (actual * 100 / predicted) on a 411 row test dataset.
model = tf.keras.Sequential([
tf.keras.layers.Input(shape=(input,)),
tf.keras.layers.Dense(256, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(num_classes, activation='softmax')
])
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
Number of uniques values in each category:
Category
Housing & Bills 6
Finance & Fees 8
Entertainment 11
Income 16
Shopping 21
Other / Misc 39
Health & Fitness 50
Transport 56
Groceries 69
Eating Out 93
Number of uniques values in each category:
Category
Income 1
Transfer 1
Housing & Bills 3
Finance & Fees 3
Groceries 13
Other / Misc 14
Entertainment 16
Health & Fitness 23
Shopping 25
Transport 28
Eating Out 41
Uses OpenAI's text-embedding-3-small model to generate high dimension embeddings from user's transaction data, then classifies these embeddings using a custom, pre-trained MLP.
The model weights (global and personalised) are stored in MongoDB, and are accessed by the AI Endpoint.
Every transaction is assigned to one of 10 spending categories:
Groceries, Eating Out, Transport, Housing & Bills, Entertainment, Shopping, Health & Fitness, Finance & Fees, Income, Other / Misc
- Personalised model: Each user has a personalised model - the personalised model can:
- Learn more quickly from the userβs historical spending patterns.
- Better adapt to individual merchant-specific habits (e.g., βLocal Marketβ = Groceries for one user but Eating Out for another).
- Global model: The global model is a highly generalised model, recieving weight updates from all users. This (ideally) makes it a jack-of-all-trades but master of none model.
- This allows users to finetune the selected model based on re-categorised transactions. Finetuning the global model has reduced influence compared to finetuning the personalised model.
- Keyword-based merchant classification using REGEX.
- Fallback rules for ambiguous transactions.
- Clean normalisation of merchant names (βTESCO 1234β β βTescoβ).
A highly customisable table that stores uploaded transaction data.
- Fully custom undo/redo system - supports row editing, row deletion, row addition.
- Model finetuning on user re-categorised rows.
- Filtering by uploaded CSV.
- CSV Export.
- Transaction re-categorisation.
- Filtering by month.
- Row deletion.
- Row editing.
- Sorting by column.
Implements a robust, multi-stage, upload process, allowing the user to map bank CSV columns to values expected by the system. Allows the saving of mappings to streamline future uploads. Implements graceful error-handling and warns the user if it is suspected that a column has been incorrectly mapped.
- Upload CSV: The bank CSV is parsed using react-papaparse.
- Column mapping: Provides the user an interface to map their bank's CSV to the systems expected columns.
- Transaction Date => Supporting EU or US format.
- Transaction Description.
- Amount => Supporting both a single amount column with both negative and positive values and two amount columns (for debit and credit / income and expense).
- Duplicate Detection: Detects and warns user of duplicate transactions between uploaded CSV and stored transactions.
- Confidence scoring: Allows users to recategorise low confidence.
- Secure and easy account deletion that deletes all associated account data.
- Secure custom login flow adhering to OWASP guidelines, utilising JWT's + Refresh Tokens.
- Auth0 integration to allow login with accredited vendors (Google, Github, etc.).
- CSV upload process streamlined via intuitive progress bar.
- Transaction data only stored on the client-side. (If auto-categorisation is enabled, transaction data is sent to the AI Endpoint upon upload, but it is never stored in our servers).
- Automatic removal of empty CSV rows.
- Standardising of date formats using date-fns.
- Detecting and correcting negative/positive mismatches in amount column(s).
- Pie Chart: Category distribution.
- Line Graph: Monthly totals.
- Bar Graph Income vs. expenses summary.
How your spending evolves month-by-month. Displays:
- Percentage increase/decrease between months for each major category.
Displays:
- Total spend.
- Total income.
- Total net.
- Direct integration with banking apps, bypassing the need to download a CSV.
- Better DB (currently using free DB tier for heavy model storage).
- Improved ML classifier model accuracy and generalisation. (Currently trained on a limited and biased dataset).
- Fully/more customisable dashboard.
- Budgeting system (allowing users to set budgets per category).
- Custom category support.
- Automatic recurring payment detection from analysing transaction data.
- Payment integration (e.g. using Stripe).