Skip to content

Cnair02/CreditCard-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

CreditCard Fraud Detection-Analysis

Title: Financial and Fraud Analysis Pipeline: ETL, Pyspark and Analytics

Objective: Build an end-to-end data pipeline for finance data. (credit_card_transactions.csv)

Skills: ETL, SQL schema design, Data Cleaning, Analytics, Visualization.

Tools: Python (Pandas), Pyspark, Matplotlib, Seaborn.

Topics covered:

  1. Data extraction
  2. Tranformation
  3. Loading data
  4. Exploratory analytics

Challenges: Data loading after processing as per schema definition into separate tables. Defining surrogate keys and using joins to establish current relation ships between records.

Conclusion:

  1. Most common categories sector of fraudulent activities identified.
  2. Late night to early morning hours needs more monitoring efforts to be put in.
  3. Unauthorized transactions does not always show for higher prices, rather it is more relevant for transactions of lower amount (<$500).
  4. Top states with the highest activities identified.

Releases

No releases published

Packages

No packages published