This repository contains Python scripts designed for efficient payroll data processing and reconciliation of financial records. These tools streamline the extraction, transformation, and validation of employee payroll data while comparing and identifying discrepancies between datasets.
✔ Automated Payroll Data Extraction – Extract employee details, payroll taxes, and paid taxes from Excel files.
✔ Data Integration – Combine extracted data into a single structured dataset, ensuring accuracy.
✔ File Reconciliation – Compare CSV files with matching identifiers to detect discrepancies.
✔ Security & Compliance – Supports data integrity checks and GDPR/HIPAA compliance validation.
Ensure you have the following installed:
- Python 3.x
- openpyxl (for Excel data handling)
- locale (for localized financial processing)
- pandas (for CSV data comparison)
Install dependencies using:
pip install openpyxl pandas- Clone the repository.
- Place your Excel payroll file inside the
datadirectory. - Update the script's file path (
v_path) and file name (v_file). - Run the payroll processing script:
python payroll_processing.pyThis will extract employee details, payroll taxes, and paid taxes, then compile them into a structured CSV file.
- Place CSV files into the target directory following the naming convention:
plexus<number>.csvlumber<number>.csv
- Run the reconciliation script:
python compare_files.pyThe script will compare matching files and highlight differences between datasets.
- extract_employee_info(dirpath, filename, heading) – Extracts employee details.
- payroll_taxes(dirpath, filename, heading) – Retrieves payroll tax data.
- extract_taxes_paid(dirpath, filename, heading) – Extracts tax payment records.
- combine_list(x, y, z) – Merges multiple lists into a single dataset.
- combine_data_lists(x, y, z) – Consolidates structured data into a unified format.
- write_to_file_in_directory(dirpath, filename, data) – Saves processed data as a CSV file.
- compare_files(dir_path) – Identifies mismatches in CSV datasets and prints discrepancies.
This project is licensed under the MIT License.
For inquiries, reach out via email:
📧 [email protected]