This repository contains the complete analysis pipeline for a meta-analytic study examining careless responding prevalence and detection methods in psychological and organizational research. The repository is designed for full reproducibility of the dissertation analysis.
careless_meta/
├── data/ # Data files
│ ├── coded_study_data_all.xlsx # Main dataset for analysis
│ └── shared_studies_coded.xlsx # Reliability analysis dataset
├── python/ # Python preprocessing scripts
│ ├── 00_preprocess.py # Data cleaning and preparation
│ ├── 01_counts.py # Generate descriptive counts
│ ├── 02_reliability.py # Inter-rater reliability analysis
│ ├── 03_proportions.py # Calculate careless proportions
│ ├── 04_meta_analysis.py # Initial meta-analysis preparation
│ └── utils.py # Utility functions
├── R/ # R analysis scripts
│ ├── 00_eda.R # Exploratory data analysis
│ ├── 01_data_import.R # Import preprocessed data
│ ├── 02_meta_analysis.R # Core meta-analysis
│ ├── 03_compare_metas.R # Compare meta-analytic approaches
│ ├── 04_meta_regression.R # Meta-regression models
│ ├── 05_multilevel_positions.R # Positional effects analysis
│ ├── 06_bias_sensitivity.R # Publication bias and sensitivity analyses
│ ├── 07_results_summary.R # Summarize results
│ ├── 08_visualization.R # Create visualizations
│ └── run_all.R # Execute all R scripts sequentially
├── output/ # Analysis outputs
│ ├── data_examination/ # Data quality checks
│ ├── python_results/ # Results from Python preprocessing
│ ├── r_results/ # Results from R analyses
│ ├── figures/ # Generated figures
│ └── tables/ # Generated tables
├── docs/ # Documentation
├── codebook.json # Codebook for variables and coding schemes
└── README.md # This file
The analysis requires two core data files in the data/ directory:
-
coded_study_data_all.xlsx: Main dataset containing:- Study characteristics
- Sample information
- Careless responding detection methods
- Prevalence rates
- Methodological details
-
shared_studies_coded.xlsx: Reliability analysis dataset containing:- Shared studies coded by multiple raters
- Inter-rater reliability data
These files are processed by the Python preprocessing pipeline to generate the analysis-ready datasets.
The analysis employs three complementary approaches to address different research questions:
-
First-Method Approach (Primary Analysis)
- Combines single-method studies with first methods from sequential screening
- Maximizes sample size while controlling for method ordering effects
- Primary focus for prevalence estimates and method comparisons
-
Single-Method Approach (Secondary Analysis)
- Restricted to studies using only one detection method
- Provides method-specific estimates with maximum internal validity
- Used for sensitivity analysis of method effects
-
Overall Approach (Tertiary Analysis)
- Examines total careless responding rates across all methods
- Provides general prevalence estimates
- Used for temporal trend analysis
-
Run Python preprocessing pipeline:
python python/00_preprocess.py python python/01_counts.py python python/02_reliability.py python python/03_proportions.py python python/04_meta_analysis.py
-
Run R analysis pipeline:
Rscript R/run_all.R
All outputs will be generated in the output/ directory, organized by analysis type.
- Meta-analytic Results: Pooled estimates, heterogeneity statistics, and subgroup analyses
- Method Comparisons: Forest plots comparing different detection methods
- Temporal Analysis: Trends in careless responding rates over time
- Publication Bias: Funnel plots and sensitivity analyses
- Influence Analysis: Cook's distance plots and leave-one-out analyses
codebook.json: Complete variable definitions and coding schemesdocs/: Additional documentation on methodological decisions- Script headers: Detailed comments explaining each analysis step
This repository is publicly available for research transparency and reproducibility purposes. Please cite the associated dissertation when using this code or analysis.