Empirical Research Template

易翊翼
20241011

___________________________________________________________________________________________________________________________

Project Description

This is an empirical research project template designed to provide a standardized project structure for:

Version control
Code synchronization between local machine and HPC
Decoupling of code and data
Separation of build and analysis phases

The template structure follows the Guide by Matthew Gentzkow and Jesse M. Shapiro and includes useful config.do and config.py files for setting up paths and packages in Stata and Python.

Directory Structure

├── analysis            # Analysis phase
│   ├── code            # Analysis code
│   └── data
│       ├── input←←←←←| # Panel data for descriptive stats and regressions(4) 
|       |    ↓        |
│       └── output    | # Generated tables and figures (5)
├── build             | # Data construction phase
│   ├── code          | # Data processing code
│   └── data          |
│       ├── raw       | # Raw data (1) 
|       |    ↓        | 
│       ├── temp      | # Temporary files, merge keys, etc. (2) 
|       |    ↓        |
│       └── processed→| # Processed databases (3) 
├── README.md           # Project documentation
├── README.py           # Directory tree generator
└── resource            # Related papers and materials

Usage Guide

1. Data Management

build/data/raw: Store raw data (read-only)
build/data/processed: Store processed data
build/data/temp: Store intermediate files
analysis/data/input: Store analysis-ready data
analysis/data/output: Store analysis results

2. Code Organization

build/code: Data cleaning and construction code
analysis/code: Analysis code
Each code file should have clear documentation

3. Version Control

Use .gitignore for large data files
Use .gitkeep to maintain empty directories
Regular code commits

4. Configuration

Use config.py for Python paths and packages
Use config.do for Stata paths and packages

Best Practices

Data Security
- Don't commit sensitive data
- Use .gitignore for large files
Code Standards
- Keep code clean and documented
- Use meaningful names
- Add appropriate comments
Performance
- Use chunking for large datasets
- Choose appropriate data structures
Collaboration
- Regular code sync
- Keep documentation updated
- Follow project standards

Maintenance

Regular dependency updates
Documentation maintenance
Temporary file cleanup
Data backup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Empirical Research Template

Project Description

Directory Structure

Usage Guide

1. Data Management

2. Code Organization

3. Version Control

4. Configuration

Best Practices

Maintenance

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
analysis		analysis
build		build
resource		resource
.gitignore		.gitignore
README.md		README.md
README.py		README.py

Yiyiyi89/0_project_template

Folders and files

Latest commit

History

Repository files navigation

Empirical Research Template

Project Description

Directory Structure

Usage Guide

1. Data Management

2. Code Organization

3. Version Control

4. Configuration

Best Practices

Maintenance

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages