A reproduction of the paper "Advanced susceptibility analysis of ground deformation disasters using large language models and machine learning: A Hangzhou City case study"
This project reproduces and analyzes the study from: https://doi.org/10.1371/journal.pone.0310724
@article{Yu_2024,
title={Advanced susceptibility analysis of ground deformation disasters using large language models and machine learning: A Hangzhou City case study},
volume={19},
ISSN={1932-6203},
url={http://dx.doi.org/10.1371/journal.pone.0310724},
DOI={10.1371/journal.pone.0310724},
number={12},
journal={PLOS ONE},
publisher={Public Library of Science (PLoS)},
author={Yu, Bofan and Xing, Huaixue and Ge, Weiya and Zhou, Liling and Yan, Jiaxing and Li, Yun-an},
editor={Gul, Muhammet},
year={2024},
month=dec,
pages={e0310724}
}Original data of the study available here: https://figshare.com/articles/dataset/ML_and_LLM/25907179
@misc{Yu_2024_data,
doi = {10.6084/M9.FIGSHARE.25907179.V2},
url = {https://figshare.com/articles/dataset/ML_and_LLM/25907179/2},
author = {YU, Bofan},
keywords = {Geology not elsewhere classified},
title = {ML and LLM},
publisher = {figshare},
year = {2024},
copyright = {Creative Commons Attribution 4.0 International}
}- Ground collapse susceptibility analysis using machine learning
- Multiple model implementations (original and alternative approaches)
- Feature importance analysis
- LLM-based analysis with multiple models and temperature settings
- Comprehensive evaluation with ROC curves and metrics
- Python 3.8+
- Required libraries:
- scikit-learn
- pandas
- matplotlib
- joblib
- numpy
- OpenAI API (for LLM analysis)
Full list of libraries is here: requirements.txt and pyproject.toml
- Clone the repository:
git clone https://github.com/RomanKyrychenko/groud_collapse.git- Navigate to the project directory:
cd groud_collapse- Install dependencies:
pip install -r requirements.txt- Add a
.envfile with your ownOPENAI_API_KEY.
To build and run the Docker container for this project, follow these steps:
- Build the Docker image:
docker build -t ground_collapse_analysis .- Run the Docker container:
docker run --rm -v $(pwd)/output:/app/output ground_collapse_analysisThis will execute the analysis and save the results in the output directory.
Run the main analysis script:
python main.py --input_file "input/ground collapse.xlsx" --input_prompt_pdf "input/prompt.pdf"Additional options:
python main.py --help--input_file: Path to input data file--input_prompt_pdf: PDF file with LLM prompt--llm_repetitions: Number of LLM experiment repetitions--llm_models: List of LLM models to use (e.g., gpt-3.5-turbo, gpt-4o)--llm_temperatures: List of temperature values for LLM experiments
src/: Source code for models and analysisoriginal_model.py: Implementation of the original stacking modelalternative_model.py: Alternative implementation with hyperparameter tuningfake_model.py: Baseline model for memorizeroriginal_test.py: Evaluation of the original modelalternative_test.py: Evaluation of the alternative modeldata_preproc.py&alternative_data_preproc.py: Data preprocessing modulesllm.py&llm_analysis.py: LLM-based analysis modules
input/: Input data files, including test datasets and prompt files for LLM analysisoutput/: Directory containing generated model files, plots, and resultsoriginal_predicted_results.csv: CSV file with predictions from the original modelalternative_predicted_results.csv: CSV file with predictions from the alternative modelfake_model_results.csv: CSV file with predictions from the baseline fake modelroc_curve_comparison.png: PNG image comparing ROC curves of different modelsllm_agg_results.tex: LaTeX file with aggregated results from LLM experimentsllm_ground_collapse_plot.png: PNG image showing ground collapse analysis using LLMs
The project implements several models:
StackingModel: The original stacking classifier from the paperAlternativeStackingModel: Enhanced version with RandomForest and hyperparameter tuningFakeModel: Baseline model for comparisonLLM Analysis: Evaluation of various LLMs on the task