Hierarchical Knowledge Retrieval for Disaster Prevention Education
An advanced RAG (Retrieval-Augmented Generation) system using RAPTOR algorithm to hierarchically organize and retrieve lessons from the 2011 Great East Japan Earthquake and Tsunami for educational purposes.
This system was developed to preserve and pass on the lessons learned from the Great East Japan Earthquake and Tsunami of March 11, 2011, to future generations. Using the RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval) algorithm, it hierarchically organizes vast amounts of disaster lesson data, enabling context-preserving search and summarization.
- ๐ฒ Hierarchical Knowledge Structure: Organizes lessons using RAPTOR algorithm
- ๐ Context-Aware Search: Semantic search considering hierarchical relationships
- ๐ Automatic Summarization: LLM-powered context-preserving lesson summaries
- ๐ Optimized Clustering: Automatic optimal cluster selection using Silhouette strategy
- ๐ฏ Disaster Education Focused: Specialized prompts and chunking for disaster prevention education
- ๐ Advanced Visualization: Comprehensive visualization suite with NetworkX, t-SNE, UMAP
- ๐ 3D Dynamic Tree Visualization: Interactive 3D RAPTOR tree with cognee-inspired rendering
- Total Content: 50,892 characters of comprehensive disaster lessons
- Hierarchical Structure: 78-node knowledge tree with 4 depth levels
- Cluster Distribution: Mean 33.1 documents, median 7.0 documents per cluster
- Coverage: 20 major categories of disaster lessons
- LLM: granite-code:8b (GPU-optimized)
- Embeddings: mxbai-embed-large (1024 dimensions)
- Vector Storage: FAISS with hierarchical indexing
- Clustering: Silhouette-based optimization (max_depth=3)
- Visualization: NetworkX, matplotlib, seaborn, plotly, t-SNE, UMAP
- 3D Visualization: Plotly WebGL, cognee-inspired dynamic 3D tree rendering
The system integrates authoritative information from multiple reliable sources:
-
Damage Assessment & Initial Response
- Earthquake/tsunami overview and casualties (15,900 deaths, 2,525 missing)
- Emergency response by Self-Defense Forces, fire departments, and police
- Transportation infrastructure, communication, and supply chain challenges
-
Tsunami Evacuation Case Studies
- Kesennuma City, Miyagi: Port facility evacuation
- Otsuchi Town, Iwate: Disaster response headquarters damage and lessons
- Iwaki City, Fukushima: Combined nuclear and tsunami disaster response
- Minamisanriku Town, Miyagi: Disaster Prevention Building tragedy
- Rikuzentakata City, Iwate: Urban destruction and reconstruction
-
Disaster Prevention Education Examples
- Kamaishi City: "Miracle of Kamaishi" and three principles
- Sendai City: Disaster education supplementary readers and Arahama Elementary ruins
- Ishinomaki City: Okawa Elementary lessons and school disaster system review
- Fukushima Prefecture: Radiation education and scientific literacy
-
Reconstruction Town Planning
- Rikuzentakata City: 10m elevation reconstruction project (5 million mยณ of soil)
- Onagawa Town: Compact city and Seapal-Pier Onagawa
- Higashimatsushima City: Environmental Future City concept and smart housing
- Natori City Yuriage: Challenges and lessons in resident consensus building
-
Fukushima Nuclear Accident Details
- Nuclear plant incident timeline and technical analysis
- Evacuation procedures and long-term impact assessment
- Radiation monitoring and safety protocols
The system generates comprehensive visualizations to understand the hierarchical knowledge structure:
Hierarchical Knowledge Tree:
- Total Nodes: 78 (distributed across 4 levels)
- Root Level: 1 comprehensive overview node
- Intermediate Levels: 4 โ 16 category-specific summary nodes
- Leaf Level: 57 specific lesson nodes
- Node Colors: Depth-based (Root: green, Intermediate: blue, Leaf: yellow)
- Node Sizes: Proportional to document count
Statistical Distribution:
- Depth Distribution: Balanced 4-level hierarchy (1โ4โ16โ57)
- Cluster Size Variation: Range from 1 to 433 documents
- Mean Cluster Size: 33.1 documents
- Median Cluster Size: 7.0 documents
Quality Metrics:
- Silhouette Coefficient: 0.0968 average (higher is better)
- Davies-Bouldin Index: 2.8291 average (lower is better)
- Optimal k Selection: k=2 (6 times), k=3 (2 times)
- Strategy: Silhouette-based automatic optimization
Dimensionality Reduction:
- t-SNE: 1024D โ 2D projection preserving local structure
- UMAP: Global structure preservation with clearer cluster boundaries
- Interactive Versions: Available in HTML format for detailed exploration
- Install Ollama (https://ollama.ai/)
- Download Required Models:
ollama pull mxbai-embed-large # Embedding model (1024 dimensions)
ollama pull granite-code:8b # LLM model (8B parameters)- Install Python Dependencies:
pip install langchain langchain-community langchain-ollama
pip install faiss-cpu scikit-learn numpy pandas
pip install matplotlib seaborn plotly networkx
pip install umap-learn jupyter notebookTest the installation:
python test_simple.pyExpected output:
โ
Embedding model works! Vector dimension: 1024
โ
LLM model works! Response: Hello! ...
โ
All models are working correctly!
python quick_start.pySample Queries:
- "What were effective tsunami evacuation actions?"
- "What are the success factors of the Kamaishi Miracle?"
- "What is the counterpart system?"
- "What's important for disaster information transmission?"
- "What are the challenges in reconstruction town planning?"
python quick_start.py batchpython tsunami_lesson_raptor.pyjupyter notebook raptor_tree_visualization_tsunami.ipynbThe notebook contains 67 cells including:
- Library imports and model configuration
- Tree construction and data loading
- NetworkX structure visualization
- Statistical analysis and distribution plots
- High-dimensional embedding projections
- Interactive plot generation
All visualizations are saved to output_figure/ directory:
| Visualization Type | File | Purpose | Key Information |
|---|---|---|---|
| ๐ฒ Tree Structure | 01_tree_structure.png |
Hierarchical overview | 78 nodes, 4 levels, parent-child relationships |
| ๐ Statistics | 02_cluster_statistics.png |
Numerical analysis | Depth distribution, cluster sizes |
| ๐ Evaluation | 03_evaluation_metrics.png |
Clustering quality | Silhouette scores, DBI, k-selection |
| ๐ฏ t-SNE | 04_tsne_visualization.png |
Local structure | Neighborhood preservation, cluster boundaries |
| ๐บ๏ธ UMAP | 05_umap_visualization.png |
Global structure | Topology preservation, clear separation |
| ๐ Multi-layer | 06_multi_layer_comparison.png |
Layer comparison | Abstraction process, level structures |
| ๐ Ultra-Fast 3D | 07_ultra_fast_3d_raptor.html |
3D tree structure | Interactive WebGL, GPU-optimized |
| โก Instant 3D | 08_instant_3d_raptor.html |
One-click visualization | Instant 3D tree (~3.8s total) |
Ultra-Fast Interactive 3D Tree Rendering
Our system features a cutting-edge 3D visualization system inspired by cognee principles, delivering interactive RAPTOR tree exploration in under 4 seconds:
- โก Lightning Performance: 0.022s 3D generation, 3.8s total execution
- ๐ฎ Interactive Controls: Mouse-based rotation, zoom, and pan
- ๐ Hierarchical Color Coding: Level-based visual distinction (RedโOrangeโYellowโGreen)
- ๐ Dynamic Node Sizing: Document count-based proportional sizing
- ๐ Edge Visualization: Parent-child relationship mapping
- ๐พ Multi-format Output: Interactive HTML + static PNG exports
78-Node Hierarchical Tree:
โโโ Level 0 (Root): 1 node - 433 documents (Red)
โโโ Level 1: 4 nodes - 108 docs/node (Orange)
โโโ Level 2: 16 nodes - 27 docs/node (Yellow)
โโโ Level 3: 57 nodes - 7 docs/node (Green)
3D_PERFORMANCE = {
"initialization": "0.002s (cached)",
"coordinate_calculation": "0.022s",
"webgl_rendering": "<0.2s",
"total_execution": "3.759s",
"speed_improvement": "1500x faster than baseline"
}- NumPy Vectorization: Batch coordinate calculation
- Plotly WebGL: GPU-accelerated browser rendering
- Pre-computed Data: Fixed tree structure for instant loading
- Conditional Initialization: Duplicate processing prevention
- Memory Optimization: Efficient data structures
RAPTOR_CONFIG = {
"max_depth": 3, # Maximum tree depth
"selection_strategy": "silhouette", # Clustering strategy
"chunk_size": 500, # Text chunk size
"chunk_overlap": 100, # Overlap between chunks
"embedding_dimension": 1024, # mxbai-embed-large dimension
"temperature": 0.0, # LLM temperature for consistency
}VISUALIZATION_CONFIG = {
"tree_layout": "spring", # NetworkX layout algorithm
"figure_size": (15, 10), # Figure size (inches)
"node_size_range": (100, 3000), # Node size range
"edge_alpha": 0.6, # Edge transparency
"dpi": 300, # Image resolution
"tsne_perplexity": 30, # t-SNE parameter
"umap_n_neighbors": 15, # UMAP parameter
}
# 3D Visualization Configuration
RAPTOR_3D_CONFIG = {
"instant_generation": True, # One-click execution
"webgl_optimization": True, # GPU browser rendering
"color_scheme": { # Hierarchical colors
0: "#FF0000", # Root (Red)
1: "#FF8000", # Level 1 (Orange)
2: "#FFFF00", # Level 2 (Yellow)
3: "#00FF00" # Level 3 (Green)
},
"node_size_scale": 20, # Document count scaling
"camera_position": (1.2, 1.2, 1.2), # Default 3D view
"background": "black", # Scene background
"export_formats": ["html", "png"] # Output file types
}- Tree Construction: ~2-3 minutes (initial build)
- Search Response: <1 second (after model loading)
- GPU Utilization: 100% (granite-code:8b)
- Memory Usage: ~4GB RAM, ~8GB VRAM
- 3D Visualization: 3.8s total execution (instant after initialization)
- Interactive Rendering: WebGL real-time performance
- Retrieval Accuracy: Context-aware hierarchical search
- Clustering Quality: Silhouette coefficient 0.0968
- Knowledge Coverage: 20 major disaster lesson categories
- Language Support: Japanese (primary), English (documentation)
To experience the interactive 3D RAPTOR tree visualization:
- Open Notebook:
raptor_tree_visualization_tsunami.ipynb - Run One-Click Cell: Execute the "๐ ใฏใณใฏใชใใฏ่ถ ้ซ้่ตทๅ" cell
- Interactive Exploration: Use mouse to rotate, zoom, and explore the 78-node structure
- Export Options: Automatically generates HTML and PNG files
# One-click 3D visualization execution
# Cell execution time: ~3.8 seconds
# Output: Interactive 3D tree with WebGL rendering
instant_fig = instant_3d_raptor()
instant_fig.show()output_figure/08_instant_3d_raptor.html- Interactive 3D visualizationoutput_figure/08_instant_3d_raptor.png- Static image export
- Disaster Prevention Training: Structured lesson delivery
- Academic Research: Systematic disaster lesson analysis
- Policy Development: Evidence-based disaster preparedness
- Community Education: Accessible lesson sharing
- Knowledge Management: Hierarchical information organization
- RAG System Development: Advanced retrieval techniques
- Visualization Research: Multi-modal data representation
- AI Education: RAPTOR algorithm implementation example
- 3D Interactive Learning: Immersive knowledge exploration
- Web-based Education: Browser-compatible 3D tree navigation
- Multilingual Support: English, Chinese, Korean translations
- Web Interface: Browser-based interactive system
- Knowledge Base Expansion: Global tsunami lessons integration
- Multimedia Integration: Images, videos, audio materials
- Real-time Updates: Dynamic knowledge base updates
- Mobile Application: Smartphone-optimized interface
- API Development: RESTful API for integration
- Evaluation Dataset: JQaRA-format 30-question dataset
This project aims to preserve disaster lessons for educational purposes. We welcome contributions in:
- Knowledge Base Expansion: Additional disaster lesson sources
- Evaluation Dataset Creation: Standardized assessment materials
- Documentation Improvement: Enhanced guides and tutorials
- Internationalization: Translation and localization
- Visualization Enhancement: New analysis techniques
- Performance Optimization: Speed and efficiency improvements
This project is freely available for educational and research purposes. For commercial use, please check the licensing terms of the underlying LLM models.
This system is based on research and activities from:
- Cabinet Office Reconstruction Agency: "Lessons and Know-how Collection for Reconstruction"
- Tohoku University International Research Institute of Disaster Science: Prof. Fumihiko Imamura's research
- Kamaishi City Board of Education: Disaster prevention education practices
- Storytellers nationwide: Community-based lesson sharing activities
We pray for those who lost their lives in the Great East Japan Earthquake and hope these lessons will contribute to future disaster prevention.
Version: 1.0 - International Edition
Created: October 20, 2025
Project: Tsunami Lesson RAPTOR System
For questions, suggestions, or collaboration opportunities, please open an issue in this repository.
Repository: https://github.com/tk-yasuno/tsunami-lesson-rag




