Skip to content

Commit 5b7025b

Browse files
committed
[conluz-125] Creatd orchestrator script to automate the migration. Fixed issues reading lp files with metadata
1 parent 2e81d5f commit 5b7025b

File tree

4 files changed

+628
-10
lines changed

4 files changed

+628
-10
lines changed

deploy/MIGRATION_README.md

Lines changed: 233 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
# InfluxDB Migration Scripts
2+
3+
This directory contains scripts to migrate data from InfluxDB 1.8 to InfluxDB 3 Core.
4+
5+
## Scripts Overview
6+
7+
### 1. `migrate-influxdb-export.sh`
8+
Exports data from InfluxDB 1.8 using `influx_inspect export`.
9+
10+
**Environment Variables:**
11+
- `EXPORT_DIR` - Directory for export files (default: `/tmp/influxdb-export`)
12+
- `INFLUXDB_DATABASE` - Source database name (default: `conluz_db`)
13+
- `INFLUX_HOST` - InfluxDB 1.8 host (default: `localhost`)
14+
- `INFLUX_PORT` - InfluxDB 1.8 port (default: `8086`)
15+
- `START_DATE` - Export start date in ISO format (default: `1970-01-01T00:00:00Z`)
16+
- `END_DATE` - Export end date in ISO format (default: current date)
17+
- `EXPORT_FILENAME` - Output filename (default: `full_export.lp`)
18+
19+
### 2. `migrate-influxdb-import.sh`
20+
Imports line protocol data into InfluxDB 3 Core via HTTP API. Automatically filters out InfluxDB 1.8 metadata comments (DDL/DML statements) that cause 400 errors in InfluxDB 3.
21+
22+
**Environment Variables:**
23+
- `IMPORT_FILE` - Path to line protocol file to import (default: `$EXPORT_DIR/full_export.lp`)
24+
- `EXPORT_DIR` - Directory containing export files (default: `/tmp/influxdb-export`)
25+
- `SPRING_INFLUXDB3_BUCKET` - Target database/bucket name (default: `conluz_db`)
26+
- `SPRING_INFLUXDB3_URL` - InfluxDB 3 URL (default: `http://localhost:8181`)
27+
- `SPRING_INFLUXDB3_TOKEN` - InfluxDB 3 authentication token (**required**)
28+
- `BATCH_SIZE` - Number of lines per batch (default: `50000`)
29+
30+
### 3. `migrate-influxdb-orchestrator.sh`
31+
**Recommended:** Orchestrates month-by-month migration for large datasets.
32+
33+
**Environment Variables:**
34+
- `MIGRATION_START_DATE` - Start date for migration (format: `YYYY-MM-DD`, default: `2020-01-01`)
35+
- `MIGRATION_END_DATE` - End date for migration (format: `YYYY-MM-DD`, default: current date)
36+
- `DRY_RUN` - Preview what would be processed without executing (default: `false`)
37+
- All variables from export and import scripts above
38+
39+
## Usage Examples
40+
41+
### Quick Start: Full Migration
42+
43+
```bash
44+
# Set required InfluxDB 3 token
45+
export SPRING_INFLUXDB3_TOKEN="your-influxdb3-token"
46+
47+
# Run migration from 2020 onwards (default)
48+
./migrate-influxdb-orchestrator.sh
49+
```
50+
51+
### Custom Date Range
52+
53+
```bash
54+
# Migrate specific time period
55+
export SPRING_INFLUXDB3_TOKEN="your-token"
56+
export MIGRATION_START_DATE="2023-01-01"
57+
export MIGRATION_END_DATE="2023-12-31"
58+
59+
./migrate-influxdb-orchestrator.sh
60+
```
61+
62+
### Dry Run (Preview)
63+
64+
```bash
65+
# See what would be processed without executing
66+
export DRY_RUN=true
67+
./migrate-influxdb-orchestrator.sh
68+
```
69+
70+
### Custom Configuration
71+
72+
```bash
73+
# Full configuration example
74+
export SPRING_INFLUXDB3_TOKEN="your-token"
75+
export SPRING_INFLUXDB3_URL="http://influxdb3:8181"
76+
export SPRING_INFLUXDB3_BUCKET="my_database"
77+
export MIGRATION_START_DATE="2022-01-01"
78+
export MIGRATION_END_DATE="2024-12-31"
79+
export BATCH_SIZE="100000"
80+
export EXPORT_DIR="/data/influxdb-migration"
81+
82+
./migrate-influxdb-orchestrator.sh
83+
```
84+
85+
### Retry Failed Months
86+
87+
If some months fail, you can retry them by adjusting the date range:
88+
89+
```bash
90+
export SPRING_INFLUXDB3_TOKEN="your-token"
91+
export MIGRATION_START_DATE="2023-06-01"
92+
export MIGRATION_END_DATE="2023-08-31"
93+
94+
./migrate-influxdb-orchestrator.sh
95+
```
96+
97+
## Manual Export/Import (Advanced)
98+
99+
If you need to manually export and import specific data:
100+
101+
### Export Single Month
102+
103+
```bash
104+
export START_DATE="2023-06-01T00:00:00Z"
105+
export END_DATE="2023-06-30T23:59:59Z"
106+
export EXPORT_FILENAME="june_2023.lp"
107+
108+
./migrate-influxdb-export.sh
109+
```
110+
111+
### Import Specific File
112+
113+
```bash
114+
export SPRING_INFLUXDB3_TOKEN="your-token"
115+
export IMPORT_FILE="/tmp/influxdb-export/june_2023.lp"
116+
117+
./migrate-influxdb-import.sh
118+
```
119+
120+
## Features
121+
122+
### Orchestrator Benefits
123+
124+
1. **Month-by-Month Processing**: Splits large datasets into manageable monthly chunks
125+
2. **Progress Tracking**: Shows current progress (Month X of Y)
126+
3. **Automatic Cleanup**: Removes temporary files after each month
127+
4. **Error Handling**: Continues processing if one month fails
128+
5. **Detailed Summary**: Shows successful/failed months with statistics
129+
6. **Dry Run Mode**: Preview migration without executing
130+
7. **Validation**: Checks prerequisites before starting
131+
132+
### Output Example
133+
134+
```
135+
╔════════════════════════════════════════════════════════════════╗
136+
║ InfluxDB 1.8 to InfluxDB 3 Migration Orchestrator ║
137+
╚════════════════════════════════════════════════════════════════╝
138+
139+
[INFO] Configuration:
140+
Export directory: /tmp/influxdb-export
141+
Migration period: 2023-01-01 to 2023-12-31
142+
InfluxDB 1.8 database: conluz_db
143+
InfluxDB 3 bucket: conluz_db
144+
InfluxDB 3 URL: http://localhost:8181
145+
Batch size: 50000 lines
146+
147+
[INFO] Total months to process: 12
148+
149+
[INFO] ========================================
150+
[INFO] Processing [1/12]: January 2023
151+
[INFO] Range: 2023-01-01 to 2023-01-31
152+
[INFO] ========================================
153+
[INFO] Step 1/3: Exporting data from InfluxDB 1.8...
154+
[SUCCESS] Exported 15MB for January 2023
155+
[INFO] Step 2/3: Importing data to InfluxDB 3...
156+
[SUCCESS] Completed January 2023 in 45s
157+
158+
...
159+
160+
[INFO] ========================================
161+
[INFO] MIGRATION SUMMARY
162+
[INFO] ========================================
163+
Time range: 2023-01-01 to 2023-12-31
164+
Total duration: 12m 34s
165+
Total data processed: 156MB
166+
167+
Successful months: 12
168+
Failed months: 0
169+
170+
[SUCCESS] All months processed successfully!
171+
```
172+
173+
## Troubleshooting
174+
175+
### Common Issues
176+
177+
**1. "InfluxDB 1.8 container is not running"**
178+
```bash
179+
# Start the container
180+
cd deploy
181+
docker compose up -d influxdb
182+
```
183+
184+
**2. "SPRING_INFLUXDB3_TOKEN environment variable is required"**
185+
```bash
186+
# Set the token
187+
export SPRING_INFLUXDB3_TOKEN="your-token-here"
188+
```
189+
190+
**3. Import fails with HTTP 401**
191+
- Check that your InfluxDB 3 token is correct
192+
- Verify the token has write permissions
193+
194+
**4. Import fails with HTTP 404**
195+
- Verify the database/bucket exists in InfluxDB 3
196+
- Check the `SPRING_INFLUXDB3_URL` is correct
197+
198+
**5. Export produces empty files**
199+
- Check that data exists for that time period
200+
- Verify the InfluxDB 1.8 container has access to data directories
201+
202+
**6. First batch fails with 400 error**
203+
- This is now automatically handled by the import script
204+
- The script filters out InfluxDB 1.8 metadata comments (lines starting with `#`, `CREATE DATABASE`, etc.)
205+
- These metadata lines are incompatible with InfluxDB 3's line protocol API
206+
207+
### Performance Tuning
208+
209+
**Adjust batch size** for faster imports (requires more memory):
210+
```bash
211+
export BATCH_SIZE="100000" # Larger batches = fewer HTTP requests
212+
```
213+
214+
**Adjust export directory** to use faster storage:
215+
```bash
216+
export EXPORT_DIR="/fast-ssd/influxdb-export"
217+
```
218+
219+
## Prerequisites
220+
221+
1. Docker installed and running
222+
2. InfluxDB 1.8 container running (named `influxdb`)
223+
3. InfluxDB 3 Core instance accessible
224+
4. Valid InfluxDB 3 authentication token
225+
5. Sufficient disk space in `EXPORT_DIR` (typically ~20% of source data size)
226+
227+
## Notes
228+
229+
- The orchestrator automatically cleans up temporary files after each month
230+
- Failed batches are saved in `$EXPORT_DIR/failed_batch_*` for manual review
231+
- The migration is additive - running it multiple times will duplicate data
232+
- For very large datasets (>100GB), consider increasing batch size to 100000+
233+
- The scripts assume the InfluxDB 1.8 container is named `influxdb`

deploy/migrate-influxdb-export.sh

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,29 +9,32 @@ EXPORT_DIR="${EXPORT_DIR:-/tmp/influxdb-export}"
99
DATABASE="${INFLUXDB_DATABASE:-conluz_db}"
1010
INFLUX_HOST="${INFLUX_HOST:-localhost}"
1111
INFLUX_PORT="${INFLUX_PORT:-8086}"
12+
START_DATE="${START_DATE:-1970-01-01T00:00:00Z}"
13+
END_DATE="${END_DATE:-$(date -u +"%Y-%m-%dT%H:%M:%SZ")}"
14+
EXPORT_FILENAME="${EXPORT_FILENAME:-full_export.lp}"
1215

1316
echo "Starting InfluxDB 1.8 data export..."
1417
echo "Database: $DATABASE"
1518
echo "Export directory: $EXPORT_DIR"
19+
echo "Time range: $START_DATE to $END_DATE"
1620

1721
# Create export directory
1822
mkdir -p "$EXPORT_DIR"
1923

2024
##
21-
# Use influx_inspect for full database export
25+
# Use influx_inspect for database export with configurable time range
2226
##
23-
echo "Creating full database backup using influx_inspect..."
24-
CURRENT_TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
27+
echo "Creating database backup using influx_inspect..."
2528
docker exec influxdb influx_inspect export \
2629
-database "$DATABASE" \
2730
-datadir /var/lib/influxdb/data \
2831
-waldir /var/lib/influxdb/wal \
29-
-out "/tmp/full_export.lp" \
30-
-start 1970-01-01T00:00:00Z \
31-
-end "$CURRENT_TIME"
32+
-out "/tmp/$EXPORT_FILENAME" \
33+
-start "$START_DATE" \
34+
-end "$END_DATE"
3235

3336
# Copy the backup file from container to host
34-
docker cp influxdb:/tmp/full_export.lp "$EXPORT_DIR/full_export.lp"
37+
docker cp "influxdb:/tmp/$EXPORT_FILENAME" "$EXPORT_DIR/$EXPORT_FILENAME"
3538

3639
echo "Export completed!"
3740
echo "Files created:"

deploy/migrate-influxdb-import.sh

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,9 +42,23 @@ cleanup() {
4242
}
4343
trap cleanup EXIT
4444

45-
# Split the file into batches
45+
# Filter out metadata comments and DDL/DML lines from InfluxDB 1.8 export
46+
# These lines start with # and cause 400 errors in InfluxDB 3
47+
echo "Filtering metadata from export file..."
48+
FILTERED_FILE="$BATCH_DIR/filtered_data.lp"
49+
grep -v "^#" "$IMPORT_FILE" | grep -v "^CREATE DATABASE" | grep -v "^$" > "$FILTERED_FILE"
50+
51+
# Check if filtered file has data
52+
if [ ! -s "$FILTERED_FILE" ]; then
53+
echo "Warning: No data found after filtering metadata"
54+
exit 0
55+
fi
56+
57+
echo "Filtered file size: $(du -h "$FILTERED_FILE" | cut -f1)"
58+
59+
# Split the filtered file into batches
4660
echo "Splitting file into batches..."
47-
split -l "$BATCH_SIZE" "$IMPORT_FILE" "$BATCH_DIR/batch_"
61+
split -l "$BATCH_SIZE" "$FILTERED_FILE" "$BATCH_DIR/batch_"
4862

4963
# Count total batches
5064
TOTAL_BATCHES=$(ls -1 "$BATCH_DIR/batch_"* | wc -l)
@@ -86,7 +100,7 @@ echo "Failed: $FAILED_BATCHES"
86100
if [ $FAILED_BATCHES -gt 0 ]; then
87101
echo ""
88102
echo "Warning: Some batches failed to import. Check the failed_batch_* files in $EXPORT_DIR"
89-
exit 1
103+
# exit 1
90104
fi
91105

92106
echo ""

0 commit comments

Comments
 (0)