Feat/Batch Multi Form Fill with Canonical Extraction and Evidence Attribution by Acuspeedster · Pull Request #158 · fireform-core/FireForm

Acuspeedster · 2026-03-03T11:56:15Z

Closes #155
Closes #156
Closes #157

Summary

This PR operationalizes FireForm’s "report once, file everywhere" promise at the API level.

It introduces:

Canonical transcript extraction (single-pass)
Concurrent template mapping
Batch multi-form endpoint
Evidence attribution per field
Persisted audit trail
Partial failure resilience
Complexity reduction from T×F to 1 + T

From an architectural standpoint, this PR should be considered high priority, as it resolves a core scalability and compliance limitation affecting FireForm’s primary multi-agency use case.

Architectural Redesign

Separation of Concerns

Previously:

Extraction and template filling were fused.
Each template re-extracted from raw transcript.

Now:

Canonical incident extraction (1 LLM call)
Template mapping (T concurrent calls)
PDF writing concurrent via executor

IncidentExtractor

New extractor.py introduces IncidentExtractor.

Pass 1:

Single LLM call
Produces canonical incident record
26 template-agnostic categories
Each category contains:
- value
- evidence_quote
- confidence

Pass 2:

Template-specific mapping
Stateless w.r.t transcript
Concurrent via asyncio.gather

Complexity Improvement

Previous:
T × F LLM calls

New:
1 + T LLM calls

Example:
5 templates × 10 fields = 50 calls
Now = 6 calls

Concurrency

Mapping calls executed concurrently
PDF fills executed in thread pool executor
Event loop never blocked

New Endpoints

POST `/forms/fill/batch`

One transcript
Multiple templates
Returns all output paths
Returns batch_id

GET `/forms/batches/{id}`

Per-template success/failure
Batch state

GET `/forms/batches/{id}/audit`

Canonical extraction
Evidence quotes
Confidence levels

Partial Failure Handling

One template failure does not abort batch
Status states:
- complete
- partial
- failed

Callers never lose successful fills due to unrelated template error.

Database Changes

New BatchSubmission table:

batch_id
template_ids
status
per-template outputs
canonical extraction JSON
evidence fields
created_at

Repository additions:

create_batch
get_batch

Testing

24 new tests
Total: 38 passing

Coverage includes:

Single-template success
Multi-template success
Partial failure
All-failed case
Audit endpoint validation
Evidence content validation
404 handling
Input validation (empty, duplicates, limits)
Unit tests for evidence builder

Operational Significance

This PR:

Removes redundant extraction cost
Enables true multi-agency filing
Introduces legal-grade evidence attribution
Reduces inference complexity
Improves scalability
Aligns FireForm with production emergency-services requirements
Establishes architectural clarity between extraction, mapping, and presentation

Given that multi-agency filing is the primary use case FireForm is designed to solve, resolving this redundancy and audit gap is foundational for production readiness.

…emplate tests with detailed assertions and mock setups

…uce API calls and improve performance

…eManipulator

…tribution The core promise of FireForm is 'report once, file everywhere'. This commit delivers that promise at the API level. ## What this adds ### POST /forms/fill/batch One incident transcript -> fill N agency PDFs in a single request. Extraction complexity: O(T*F) LLM calls -> O(1 + T) LLM calls (T templates, F fields each; e.g. 5 agencies x 10 fields = 50 -> 6 calls) Pipeline: 1. Single canonical LLM extraction pass (all incident data, one call) 2. Concurrent template-mapping passes via asyncio.gather() (T fast calls) 3. Concurrent PDF fills via loop.run_in_executor() (no event-loop blocking) 4. Per-template FormSubmission records + one BatchSubmission audit record 5. Partial failure tolerance: some PDFs can fail without aborting others ### GET /forms/batches/{id} Lightweight status check: per-template output paths, success/failure counts. ### GET /forms/batches/{id}/audit [legal compliance endpoint] Full canonical extraction for chain-of-custody verification. Every extracted field carries the verbatim transcript quote used as evidence, allowing supervisors and legal teams to trace every value in every filed form back to a specific statement in the original incident recording. ## New files - src/extractor.py IncidentExtractor class (canonical + mapping) - api/routes/batch.py Three new endpoints - api/schemas/batch.py BatchFill, TemplateResult, EvidenceField, ... - tests/test_batch.py 24 tests (all passing) ## Modified files - api/db/models.py BatchSubmission SQLModel table - api/db/repositories.py create_batch / get_batch CRUD - api/main.py Register batch router; add OpenAPI metadata - src/filler.py Add fill_form_with_data() (name-based field fill) - tests/conftest.py Import FillJob + BatchSubmission for test DB 38 tests pass.

Acuspeedster added 4 commits March 1, 2026 17:24

feat(tests): register AppError exception handler in FastAPI app and t…

05af000

…emplate tests with detailed assertions and mock setups

feat(llm): implement batch processing for LLM field extraction to red…

42b51f0

…uce API calls and improve performance

feat: add pytest configuration and lazy import for commonforms in Fil…

f3a5654

…eManipulator

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/Batch Multi Form Fill with Canonical Extraction and Evidence Attribution#158

Feat/Batch Multi Form Fill with Canonical Extraction and Evidence Attribution#158
Acuspeedster wants to merge 4 commits intofireform-core:mainfrom
Acuspeedster:feat/batch-fill-canonical-extraction

Acuspeedster commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Acuspeedster commented Mar 3, 2026

Summary

Architectural Redesign

Separation of Concerns

IncidentExtractor

Complexity Improvement

Concurrency

New Endpoints

POST /forms/fill/batch

GET /forms/batches/{id}

GET /forms/batches/{id}/audit

Partial Failure Handling

Database Changes

Testing

Operational Significance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

POST `/forms/fill/batch`

GET `/forms/batches/{id}`

GET `/forms/batches/{id}/audit`