This repository was archived by the owner on Jan 26, 2025. It is now read-only.

Description
We need complete end-to-end documentation for a single-node dstlr:
- Ingesting Washington Post into Solr.
- Running extraction on a subset of the docs. (I understand that extraction over the entire corpus might be unrealistic on a single node.)
- Running enrichment.
- Running sample data cleaning queries.
We have parts here and there already, but I'd like documentation down to the level of "copy and paste these commands" into a shell... and it should just work.