Kramerius OCR getter

Just get the OCR from a book hosted at the Kramerius site. Experimental, Kramerius doesn't work half of the time 😿

Requirements

pip install -r requirements.txt

Setup

Rename .env-example to .env
Put in your cookie from Kramerius (works without it, but with it you can download dila nedostupná na trhu).

Getting a cookie from Kramerius

Log into ndk.cz with your account (for example university account)
Press right click on the page>Inspect
In the inspect window, select Storage
Select cookies
Copy the shibsession name into cookie name in env.
Copy the Value of shibsession into cookie Value in env.
Save

I don't know how long the cookie persist. Needs more testing

Usage

Download the Kramerius.py
Make it exacutable with chmod +x Kramerius.py
Run ./Kramerius.py "Link to your book"

You can specify the output file with --o flag

IF your download fails (kramerius is tricky) you get uuid returned on which it failed. Then, you can simply run ./Kramerius "link" --c "uuid" and it will continue the download into output_continueation.txt. Then run cat output_continuation.txt>>output.txt to join them.

That's it! Download what you need :)

Supported Kramerius instances

-> ndk.cz
-> Moravská zemská knihovna

Considered support for:

-> kramerius.lib.cas.cz

Experimental

If you have gTTS downloaded (pip install gTTS), you can use the TTS.py to generate "quick" audiobook for your file.

Simply call TTS.py "nameofyourfile.txt" and in a few minutes (takes some time) you will have a listenable file.

LMK if anything breaks

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
2pdf		2pdf
graveyard		graveyard
.env-example		.env-example
.gitignore		.gitignore
Kramerius.py		Kramerius.py
README.md		README.md
img1.png		img1.png
requirements.txt		requirements.txt
tts.py		tts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kramerius OCR getter

Requirements

Setup

Getting a cookie from Kramerius

Usage

Supported Kramerius instances

Considered support for:

Experimental

About

Uh oh!

Releases

Packages

Languages

ridlees/Kramerius-text-scrapper

Folders and files

Latest commit

History

Repository files navigation

Kramerius OCR getter

Requirements

Setup

Getting a cookie from Kramerius

Usage

Supported Kramerius instances

Considered support for:

Experimental

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages