Releases: Hk669/bpetokenizer
Releases · Hk669/bpetokenizer
v1.2.1
06 Jun 18:34
Compare
Sorry, something went wrong.
No results found
What's Changed
feat: starttime-endtime added with the throughput on verbose by @Hk669 in #10
Updates for the pretrained tokenizers. by @Hk669 in #11
Full Changelog : v1.2.0...v1.2.1
v1.2.0
05 Jun 15:15
Compare
Sorry, something went wrong.
No results found
v1.0.4
05 Jun 15:01
Compare
Sorry, something went wrong.
No results found
v1.0.32
29 May 15:00
Compare
Sorry, something went wrong.
No results found
Full Changelog : v1.0.31...v1.0.32
added hyperparameter min_frequency to adjust the merge pairs to avoid extra vocab.
default is set to 2.
made some changes in the tests.
v1.0.31
29 May 08:42
Compare
Sorry, something went wrong.
No results found
Full Changelog : v1.0.3...v1.0.31
added a tokens visibilty feature to the developers to view their splitting of the tokens and as well as the text chunks split using the pattern.
added more samples
v1.0.3
28 May 08:28
Compare
Sorry, something went wrong.
No results found
added the mode parameter in the save and load methods to help developers, save and load their vocab and the merges of the tokenizer in their desired format .
Full Changelog : v1.0.21...v1.0.3
v1.0.2
27 May 20:43
Compare
Sorry, something went wrong.
No results found
build working correctly, ensuring the upload to pypi working.
v1.0.10
27 May 17:44
Compare
Sorry, something went wrong.
No results found
testing the pypi package auto upload
v1.0.1
27 May 17:28
Compare
Sorry, something went wrong.
No results found
first release
adds the following functionalities:
BPETokenizer: which can be used to build your tokenizer for the LLM
Tokenizer: a base class which leverages the save and load of the vocab and merges