-
Notifications
You must be signed in to change notification settings - Fork 162
Open
Description
Python -VV
Python 3.12.10 (main, Apr 9 2025, 04:11:22) [Clang 20.1.0 ]Pip Freeze
accelerate==1.12.0
aiohappyeyeballs==2.6.1
aiohttp==3.11.16
aiosignal==1.3.2
anaconda-anon-usage @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_9abq1vdv2c/croot/anaconda-anon-usage_1732732441000/work
annotated-types @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_fbh4enbns2/croot/annotated-types_1709542919423/work
anthropic==0.8.1
antlr4-python3-runtime==4.9.3
anyio==4.9.0
archspec @ file:///croot/archspec_1709217642129/work
attrs==25.3.0
beautifulsoup4==4.14.3
boltons @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_cc11jmhstw/croot/boltons_1737061711194/work
Brotli @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_d7pp3g74_g/croot/brotli-split_1736182638718/work
certifi @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_d0mlk2yciq/croot/certifi_1738623741969/work/certifi
cffi @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_51d1gdg4kr/croot/cffi_1736183297412/work
charset-normalizer @ file:///croot/charset-normalizer_1721748349566/work
click==8.3.1
colorlog==6.10.1
conda @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_a6jlf16vzv/croot/conda_1738168381330/work
conda-anaconda-telemetry @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_cfomk06nc8/croot/conda-anaconda-telemetry_1736524617095/work
conda-anaconda-tos @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_ddgaun3j93/croot/conda-anaconda-tos_1739299002132/work
conda-content-trust @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_78eyko59n2/croot/conda-content-trust_1714483158098/work
conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1737733694612/work/src
conda-package-handling @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_b5_vfr2fqi/croot/conda-package-handling_1731369042242/work
conda_package_streaming @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_58icnlhvbg/croot/conda-package-streaming_1731366295477/work
cryptography @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_97qpgd2gkx/croot/cryptography_1732131165709/work
curl_cffi==0.14.0
dataclasses-json==0.6.7
dill==0.4.1
discord==2.3.2
discord-protos==0.0.2
discord.py==2.5.2
discord.py-self==2.1.0
distro @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_b5l_bzm_c4/croot/distro_1714488255954/work
docling==2.70.0
docling-core==2.60.2
docling-ibm-models==3.11.0
docling-mcp==1.3.4
docling-parse==4.7.3
et_xmlfile==2.0.0
Faker==40.1.2
filelock==3.18.0
filetype==1.2.0
frozendict @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_f2kfyv072k/croot/frozendict_1713194840232/work
frozenlist==1.5.0
fsspec==2025.7.0
google-api-core==2.29.0
google-api-python-client==2.188.0
google-auth==2.48.0
google-auth-httplib2==0.3.0
google-genai==1.60.0
googleapis-common-protos==1.72.0
greenlet==3.3.1
h11==0.16.0
hf-xet==1.1.5
httpcore==1.0.9
httplib2==0.31.2
httpx==0.28.1
httpx-sse==0.4.3
huggingface-hub==0.34.3
idna @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_2b_jn555_n/croot/idna_1714398852258/work
Jinja2==3.1.6
jsonlines==4.0.0
jsonpatch @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_9dcqemvl4v/croot/jsonpatch_1714483445583/work
jsonpointer==2.1
jsonref==1.1.0
jsonschema==4.26.0
jsonschema-specifications==2025.9.1
langchain==1.2.7
langchain-classic==1.0.1
langchain-community==0.4.1
langchain-core==1.2.7
langchain-google-genai==4.2.0
langchain-text-splitters==1.1.0
langgraph==1.0.7
langgraph-checkpoint==4.0.0
langgraph-prebuilt==1.0.7
langgraph-sdk==0.3.3
langsmith==0.6.5
latex2mathml==3.78.1
libmambapy @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_bd_xm961h4/croot/mamba-split_1734469689868/work/libmambapy
lxml==6.0.2
markdown-it-py @ file:///Users/builder/cbouss/perseverance-python-buildout/croot/markdown-it-py_1699239904229/work
marko==2.2.2
MarkupSafe==3.0.3
marshmallow==3.26.2
mcp==1.26.0
mdurl @ file:///Users/builder/cbouss/perseverance-python-buildout/croot/mdurl_1699239487349/work
menuinst @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_27kvagn684/croot/menuinst_1738945388149/work
mpire==2.10.2
mpmath==1.3.0
multidict==6.3.2
multiprocess==0.70.19
mypy_extensions==1.1.0
networkx==3.6.1
numpy==2.4.1
ocrmac==1.0.1
omegaconf==2.3.0
opencv-python==4.13.0.90
openpyxl==3.1.5
orjson==3.11.5
ormsgpack==1.12.2
packaging @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_15t4xe1fp0/croot/packaging_1734472125760/work
pandas==2.3.3
pillow==11.3.0
platformdirs @ file:///Users/builder/cbouss/perseverance-python-buildout/croot/platformdirs_1701805067573/work
pluggy @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_70ykkrsb12/croot/pluggy_1733169619735/work
polyfactory==3.2.0
propcache==0.3.1
proto-plus==1.27.0
protobuf==6.33.4
psutil==7.2.1
pyasn1==0.6.2
pyasn1_modules==0.4.2
pyclipper==1.4.0
pycosat @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_67p9tvgx9q/croot/pycosat_1736868714508/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic==2.12.5
pydantic-settings==2.12.0
pydantic_core==2.41.5
Pygments @ file:///Users/builder/cbouss/perseverance-python-buildout/croot/pygments_1699240212223/work
PyJWT==2.10.1
pylatexenc==2.10
pyobjc-core==12.1
pyobjc-framework-Cocoa==12.1
pyobjc-framework-CoreML==12.1
pyobjc-framework-Quartz==12.1
pyobjc-framework-Vision==12.1
pyparsing==3.3.2
pypdfium2==5.3.0
PySocks @ file:///Users/builder/cbouss/perseverance-python-buildout/croot/pysocks_1699239289103/work
python-dateutil==2.9.0.post0
python-docx==1.2.0
python-dotenv==1.1.1
python-multipart==0.0.22
python-pptx==1.0.2
python-telegram-bot==22.6
pytz==2025.2
PyYAML==6.0.2
rapidocr==3.5.0
referencing==0.37.0
regex==2026.1.15
requests==2.32.5
requests-toolbelt==1.0.0
rich @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_4dgbejjrrq/croot/rich_1732638982215/work
rpds-py==0.30.0
rsa==4.9.1
rtree==1.4.1
ruamel.yaml @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_a0tb7e252p/croot/ruamel.yaml_1727980165152/work
ruamel.yaml.clib @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_47oq900ogu/croot/ruamel.yaml.clib_1727769824325/work
safetensors==0.7.0
scipy==1.17.0
semchunk==2.2.2
setuptools==75.8.0
shapely==2.1.2
shellingham==1.5.4
six==1.17.0
sniffio==1.3.1
soupsieve==2.8.3
SQLAlchemy==2.0.46
sse-starlette==3.2.0
starlette==0.52.1
sympy==1.14.0
tabulate==0.9.0
tenacity==9.1.2
tokenizers==0.22.2
torch==2.2.2
torchvision==0.17.2
tqdm @ file:///private/var/folders/c_/qfmhj66j0tn016nkx_th4hxm0000gp/T/abs_b8a_tjze9j/croot/tqdm_1738946058169/work
transformers==4.57.6
tree-sitter==0.25.2
tree-sitter-c==0.24.1
tree-sitter-javascript==0.25.0
tree-sitter-python==0.25.0
tree-sitter-typescript==0.23.2
truststore @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_16_69k8s1i/croot/truststore_1736550123387/work
typer==0.19.2
typing-inspect==0.9.0
typing-inspection==0.4.2
typing_extensions==4.15.0
tzdata==2025.3
tzlocal==5.3.1
uritemplate==4.2.0
urllib3 @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_8bqv7goib8/croot/urllib3_1737133637259/work
uuid_utils==0.14.0
uvicorn==0.40.0
websockets==15.0.1
wheel==0.45.1
xlsxwriter==3.2.9
xxhash==3.6.0
yarl==1.18.3
zstandard @ file:///private/var/folders/sy/f16zz6x50xz3113nwtb9bvq00000gp/T/abs_65utj9q8ya/croot/zstandard_1731360545821/workReproduction Steps
- I'm processing a PDF that contains multiple tables, some has 40+ rows which has a proper headings, etc.
- I run these
response = client.ocr.process( model="mistral-ocr-latest", document={"type": "document_url", "document_url": document_url}, pages=page_range, document_annotation_format=annotation_format, document_annotation_prompt=DOCUMENT_ANNOTATION_PROMPT, bbox_annotation_format=bbox_format, include_image_base64=( str(os.environ.get("MISTRAL_INCLUDE_IMAGE_BASE64", "true")).lower() == "true" ), table_format=os.environ.get("MISTRAL_TABLE_FORMAT", "html"), ) - I cannot provide the schema but one thing the schema has this field "notes:" to let the ai write a note about it's extraction process.
Expected Behavior
So i am expecting to extract all those data in the tables with structured outputs. The table fields in the raw API response give the tables properly with colspan, rowspan, header, each data in the rows, etc. However when using document annotation it seems those table data are not fed to the ai, the one that is creating the structured output (I assume it's one of your ai that outputs these) and in the note it keeps saying: "Land data table not found in the document. Only one land parcel entry created based on the text description." or sometimes "[tbl-0.html] data is not provided in the text"
Additional Context
No response
Suggested Solutions
No response
Metadata
Metadata
Assignees
Labels
No labels