feat: Change OCR to multi-method model #716

benITo47 · 2026-01-14T09:13:05Z

Description

This PR refactors the OCR and VerticalOCR to improve efficiency.

Previously, our implementation relied on multiple instances of the same detector and recognizer models, each handling a different input size (3x Detector, 4x Recognizer instances). This approach was resource-intensive.

This update introduces a more streamlined approach by using a single detector and a single recognizer model, each with multiple forward_ methods (e.g., forward_800, forward_320). These methods handle different input widths within the same model instance, significantly reducing the number of loaded models and simplifying the API.

This change is a breaking change as it modifies the arguments for useOCR, useVerticalOCR, OCRModule, and VerticalOCRModule

Introduces a breaking change?

Yes
No

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
Documentation update (improves or adds clarity to existing documentation)
Other (chores, tests, code style improvements etc.)

Tested on

iOS
Android

Testing instructions

Manual sanity checks.

Screenshots

Related issues

#692

Checklist

I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have updated the documentation accordingly
My changes generate no new warnings

Additional notes

Refactor of TypeScript interfaces and hooks for OCR and VerticalOCR to support models that expose multiple inference methods for different input sizes. This commit simplifies current setup by allowing a single detector and recognizer source, rather than requiring separate entries for different input sizes.

…er model Adapts the C++ Recognition controllers to handle a single recognizer file that contains multiple inference methods.

This commit addapts the C++ OCR and VerticalOCR controllers to handle a single detector model with multiple inference methods

msluszniak · 2026-01-14T09:37:51Z

The cpp code and docs look good, someone needs to review the rest and test it

IgorSwat

Overall a good code.

IgorSwat · 2026-01-14T09:25:21Z

packages/react-native-executorch/common/rnexecutorch/models/ocr/Detector.h

-  std::vector<types::DetectorBBox> generate(const cv::Mat &inputImage);
-  cv::Size getModelImageSize() const noexcept;
+  std::vector<types::DetectorBBox> generate(const cv::Mat &inputImage,
+                                            const int inputWidth);


Unnecessary const.
Also use int32_t (or any other explicit int type you want) instead of an int.

IgorSwat · 2026-01-14T09:31:49Z

packages/react-native-executorch/common/rnexecutorch/models/ocr/Detector.cpp

+  auto inputShapes = getAllInputShapes(methodName);
+  if (inputShapes.empty()) {
+    throw std::runtime_error("Detector model has no input shape for method: " +
+                             methodName);


I would change the error message to something like "Detector model: invalid method name XYZ"

IgorSwat · 2026-01-14T09:41:10Z

packages/react-native-executorch/common/rnexecutorch/models/ocr/Recognizer.cpp


 std::pair<std::vector<int32_t>, float>
-Recognizer::generate(const cv::Mat &grayImage) {
+Recognizer::generate(const cv::Mat &grayImage, int inputWidth) {


int32_t please :)

IgorSwat · 2026-01-14T09:42:03Z

packages/react-native-executorch/common/rnexecutorch/models/ocr/Recognizer.cpp

+  if (shapes.empty()) {
+    throw std::runtime_error(
+        "Recognizer model has no input tensors for method " + method_name);
+  }


I would change the error message to something like "Recognizer model: invalid method name XYZ"

IgorSwat · 2026-01-14T09:42:36Z

packages/react-native-executorch/common/rnexecutorch/models/ocr/Recognizer.h

                      std::shared_ptr<react::CallInvoker> callInvoker);
-  std::pair<std::vector<int32_t>, float> generate(const cv::Mat &grayImage);
+  std::pair<std::vector<int32_t>, float> generate(const cv::Mat &grayImage,
+                                                  int inputWidth);


int32_t please :)

IgorSwat · 2026-01-14T09:46:21Z

packages/react-native-executorch/common/rnexecutorch/models/vertical_ocr/VerticalDetector.cpp

+  modelMediumImageSize =
+      calculateImageSizeForWidth(constants::kMediumDetectorWidth);
+  modelLargeImageSize =
+      calculateImageSizeForWidth(constants::kLargeDetectorWidth);


I recommend using either this-> (as in the line above, ex. this->modelSmallImageSize), or marking the class members with _ at the end (ex. modelSmallImageSize_).

By doing so it's clear which fields are class members, and which one are not.

IgorSwat · 2026-01-14T09:49:15Z

packages/react-native-executorch/common/rnexecutorch/models/vertical_ocr/VerticalDetector.cpp

-      cv::Size(modelImageSize.width / 2, modelImageSize.height / 2));
-  float txtThreshold = this->detectSingleCharacters
+      cv::Size(modelInputSize.width / 2, modelInputSize.height / 2));
+  float txtThreshold = detectSingleCharacters


Why is this-> removed?

IgorSwat · 2026-01-14T09:49:21Z

packages/react-native-executorch/common/rnexecutorch/models/vertical_ocr/VerticalDetector.cpp


  // if this is Narrow Detector, do not group boxes.
-  if (!this->detectSingleCharacters) {
+  if (!detectSingleCharacters) {


Why is this-> removed?

benITo47 added 4 commits January 12, 2026 17:57

[REFACTOR] Change native code to support single multi-method recogniz…

deeac42

…er model Adapts the C++ Recognition controllers to handle a single recognizer file that contains multiple inference methods.

[REFACTOR] Support single multi-method detector on the Native side

043e5ef

This commit addapts the C++ OCR and VerticalOCR controllers to handle a single detector model with multiple inference methods

[REFACTOR] Update documentation to current state of OCR

bdd64bd

benITo47 requested review from IgorSwat, chmjkb, mkopcins and msluszniak January 14, 2026 09:13

msluszniak assigned benITo47 Jan 14, 2026

msluszniak added the feature PRs that implement a new feature label Jan 14, 2026

msluszniak linked an issue Jan 14, 2026 that may be closed by this pull request

Re-export OCR to use single weights #692

Open

IgorSwat reviewed Jan 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Change OCR to multi-method model #716

feat: Change OCR to multi-method model #716

benITo47 commented Jan 14, 2026

Uh oh!

msluszniak commented Jan 14, 2026

Uh oh!

IgorSwat left a comment

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

IgorSwat Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Change OCR to multi-method model #716

Are you sure you want to change the base?

feat: Change OCR to multi-method model #716

Conversation

benITo47 commented Jan 14, 2026

Description

Introduces a breaking change?

Type of change

Tested on

Testing instructions

Screenshots

Related issues

Checklist

Additional notes

Uh oh!

msluszniak commented Jan 14, 2026

Uh oh!

IgorSwat left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants