Introduction
This repository hosts the PaddleOCR document helper models — page-orientation
classification, dewarping, and table-structure recognition — for the
React Native ExecuTorch library,
fused into one multi-method .pte per backend for the ExecuTorch runtime
(XNNPACK, CoreML, Vulkan). These are document pre/post-processing companions to
react-native-executorch-paddleocr -
not an OCR model on their own.
If you'd like to run these models in your own ExecuTorch runtime, refer to the official documentation for setup instructions.
Methods
The fused .pte exposes four methods (the .pte is pure tensor→tensor; the client
does normalization, argmax/softmax, grid-sampling and the decode loop):
| method | source model | input | output | purpose |
|---|---|---|---|---|
orientation |
PP-LCNet doc_ori | [1,3,224,224] |
logits[1,4] |
page rotation 0 / 90 / 180 / 270° (argmax) |
dewarp |
UVDoc | [1,3,712,488] |
grid[1,2,45,31] |
sampling grid → grid_sample to unwarp a curved/folded page |
table_encode |
SLANeXt | [1,3,488,488] |
feat[1,256,96] |
encode a cropped table image (run once) |
table_decode_step |
SLANeXt decoder | (feat[1,256,96], hidden[1,256], onehot[1,50]) |
(probs[1,50], hidden[1,256]) |
one autoregressive structure-token step |
Backends & precision
| backend | target | precision | size |
|---|---|---|---|
xnnpack |
CPU | int8 | ~26 MB |
coreml |
Apple ANE | weight-only int8 | 11.9 MB |
vulkan |
Android GPU | fp16, except table_decode_step → XNNPACK |
23.9 MB |
table_decode_step is always computed in fp32 (autoregressive stability)
Compatibility
If you intend to use these models outside of React Native ExecuTorch, make sure your runtime is
compatible with the ExecuTorch version used to export the .pte files. For more details, see
the compatibility note in the
ExecuTorch GitHub repository.
If you work with React Native ExecuTorch, the library constants guarantee compatibility with the
runtime used behind the scenes.