PP-OCRv6 released on Hugging Face, claims to surpass billion-scale VLMs PP-OCRv6 uses 34.5M parameters to outperform billion-parameter VLMs on OCR tasks, according to a new paper. PaddleOCR 3.7 also adds transformers and ONNX Runtime backends.
See 2 more sourcesPaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training 28 days ago Zelun Zhang, Hongen Liu, Suyin Liang, Yubo Zhang, Yiqing Xiang, Jiaxuan Liu, Ting Sun, Manhui Lin, Yue Zhang, Changda Zhou, Tingquan Gao, Cheng Cui, Yi Liu, Dianhai Yu, Yanjun Ma Arxiv CS.CV (computer vision) · Yubo Zhang Xueqing Wang Manhui Lin Yue Zhang Penglongyi Deng Ting Sun Tingquan Gao Zelun Zhang Jiaxuan Liu Changda Zhou Hongen Liu Suyin Liang Cheng Cui Yi Liu Dianhai Yu Yanjun Ma