Multilingual OCR (currently recognises: English and Chinese)

m_OCR(multilingual OCR) is a Vision-Encoder-Decoder model (based on the concept of TrOCR) which uses pre-trained facebook's vit-mae-large as the encoder and xlm-roberta-base as the decoder. It has been trained on IAM, SROIE 2019, text_renderer Chinese (synthetic) and TRDG (synthetic) datasets (amounting to approx 1.4 Million samples) for English and Chinese document text-recognition.