Automatic recognition of handwritten medical forms

Bui Minh Triet
Nguyen Nguyen Ha Nhan
Mai Duc Minh
Tran Kim Long

Handwriting recognition has long been an intriguing study issue in document digitization, computer vision, and sequence modelling. Efforts to expand and improve the prediction process have consistently shown new potential for developers to use and incorporate the technology into our daily lives. In this project, our goal is to create a system that enables the usage of handwriting in Tetanus treatment for mobility and speed while still ensuring the integrity of information by utilizing various Deep Learning and Image Processing techniques.

 

The project foundation consists of 5 main components: 

Paper Detection: A Mask-RCNN model trained on 400 images to detect and differentiate relevant records from covering paper. 


Preprocessing: A combination of CLAHE, Sauvola and OpenCV’s Denoised is used to normalise the image, retain handwriting stroke and remove small objects to aid with the Text Detection process. 


Text Detection: A Fast-RCNN model trained with 1000 handwriting line instances to detect actual, useful handwriting lines from stamps, signatures, and form text,…The model then cropped these lines for later steps in the pipeline. 


Adaptive Preprocessing: Each cropped line is then measured for the rate of blurriness and automatically applies appropriate image processing methods of varying parameters of CLAHE, Sauvola and Denoised. This is to ensure optimal stroke maintenance with the multitude of blurriness in each section.


Text Recognition: A VGG19 – Transformer model pretrained with general Vietnamese handwriting lines, trained and fine-tuned with 1400 medical lines taken from the medical records. The resulting model achieves 0.12 in Character Error Rate and 0.23 in Word Error Rate. 



The pipeline has an accompanying UI that allows interaction with the above modules along with annotation operations specifically designed to aid OUCRU medical experts in providing ground truth for Text Recognition training purposes. The UI and annotation backend is deployed on AWS for professional usage. 


Demo Video

Share by: