This OCR pipeline attempts to detect text in a cropped handwritten forms. In order to convert cropped form data to text, synthetic data of the required type was generated and trained using a modified version of the attention OCR model.
Sample prediction on real data
To reproduce the results of the pipeline on test data kindly refer instructions.txt
and for a details about the analysis, comparison and hyperparameter tuning of the neural network refer report.pdf
. A log report, log.md
has also been included in this repository which constains succinct description of daily work done.
Find the folder tree below. Note that some details have been omitted for brevity and can be found in the respective repositories of the source code, kindly refer a separate README.md
in such cases.
Root<br>
|
| .gitignore
| requirements.txt
| log.md
| README.md
| MANIFEST.in
| setup.py
| instructions.md
| myrun.sh
| Report.pdf
|
|___aocr
| |
| | __main__.py
| | __init__.py
| | defaults.py
| | LICENSE.md
| | README.md
| |
| |____model
| | |
| | | __init__.py
| | | cnn.py
| | | model.py
| | | seq2seq.py
| | | seq2seq_model.py
| |
| |____util
| |
| | __init__.py
| | bucketdata.py
| | data_gen.py
| | dataset.py
| | export.py
| | visualizations.py
|
|___text_renderer
| |
| | main.py
| | README.md
| | setup.py
| |
| |____dataset_labels
| | |
| | | convert_labels.py
| |
| |____ocr_data
| |
| |____example_data
| |
| |____text_renderer
| |
| |____tools
| |
| |____docs
| |
| |____docker
|
|
|____experiments
| |
| | TestSyntheticDataGen.ipnb
| | Tfwriter.ipnb
| | Train.ipnb
|
|____checkpoints
|
|____app
|
|____datasets
|
|____utils
|
- The folder
aocr
contains the main code for the attention ocr along with a separateREADME.md
which can be used as a reference. - The folder
text_renderer
contains the code used for generation of synthetic dataset. The exact details of configuration used is insidetext-renderer/ocr_data/gen_data.py
. - Since this model was trained and tested on Google Colab, sample
ipnb
files have been provided for reference inexperiments
. - And the app folder contains the flask based REST API for testing the endpoints
- The model checkpoints are located in the checkpoints directory
- Please read
Report.pdf
for a detailed summary of work done!
In order to run the app on your local machine follow these steps:
-
Clone the repository on your local machine:
git clone https://github.com/java-abhinav07/abhinav_java_9873155323-IITB-Assignment-Jul-Dec2020-Batch2.git
-
Install aocr locally using
setup.py
:cd abhinav_java_9873155323-IITB-Assignment-Jul-Dec2020-Batch2
pip3 install -e ./
-
Install necessary packages:
pip3 install -r requirements.txt
-
Having installed all the packages run:
python3 app/app.py
This will run the server on the local machine on port 8001(note that CPU inference might take upto 14 seconds to process) -
Send the request to localhost as follows:
-
Response Status will either be
completed
orinvalid request
to indicate a successful or unsuccessful response respectively.
To execute inference on your local machine you can also use the bash script provided as follows:
./my_run.sh TestImageFolderPath Output.txt
Subsequently the output file output.txt
will have output:
<Testimagefilename1> <recognized text> <Testimagefilename2> <recognized text>
Inorder to run request over a public web server use the following path:
https://formreader.herokuapp.com/predict
Use the same API Spec shared above in order to fetch the results.
Heroku app has been modified and is now fully functional however the model initialization (first request) might take a little time due to low memeory constraints on heroku.
This repository contains code from the following two repositories:
- https://github.com/emedvedev/attention-ocr : aocr
- https://github.com/oh-my-ocr/text_renderer : text_renderer
The reports contains a list of papers which were referenced during the design of this ocr rendition.
This project was part of the application process for Research Internship at IIT Bombay.