Microsoft project code named Oxford has number of interesting APIs. Let’s dig into one of them – OCR.
Computer Vision API provides OCR conversion from images to text. It could be tested from here. It is really fast, but requires understanding of the subject. For example, I have coloured image with a text
I don -t need l. Google my wife knows
Hmm, last string is lost. But if you apply charcoal transformation removing some noise
I don-t need Go, gle my wife knows everything
Much better, last string is picked up. The same OCR engine Microsoft has in OneNote, but it has the same problem.
- Supports many languages and auto-detection (?)
- REST based
- Really interesting package of APIs and features
- Requires Azure subscription. Which means not free
- Requires file preparation before load
- Does not support vectorised images (SVG)
- Does not understand image orientation
- Does not support photo EXIF attributes. This is key point if you know what I mean
- Recognition is not as good and customizable as you can do with free tools like Tesseract-OCR, ImageMagic, autotrace, inkscape.