Project Oxford: Computer Vision API pros & cons

Microsoft project code named Oxford has number of interesting APIs. Let’s dig into one of them – OCR.

Computer Vision API provides OCR conversion from images to text. It could be tested from here. It is really fast, but requires understanding of the subject. For example, I have coloured image with a text

snip_20150923092422Here is result:

I don -t need 
l. Google 
my wife knows

Hmm, last string is lost. But if you apply charcoal transformation removing some noise

snip_20150923092751 Here is result:

I don-t need 
Go, gle 
my wife knows 

Much better, last string is picked up. The same OCR engine Microsoft has in OneNote, but it has the same problem.


  • Simple
  • Fast
  • Supports many languages and auto-detection (?)
  • REST based
  • Really interesting package of APIs and features


  • Requires Azure subscription. Which means not free
  • Requires file preparation before load
  • Does not support vectorised images (SVG)
  • Does not understand image orientation


  • Does not support photo EXIF attributes. This is key point if you know what I mean
  • Recognition is not as good and customizable as you can do with free tools like Tesseract-OCR, ImageMagic, autotrace, inkscape.


About Andrew Butenko
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s