In this article i will show you how to read text from image by using OCR Components in C#.net.
What is OCR?
OCR (Optical Character Recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.
OCR translates images of text, such as scanned documents, into actual text characters. Also known as text recognition, OCR makes it possible to edit and reuse the text that is normally locked inside scanned images. OCR works using a form of artificial intelligence known as pattern recognition, to identify individual text characters on a page, including punctuation marks, spaces, and ends of lines.
First off, you need to have MS Office 2007 installed or later version. This is obviously a dependency if you develop an application to use the OCR capabilites in the field – it won’t work without Office installed. Furthermore, the OCR capability doesn’t install by default when you install Office, you need to add a component called ‘Microsoft Office Document Imaging’ (MODI).
Instructions on how to add the required MODI component.
Step 1
Click Start, click Run, type appwiz.cpl in the Open box, and then click OK.
Step 2
Click to select the Office 2007 version that you have installed.
Step 3
Click Change.
Step 4
Click Add or Remove Features, and then click Continue.
Step 5
Expand Office Tools.
Click on Image for better View.
Step 6
Click Microsoft Office Document Imaging, and then click Run all from My Computer.
Click on Image for better View.
Step 7
Click Continue.
Now MODI Components installed on your Machine.lets create OCR Application in Visual Stdio.
Step 8
Create a Console Application and give the solution name as SolReadTextFromImage.
Step 9
Copy a Sample image file in Application BaseDirectory.(./bin/debug/SampleImage.JPG)
Click on Image for better View.
Step 10
Add a MODI Reference in our application.so we can use in our application for reading text from image.Right Click on project in Solution Explorer.right click on References,select the COM tab,then select Microsoft Office Document Imaging 12.0 Type Library.
Click on Image for better View.
Step 11
The Code below will read text from image and store in text file,it is look like this
Step 12
Call both methods in main function,it is look like this
Full Code
Output
Click on Image for better View.
Download
Download source Code
What is OCR?
OCR (Optical Character Recognition) is the recognition of printed or written text characters by a computer. This involves photoscanning of the text character-by-character, analysis of the scanned-in image, and then translation of the character image into character codes, such as ASCII, commonly used in data processing.
OCR translates images of text, such as scanned documents, into actual text characters. Also known as text recognition, OCR makes it possible to edit and reuse the text that is normally locked inside scanned images. OCR works using a form of artificial intelligence known as pattern recognition, to identify individual text characters on a page, including punctuation marks, spaces, and ends of lines.
First off, you need to have MS Office 2007 installed or later version. This is obviously a dependency if you develop an application to use the OCR capabilites in the field – it won’t work without Office installed. Furthermore, the OCR capability doesn’t install by default when you install Office, you need to add a component called ‘Microsoft Office Document Imaging’ (MODI).
Instructions on how to add the required MODI component.
Step 1
Click Start, click Run, type appwiz.cpl in the Open box, and then click OK.
Step 2
Click to select the Office 2007 version that you have installed.
Step 3
Click Change.
Step 4
Click Add or Remove Features, and then click Continue.
Step 5
Expand Office Tools.
Click on Image for better View.
Step 6
Click Microsoft Office Document Imaging, and then click Run all from My Computer.
Click on Image for better View.
Step 7
Click Continue.
Now MODI Components installed on your Machine.lets create OCR Application in Visual Stdio.
Step 8
Create a Console Application and give the solution name as SolReadTextFromImage.
Step 9
Copy a Sample image file in Application BaseDirectory.(./bin/debug/SampleImage.JPG)
Click on Image for better View.
Step 10
Add a MODI Reference in our application.so we can use in our application for reading text from image.Right Click on project in Solution Explorer.right click on References,select the COM tab,then select Microsoft Office Document Imaging 12.0 Type Library.
Click on Image for better View.
Step 11
The Code below will read text from image and store in text file,it is look like this
#region Methods /// <summary> /// Read Text from Image and display in console App /// </summary> /// <param name="ImagePath">specify the Image Path</param> private static void ReadTextFromImage(String ImagePath) { try { // Grab Text From Image MODI.Document ModiObj = new MODI.Document(); ModiObj.Create(ImagePath); ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true); //Retrieve the text gathered from the image MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0]; System.Console.WriteLine(ModiImageObj.Layout.Text); ModiObj.Close(); } catch (Exception ex) { throw new Exception(ex.Message); } } /// <summary> /// Read Text from Image and Store in Text File /// </summary> /// <param name="ImagePath">specify the Image Path</param> /// <param name="StoreTextFilePath">Specify the Store Text File</param> private static void ReadTextFromImage(String ImagePath, String StoreTextFilePath) { try { // Grab Text From Image MODI.Document ModiObj = new MODI.Document(); ModiObj.Create(ImagePath); ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true); //Retrieve the text gathered from the image MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0]; // Store Image Content in Text File FileStream CreateFileObj = new FileStream(StoreTextFilePath, FileMode.Create); //save the image text in the text file StreamWriter WriteFileObj = new StreamWriter(CreateFileObj); WriteFileObj.Write(ModiImageObj.Layout.Text); WriteFileObj.Close(); ModiObj.Close(); } catch (Exception ex) { throw new Exception(ex.Message); } } #endregion
Step 12
Call both methods in main function,it is look like this
static void Main(string[] args) { // Set Sample Image Path String ImagePath = AppDomain.CurrentDomain.BaseDirectory + "SampleImage.jpg"; ReadTextFromImage(ImagePath); // Set Store Image Content text file Path String StoreTextFilePath = AppDomain.CurrentDomain.BaseDirectory + "SampleText.txt"; ReadTextFromImage(ImagePath, StoreTextFilePath); }
Full Code
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.IO; namespace SolReadTextFromImage { class Program { static void Main(string[] args) { // Set Sample Image Path String ImagePath = AppDomain.CurrentDomain.BaseDirectory + "SampleImage.jpg"; ReadTextFromImage(ImagePath); // Set Store Image Content text file Path String StoreTextFilePath = AppDomain.CurrentDomain.BaseDirectory + "SampleText.txt"; ReadTextFromImage(ImagePath, StoreTextFilePath); } #region Methods /// <summary> /// Read Text from Image and display in console App /// </summary> /// <param name="ImagePath">specify the Image Path</param> private static void ReadTextFromImage(String ImagePath) { try { // Grab Text From Image MODI.Document ModiObj = new MODI.Document(); ModiObj.Create(ImagePath); ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true); //Retrieve the text gathered from the image MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0]; System.Console.WriteLine(ModiImageObj.Layout.Text); ModiObj.Close(); } catch (Exception ex) { throw new Exception(ex.Message); } } /// <summary> /// Read Text from Image and Store in Text File /// </summary> /// <param name="ImagePath">specify the Image Path</param> /// <param name="StoreTextFilePath">Specify the Store Text File</param> private static void ReadTextFromImage(String ImagePath, String StoreTextFilePath) { try { // Grab Text From Image MODI.Document ModiObj = new MODI.Document(); ModiObj.Create(ImagePath); ModiObj.OCR(MODI.MiLANGUAGES.miLANG_ENGLISH, true, true); //Retrieve the text gathered from the image MODI.Image ModiImageObj = (MODI.Image)ModiObj.Images[0]; // Store Image Content in Text File FileStream CreateFileObj = new FileStream(StoreTextFilePath, FileMode.Create); //save the image text in the text file StreamWriter WriteFileObj = new StreamWriter(CreateFileObj); WriteFileObj.Write(ModiImageObj.Layout.Text); WriteFileObj.Close(); ModiObj.Close(); } catch (Exception ex) { throw new Exception(ex.Message); } } #endregion } }
Output
Click on Image for better View.
Download
Download source Code
Brothers always brothers, Thank you Very Much for U re Kindness. I Am Pleasure to say This Article for me. :)
ReplyDeleteMost Welcome Ganesh.
DeleteI prefer xsocr tool http://www.xspdf.com/guide-ocr/text-recognition-from-image/, it's using tesseract 3 engine, and recognize text with higher accuracy
DeleteAwesome!
ReplyDeleteThanks for share your knowledge n__n
Most Welcome Fabian.........
DeleteThanks a lot. its really worth full one.
ReplyDeleteMost Welcome Stalin
DeleteAwesome!
ReplyDeleteMost Welcome Rajiv Kumar.
DeleteNaik,
DeleteThanks for ur Effort to build this one.From this article we can read the text from Image.Do u have any idea, to read the text(which will change dynamically like the Verification code) inside the image from the image.
Most Welcome Mahendran.
DeleteI think it's may not possible to read verification code from image.
ReadTextFromImage function is throwing exception
ReplyDeleteObject hasn't been initialized and can't be used yet...don't know why...
Can you sned your solution copy to my mail id????
DeleteHi Kishor,
ReplyDeleteThis is a nice post, thanks for sharing.
In my case, I am having image with English but the font is different, the font is "BlackJackRegular".
How can we set the font while reading image?
Thanks
Hi Anant
DeleteI hope you like my article.
MODI Supports only few Standard fonts.i tried lots of font but no result.even you cant read capcha too.
Thanks Kishor for sharing your valuable knowledge, It just give me the beginning to include this feature in my project, thanks a lot
ReplyDeleteMost Welcome Arun...
DeleteThanks Brother, but i want take image from webcam then after extract from image.
ReplyDeleteI will write article on taking image from webcam very soon.
ReplyDeleteThanks for your code, Can you please let met know how to read tab separated text from image using MODI object.
ReplyDeleteThanks a lot it is easier to understand and very helpful.
ReplyDeleteworks fine.
is that give a result for corsive writing??
ReplyDeletethanks, can i use this code to extract information from .E01 hard disk images?
ReplyDeleteThanks its very good article , but i have one doubt , its working if the content of the image is in horizontal , but if content is in vertical its not working , please help me out in this .
ReplyDeleteThanks
Mohan
Great tutorial. Can this read pdf document?
ReplyDeletecant it take path from a folder in c drive,instead of the image file being in bin folder.
ReplyDeleteHi
ReplyDeletethanks for the code.
@ sign is not converted in MODI.dll
when i get the text from email address rrizwann@gmail.com then @ sign skip and return rrizwangmail.com.
im getting some exception error...
ReplyDeleteIS IT WORKING FOR CURSIVE TEXT IMAGES
ReplyDelete?
MODI Supports only Standard Fonts......it's not designed for Cursive Text...I tried but i failed to read cursive Text from Image.
DeleteHi Kishor,
ReplyDeleteI am getting this error :
Retrieving the COM class factory for component with CLSID {40942A6C-1520-4132-BDF8-BDC1F71F547B} failed due to the following error: 80040154.
please let me know how to resolve it?
thanks
Did you add reference Microsoft Office Document Imaging 12.0 Type Library on your Solution????
Deletecan we do the same implementation using web application?
ReplyDeleteif yes then procedure please?
it really very nice article..
ReplyDeleteIts really good.
ReplyDeleteBut i have a problem, it reads normal font accurately but it is failing to read out different font text.
Suppose there is an image file having different font text, it doesn't produce correct output.
Can we find out font of text using MODI so that it can give correct output.
MODI Supports only Standard Fonts.......
DeleteHi Kishor
ReplyDeleteI have Microsoft office 2013 installed in my system. I don't se the option Microsoft Office Document Imaging. Instead I see Optical Character Recognition(OCR.Should I proceed with the same?
Install MODI DLL
Deletehttp://social.technet.microsoft.com/Forums/office/en-US/93d6f285-dc98-46e2-b7e0-872bba9c4e35/microsoft-office-document-imaging
can u plz give some idea what we have to do if we want to read cursive text images?
ReplyDeletei want to read a text from natural scene images, what shud i do for it?
ReplyDeleteHi Kishor
ReplyDeleteThanks for your great post,,,
But i had one problem...while reading it comes on unknown formart....
Is it possible set font for reading.....It is not detecting all fonts
Pls help me....
Thanks for the good post..is it possible to read different different fonts .....???
ReplyDeletehi........... is it possible to get starting point of text from all the four sides.... Actually i have to auto crop image... plz reply ASAP.....
ReplyDeleteWorked like a charm, Thanks!
ReplyDeleteacually this code support only some font-families how will it support to all font families???????plzzz give me suggesion its ma task....
ReplyDeleteits not supporting all fonts...can u plzz suggest me code for it
ReplyDeletecan u do for asp.net web aapp
ReplyDeleteIs it possible to do it in MS word 2013?
ReplyDeleteHi Kishor,
ReplyDeleteI am using MS Visual Studio 2008, I have done all these steps, I am getting Retrieving the COM class factory for component with CLSID {40942A6C-1520-4132-BDF8-BDC1F71F547B} failed due to the following error: 80040154 error. I rechecked the reference. I could see the MODI reference added in the solution explorer, still I am getting this error. Please let me know what need to done.
hi sir..
ReplyDeletewhat will be the source code for converting handwritten image text into text
Let me know how to get particular text from image ?
ReplyDeletevery helpful. thank you so much
ReplyDeleteDoes this method also work for PDF images?
ReplyDeleteThank you for this! MUCH appreciated! Thank you :)
ReplyDeleteTeam, this is awesome! It works perfectly with clear images and has a very good approach for dark images. But now, I have two questions:
ReplyDelete1. I'm using Windows Azure for my website where I have this solution installed. But, do you know, how can I install the Sharepoint Designer in the Windows Azure server?
2. Do you know, how can I edit the size of the pictures? With this solution, the pictures that can be loaded must be from a certain MPx and size!
Retrieving the COM class factory for component with CLSID {40942A6C-1520-4132-BDF8-BDC1F71F547B} failed due to the following error: 80040154.
ReplyDeleteI’m searching for OCR solution recently and this ocr solution is one of my testing. Till now, it works.
ReplyDeleteTest: .net ocr sdk, c# ocr api
Step 8
ReplyDeleteCreate a Console Application and give the solution name as SolReadTextFromImage.
pls help
What you want to know?
DeletePlease mail @ achaltrehan5@gmail.com :)
I will help you.. :)
While executing this code am getting error like 'OCR running error' please help
ReplyDeleteThanks, Kishor. This is going to be a great hit in the data entry process. Can you please tell me if the code recognizes hand written characters....
ReplyDeletethanks but there is no reliable results. can you guide how we could get accurate results ?
ReplyDeleteHow to convert cursive image to txt
ReplyDeleteWhile executing this code am getting error like image below (dont have type for (MODI.Image)ModiObj.Images[0]; requre dynamic express)
ReplyDeletehttp://prntscr.com/8szq02
Plz help me...Thank you somuch
Very Nice, Thanks for sharing your knowledge
ReplyDeletei want to store text from image taken by web cam ,could u sugest me..how to do it.Thanks in advance..
ReplyDeletehi, can u plz give some idea what we have to do if we want to read cursive text images?
ReplyDeleteThat's a very great sharing. I just learn that we can extract text from image file. Is that possible to do using image stream instead of physical file?
ReplyDeleteDid not show reference in COM Reference list....plz help
ReplyDeleteHi there. I'm trying to read text from a tiff file, but the program throws exception every time I get to the "Create" method saying that the file is empty or corrupt.
ReplyDeletea great work bro keep it up plz give me your mobile number
ReplyDeletei want to convert handwritten image to text,please anyone help me
ReplyDeleteThanks for your help.
ReplyDeleteKishor,
ReplyDeleteI have MS office 2016, I could not find MS Office Document imaging under Office Tools while I add or remove features using Control panel. Please help me sir.
thanks you very much for very much sir can this is possible in Asp,net
ReplyDeletesir i cannot convert .. i think my file is in cursive text font .. have any option ?
ReplyDeleteHello,
ReplyDeleteCan i get Specific text from the Image file like if Image file Contains first name and last name
how can it support in office 2013.bacause MODI Dll not a part of office 13 .I have already try.
ReplyDeleteHai Author Good Information that i found here,do not stop sharing and Please keep updating us..... Thanks. Hire dot net developer
ReplyDeleteHello,
ReplyDeleteVery nice article. Thanks.
Some images get error: "OCR running error". how to solve, please help me....
ReplyDeleteits not reading data from pdf file i am getting error like "File is empty or corrupted"
ReplyDeleteThank You for sharing this Informative article. Keep going
ReplyDeleteTo Learn More about mobile alternatives or internet enabled options, contact us: http://surajinformatics.ae/
Bursa
ReplyDeleteMersin
izmir
Rize
Antep
EJH8K
Kocaeli
ReplyDeleteDenizli
Bursa
istanbul
Van
WY1
Malatya Lojistik
ReplyDeleteAntep Lojistik
Urfa Lojistik
Sivas Lojistik
Erzurum Lojistik
5AF7
6B3A3
ReplyDeleteSamsun Parça Eşya Taşıma
Samsun Evden Eve Nakliyat
Kırıkkale Evden Eve Nakliyat
Giresun Evden Eve Nakliyat
Kırklareli Parça Eşya Taşıma
D9A17
ReplyDeleteBingöl Evden Eve Nakliyat
İzmir Lojistik
Antep Evden Eve Nakliyat
Gümüşhane Lojistik
Ordu Lojistik
5E9A6
ReplyDeleteZonguldak Lojistik
Maraş Lojistik
Çorum Evden Eve Nakliyat
Kastamonu Lojistik
Kırıkkale Lojistik
C9729
ReplyDeleteErzurum Lojistik
Gümüşhane Lojistik
Çankırı Lojistik
Samsun Lojistik
İstanbul Lojistik
6C156
ReplyDeleteMuş Lojistik
Ordu Parça Eşya Taşıma
Gümüşhane Evden Eve Nakliyat
Adıyaman Evden Eve Nakliyat
Çorum Lojistik
B1032
ReplyDeleteKırşehir Evden Eve Nakliyat
Afyon Evden Eve Nakliyat
Şırnak Lojistik
Tokat Lojistik
Kars Lojistik
7550E
ReplyDeleteMuğla Lojistik
Kocaeli Şehirler Arası Nakliyat
Karaman Şehir İçi Nakliyat
Denizli Şehir İçi Nakliyat
Siirt Şehir İçi Nakliyat
Afyon Lojistik
Kilis Parça Eşya Taşıma
Kayseri Şehir İçi Nakliyat
Sakarya Lojistik
96301
ReplyDeleteAntalya Parça Eşya Taşıma
Hatay Şehir İçi Nakliyat
Erzincan Parça Eşya Taşıma
Bitcoin Nasıl Alınır
Batman Parça Eşya Taşıma
Mersin Parça Eşya Taşıma
Rize Şehirler Arası Nakliyat
Ankara Şehirler Arası Nakliyat
AAX Güvenilir mi
9A18A
ReplyDeleteAltındağ Parke Ustası
Çankaya Boya Ustası
Çerkezköy Yol Yardım
Zonguldak Şehir İçi Nakliyat
Bitcoin Nasıl Alınır
Siirt Evden Eve Nakliyat
Afyon Lojistik
Ordu Parça Eşya Taşıma
Van Lojistik
18621
ReplyDeleteİstanbul Evden Eve Nakliyat
Ünye Koltuk Kaplama
Referans Kimliği Nedir
Binance Referans Kodu
Ünye Çatı Ustası
Urfa Evden Eve Nakliyat
Kırklareli Evden Eve Nakliyat
Çerkezköy Motor Ustası
Yenimahalle Parke Ustası
FAFE0
ReplyDeleteBitcoin Kazma Siteleri
Coin Kazma
Bitcoin Üretme Siteleri
Binance Madencilik Nasıl Yapılır
Gate io Borsası Güvenilir mi
Bitcoin Nasıl Alınır
Coin Kazma
magnet
Coin Kazma Siteleri
46337
ReplyDeleteKripto Para Madenciliği Nedir
Coin Madenciliği Nedir
Bitcoin Üretme
Kripto Para Madenciliği Siteleri
Binance Kaldıraçlı İşlem Nasıl Yapılır
Bitcoin Nedir
Kripto Para Kazma Siteleri
Coin Nasıl Alınır
Coin Nasıl Çıkarılır
D731D
ReplyDeleteBitlis En İyi Görüntülü Sohbet Uygulaması
ankara yabancı canlı sohbet
Aksaray Goruntulu Sohbet
adıyaman rastgele görüntülü sohbet ücretsiz
gümüşhane en iyi ücretsiz sohbet siteleri
çorum canlı görüntülü sohbet uygulamaları
kadınlarla sohbet
telefonda rastgele sohbet
siirt rastgele sohbet siteleri
4B3B8
ReplyDeleteBilecik Rastgele Canlı Sohbet
Bedava Sohbet Chat Odaları
Ordu Görüntülü Sohbet Ücretsiz
Artvin Bedava Sohbet
Urfa Canlı Görüntülü Sohbet Siteleri
mobil sohbet chat
görüntülü sohbet odaları
kocaeli canlı sohbet sitesi
adıyaman görüntülü sohbet kadınlarla
689E3
ReplyDeleteBitcoin Mining Nasıl Yapılır
Görüntülü Sohbet
Baby Doge Coin Hangi Borsada
Cate Coin Hangi Borsada
Binance Borsası Güvenilir mi
Tumblr Beğeni Hilesi
Onlyfans Takipçi Hilesi
Okex Borsası Güvenilir mi
Bitcoin Üretme
0868C
ReplyDeleteshiba
shiba
poocoin
avax
uniswap
zkswap
bitbox
dexscreener
sushi
4B2F4
ReplyDeletesatoshivm
eigenlayer
shapeshift
layerzero
sushiswap
uwulend finance
pancakeswap
quickswap
yearn finance
AC373
ReplyDeletemobil proxy 4g
kucoin
canlı sohbet odaları
bitget
en eski kripto borsası
okex
bitcoin haram mı
aax
sohbet canlı
78CD3
ReplyDeletebitcoin ne zaman çıktı
mexc
papaya
binance referans kimliği nedir
sohbet canlı
mercatox
bingx
en eski kripto borsası
kripto telegram
3C868
ReplyDeleteSamandağ
Kemah
Salıpazarı
Gümüşova
Siverek
Tutak
Kumluca
Dulkadiroğlu
Çivril