CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 5 of 5
  1. #1
    Join Date
    Mar 2005
    Location
    Detroit MI
    Posts
    80

    Text In Image Recognition?

    Hi There,

    I'm looking to perform a text recognition function.

    Here's what I need:

    - I am given an image, of lets say 100x75 pixels in size.
    - The image may or may not contain text.
    - The text would be a computer font and not handwritte.
    - If it does contain text, it is always the same font, colour and text size.
    - It may contain up to around 20 characters.
    - If text is found, the routine would pass this back to the calling function as a string.

    Does anyone know of a free (it's a hobby project! ) library that would do this for me? Failing that, a tutorial of how I might get started into doing this myself?

    Many Thanks

    BW
    Regards,

    Big Winston

  2. #2
    Join Date
    Feb 2006
    Location
    London
    Posts
    238

    Re: Text In Image Recognition?

    Once I did a similar thing to automatically pass registration process on one internet site. In many cases it is trivial to do.

    1. Ideally you would know in advance a certain characteristic of the text (or may test several alternatives automatically). For instance the text can be lighter/darker, contain a higher proportion of green colour than the background and so on. When you decided upon such a characteristic, then transform the picture in monochrome one based on that criterion (everything lighter than some threshold becomes white, the rest is black). You can always run a set of tests to ensure that your program makes this transformation correctly.
    2. The next stage is character separation. You need to determine the groups of connected black pixels in your image and extract the bounding rectangle into a small image which will contain only one character.
    3. You need to precompute template sets of characters for all fonts and all font sizes (in my case the was only one font, one font size and only numbers were used which seriously simplified my task).
    4. Finally you need to compare an image of every character you extracted with every image in you database of template images. Basically you will need to compute the ratio of coinciding (the number of black pixels which coincides to the total number of black pixels in the tested image. The template which has got the highest ratio is likely to be the character drawn in the picture.

    It is a great fun to improve this algorithm and tune it to your particular problem, see how it can recognise higher and higher proportion of images.

    In my case I managed to increase the proportion of correctly recognised images from 5% to 95-97% in a couple of weeks.

  3. #3
    Join Date
    Dec 2001
    Location
    Greece, Athens
    Posts
    1,015

    Re: Text In Image Recognition?

    Really nice explanation DragForce!
    I have never work on OCR algorithms, but I have a note on the final step of the ones you described (the 4th): the one that has to do with the character classification. Well, another, more complicated, but maybe better, feature than the ratio of coinciding would be boundary statistics.
    In specific, one could use boundary tracking algorithms to estimate the boundary of a character. Afterwards, you could use a chain-code to describe this boundary. Finally a statistic (e.g. 2nd central moment) could be used to describe the specific boundary. This statistic should then be compared to the statistics of the training characters to classify the corresponding character.
    Theodore
    Personal Web Page (some audio segmentation tools): www.di.uoa.gr/~tyiannak

  4. #4
    Join Date
    Dec 2001
    Location
    Greece, Athens
    Posts
    1,015

    Re: Text In Image Recognition?

    Sth else: if you decide to use image (or boundary) statistics, it would be better to use Scale invariant moments or even better Hu invariant moments, which are invariant under both rotation and scaling. This would make the system more indepedent to the font....
    Theodore
    Personal Web Page (some audio segmentation tools): www.di.uoa.gr/~tyiannak

  5. #5
    Join Date
    May 2009
    Location
    Hyderabad
    Posts
    4

    Re: Text In Image Recognition?

    Hi,

    I am new to this, I want to recognize telugu text from scanned Image.
    I want to develop this in C#.NET, Please guide me how to accomplish this..
    Its very urgent...Please help me..

    Thanks in advance.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured