Retrieve a word at mouse current position?

**megaxoom** · April 8th, 2004, 06:52 PM

Hi everyone

Is there a way to retrieve a word at mouse current position regardless of the window? Maybe the a question is a little of confuse, so I describe it in detail. Whenever my applications is running and a user is moving a mouse to particular window that has some texts, I would like my application to pick a word (just a word not all text) at the current mouse position. Can this be done? Thanks for any help.

**olin** · April 8th, 2004, 07:21 PM

Well, I have to say it is not an easy work to do.

Typically, you may use hook to capture some call of GDI API function such as TextOut and TextOutA, and then you may modify the image of the functions memory and embed your own assemble code....

**Sam Hobbs** · April 8th, 2004, 09:31 PM

This question has been asked a few times, or at least similar questions.

First, are you saying you want to do this while the mouse is moving? That is, not from a mouse click?

Look at the ::GetMessagePos function; you can use that to get the mouse position when a message is being processed. Then you can use ChildWindowFromPoint.

Then it depends on what is there. If I can guess what it is you are doing, then the window will always be one or a few specific windows. Also, my guess is that you want to do this only for edit and/or rich edit controls. In which case, you can get the word by using the location. There are a lot of details, but hopefully that is the general idea.

**olin** · April 8th, 2004, 10:44 PM

Hi, Sam Hobbs,

yes, yours is an easy way. But by it, you will only retrieve the title of a certain window such eidtbox or some controls. However, you cann't retrieve words in other cases such as in a webpage, a word process file, a pdf file.....

I doesn't know what is megaxoom's intent. If you only want to get some words from a editbox etc. Sam Hobbs gave you a better way.

**megaxoom** · April 8th, 2004, 11:06 PM

Sorry for not describe in more detail, I would like to pick word from any application like : MS Word, Internet Explorer, notepad, etc.

**olin** · April 9th, 2004, 12:50 AM

Then you should capture the GDI API although there is still another way that tracking the DC of the windows. but I think the former one does better considering the compatibility. after you capture the API, you may imbed your own assemble codes into the momery of API, or you can change the import address table to run your custom codes.

**megaxoom** · April 9th, 2004, 10:32 AM

Hi Olin

Thanks for your help, but could you provide some information where can I start to do this? Thanks.

**sephiroth2m** · April 9th, 2004, 04:48 PM

as olin said, capture the GDI API will work fine. After injecting ur code, each time that API is called, u'll get the start possition of the string, compare it with cursor's possition then u'll get the underlying text. See GetTextExtentPoint32, GetTextAlign for detail

**OReubens** · April 9th, 2004, 05:14 PM

The idea of intercepting 'textout' type calls, and storing them is a bogus solution. It just never will work for a variety of reasons.

1) You still won't know if the text is still there when the mouse is over it. It may already have been painted over by something else.
An easy sample of this would be a tooltip. You may be able to grab the textout that made the tooltip display itself, but it would require a lot of additional work to know that the tooltip has disapeared.

3) Several windows can paint text at identical locations. (an app with several levels of dialogs is a simple example). As a result you would also need to be aware of focus.

4) Text can move around without calling any of the text type functions. Scrolling the window or blitting part of the window to another location is a possibility

5) Many programs (and IE is among them) use off-screen buffers (often this is a MemDC, but not necessarily) to prepare the painting, and then do a single blit from this off-screen buffer to the video memory, thus avoiding flicker.
It's impossble to track text in this type programs.

6) It'll require a HUGE amount of memory to store all the strings that have been painted, and there is no (easy) way to know which strings have become obsolete.

There are possibly a whole bunch of other reasons why capturing and storing GDI calls is just never going to work, the above are just the most obvious ones. This technique is not only going to be incredibly hard to even remotely achieve, it'll never work 'right'.

--
The best possible 'easy' solution is indeed the one Sam Hobbs gave.
Identify the type of window at the cursor-location, and then have a set of Windowtype-specific routines to calculate what text is under the cursor.
Doing this for the the common controls (static, edit, list, tree, combo...) should be easy enough (although quite a bit of work), but other windows are going to be problematic if not impossible as there is no API available for them to do what you want.

Another solution (and I know of a program that does it this way) would involve grabbing a bitmap of the cursor-location, and apply an OCR-type algo on it to figure out what the word is. It works, and works always, but making an OCR algo is going to be a pain...

Ah sigh... Textmode DOS did have it perks in this area ;-)

**Sam Hobbs** · April 9th, 2004, 05:24 PM

Assuming that OCR requires a lot of processing, using it in realtime as the mouse scampers around is likely to be impossible also.

**OReubens** · April 10th, 2004, 05:18 AM

Well yes, you would think so, but a picture grabbed from screen is somewhat easier than a document scanned through a scanner...

Text is always perfectly horizontal (well... most of the time) and it's always locked into fixed pixel positions, and you can match charactares to fixed pixel patterns. I'm not saying it's easy (hence the 'pain' I was referring to), but aparently it can be done fast enough to be usable.

This program I was referring has it's word capturing lagging behind somewhat. It won't start grabbing the word until you held the mouse at a specific place for half a second. And it seems to be able to capture the word near instantaneously (on a 2.3Ghz machine, I'll give them that :-)) if it's a 'normal' size (8 to 14pt) and a common font (courier, arial, MS Sansserif, MS Serif, system, terminal...). The kinds of fonts you'd see on you screen 99% of the time.
On extremely large or 'novelty' fonts, processing time can increase to about a second or so.
Still not bad if you ask me...

**Sam Hobbs** · April 10th, 2004, 02:46 PM

However this question says that the word should be chosen as the mouse moves, which implies to me that it should be capable of working multiple times per second. I am assuming that is a requirement. If the problem were clarified then perhaps it is reasonable for the solution to require a couple of seconds. However it is probably an insignificant issue in this situation.

**sephiroth2m** · April 10th, 2004, 07:22 PM

1) You still won't know if the text is still there when the mouse is over it. It may already have been painted over by something else.
An easy sample of this would be a tooltip. You may be able to grab the textout that made the tooltip display itself, but it would require a lot of additional work to know that the tooltip has disapeared.

There're many ways to know if the text is still there, even if u want, the tooltip won't have the chance to show. Anyway, I think there're not many tooltips which like to overlap its text.

3) Several windows can paint text at identical locations. (an app with several levels of dialogs is a simple example). As a result you would also need to be aware of focus.

There's no need to aware of focus, u get the necessary window just by one or two functions. If there're several windows at an identical locations, only the topmost can receive the invalidate region, that means, triggers 'textout'.

4) Text can move around without calling any of the text type functions. Scrolling the window or blitting part of the window to another location is a possibility

Text always has a relative co-ordinate, called logical co-ord. And u can convert this to screen co-ord very easily.

5) Many programs (and IE is among them) use off-screen buffers (often this is a MemDC, but not necessarily) to prepare the painting, and then do a single blit from this off-screen buffer to the video memory, thus avoiding flicker.
It's impossble to track text in this type programs.

It's possible coz before blitting to the video, it must 'textout' its off-screen buffer first.

6) It'll require a HUGE amount of memory to store all the strings that have been painted, and there is no (easy) way to know which strings have become obsolete.

U don't have to store any string, GDI functions did that. Really, if u draw a bunch of text, it will be divided into smaller sections, so no 'textout' function draws more than one line.

Of coz, this solution doesn't work in all cases. But it will work in almost case. Beside, u don't need to care about the processing time, the OCR algo, not much about the font... I think OCR solution is a good solution too, but it won't work in all case, either

**Sam Hobbs** · April 10th, 2004, 07:34 PM

Originally posted by sephiroth2m
Text always has a relative co-ordinate, called logical co-ord. And u can convert this to screen co-ord very easily.

I have not heard of logical coordinates, especially not that can be converted to screen coordinates. Do you mean client coordinates?

**sephiroth2m** · April 12th, 2004, 05:54 AM

No, Sam, it's logical coordinates, that almost GDI functions use. Screen/window/client coordinates are device coordinates (that have origin at the top-lefft and unit = pixel). U use functions such as LPtoDP, ClientToScreen... to convert between them

Thread: Retrieve a word at mouse current position?

Thread Tools

Display

Retrieve a word at mouse current position?

Posting Permissions