How to get the word under the cursor in Windows?

C#WindowsWinapiHookOcr

C# Problem Overview


I want to create a application which gets the word under the cursor (not only for text fields), but I can't find how to do that. Using OCR is pretty hard. The only thing I've seen working is the Deskperience components. They support a 'native' way, but I they cost a lot. Now I'm trying to figure out what is this 'native' way (maybe somehow of hooking). Any help will be appreciated.

EDIT: I found a way, but it gets only the whole text of the control. Any idea how to get only the word under the cursor from the whole text?

C# Solutions


Solution 1 - C#

On recent versions of Windows, the recommended way to gather information from one application to another (if you don't own the targeted application of course) is to use the UI Automation technology. Wikipedia is pretty good for more information on this: Microsoft UI Automation

Basically, UI automation will use all necessary means to gather what can be gathered

Here is a small console application code that will spy the UI of other apps. Run it and move the mouse over to different applications. Each application has a different support for various "UI automation patterns". For example, there is the Value pattern and the Text pattern as demonstrated here.

static void Main(string[] args)
{
    do
    {
        System.Drawing.Point mouse = System.Windows.Forms.Cursor.Position; // use Windows forms mouse code instead of WPF
        AutomationElement element = AutomationElement.FromPoint(new System.Windows.Point(mouse.X, mouse.Y));
        if (element == null)
        {
            // no element under mouse
            return;
        }

        Console.WriteLine("Element at position " + mouse + " is '" + element.Current.Name + "'");

        object pattern;
        // the "Value" pattern is supported by many application (including IE & FF)
        if (element.TryGetCurrentPattern(ValuePattern.Pattern, out pattern))
        {
            ValuePattern valuePattern = (ValuePattern)pattern;
            Console.WriteLine(" Value=" + valuePattern.Current.Value);
        }

        // the "Text" pattern is supported by some applications (including Notepad)and returns the current selection for example
        if (element.TryGetCurrentPattern(TextPattern.Pattern, out pattern))
        {
            TextPattern textPattern = (TextPattern)pattern;
            foreach(TextPatternRange range in textPattern.GetSelection())
            {
                Console.WriteLine(" SelectionRange=" + range.GetText(-1));
            }
        }
        Thread.Sleep(1000);
        Console.WriteLine(); Console.WriteLine();
    }
    while (true);
}

UI automation is actually supported by Internet Explorer and Firefox, but not by Chrome to my knowledge. See this link: When will Google Chrome be accessible?

Now, this is just the beginning of work for you :-), because:

  • Most of the time, all this has heavy security implication. Using this technology (or direct Windows technology such as WindowFromPoint) will require sufficient rights to do so (such as being an administrator). And I don't think DExperience has any way to overcome these limitations, unless they install a kernel driver on the computer.

  • Some applications will not expose anything to anyone, even with proper rights. For example, if I'm writing a banking application, I don't want you to spy on what my application will display :-). Other applications such as Outlook with DRM will not expose anything for the same reasons.

  • Only the UI automation Text pattern support can give more information (like the word) than just the whole text. Alas, this specific pattern is not supported by IE nor FF even if they support UI automation globally.

So, if all this does not work for you, you will have to dive deeper and use OCR or Shape recognition techniques. Even with this, there will be some cases where you won't be able to do it at all (because of security rights).

Solution 2 - C#

This is non-trivial if the application you want to "spy" on is drawing the text themselves. One possible solution is to trigger the other application to paint a portion of it's window by invalidating the area directly under the cursor.

When the other application paints, you will have to intercept the text drawing calls. One way to do so is to inject code in the other application, and intercept calls into GDI functions that draw text. When you debug native applications, this is what visual studio does to implement breakpoints. To test the idea you could use a library like detours (but that's not free for commercial use).

You could also check if the application supports one of the accessability API's that are in Windows to facilitate things like screen readers for blind people.

One word of caution: I have not done any of this myself.

Solution 3 - C#

If the app need to handle not only .Net apps I would start with importing functions (P/Invoke):

Later you can iterate over the controls and try to get the text from inside based on type. If I will find some time I will try to publish such code.

After some checking it looks like the best way (unfortunately the hard also) is to hook into GDI text rendering some discussion

Solution 4 - C#

I'd echo what Patricker said, but I think there is no reliable way to do what you want.

You probably obtained the window text or something like that. But what if the cursor is over a window that doesn't use the window text to store its content? Windows are under no obligation to store their data in a particular way.

This ends up pointing you towards character recognition where you look at the pixels under the cursor and try and figure out what words are there. But not only is this very non-trivial, it also is not foolproof. What if part of the word is not visible because it extends out of the window?

This is definitely not trivial. There are a couple of ways to approach it. But there is no reliable way that will work with all windows.

Solution 5 - C#

There is an sdk for getting the text using OCR. It's not free but it's quite cheap compared to other products: http://www.screenocr.com/screen-ocr-library-sdk.htm They have an application which provides the same features so you can try the demo too.

Solution 6 - C#

To achieve this you need a multi-pronged approach.

UIA does work in many applications but you need to experiment to see where the text is returned. It may be in Element, Value or Range. There is no consistency even across office applications.

If UIA fails then enumerate the running object table (ROT) and retreive the COM pointers to various apps registered in the ROT. You can then cast these pointers to the underlying office types:
for example:

enumerate ROT  - then
 wb = (Excel._Workbook)enumerator.Value;
string strText = wb.Application.ActiveCell.Text.ToString();

If the above two methods fail then make use of the free OCR system in MODI (Microsoft Office Document Imaging 12.0 Type Library)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionblezView Question on Stackoverflow
Solution 1 - C#Simon MourierView Answer on Stackoverflow
Solution 2 - C#user180326View Answer on Stackoverflow
Solution 3 - C#bartosz.lipinskiView Answer on Stackoverflow
Solution 4 - C#Jonathan WoodView Answer on Stackoverflow
Solution 5 - C#GiorgiView Answer on Stackoverflow
Solution 6 - C#RichardBView Answer on Stackoverflow