-
July 6th, 2007, 10:09 AM
#1
Searching a word doc
I managed to open a Word Document in an MFC application. What is the best way to search for specific strings in that document? I want to extract those strings and compare them to a set of presaved CString objects.
Would it work in a similar way to the CStdioFile class? (eg finput.ReadString)
-
July 6th, 2007, 11:18 AM
#2
Re: Searching a word doc
How did you opened a Word document? can you please be more specific?
Please use code tags [code] [/code]
We would change the world, but God won't give us the sourcecode..
Undocumented futures are fun and useful....
_________
Gili
-
July 6th, 2007, 11:21 AM
#3
Re: Searching a word doc
I used automation:
Code:
_Application wordApp;
CString id("Word.Application");
BOOL status = wordApp.CreateDispatch(id);
_Document myDoc;
COleVariant covTrue((short)TRUE);
COleVariant covFalse((short)FALSE);
COleVariant covOptional((long)DISP_E_PARAMNOTFOUND, VT_ERROR);
Documents docs = wordApp.GetDocuments();
CString ReportFileName(filenameWord);
myDoc = docs.Open(COleVariant(ReportFileName),
covOptional,
covFalse, // ReadOnly
covOptional,
covOptional,
covOptional,
covOptional,
covOptional,
covOptional,
covOptional,
covOptional,
covOptional);
wordApp.SetVisible(TRUE);
wordApp.Activate();
myDoc.SetUserControl(TRUE);
The parameter "filenameWord" is defined in another section of the code, and it contains the file's path.
I don't need to modify or save the word document. All I need to do is to search it for some specific CString objects that look like "<some text goes here>"
-
July 6th, 2007, 01:15 PM
#4
Re: Searching a word doc
Not necessarily the best way but this is A way to search.
Code:
COleVariant covTrue((short)TRUE),
covFalse((short)FALSE),
covOpt((long)DISP_E_PARAMNOTFOUND, VT_ERROR);
// Get the IDispatch pointer and attach it to the objWord object.
if (!objWord.CreateDispatch("Word.Application"))
{
AfxMessageBox("Couldn't get Word object.");
return;
}
objWord.SetVisible(TRUE);
_Document doc;
Documents docs(objWord.GetDocuments());
doc=docs.Open(COleVariant("c:/My Documents/ThingsToTry.doc"));
Selection sel;
sel=objWord.GetSelection();
sel.WholeStory();
Find fnd;
fnd=sel.GetFind();
fnd.SetText("iteration");
fnd.SetForward(TRUE);
fnd.Execute(covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt);
The calls may have different arguments based on your version of word you are using. This was written years ago using the MS Word 97 typelib.
Verere testudinem! (Fear the turtle)
Once you can accept the universe as matter expanding into nothing that is something, wearing stripes with plaid comes easy. -Albert Einstein
Robots are trying to steal my luggage.
-
July 6th, 2007, 01:57 PM
#5
Re: Searching a word doc
Thanks for the code, I was able to get it working...don't know exactly how I managed to do that...heh
In any case, it does find what I give it, but it only works once, and doesn't find the rest of the strings. Here is the code to clarify what I mean.
Code:
Selection sel = wordApp.GetSelection();
Find find = sel.GetFind();
Replacement repl = find.GetReplacement();
for (int k = 0; k < numberOfKeys; k++) {
find.SetText(listOfKeys[k]);
find.SetForward(TRUE);
if(
find.Execute(covOptional, covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional,covOptional)) {
foutput.WriteString(listOfKeys[k]);
foutput.WriteString(_T(" = "));
foutput.WriteString(database[listOfKeys[k]]);
foutput.WriteString(_T("\n"));
}
}
listOfKeys is an array of CString that contains the string that I want to search for.
numberOfKeys is just an int with the total number of keys in the database.
database is a CMapStringToString object that has all the CStrings from the array as keys, and some comments as the data that comes with it. I am using this object to print stuff to a file.
-
July 6th, 2007, 02:06 PM
#6
Re: Searching a word doc
I was able to repeat the search multiple times until it had found all instances of the word. Execute returns TRUE if it found the word and FALSE if it doesn't. Maybe that isn't what you meant.
Code:
fnd=sel.GetFind();
fnd.SetText("iteration");
fnd.SetForward(TRUE);
BOOL res;
res=TRUE;
while(res==TRUE)
{
res=fnd.Execute(covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt);
}
I can replace all instances of "iteration" with "whizbang" by using the ReplaceWith and Replace fields of Execute.
Code:
fnd=sel.GetFind();
fnd.SetText("iteration");
fnd.SetForward(TRUE);
BOOL res;
res=fnd.Execute(covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,covOpt,COleVariant("Whizbang"),COleVariant(2L)); //2=replaceall
Last edited by Tom Frohman; July 6th, 2007 at 02:16 PM.
Verere testudinem! (Fear the turtle)
Once you can accept the universe as matter expanding into nothing that is something, wearing stripes with plaid comes easy. -Albert Einstein
Robots are trying to steal my luggage.
-
July 6th, 2007, 02:26 PM
#7
Re: Searching a word doc
What does the SetForward do?
From what I posted, you can see that my code cycles through an array element by element. The element is used in the find.SetText statement.
Then I call find.Execute(....). If it is true, I go into the statement and print something to a file. If not, I go back to the beginning of the for loop and simply change the statement in the find.SetText command, then I search for that instead, etc...
The Word Doc that I am searching has the following text:
<T5><T4><T3><T2><T1>
The file that is generated only shows <T5>, and none of the other things, even though they are stored in the database.
-
July 6th, 2007, 03:25 PM
#8
Re: Searching a word doc
The problem is that Find resets the selection to the found text.
You need to reset the selection each time by using
sel.WholeStory or some other method before the execute function.
Verere testudinem! (Fear the turtle)
Once you can accept the universe as matter expanding into nothing that is something, wearing stripes with plaid comes easy. -Albert Einstein
Robots are trying to steal my luggage.
-
July 6th, 2007, 03:40 PM
#9
Re: Searching a word doc
Wow, that worked!
Thanks a lot for your help, I couldn't have figured it out without some direction.
Also out of curiosity...is there another way to reset the selection other than WholeStory...just for future reference.
Now the other issue is that it is extremely slow, but that's simply an efficiency issue I'll work out later.
-
July 7th, 2007, 07:14 AM
#10
Re: Searching a word doc
An alternative:
All Text To String
In the link above all the text is selected and read into a string.
You could then use the CString Find() method to find the text.
It might not be practicle if the documents are too big. It might be worth a try.
The select method of the document class in the link is an alternative to the select whole story command.
One way to speed up automation is to NOT make the document visible and/or turn off screen updating.
Tom
Verere testudinem! (Fear the turtle)
Once you can accept the universe as matter expanding into nothing that is something, wearing stripes with plaid comes easy. -Albert Einstein
Robots are trying to steal my luggage.
-
July 9th, 2007, 08:38 AM
#11
Re: Searching a word doc
I tried to set it to hide the Word doc but it doesn't work. Someone here at the office told me to make SetVisible to false so I did this
wordApp.SetVisible(FALSE);
wordApp is an object of type _Application.
It compiles but crashes when I attempt to open the document. Something tells me he has no idea what he's talking about...
-
July 9th, 2007, 09:39 AM
#12
Re: Searching a word doc
SetVisible(FALSE) is the right way. I think the crash is a separate issue.
Ir that doesn't work try
wordApp.SetScreenUpdating(FALSE);
at the start of the operation and
wordApp.SetScreenUpdating(TRUE);
at the end
Last edited by Tom Frohman; July 9th, 2007 at 09:47 AM.
Verere testudinem! (Fear the turtle)
Once you can accept the universe as matter expanding into nothing that is something, wearing stripes with plaid comes easy. -Albert Einstein
Robots are trying to steal my luggage.
-
July 9th, 2007, 09:49 AM
#13
Re: Searching a word doc
Thanks for the input. When I do it that way and I open the file, I get a "Cannot activate application" error...I can tell that the program starts running and everything works, then a few seconds into it it gives me that error. Any ideas on what could be causing it?
-
July 9th, 2007, 10:16 AM
#14
Re: Searching a word doc
I tried your suggestions but it still doesn't work...I'm not exactly sure why though, I'm debugging right now to see what's happening.
Edit: it doesn't crash anymore, but it still opens up the document...if I have SetVisible(FALSE) in there as well, it still gives me the same error (cannot activate application).
Last edited by ahammad; July 9th, 2007 at 10:24 AM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|