Click to See Complete Forum and Search --> : Retrieving HTML text from a web page


Aaron Croasmun
March 12th, 2001, 08:22 AM
I'm trying to write a search engine 'robot', which works, except for one major detail. The program uses a file on a hard drive as opposed to the text on a web page. ie: it uses file.asp instead of the text you'd see surfing to file.asp over the internet. My question is, how can I grab all of the text from a web page after it's been opened in a browser? Thanks!
-Aaron

Iouri
March 12th, 2001, 08:44 AM
'Ref: ( it is in mshtml.dll)
'Microsoft HTML Object Library
'Microsoft Internet Controls

Option Explicit

Dim htmBody As New HTMLBody
Dim htmDoc As New HTMLDocument

Private Sub Command1_Click() 'Get WEB
wbcMain.Navigate Text1
End Sub


Private Sub Command2_Click() 'Get text to msgbox
Set htmDoc = wbcMain.Document
Set htmBody = htmDoc.body
MsgBox htmBody.innerText
End Sub

Private Sub Command3_Click() 'print text
Printer.Print htmBody.innerText
Printer.EndDoc
Printer.NewPage
End Sub

Private Sub Command4_Click() 'exit
Unload Me
End Sub

Private Sub Form_Resize() 'resize
Dim hgt As Single

hgt = ScaleHeight - wbcMain.Top
If hgt < 120 Then hgt = 120
wbcMain.Move 0, wbcMain.Top, ScaleWidth, hgt
End Sub
'================
'another way
Add the Microsoft Internet Transfer control
then write this code

set strvar=inet1.openurl("http://ipaddress/virtual directory /filename

or

strvar=inet1.openurl("c:/path.../filename")

the strvar will contain the contents of the web page..!




Iouri Boutchkine
iouri@hotsheet.com