Hi, All... can anyone advice me how to process large text file in vb6 ??? how can i read multiple text from directory ?? how can I read Unicode text file from directory?? thanks in advance
Depends on what you'd like to do with the data. You should SEARCH the forums for examples
I want to read a 2000 Unicode text files, each file is about 13-15 KB, the text files is contained 4 categories. I want to find the frequency of each word for each category then put the result in the database table or another text file. What i want to do is a reprocessing for the text file to be ready for text categorization. one thing more. can use the hash table for that ??? and how ??
13-15kB is not a large file.
In fact you could store easily one file in ONE string only.
So you could possibly store 2000 files in an array of strings.
You want to watch out, however, how to treat Unicode files.
I have some cool links for unicode processing in my office. I shall post them tomorrow.
Generally, when having unicode strings loaded properly, processing should be no problem. You only have to watch when writing unicode back to a file.
13-15kB is not a large file.
In fact you could store easily one file in ONE string only.
So you could possibly store 2000 files in an array of strings.
You want to watch out, however, how to treat Unicode files.
I have some cool links for unicode processing in my office. I shall post them tomorrow.
Generally, when having unicode strings loaded properly, processing should be no problem. You only have to watch when writing unicode back to a file.
Thank you very much. really appreciated. will wait for ur links tomorrow.
by the way. how about finding the frequency of each word in a text at whole ? how can I store them? what is suitable data structure to store each word and its frequency ?
You have to be aware that there are several Unicode standards.
I have only experience in using UTF16 which always stores two bytes per character. It is very easy to handle.
The first two bytes in a unicode file identify the type of encoding. If they are hex FFFE it is a UTF16 VB6 can handle.
This is in short how to read a file:
Code:
Dim sig%, i$
Open FileName For Binary As #1
Get #1, , sig
'If Hex(sig) = "FFFE" Then
i$ = InputB(LOF(1) - 2, #1)
Else
MsgBox "No UTF16"
End If
Close #1
I$ contains then the complete file in unicode.
Look at this rather good tutorial: http://www.cyberactivex.com/UnicodeT...lVb.htm#FileIO
In the FileIO section download the modUnicodeRW.bas
It contains several routines to read and write Unicode, also using API calls.
You can study them and choose.
I recommend to go through all interesting parts of the tutorial, too.
You have to be aware that there are several Unicode standards.
I have only experience in using UTF16 which always stores two bytes per character. It is very easy to handle.
The first two bytes in a unicode file identify the type of encoding. If they are hex FFFE it is a UTF16 VB6 can handle.
This is in short how to read a file:
Code:
Dim sig%, i$
Open FileName For Binary As #1
Get #1, , sig
'If Hex(sig) = "FFFE" Then
i$ = InputB(LOF(1) - 2, #1)
Else
MsgBox "No UTF16"
End If
Close #1
I$ contains then the complete file in unicode.
Look at this rather good tutorial: http://www.cyberactivex.com/UnicodeT...lVb.htm#FileIO
In the FileIO section download the modUnicodeRW.bas
It contains several routines to read and write Unicode, also using API calls.
You can study them and choose.
I recommend to go through all interesting parts of the tutorial, too.
Bookmarks