Large text document processing in vb6
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 9 of 9

Thread: Large text document processing in vb6

  1. #1
    Join Date
    Oct 2011
    Location
    Malaysia , Selangor
    Posts
    89

    Large text document processing in vb6

    Hi, All... can anyone advice me how to process large text file in vb6 ??? how can i read multiple text from directory ?? how can I read Unicode text file from directory?? thanks in advance

  2. #2
    Join Date
    Jan 2006
    Location
    Chicago, IL
    Posts
    14,945

    Re: Large text document processing in vb6

    Depends on what you'd like to do with the data. You should SEARCH the forums for examples
    David

    CodeGuru Article: Bound Controls are Evil-VB6
    2013 Samples: MS CODE Samples

    CodeGuru Reviewer
    2006 Dell CSP
    2006, 2007 & 2008 MVP Visual Basic
    If your question has been answered satisfactorily, and it has been helpful, then, please, Rate this Post!

  3. #3
    Join Date
    Oct 2011
    Location
    Malaysia , Selangor
    Posts
    89

    Re: Large text document processing in vb6

    Quote Originally Posted by dglienna View Post
    Depends on what you'd like to do with the data. You should SEARCH the forums for examples
    I want to read a 2000 Unicode text files, each file is about 13-15 KB, the text files is contained 4 categories. I want to find the frequency of each word for each category then put the result in the database table or another text file. What i want to do is a reprocessing for the text file to be ready for text categorization. one thing more. can use the hash table for that ??? and how ??

  4. #4
    Join Date
    Jul 2006
    Location
    Germany
    Posts
    3,722

    Re: Large text document processing in vb6

    13-15kB is not a large file.
    In fact you could store easily one file in ONE string only.
    So you could possibly store 2000 files in an array of strings.
    You want to watch out, however, how to treat Unicode files.
    I have some cool links for unicode processing in my office. I shall post them tomorrow.
    Generally, when having unicode strings loaded properly, processing should be no problem. You only have to watch when writing unicode back to a file.

  5. #5
    Join Date
    Oct 2011
    Location
    Malaysia , Selangor
    Posts
    89

    Re: Large text document processing in vb6

    Quote Originally Posted by WoF View Post
    13-15kB is not a large file.
    In fact you could store easily one file in ONE string only.
    So you could possibly store 2000 files in an array of strings.
    You want to watch out, however, how to treat Unicode files.
    I have some cool links for unicode processing in my office. I shall post them tomorrow.
    Generally, when having unicode strings loaded properly, processing should be no problem. You only have to watch when writing unicode back to a file.
    Thank you very much. really appreciated. will wait for ur links tomorrow.
    by the way. how about finding the frequency of each word in a text at whole ? how can I store them? what is suitable data structure to store each word and its frequency ?

  6. #6
    Join Date
    Jan 2006
    Location
    Chicago, IL
    Posts
    14,945

    Re: Large text document processing in vb6

    Kind of like GOOGLE does?
    David

    CodeGuru Article: Bound Controls are Evil-VB6
    2013 Samples: MS CODE Samples

    CodeGuru Reviewer
    2006 Dell CSP
    2006, 2007 & 2008 MVP Visual Basic
    If your question has been answered satisfactorily, and it has been helpful, then, please, Rate this Post!

  7. #7
    Join Date
    Oct 2011
    Location
    Malaysia , Selangor
    Posts
    89

    Re: Large text document processing in vb6

    Quote Originally Posted by dglienna View Post
    Kind of like GOOGLE does?
    It is Automatic text categorization(ATC) for a specific language.

  8. #8
    Join Date
    Jul 2006
    Location
    Germany
    Posts
    3,722

    Re: Large text document processing in vb6

    You have to be aware that there are several Unicode standards.
    I have only experience in using UTF16 which always stores two bytes per character. It is very easy to handle.
    The first two bytes in a unicode file identify the type of encoding. If they are hex FFFE it is a UTF16 VB6 can handle.
    This is in short how to read a file:
    Code:
      Dim sig%, i$
      Open FileName For Binary As #1
      Get #1, , sig
      'If Hex(sig) = "FFFE" Then
          i$ = InputB(LOF(1) - 2, #1)
       Else
          MsgBox "No UTF16"
       End If
       Close #1
    I$ contains then the complete file in unicode.

    Look at this rather good tutorial:
    http://www.cyberactivex.com/UnicodeT...lVb.htm#FileIO
    In the FileIO section download the modUnicodeRW.bas
    It contains several routines to read and write Unicode, also using API calls.
    You can study them and choose.
    I recommend to go through all interesting parts of the tutorial, too.

  9. #9
    Join Date
    Oct 2011
    Location
    Malaysia , Selangor
    Posts
    89

    Re: Large text document processing in vb6

    Quote Originally Posted by WoF View Post
    You have to be aware that there are several Unicode standards.
    I have only experience in using UTF16 which always stores two bytes per character. It is very easy to handle.
    The first two bytes in a unicode file identify the type of encoding. If they are hex FFFE it is a UTF16 VB6 can handle.
    This is in short how to read a file:
    Code:
      Dim sig%, i$
      Open FileName For Binary As #1
      Get #1, , sig
      'If Hex(sig) = "FFFE" Then
          i$ = InputB(LOF(1) - 2, #1)
       Else
          MsgBox "No UTF16"
       End If
       Close #1
    I$ contains then the complete file in unicode.

    Look at this rather good tutorial:
    http://www.cyberactivex.com/UnicodeT...lVb.htm#FileIO
    In the FileIO section download the modUnicodeRW.bas
    It contains several routines to read and write Unicode, also using API calls.
    You can study them and choose.
    I recommend to go through all interesting parts of the tutorial, too.
    Thank you very much

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Azure Activities Information Page

Windows Mobile Development Center


Click Here to Expand Forum to Full Width

This is a CodeGuru survey question.


Featured


HTML5 Development Center