dcsimg
CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Page 2 of 2 FirstFirst 12
Results 16 to 17 of 17

Thread: Reading text files quickly?

  1. #16
    Join Date
    Jul 2000
    Location
    Milano, Italy
    Posts
    7,726

    If instead you have big files...

    If instead you have big files (eg greater than 10 meg)
    and you want to serach something inside it (that is: open, read, search inside), or even you want to get all the content (but for
    this you will have to modify ffollowin code), best way is not to
    read all in a single step, but to read chuncked.
    Ie: (attachment: whole code and testing app)
    Code:
    Private Function ScanChunked( _
              ByVal FileName As String _
            , ByVal SearchString As String _
            , ByVal MatchCase As Integer _
            ) As Boolean
        Dim intFree As Integer
        intFree = FreeFile 'a free file number
        
        Dim bytePos As Long 'bytes to get from file
        Dim sContent As String 'content of file to get
        
        Dim Remind As Long 'to get remind part of file
        Dim sPrevBytes As String 'to add to content last read bytes so that if chunk split
                                 'the searched word, you can still get it
        
        
        
        'this may take a lot.
        'let user see a progress working
        Dim lProgress As Long
        Dim lFileSize As Long
        Dim lIter As Long
        ' ensure that the file exists
        'skip system files!
        If Len(Dir$(FileName, vbArchive Or vbHidden Or vbNormal Or vbReadOnly)) = 0 Then
            RaiseEvent FileError("File " & FileName & " skipped as it is a system file!")
            
            Exit Function
            'Err.Raise 53  ' File not found
        End If
       
        Open FileName For Binary Access Read As #intFree
        
        lFileSize = LOF(intFree)
        RaiseEvent FileSearched(FileName & " size: " & Format$(CLng(lFileSize / 1024), "Standard") & " Kb.")
        'mFileInfo.Text = fileName & " size: " & Format$(CLng(lFileSize / 1024), "Standard") & " Kb."
        
        Remind = lFileSize Mod MaxLen
        lIter = CLng(lFileSize / MaxLen)
        If lIter > 0 Then
            For bytePos = 1 To lFileSize Step MaxLen
                
                DoEvents 'lets user stop process if needed
                 If mbStop Then
                    Close #intFree
                    Exit Function
                 End If
                
                 'read data
                 sContent = ReadData(intFree, bytePos, MaxLen)
                
               'check for match, including previous chars (if any) enough
               'to make the searched string
                If CheckInstr(sPrevBytes & sContent, SearchString, MatchCase, FileName) Then
                    ScanChunked = True
                    Close #intFree
                    RaiseEvent FileJobPercent(100) 'ended with this one
                    Exit Function
                End If
                
                'see if ending previus reading plus actual ones can give
                'you what you're looking for
                sPrevBytes = getPrevString(sContent, SearchString)
                RaiseEvent FileJobPercent(CInt(bytePos / lFileSize * 100))
            Next bytePos
        End If
        If Remind > 0 Then
            'get last bytes
            For bytePos = lFileSize - Remind + 1 To lFileSize Step Remind
                'should be one single step...
                 If mbStop Then
                    Close #intFree
                    Exit Function
                End If
                
                'read remind
                sContent = ReadData(intFree, bytePos, Remind)
                
                If CheckInstr(sPrevBytes & sContent, SearchString, MatchCase, FileName) Then
                    ScanChunked = True
                    Close #intFree
                    RaiseEvent FileJobPercent(100)
                    Exit Function
                 End If
                 sPrevBytes = getPrevString(sContent, SearchString)
                 RaiseEvent FileJobPercent(CInt(bytePos / lFileSize * 100))   'should be 100
            Next
        End If
        Close #intFree
    End Function
    
    Private Function getPrevString( _
              ByVal sContent As String _
            , ByVal SearchString As String _
            ) As String
        
        If Len(sContent) > Len(SearchString) Then
            getPrevString = Right$(sContent, Len(SearchString))
        ElseIf Len(sContent) > 0 Then
            getPrevString = sContent
        End If
    
    End Function
    
    Private Function CheckInstr(ByVal sContent As String, ByVal SearchString As String, ByVal MatchCase As Integer, ByVal sFilename As String) As Boolean
        If InStr(1, sContent, SearchString, MatchCase) Then
            RaiseEvent FileFound(sFilename)
            CheckInstr = True
        End If
    End Function
    
    Function ReadData(intFree As Integer, bytePos As Long, byteLength As Long) As String
        Dim sBuffer As String
        sBuffer = Space(byteLength)
        Get #intFree, bytePos, sBuffer
        ReadData = sBuffer
    End Function
    Attached Files Attached Files
    Last edited by Cimperiali; February 21st, 2005 at 02:56 PM.
    ...at present time, using mainly Net 4.0, Vs 2010



    Special thanks to Lothar "the Great" Haensler, Chris Eastwood , dr_Michael, ClearCode, Iouri and
    all the other wonderful people who made and make Codeguru a great place.
    Come back soon, you Gurus.

  2. #17
    Join Date
    Jul 2000
    Location
    Milano, Italy
    Posts
    7,726
    Originally posted by DinoVaught
    I have to disagree when you say my tests are meaningless.
    I agree with you, DinoVaught, but to make a consistent test, 4
    iterations are not enough.
    If you look at how Balena makes these kindd of test, you will see
    he insert code to benchmark inside a loop and repeat for around
    10,000 times, even more than once, with alternate (taht is:
    first time method A and then method B, second time method B
    and then method A and so on for many, many times, with a
    reboot of the machine in the middle, and all other unnnecessary
    processes stopped).
    That is because the Cpu is not at disposal of your software the
    same twice, and it might be when testing one method you have more than when testing another.
    In any case, binary is really fast, and Balena second solution
    (to tell the truth, it was a suggestment of someone else Balena
    reported) performed great.

    ...at present time, using mainly Net 4.0, Vs 2010



    Special thanks to Lothar "the Great" Haensler, Chris Eastwood , dr_Michael, ClearCode, Iouri and
    all the other wonderful people who made and make Codeguru a great place.
    Come back soon, you Gurus.

Page 2 of 2 FirstFirst 12

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Windows Mobile Development Center


Click Here to Expand Forum to Full Width




On-Demand Webinars (sponsored)