CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 7 of 7
  1. #1
    Join Date
    Feb 2001
    Posts
    342

    string concatenation & speed question

    I have an app which generates somewhat lengthy HTML files (largest might be around 600K in normal use).

    I emit the HTML tags in small pieces as the app calculates what the page should look like, then I write the tags to the file (a CFile object).

    This naturally leads to a great deal of file writes, which are somewhat time consuming. If I do a profile, it shows that this is definitely an opportunity for improvement.

    So, I though, simple enough: I'll just emit the HTML code to an internal string, and write the entire thing with one file write.

    Well, using a CString as the buffer made things horribly worse. I suspect that CString concatanation is very slow.

    Any suggestions? What would be faster than using a CString?

  2. #2
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588
    Yes, string concatenation is basically slow, since it usually involves making a new copy of the whole buffer with some extra bit appended.

    The fastest way to do this is to use buffered IO. How do you write your file currently ? Depending on the extent of the speed problem, you may want to consider implementing your own buffering method, but that could be an overkill.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  3. #3
    Join Date
    Apr 1999
    Posts
    27,449
    I don't know if CString has a "reserve" function where you can set up the memory up front, but std::string has such a function.

    You state how much memory to reserve for the string so that concatenation does not incur the overhead otherwise. The trick is to determine what's a good number.
    Code:
    #include <string>
    #include <ctime>
    #include <iostream>
    
    using namespace std;
    
    void slow()
    {
       std::string s;
       for ( int i = 0; i < 100000; ++i )
            s += "0123456789";
    }
    
    void fast()
    {
       std::string s;
       s.reserve(1000000);  // Use reserve
       for ( int i = 0; i < 100000; ++i )
            s += "0123456789";
    }
    
    int main()
    {
        clock_t start = clock();
        slow();
        clock_t stop = clock();
        cout << "slow took " << stop - start << " ticks" << endl;
    
        start = clock();
        fast();
        stop = clock();
    
        cout << "fast took " << stop - start << " ticks" << endl;
    }
    
    Output:
    
    slow took 58156 ticks
    fast took 0 ticks
    Note the dramatic difference.

    Regards,

    Paul McKenzie

  4. #4
    Join Date
    Feb 2001
    Posts
    342

    Here's a code chunk:

    I have several overloaded methods to emit the HTML code:

    Code:
    try	{
    	tgtfile = new CFile(strFileName, CFile::modeCreate | CFile::modeWrite );
    }
    		
    catch(CFileException)
    {
    	MessageBox("HTML file could not be opened","SDX",MB_ICONERROR);
    }
    
    ...
    
    void CHotViewView::Emit(char c)
    {
    	tgtfile->Write(&c,1);
    }
    
    void CHotViewView::Emit(CString strString)
    {
    	tgtfile->Write(strString,strString.GetLength());
    }
    
    etc.

  5. #5
    Join Date
    Apr 1999
    Posts
    27,449
    Well first of, you are making the compiler do unnecessary copying when you passed the CString by value in the Emit function:
    Code:
    // the usage of a const reference
    void CHotViewView::Emit(const CString& strString)
    {
        //...
    }
    If your CString has many characters, your declaration of passing by value causes the compiler to make a copy, thereby slowing down the function. Passing a const reference does not incur the copy.

    Regards,

    Paul McKenzie

  6. #6
    Join Date
    Aug 2002
    Location
    Madrid
    Posts
    4,588
    Another thing is that it clearly says in MSDN that CFile is unbuffered, this is what makes it so slow. As a start, just try replacing CFile by CStdioFile and check the speed difference.
    Get this small utility to do basic syntax highlighting in vBulletin forums (like Codeguru) easily.
    Supports C++ and VB out of the box, but can be configured for other languages.

  7. #7
    Join Date
    Nov 2002
    Location
    California
    Posts
    4,556
    Originally posted by Yves M
    Another thing is that it clearly says in MSDN that CFile is unbuffered, this is what makes it so slow. As a start, just try replacing CFile by CStdioFile and check the speed difference.
    CStdioFile is remarkably faster than CFile in these circumstances. I've done the same kinds of HTML outputs as the OP, for files of up to and over 1 meg. The files contained tables that were output element-by-element (i.e., one <TD> element at a time). Using CFile was terrible. With CStdioFile there's a noticable delay, but it's small enough (one to two seconds for the largest files) that a CWaitCursor is enough to keep the user satisfied.

    -Mike

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured