CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 13 of 13
  1. #1
    Join Date
    Nov 2011
    Posts
    189

    Wink The string thread

    Strings can always be a little confusing, so i made thIs thread:
    As for the 1st question: lets suppose we have java code written in a txt file. We determine if a line starts with"//" after which we want to add the character "¥" at the end of the respective line. I have no problem determining if a line starts with "//" but i dont know how to append "¥" at the end of the RESPECTIVE line.
    Any help is much appreciated

  2. #2
    Join Date
    May 2006
    Location
    UK
    Posts
    4,473

    Re: The string thread

    Code:
    myString = myString + "¥";
    Beware: if you're appending lots of things to the same line don't use this technique, always use a StringBuilder (or StringBuffer if you need synchronization). To understand why read this short tutorial I wrote.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  3. #3
    Join Date
    Jan 2009
    Posts
    596

    Re: The string thread

    Quote Originally Posted by keang View Post
    Code:
    myString = myString + "¥";
    Beware: if you're appending lots of things to the same line don't use this technique, always use a StringBuilder (or StringBuffer if you need synchronization). To understand why read this short tutorial I wrote.
    I've just looked at the link to your tutorial, and spotted a problem which is causing you to overestimate the time difference between using direct concatenation and using StringBuilder. That is, the loop using '+' is also writing the loop variable to System.out, whereas the StringBuilder loop isn't.

    This will account for some (most!) of the 74,000 times difference in speed you are seeing!

  4. #4
    Join Date
    May 2009
    Posts
    2,413

    Re: The string thread

    Quote Originally Posted by Peter_B View Post
    This will account for some (most!) of the 74,000 times difference in speed you are seeing!
    That's right.

    Since Java compilers were allowed to use StringBuffer/Builder to optimize String expressions, String code is as fast as handcrafted StringBuffer/Builder code. To prefer StringBuffer/Builder over String for performance reasons was relevant like maybe 7 years ago.

    Today there's little or no penalty for using String also where it's not the most appropriate choise. The Java language has been changed not to penalize people for using String where a mutable alternative such as StringBuffer/Builder would be better.

    Unfortunately optimization "advice" (ala keang's tutorial) live its own life lingering on long after it was relevant (if it ever was). Another example. Even today in new Java code you can see reference variables being "nulled" just before they go out of scope in order to "help" the garbage collector reclaim objects.

    So Donald Knuth's famous advice is as relevant as ever before - don't optimize prematurely!
    Last edited by nuzzle; February 28th, 2012 at 03:47 AM.

  5. #5
    Join Date
    May 2009
    Posts
    2,413

    Re: The string thread

    Quote Originally Posted by cens View Post
    Any help is much appreciated.
    You must create a new text file. It's not possible to append text to an existing file unless it's at the very end.

    So you need to copy the existing file line by line to a new file appending "¥" in the process.

  6. #6
    Join Date
    May 2009
    Posts
    2,413

    Re: The string thread

    Quote Originally Posted by keang View Post
    To understand why read this short tutorial I wrote.
    In this tutorial you're using compiler output to support arguments about best coding practices.

    This is always wrong because there's no THE Java compiler. There are many different compilers from different suppliers and they don't generate identical code. And even the same compiler may generate different code over time because it's being updated and improved.

    To support arguments of this kind the only relevant source is the Java Language Specification (not compiler output nor standard library source code).

  7. #7
    Join Date
    May 2006
    Location
    UK
    Posts
    4,473

    Re: The string thread

    Quote Originally Posted by Peter_B
    I've just looked at the link to your tutorial, and spotted a problem which is causing you to overestimate the time difference between using direct concatenation and using StringBuilder. That is, the loop using '+' is also writing the loop variable to System.out, whereas the StringBuilder loop isn't.
    Yes you are right, well spotted, the posted code is wrong - thanks for pointing that out. However, I believe the stats were run on the correct code as I've just re-run the code (without the system out) and the results are:
    Code:
    Concatenate strings - time = 427704ms, string length = 500000
    Using StringBuilder - time = 15ms, string length = 500000
    This has been run on a different machine and different version of Java (6 u 23) but still shows an order of magnitude difference in performance.

    Having re-read the article it's wrong on a number of counts:

    • It doesn't clearly point out such a huge difference in performance only applies when appending multiple times to large strings. Multiple concatenations to small strings whilst still slower are no where near as inefficient (maybe 5-10 times slower).
    • As nuzzle has pointed out it relates to a single compiler. At the time I wrote the article this output was typical of the compilers I tried and had been for a number of years but it should have made this clear and that things may/will change.
    • When this article was written this was considered "best practice" but now I would agree that this no longer holds true. The performance inefficiencies for most uses (which would be relatively small strings and relatively low numbers of concatenations) are going to be insignificant on modern computers. However, if there is a performance issue when concatenating strings, there is a good chance improvements can be made by using StringBuilder/StringBuffer.

    Quote Originally Posted by nuzzle
    Since Java compilers were allowed to use StringBuffer/Builder to optimize String expressions, String code is as fast as handcrafted StringBuffer/Builder code. To prefer StringBuffer/Builder over String for performance reasons was relevant like maybe 7 years ago.
    Do you have Java 7 installed or any other vendors compiler so you can you run the test code (without that spurious system out in the first loop) on your machine and post the results. It'll be interesting to see how different compilers handle string concatenation.
    BTW I'm currently using the Java 6 update 23 compiler from Oracle which clearly still isn't optimizing the loop and isn't 7 years out of date.

    Code:
    static void test()
         {
         String text = "";
    
         // warm up system
         for ( int i = 0; i < 10000; ++i )
             text = text + "*";
    
         System.out.println("Running concatenation time test...");
    
         text = "";
         long t = System.currentTimeMillis();
    
         for ( int i = 0; i < 500000; ++i )
             text = text + "*";
    
         System.out.println("Concatenate strings - time = "+(System.currentTimeMillis()-t)+"ms, string length = "+text.length());
    
         text = "";
         t = System.currentTimeMillis();
    
         StringBuilder sb = new StringBuilder(text);
    
         // warm up system
         for ( int i = 0; i < 10000; ++i )
             sb.append("*");
    
         sb = new StringBuilder(text);
    
         for ( int i = 0; i < 500000; ++i )
             sb.append("*");
    
         text = sb.toString();
    
         System.out.println("Using StringBuilder - time = "+(System.currentTimeMillis()-t)+"ms, string length = "+text.length());
         }
    Interestingly the Java Tutorials (and before you say it yes I know they are not the bible and only relate to Oracles compiler which, to be fair, most of the people using this site probably use) still say "Strings should always be used unless string builders offer an advantage in terms of simpler code (see the sample program at the end of this section) or better performance. For example, if you need to concatenate a large number of strings, appending to a StringBuilder object is more efficient."

    In this tutorial you're using compiler output to support arguments about best coding practices.
    Yes that is wrong, my intention was to use the compiler output to explain why there is such a discrepancy in the times for each loop to run. The conclusion is/was wrong as it is clearly compiler dependent albeit at the time I wrote the article it did apply to all the compilers I tried.

    Quote Originally Posted by nuzzle
    Unfortunately optimization "advice" (ala keang's tutorial) live its own life lingering on long after it was relevant (if it ever was)
    Lots of advice on many different subjects has a limited shelf life but that doesn't mean you should not give it at the time it is relevant.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  8. #8
    Join Date
    Jan 2009
    Posts
    596

    Re: The string thread

    I've just modified keang's code to do the comparison for different numbers of iterations:
    Code:
    class StringSpeedTest {
    
        public static void main(String[] args) {
    
            // warm up system
            String text = "";
            for ( int i = 0; i < 10000; ++i )
                text = text + "*";
    
            System.out.format("&#37;15s%15s%15s%n", "Iterations", "Concatenation", "Builder");
    
    	for (int i=1; i < 2000000; i=i*2) {
                long concatTime = Concatenation(i);
                long builderTime = Builder(i);
    	    System.out.format("%15d%15d%15d%n", i, concatTime, builderTime);
            }
        }
    
        public static long Concatenation(int iterations) {
            String text = "";
    
            long t = System.currentTimeMillis();
            for ( int i = 0; i < iterations; ++i)
                text = text + "*";
    
            return System.currentTimeMillis()-t;
        }
    
        public static long Builder(int iterations) {
            String text = "";
    
            long t = System.currentTimeMillis();
            StringBuilder sb = new StringBuilder(text);
            for ( int i = 0; i < iterations; ++i)
                sb.append("*");
            text = sb.toString();
    
            return System.currentTimeMillis()-t;
        }
    }

    The results are as follows:
    Code:
         Iterations  Concatenation        Builder
                  1              0              0
                  2              0              0
                  4              0              0
                  8              0              0
                 16              0              0
                 32              0              0
                 64              0              0
                128              0              0
                256              2              0
                512              3              0
               1024              9              0
               2048             35              0
               4096            142              0
               8192            590              1
              16384           2497              2
              32768          16882              4
              65536         171438              5
             131072        1391516             10
             262144              #             40
             524288              #             98
            1048576              #            142
    # = skipped these as would take too long!

    The zeros are too quick to time.

    You can see that for string lengths of ~100 characters, either approach will work well. But it is amazing how much quicker the StringBuilder approach gets once you are using strings of ~10000 characters.

    I used this cool online curve-fitting service (http://zunzun.com/) to fit a power curve to the concatenation results (first graph below), which gives time ~ iterations**3

    Wherease the StringBuilder results (second graph), look more like time directly proportional to iterations.

    P.S. The version of java for this was:
    java version "1.5.0_10"
    Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_10-b03)
    Java HotSpot(TM) Client VM (build 1.5.0_10-b03, mixed mode, sharing)

    which is that on the Knoppix 5.1 Live Linux CD (yes, I know it's old! But I'm using it until I get around to reinstalling Gentoo )
    Attached Images Attached Images   
    Last edited by Peter_B; February 28th, 2012 at 11:52 AM. Reason: Added java version info

  9. #9
    Join Date
    May 2006
    Location
    UK
    Posts
    4,473

    Re: The string thread

    Thanks for that Peter_B, very interesting. Hopefully people post results from different compilers so we can see how they compare.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  10. #10
    Join Date
    Nov 2011
    Posts
    189

    Re: The string thread

    Emm.. Anyways.. I took the read-from-file-string approach and it does what i want

  11. #11
    Join Date
    May 2006
    Location
    UK
    Posts
    4,473

    Re: The string thread

    Sorry Cens, we appear to have hi-jacked your thread. Glad to hear you've got it working.
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

  12. #12
    Join Date
    Nov 2011
    Posts
    189

    Re: The string thread

    No prob, the thread was supposed to be a general-StringProblem-based place anyways

  13. #13
    Join Date
    May 2006
    Location
    UK
    Posts
    4,473

    Re: The string thread

    Ok, I've done some further testing:

    Compiling and running under Java 7 update 3 is marginally quicker than Java 6 update 23 but still displays the same non-linear time issues when concatenating strings. This slight speed difference is down to the runtime and not the compiler - the generated code is identical.

    Compiling and running under Netbeans 6.7.1 and IntelliJ IDEA community edition, which both use the Oracle JDK compiler, is unsurprisingly virtually identical to manually compiling using the Oracle compiler.

    Compiling and running under Eclipse Helios, which I believe uses a compiler based on the VisualAge compiler, is approximately twice as quick at concatenating strings as Java 6 update 23 but still displays the same non-linear time issues when concatenating strings.

    The increase in speed is because the Eclipse compiler generates code that creates a new StringBuilder passing in the current string to the constructor and then appends the string whereas the JDK compiler creates an empty StringBuilder and appends both strings.

    Does anybody else have any other Java compilers?
    Posting code? Use code tags like this: [code]...Your code here...[/code]
    Click here for examples of Java Code

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured