-
February 27th, 2012, 04:00 PM
#1
The string thread
Strings can always be a little confusing, so i made thIs thread:
As for the 1st question: lets suppose we have java code written in a txt file. We determine if a line starts with"//" after which we want to add the character "¥" at the end of the respective line. I have no problem determining if a line starts with "//" but i dont know how to append "¥" at the end of the RESPECTIVE line.
Any help is much appreciated
-
February 27th, 2012, 05:57 PM
#2
Re: The string thread
Code:
myString = myString + "¥";
Beware: if you're appending lots of things to the same line don't use this technique, always use a StringBuilder (or StringBuffer if you need synchronization). To understand why read this short tutorial I wrote.
-
February 27th, 2012, 06:45 PM
#3
Re: The string thread
Originally Posted by keang
Code:
myString = myString + "¥";
Beware: if you're appending lots of things to the same line don't use this technique, always use a StringBuilder (or StringBuffer if you need synchronization). To understand why read this short tutorial I wrote.
I've just looked at the link to your tutorial, and spotted a problem which is causing you to overestimate the time difference between using direct concatenation and using StringBuilder. That is, the loop using '+' is also writing the loop variable to System.out, whereas the StringBuilder loop isn't.
This will account for some (most!) of the 74,000 times difference in speed you are seeing!
-
February 28th, 2012, 12:06 AM
#4
Re: The string thread
Originally Posted by Peter_B
This will account for some (most!) of the 74,000 times difference in speed you are seeing!
That's right.
Since Java compilers were allowed to use StringBuffer/Builder to optimize String expressions, String code is as fast as handcrafted StringBuffer/Builder code. To prefer StringBuffer/Builder over String for performance reasons was relevant like maybe 7 years ago.
Today there's little or no penalty for using String also where it's not the most appropriate choise. The Java language has been changed not to penalize people for using String where a mutable alternative such as StringBuffer/Builder would be better.
Unfortunately optimization "advice" (ala keang's tutorial) live its own life lingering on long after it was relevant (if it ever was). Another example. Even today in new Java code you can see reference variables being "nulled" just before they go out of scope in order to "help" the garbage collector reclaim objects.
So Donald Knuth's famous advice is as relevant as ever before - don't optimize prematurely!
Last edited by nuzzle; February 28th, 2012 at 03:47 AM.
-
February 28th, 2012, 12:23 AM
#5
Re: The string thread
Originally Posted by cens
Any help is much appreciated.
You must create a new text file. It's not possible to append text to an existing file unless it's at the very end.
So you need to copy the existing file line by line to a new file appending "¥" in the process.
-
February 28th, 2012, 01:12 AM
#6
Re: The string thread
Originally Posted by keang
In this tutorial you're using compiler output to support arguments about best coding practices.
This is always wrong because there's no THE Java compiler. There are many different compilers from different suppliers and they don't generate identical code. And even the same compiler may generate different code over time because it's being updated and improved.
To support arguments of this kind the only relevant source is the Java Language Specification (not compiler output nor standard library source code).
-
February 28th, 2012, 07:39 AM
#7
Re: The string thread
Originally Posted by Peter_B
I've just looked at the link to your tutorial, and spotted a problem which is causing you to overestimate the time difference between using direct concatenation and using StringBuilder. That is, the loop using '+' is also writing the loop variable to System.out, whereas the StringBuilder loop isn't.
Yes you are right, well spotted, the posted code is wrong - thanks for pointing that out. However, I believe the stats were run on the correct code as I've just re-run the code (without the system out) and the results are:
Code:
Concatenate strings - time = 427704ms, string length = 500000
Using StringBuilder - time = 15ms, string length = 500000
This has been run on a different machine and different version of Java (6 u 23) but still shows an order of magnitude difference in performance.
Having re-read the article it's wrong on a number of counts:
- It doesn't clearly point out such a huge difference in performance only applies when appending multiple times to large strings. Multiple concatenations to small strings whilst still slower are no where near as inefficient (maybe 5-10 times slower).
- As nuzzle has pointed out it relates to a single compiler. At the time I wrote the article this output was typical of the compilers I tried and had been for a number of years but it should have made this clear and that things may/will change.
- When this article was written this was considered "best practice" but now I would agree that this no longer holds true. The performance inefficiencies for most uses (which would be relatively small strings and relatively low numbers of concatenations) are going to be insignificant on modern computers. However, if there is a performance issue when concatenating strings, there is a good chance improvements can be made by using StringBuilder/StringBuffer.
Originally Posted by nuzzle
Since Java compilers were allowed to use StringBuffer/Builder to optimize String expressions, String code is as fast as handcrafted StringBuffer/Builder code. To prefer StringBuffer/Builder over String for performance reasons was relevant like maybe 7 years ago.
Do you have Java 7 installed or any other vendors compiler so you can you run the test code (without that spurious system out in the first loop) on your machine and post the results. It'll be interesting to see how different compilers handle string concatenation.
BTW I'm currently using the Java 6 update 23 compiler from Oracle which clearly still isn't optimizing the loop and isn't 7 years out of date.
Code:
static void test()
{
String text = "";
// warm up system
for ( int i = 0; i < 10000; ++i )
text = text + "*";
System.out.println("Running concatenation time test...");
text = "";
long t = System.currentTimeMillis();
for ( int i = 0; i < 500000; ++i )
text = text + "*";
System.out.println("Concatenate strings - time = "+(System.currentTimeMillis()-t)+"ms, string length = "+text.length());
text = "";
t = System.currentTimeMillis();
StringBuilder sb = new StringBuilder(text);
// warm up system
for ( int i = 0; i < 10000; ++i )
sb.append("*");
sb = new StringBuilder(text);
for ( int i = 0; i < 500000; ++i )
sb.append("*");
text = sb.toString();
System.out.println("Using StringBuilder - time = "+(System.currentTimeMillis()-t)+"ms, string length = "+text.length());
}
Interestingly the Java Tutorials (and before you say it yes I know they are not the bible and only relate to Oracles compiler which, to be fair, most of the people using this site probably use) still say "Strings should always be used unless string builders offer an advantage in terms of simpler code (see the sample program at the end of this section) or better performance. For example, if you need to concatenate a large number of strings, appending to a StringBuilder object is more efficient."
In this tutorial you're using compiler output to support arguments about best coding practices.
Yes that is wrong, my intention was to use the compiler output to explain why there is such a discrepancy in the times for each loop to run. The conclusion is/was wrong as it is clearly compiler dependent albeit at the time I wrote the article it did apply to all the compilers I tried.
Originally Posted by nuzzle
Unfortunately optimization "advice" (ala keang's tutorial) live its own life lingering on long after it was relevant (if it ever was)
Lots of advice on many different subjects has a limited shelf life but that doesn't mean you should not give it at the time it is relevant.
-
February 28th, 2012, 11:45 AM
#8
Re: The string thread
I've just modified keang's code to do the comparison for different numbers of iterations:
Code:
class StringSpeedTest {
public static void main(String[] args) {
// warm up system
String text = "";
for ( int i = 0; i < 10000; ++i )
text = text + "*";
System.out.format("%15s%15s%15s%n", "Iterations", "Concatenation", "Builder");
for (int i=1; i < 2000000; i=i*2) {
long concatTime = Concatenation(i);
long builderTime = Builder(i);
System.out.format("%15d%15d%15d%n", i, concatTime, builderTime);
}
}
public static long Concatenation(int iterations) {
String text = "";
long t = System.currentTimeMillis();
for ( int i = 0; i < iterations; ++i)
text = text + "*";
return System.currentTimeMillis()-t;
}
public static long Builder(int iterations) {
String text = "";
long t = System.currentTimeMillis();
StringBuilder sb = new StringBuilder(text);
for ( int i = 0; i < iterations; ++i)
sb.append("*");
text = sb.toString();
return System.currentTimeMillis()-t;
}
}
The results are as follows:
Code:
Iterations Concatenation Builder
1 0 0
2 0 0
4 0 0
8 0 0
16 0 0
32 0 0
64 0 0
128 0 0
256 2 0
512 3 0
1024 9 0
2048 35 0
4096 142 0
8192 590 1
16384 2497 2
32768 16882 4
65536 171438 5
131072 1391516 10
262144 # 40
524288 # 98
1048576 # 142
# = skipped these as would take too long!
The zeros are too quick to time.
You can see that for string lengths of ~100 characters, either approach will work well. But it is amazing how much quicker the StringBuilder approach gets once you are using strings of ~10000 characters.
I used this cool online curve-fitting service (http://zunzun.com/) to fit a power curve to the concatenation results (first graph below), which gives time ~ iterations**3
Wherease the StringBuilder results (second graph), look more like time directly proportional to iterations.
P.S. The version of java for this was:
java version "1.5.0_10"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_10-b03)
Java HotSpot(TM) Client VM (build 1.5.0_10-b03, mixed mode, sharing)
which is that on the Knoppix 5.1 Live Linux CD (yes, I know it's old! But I'm using it until I get around to reinstalling Gentoo )
Last edited by Peter_B; February 28th, 2012 at 11:52 AM.
Reason: Added java version info
-
February 28th, 2012, 01:58 PM
#9
Re: The string thread
Thanks for that Peter_B, very interesting. Hopefully people post results from different compilers so we can see how they compare.
-
February 28th, 2012, 02:01 PM
#10
Re: The string thread
Emm.. Anyways.. I took the read-from-file-string approach and it does what i want
-
February 28th, 2012, 02:07 PM
#11
Re: The string thread
Sorry Cens, we appear to have hi-jacked your thread. Glad to hear you've got it working.
-
February 28th, 2012, 03:28 PM
#12
Re: The string thread
No prob, the thread was supposed to be a general-StringProblem-based place anyways
-
March 5th, 2012, 05:19 PM
#13
Re: The string thread
Ok, I've done some further testing:
Compiling and running under Java 7 update 3 is marginally quicker than Java 6 update 23 but still displays the same non-linear time issues when concatenating strings. This slight speed difference is down to the runtime and not the compiler - the generated code is identical.
Compiling and running under Netbeans 6.7.1 and IntelliJ IDEA community edition, which both use the Oracle JDK compiler, is unsurprisingly virtually identical to manually compiling using the Oracle compiler.
Compiling and running under Eclipse Helios, which I believe uses a compiler based on the VisualAge compiler, is approximately twice as quick at concatenating strings as Java 6 update 23 but still displays the same non-linear time issues when concatenating strings.
The increase in speed is because the Eclipse compiler generates code that creates a new StringBuilder passing in the current string to the constructor and then appends the string whereas the JDK compiler creates an empty StringBuilder and appends both strings.
Does anybody else have any other Java compilers?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|