Using VS2008, .net 3.5
I'm working on an app that will potentially parse thousands of XML files. What I'm working on is actually a rewrite of a Java application that I've already written and has been in production for some time now. Everything I've read says that the .net implementation of DOM and SAX/pull parsing is slower than Java, but there performance hit that I'm seeing is ridiculous.
In benchmarking tests that I've run with my previous Java implementation vs. two implementations of .net parsing are stark and disappointing. I'm hoping that I'm doing something wrong and that all the 50lb heads in this forum might be able to save the day. A benchmark I ran today had my Java implementation finish parsing at 15 seconds where my best .net implementation ran for 95 seconds.
The two implementations I've tried include using
1) ...xmlDoc.GetElementsByTagName(...) and a bunch of if statements to grab the nodes I'm looking for.
2) using LINQ to query the xml file(s) for the info I need to grab.
As you might assume, LINQ is the faster of the two implementations. I just started down a third route of trying XMLTextReader, and just reading through the entire file was still slow (that's not doing anything with the data).
I've been searching the interwebs to no avail...Anybody have a definitive answer on what the FASTEST XML parsing implementation is? I'm open to third-party solutions as well. I'm at my wits end on this...