-
July 23rd, 2009, 11:10 AM
#1
[RESOLVED] Help parsing gigantic text files (64 MB+ )
I have these giant text files I want to parse. I am looking for a FAST method and was considering using RegEx. However let me give you some sample data here.
If I load the text file into notepad and do some simple string searches, it can take 20 or 30 seconds or more to find a hit.
If I load this into a test C# program I have, load the text file into a string and use some simple RegEx expressions flagged as multi-line, it can take a long time too, although it is faster then notepad.
At this point I would say, well it is a giant text file it takes some time except for the next part:
There is this application someone told me to check out, TextPad. So I download and install this and load up my text file, which it does instantly (notepad grinds for a while). Then I do the same string searches and it is instant. I can say, find all occurrences of X and book mark them. BAM! its done.
So there *is* a way, somehow. Maybe someone here knows how they do it or *THE WAY*, or maybe not.
What I would like advise on, is an efficient method of parsing text files with an eye towards speed. For example, maybe the RegEx engine is just a poor choice for large files. Or perhaps the "Multi-line" property is inefficient and maybe I should store my file as a series of lines (as strings) rather than a single gigantic string and iterate over each line.
Anyway, this is kind of green field, I am open for suggestions.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|