-
February 10th, 2012, 05:22 PM
#1
Efficient Text File parser
Hi,
I was wondering if I could get some advice on the most efficient way I can parse a text file which contains the following pair:
<String LABEL:> Paragraph
For example:
PAGENUMBER: 1
TEXT: text with multiple lines
TEXT: some more text
PAGENUMBER: 2 TEXT: random te:xt here with colon just to add some spice
PAGENUMBER: 3
I was wondering what might be the best way to do this: I could tokenize on ":" but only if it matches with a specific label like TEXT or PAGENUMBER.
Thank you for your help,
Jack Higgins.
-
February 10th, 2012, 05:47 PM
#2
Re: Efficient Text File parser
I was wondering if I could get some advice on the most efficient way I can parse a text file which contains the following pair
That all depends what you mean by "most efficient".
I was wondering what might be the best way to do this:
If you know for sure that the label is at the beginning of the line and is always followed by a colon then read in the file a line at a time and for each line either use a regex to split the line or call the string's indexOf(':') method to find the index of the first colon and then use the substring() method to split the string at this index.
-
February 11th, 2012, 07:25 AM
#3
Re: Efficient Text File parser
Thanks Keang.
I guess the problem is that the label with the colon might not always be on a new line.
Jack.
-
February 11th, 2012, 08:18 AM
#4
Re: Efficient Text File parser
You need to have some way of identifying the label or the task is impossible.
If the label is not necessarily at the beginning of a line then what is the identification criteria?
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|