I was wondering if I could get some advice on the most efficient way I can parse a text file which contains the following pair:
<String LABEL:> Paragraph
For example:
PAGENUMBER: 1
TEXT: text with multiple lines
TEXT: some more text
PAGENUMBER: 2 TEXT: random te:xt here with colon just to add some spice
PAGENUMBER: 3
I was wondering what might be the best way to do this: I could tokenize on ":" but only if it matches with a specific label like TEXT or PAGENUMBER.
I was wondering if I could get some advice on the most efficient way I can parse a text file which contains the following pair
That all depends what you mean by "most efficient".
I was wondering what might be the best way to do this:
If you know for sure that the label is at the beginning of the line and is always followed by a colon then read in the file a line at a time and for each line either use a regex to split the line or call the string's indexOf(':') method to find the index of the first colon and then use the substring() method to split the string at this index.
You need to have some way of identifying the label or the task is impossible.
If the label is not necessarily at the beginning of a line then what is the identification criteria?
Bookmarks