|
-
January 18th, 2011, 11:35 AM
#1
Cleaning up code with Regular Expressions
I am terrible at knowing regular expressions very well so mostly I find examples then try to use them, however my code usually ends up being really ugly even though it works.
Code:
string classifications = LineNode.InnerText;
//Makes This_TypeOfStringIsLong > This_Type Of String Is Long
string spaced = Regex.Replace(classifications, @"([a-z])([A-Z])", @"$1 $2", RegexOptions.None);
//Makes This_Type Of String Is Long > This_Type,Of,String,Is,Long
string modded = spaced.Replace(" ", ",");
//Makes This_Type,Of,String,Is,Long > This Type,Of,String,Is,Long
string again = modded.Replace("_", " ");
string[] LineArray = again.Split(',');
Is there a simple shorter way of doing this?
we begin with a string like "This_TypeOfStringIsLong" has to be parsed out by each Capital Word, then if there is a _ make a space, but its original word has to be treated as one
example Blue_SkyDarkNightOwl > Blue Sky, Dark, Night, Owl
-
January 18th, 2011, 05:36 PM
#2
Re: Cleaning up code with Regular Expressions
Hi Bix.
Certainly I'm not any better than you with regular expressions, and almost certainly not AS good as you, but it did occur to me that one step in the process might be eliminated by combining two steps into a single statement.
In the original code,
step 1 inserts a space anywhere in the line where a lower-case char is followed immediately by an upper-case char, thus tokenizing the string.
step 2 converts those spaces to commas for the impending split.
step 3 converts the underscore to a space
step 4 splits the line on the aforementioned commas.
It seems likely that in this particular case, one might successfully combine steps 1 & 2 as follows:
Code:
string classifications = "This_TypeOfStringIsLong";
// Tokenize the string
string spaced = Regex.Replace(classifications, @"([a-z])([A-Z])", @"$1,$2");
// Replace the underscore with a space
string again = spaced.Replace("_", " ");
string[] LineArray = again.Split(',');
yet another option, though much less desirable, would be to combine all the operations into a single statement without doing any intervening string creation, similar to the following ....
Code:
string[] LineArray2 = Regex.Replace(classifications, @"([a-z])([A-Z])", @"$1,$2").Replace("_", " ").Split(',');
Personally, I'd avoid that option like the plague because it renders the operation illegible, but I guess it is an option however foul.
Last edited by ThermoSight; January 18th, 2011 at 10:07 PM.
-
January 19th, 2011, 08:45 PM
#3
Re: Cleaning up code with Regular Expressions
Yeh I fixed it made it waaay smaller just by changing the way I saved out the xml file. Before I had it like this >> "This_Is_OneHere_Is_AnotherYetMoreAgainSplitEachByCapital_Word"
to >> "This_Is_One,Here_Is_Another,Yet,More,Again,Split,Each,By,Capital_Word"
code is SUPER simplified with this
Code:
string classifications = LineNode.InnerText;
string modded = classifications.Replace("_", " ");
string[] LineArray = modded.Split(',');
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|