Click to See Complete Forum and Search --> : get left and right characters


Trainwreck
March 21st, 2008, 08:01 PM
Hey, i really have a problem that i cant figure out by myself.

Suppose i have the following string: "12.34_56:78"

What i need, and probably with regex, is to search for the '_' character and take out the 34 on the left and the 56 on the right of it.

Is there anyway to accomplish this... I really have no idea how to write a regex for this kind of thing.

dlorde
March 21st, 2008, 09:05 PM
Your question is not very clear - when you say 'take out' the 34 and the 56, do you mean you want the result to be the remainder of the text, or given that text, you want the result to be 34 and 56?

It's probably best if you explain more clearly and provide several examples of input text and the output you would expect to get.

If you cannot describe what you are doing as a process, you don't know what you're doing...
W.E. Deming

Trainwreck
March 22nd, 2008, 07:15 AM
Yeah sorry, i can imagine that its unclear for others.
Well here's what i want. Im gonna use the string from above as example again, so i have:
"12.34_56:78"

I need to search in that string for the underscore '_', then when it finds that i want to get all the numbers on the left and right of the underscore, untill there are no more numbers.

So suppose a function/regex code searches in the string and eventually see's an underscore in the string.
Now it have to look to the left of the underscore to get all the numbers. So it first see's the number 4, then it see's the numbers 3, then it see's a dot '.' which is no number so it stops there.

So that leaves me with 34 from the left side, now its gonna do the same thing for the right side,

5 is numbers, 6 is a numbers, and then it bumps in in a non digit character, ':' so it stops searching there too, which leaves me with '56'.
With a result of
34
56

A few other examples:
2.145_7:9778

result:
145
7
-----------------------------
245.909_5421:09

result:
909
5421

I hope you kinda get the idea of what i need to accomplish here. If its still unclear then just say so and ill try to explain it in a better way.

dlorde
March 22nd, 2008, 02:27 PM
OK, it's clear enough now.

I'm not an expert in regular expressions, so I may be missing a simpler way, but ISTM you could do it in two stages - first find all the occurrences of a sequence of digits either side of an underscore (nn_nn), then split those occurrences using the underscore as the delimiter.

To find digits either side of an underscore, create a Matcher from a Pattern using "\\d+_\\d+" as the expression. Then loop through the Matcher with the find() method (e.g. while (matcher.find()) {... ). Inside the loop, get the matched group from the Matcher and split it using "_" as the delimiter. This will give you an array containing the numbers on either side of the underscore.

The art of programming is the art of organizing complexity, of mastering multitude, and avoiding its bastard, chaos...
E. Dijkstra

Trainwreck
March 22nd, 2008, 06:35 PM
Thanks, i think this is going to work for me :)

vivendi
March 23rd, 2008, 09:52 AM
Hey guys, then is also exactly what i need but i have one problem with the code.
If i use this regex like you gave:


"\\d+_\\d+"


But when i change the underscore '_' to an asterks '*' then it gives me an error...


"\\d+*\\d+" //Doesnt work
"\\d+\\*\\d+" //Doesnt work either


Any idea how to solve this problem...?

dlorde
March 23rd, 2008, 10:35 AM
It works just as before for me if I replace '_' with "\\*" throughout. Without seeing your code, I can't say more.

All truths are easy to understand once they are discovered; the point is to discover them...
G. Galilie

vivendi
March 23rd, 2008, 10:55 AM
This is the code that i have now:


Matcher m = Pattern.compile("\\d+\\*\\d+").matcher(text);
while(m.find()) {
String strTemp = m.group();

String strSum[] = strTemp.split("*");
int iLeft = new Integer(strSum[0]).intValue();
int iRight = new Integer(strSum[1]).intValue();

int iResult = iLeft * iRight;
String strResult = Integer.toString(iResult);

text = text.replace(strTemp, strResult);

System.out.println(text);
}

dlorde
March 23rd, 2008, 11:00 AM
If you read my previous message, you'll see that I say "replace '_' with "\\*" throughout". '*' is a special character, so it must be escaped. You correctly escape it in the first regex, but not in the second.

In the particular is contained the universal...
J. Joyce

vivendi
March 23rd, 2008, 12:16 PM
ahh yeah i forgot to slash it out at the m.split, but thanks, now everything is working out great ^^

vivendi
March 24th, 2008, 08:05 AM
Sorry to bring up the topic of someone else again but im struggling with a similar problem again.

I want to get everything between parenthesis. So if i had this string: "(2+2)*1+(4+4)"
Then i want to get:
2+2
4+4

I've tried it with a matcher() and a pattern.split() but i cant get it to work.
This is the last thing that i've tried:

String input = "(2+2)*1+(4+4)";
Pattern p = Pattern.compile("\\(.\\)");
String strParenthesis[] = p.split(input);

for(int i=0; i<iCount; i++)
{
System.out.println("sum: "+strParenthesis[i]);
}

vivendi
March 24th, 2008, 12:29 PM
I couldn't let this go and suddenly i knew it! And i think i know know now how regex basically works. The answer to my question is :

"\\(\\d+.\\d+\\)"

:)

keang
March 24th, 2008, 12:33 PM
Your problem is the '.' which matches any character but just one character, you need to match one or more of any character so you need to use a plus character which matches the preceeding character one or more times ie '.+'.

This still doesn't totally solve the problem though as it will match the first open bracket with the last closing bracket which isn't what you want so you need to add a Lazy quantifier to tell the matcher to match the first possible match. You do this by adding a '?' after the reptition character ie '.+?'

Your pattern should now be: "\\(.+?\\)".

If you now use the matcher to find this pattern you will get a string starting with an opening bracket, followed by one or more characters and ending with a closing bracket. If you don't want the brackets then use substring to extract the bit in between them or there's probably a way to get the regex to lose the opening and closing brackets but that's beyond my regex ability.

I suggest you read a regex tutorial, there are several online but I find this one (http://www.regular-expressions.info/tutorial.html) to be particularly good.

keang
March 24th, 2008, 12:40 PM
I couldn't let this go and suddenly i knew it! And i think i know know now how regex basically works. The answer to my question is :

"\\(\\d+.\\d+\\)"Yes this will work provided the brackets contain digits followed by any single character followed digits, of course anything more compilcated will not match eg:

(2+3) will match
(254) will match
(12+56) will match
(12+5+78) will not match

vivendi
March 24th, 2008, 03:10 PM
True, something like 12+5+78 will not work, but that will be my next step i guess, to solve that kind of stuff. No idea how to do it right now though....

keang
March 24th, 2008, 05:32 PM
True, something like 12+5+78 will not work, but that will be my next step i guess, to solve that kind of stuff. No idea how to do it right now though....When you check your thread it's advisable to read all of the replies since your last post rather than just reading the reply at the end of the thread. If you had read all of the replies you would have seen that I posted twice and the first post showed you a way to do this.

dlorde
March 24th, 2008, 05:42 PM
Remember that the regex you specified to identify stuff between parentheses won't work in split(..) if you want what's between the parentheses - because split(..) uses the regex to specify the delimiter, so you'll get everything except what the regex specifies.

Using Pattern and Matcher is the way to go if you want to extract the stuff matched by the regex.

Note that if you are trying to construct some kind of expression evaluator (all your regex's seem to concern math expressions), using regular expressions is not the way to go. For this sort of thing, you need an expression parser. This is a well-explored area, and one where you'll find plenty of help online.

Vague and nebulous is the beginning of all things, but not their end...
K. Gibran