Click to See Complete Forum and Search --> : regex problem


vivendi
March 20th, 2008, 02:03 PM
Hey, i have a little problem with my regex function.
Suppose i have the following string: 2+3-4+5



Pattern p = Pattern.compile("");
// Split input with the pattern
String[] result = p.split("2+3-4+5");

for (int i=0; i<result.length; i++)
System.out.println(result[i]);


What i want to do is split all the positive numbers. So the result should be:
2
3
5

What i get with my regexp. is:
2
3-4
5

I dont want it to take the -4 with it. How can i solve this...

ProgramThis
March 20th, 2008, 02:53 PM
Please post your regex, that will help.

From what I understand about regex you can put in a \-[^0-9]+
Or something like that. Basically, you exclude any combination of numbers ( the ^[0-9]+ part) which are followed by the - (by using \- I believe).

Here, check out this (http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html) link.

And here (http://www.javaworld.com/javaworld/jw-07-2001/jw-0713-regex.html?page=2) is an example of it in use.

vivendi
March 20th, 2008, 03:04 PM
Sorry, i accidentally deleted my regex from the code above, but its just a simple as this:

Pattern p = Pattern.compile("[+]");

With your regex i get the following result:
2+3
+5
Which isn't exactly what i want, i only need to numbers now one by one stored in my object 'result'.

2
3
5

like that. Any idea how to do that...?
BTW, i checked that link of yours, i've alredy used that site earlier today but i just can't figure out how to achieve my goal.

dlorde
March 20th, 2008, 04:28 PM
The split pattern you want is something like this: "\\+|(-\\d+)+\\+".
This translates as PLUS OR [(MINUS followed by any number of digits) one or more times, followed by PLUS].

The '+' is escaped because it's a special character, the OR is '|', and any number of digits is '\\d+'. If all your numbers are single digits, you won't need this extra '+' after the '\\d'. The '+' following the parentheses means one or more times. The '-' followed by some digits is repeated one or more times in case there are several negative numbers following each other, then at the end of a negative number sequence we must include the '+' in front of the next positive number (remember that the delimiter is everything between each desired output character).

This is all documented in the Pattern (http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html) JavaDocs.

Optimism is an occupational hazard of programming: testing is the treatment...
K. Beck

vivendi
March 20th, 2008, 04:53 PM
Hey thanks, it works indeed but. There's just one thing though, this way i also have to check for a * (multiply) or / (divide) character and exclude those:
So if i would have this now: 2+3-4+5*6/7

Then it should still give me:
2
3
5

I've tried to edit your regex to this: "\\+|(-\\d+)+|(*\\d+)+\\+" //This one is still without the divide character

But that gives me an error. Sorry, i'm really trying to understand regex, but its pretty hard to understand the more advance stuff. Im still trying though :)

dlorde
March 20th, 2008, 06:13 PM
The '*' is a special character, so you should escape it if you want it as a literal: '\\*' Any character that can be used as a special character in a pattern must be escaped if you want it as a literal.

The new requirement is to match '*' and '/' (including the digits following)the same as '-' . So if you replace '-' in my expression with '*' OR '/' OR '-', you should get the result you want. The way to OR characters together is to put them inside square braces: "[]".

Regular expressions aren't really difficult - just unfamiliar. They require that you can logically express what you need to match, and then follow the regex rules exactly. The first bit is where people usually have trouble - they can't express what they want to match logically. It's hard to do because it's not something we normally have to do. The brain is so good at pattern matching, no conscious analysis is needed. When we need to explain it to a computer, paradoxically it takes a while to figure out just what it is we usually do without thinking :rolleyes:

That language is an instrument of human reason, and not merely a medium for the expression of thought, is a truth generally admitted...
G. Boole