Using Regular Expression (regex) in Java Programming
Hiii all,
I'm new in using Regular Expression in Java.
For example, I have String s like this:
String s = "Why John Smith and Alan Smith and Nick Gates are the same?"
How can I get sub-strings as "John Smith", "Alan Smith", "Nick Gates" - (names of people - with first upper character) from s by using regex?
I try to use regex in Java, but It totally doesn't work.
This is my code:
/*
String EXAMPLE_TEST = "Why John Smith and Alan Smith and Nick Gates are the same?";
System.out.println(EXAMPLE_TEST.matches("([A-Z]&&[a-z])"));
String[] splitString = (EXAMPLE_TEST.split("([A-Z]&&[a-z])"));
System.out.println(splitString.length);
for (String string : splitString) {
System.out.println(string);
}
*/
Please help me to have exact "regex" in this case.
Thanks all in advance!
Re: Using Regular Expression (regex) in Java Programming
If the only way of spotting a name is the fact that two adjacent words start with capital letters then the following regex will do this:
Code:
Pattern pattern = Pattern.compile("[A-Z][a-z]+ [A-Z][a-z]+");
Matcher matcher = pattern.matcher("Why John Smith and Alan Smith and Nick Gates are the same?");
while( matcher.find() )
System.out.println("main: "+matcher.group());
Although this still fails because in the given sentence the first word starts with a capital letter and the second word is a name so the above regex will pull out "Why John". Other failures may also occur if the sentence has embedded captial letters such as for proper nouns, abbreviations etc.
I wrote a simple regex test utility to speed up testing regex expressions which may be of help to you. You can perform the test using the applet on the page or download a standalone version.
Re: Using Regular Expression (regex) in Java Programming
Thanks keang very much!
I put space in your regex to exclude "Why", and It worked well.
(" [A-Z][a-z]+ [A-Z][a-z]+");
main: John Smith
main: Alan Smith
main: Nick Gates
Re: Using Regular Expression (regex) in Java Programming
Your solution will only work if the first word of the sentence isn't a name.
A slightly better regex is "(?<= )([A-Z][a-z]+ [A-Z][a-z]+\b)" this will do as per your regex but will not include the space character at the front of the returned value. The '\b' ensures only alpha characters are present up to the end of the word. The original regex will match the alpha characters in a second word that included numbers.
Re: Using Regular Expression (regex) in Java Programming
The OP has cross-posted this to other forums (Java Programming Forums), so be aware you may be wasting your time answering a question already answered elsewhere.
To the OP - please don't cross-post without notification. If you do, ensure you update each thread with any new information from the others.
Experience is a poor teacher: it gives its tests before it teaches its lessons...
Anon.
Re: Using Regular Expression (regex) in Java Programming
Thanks for the heads up.
I particularly like the way the OP added the post "I figured out by using regex like this: " [A-Z][a-z]+ [A-Z][a-z]+"". Plagiarism?
@lordelf2004 If you want your posts answered in future I suggest you take dlorde's advice and also give credit to any posters that have helped to solve your problem rather than claiming to have done it yourself.
Re: Using Regular Expression (regex) in Java Programming
Hiii dlorde and keang,
First of all, I'm so sorry for any conveniences that I made to u!
My mistake is I did not cite "keang" name who suggest the solution in this post: http://www.javaprogrammingforums.com...html#post12195 , right?
I post my problem in many forums because I want it to resolve as much as it can....
I DID NOT intend to do something like "Plagiarism"...
When the problem is solved, I just updated....I just want to post the solution for other posters in those forum...because they can waste their time for helping me....
This it the first-time I joined these forums, I dont know many rules like this...
I'll cite any posters who helped me to solve problem if I post it anywhere else....
again, I'm so sorry...
Re: Using Regular Expression (regex) in Java Programming
Quote:
again, I'm so sorry...
That's ok, we all make mistakes.