boudino
November 30th, 2007, 05:55 AM
Hello
I just need a consultation. I want to check if a string is gramatically (no MX record for a domain check) correct email address. There are tons of such a regex on the internet, but they don't meet my needs. Many of them allows addresses like mailbox@123.122., and those which claim to be complete implementation of RFC 2822 are to benevolent by my optinion.
So I end up with following two regex:
1:^(?:[\w~-]+)(?:\.[\w~-]+)*@(?!\b\d+\b)(?:[\w-]+)(?:\.(?!\b\d+\b)(?:[\w-]+))*$
2:^(?:[\w~-]+)(?:\.[\w~-]+)*@(?:[\w-]+\.)*(?!\b\d+\b)(?:[\w-]+)$
They should allow digits, letters,"-" and "~" in mailbox, and also period in it.
The first should accept any domain name composed from one and more group, separated by period, each of them consisting from combination of letters, digits and "-", but disallowing group formed from letters only.
The second is almost same, but the less restrictive in the domain name, where requires non digits only group only as the last element.
What do you think about them? Will they evaluate in the way which I describe? Which one do you prefer?
Thanks.
And one more question: is "+" valid char in mailbox specification? What other chars like that are allowed in mailbox part? (I know, that everytging is in RFCs, but I am lazy to read them all ;) )
I just need a consultation. I want to check if a string is gramatically (no MX record for a domain check) correct email address. There are tons of such a regex on the internet, but they don't meet my needs. Many of them allows addresses like mailbox@123.122., and those which claim to be complete implementation of RFC 2822 are to benevolent by my optinion.
So I end up with following two regex:
1:^(?:[\w~-]+)(?:\.[\w~-]+)*@(?!\b\d+\b)(?:[\w-]+)(?:\.(?!\b\d+\b)(?:[\w-]+))*$
2:^(?:[\w~-]+)(?:\.[\w~-]+)*@(?:[\w-]+\.)*(?!\b\d+\b)(?:[\w-]+)$
They should allow digits, letters,"-" and "~" in mailbox, and also period in it.
The first should accept any domain name composed from one and more group, separated by period, each of them consisting from combination of letters, digits and "-", but disallowing group formed from letters only.
The second is almost same, but the less restrictive in the domain name, where requires non digits only group only as the last element.
What do you think about them? Will they evaluate in the way which I describe? Which one do you prefer?
Thanks.
And one more question: is "+" valid char in mailbox specification? What other chars like that are allowed in mailbox part? (I know, that everytging is in RFCs, but I am lazy to read them all ;) )