-
March 14th, 2008, 04:20 PM
#16
Re: MySQL Collation issue
Regular expresions don't work.
I've checked everything, collations, caracter sets...
I've seen pájaro is stored in the DB as: 70 C3 A1 6A 61 72 6F which is 7 bytes long. This is the utf8 code. These 7 bytes, viewed as latin1 appear as pájaro
However actually pájaro in latin1_spanish is 6 bytes long 70 E1 6A 61 72 6F
It looks like utf-8 data in DB is being taken byte by byte as latin1, ignoring character sets and collations. When I search for pájaro this word is translated to pájaro and then I find records because there is a match. However when I search for pajaro there is not any match.
I don't understand.
I will try to change character sets and collations to latin1 to see if I have the same problems.
Last edited by Doctor Luz; March 14th, 2008 at 04:26 PM.
-
March 14th, 2008, 08:57 PM
#17
Re: MySQL Collation issue
Originally Posted by Doctor Luz
Regular expresions don't work.
I've checked everything, collations, caracter sets...
I've seen pájaro is stored in the DB as: 70 C3 A1 6A 61 72 6F which is 7 bytes long. This is the utf8 code. These 7 bytes, viewed as latin1 appear as pájaro
However actually pájaro in latin1_spanish is 6 bytes long 70 E1 6A 61 72 6F
It looks like utf-8 data in DB is being taken byte by byte as latin1, ignoring character sets and collations. When I search for pájaro this word is translated to pájaro and then I find records because there is a match. However when I search for pajaro there is not any match.
I don't understand.
I will try to change character sets and collations to latin1 to see if I have the same problems.
Why aren't the regular expressions working? And the thing is, if you use regular expressions, you have to do it from the php code so that á gets set as Ãi instead, making it [aÃi].
If you insert the pájaro from php code and it becomes pÃijaro, and then try to search for
"SELECT * FROM aves WHERE nombre RLIKE 'p[áa]j[áa]ro';", it should turn it into "SELECT * FROM aves WHERE nombre RLIKE 'p[áÃi]j[áÃi]ro';"
Alternatively, I guess you could do it yourself with either
Code:
SELECT * FROM aves WHERE nombre RLIKE 'p[áÃi]j[áÃi]ro';
or
Code:
SELECT * FROM aves WHERE nombre RLIKE 'p([á]|[Ã][i])j([á]|[Ã][i])ro';
Sincerely,
Martin Svendsen
-
March 15th, 2008, 10:50 AM
#18
Re: MySQL Collation issue
In my most recent attachment, you will notice that I was playing with the ASCII values because I suspected that. I also saw the Ãi a while back.
When saved as UTF, the ASCII character is 195. However, when using Latin 1 the ASCII character is 225. Why can't you recurse your database changing the ASCII value to match your character encoding?
If the post was helpful...Rate it! Remember to use [code] or [php] tags.
-
March 17th, 2008, 12:40 PM
#19
Re: MySQL Collation issue
Queries for matching exact values are working now.
Now I find another problem, REGEXP is not working as I expected.
I have something like this:
$sql="select * from aves where nombre REGEXP '[[:<:]]pajaro[[:>:]]'";
Is there a way for making this query to find me the row "el pájaro loco"?
I only can do it this way [[:<:]]pájaro[[:>:]]
REGEXP does not use collation?
-
March 17th, 2008, 12:55 PM
#20
Re: MySQL Collation issue
Why don't you use % as the wildcard? Just like *.doc returns all documents, in SQL %.doc would return all documents.
¿O entendÃ* mal?
If the post was helpful...Rate it! Remember to use [code] or [php] tags.
-
March 17th, 2008, 01:22 PM
#21
Re: MySQL Collation issue
with LIKE?
Yes but it is giving me results like "espantapájaros" I would like to obtain only full words...
I would like to work with REGEXP brecuse the work is done, although maybe processing the results to exclude things like "espantapájaros"...
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|