Toshioo
April 20th, 2011, 03:34 PM
Hi, I'm using simplexml_load_file() on a RSS feed to then get the titles. The problem is that when for example a ' appears on the title, strange characters appear instead of '. How do I fix this :S? Thanks.
|
Click to See Complete Forum and Search --> : PHP How to change encoding with simplexml_load_file()? Toshioo April 20th, 2011, 03:34 PM Hi, I'm using simplexml_load_file() on a RSS feed to then get the titles. The problem is that when for example a ' appears on the title, strange characters appear instead of '. How do I fix this :S? Thanks. PeejAvery April 20th, 2011, 10:43 PM That means the file you're loading is not UTF-8. Since SimpleXML was created to read/write UTF-8, the file your reading needs to be of that character encoding. An alternative...which will be slower processing, is to read the file using file_get_contents() (http://ar.php.net/manual/en/function.file-get-contents.php), convert to UTF-8, and then use simplexml_load_string() (http://www.php.net/manual/en/function.simplexml-load-string.php). Toshioo April 21st, 2011, 04:43 AM That means the file you're loading is not UTF-8. Since SimpleXML was created to read/write UTF-8, the file your reading needs to be of that character encoding. An alternative...which will be slower processing, is to read the file using file_get_contents() (http://ar.php.net/manual/en/function.file-get-contents.php), convert to UTF-8, and then use simplexml_load_string() (http://www.php.net/manual/en/function.simplexml-load-string.php). I tried that, but there are still strange characters instead of '. The code is: $temp = mb_convert_encoding( file_get_contents("feed_url"), 'UTF-8' ); $feed = simplexml_load_string($temp); I also tried utf8_encode() instead of mb_convert_encoding and the same happened. When I get the title I send it by e-mail and I see it by e-mail, could the problem be that? PeejAvery April 21st, 2011, 07:57 AM What is the character encoding of the e-mail? Toshioo April 21st, 2011, 09:02 AM What is the character encoding of the e-mail? I wrote the email's header like this: $headers = "Content-type: text/html; charset=UTF-8\r\n"; It shouldn't be the problem though, I just tried using echo and the strange characters appear. PeejAvery April 21st, 2011, 10:17 AM Save the file and upload it here. Toshioo April 21st, 2011, 01:50 PM Ok, here it is: <html> <body> <?php $max_news = 5; $i = 0; $str = file_get_contents("http://feeds.feedburner.com/PokerNewsDaily?format=xml"); $temp = mb_convert_encoding( $str, "UTF-8" ); $feed = simplexml_load_string($temp); foreach ($feed -> channel -> item as $item ) { if ($i > $max_news) break; sleep(10); $subject = $item -> title; $link = $item -> link; $description = $item -> description; $body = "<html><body>" . $description . "<br /><br /><i>PokerDailyNews</i>: <a href=" . $link . ">Read the full report</a></body></html>"; $to = "email@email.com"; $headers = "Content-type: text/html; charset=UTF-8\r\n"; if ( mail($to, $subject, $body, $headers) ) { echo "<p>" . $subject . " - sent</p>"; } else { echo "<p>" . $subject . " - NOT sent</p>"; } $i++; } ?> </body> </html> PeejAvery April 21st, 2011, 08:06 PM I'm not seeing any invalid characters. Toshioo April 22nd, 2011, 04:45 AM I'm not seeing any invalid characters. Hmm, then could it be because of the server I use? PeejAvery April 23rd, 2011, 08:06 AM I'd doubt it. Is your PHP document saved in UTF-8? Are you also outputting a UTF-8 header? Toshioo April 23rd, 2011, 03:05 PM I'd doubt it. Is your PHP document saved in UTF-8? Are you also outputting a UTF-8 header? It's good now :D The problem was that the internal encoding wasn't UTF-8, I wrote this in the beggining of the file: header('Content-Type:text/html; charset=UTF-8'); And I didn't have to convert the string, because the feeds are in UTF-8. Thanks a lot for your help :) codeguru.com
Copyright Internet.com Inc., All Rights Reserved. |