CodeGuru Home VC++ / MFC / C++ .NET / C# Visual Basic VB Forums Developer.com
Results 1 to 7 of 7

Thread: Charsets

  1. #1
    Join Date
    Jan 2006
    Posts
    352

    Charsets

    I must admit this ones are confusing me a lot!

    When I load a page which sends:
    Code:
    Content-Type: text/html; charset=windows-1250
    I get text[FF 3] - both in browser and source code[CTRL+U]:
    Code:
    zagrebačka županija
    But when I retrieve page with PHP's curl and write it to file I get:
    Code:
    zagrebaèka županija
    How come?

    And I am so confused, that I even don't know in which charset are my .php files.
    Ipsens

  2. #2
    Join Date
    May 2002
    Posts
    10,943

    Re: Charsets

    Did you create the files in Windows or Mac? In Windows, very few applications give you the choice of what character set you can use. So it is probably ISO-8898-1 if created in Windows. If it was created with a Mac, it would most likely be UTF-8.
    If the post was helpful...Rate it! Remember to use [code] or [php] tags.

  3. #3
    Join Date
    Jan 2006
    Posts
    352

    Re: Charsets

    Yes, I can see that now.

    PHP 5 has no native support of Unicode utf-8
    PHP 5 defaults to iso-8859-1 (Latin-1) as well as MySQL

    But PHP 6 however will be have default Unicode UTF-8 support.

    But still bugs me a case of loaded page in browser and same retrieved with cURL to file.
    They are different.
    Ipsens

  4. #4
    Join Date
    May 2002
    Posts
    10,943

    Re: Charsets

    Quote Originally Posted by Ipsens
    But still bugs me a case of loaded page in browser and same retrieved with cURL to file.
    They are different.
    That could be because of the charset of the PHP file retrieving the data.
    If the post was helpful...Rate it! Remember to use [code] or [php] tags.

  5. #5
    Join Date
    Jan 2006
    Posts
    352

    Re: Charsets

    Ok, just to say this.
    I 've edited my object which generates valid XHTML code, in a way, that just one line before html output, it converts it from iso-8859-1 (Latin-1) to Unicode utf-8 for web browsers.
    PHP Code:
    //Transform $this->xhtml from default PHP's iso-8859-1 (Latin-1), to utf-8
    $this->xhtml utf8_encode($this->xhtml);
            
            
    echo 
    $this->xhtml
    In html content:
    HTML Code:
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <title>Test Document</title>
    </head>.....
    And edited php.ini to:
    Code:
    default_mimetype = "text/html"
    default_charset = "utf-8"
    PHP works like CGI.


    Is this recommended?
    Ipsens

  6. #6
    Join Date
    May 2002
    Posts
    10,943

    Re: Charsets

    All looks fine and kosher to me. Just remember, that even though the file can be saved in a different character set than it's headers or meta tags claim.

    For example...You can have a PHP file that has headers claiming to be ISO-8898-1, when the file was saved to hard disk in UTF-8. I have fought with that before.
    If the post was helpful...Rate it! Remember to use [code] or [php] tags.

  7. #7
    Join Date
    Jan 2006
    Posts
    352

    Re: Charsets

    Oh, those charsets are driving me nuts, really.

    And what is worst..., you can't rely on PHP's internal functions to detect charsets of certain string.

    It is really like working in a chaos, when it comes to non default charsets.
    Last edited by Ipsens; August 25th, 2008 at 04:01 PM.
    Ipsens

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  





Click Here to Expand Forum to Full Width

Featured