Yet another character encoding issue

Yet another character encoding issue - php

The language is Php. The editor is Php Storm. The editor encoding is utf-8. The file encoding also. mb_detect_encoding() also returns that the encoding is utf-8 but php does not recognizes č, ć, ž, đ and others. Does anyone know what the problem is?
I know that this is yet another character encoding question and that the solution is never clear in this case, but thank you for any answer.
EDIT
I use my own php framework and the index.php file is encoded to ANSI, not utf-8, but the rest of the files are utf-8. If I try to change from ANSI to utf-8, I get a content encoding
error.

I solved the problem with these lines of code:
ini_set('mbstring.internal_encoding','UTF-8');
ini_set( 'default_charset', 'UTF-8' );
ini_set('mbstring.func_overload',7);
header('Content-Type: text/html; charset=UTF-8');
Actually, setting the 'default_charset' is enough to work but since I spent the last 4 hours trying to solve this, the hell with it. Let it all fire.

Related

local server ruins character encoding

I need to learn all these encoding stuffs well. Because this is the third time I'm wasting my time with silly wrong encoding problems. Here is the problem:
I have a simple php file.
File is in the format of UTF-8
If I run my local server, it makes ı => Ä± and ö => Ãœ
If I rename extension as HTML it works perfectly, so the problem is local server, definetely.
To correct this issue, I have done the following
I've read this, this and this
Double checked the file encoding, it's UTF-8
Added the meta tag <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Added the header inside php tag: header("Content-type:text/html; charset: UTF-8");
Added the internal encoding inside php tag mb_internal_encoding('UTF-8');
Corrected the line in php.ini file default_charset = UTF-8
Added the following in httpd.conf file: AddDefaultCharset utf-8
Everything is hard-coded, so I don't use database, it's not related to mysql encoding
I'm using WAMP, and machine is Windows 7 English. I'm completely exhausted, therefore, I really need help.
Thanks.

Checks whether the document encoding is UTF-8 without BOM
and if does not work try utf8_encode() and utf8_decode()
EDIT
$text = "A strange string to pass, maybe with some ø, æ, å characters.";
foreach(mb_list_encodings() as $chr){
echo mb_convert_encoding($text, 'UTF-8', $chr)." : ".$chr."<br>";
}

In the end I found problem. I wrote
header("Content-type:text/html; charset: UTF-8");
After Content-Type, we need to have put : but after Charset we need to put equal sign = which was very frustrating for me. So,
header('Content-Type: text/html; charset=utf-8');
solves the problem.

get_meta_tags and persian phrases

I used this function,
$code = get_meta_tags('http://www.narenji.ir/');
and I've seen this
'Ù…Ú©Ø§Ù†ÛŒ Ø¨Ø±Ø§ÛŒ Ø¢Ø´Ù†Ø§ÛŒÛŒ Ø¨Ø§ Ø§Ø¨Ø²Ø§Ø±Ù‡Ø§ Ùˆ Ø§Ø®Ø¨Ø§Ø± Ø¯Ø§Øº Ø¯Ù†ÛŒØ§ÛŒ ÙÙ†Ø§ÙˆØ±ÛŒ'
How can I fix this issue?
Can I fix it without using JSON?

You must be missing some link here, your code just works:
Example
The key point is that you preserve the UTF-8 encoding so that Persian is supported. Otherwise you would need some other encoding (one that I do not yet know) that supports Persian and a library that is able to re-encode that.
Which encoding do you want to use for Persian output?

If you are executing your script from a browser, make sure you sending UTF-8 as your content encoding. Add a Content-Type header before echo'ing anything.
header('Content-Type:text/html; charset=utf-8');

utf8_decode() is built specifically for converting from UTF-8 to ISO-8859-1 (latin1). Persian characters are not in Latin1, so why would you feel it's necessary here??
working example: http://codepad.viper-7.com/tEjZAz

Character encoding in PHP

I never had this problem before, it was usually my database or the html page. But now i think its my php. I import text from a csv or from a text area and in both ways it goes wrong.
for example é changes to Ã©. I used htmlentities to fix this but it didn't work. The htmlentities function didn't return é in html but Ã© in html entities, so it already loses the real characters before htmlentities comes in to place... So does that mean my php file has the wrong encoding or something?
I hope someone can help me out..
Thanks!
Chris

A file is usually ISO-8859-1 (Latin) or UTF-8 ... ISO-8859-1 is 1 byte per char, UTF-8 is 1-4 bytes per char. So if you get 2 chars when you expect one, then you are reading UTF-8 and showing it as ISO-8859-1 ... if you get strange chars, then you are reading ISO-8859-1 and showing it as UTF-8.
If you provide more details, it would be easier to pinpoint, but in short, you have inconsistent charsets and need to convert one or the other so they're all the same. But from what it seems, you're using ISO-8859-1 in your project, but you are reading some UTF-8 from somewhere... use utf8_decode($text) if that data should be indeed be stored as UTF-8, or find the data and convert it manually.
EDIT: If you are using AJAX somewhere, then you will ALWAYS get UTF-8 from it, and you'll have to decode it yourself with utf8_decode() if you want to keep using ISO-8859-1.

Try opening your php file and change the encoding to UTF-8
if that doesn't help, add this to your php:
header('Content-Type: text/html; charset=utf-8');
Or this to your html:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Take a look at PHP's iconv().

Encoding issue with Apache , displaying diamond characters in browser

Request you all to help me set up Apache server on Cent OS. It looks like some encoding issue, but I am not able to resolve it yet.
Instead of HTML content it displays HTML source in (chrome,firefox), IE 9 works fine. It displays � character after each "<" symbol.
http://pdf.gen.in/index1.htm
Second Problem is with PHP. It displays source code of PHP http://pdf.gen.in/index.php with similar diamond characters, wherever it encounters a "<" character. It seems like php issue is related to the first issue.

Those files are encoded with UTF-16LE. For the static HTML page, you might be able to get it to work by setting the charset correctly in the MIME type (it's currently text/html; charset=UTF-8). I don't know how strong PHP's Unicode support is. Try using UTF-8 instead, it's generally more well supported due to its partial overlap with ASCII.

You should use a decent text editor, and always set encoding of php/html to "UTF-8 without BOM".
Create a file named "test.php", paste below codes and save with "UTF-8 without BOM" encoding, then it will work just fine.
<?php
phpinfo();
?>

utf-8 decoding problem in php

I got a .vcf file with parts encoded as UTF-8:
CATEGORIES;CHARSET=UTF-8:StraÃŸe & â€“dienste
Now "â€“" should be a "-" and "StraÃŸe" should convert to "Straße".
I tried
utf8_decode()
iconv()
mb_convert_encoding()
And have been playing with several output encoding options like
header('content-type: text/html; charset=utf-8');
mb_internal_encoding('UTF-8');
mb_http_output( "UTF-8" );
But I don't get the wanted results - instead: "StraÃ?e & â??dienste"
Anyone getting that knot out of my brain? Thanks a lot.

solved.
i had to convert the PHP file back to ISO-8859-1 (instead of UTF-8).
thought that would make no difference, but it does!

You may actually want to try utf8_encode(). I had a similar problem when retrieving UTF-8 encoded information from MySQL and displaying it on a UTF-8 HTML page.

forgot to mention: there is no MySQL...
plain php ;-)
echo "StraÃŸe & â€“dienste";
echo utf8_decode("StraÃŸe & â€“dienste");
should somehow become "Straße & -dienste"... but won't, won't, won't

I don't have an answer for you, as I'm not sure I understand fully what you are trying to do (read a .vcf file in PHP?)....
But a clue is this: "StraÃŸe" is "Straße" encoded in UTF-8, but then interpreted as Latin1 (or Windows-1252).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Yet another character encoding issue - php

Related

local server ruins character encoding

get_meta_tags and persian phrases

Character encoding in PHP

Encoding issue with Apache , displaying diamond characters in browser

utf-8 decoding problem in php

Categories

Resources