I use PHP Simple DOM to grab a URL. When I print the urls content to screen, I get:
you’ll
instead of:
you'll
If I run
$str = utf8_decode('you’ll');
echo $str;
I get:
you?ll
I'm obviously not understanding the fundamentals of encoding. Can you someone please tell me what I'm missing?
Try to set the encoding to UTF-8 before do anything.
Start your php file with this:
<?php
header('Content-Type: text/html; charset=UTF-8');
mb_internal_encoding('UTF-8');
?>
and try to echo/print it without utf8_decode.
Note:
If you're using mysql (postgesql), use this too:
<?php
mysql_query("SET CHARACTER SET UTF8");
mysql_query("SET NAMES UTF8");
?>
Edit: also, make sure you save your PHP file in UTF-8 (without BOM) format.
You need to declare the document that you are outputting to be UTF-8 (assuming it actually is UTF-8) so the browser knows what to expect. You could convert the encoding, but if all you are doing is displaying it in the browser it would be better to leave the content as it is.
Add this line to your PHP before you output anything:
header('Content-Type: text/html; charset=utf-8');
...and add this meta tag as the first child element of your <head>:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Related
I am trying to scrape a webpage in arabic and everything works fine except the fact that when i echo the text what i get is a garbled up text even though i have set the header to UTF-8
Here is my code
<?php
header ('Content-Type: text/html; charset=UTF-8');
require 'vendor/autoload.php';
use Goutte\Client;
$client = new Client();
$crawler = $client->request('GET', 'http://www.lebanonfiles.com');
$news_container = $crawler->filter('#mcs4_container .line');
$news_container->each(function($node) {
echo $node->text();
})
?>
What i get is this piece of garbled text
You should try this... try to put this line at beginning of your php file: ini_set('default_charset', 'UTF-8'); this may solve your issue.
Have a nice day.
ALL attributes must be set to UTF-8, on all levels of your application/script.
Save the document as UTF-8 or as UTF-8 w/o BOM (If you're using Notepad++, it's Format -> Convert to UTF-8)
Note that even though they are both UTF-8, they can behave differently!
The header in both PHP and HTML should be set to UTF-8
HTML: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
PHP: header('Content-Type: text/html; charset=utf-8');
You may need to specify your charset in your php.ini file, using default_charset = "utf-8", although this is standard in PHP 5.6
Everything that can be set to a specific charset, should be set to the same.
There might be different aspects of your code that needs to be set to a specific charset.
I have these Chinese characters:
汉字/漢字''test
If I do
echo utf8_encode($chinesevar);
it displays
??/??''test
Or even if I just do a simple
echo $chinesevar
it still displays some weird characters...
So how am I going to display these Chinese characters without using the <meta> tag with the UTF-8 thingy .. or the ini_set UTF-8 thing or even the header() thing with UTF-8?
Simple:
save your source code in UTF-8
output an HTTP header to specify to your browser that it should interpret the page using UTF-8:
header('Content-Type: text/html; charset=utf-8');
Done.
utf8_encode is for converting Latin-1 encoded strings to UTF-8. You don't need it.
For more details, see Handling Unicode Front To Back In A Web App.
Look that your file is in UTF8 without BOM and that your webserver deliver your site in UTF-8
HTML:
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
in PHP:
header('Content-Type: text/html; charset=utf-8');
And if you work with a database look that your database is in UTF-8 if you read the text from your database.
$chinesevarOK = mb_convert_encoding($chinesevar, 'HTML-ENTITIES', 'UTF-8');
Perhaps take a look at the following solutions:
Your database, table and field COLLATE should be utf8_unicode_ci
Check if your records are showing the correct characters within the database...
Set your html to utf8
Add the following line to your php after connecting to the database
mysqli_set_charset($con,"utf8");
http://www.w3schools.com/php/func_mysqli_set_charset.asp
save your source code in UTF-8 No BOM
How do I interpret some characters into their proper form in PHP?
For example, the string is \u00c9rwin but PHP print it as Érwin, and the correct form must be Érwin
What is the proper PHP code for this? I am pretty sure this is not an HTML entity, or is it?
P.S. no encoding was declared on the PHP file
Look into utf8_encode and utf8_decode.
It's important as well to go UTF8 across the whole stack. What that means is that your database connection should be using UTF8 (here's how in MySQL), your HTTP Content-Type should be returning UTF8 (see mgraph's example below) and you should also be setting it in the meta tag so that there is no need to encode/decode at all as you're using the same charset everywhere.
add this in header:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
or:
<?php header ('Content-type: text/html; charset=utf-8'); ?>
I am trying to GET values from url
and I have ended up with a problem in IE
but all other browsers it works great.
This is my issue:
If text is some UTF-8 text as example:
$x=$_GET['txt'];
echo $x;
I got
???????
only in IE
still same problem and this is my all code
<?php
header('Content-Type: text/html; charset=utf-8');
$x=$_GET['id'];
echo $x;
?>
try with this word in id
سسسسسسس
You can put this meta tag inside the <head> if it's a charset issue (as an alternative to using header inside PHP):
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
UPDATE
If you're not encoding the value of x from the URL you should do something like:
Link
Using your sample text (سسسسسسس), once that's encoded using urlencode it should like this:
%D8%B3%D8%B3%D8%B3%D8%B3%D8%B3%D8%B3%D8%B3
I got it working by adding a charset meta tag and doing a simple urldecode:
echo urldecode($_GET['x']);
See screenshot on IE:
Try setting your page so the browser will recognize its encoding correctly. Mostly sending a proper header is enough:
header('Content-Type: text/html; charset=utf-8');
This is for UTF8 but you can send any encoding you want.
As you already stated yourself this could depending on the browsers character encoding settings. Try the utf8-function in PHP like
http://www.php.net/manual/de/function.utf8-decode.php
and
http://www.php.net/manual/de/function.utf8-encode.php
(:
Also look here:
Handling unicode values in GET parameters with PHP
try adding: urlencode & urldecode around your $x
When I try and execute this code to print out an Arabic string: print("إضافة"); I get this output: إضاÙØ©. If I utf8_decode() it I'll get ?????. I have "AddLanguage ar" in my apache configuration but it doesn't help. How do i print out this Arabic string?
Also set your page language to utf8 eg:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
and then see it if worked. If that still doesn't work, go and check this out, it is complete solution for the arabic language using PHP:
http://www.ar-php.org/en_index_php_arabic.html
You may want to check out this too:
http://www.phpclasses.org/browse/package/2875.html
It might be necessary to indicate to the browser which charset you are using -- I'm guessing it's UTF-8.
IN order to achive that, you might try putting this portion of code at the beginning of your script, before any output is generated :
header('Content-type: text/html; charset=UTF-8');
[utf8_decode][1] will try to decode your string from UTF-8 to latin1, which is not suited for Arabic characters -- hence the '?' characters.
You may want to set
default_charset = "utf-8"
in your php.ini. Default charset directive instructs the server to produce correct content type header.
You can also do it in runtime:
ini_set('default_charset', 'utf-8');
You may also want to check your browser font if it has Arabic support. Stick to common fonts like Arial Unicode and Times New Roman.
Well,
First: Add by the beginning of HTML page
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
second:
if you are using AJAX encode data using encodeURIComponent
Third:
First line of your PHP page should be
header('Content-Type: text/html; charset=utf-8');
and decode the data sent to PHP using urldecode
Regards,