I've tried converting the text to or from utf8, which didn't seem to help.
I'm getting:
"It’s Getting the Best of Me"
It should be:
"It’s Getting the Best of Me"
I'm getting this data from this url.
To convert to HTML entities:
<?php
echo mb_convert_encoding(
file_get_contents('http://www.tvrage.com/quickinfo.php?show=Surviver&ep=20x02&exact=0'),
"HTML-ENTITIES",
"UTF-8"
);
?>
See docs for mb_convert_encoding for more encoding options.
Make sure your html header specifies utf8
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
That usually does the trick for me (obviously if the content IS utf8).
You don't need to convert to html entities if you set the content-type.
Your content is fine; the problem is with the headers the server is sending:
Connection:Keep-Alive
Content-Length:502
Content-Type:text/html
Date:Thu, 18 Feb 2010 20:45:32 GMT
Keep-Alive:timeout=1, max=25
Server:Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.7 with Suhosin-Patch
X-Powered-By:PHP/5.2.4-2ubuntu5.7
Content-Type should be set to Content-type: text/plain; charset=utf-8, because this page is not HTML and uses the utf-8 encoding. Chromium on Mac guesses ISO-8859-1 and displays the characters you're describing.
If you are not in control of the site, specify the encoding as UTF-8 to whatever function you use to retrieve the content. I'm not familiar enough with PHP to know how exactly.
I know the question was answered but setting meta tag didn't help in my case and selected answer was not clear enough, so I wanted to provide simpler answer.
So to keep it simple, store string into a variable and process that like this
$TVrageGiberish = "It’s Getting the Best of Me";
$notGiberish = mb_convert_encoding($TVrageGiberish, "HTML-ENTITIES", 'UTF-8');
echo $notGiberish;
Which should return what you wanted It’s Getting the Best of Me
If you are parsing something, you can perform conversion while assigning values to a variable like this, where $TVrage is array with all the values, XML in this example from a feed that has tag "Title" which may contain special characters such as ‘ or ’.
$cleanedTitle = mb_convert_encoding($TVrage->title, "HTML-ENTITIES", 'UTF-8');
If you're here because you're experiencing issues with junk characters in your WordPress site, try this:
Open wp-config.php
Comment out define('DB_CHARSET', 'utf8') and define('DB_COLLATE', '')
/** MySQL hostname */
define('DB_HOST', 'localhost');
/** Database Charset to use in creating database tables. */
//define('DB_CHARSET', 'utf8');
/** The Database Collate type. Don't change this if in doubt. */
//define('DB_COLLATE', '');
It sounds like you're using standard string functions on a UTF8 characters (’) that doesn't exist in ISO 8859-1. Check that you are using Unicode compatible PHP settings and functions. See also the multibyte string functions.
We had success going the other direction using this:
mb_convert_encoding($text, "HTML-ENTITIES", "ISO-8859-1");
Just try this
if $text contains strange charaters do this:
$mytext = mb_convert_encoding($text, "HTML-ENTITIES", 'UTF-8');
and you are done..
if all seems not to work, this could be your best solution.
<?php
$content="It’s Getting the Best of Me";
$content = str_replace("’", "'", $content);
echo $content;
?>
==or==
<?php
$content="It’s Getting the Best of Me";
$content = str_replace("’", "'", $content);
echo $content;
?>
try this :
html_entity_decode(mb_convert_encoding(stripslashes($text), "HTML-ENTITIES", 'UTF-8'))
For fopen and file_put_contents, this will work:
str_replace("’", "'", htmlspecialchars_decode(mb_convert_encoding($string_to_be_fixed, "HTML-ENTITIES", "UTF-8")));
You Should check encode encoding origin then try to convert to correct encode type.
In my case, I read csv files then import to db. Some files displays well some not. I check encoding and see that file with encoding ASCII displays well, other file with UTF-8 is broken. So I use following code to convert encoding:
if(mb_detect_encoding($content) == 'UTF-8') {
$content = iconv("UTF-8", "ASCII//TRANSLIT", $content);
file_put_contents($file_path, $content);
} else {
$content = mb_convert_encoding($content, 'UTF-8', 'UTF-8');
file_put_contents($file_path, $content);
}
After convert I push the content to file then process import to DB, now it displays well in front-end
If none of the above solutions work:
In my case I noticed that the single quote was a different style of single quote. Instead of ' my data had a ’. Notice the difference in the single quote? So I simply wrote a str_replace to replace it and it fixed the problem. Probably not the most elegant solution but it got the job done.
$string= str_replace("’","'",$string);
I looked at the link, and it looks like UTF-8 to me. i.e., in Firefox, if you pick View, Character Encoding, UTF-8, it will appear correctly.
So, you just need to figure out how to get your PHP code to process that as UTF-8. Good luck!
use this
<meta http-equiv="Content-Type" content="text/html; charset=utf8_unicode_ci" />
instead of this
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
If nothing works try this mb_convert_encoding($elem->textContent, 'UTF-8', 'utf8mb4');
Related
I've tried converting the text to or from utf8, which didn't seem to help.
I'm getting:
"It’s Getting the Best of Me"
It should be:
"It’s Getting the Best of Me"
I'm getting this data from this url.
To convert to HTML entities:
<?php
echo mb_convert_encoding(
file_get_contents('http://www.tvrage.com/quickinfo.php?show=Surviver&ep=20x02&exact=0'),
"HTML-ENTITIES",
"UTF-8"
);
?>
See docs for mb_convert_encoding for more encoding options.
Make sure your html header specifies utf8
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
That usually does the trick for me (obviously if the content IS utf8).
You don't need to convert to html entities if you set the content-type.
Your content is fine; the problem is with the headers the server is sending:
Connection:Keep-Alive
Content-Length:502
Content-Type:text/html
Date:Thu, 18 Feb 2010 20:45:32 GMT
Keep-Alive:timeout=1, max=25
Server:Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.7 with Suhosin-Patch
X-Powered-By:PHP/5.2.4-2ubuntu5.7
Content-Type should be set to Content-type: text/plain; charset=utf-8, because this page is not HTML and uses the utf-8 encoding. Chromium on Mac guesses ISO-8859-1 and displays the characters you're describing.
If you are not in control of the site, specify the encoding as UTF-8 to whatever function you use to retrieve the content. I'm not familiar enough with PHP to know how exactly.
I know the question was answered but setting meta tag didn't help in my case and selected answer was not clear enough, so I wanted to provide simpler answer.
So to keep it simple, store string into a variable and process that like this
$TVrageGiberish = "It’s Getting the Best of Me";
$notGiberish = mb_convert_encoding($TVrageGiberish, "HTML-ENTITIES", 'UTF-8');
echo $notGiberish;
Which should return what you wanted It’s Getting the Best of Me
If you are parsing something, you can perform conversion while assigning values to a variable like this, where $TVrage is array with all the values, XML in this example from a feed that has tag "Title" which may contain special characters such as ‘ or ’.
$cleanedTitle = mb_convert_encoding($TVrage->title, "HTML-ENTITIES", 'UTF-8');
If you're here because you're experiencing issues with junk characters in your WordPress site, try this:
Open wp-config.php
Comment out define('DB_CHARSET', 'utf8') and define('DB_COLLATE', '')
/** MySQL hostname */
define('DB_HOST', 'localhost');
/** Database Charset to use in creating database tables. */
//define('DB_CHARSET', 'utf8');
/** The Database Collate type. Don't change this if in doubt. */
//define('DB_COLLATE', '');
It sounds like you're using standard string functions on a UTF8 characters (’) that doesn't exist in ISO 8859-1. Check that you are using Unicode compatible PHP settings and functions. See also the multibyte string functions.
We had success going the other direction using this:
mb_convert_encoding($text, "HTML-ENTITIES", "ISO-8859-1");
Just try this
if $text contains strange charaters do this:
$mytext = mb_convert_encoding($text, "HTML-ENTITIES", 'UTF-8');
and you are done..
if all seems not to work, this could be your best solution.
<?php
$content="It’s Getting the Best of Me";
$content = str_replace("’", "'", $content);
echo $content;
?>
==or==
<?php
$content="It’s Getting the Best of Me";
$content = str_replace("’", "'", $content);
echo $content;
?>
try this :
html_entity_decode(mb_convert_encoding(stripslashes($text), "HTML-ENTITIES", 'UTF-8'))
For fopen and file_put_contents, this will work:
str_replace("’", "'", htmlspecialchars_decode(mb_convert_encoding($string_to_be_fixed, "HTML-ENTITIES", "UTF-8")));
You Should check encode encoding origin then try to convert to correct encode type.
In my case, I read csv files then import to db. Some files displays well some not. I check encoding and see that file with encoding ASCII displays well, other file with UTF-8 is broken. So I use following code to convert encoding:
if(mb_detect_encoding($content) == 'UTF-8') {
$content = iconv("UTF-8", "ASCII//TRANSLIT", $content);
file_put_contents($file_path, $content);
} else {
$content = mb_convert_encoding($content, 'UTF-8', 'UTF-8');
file_put_contents($file_path, $content);
}
After convert I push the content to file then process import to DB, now it displays well in front-end
If none of the above solutions work:
In my case I noticed that the single quote was a different style of single quote. Instead of ' my data had a ’. Notice the difference in the single quote? So I simply wrote a str_replace to replace it and it fixed the problem. Probably not the most elegant solution but it got the job done.
$string= str_replace("’","'",$string);
I looked at the link, and it looks like UTF-8 to me. i.e., in Firefox, if you pick View, Character Encoding, UTF-8, it will appear correctly.
So, you just need to figure out how to get your PHP code to process that as UTF-8. Good luck!
use this
<meta http-equiv="Content-Type" content="text/html; charset=utf8_unicode_ci" />
instead of this
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
If nothing works try this mb_convert_encoding($elem->textContent, 'UTF-8', 'utf8mb4');
I've got a problem with some specials characters in PHP. I have a table in mysql (utf8_hungarian_ci) that contains some text with special characters like á, á, Ó, Ö, ö, ü, and I would like to show this text on my page. I've tested:
$text = htmlentities($text); //to convert the simple spec chars
$search = array("& otilde;","&O tilde;","& ucirc;","&U circ;");
$replace = array("& #337;","& #336;","& #369;","& #368;");
$text = str_replace($search, $replace, $text);
echo $text;
But this code works only if $text isn't set from database. If I use this code and my $text is selected from database, it doesn't shows me any text, and if I only use:
echo $text; without htmlentities and replacements
I get characters like this one: �
I know there were some questions about this and I have tried accepted answers, but it still doesn't work, so please help me if you want and if you have time. Thank you anyway. A good day to you all!
Also try setting in your header to use UTF-8 encoding.
In your PHP file, add
header('Content-type: text/html; charset=utf-8');
as well as specifying the encoding to be UTF-8 in your <meta> tag, to ensure that you told the browser. And see if it fixes the issue.
As well as including UTF-8 encoding in your meta tag.
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
...
</head>
Edit:
If you have access to Apache configuration, see if AddDefaultCharset is set to another encoding.
Try using mysql_set_charset() (mysqli_set_charset() if you're using MySQLi).
Try to put this in you html header:
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
(Also, you may need to save your file in "utf-8" file encoding)
.
Secondly, you could use this to try to tranlate-or-remove the disturbing char that always prints out in your case:
$str_out = #iconv("ISO-8859-1", "UTF-8//TRANSLIT//IGNORE", $str_in);
This is a slightly generic answer but please read up this article I wrote on common character-encoding pitfalls in the PHP/MySQL stack and if you still have problems let's try to work through them.
http://webmonkeyuk.wordpress.com/2011/04/23/how-to-avoid-character-encoding-problems-in-php/
Chrome converts this: aöüß to %C3%A4%C3%B6%C3%BC%C3%9F
But Firefox converts it to this strange thing here: a%F6%FC%DF
I can't seem to find a way to convert the Firefox thing back to the original in PHP.
Urldecode and rawurldecode unfortunately don't work. Does anyone know how to deal with that? Thanks.
As Tei already guessed: Chrome is using UTF-8 (as probably recommended) for URL parameters while Firefox uses Latin-1. I don't think that you can control this behavior. Also this will be difficult to handle, because you pretty much need to guess the encoding that was used.
This is how the decoding works (browser-dependent, assuming that you're using UTF-8 in your application):
Chrome:
$text = urldecode($_GET['text']);
Firefox:
$text = utf8_encode(urldecode($_GET['text']));
This may be a solution that works in most cases:
function urldecode_utf8($text) {
$decoded = urldecode($text);
if (!mb_check_encoding($decoded, 'UTF-8')) {
$decoded = utf8_encode($decoded);
}
return $decoded;
}
Force your page to use UTF-8. Probably these codes are different encoded umlauts. One is something like Latin1, and the other perhaps is UTF-8.
The best way to force utf-8 is in a meta tag in the html.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
$test = "selvfølgelig";
$test = html_entity_decode($test);
I want this to output "selvfølgelig", but this turns out as selvf�lgelig
How can i make this output like i want?
header("Content-Type: text/html; charset=UTF-8");
$string="selvfølgelig";
echo html_entity_decode($string, ENT_QUOTES, 'UTF-8')
i think you dont need to actually use this function.
EDIT: you can use the
< meta >
tag that Qualcuno provided instead of throwing a header.
It's an encoding issue, not related to html_entity_decode.
Make sure your page is in UTF-8: put this tag in your <head></head> section of the page.
<meta http-equiv="Content-Type" content="text/html;charset=UTF-8" />
Set, then, the third parameter of html_entity_decode(), charset, to 'UTF-8' (the default value).
PS: Please note that setting the encoding of the page to UTF-8 will solve many issues, but may introduce others, since UTF-8 can use multibyte characters. There may be security issues too, if you fail to validate input data correctly.
html_entity_decode($test, ENT_COMPAT, 'UTF-8');
There are other charsets too:
html_entity_decode($test, ENT_COMPAT, 'ISO-8859-15');
You need to define with which charset you're working with.
You will have to use the third parameter of html_entity_decode to specify a character set.
$test = html_entity_decode($test, ENT_COMPAT, 'utf-8');
I’m a headache with the damn charset.
Portuguese charset=iso-8859-1
On my HTML I have:
<meta http-equiv="Content-type" content="text/html; charset=iso-8859-1" />
On my config.php:
$config['charset'] = 'ISO-8859-1';
I have the word ‘café’, coffee.
It is been displayed like: cafŽ.
Any ideas?!
Thanks in advance for any help
**Edit
I don't know if it matters but I'm using Eclipse
What's the encoding of the file in Eclipse set to? Right-Klick on the file in Eclipse, check under "Properties". It must be the same as in your meta-tag.
thanks so much, i believe your answer is the best one:
$string = 'café';
utf8_decode($string);
OR
$string = 'café';
utf8_encode($string);
with meta charset in the header of each file, the issue of portugues characters will be solved.
Why don't you switch to UTF-8?
edit You might also want to switch to using entities.
é would be the é
http://www.w3schools.com/tags/ref_entities.asp
I would look at the default charset in the browser first, it could be set to ISO-8859-15 or UTF8. I have had the reverse problem of my browser encoding was set to ISO-8859-1 instead of UTF8.
Secondly is this data static or coming from a database? If it is from mySQL for example, check the collation of the database, is it in latin1 or utf8?
If coming from a UTF8 collated database (or not - as you're using PHP) you can try
$string = 'café';
utf8_decode($string);
OR
$string = 'café';
utf8_encode($string);
Moving to UTF8 may be a good idea because functions like PHPs utf8_encode() and utf8_decode() but if it's not appropriate to your market then that is that.
If the utf8_encode or utf8_decode functions work, you should look at your input method and input encoding as you will likely find a problem there.
P.S. I have the same problems from time to time being in Brazil... I feel your pain mate!
Try this one here:
$string = 'café';
htmlentities($string, ENT_COMPAT, 'utf-8');
Take care!
Go to the resource bundle in the project explorer and then right click on that file and change char set to utf=8 and then save the settings.