Trying to display Vietnamese characters with php - php

When I try to display Vietnamese characters with the following code:
<?php
$str = "Nghệ thuật cắm hoa vải";
//echo utf8_encode(html_entity_decode(($str)));
echo html_entity_decode($str);
//echo $str;
?>
I get Ngh�? thu�?t c??m hoa va?i as a result.
Tried several option but couldn't make it. Any ideas?

Is the PHP script encoded in UTF-8? If it is, send a header indicating so:
header("Content-type: text/html; charset=utf-8");
Alternatively, do:
echo mb_convert_encoding($string, "HTML-ENTITIES", "UTF-8");

Works fine for me: http://codepad.org/uTmORRmz
Does your browser support Unicode?

Related

problems displaying html characters

I'm building a sign-in/up form and I have problems displaying HTML characters. When a user signs up, I use this function for the sign-up data and then insert it into the database.
function clearInput( $string) {
$string = stripslashes($string);
return htmlentities($string);
}
When a user signs-up with the name <p>hello</p> it will look like this in the db: "&lt;p&gt;hello&lt;/p&gt"<br>.
If the user signs-in and I var_dump the name that is saved in the session it looks like this <br>'<p>hello</p>' in the browser.
If I echo this <p>hello</p> manually in the document, it displays this <p>hello\</p> as it should normally.
Does someone know how it shows <p>hello</p> when I var_dump the session name?
I don't get your question but it will help you.
$str = "This is some <b>bold</b> text.";
echo htmlspecialchars_decode($str);
This is some <b>bold</b> text.
$str = '<a href="https://www.test.com">test.com</a>';
echo html_entity_decode($str);
test.com

php file_get_contents() converts html entities like ö to ö

I try to display the contents of a php file in a textarea with
<textarea ...>
<?php echo file_get_contents("file.php"); ?>
</textarea>
but it converts the html entities like ö to ö. As it is php code I can not use a conversion like htmlspecialchars because it would brick the <> and quotes etc.
EDIT:
I checked the sourcecode with the browser and there is really an ö.
Please help, thank you!
htmlspecialchars will change:
< ö
to:
< &ouml;
and it's ok.
It will display in textarea:
< ö
After you submit form, it will send to server text
< ö
not:
< &ouml;
Tested in Chrome with jQuery:
$('#layout').html('<form method="POST"><textarea name="x"><&ouml;</textarea><input type="submit" /></form>')
I think there is some non-UTF-8 character in your file.php. Try this:
$content = file_get_contents('/tmp/file.php');
$content = mb_convert_encoding($content, 'UTF-8',mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
print_r($content);
You can refer this comment http://php.net/manual/en/function.file-get-contents.php#85008.
<textarea ...>
<?php echo htmlentities(file_get_contents("file.php")); ?>
</textarea>
Example
Convert some characters to HTML entities:
<?php
$str = 'Go to w3schools.com';
echo htmlentities($str);
?>
The HTML output of the code above will be (View Source):
<a href="https://www.w3schools.com">Go to w3schools.com</a>
The browser output of the code above will be:
Go to w3schools.com

PHP: Get encoded html entities

I'm trying to get the html entities of a UTF-8 string,
Example: example.com/search?q=مرحبا
<?php
echo htmlentities($_GET['q']);
?>
I got:
مرحبا0مرحبا
It's UTF-8 text not html entities,
what I need is:
مرحبا
I have tried urldecode and htmlentities functions!
Add this code to the start of your file:
header('Content-Type: text/html; charset=utf-8');
The browser needs to know it is UTF-8. This tag also can go in the head section for formality.
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
I think you can solve it by getting the each char in the string and get its value.
From Mark Baker's answer and vartec's answer you can get:
<?php
$chrArray = preg_split('//u',$_GET['q'], -1, PREG_SPLIT_NO_EMPTY);
$htmlEntities = "";
foreach ($chrArray as $chr) {
$htmlEntities .= '&#'._uniord($chr).';';
}
echo $htmlEntities;
?>
I have not test it.

trying to get "é" character to print out correctly

I am trying to take the rss/xml feed from itunes and I have noticed that artist and songs that have special charters like é like in beyoncé is showing as Beyoncé
I have tried the following to get it to show correctly but unsucessfully I have Googled searched and searched on here for the correct answer but sadly not working.
here is what I have tried - I maybe way off.
echo html_entity_decode($entry->imartist, ENT_COMPAT, 'UTF-8');
here is the full code
function itunes(){
$itunes_feed = "https://itunes.apple.com/au/rss/topsongs/limit=100/explicit=true/xml";
$itunes_feed = file_get_contents($itunes_feed);
$itunes_feed = preg_replace("/(<\/?)(\w+):([^>]*>)/", "$1$2$3", $itunes_feed);
$itunes_xml = new SimpleXMLElement($itunes_feed);
$itunes_entry = $itunes_xml->entry;
foreach($itunes_entry as $entry){
echo html_entity_decode($entry->title."<br>", ENT_COMPAT, 'UTF-8');
echo html_entity_decode($entry->imartist, ENT_COMPAT, 'UTF-8');
echo "<br><br>";
// Get the value of the entry ID, by using the 'im' namespace within the <id> attribute
$entry_id['im'] = $entry->id->attributes('im', TRUE);
echo (string)$entry_id['im']['artist'];
//echo $entry_id['artist']."<br>";
}
}
That feed is in valid UTF-8, you shouldn't need to decode it with html_entity_decode. What happens if you add a <meta charset="utf-8" /> in the <head> of HTML page ?

Character encoding issues - UTF-8 / Issue while transmitting data on the internet?

I've got data being sent from a client side which is sending it like this:
// $booktitle = "Comí habitación bailé"
$xml_obj = new DOMDocument('1.0', 'utf-8');
// node created with booktitle and added to xml_obj
// NO htmlentities / other transformations done
$returnHeader = drupal_http_request($url, $headers = array("Content-Type: text/xml; charset=utf-8"), $method = 'POST', $data = $xml_data, $retry = 3);
When I receive it at my end (via that drupal_http_request) and I do htmlentities on it, I get the following:
Comí habitación bailé
Which when displayed looks like gibberish:
Comí Habitación Bailé
What is going wrong?
Edit 1)
<?php
$title = "Comí habitación bailé";
echo "title=$title\n";
echo 'encoding is '.mb_detect_encoding($title);
$heutf8 = htmlentities($title, ENT_COMPAT, "UTF-8");
echo "heutf8=$heutf8\n";
?>
Running this test script on a Windows machine and redirecting to a file shows:
title=Comí habitación bailé
encoding is UTF-8heutf8=
Running this on a linux system:
title=Comí habitación bailé
encoding is UTF-8PHP Warning: htmlentities(): Invalid multibyte sequence in argument in /home/testaccount/public_html/test2.php on line 5
heutf8=
I think you shouldn't encode the entities with htmlentities just for outputting it correctly (you should as stated in the comments use htmlspecialchars to avoid cross side scripting) , just set the correct headers and meta end echo the values normally:
<?php
header ('Content-type: text/html; charset=utf-8');
?>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
</body>
</html>
htmlentities interprets its input as ISO-8859-1 by default; are you passing UTF-8 for the charset parameter?
Try passing headers information in a key/value array format.
Something like
$headers = array("Content-Type" => "text/xml; charset=utf-8"")

Categories