I have a form and in a textarea I want to display some text that have some spanish characters but encoded as html. The problem is that instead of the spanish character it displays the html code. I'm using htmlentities to display it in the form. my code to display is:
<?php echo htmlentities($string, ENT_QUOTES, "UTF-8") ?>
Any idea or I just shouldnt use htmlentities in a form? Thanks!
EDIT
Lets say $string = 'á'
When I just do <?php echo $string ;?> I get á
If I do <?php echo htmlentities($string, ENT_QUOTES, "UTF-8") ?> I get á
I'm so confused!
You can try explicitly adding content type at the top of your file as below
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
if it's already encoded as html then you need to decode it now..you can use html_entity_decode($string);
Your string to be echoed in the form should be á as returned from database and not á
$string = 'á'; // your string as fetched from database
echo html_entity_decode($string);// this will display á in the textarea
and before saving to database you need to
htmlentities($_POST['txtAreaName'], ENT_QUOTES, "UTF-8"); // return `á`
If I understand you correctly, you need to use...
<meta charset="utf-8">
in your page header, and then...
<?php echo html_entity_decode($string, ENT_QUOTES); ?>
This will convert your HTML entities back to their proper characters
You might be looking for htmlspecialchars.
echo htmlspecialchars('<á>', ENT_COMPAT | ENT_HTML5, "UTF-8");
outputs <á>.
Related
I need help with encoding "Title" tag. I have name with special character, like "ě, š, č, ř, ž, ý, á, í, é" and in demo.browse.php is working. There is my code and I dont know, where is the problem, please, Can you help me? :) thanks
<?php
require_once('getid3.php');
$PageEncoding = 'UTF-8';
$getID3 = new getID3;
$getID3->setOption(array('encoding' => $PageEncoding));
$FullFileName = "test.webm";
$ThisFileInfo = $getID3->analyze($FullFileName);
getid3_lib::CopyTagsToComments($ThisFileInfo);
echo '<html><head>';
echo '<title>getID3() - (sample script)</title>';
echo '<meta http-equiv="Content-Type" content="text/html;charset='.$PageEncoding.'" />';
echo '</head><body>';
echo htmlentities(!empty($ThisFileInfo['comments_html']['title'])?implode('<br>', $ThisFileInfo['comments_html']['title']) : chr(160));
echo '</body></html>';
?>
resoult is: "V zajet& #237; d& #233;mon&# 367;"
And original is: "V zajetí démonů"
I try iconv(); and utf8_encode(); dont work. Thanks
You call htmlentities on the result yourself. This call converts diacritics to respective HTML entities.
Use htmlspecialchars instead.
I still don't understand how iconv works.
For instance,
$string = "Löic & René";
$output = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string);
I get,
Notice: iconv() [function.iconv]:
Detected an illegal character in input
string in...
$string = "Löic"; or $string = "René";
I get,
Notice: iconv() [function.iconv]: Detected an incomplete multibyte character in input string in.
I get nothing with $string = "&";
There are two sets of different outputs I need store them in the two different columns inside the table of my database,
I need to convert Löic & René to Loic & Rene for clean url purposes.
I need to keep them as they are - Löic & René as Löic & René then only convert them with htmlentities($string, ENT_QUOTES); when displaying them on my html page.
I tried with some of the suggestions in php.net below, but still don't work,
I had a situation where I needed some characters transliterated, but the others ignored (for weird diacritics like ayn or hamza). Adding //TRANSLIT//IGNORE seemed to do the trick for me. It transliterates everything that is able to be transliterated, but then throws out stuff that can't be.
So:
$string = "ʿABBĀSĀBĀD";
echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $string);
// output: [nothing, and you get a notice]
echo iconv('UTF-8', 'ISO-8859-1//IGNORE', $string);
// output: ABBSBD
echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $string);
// output: ABBASABAD
// Yay! That's what I wanted!
and another,
Andries Seutens 07-Nov-2009 07:38
When doing transliteration, you have to make sure that your LC_COLLATE is properly set, otherwise the default POSIX will be used.
To transform "rené" into "rene" we could use the following code snippet:
setlocale(LC_CTYPE, 'nl_BE.utf8');
$string = 'rené';
$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string);
echo $string; // outputs rene
How can I actually work them out?
Thanks.
EDIT:
This is the source file I test the code,
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" class="no-js">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<?php
$string = "Löic & René";
$output = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string);
?>
</html>
$clean = iconv('UTF-8', 'ASCII//TRANSLIT', utf8_encode($s));
And did you save your source file in UTF-8 encoding? If not (and I guess you didn't since that will produce the "incomplete multibyte character" error), then try that first.
How can I make htmlentities to work with cyrillic symbols.
Now, when I try input some cyrillic: "Тест" it returns "ТеÑ"
My code:
$var = htmlentities($var);
Encoding: utf-8.
Thanks!
I had the same problem, try this solution:
<?php echo htmlentities("Текст на русском языке", ENT_QUOTES, 'UTF-8') ?>
In order to bring closure to this question -
I want my users not to enter HTML code in their comments
This is not necessary; htmlspecialchars() will convert all special characters necessary to prevent HTML from being shown.
The default behaviour is ENT_HTML401, which contains only a few entities. Try using ENT_HTML5:
<?php echo htmlentities("Текст на русском языке", ENT_COMPAT | ENT_HTML5, 'UTF-8') ?>
If you want to know which entities are replaced, you can use get_html_translation_table:
<?php print_r(get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_HTML401)) ?>
<?php print_r(get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_HTML5)) ?>
I am reading an rss feed http://beersandbeans.com/feed/
The feeds says it is UTF8 format, and I am using simplepie rss to import the content When i grab the content and store it in $content I perform the following:
<?php
header ('Content-type: text/html; charset=utf-8');
?>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head><body>
<?php
echo $content;
echo $enc = mb_detect_encoding($content, "UTF-8,ISO-8859-1", true);
echo $content = mb_convert_encoding($content, "UTF-8", $enc);
echo $enc = mb_detect_encoding($content, "UTF-8,ISO-8859-1", true);
?>
</body></html>
This then produces:
..... Camping: 2,000isk/day for 5 days) = $89 .....
ISO-8859-1
..... Camping: Â Â 2,000isk/day for 5 days) = $89 .....
UTF-8
Why is it outputting the  ?
Try not specifying "UTF-8,ISO-8859-1" and see what encoding it gives you. It might be detecting ISO-8859-1 because it's the last one in that list, rather than the actual encoding of the string.
Set strict-mode to true in mb_detect_encoding(), see http://www.php.net/manual/de/function.mb-detect-encoding.php#102510
Also try http://www.php.net/manual/de/function.mb-convert-encoding.php instead of iconv()
I still don't understand how iconv works.
For instance,
$string = "Löic & René";
$output = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string);
I get,
Notice: iconv() [function.iconv]:
Detected an illegal character in input
string in...
$string = "Löic"; or $string = "René";
I get,
Notice: iconv() [function.iconv]: Detected an incomplete multibyte character in input string in.
I get nothing with $string = "&";
There are two sets of different outputs I need store them in the two different columns inside the table of my database,
I need to convert Löic & René to Loic & Rene for clean url purposes.
I need to keep them as they are - Löic & René as Löic & René then only convert them with htmlentities($string, ENT_QUOTES); when displaying them on my html page.
I tried with some of the suggestions in php.net below, but still don't work,
I had a situation where I needed some characters transliterated, but the others ignored (for weird diacritics like ayn or hamza). Adding //TRANSLIT//IGNORE seemed to do the trick for me. It transliterates everything that is able to be transliterated, but then throws out stuff that can't be.
So:
$string = "ʿABBĀSĀBĀD";
echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $string);
// output: [nothing, and you get a notice]
echo iconv('UTF-8', 'ISO-8859-1//IGNORE', $string);
// output: ABBSBD
echo iconv('UTF-8', 'ISO-8859-1//TRANSLIT//IGNORE', $string);
// output: ABBASABAD
// Yay! That's what I wanted!
and another,
Andries Seutens 07-Nov-2009 07:38
When doing transliteration, you have to make sure that your LC_COLLATE is properly set, otherwise the default POSIX will be used.
To transform "rené" into "rene" we could use the following code snippet:
setlocale(LC_CTYPE, 'nl_BE.utf8');
$string = 'rené';
$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string);
echo $string; // outputs rene
How can I actually work them out?
Thanks.
EDIT:
This is the source file I test the code,
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" class="no-js">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
<?php
$string = "Löic & René";
$output = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $string);
?>
</html>
$clean = iconv('UTF-8', 'ASCII//TRANSLIT', utf8_encode($s));
And did you save your source file in UTF-8 encoding? If not (and I guess you didn't since that will produce the "incomplete multibyte character" error), then try that first.