HTMLENTITIES doesn't work with cyrillics - php

How can I make htmlentities to work with cyrillic symbols.
Now, when I try input some cyrillic: "Тест" it returns "ТеÑ"
My code:
$var = htmlentities($var);
Encoding: utf-8.
Thanks!

I had the same problem, try this solution:
<?php echo htmlentities("Текст на русском языке", ENT_QUOTES, 'UTF-8') ?>

In order to bring closure to this question -
I want my users not to enter HTML code in their comments
This is not necessary; htmlspecialchars() will convert all special characters necessary to prevent HTML from being shown.

The default behaviour is ENT_HTML401, which contains only a few entities. Try using ENT_HTML5:
<?php echo htmlentities("Текст на русском языке", ENT_COMPAT | ENT_HTML5, 'UTF-8') ?>
If you want to know which entities are replaced, you can use get_html_translation_table:
<?php print_r(get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_HTML401)) ?>
<?php print_r(get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_HTML5)) ?>

Related

Using PHP to decode a string

I'm trying to decode a string using PHP but it doesn't seem to be returning the correct result.
I've tried using html_entity_decode as well as utf8_decode(urldecode())
Current code:
$str = "joh'#test.com";
$decodeStr = html_entity_decode($str, ENT_COMPAT, "UTF-8");
Expected return is john#test.com
I suppose your html entity code for character 'n' is wrong.
Working example:
$str = "john#test.com";
echo $decodeStr = html_entity_decode($str, ENT_COMPAT, "UTF-8");
The HTML entity code for n is n, whereas the entity code in your string is for a single apostrophe '. If you wanted to convert single quotes, the ENT_QUOTES flag must be used when calling html_entity_decode(), as the default is ENT_COMPAT | ENT_HTML401 (from the PHP docs) which doesn't convert single quotes. If you need additional flags, you can "add" them using the pipe | symbol like this: ENT_HTML401 | ENT_QUOTES.
If you're expecting john#test.com:
$str = "john#test.com";
$decodeStr = html_entity_decode($str, ENT_COMPAT, "UTF-8");
echo $decodeStr; // john#test.com
Or if you're expecting joh'#test.com:
$str = "joh'#test.com";
$decodeStr = html_entity_decode($str, ENT_QUOTES, "UTF-8");
echo $decodeStr; // joh'#test.com
Shouldn't the entity for # be # instead of ' which is for an apostrophe?

Convert france character to HTML special character

I'm using Zend framework with mongoDB. I need to convert France character to special character.
For example: Prénom -> Prénom . what could I do?
htmlentities ( http://php.net/htmlentities ) can do this if you call:
htmlentities('Prénom', ENT_COMPAT, 'UTF-8');
I get:
Prénom
as the result
Maybe you can take a look at strtr function (Read more at http://php.net/strtr)?
I think that the right way to look is either mb_convert_encoding or htmlentities
Here is an example which you can view here:
$text = "Prénom";
echo mb_convert_encoding($text, 'HTML-ENTITIES', 'UTF-8');
echo "\n";
echo htmlentities($text, ENT_COMPAT | ENT_HTML401, 'UTF-8');

Show spanish characters in a form

I have a form and in a textarea I want to display some text that have some spanish characters but encoded as html. The problem is that instead of the spanish character it displays the html code. I'm using htmlentities to display it in the form. my code to display is:
<?php echo htmlentities($string, ENT_QUOTES, "UTF-8") ?>
Any idea or I just shouldnt use htmlentities in a form? Thanks!
EDIT
Lets say $string = 'á'
When I just do <?php echo $string ;?> I get á
If I do <?php echo htmlentities($string, ENT_QUOTES, "UTF-8") ?> I get á
I'm so confused!
You can try explicitly adding content type at the top of your file as below
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
if it's already encoded as html then you need to decode it now..you can use html_entity_decode($string);
Your string to be echoed in the form should be á as returned from database and not á
$string = 'á'; // your string as fetched from database
echo html_entity_decode($string);// this will display á in the textarea
and before saving to database you need to
htmlentities($_POST['txtAreaName'], ENT_QUOTES, "UTF-8"); // return `á`
If I understand you correctly, you need to use...
<meta charset="utf-8">
in your page header, and then...
<?php echo html_entity_decode($string, ENT_QUOTES); ?>
This will convert your HTML entities back to their proper characters
You might be looking for htmlspecialchars.
echo htmlspecialchars('<á>', ENT_COMPAT | ENT_HTML5, "UTF-8");
outputs <á>.

How do I convert arabic letters in htmlentities symbols?

I need convert arabic letters in htmlentities symbols. Codepage: ISO-8859-1.
سك - this is arabic symbol for example.
htmlentities("سك")
returns:
س�
How can I get from this symbol the html-entities سك?
htmlentities() can do only characters that have named entities. See this question on how to convert arbitrary characters into numeric entities.
You're probably not targeting the correct charset. Try: htmlentities('سك', ENT_QUOTES, 'UTF-8');
i'm using a function to make sure there are no html code or cotation posted by user
function cleartext($x1){
$x1 = str_replace('"','',$x1);
$x1 = str_replace("'",'',$x1);
$x1 = htmlentities($x1, ENT_QUOTES, 'UTF-8');
return $x1;
}
so thank for ( ENT_QUOTES, 'UTF-8' ) it helped me to find what am looking for

PHP - convert a string with - or + signs to HTML

How do I convert a string that has a - or + sign to a html friendly string?
I mean to convert those characters to html notations, like space is and so on...
ps: htmlentities doesn't work. I still see the -/+
Try this
$string = str_replace('+', '+', $string); // Convert + sign
$string = str_replace('-', '-', $string); // Convert - sign
I don't think there is entities for these symbols see: http://www.w3schools.com/tags/ref_entities.asp
I tested with
$str = "- and +"; echo htmlentities($str);
and didn't get entities. According to: http://us.php.net/manual/en/function.htmlentities.php
I would expect them to be encoded if there was encoding available.
No idea what you want to accomplish. But this escapes selected characters to html entities:
$html = preg_replace("/([+-])/e", '"&#".ord("$1").";"', $html);
As far as I am aware, - and + are fine in HTML, and dont have an entity equivalent. See http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
Are you sure you're not thinking of URL encoding?
Specify that you want it to use unicode as follows:
htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8");
Have a look at the 2nd comment on this page:
http://www.php.net/manual/en/function.htmlentities.php#100388
This will enable more encoding characters.
If you just want to encode some, then this is a little lighter weight:
<?php
$ent = array(
'+'=>'+',
'-'=>'+'
);
echo strtr('+ and -', $ent);
?>

Categories