I'm using Zend framework with mongoDB. I need to convert France character to special character.
For example: Prénom -> Prénom . what could I do?
htmlentities ( http://php.net/htmlentities ) can do this if you call:
htmlentities('Prénom', ENT_COMPAT, 'UTF-8');
I get:
Prénom
as the result
Maybe you can take a look at strtr function (Read more at http://php.net/strtr)?
I think that the right way to look is either mb_convert_encoding or htmlentities
Here is an example which you can view here:
$text = "Prénom";
echo mb_convert_encoding($text, 'HTML-ENTITIES', 'UTF-8');
echo "\n";
echo htmlentities($text, ENT_COMPAT | ENT_HTML401, 'UTF-8');
Related
I'm trying to decode a string using PHP but it doesn't seem to be returning the correct result.
I've tried using html_entity_decode as well as utf8_decode(urldecode())
Current code:
$str = "joh'#test.com";
$decodeStr = html_entity_decode($str, ENT_COMPAT, "UTF-8");
Expected return is john#test.com
I suppose your html entity code for character 'n' is wrong.
Working example:
$str = "john#test.com";
echo $decodeStr = html_entity_decode($str, ENT_COMPAT, "UTF-8");
The HTML entity code for n is n, whereas the entity code in your string is for a single apostrophe '. If you wanted to convert single quotes, the ENT_QUOTES flag must be used when calling html_entity_decode(), as the default is ENT_COMPAT | ENT_HTML401 (from the PHP docs) which doesn't convert single quotes. If you need additional flags, you can "add" them using the pipe | symbol like this: ENT_HTML401 | ENT_QUOTES.
If you're expecting john#test.com:
$str = "john#test.com";
$decodeStr = html_entity_decode($str, ENT_COMPAT, "UTF-8");
echo $decodeStr; // john#test.com
Or if you're expecting joh'#test.com:
$str = "joh'#test.com";
$decodeStr = html_entity_decode($str, ENT_QUOTES, "UTF-8");
echo $decodeStr; // joh'#test.com
Shouldn't the entity for # be # instead of ' which is for an apostrophe?
how do i convert below text to something like "Växjö" using PHP?
Växjö
I have tried
html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", $text), ENT_NOQUOTES, 'UTF-8')
iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $text)
Any PHP version from 5.0 onwards should be fine with...
$decoded = html_entity_decode('Växjö', ENT_COMPAT, 'UTF-8');
Demo here - http://3v4l.org/DZc59
echo html_entity_decode('Växjö', ENT_QUOTES, 'UTF-8');
I have xml like this:
<formula type="inline">
<default:math xmlns="http://www.w3.org/1998/Math/MathML">
<default:mi>
ℤ
</default:mi>
</default:math>
</formula>
My goal is to get rid of all special entities like ℤ by replacing them by their numeric entity presentations.
I tried :
$test = <content of the xml>;
$convmap = array(0x80, 0xffff, 0, 0xffff);
$test = mb_encode_numericentity($test, $convmap, 'UTF-8');
But this will not replace the ℤ Any idea?
My goal is to get:
ℤ
as shown here: http://www.fileformat.info/info/unicode/char/2124/index.htm
Thank you.
Your converter is converting your LaTeX into MathML, not HTML entities. You need something that converts directly into HTML character references, or a MathML to HTML character reference converter.
You should be able to use htmlentities:
htmlentities($symbolsToEncode, ENT_XML1, 'UTF-8');
http://pt1.php.net/htmlentities
You can change ENT_XML1 to ENT_SUBSTITUTE and it will return Unicode Replacement Characters or Hex character references.
As an alternative, you could use strtr to convert the characters to something you specify:
$chars = array(
"\x8484" => "蒄"
...
);
$convertedXML = strtr($xml, $chars);
http://php.net/strtr
Someone has done something similar on GitHub.
So you need to decode the named entities first:
function decodeNamedEntities($string) {
static $entities = NULL;
if (NULL === $entities) {
$entities = array_flip(
array_diff(
get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_HTML5, 'UTF-8'),
get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_XML1, 'UTF-8')
)
);
}
return str_replace(array_keys($entities), $entities, $string);
}
After that you can use htmlentities to encode them in a different format if it is really needed.
echo $title gives me something like \u00ca\u00e0\u00f7\u00e5\u00eb\u00e8.
It should be a readable text instead. How do I decode it correctly?
I've tried html_entity_decode($title, 0, 'UTF-8'), but it doesn't work for non-english languages. I get something like Êà÷åëè instead of a real text.
Try echo htmlentities($str, ENT_QUOTES | ENT_IGNORE, "UTF-8");
try this
$title = mb_convert_encoding($title,'HTML-ENTITIES','utf-8');
hope this will work for you.
Edit:
Try this if it works
$title = iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $title);
How can I make htmlentities to work with cyrillic symbols.
Now, when I try input some cyrillic: "Тест" it returns "ТеÑ"
My code:
$var = htmlentities($var);
Encoding: utf-8.
Thanks!
I had the same problem, try this solution:
<?php echo htmlentities("Текст на русском языке", ENT_QUOTES, 'UTF-8') ?>
In order to bring closure to this question -
I want my users not to enter HTML code in their comments
This is not necessary; htmlspecialchars() will convert all special characters necessary to prevent HTML from being shown.
The default behaviour is ENT_HTML401, which contains only a few entities. Try using ENT_HTML5:
<?php echo htmlentities("Текст на русском языке", ENT_COMPAT | ENT_HTML5, 'UTF-8') ?>
If you want to know which entities are replaced, you can use get_html_translation_table:
<?php print_r(get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_HTML401)) ?>
<?php print_r(get_html_translation_table(HTML_ENTITIES, ENT_COMPAT | ENT_HTML5)) ?>