Html Special Chars PHP - php

fm API to get event discription, venue name etc...
Now sometimes I get special chars back like: ' é à , but they show up scrabled.
So how can I display them properly? Also with the descrioption I get html-tags back, but I do want to keep these.
Can someone help me out fot those both cases? The language I'm using is php
Thanks in advance

specify encoding in the header:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
...
encode string when handling the input
$str=utf8_encode($str);
if you are displaying the input back as-is, no encoding is required;
however, if the value is the content of an input or textarea, escape the html characters
<?php echo htmlspecialchars($str); ?>

For Latin characters, use
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
in your section.

you need to be sure about two things, the meta header referring to which enconding you will be using, and the encoding you are using for the text served.
If you are using a utf8 header just be sure to convert the text served to utf8, refer to functions for encoding conversion like : mb_convert_encoding

Related

Some characters become "�" in our webpage

I use PHP to access a database to get a string like this
‘Chloe’ Fashion Show & Dinner
and then I do a printf() to output the string as html, but my webpage shows this:
�Chloe� Fashion Show & Dinner
All contents are English-based, do I miss something in PHP?
Where should I be checking?
Check if your .php file is encoded as UTF-8 without BOM
Check that your connection to the database is UTF-8
Check that you send <meta charset="utf-8"> in your HTML markup in the <head> tag
If your connection to the database is not UTF-8 and you don't want to change it (but I recommend it -> everything UTF-8 is the most secure solution against character rubbish) use utf8_encode($databaseValue) to ensure the encoding of your value is UTF-8.
Make sure that you use:
<meta charset="utf-8">
in the head of your page.
You need to add charset meta tag in 'head' section of html.
Note that the meta tag must appear within the first 1024 bytes of rendered page.
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>

’ is being displayed instead of -

’ is being displayed instead of - in php page
I tried using different encoding types like:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
and
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
but result is the same. What could be the problem?
Input
<strong style="color:#A8A8A8;">1</strong> – Lorem Ipsum.
Result
1 – Lorem Ipsum.
Make sure your html header specifies utf8
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
That usually does the trick for me (obviously if the content IS utf8).
You don't need to convert to html entities if you set the content-type.
check http://php.net/manual/en/function.mb-convert-encoding.php
<?php
header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');
?>
may be this will help you.
It looks like your source data is converted from one to another encoding along the way. Try to make sure ALL steps have the same encoding.
Is your (MySQL?) data stored as UTF8?
Is your .php file saved as UTF8?
Conversion errors like this usually pop up when handling UTF8 data as ISO-8859-1 data. (multibyte vs singlebyte? not sure).
The fact that the meta tag doesn't change the output is a strong indicator that there's something overriding it; probably it's the charset specified in the HTTP header (which has precedence over the meta tag), are you sure you're not setting it there?
Your document is most likely encoded in UTF-8 since – is the iso-8859-1 presentation of the UTF-8 encoded character –.
What you need is the meta-tag you describe:
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
Since it isn't working, the tag might to be ignored. Suggestion is to use the browser and check what encoding it tries to use (Tools - Encoding in Chrome).
If the browser uses UTF-8, you have double-encoded the characters. Check your code if so that you don't have an excessive utf8_encode(...)
If the browser uses Latin1 (iso-8859-1) your tag is ignored or overridden by the HTTP header. Try to validate your HTML with an online validator. Check the sent header information with your browser's development tool to make sure iso-8859-1 is not set as encoding.
Had the same problem when creating a file from javacode and setting the encoding to UTF-16 did the trick.

How to save Russian characters in a UTF-8 encoded file

OK so I have a PHP file with several strings of text in various languages. For most languages like French or Spanish I just simply type in the characters.
The problem I have is with Russian language characters. The PHP file is encoded in UTF-8, how can I make sure that the Russian characters are both saved correctly and displayed correctly on the output web page... Is it just a case of pasting the text into the PHP file, or is there a way to guarantee the characters will be saved into the file correctly - perhaps converting it into HTML-like notation for example?
Obviously I am assuming the end user will have the correct encoding set in their web browser, I just want to make sure I got it all covered from my end.
I am using Notepad++ on Windows to edit my PHP file.
Thanks!
If you want to tell browsers your encoding, place it inside your <header> tag:
<meta http-equiv='Content-Type' content='text/html; charset=utf-8'>
Or short version
<meta charset='utf-8'>
That should be pretty enough for Russian characters to be correctly displayed on a webpage.
if your doctype is html declare <meta http-equiv='Content-Type' content='text/html; charset=UTF-8'> but if your doctype is xhtml then declare <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />.Never assume that end-user will act correctly during your designsIf you already have some document, edit your document's meta tag for charset declaration and use notepad++ encoding>convert to UTF-8 without BOM, save your document, safely go on with your multilingual structure from now on.php tag is irrelevant for your question since you don't mention about any database char setting.
There is no difference between Latin and Cyrillic characters in UTF-8. Both are just byte sequences. Configure your server or PHP script to send Content-Type: text/html;charset=utf, and you are rather safe.
Your editor might have problems when the font you are using does not contain Russian characters. Choose another font then.
And please ignore the <meta> element recommendations. You don't need that: it is useless when your HTTP headers are correct, and maybe harmful if they aren’t.
Well you have to check 2 things
To ensure that *.php is an UTF-8 file I use PSPad. If file is not in UTF-8, I save
it like that: http://stepolabs.com/upload/utf-8.png
Then your website must have UTF-8 encoding in <meta> tag;
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
... more about metatagging.
Finally if everything is done well - (format and meta declaration) all should be displayed properly!

javascript encodeURIComponent equivalent in php

I need to convert %F3%BE%AE%A2 to this char "󾮢" in PHP. When I tried using
rawurldecode('%F3%BE%AE%A2');
then it gives 2 chars instead of 1 char.
How can I convert it properly?
EDIT:
To be more specific it's an UTF-16 surrogate char.it gives \udbba\udfa2 in javascript.Now
if i want to send data via javascript API then i could easily send "󾮢" as a single character.
But for security reasons i need to use PHP.That's where the problem starts.Decoding '%F3%BE%AE%A2' with rawurldecode() along with utf-8 header doesn't seem to be giving me the char i want.
Wish i have explained it.Thanks for your appreciations.
Actually rawurldecode() is giving you the correct result. That character consists of four bytes when encoded in utf-8 and the rule in url encoding is to convert each byte to %XX notation. rawurldecode() is giving you back those 4 bytes but probably you have not set your page's encoding to utf-8 so your browser is misinterpreting those bytes. add this to your <head>:
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
and you should see the right character.
This is a test page I made:
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
</head>
<body>
<?php echo rawurldecode('%F3%BE%AE%A2'); ?>
</body>
</html>
what I see in my browser:
󾮢
exactly the character you want to see.

Convert html entities into charactes problem

link
Im having trouble converting the html entites into html characters, (&# 8217;) i have tried using al the different php functions (html_entity_decode, htmlspecial characters etc...) None seem to be working, any ideas what function i need to use?
Thank you!
Your problem isn't that the characters are not decoded correctly, but that the browser is misinterpreting the decoded characters.
As the page is encoded using UTF-8, you need to specify that in the header:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

Categories