I'm trying to send these characters through PHP:
áéíóúüchlñÁÉÍÓÚÜCLÑ
They show up in the received email like this:
áéÃóúüchlñÃÃÃÃÃ
I tried htmlentities but without success:
$newsubject = htmlentities($subject, ENT_COMPAT, "UTF-8");
mail($notes,$newsubject,$message,$headers);
Does anybody have an idea what I could try?
Thanks
I think, you need to use MIME (Multipurpose Internet Mail Extensions).
Add your mail headers the following:
MIME-Version: 1.0
Content-Type: text/plain;charset=utf-8
You are attempting to send them as UTF-8 but your PHP is handling them as latin-1.
Call utf8_encode on the input string to treat it as UTF-8 again.
EDIT: Misread the question. Add a header to the email you're sending:
Content-Type: text/plain; charset=utf-8
Your character set is wrong on the characters themselves. Try this: (Windows) Copy and paste those characters from a "UTF-8 character set" site back into your application. Make sure your doucument is UTF-8, and BOM Signature disabled.
Related
I faced a problem in displaying UTF-8 content (Tamil text) .
<?php
// SAMPLE CODE
header('Content-Type: text/plain; charset=UTF-8');
echo 'Hello Loréane !';
After i googled , changed the encoding of file in editor to 'UTF-8' from 'ANSI' , Now the problem solved i got the correct content in browser
And my question is
Why it works after I changed the encoding type in file , even though i sent UTF-8 headers before that doesn't work ?
header('Content-Type: text/plain; charset=UTF-8');
This just informs the browsers what kind of content you're going to send it and how it should treat it. It does not set the encoding of the actual content you're sending. It's completely up to you to fulfil your own promise. Your content is not going to magically transform from whatever to UTF-8 just because you set that header. If you tell the browser to treat the content as UTF-8, but you're sending it Latin-1 encoded data, of course it will break.
I am creating a site with html and php.
When I Run my php page on borwser using localhost(XAMPP server), then some symbols () are displayed but when I check my html-php code, then no symbol or script like: ¿ or » is found.
If i am wrong somewhere then Please let me know.
That's a UTF-8 byte-order marker. You should configure your editor to save UTF-8 without BOM. It isn't mandatory for the UTF-8 encoding; in fact, its use is discouraged and it only causes problems.
Additionally, make sure your web server is sending an appropriate Content-Type HTTP header:
Content-Type: text/plain; charset=utf-8
¿ or » are html entities, they are looks different at php code and at browser. You can find them, for example, here. Also, you possibly have an issue with BOM
My best guess: You have an issue with encoding (UTF vs. ISO). Look up encoding used by your editor on saving, and send it to the browser like i.e. header("Content-type:text/html;charset=UTF-8")
sounds like you're dealing with a character encoding problem.
try to declare the encoding in your headers.
header("Content-Type: text/html; charset=UTF-8")
this needs to be output before any text is sent to the client.
Im reading a log file pasted into the body of an email, some are in various different languages and all language characters seem to display correctly except for Russian.
Here is an example of what the Russian says in the log file:
Ссылка на объект не указывает на экземпляр объекта.
в
From what I have read I need to specify decoding or encoding something on the lines of mb_encoding (UTF-8) but I am a bit lost on how to actual structure it without affecting code that isnt russian. But when echoed out it gets converted to this:
СÑылка на объект не указывает на ÑкземплÑÑ€ объекта.
в
Here is the code im using already, I am a php beginner and some of this isnt my code, I have edited to suit but not 100% what everything is doing:
$mailbox = "xxx#gmail.com";
$mailboxPassword = "xxx";
$mailbox = imap_open("{imap.gmail.com:993/imap/ssl}INBOX",
$mailbox, $mailboxPassword);
mb_internal_encoding("UTF-8");
$subject = mb_decode_mimeheader(str_replace('_', ' ', $subject));
$body = imap_fetchbody($mailbox, $val, 1);
$body = base64_decode($body);
echo $body;
Once I echo out body it converts from Russian into that encoding, any pointers on similar code I can dissect to learn how to fix this?
Please bear in mind there is numerous languages been read from the email, for the most part its just a few snippets and the rest is basic logging but what I am worried about is if I set a new decode that it will mess up other language characters
Despite its large adoption, email is still tricky to work with. If your IMAP client has a limited set of requirements, your job will be easy. Otherwise, for truly a general-purpose GMail client, there's no silver bullet and you have to un understand how email wokrs: SMTP, MIME and finally IMAP.
Basic MIME knowledge is absolutely needed, and I won't paste the whole wikipedia article, but you should really read it and understand how it works. IMAP is somewhat easier to understand.
Usually, email messages contains either a single text/plain body, or a multipart/alternative body with both a text/plain and a text/html part. But, you know, there are attachments, so you can also likely find a multipart/mixed and it can really contain anything, and if it's binary content you should treat it differently than text. There are two headers (which you can find in the global message or in part inside a multipart envelope) somewhat involved in charset issues: Content-Type and Content-Transfer-Encoding.
From your code, we must assume that you are only interested in textual parts base64-encoded. Once you have decoded them, they are a sequence of byte representing text in the charset specified by the sender in the Content-Type header, which is non-ASCII here and thus looks like this:
Content-Type: text/plain; charset=ISO-8859-1
Note that charset may be utf8 or really any other you can think of, you have to check this in your program. You job is transcoding this piece of input in the output charset of your HTML page. If your page does not use a Unicode encoding (like UTF-8), chances are that you can't even be able to show the message correctly, and '?' will be printed instead of missing characters. Since you require your application to be used worldwide (not just in Russia), and since it's anyway good practice, you should use UTF-8 in your HTML responses, and thus when you want to echo the message body:
echo mb_convert_encoding(imap_base64($body), "UTF-8", $input_charset);
where $input_charset is the one found in the Content-Type header for the processed part. For the subject line, you should use imap_mime_header_decode(), which returns an array of tuples (binary string, charset) which you have to output in the same manner as above.
TL;DR
The bytes in the UTF-8 encoded input text map quite nicely to the output if we assume it's CP-1252 encoded (maybe you didn't copy some non printable ones). This means that the input is UTF-8, but the browser thinks the page is Windows-1252. Likely this is the default browser behavior for your locale, and you can easily correct it by sending the appropriate header before any other input:
header("Content-Type: text/html; charset=utf-8");
This should be enough to solve this issue, but will also likely cause problem with non-ASCII characters in string literals and the database (if any). If you want a multilingual application, Unicode is the way, but you have to transcode your database and your PHP files from CP-1252 to UTF-8.
I've got a program on which I have non-ASCII characters which do not show properly on ISO-8859-1. Is there a way to use PHP and change the browser encoding somehow, and also allow the characters to display properly in the browser even though the encoding is ISO-8859-1?
Much Appreciated.
Use the header function to send an (explicit) HTTP Content-Type response header.
header('Content-Type: text/html; charset=ISO-8859-1');
… replacing ISO-8859-1 with whatever encoding you are actually using. Hopefully that will be UTF-8.
you should use the header function
header( 'Content-Type: text/html; charset=ISO-8859-1');
Note: you should make sure no content have been sent to the browser or you can't modify the headers anymore, so I advise you to use this code as soon as possible in your script
The browser itself doesn't have an encoding. It supports many encodings and uses the one you tell it too. If you specify (in headers and/or HTML) that the encoding is ISO-8859-1, then your document should be in that encoding and you should make sure that all characters you send are in the right encoding. So you should actually send ISO-8859-1 characters. You cannot send a document that uses different encodings for different sections of the document.
For some characters, you may post an HTML entity instead. For instance é can be sent as é. This will work, regardless of encoding.
If you have the choice, I'd opt to use UTF-8. It supports any character and you don't have to worry about escaping diacritics or other special characters, except those that are special to HTML/XML itself.
Like others have said, using the header function:
header('Content-type: text/html; charset=ISO-8859-1');
or, if you want to serve valid XHTML files instead of the standard HTML:
header('Content-type: application/xml+xhtml; charset=ISO-8859-1');
It is possible to call the header later on in the script, unlike what RageZ said, but you will need to have enabled output buffering for that, using ob_start().
I have a PHP file with one simple echo function:
echo 'アクセスは撥ねりません。';
but when I access that page i get this:
????????????
Can someone help me?
I also have my page encoding set to UTF-8, and I know it, because all of the browsers i used said so.
I also do this before the echo function:
mb_internal_encoding('UTF-8');
What does this do?
Does it help me?
All I need is to be able to echo a static Japanese string.
Thanks!
There are a few places where this could go wrong.
Firstly, if you aren't setting the output encoding in php with header()
header('Content-type: text/html; charset=utf-8');
or in your html with a meta tag:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
you will need to check the php.ini setting default_charset. Chances are this is defaulted to iso-8859-1
Secondly, you may also need to check the content encoding you are saving the php script as. If you are saving it as ASCII or some other latin charset, it will munge the characters.
I got it.
I just had to set the mbstring extension settings to handle internal strings in UTF-8. Thas extension is standard with my build of PHP 5.3.0.
Maybe you are printing Japanese characters contained in UTF-16 (extended set of chars)?
I just did a quick test and your example works for me, so it's most likely one of these:
Your file is not saved in UTF-8, but some other encoding, such as Shift-JIS. A decent editor should be able to let you see what encoding it used
Your server is sending bad http headers. Can you use some tool to check the headers and paste the results? Or the results you got from the browser?
The browser is using an incompatible font
I saved a file in UTF-8, pasted your code into it, and my server is serving the file with Content-Type: text/html; charset=utf-8 and it shows up just fine. Did not need to use the mb_ function or anything else.