I'm building a multilingual website that uses PHP to load language files. Paragraphs in the language file are set in define() constants.
After opening page in the browser I get a bunch of characters like "?". In the markup I have encoding set to utf-8. What can be done to make it work other than replacing all unknown characters with html character entities?
header('Content-Type:text/html; charset=UTF-8');
use this line on the top of your PHP code to send header; also helps with validation of dynamic pages to get rid of utf warning.
Make sure you language files are utf8,
make sure your HTML files and template are utf8,
make sure you add the following tag to your page
< meta http-equiv="Content-Type" content="text/html; charset=UTF-8" / >
as long as your data is utf8, the template and html files are utf8 and you explicitely specify the html page is utf8, it should work.
EDIT: the code embedding thing is a bit broken today... have to insert a space at the tag opening.weird
Related
How can i remove only � (using curl To get data)
$str = "Check this out <a href=�http://www.somewebsite.com�>Somewebsite</a>, this is a great website
Windows� (XP 32bit/Vista/7/8/8.1)";
I just want � to be removed.
I tried
$output = preg_replace("/[^A-Za-z0-9]/","",$str);
it remove html also ... but i want html
Instead of doing a bad work-around like that, you should fix your charset issue instead. Your problem is likely that you don't use the same character-encoding in all levels of your application/scripts. Anything that has or can be set to a specific character-encoding, should be set to the same. The most general ones are below.
Save the document as UTF-8 (or UTF8 w/o BOM) (If you're using Notepad++, it's Format -> Convert to UFT-8 or UTF8 w/o BOM)
The header in both PHP and HTML should be set to UTF-8
HTML: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />, inside the <head>-tag in your document.
PHP: header('Content-Type: text/html; charset=utf-8'); - PHP headers has to be set BEFORE any output is made (no HTML, no whitespace, no echo/print - nothing).
There are other aspects as well that might need to be set to UTF-8, it depends on what kind of PHP functions you are using and so on. But the above is generally a good start.
I have some texts in French (containing accented characters such as "é"), stored in a MySQL table whose collation is utf8_unicode_ci (both the table and the columns), that I want to output on an HTML5 page.
The HTML page charset is UTF-8 (< meta charset="utf-8" />) and the PHP files themselves are encoded as "UTF-8 without BOM" (I use Notepad++ on Windows). I use PHP5 to request the database and generate the HTML.
However, on the output page, the special characters (such as "é") appear garbled and are replaced by "�".
When I browse the database (via phpMyAdmin) those same accented characters display just fine.
What am I missing here?
(Note: changing the page encoding (through Firefox's "web developer" menu) to ISO-8859-1 solves the problem... except for the special characters that appears directly in the PHP files, which become now corrupted. But anyway, I'd rather understand why it doesn't work as UTF-8 than changing the encoding without understanding why it works. ^^;)
I experienced that same problem before, and what I did are the following
1) Use notepad++(can almost adapt on any encoding) or eclipse and be sure in to save or open it in UTF-8 without BOM.
2) set the encoding in PHP header, using header('Content-type: text/html; charset=UTF-8');
3) remove any extra spaces on the start and end of my PHP files.
4) set all my table and columns encoding to utf8mb4_general_ci or utf8mb4_unicode_ci via PhpMyAdmin or any mySQL client you have. A comparison of the two encodings are available here
5) set mysql connection charset to UTF-8 (I use PDO for my database connection )
PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET utf8"
or just execute the SQL queries before fetching any data
6) use a meta tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
7) use a certain language code for French
<meta http-equiv="Content-language" content="fr" />
8) change the html element lang attribute to the desired language
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr">
and will be updating this more because I really had a hard time solving this problem before because I was dealing with Japanese characters in my past projects
9) Some fonts are not available in the client PC, you need to use Google fonts to include it on your CSS
10) Don't end your PHP source file with ?>
NOTE:
but if everything I said above doesn't work, try to adjust your encoding depending on the character-set you really want to display, for me I set everything to SHIFT-JIS to display all my japanese characters and it really works fine. But using UFT-8 must be your priority
This works for me
Make your database utf8_general_ci
Save your files in N++ as UTF-8 without BOM
Put $mysqli->query('SET NAMES utf8'); after the connection to the database in your PHP file
Put < meta charset="utf-8" /> in your HTML-s
Works perfect.
If your php.ini default_charset is not set to UTF-8, you need to use a Content-type to define your data. Apply the following header at the top of your file(s) :
header("Content-type: text/html; charset=utf-8");
If you have still troubles with encoding, the cause may be one of the following:
a database server charset problem (check encoding of your server)
a database client charset problem (check encoding of your connection)
a database table charset problem (check encoding of your table)
a php default encoding problem (check default_encoding parameter in parameters.ini)
a multibyte missconfigured (see mb_string parameters in parameters.ini)
a <form> charset problem (check that it is sent as utf-8)
a <html> charset problem (where no enctype is set in your html file)
a Content-encoding: problem (where the wrong encoding is sent by Apache).
SET NAMES worked for me.
My issue was in one of my editing pages the field with the foreign characters would not display, on the production web pages there was no problem.
I know you already have an answer. That's great. But strangely none of these answers solved my issue. I'd like to share my answer for the benefit of the others who may encounter the same issues.
I also had the same problems as the OP, with regards to French accents in a multi-lingual application.
But I encountered this issue for the first time when I had to pass (French accented) data as segments in AJAX calls.
Yes, we must have the database set to work with UTF8. But the fact that AJAX calls had query strings (in my case segments, since I'm using CodeIgniter), I had to simply encode the French text.
To do this on the client-side, use the Javascript encodeURI() function with your data.
And to reverse it in PHP, just use urldecode($MyStr) where data was received as parameters.
Hope this helps.
Type something full French signs in your (php) file
Save that file as UTF-8
Paste line beneath into your website header
header('Content-Type: text/html; charset=utf-8');
Page (file) should look good.
If looks good go here for mysql behavior after (SET_NAMES).
I'm including a menubar through php include and for some reason the menubar doesn't display UTF-8 special characters.
The rest of the page works fine, just not the left navigation.
What could be the reason? I tried adding:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
on the included file too but I still can't get the caracters.
I'm testing it here:
http://www.lilianasanmiguel.com.ar/acol1.php
Make sure that you actually save the files in UTF-8 encoding in your editor.
Check if the file you include is uft-8 encoded. In Eclipse you can do so by right-clicking in the file an check under "Properties"
Be aware that the standard PHP string functions won't work with UTF-8 - they all assume a fixed with character set. Check your code and make sure you're using the mb_* equivalents instead.
I have a problem in Language Encoding in PHP as my php file should display both English and Arabic Characters.
Some web page parts are static and others are dynamic (data comes from a Sybase database) and the language encoding of database is ok as data is displayed well in it.
My web page has some drop down lists that are dynamic but they display the data in a strange format which is not English or Arabic like squares and unknown symbols.
I checked the possible causes and did many solutions like:-
Changing the encoding of the PHP script:
Saving File with the Name : WebPage1 of Type : PHP and Encoding : ANSI or UTF-8 or Unicode.
Changing the HTML encoding declaration:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=windows-1256" />
Changing the PHP encoding declaration:
header('Content-Type: text/html; charset=UTF-8);
header('Content-Type: text/html; charset=windows-1256');
Changing the database tables font and language:
Arial(Arabic).
The problem still exists and I do not know what I can do to solve that.
Can you suggest any solution?
Check your database connection, make sure the sybase_connect connects with UTF-8 as charset.
See http://php.net/manual/en/function.sybase-connect.php
From the comment that you are using ODBC to connect: There seems to be an issue with PHP/ODBC and UTF8. Some suggestions are mentioned in this thread: Php/ODBC encoding problem
Always use UTF-8.
Your first header is correct. Your first header is correct, except you should use single = instead of ==. Make sure you used header() function before sending any output to browser.
Open your files in a Unicode supporting editor like Editplus, notepad++ and while saving every source code or HTML file, use Save as and choose UTF-8 on the save as screen. If you use eclipse, import your project to eclipse, right click it and go to project settings, apply charset setting as utf-8 to all source code.
If there's something wrong with data coming from MySQL database, then use appropriate collation on any text storing column (varchar, blob etc). Those are the usual suggestions for it. If you use Sybase, then use Google for collation settings.
And don't change your font to Arabic; Arial already supports it.
You seem to confuse something. Neither UTF-8 nor windows-1256 describe languages, they denote character sets/encodings. Although the character sets may contain characters that are typically used in certain languages, their use doesn’t say anything about the language.
Now as the characters of Windows-1256 are contained in Unicode’s character set and thus can be encoded with UTF-8, you should choose UTF-8 for both languages.
And if you want to declare the language for your contents, read the W3C’s tutorial on Declaring Language in XHTML and HTML.
In your case you could declare your primary document language as en (English) and parts of your document as ar (Arabic):
header('Content-Language: en');
header('Content-Type: text/html;charset=UTF-8');
echo '<p>The following is in Arabic: <span lang="ar">العربية</span></p>';
Make sure to use UTF-8 for both.
i am working on php. in my index.php page i have included right.php. right.php contains greek text. index.php has the html headers. the greek text are not showing correctly. when i open the right.php file in dreamweaver and save the page, it gives warning about the text. what can i do to solve this? because right.php has common contents which is used in many pages.
This is all to do with the content type of your pages. Most likely you are trying to save / display the text in latin1 format which doesn't support the characters you are trying to display.
The most sensible thing to do is convert everything to UTF-8. If you're manually editing the text then ensure your text editor (i.e. Dreamweaver) is set to save the files as UTF-8 and then ensure you have the following meta tag on your page
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
Make sure you are saving your files as UTF-8 encoding (check preferences in DreamWeaver to find file encoding). Also make sure your HTML <head> tags include charset similar to this: <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
You can use a different character set if you prefer, but UTF-8 supports the entire Unicode character space, so it's pretty safe.
You have to set file encoding to utf-8 and set it also in <meta> charset tag in <head> HTML.