I'm working on a website which uses / stores accented characters in the database. I have the page template set so that the config.php charset variable matches the setting, e.g.:
<meta charset="<?php echo $this->config->item('charset');?>">
The problem I'm having is, when $config['charset'] is set to UTF-8, the form validation fails and it's as if no characters were submitted if an accented character was included. So, for example, a required field will bounce back if á is included anywhere in the string. The string minus the á works fine.
I've managed to get this working by changing the $config['charset'] to ISO-8859-1 and converting text to UTF-8 before inserting / after retrieving from the database with php's utf8_encode() and utf8_decode(). Is this the best way or am I missing something needed in order to get UTF-8, with accented characters, working in CodeIgniter?
Any advice appreciated.
You have to make sure that you use UTF-8 everywhere, and that both PHP and MySQL are configure to handle UTF-8.
In the html, add the meta-tag:
<meta charset="utf-8" />
And save it in UTF-8 format. here is how to do that in notepad++.
Define the MySQL tables to support UTF-8, create table with:
DEFAULT CHARSET=utf8;
And set the connection to:
mysql_set_charset('utf8', $con);
Enable UTF-8 in the php.ini:
default_charset = "utf-8"
For a full manual check Handling Unicode Front To Back In A Web App
Related
I've recently come across a weird problem about the meta charset.
If I don't set any charset in my header, all accent like é,è,à.. are show correctly (even var from php) except for text comming from my database are replace by a little question marks in a lozenge.
If I set one of those(I tried both of them) charset in my header
<meta http-equiv="Content-Type" content="text/html" charset="iso-8859-15" />
<meta charset="UTF-8">
Text from my database is okay, but all the rest show the little question marks instead of the accent.
My database character set is UTF-8 unicode, and Collation is UTF_8 general_ci.
Note that I'm using smarty, but I did'nt change the charset in config cause his default is UTF-8.
Okay I found a solution, I'm using an ORM, and i only add the charset=utf8 in the setConnection method like this
$config->set_connections(array(
'development' => 'mysql://user:pass#localhost/mydb;charset=utf8')
);
I'm hoping for an explanation of why some UTF-8 text is being saved to a database table incorrectly...
I created an HTML form and the page's meta content is set to UTF-8:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
The PHP and template files are all Unicode/UTF-8.
The form field data is submitted to a utf8_unicode_ci encoded database table.
If I submit the form with characters such as "éçä" (which I created from Windows' Character Map program set to Unicode character set) they show up incorrectly in the database ("éçä"). I'm viewing the database via phpMyAdmin (which is also set to UTF-8 character encoding).
However, if I run iconv() on the string to convert to ISO-8859-1 before inserting it into the database, then the character show correctly:
$input = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $input);
What is going on? Shouldn't the fact that everything is UTF-8/Unicode from beginning to end resulted in it being correct in the DB? What am I doing wrong and why did converting the data to ISO-8859-1 work?
The only other thing done to the data is a FILTER_SANITIZE_MAGIC_QUOTES:
$input = filter_var($input,FILTER_SANITIZE_MAGIC_QUOTES);
Thank you for your time and input.
Two steps you haven't mentioned:
Specify UTF-8 in HTTP Content-Type header
Specify UTF-8 when connecting to MySQL, e.g. specifying charset in PDO
I have some texts in French (containing accented characters such as "é"), stored in a MySQL table whose collation is utf8_unicode_ci (both the table and the columns), that I want to output on an HTML5 page.
The HTML page charset is UTF-8 (< meta charset="utf-8" />) and the PHP files themselves are encoded as "UTF-8 without BOM" (I use Notepad++ on Windows). I use PHP5 to request the database and generate the HTML.
However, on the output page, the special characters (such as "é") appear garbled and are replaced by "�".
When I browse the database (via phpMyAdmin) those same accented characters display just fine.
What am I missing here?
(Note: changing the page encoding (through Firefox's "web developer" menu) to ISO-8859-1 solves the problem... except for the special characters that appears directly in the PHP files, which become now corrupted. But anyway, I'd rather understand why it doesn't work as UTF-8 than changing the encoding without understanding why it works. ^^;)
I experienced that same problem before, and what I did are the following
1) Use notepad++(can almost adapt on any encoding) or eclipse and be sure in to save or open it in UTF-8 without BOM.
2) set the encoding in PHP header, using header('Content-type: text/html; charset=UTF-8');
3) remove any extra spaces on the start and end of my PHP files.
4) set all my table and columns encoding to utf8mb4_general_ci or utf8mb4_unicode_ci via PhpMyAdmin or any mySQL client you have. A comparison of the two encodings are available here
5) set mysql connection charset to UTF-8 (I use PDO for my database connection )
PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8"
PDO::MYSQL_ATTR_INIT_COMMAND => "SET CHARACTER SET utf8"
or just execute the SQL queries before fetching any data
6) use a meta tag <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
7) use a certain language code for French
<meta http-equiv="Content-language" content="fr" />
8) change the html element lang attribute to the desired language
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr" lang="fr">
and will be updating this more because I really had a hard time solving this problem before because I was dealing with Japanese characters in my past projects
9) Some fonts are not available in the client PC, you need to use Google fonts to include it on your CSS
10) Don't end your PHP source file with ?>
NOTE:
but if everything I said above doesn't work, try to adjust your encoding depending on the character-set you really want to display, for me I set everything to SHIFT-JIS to display all my japanese characters and it really works fine. But using UFT-8 must be your priority
This works for me
Make your database utf8_general_ci
Save your files in N++ as UTF-8 without BOM
Put $mysqli->query('SET NAMES utf8'); after the connection to the database in your PHP file
Put < meta charset="utf-8" /> in your HTML-s
Works perfect.
If your php.ini default_charset is not set to UTF-8, you need to use a Content-type to define your data. Apply the following header at the top of your file(s) :
header("Content-type: text/html; charset=utf-8");
If you have still troubles with encoding, the cause may be one of the following:
a database server charset problem (check encoding of your server)
a database client charset problem (check encoding of your connection)
a database table charset problem (check encoding of your table)
a php default encoding problem (check default_encoding parameter in parameters.ini)
a multibyte missconfigured (see mb_string parameters in parameters.ini)
a <form> charset problem (check that it is sent as utf-8)
a <html> charset problem (where no enctype is set in your html file)
a Content-encoding: problem (where the wrong encoding is sent by Apache).
SET NAMES worked for me.
My issue was in one of my editing pages the field with the foreign characters would not display, on the production web pages there was no problem.
I know you already have an answer. That's great. But strangely none of these answers solved my issue. I'd like to share my answer for the benefit of the others who may encounter the same issues.
I also had the same problems as the OP, with regards to French accents in a multi-lingual application.
But I encountered this issue for the first time when I had to pass (French accented) data as segments in AJAX calls.
Yes, we must have the database set to work with UTF8. But the fact that AJAX calls had query strings (in my case segments, since I'm using CodeIgniter), I had to simply encode the French text.
To do this on the client-side, use the Javascript encodeURI() function with your data.
And to reverse it in PHP, just use urldecode($MyStr) where data was received as parameters.
Hope this helps.
Type something full French signs in your (php) file
Save that file as UTF-8
Paste line beneath into your website header
header('Content-Type: text/html; charset=utf-8');
Page (file) should look good.
If looks good go here for mysql behavior after (SET_NAMES).
I am building a webpage in HTML with PHP and MySQL and I ran into trouble with swedish characters ÅÄÖ when running page. They show up as � instead of Å/Ä/Ö.
I have set the charset to UTF-8 in both HTML meta-tag and via PHP:
<?php
header('Content-type: text/html; charset=UTF-8');
?>
<meta charset="UTF-8">
Also, MySQL runs utf8_general_ci collation on all tables.
All files should also be encoded and saved as UTF-8 without Unicode Signature (BOM) and no normalization form.
All this have worked flawless before, but today, nomather what I try I do end up with � instead of Å/Ä/Ö. Is there a good way to debug this and find the problem?
Is any of my steps unnecessary or have I forgotten anything?
What you need from deceze's article is the part regarding the SET NAMES:
mysql_set_charset('utf8', $connection); //not mysql_query("SET NAMES 'utf8'");
Just add that at the beginning of your php code, after the database connection was started
You may try save your php files in UTF-8 encoding. I assume the files are written in something else (possibly ISO-xxxx or ANSI)
To do that with Notepad++, select all the lines and copy to clipboard, change the coding to UTF-8 without BOM in encoding menu, then paste over everything and save.
Is this for only few records or all records with swedish characters?
You can change the page encoding manually in browser settings - this is how you test it: change it to latin1/iso-8854-1 to see if it displays these correctly that are wrong as utf-8.
Chances are someone is using browser that is not supporting utf8 or fiddled with the encoding manually.
Also make sure you db connection is utf8 too. (set names utf8;)
Currently in my application the utf8 encoded data is spoiled by internal coding of PHP.
How to make it consistent with utf8?
EDIT:To show examples,please tell me how to output the current internal encoding in PHP?
In php.ini I found the following:
default_charset = "iso-8859-1"
Which means Latin1.
How to change it to utf8,say,what's the iso version of utf8?
Change it to:
default_charset = "utf-8"
There is no ISO version of UTF-8.
You'll need to be specific with the details since encoding can be mangled at many different areas in your PHP application.
The common problem areas are:
Saving and retrieving from DB:
The database encoding must the same as the strings sent to it from PHP, or you must convert the strings to the DB encoding.
PHP4's single byte string functions:
PHP's functions such as strlen(), str_replace() do not produce the correct results on multibyte encodings such as UTF-8, since they operate on single bytes.
Page encoding:
Make sure the browser knows you are sending it UTF-8.
You can change the character encoding in php file. To change encoding in php page use the following function.
$new_value = htmlentities('$old_value',ENT_COMPAT, "UTF-8");
and also you can add the following in the html head section
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
I hope this will help to solve your problem.