I'm using Amazon's RDS, and I'm having difficulty reading utf data from the DB. The results are shown as non-utf characters and hence with utf-8 encoding in the db they show up as bad characters.
The DB was transferred from another MySQL database (not Amazon RDS) and when the code communicates with that database everything is fine.
I checked the Character Set, and Collates on all tables, and the DB itself they are all UTF-8 and utf_general_ci
The pages are using utf-8 encoding like this
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
I also tried passing this query "SET NAMES utf8", still didn't help.
I noticed someone else has had the same problem with RDS
Can't store UTF-8 in RDS despite setting up new Parameter Group using Rails on Heroku
And the solution given to them was to specify the character set in the connection string, that solution apparently worked for rails, but in PHP I don't think there is such a thing as connection string.
This is how I connect to mysql
mysqli_connect($this->host, $this->login, $this->pw, $this->database)
Also if I change the data type of those columns to binary data types such as BLOB they will work properly.
What you're looking for instead of a connection string is most likely mysqli::set_charset() which will change your default client character set;
if (!$mysqli->set_charset("utf8")) {
printf("Error loading character set utf8: %s\n", $mysqli->error);
} else {
printf("Current character set: %s\n", $mysqli->character_set_name());
}
Related
I'm having a problem getting UTF-8 names written into a MySQL database... Here's what I have.
PHP page head has....
<meta charset="utf-8">
the MySQL column is: Char (80) with utf8_unicode_ci (these were originally latin1... I've changed them to UTF-8, truncated the database, then rerun the code)
The variable echoes to screen: Germán Mera
but writes it to database as Germán Mera
I tried putting utf8_encode(); around the variable, but then it writes to database as: Germán Mera and screen as Germán Mera (I know that command only works on iso-8859-1.. I think the JSON page is already UTF-8)
Here is an excerpt of the code I am using to get the name (for sake of simplicity, I'm only showing relevant code - I know what's shown below is not secure)
$str = file_get_contents('http://fantasy.mlssoccer.com/web/api/elements/647/');
$jsonarray = json_decode($str, true);
$name = $jsonarray['web_name'];
mysqli_query ($con, "INSERT INTO mlsprices (name) VALUES ('$name')");
Any idea how I can get this to write to the database properly? When I search, I only get quite complicated answers (eg, this) and there's surely an easier way.
Try using SET NAMES 'UTF8' after connecting to MySQL:
$con=mysqli_connect("host", "user", "pw", "db");
if (!$con)
{
die('Failed to connect to mySQL: ' .mysqli_connect_errno());
}
/* change character set to utf8 */
if (!$con->set_charset("utf8")) {
printf("Error loading character set utf8: %s\n", $con->error);
}
As the manual says:
SET NAMES indicates what character set the client will use to send SQL
statements to the server... It also specifies the character set that the server should
use for sending results back to the client.
I've got a login screen that checks entered username and password against a MySQL database.
My problem is that it doesn't recognize Swedish characters like "ÅÄÖ".
For example, the password "lösenord" is in the database but it isn't accepted, however "losenord" is.
The database has "utf8_general_ci" connection collation and I've set the charset to UTF-8 in my index.html but not in my php scripts.
I've read what feels like a million different ways to solve UTF 8 issues like this but I can't get it to work.
If someone could at least point me in the right direction I would be very thankful.
Do I need to encode each mysql query, set some META tag?
Cheers
Try using SET NAMES 'UTF8' after connecting to MySQL:
$con=mysqli_connect("host", "user", "pw", "db");
if (!$con)
{
die('Failed to connect to mySQL: ' .mysqli_connect_errno());
}
/* change character set to utf8 */
if (!$con->set_charset("utf8")) {
printf("Error loading character set utf8: %s\n", $con->error);
}
As the manual says:
SET NAMES indicates what character set the client will use to send SQL
statements to the server... It also specifies the character set that the server should
use for sending results back to the client.
Also use utf8_swedish_ci in your table, otherwise string comparison will go wrong and MySQL will treat 'ö' and 'o' as the same character.
I am trying to convert a string from HTML-ENTITIES to UTF-8 and then save the encoded string in my database. The html entities are greek letters and look for example like this: νω
Now I tried thousands of different ways, starting from just using utf8_encode or html_entity_decode until now I came across the function mb_convert_encoding().
Now the really weird thing is that when converting my string and then outputting it, it is correctly encoded to utf-8, but when inserting this string into my database I end up getting something like: ξÏνω.
This is the code for the encoding:
header('Content-Type: text/html; charset=utf-8');
mb_internal_encoding('utf-8');
......
while($arr = $select->fetch_array(MYSQLI_ASSOC))
{
$text = $arr["greek"];
$result = mb_convert_encoding($text, 'UTF-8', 'HTML-ENTITIES');
$mysqli->query("UPDATE some SET greek = '".$result."'");
}
When outputting my query and then manually doing a sql query in phpmyadmin it works fine, so it doesnt seem to be a problem of my db. There must be some problem when transferring the encoded string to my database...
As you see in your script, you are instructing the browser to use UTF8. That is the first step.
However your database needs the same thing and also the encoding/collation on the tables need to be UTF8 too.
You can either recreate your tables using utf8_general_ci or utf8_unicode_ci as the collation, or convert the existing tables (see here)
You need to also make sure that your database connection i.e. php code to mysql is using UTF8. If you are using PDO there are plenty of articles that show how to do that. The simplest way is to do:
$mysqli->query('SET NAMES utf8');
NOTE The change you will make now is final. If you change the connection encoding to your database, you could affect existing data.
EDIT You can do the following to set the connection
$mysqli = new mysqli($host, $user, $pass, $db);
if (!$mysqli->set_charset("utf8")) {
die("Error loading character set utf8: %s\n", $mysqli->error);
}
$mysqli->close();
Links of interest:
Whether to use "SET NAMES"
Execute the SET NAMES 'utf8' query prior to any others.
I'm trying to save French accents in my database, but they aren't saved like they should in the DB.For example, a "é" is saved as "é".I've tried to set my files to "Unicode (utf-8)", the fields in the DB are "utf8_general_ci" as well as the DB itself.When I look at my data posted through AJAX with Firebug, I see the accent passed as "é", so it's correct.Thanks and let me know you need more info!
Personally I solved the same issue by adding after the MySQL connection code:
mysql_set_charset("utf8");
or for mysqli:
mysqli_set_charset($conn, "utf8");
or the mysqli OOP equivalent:
$conn->set_charset("utf8");
And sometimes you'll have to define the main php charset by adding this code:
mb_internal_encoding('UTF-8');
On the client HTML side you have to add the following header data :
<meta http-equiv="Content-type" content="text/html;charset=utf-8" />
In order to use JSON AJAX results (e.g. by using jQuery), you should define the header by adding :
header("Content-type: application/json;charset=utf8");
json_encode(
some_data
);
This should do the trick
The best bet is that your database connection is not UTF-8 encoded - it is usually ISO-8859-1 by default.
Try sending a query
SET NAMES utf8;
after making the connection.
mysqli_set_charset($conn, "utf8");
if you use PDO, you must instanciate like that :
new \PDO("mysql:host=$host;dbname=$schema", $username, $password, array(\PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8') );
Use UTF8:
Set a meta in your
<meta http-equiv="Content-type" content="text/html;charset=utf-8" />
When you connect to your mySQL DB, force encoding so you DONT have to play with your mysql settings
$conn = mysql_connect('server', 'user', 'password') or die('Could not connect to mysql server.');
mysql_select_db('mydb') or die('Could not select database.');
mysql_set_charset('utf8',$conn); //THIS IS THE IMPORTANT PART
If you use AJAX, set you encoding like this:
header('Content-type: text/html; charset=utf-8');
Have you reviewed http://dev.mysql.com/doc/refman/5.0/en/charset-unicode.html:
Client applications that need to
communicate with the server using
Unicode should set the client
character set accordingly; for
example, by issuing a SET NAMES 'utf8'
statement. ucs2 cannot be used as a
client character set, which means that
it does not work for SET NAMES or SET
CHARACTER SET. (See Section 9.1.4,
“Connection Character Sets and
Collations”.)
Further to that:
if you get data via php from your
mysql-db (everything utf-8) but still
get '?' for some special characters in
your browser (), try this:
after mysql_connect() , and
mysql_select_db() add this lines:
mysql_query("SET NAMES utf8");
worked for me. i tried first with the
utf8_encode, but this only worked for
äüöéè... and so on, but not for
kyrillic and other chars.
You need to a) make sure your tables are using a character encoding that can encode such characters (UTF-8 tends to be the go-to encoding these days) and b) make sure that your form submissions are being sent to the database in the same character encoding. You do this by saving your HTML/PHP/whatever files as UTF-8, and by including a meta tag in the head that tells the browser to use UTF-8 encoding.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Oh, and don't forget C, when connecting to the database, make sure you're actually using the correct character set by executing a SET NAMES charset=utf8 (might not be the correct syntax, I'll have to look up what it should be, but it will be along those lines)
PHP(.net) advises against setting charsets after connecting using a query like SET NAMES utf8 because your functionality for escaping data inside MySQL statements might not work as intended.
Do not use SET NAMES utf8 but use the appropriate ..._set_charset() function (or method) instead, in case you are using PHP.
Ok I have found a working solution for me :
Run this mysql command
show variables like 'char%';
Here you have many variables : "character_set_server", "character_set_system" etc.
In my case I have "é" for "é" in database and I want to show "é" on my website.
To work I have to change "character_set_server" value from "utf8mb4" to "latin1".
All my correct value are :
And other values are :
With theses values the wrong database accent are corrected and well displayed by the server.
But each case can be different.
I have nearly completed the task of overhauling my web app to be properly "UTF-8 aware". I have found, though, that if I set the connection character set to utf8 using mysqli_set_charset, the result is that output appears incorrectly (indeed it appears as though the page's character encoding had been misidentified), whereas if I do not set the connection character set, it appears correctly.
For example, a string stored in one table in my database - the column's character set is utf8 - is echoed properly as Página principal if I do not set the connection character set. If I do set the connection character set, it appears as Página principal.
Details: The PHP scripts I am using to test this behaviour have the following meta tag in the <head> section:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
I have determined that the default character set for new connections on my host is latin1. The database I am connecting to has default character set utf8. Here is the code used to create the database link:
$cxn = #mysqli_connect('localhost', $db_user, $db_password, $db_database) or die('Failed to connect to the database.');
mysqli_set_charset($cxn, 'utf8');
mysqli_query($cxn, 'SET SESSION sql_mode = \'TRADITIONAL\'');
Additionally: In case it should serve as some extra forensic evidence, I have found that if I view the page in Firefox and manually change the character encoding to ISO-8859-1, the aforementioned string appears as Página principal.
This probably results from the fact that the character encoding was different when the data was inserted, and thus might be stored in the wrong encoding (is table encoding set to utf8?). Check if freshly inserted data returns fine. (Aka data which was inserted with a utf8 connection)