Encoding problems in PHP / MySQL - php

EDIT: After feedback from my original post, I've change the text to clarify my problem.
I have the following query (pseudo code):
$conn = mysql_connect('localhost', 'mysql_user', 'mysql_password');
mysql_query("SET NAMES 'utf8'; COLLATE='utf8_danish_ci';");
mysql_query("SELECT id FROM myTable WHERE name = 'Fióre`s måløye'", $conn);
This returns 0 rows.
In my logfile, I see this:
255 Connect root#localhost on
255 Query SET NAMES 'utf8'; COLLATE='utf8_danish_ci'
255 Init DB norwegianfashion
255 Query SELECT id FROM myTable WHERE name = 'Fióre`s måløye'
255 Quit
If I run the query directly in phpMyAdmin, I get the result.
Table encoding: UTF-8
HTML page encoding: UTF-8
I can add records (from form input) where names uses accents (e.g. "Fióre`s Häßelberg")
I can read records with accents when using -> "name LIKE '$labelName%'"
The information in the DB looks fine
I have no clue why I can't select any rows which name has accent characters.
I really hope someone can help me.
UPDATE 1:
I've come to a compromise. I'll be converting accents with htmlentities when storing data, and html_entity_decode when retrieving data from the DB. That seems to work.
The only drawback I see so far, is that I can't read the names in cleartext using phpMySQL.

I think you should rather return $result than $this->query.
Additionally you should be aware of SQL injection and consider using mysql_real_escape_string or Prepared Statements to protect you against such attacks. addslashes is not a proper protection.

As other answers indicate, this very much seems like an encoding problem. I suggest turning on query logging ( http://dev.mysql.com/doc/refman/5.1/en/query-log.html ) as it can show you what the database really receives.
UPDATE:
I finally found a page explaining the dirty details of PHP and UTF-8 (http://www.phpwact.org/php/i18n/charsets). Also, make sure you read this (http://niwo.mnsys.org/saved/~flavell/charset/form-i18n.html) to understand how you to get proper data returned from form posts.

Try this query. If you get results, then it's an issue with your backtick character in the query
SELECT * FROM sl_label WHERE name Like 'Church%'

Maybe try checking for error messages after calling the query (if you aren't already doing this outside that function). It could be telling you exactly what's wrong.
As Artem commented, printing out the actual query is a good idea - sometimes things aren't exactly as you expect them to be.

This might be an encoding issue, the ' in Church's might be a fancy character. PHPMyAdmin could be UTF-8, and your own PHP website could be iso-latin1.

I'm looking at this line
mysql_query("SET NAMES 'utf8'; COLLATE='utf8_danish_ci';");
and I think it might be an error. With the ';' you are sending two queries to the server, but COLLATE is a clause, not a legal statement on its own. Try:
mysql_query("SET NAMES 'utf8' COLLATE 'utf8_danish_ci'");
If the COLLATE clause is not being accepted by the server, you might be having the problem of your label column having a danish_ci collation, but the statements coming in have the default (prob utf_general_ci). There would be no match for the accented characters, but the wildcard works because the representation for the basic ascii characters are the same.

Related

Can not insert french string in database mysql php

I have form with input text, when i add text
Un sac à main de femme recèlerait une quantité importante de bactéries
it adds in database only Un sac
i have tried with addslashes, mysql_real_escape_string, htmlspecialchars etc. also using UTF-8 encoding, but still it can not insert whole string
YOu should use utf8_unicode_ci as your column's collation in orer for French strings to be added in it.
In order to store non-US strings in the database, you must ensure that each of the following 3 steps are correctly implemented:
You database table must be set to a charset compatible with French. To be future proof, I recommend creating tables with UTF-8. For more information see the MySQL documentation.
Your database connection must be set to a proper character set both when storing and when querying. To do this, use mysqli_set_charset() (or whatever your MySQL connector offers).
Your input form AND your view page must be served with the exact character set as your data. To do that, you will need to set the following header: header('Content-Type: text/html; charset=UTF-8'); (If you are using a different charset, change it accordingly.)
You can of course use a different character set for storage and representation but why would you want to do that?
Also, when working with databases and HTML, you should consider:
ALWAYS escape your data as it goes into the database. Use mysqli_real_escape_string() or whatever escape method your database connector offers. Also, do NOT set the connection charset by using SET NAMES UTF8, otherwise your connector library will not know what charset to use for escaping. For more information google "sql injection".
ALWAYS escape your data as it goes into HTML with htmlspecialchars(). Also pay attention to ALWAYS provide the correct character set. For more information google "xss".
After breaking my head for 2 days straight and reading all the possible answers here's what solved the problem and allows me to insert additional weird characters like em dash etc. and retrieve data without seeing weird characters.
Here's the complete step-by-step setup.
The collation of the db column need to be: utf8_general_ci
The type is: varchar(250)
In the PHP header set the default client character set to UTF8
mysql_set_charset("UTF8", $link);
Set the character set result so we can show french characters
$sql = "SET character_set_results=utf8";
$result = mysql_query($sql);
In the html header specify, so you can view the french characters:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
When inserting the data do NOT use utf8_decode, just the below will work fine
$query = 'insert into tbl (col) VALUES ("'.mysql_real_escape_string($variable).'");
Use normal queries to retreive data, example query:
$query = "select * from table;";
Finally got this fixed, hope this is helpful to others.
In the php:
header ('Content-type: text/html; charset=utf-8');
After connection:
mysql_set_charset("utf8");
Just to follow up with this, I was using dbForge Studio and just pasting in French text and I had all the collations/encoding set properly. The one thing I didn't have set was the actual encoding for the connection to the db. Set it to UTF8 and all was well again. #2 in #Janoszen answer.
Had the same problem. The input text came from ANSII file, so it wasn't quite UTF8, despite all my utf8 settings. utf8_encode(input_text) solved it.
I have tried
htmlentities()
. .it saves the string as it is in the database
You should try this to insert special character in mysql :
$con = mysql_connect($server,$uname,$pass);
$res = mysql_select_db($database,$con)
mysql_set_charset("letin1", $con);

Superscript character in PHP causing a MySQLi select query to find 0 rows

I am using PHP 5.3.3 and MySQL 5.1.61. The column in question is using UTF-8 encoding and the PHP file is encoded in UTF-8 without BOM.
When doing a MySQLi query with a ² character in SQLyog on Windows, the query executes properly and the correct search result displays.
If I do this same exact query in PHP, it will execute but will show 0 affected_rows.
Here's what I tried:
Using both LIKE instead of =
Changing the encoding of the PHP file to ANSI, UTF-8 without BOM, and UTF-8
Doing 'SET NAMES utf-8' and 'latin1' before running the query
Did header('Content-Type: text/html; charset=UTF-8'); in PHP
Escaping using MySQLi::real_escape_string
Doing a filter_var($String, FILTER_SANITIZE_STRING)
Tried a MySQLi stmt bind
The only way I could get it to work properly is if I swapped the ² for a % and changed = to LIKE in PHP.
How can I get it query properly in PHP when using the ²?
You should be able to get the query to work by ensuring the following:
Prepping PHP for UTF-8
You first need to make sure the PHP pages that will be issuing these queries are served as UTF-8 encoded pages. This will ensure that any UTF-8 output coming from the database is displayed properly. In Firefox, you can check to see if this is the case by visiting the page you're interested in and using the View Page Info menu item. When you do so, you should see UTF-8 as the value for the page's Encoding. If the page isn't being served as UTF-8, you can do so one of two ways. Either you can set the encoding in a call to header(), like this:
header('Content-Type: text/html; charset=UTF-8');
Or, you can use a meta tag in your page's head block:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
Prepping MySQL for UTF-8
Next up, you need to make sure the database is set up to use the UTF-8 encoding. This can be set at the server, database, table, or column levels. If you're on a shared host, you probably can only control the table and column levels of your hierarchy. If you have control of the server or database, you can check to see what character encoding they are using by issuing these two commands:
SHOW VARIABLES LIKE 'character_set_system';
SHOW VARIABLES LIKE 'character_set_database';
Changing the database level encoding can be done using a command like this:
(CREATE | ALTER) DATABASE ... DEFAULT CHARACTER SET utf8;
To see what character encoding a table uses, simply do:
SHOW CREATE TABLE myTable;
Similarly, here's how to change a table-level encoding:
(CREATE | ALTER) TABLE ... DEFAULT CHARACTER SET utf8;
I recommend setting the encoding as high as you possibly can in the hierarchy. This way, you don't have to remember to manually set it for new tables. Now, if your character encoding for a table is not already set to UTF-8, you can attempt to convert it using an alter statement like this:
ALTER TABLE ... CONVERT TO CHARACTER SET utf8;
Be very careful about using this statement! If you already have UTF-8 values in your tables, they may become corrupted when you attempt to convert. There are some ways to get around this, however.
Forcing MySQLi to Use UTF-8
Finally, before you connect to your database, make sure you issue the appropriate call to say that you are using the UTF-8 encoding. Here's how:
$db = new mysqli(DB_HOST, DB_USERNAME, DB_PASSWORD, DB_NAME);
// Change the character set to UTF-8 (have to do it early)
if(! $db->set_charset("utf8"))
{
printf("Error loading character set utf8: %sn", $db->error);
}
Once you do that, everything should hopefully work as expected. The only characters you need to worry about encoding are the big 5 for HTML: <, >, ', ", and &. You can handle that using the htmlspecialchars() function.
If you want to read more (and get links to additional resources), feel free to check out the articles I wrote about this process. There are two parts: Unicode and the Web: Part 1, and Unicode and the Web: Part 2. Good luck!

How to encode cyrillic in mysql?

what's up? :-)
I have one problem and i hope you can help me with it.
One friend of mine have a simple solid html website and i implemented little php; CRUD system for articles... problem i came across is placing and getting cyrillic characters from mysql database.
What i want to achive is next:
In the main navigation there are some separated sections, whose names, ids and item's order i want to place in mysql and than to pull names and to put each name as a link. Names are supposed to be cyrillic characters.
The problem comes when i, using php mysql_fetch_assoc function, try to display names which are inserted with cyrillic characters in database row, collation of row is utf8_general_ci, and i end with ????? insted of original characters. If i submit cyrillic characters via submit form to mysql it shows something like this У.
How can i solve this, thanks in advance!? :-)
Make sure you call this after connecting to database.
mysql_query("SET NAMES UTF8");
Also make sure that HTML file has charset meta tag set to UTF-8 or send header before output.
header("Content-Type: text/html; charset=utf-8");
I had the same problem until I encoded the 'Collation' column in my table to 'utf8_bin'.
if its really mysql fetch assoc messing up you should try:
mysql-set-charset
from the docs:
Note:
This is the preferred way to change
the charset. Using mysql_query() to
execute SET NAMES .. is not
recommended.
also make sure your files are saved as utf8 and check iconv_set_encoding / iconv_get_encoding
For anyone having more complex issues with legacy project upgrades from versions before PHP 5.6 and MYSQL 5.1 to PHP 7 & Latest MySQL/Percona/MariaDB etc...
If the project uses utf8_encode($value) you can either try removing the function from the value being prepared and use the accepted answer for setting UTF-8 encoding for all input.
--- OR ---
Try replacing utf8_encode($value) with mb_convert_encoding($value, 'utf-8')
PDO USERS
If you are using PDO here are two ways how to set utf8:
$options = [
\PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8'
];
new \PDO($dsn, $username, $passwd, $options);
--- OR ---
$dsn = 'mysql:host=localhost;charset=utf8;'
new \PDO($dsn, $username, $passwd);
I can confirm that mb_convert_encoding($value, 'utf-8') to SQL table using utf8_unicode_ci works for Cyrillic and Umlaut.

Which collation should I use to store these country names in MySQL?

I am trying to store a list of countries in a mySQL database.
I am having problems storing (non English) names like these:
São Tomé and Príncipe
República de El Salvador
They are stored with strange characters in the db, (and therefore output strangely in my HTML pages).
I have tried using different combinations of collations for the database and the MySQL connection collation:
The "obvious" setting was to use utf8_unicode_ci for both the databse and the connection information. To my utter surprise, that did not solve the problem.
Does anyone know how to resolve this issue?
[Edit]
It turns out the problem is not to do with collation, but rather encoding, as pointed out by the col. I notice that at the command line, I can type two separate commands:
SET NAMES utf8
followed by
[QUERY]
where [QUERY] is my SQL statment to retrieve the names, and that works (names are no longer mangled). However, when I do the same thing programatically (i.e. through code), I still get the mangled names. I tried combining the two statements like this:
SET NAMES utf8; [QUERY]
at the command line, again, this returned the correct strings. Once again, when I tried the same statements through code, I got wrong values.
This is a snippet of what my code looks like:
$mysqli = self::get_db_connection();
$mysqli->query('SET NAMES utf8');
$sql = 'SELECT id, name FROM country';
$results = self::fetch($sql);
the fetch method is:
private static function fetch($query)
{
$rows = array();
if (!empty($query))
{
$mysqli = self::get_db_connection();
if ($mysqli->connect_errno)
{
self::logError($mysqli->connect_error);
}
else
{
if ($result = $mysqli->query($query))
{
if(is_object($result)){
while ($row = $result->fetch_array(MYSQLI_ASSOC))
$rows[] = $row;
$result->close();
}
}
}
}
return $rows;
}
Can anyone spot what I may be doing thats wrong?
Just to clarify, the HTTP headers in the page are set correctly
'Content-type': 'text/html; charset=utf-8'
so thats not the issue here.
As a matter of fact, collation affects nothing of a kind. it's a thing used for ordering and comparison, not recoding.
It is encoding responsible for the characters itself.
So, your problem comes not from the table collation but from the connection encoding
SET NAMES utf8
query should solve the problem, at leas for the newly inserted data
if you use uf8 everywhere*, it will work - seems like you forgot anything
*everywhere means: for your database-collation and -connection, for your (php?) script files and for the pages that are sent to the browser (by setting a meta-tag or, better, set an uftf-8-header)

MySQL: SELECT statement with Chinese and Japanese characters (empty result?)

I'm trying to query my database to get some results in Chinese and Japanese languages as follows:
$str = '日本';
$get_character = mysql_fetch_array (mysql_query("SELECT id FROM `mytable` WHERE ch = '$str'"));
print $get_character[0];
The problem is it returns me nothing. For testing purpose I've changed 日本 in database to test and I do get the right id. What's the problem?
Thanks!
Probably you need to set your connection to UTF-8 (assuming that's what you're using):
mysql_query('SET NAMES "utf8"');
The collation (or maybe encoding) is probably set incorrectly on the field, likely to English or something similar so characters in other languages get mutilated when you try to insert them.

Categories