php json_encode utf8 char problem ( mysql ) [duplicate] - php

This question already has an answer here:
json_encode problems with utf8 [closed]
(1 answer)
Closed 6 years ago.
I am writing to the database in the form of data from a form with jQuery json_encode.
However, data from the database will corrupt.
$db->query("SET NAMES utf8");
$kelime = array("Merhaba","Dünya");
$bilgi = json_encode($kelime);
$incelemeEkle = "
INSERT INTO incelemeRapor SET
bigData = '".$bilgi."'
";
$db->query($incelemeEkle);
Database Table Schema;
CREATE TABLE `incelemeRapor` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`bigData` text COLLATE utf8_unicode_ci,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=2 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
MySQL Inserted Example Data;
["Merhaba","Du00fcnya"]

Always escape your data before puting it in a SQL query:
$incelemeEkle = "
INSERT INTO incelemeRapor SET
bigData = '".mysql_real_escape_string($bilgi)."'
";
(added mysql_real_escape_string() call)
json_encode() encodes non-ascii characters with the \u<code-point> notation; so json_encode(array("Merhaba","Dünya")); returns ["Merhaba","D\u00fcnya"].
Then this string is embeded in a SQL query:
INSERT INTO incelemeRapor SET
bigData = '["Merhaba","D\u00fcnya"]'
There is no special meaning for the escape sequence \u, so MySQL just removes the \; and this results in ["Merhaba","Du00fcnya"] being stored in database.
So if you escape the string, the query becomes:
$incelemeEkle = "
INSERT INTO incelemeRapor SET
bigData = '["Merhaba","D\\u00fcnya"]'
";
And ["Merhaba","D\u00fcnya"] is stored in the database.

I tried with mysql_real_escape_string() but not worked for me (result to empty field in database).
So I looked here : http://php.net/manual/fr/json.constants.php and the flag JSON_UNESCAPED_UNICODE worked for me fine :
$json_data = json_encode($data,JSON_UNESCAPED_UNICODE);
JSON_UNESCAPED_UNICODE is available only since PHP 5.4.0 !

So in addition to ensuring that your database is using utf8_unicode_ci, you also want to make sure PHP is using the proper encoding. Typically I run the following two commands at the top of any function which is going to potentially have foreign characters within them. Even better is to run it as one of the first commands when your app starts:
mb_language('uni');
mb_internal_encoding('UTF-8');
Those two lines have saved me a ton of headaches!

Like user576875 says, you just need to correctly treat your string before inserting it into the database. mysql_real_escape_string() is one way to do that. Prepared statements are another way. This will also save you from the SQL injection security issue that you might be susceptible to if you write user input directly into SQL. Always use one of the above two methods.
Also, note that this has little to do with UTF8. JSON is ASCII safe, so as long as you use an ASCII like character set (utf8, iso-8859-1), the data will be inserted and stored correctly.

I would apply BASE64 encoding to the JSON string. This should work with nearly every php setting, database, database version and setting:
$values = array("Test" => 1, "the" => 2, "West" => 3);
$encoded = base64_encode(json_encode($values));
$decoded = json_decode(base64_decode($encoded), true);

Related

How do you escape strings when writing data Migrations in Phinx?

I've got some simple updates inside my migration with strings that may contain special characters. For example:
$this->execute("UPDATE `setting` SET `classname` = 'org\foo\Bar' WHERE `id` = 1 ");
The problem with this for example, org\foo\Bar when inserted into MySQL treats \ as escape characters. For each DB phinx supports, I'm sure there are special characters that need to be handled in strings that when using PDO directly you'd get around by using prepared statements and binding parameters.
Is there any native way in phinx to escape strings or do I need to fall back on something like PDO::quote()?
As alluded to in Charlotte's OP comments, it doesn't look like this feature exists. The work around is the following:
Grab the the PDO connection
Use the quote() or manually construct a query using the connection directly
Here's my code example using quote()
public function change()
{
$conn = $this->getAdapter()->getConnection();
$quotedString = $conn->quote('org\foo\Bar');
$this->execute("UPDATE `setting` SET `classname` = $quotedString WHERE `id` = 1 ");
}

Android plus PHP special characters

I'm currently sending info by webservice from Android to a function in PHP which sends that data to the database.
Although, if I have accents in my words like names, I receive this error from PHP:
Incorrect string value: '\xE1\xE1mos ...' for column 'firstname' at
row 1
The names could have accents, like:
António, João, etc.
In PHP function, before inserting into database, I did this without success:
$db->set_charset('utf8');
$query = $db->prepare("SET NAMES 'utf8';");
$query->execute();
If I send the data without any kind of accents, the service works perfectly.
Edit: Solved using utf8_encode(variable).
When sending data to your WebServices in Android, try URL encoding the strings before putting them in the URL/form data, like:
// ...
String encodedName = URLEncoder.encode(name, Http.UTF_8);
// use encodedName instead of name to pass as parameters to the webservices
In the PHP side, before using the parameter, use the urldecode function.
$name = urldecode ($_GET['my_param']);
And keep using the encoding in the database. Also, check if your tables have been created using UTF_8 colation.
It is not enough to specify the encoding of the connection. You need to change the charset of your tables and database to utf-8.
As such, do the following
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci;
ALTER TABLE tbl_name CONVERT TO CHARACTER SET "utf-8" COLLATE utf8_general_ci;

Displaying fields from MySQL table but no field with quotes being printed

I am fetching data from a MySQL table to display it on a page. The script is displaying the information, but in my table normal quotes we're inserted as another type of quote characters such as. ( ’ ) and ( “ ” ) which are automatically formatted this way when something is typed in Microsoft Word 2010, which was used to type most of the entries in the table. So my guess are those are special characters. But whenever i test out displaying a field with actual single quotes ( ' ) and ( " " ) i receive a mysql_fetch_row expects parameter 1 to be a resource, boolean given error. This is the code i use:
$result = mysql_query("SELECT `question` FROM {$db_table_alt}");
while($field = mysql_fetch_row($result)) {
foreach($field as $fields) {
//build a unique section ID based on the ID that the Question belongs to
$uid = mysql_query("SELECT `id` FROM `questions` WHERE `question` LIKE '%$fields%'");
while($uidfield = mysql_fetch_row($uid)) {
But whenever i use this line
$fields = mysql_real_escape_string(stripslashes($fields));
The field with real quotes will display, but with forward slashes before the quote.
Can somebody help me find a solution to this please?
If mysql_fetch_row() is complaining about a boolean that means mysql_query either returned no rows or the SQL had an error (mysql_error() will tell you which).
If you're getting backslashes before the quotes in your returned data, then they are getting put in the database. That sounds like magic_quotes are enabled. You really want to turn that off as it's an obsolete and broken solution to a problem.
Also, I think you're going to have to learn about character encodings. A default MySQL install will be not be UTF8, I'm afraid, it will probably be ISO-8859-15. Word used to like writing text in Windows-1252 which is not the same. And then it gets more complicated with whatever browser, website and other things that talk to the database use. I believe PhpMyAdmin tries to run in UTF8, so data will get converted into your tables if they're not UTF8. This will also affect your queries looking for the "smart quotes".
You seem to have two distinct problems here.
One where $result is evaluating to a boolean, which means there is an error in the query generated (SELECT question FROM {$db_table_alt}). Try echoing that query out and manually running it. It may be that the table/view named by {$db_table_alt} does not exist.
The second is a string escaping problem. I expect the quotes are escaped in the database - using mysql_real_escape_string on the query will not alter whether the returned results are escaped or not.
Also, if your data in the database is escaped by slashes, you should read up on magic quotes and what to do about them: PHP docs on magic quotes. You should not have to do any string escaping when pulling data out of the DB.

JSON specialchars JSON php 5.2.13

I'm getting crazy over these encoding probs...
I use json_decode and json_encode to store and retrieve data. What I did find out is, that json always needs utf-8. No problem there. I give json 'hellö' in utf-8, in my DB it looks like hellu00f6. Ok, codepoint. But when I use json_decode, it won't decode the codepoint back, so I still have hellu00f6.
Also, in php 5.2.13 it seems like there are still no optionial tags in JSON. How can I convert the codepoint caracters back to the correct specialcharacter for display in the browser?
Greetz and thanks
Maenny
It could be because of the backslash preceding the codepoint in the JSON unicode string: ö is represented \u00f6. When stored in your DB, the DBMS doesn't knows how to interpret \u00f6 so I guess it reads (and store) it as u00f6.
Are you using an escaping function ?
Try adding a backslash on unicode-escaped chars:
$json = str_replace("\\u", "\\\\u", $json);
The preceding post already explains, why your example did not work as expected.
However, there are some good coding practices when working with databases, which are important to improve the security of your application (i.e. prevent SQL-injection).
The following example intends to show some of these practices, and assumes PHP 5.2 and MySQL 5.1. (Note that all files and database entries are stored using UTF-8 encoding.)
The database used in this example is called test, and the table was created as follows:
CREATE TABLE `test`.`entries` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`data` VARCHAR( 100 ) NOT NULL
) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_bin
(Note that the encoding is set to utf8_bin.)
It follows the php code, which is used for both, adding new entries and creating JSON:
<?
$conn = new PDO('mysql:host=localhost;dbname=test','root','xxx');
$conn->exec("SET NAMES 'utf8'"); // Enable UTF-8 charset for db-communication ..
if(isset($_GET['add_entry'])) {
header('Content-Type: text/plain; charset=UTF-8');
// Add new DB-Entry:
$data = $conn->quote($_GET['add_entry']);
if($conn->exec('INSERT INTO `entries` (`data`) VALUES ('.$data.')')) {
$id = $conn->lastInsertId();
echo 'Created entry '.$id.': '.$_GET['add_entry'];
} else {
$info = $conn->errorInfo();
echo 'Unable to create entry: '. $info[2];
}
} else {
header('Content-Type: text/json; charset=UTF-8');
// Output DB-Entries as JSON:
$entries = array();
if($res = $conn->query('SELECT * FROM `entries`')) {
$res->setFetchMode(PDO::FETCH_ASSOC);
foreach($res as $row) {
$entries[] = $row;
}
}
echo json_encode($entries);
}
?>
Note the usage of the method $conn->quote(..) before passing data to the database. As mentioned in the preceding post, it would even be better to use prepared statements, since they already do the whole escaping. Thus, it would be better if we write:
$prepStmt = $conn->prepare('INSERT INTO `entries` (`data`) VALUES (:data)');
if($prepStmt->execute(array('data'=>$_GET['add_entry']))) {...}
instead of
$data = $conn->quote($_GET['add_entry']);
if($conn->exec('INSERT INTO `entries` (`data`) VALUES ('.$data.')')) {...}
Conclusion: Using UTF-8 for all character data stored or transmitted to the user is reasonable. It makes the development of internationalized web applications way easier. To make sure, user-input is properly sent to the database, using an escape function is a good idea. Otherwise, using prepared statements make life and development even easier and furthermore improves your applications security, since SQL-Injection is prevented.

Encoding problems in PHP / MySQL

EDIT: After feedback from my original post, I've change the text to clarify my problem.
I have the following query (pseudo code):
$conn = mysql_connect('localhost', 'mysql_user', 'mysql_password');
mysql_query("SET NAMES 'utf8'; COLLATE='utf8_danish_ci';");
mysql_query("SELECT id FROM myTable WHERE name = 'Fióre`s måløye'", $conn);
This returns 0 rows.
In my logfile, I see this:
255 Connect root#localhost on
255 Query SET NAMES 'utf8'; COLLATE='utf8_danish_ci'
255 Init DB norwegianfashion
255 Query SELECT id FROM myTable WHERE name = 'Fióre`s måløye'
255 Quit
If I run the query directly in phpMyAdmin, I get the result.
Table encoding: UTF-8
HTML page encoding: UTF-8
I can add records (from form input) where names uses accents (e.g. "Fióre`s Häßelberg")
I can read records with accents when using -> "name LIKE '$labelName%'"
The information in the DB looks fine
I have no clue why I can't select any rows which name has accent characters.
I really hope someone can help me.
UPDATE 1:
I've come to a compromise. I'll be converting accents with htmlentities when storing data, and html_entity_decode when retrieving data from the DB. That seems to work.
The only drawback I see so far, is that I can't read the names in cleartext using phpMySQL.
I think you should rather return $result than $this->query.
Additionally you should be aware of SQL injection and consider using mysql_real_escape_string or Prepared Statements to protect you against such attacks. addslashes is not a proper protection.
As other answers indicate, this very much seems like an encoding problem. I suggest turning on query logging ( http://dev.mysql.com/doc/refman/5.1/en/query-log.html ) as it can show you what the database really receives.
UPDATE:
I finally found a page explaining the dirty details of PHP and UTF-8 (http://www.phpwact.org/php/i18n/charsets). Also, make sure you read this (http://niwo.mnsys.org/saved/~flavell/charset/form-i18n.html) to understand how you to get proper data returned from form posts.
Try this query. If you get results, then it's an issue with your backtick character in the query
SELECT * FROM sl_label WHERE name Like 'Church%'
Maybe try checking for error messages after calling the query (if you aren't already doing this outside that function). It could be telling you exactly what's wrong.
As Artem commented, printing out the actual query is a good idea - sometimes things aren't exactly as you expect them to be.
This might be an encoding issue, the ' in Church's might be a fancy character. PHPMyAdmin could be UTF-8, and your own PHP website could be iso-latin1.
I'm looking at this line
mysql_query("SET NAMES 'utf8'; COLLATE='utf8_danish_ci';");
and I think it might be an error. With the ';' you are sending two queries to the server, but COLLATE is a clause, not a legal statement on its own. Try:
mysql_query("SET NAMES 'utf8' COLLATE 'utf8_danish_ci'");
If the COLLATE clause is not being accepted by the server, you might be having the problem of your label column having a danish_ci collation, but the statements coming in have the default (prob utf_general_ci). There would be no match for the accented characters, but the wildcard works because the representation for the basic ascii characters are the same.

Categories