†characters in mysql database - php

Storing some values in a mysql database, the input is being sanitized with mysql_real_escape_string($value) and it displays fine. However while performing a direct query on the database, I'm seeing characters like †on each text field that has been edited using this form. How does that happen? It doesn't show when I display values on webpages but how can I prevent these characters from appearing at all?
I looked at this question: Strange characters in mysql dbase which seemed to have some advice on setting character names on the input, but how can I fix this for once and for all? I believe the person who's updating these values is copying and pasting directly from Microsoft Word, so I'm sure it has something to do with the "smart quotes" and other such fancy formatting that MS Word likes to use.

As the answer you linked shows, it comes from PHP which connects to mysql with latin1 encoding by default. So the datas are not correctly inserted in database.
Another problem is that if you query back the data in php, you get correct data as they are "decoded" the same way they are encoded. But if you perform direct query in database (say, with mysql client on console), data seem broken.
That's why the answer is to query "SET NAMES UTF8" before anything else.
You may parameter the mysql server to force utf8 on any connection. I do not see any other solution.

Related

Mysql dealing with non-English char input, fine with phpMyAdmin command box, not via PHP

My database deals with Chinese and Japanese characters.
When I insert rows on the phpMyAdmin command box, it works beautifully.
But problem occurs when I set up a query input on my website, the query is fetched by a php file, like:
$query = $_POST['query'];
$result = $dbc->query($query);
The non-English characters just become rubbish in database, like
ID column1
1666 ä½ å¥½å•Šå•Šå•Š
I checked that the php file receives the characters fine, the problem should come from Mysql. All charset is utf-8.
I am new to mysql, please let me know if you may need more info.
Thank you in advance.
column's collation set to utf8_general_ci
More information : https://dev.mysql.com/doc/refman/5.7/en/charset-applications.html
Add this line into your config.inc.php in PHPMyAdmin
$cfg['DefaultCharset'] = 'utf8';
$cfg['DefaultConnectionCollation'] = 'utf8_general_ci';
Since phpMyAdmin works expected, it sounds like your application is at fault here.
Perhaps you could clarify what you mean when you wrote "php file receives the characters fine" -- how did you determine that? The way I read it, you can create the proper characters from your application and they're displayed correctly in the database through phpMyAdmin, but when you try to display it through your application it shows gibberish. In that case it seems the portion of the application that retrieves and displays the data is at fault.
Regardless, there are many answers here about having gibberish characters, the best place to start is UTF-8 all the way through and you could also read more at Special characters in PHP / MySQL or Problem with PHP and Mysql UTF-8 (Special Character). Finally, even thugh it's off the Stackoverflow network, the phpMyAdmin wiki covers this pretty well also.
As a hint, most of the time I see this issue it's caused by not issuing SET NAMES (or using an equivalent function) when setting up the MySQL connection from your PHP script.

Can bad encoding in a MySQL database break AJAX requests (scripted in PHP)?

I've got a weird scenario going on here.
On my localhost running WAMP server (Apache, MySQL, PHP), I've created a webpage that displays a list of messages from a table in my database.
Let's say the DIV container was called: #message-list
This list gets displayed correctly (when the page is launched, PHP renders the whole page).
The HTML markup that PHP echoes-out works just fine.
The MySQL database lookup therefore also works! Great.
Now...
With a bit of AJAX and jQuery magic, I've created a form to add more messages on-the-fly, by sending a POST request to a PHP scripts that uses the SAME underlying code that generated the initial #message-list DIV.
The AJAX'ed PHP script does two things:
add a record for the new message from the user;
echoes the list (which should be updated with the new message now);
When the AJAX response comes back to the browser, the JavaScript side replaces the old list with the new #message-list content.
Now this... partially works.
What goes wrong - On one given page, it seems some of my previous posts are somehow "corrupting" something inside the AJAX request on the PHP side, resulting a null response (basically no HTML code gets generated to replace the #message-list DIV tag).
On some other pages though, the AJAX response works fine.
So my question is:
Is it possible that some String data in the Database breaks the execution of my PHP script because of some invalid character, badly encoded, or a quote / double-quote?
I've tried using PHP's htmlentities() and mysql_real_escape_string() functions to solve this, but one of my pages still doesn't properly refreshes the list after the AJAX response is received.
Could it just be that I just need to cleanup / sanitize the existing content in my table?
If so, is there any easy script / query I can use to do this?
Thanks!
EDIT #1:
MySQL version = 5.5.24-log
By using mysql_client_encoding, this shows "latin1" (ah HA! That may be the issue then!)
In PHP, using the mysql_... methods (such as mysql_connect, mysql_select_db, mysql_query, etc.);
Sample of database table with possible issue:
http://pastebin.com/PjLVmXEF
By the looks of it, many developers say PDO is recommended. I'll give that a shot and see if all errors vanish. Thanks a lot for your help so far everyone!
EDIT #2:
My current solution has been this:
I've used these queries to modify my database and the table with the encoding problem:
// SQL queries:
ALTER DATABASE timegrasp charset=utf8;
ALTER TABLE tg_messages CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
Second, I noticed some characters in a specific record was not displaying correctly (an  around some double-quoted sentences). So I manually backspaced and reinserted the double-quote in MySQL Query Browser to be sure it was completely gone.
On the PHP side, I only encode the messages on the way "in" to the database, with this:
$htmlConverted = htmlentities( $pMessage, ENT_COMPAT | ENT_HTML401, "UTF-8" );
return mysql_real_escape_string( $htmlConverted );
And make sure I begin my MySQL connection with this:
mysql_set_charset("utf8", $DB_LINK);
Then, I can just read the String directly from the table without any decoding / conversion.
Finally, to test this - I copied the same message from the source (a Skype chat with my client) which had the special characters, pasted it in my web form, and now it works fine! :)
I'm not certain all the steps and parameters above were necessary, but this is what helped fix my issue.
It would be good to know for future reference though if any of this is bad practice or common "dont's" mistake when handling special characters in MySQL tables.
The PHP json_encode function refuses to process strings that are invalid in UTF-8: it returns null. If you don't set a character encoding for the database connection it's possible that some other encoding is being used -- not UTF-8 -- and the data you pass to this function is in fact not valid UTF-8.
If you mentioned details such as which database API and which connection parameters you are using I could give further advice...

PHP mysql fixed connection to utf8, but now existing greek data is useless

I have a mysql database storing some fields in greek characters. In my html I have charset=utf-8 and my database columns are defined with encoding utf_general_ci. But I was not setting the connection encoding so far. As a result I have a database that doesn't display the greek characters well, but when reading back in PHP, it all shows well.
Now I try to do this the right way, so I added also in my database functions.
$mysqli->set_charset("utf8");
This works great for new entries.
But for existing entries, the problem is that when I read data in PHP, it comes garbled, since now the connection encoding has changed.
Is there a way to fix my data and make them useful again? I can continue working my old way, but I know it's wrong and can cause me more problems in the future.
I solved this issue as follows:
in a PHP script, retrieve the information as I do now, i.e without setting the connection. This way the mistake will be inverted and corrected and in your php file you will have the characters in the correct utf-8 format.
in the same PHP script, write back the information with setting the connection to utf-8
at this point the correct characters are in the database
I changed all my read/write functions of your site to use the utf-8 from now on

Mysql Collation for arabic using php

I am facing problem with mysql database. I cant save arabic text into the mysql data even i change the collation to cp1256_general_ci and tried other collation. I cant get much help from search.
Anyone who can help me out please help
I have change collation at database level as well as colum level to cp1256_general_ci for some fields.
Please suggestion how should i set this as i am NEW PHP and MySQL
I also write simple INSERT statement to insert input data in mysql do i have to take any case while inserting data into mysql if it is in arabic
The short answer is: Use UTF-8 everywhere. "Everywhere" means
In all forms that are used to store data in your database
The database connection
The database tables and columns
In all pages that output data from the database.
if you have existing CP-1256 data (or incoming data in that character set that you can't change) you can use iconv() to convert it into UTF-8.
Note that if using UTF-8, you need to make sure you use multibyte-safe string functions. This is often already the case because standard function like strlen() get mapped to mb_strlen(). To find out whether this is the case on your server, see the manual entry on the issue.

MySQL Database has non escaped single quotes in entires ... How to show them?

The database has a ton of entries that were not escaped because they were inputted manually when they were inserted so they look like: Don't inside of the entry, but when I try to display them they have a weird characters when I output in PHP. Before I would put anything into the database I would usually use mysqli_real_escape_string and then do the same when I go to retrieve the data, but since the data is already stored without using real_escape how do I display it properly?
The character being displayed instead of the single quotes looks like this: �
If it helps the data is stored as 'text'.
Thanks!
For future users of the same problem here's the steps:
Check your website headers to see what the encoding is
Check your mysql table columns and make sure they match.
If they don't change them to match. utf8_general in mysql and utf8 in my HTML worked for me
You will have to go back through the old mysql tables and update them so the new encoding is set properly.
New entries should work fine
When you output your results in PHP (or I guess whatever language you use), depending on if you are using any validation, you may have to use mysqli_real_escape_string or a similar function, such as stripslashes()
You'll need to read up on text encoding.
The usual solution is to make sure everything (the content-type encoding on your pages, and your mysql) are set to UTF-8
Chances are your data is Latin1 and you're displaying UTF-8 or vise versa

Categories