I am having a little issue with storing mcrypt_module_open('rijndael-256','','ofb',''); in a MySQL db.
When it inserts the encrypted data into the MySQL db it looks like this ˜9ÏÏd‰.
It should look like this
÷`¥¶Œ"¼¦q…ËoÇ
I am wondering if I have to do something to get it to work?
Use a blob field type for storing binary data (BLOB, VARBINARY, BINARY)
If you're not doing this already: escape your values with the proper methods if you're using them directly in a SQL-statement. Or even better: use query parameters/prepared statements.
As a last resort you could just encode your data with either base64_encode or bin2hex.
If you want to display binary data on the console or in the browser (even for debugging purpose) use one of those encodings too. Otherwise you might not see the actual data because the browser might not display your binary correctly.
In general, it might be a good idea to base64 encode and decode binary data like this. See Best way to use PHP to encrypt and decrypt passwords? .
Have you tried to Collation of your table that your character supports.
The characters '÷`¥¶Œ"¼¦q…ËoÇ' looks like UTF-8 or someother charset, find charset of your characters and update table Collation based your charset
Related
I noticed that when doing database queries in PHP (e.g. Zend_db, mysqli...), you can set the character set. For example: mysqli_set_charset($con,"utf8"); I'm a little foggy as to what this actually does behind the scenes.
If I use php to do a database SELECT query, and I indicate a character set, what happens if it's not the same character set that the column was defined as in the database?
I mean, the database returns a binary sequence, but what is actually returned if the character is not encoded the same in the two character sets? Will mySQL take the internal binary data and return it "As-is"?
Or will MySQL try to convert it to the binary sequence that's the equivalent in the character set you indicated?
I guess the gist of my question is that I would like to know how the data is encoded when PHP is sending in the query, how it's transmitted back from MySQL, and whether there's another step of translation after PHP gets it back and stores it into a string in PHP internal memory.
Similarly, if you're doing an INSERT or update, how does it get sent from PHP to MySQL? Does PHP convert it to the correct binary encoding THEN send it into MySQL?
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Update:
Thanks to Raymond Nijland. I was able to fix my bug. But I did notice that for nonstandard characters, the charset does seem to matter.
I did a select statement using $db = new \PDO("mysql:host=$host;dbname=$database;charset=latin1", $dbuser, $dbpassword);. First, I tried latin1, then I tried utf8.
My problem was that I had a column with encrypted data, which I guess had some wierd characters in it. if I did an md5 on that field directly in the database query, it gave me an encoding that began with 889... BUT, I tried to pulled it into PHP with a SELECT statement. If I used PDO with a charset of latin1, then did an MD5() inside of PHP, it gives me the same hash (889...). Which implies that PHP has an exact copy of the binary that's in the database. BUT if I did used PDO with charset 'UTF-8', then did an MD5() in PHP, it gave me a hash beginning with 087... So somewhere, a conversion must be taking place.
At this point, my orignal bug is fixed, but I'm still curious as to what's happening. Is MYSQL doing the conversion before returning it to PHP, or does PDO do some sort of conversion on the PHP side?
mysqli_set_charset($con,"utf8"); (or other code for other client libraries) declares to MySQL that the encoding in the client will be MySQL's CHARACTER SET utf8. If bytes with a different encoding are sent to (think INSERT) mysql, garbage or errors will occur.
That setting also declares that the client desires that encoding from SELECTs.
The CHARACTER SET on each column in each table may be something else (eg, "latin1"). If so, MySQL will attempt to convert the encoding during the transmission.
Caution: MySQL's CHARACTER SET utf8 is a subset of the well-known UTF-8. To get the latter, use CHARACTER SET utf8mb4 in tables and mysqli_set_charset($con,"utf8mb4"); when connecting.
Going forward, utf8mb4 is preferred in most situations.
Non-text stuff ("as-is") should be put into BLOB or VARBINARY columns -- this bypasses any checking of the encoding. (Think a .jpg or AES_ENCRYPT.)
MySQL's MD5() function returns a hex string. UNHEX(MD5('...')) return binary stuff and must be store in, say, a BINARY(16) column.
Many forms of garbled text are discussed in Trouble with UTF-8 characters; what I see is not what I stored .
I'm getting an XML API response, parsing the data I need, but want to store full XML response in MySQL for add'l data use later.
I intially stored the XML in a BLOB, but found special characters sometimes in values break the INSERT.
So, I first convert XML with htmlentities into a BLOB to keep original API response data integrity. Is this a good way to do it, or is there a better method?
Properly escaped, any data can be INSERTed into a BLOB. htmlentities is not the way. For the mysqli API, use mysqli::real_escape_string. For PDO use binding.
If you are running on Windows, be sure to fetch the data in binary mode, else CR/LF may get "fixed" during the read.
In theory XML should be UTF-8, so simply putting into TEXT CHARACTER SET utf8mb4 should work instead of BLOB.
If all else fails, consider converting (in the client) the blob into Base64. It will be 4/3 bigger.
I have this following
$html = <div>ياں ان کي پرائيويٹ ليمٹڈ کمپنياں ہيں</div>
But it is being stored in the mysql database as following format
تو يہ اسمب
لي ميں غر
يب کو آنے
نہيں
Actually, When I retrieve the data from mysql database and shows it on the webpage it is shown correctly.
But I want to know that Is it the standard format of unicode to store in the database, or the unicode data should be stored as it is (ياں ان کي پرائيويٹ ليمٹڈ کمپنياں ہيں)
When you store unicode in your database...
First off, your database has to be set as 'utf-general', which is not the default. With MySQL, you have to set both the table to utf format, AND individual columns to utf. In addition to this, you have to be sure that your connection is a utf-8 connection, but doing that varies based on what method you use to store the unicode text into your database.
To set your connection's char-set, if you are using Mysqli, you would do this:
$c->set_charset('utf8'); where $c is a Mysqli connection.
Still, you have to change your database charsets like I said before.
EDIT: I honestly don't think it matters MUCH how you store it, though I store it as the actual unicode characters, because that way if some user were to input '& #1610;' into the database, it wouldn't be retrieved as a unicode character by mistake.
EDIT: Here is a good example, if you remove that space between & and #1610; in my answer, it will be mistakenly retrieved from the server as a unicode character, unless you want users to be able to create unicode characters by using a code like that.
Not a perfect example since stackoverflow does that on purpose, and it doesn't work like that really, but the concept is the same.
Something wrong with data charset. I don't know what exactly.
This is workaround. Do it before insert/update:
$str = html_entity_decode($str, ENT_COMPAT, 'UTF-8');
it looks like to me that this is HTML encoding, the way PHP encode unicode to make sure it will display OK on the web page, no matter the page encoding.
Did you tried to fetch the same data using MySQL Workbench?
It seems that somewhere in your PHP code htmlentities is being used on the text -- instead of htmlspecialchars. The difference with htmlentities is that it escapes a lot of non-ASCII characters in the form you see there. Then the result of that is being stored in the database. It's not MySQL's doing.
In theory this shouldn't be necessary. It should be okay to output the plain characters if you set the character set of the page correctly. Aassuming UTF-8, for example, use header('Content-Type: text/html; charset=utf-8'); or <meta http-equiv="Content-Type" value="text/html; charset=utf-8">.
This might result in gibberish (mojibake) if you view the database directly (although it will display fine on the web page) unless you also make sure the character set of the database is set correctly. That means the table columns, table, database, and connection character set all to, probably, utf8mb4_general_bin or utf8_general_bin (or ..._general_ci). In practice getting it all working can be a bit of a nuisance. If you didn't write this code, then probably someone in your code base decided at some point to use htmlentities on it to convert the exotic characters to ASCII HTML entities, to make storage easier. Or sometimes people use htmlentities out of habit when the merer htmlspecialchars would be fine.
I'm storing data in a MySQL database that may have some special characters. I'm wondering how to store it so that these characters are preserved if they're either output to HTML via PHP OR via JavaScript, e.g. createTextNode.
For example, the division symbol (÷) has the html code ÷, and when I store it as that it shows up fine when put directly into HTML by PHP, but when I pull it into JavaScript using $.getJSON and then insert it with createTextNode it shows up looking like ÷.
I also tried storing the symbol in the SQL directly, but my understanding is that the column would need to be changed from VARCHAR to NVARCHAR and that would cause a performance hit that doesn't seem necessary.
Given that I can modify the SQL, the PHP, or the JavaScript, is there an easy fix here? Maybe a way to unescape the HTML entity in JavaScript?
As answered by Yogesh, you should switch your collation of the DB to utf8_general_ci
So there's probably two things going on:
JSON escapes special characters.
Somewhere, something in your code flow is URL encoding the strings too.
So you just need to decode the string in your JavaScript, or you need to find what part of your code is URL encoding those strings and fix it.
When encoding newline of textarea before storing into mysql using PHP with rawurlencode function encodes newline as %0D%0A.
For Example:
textarea text entered by user:
a
b
encoding using rawurlencode and store into database will store value as a%0D%0Ab
When retrieving from database and decoding using rawurldecode does not work and code gives error. How to overcome this situation and what is the best way to store and retrieve and display textarea values.
can you first encode this textarea string using base64_encode and then perform a base64_decode on the same, if the above does not work for you.
If the textarea does not contain URLs, you should rather use base64_encode then rawurlencode and then store as normal.
You simply should not use rawurlencode for escaping data for your database.
Each target format has it's own escaping method which in general terms makes sure it is stored/display/transferred safely from one place to another, and it doesn't need decoding at the other end.
For instance:
displaying text in HTML, use htmlentities or htmlspecialchars
storing in database, use mysqli_real_escape_string, pg_escape_string, etc...
transferring variablename, use urlencode
transferring variablecontent, use rawurlencode
etc...
You should notice that decoding these things is often done by the browser/database. So no data is actually stored escaped. And decoding doesn't need te be done by your code.
The problem is probably because you escape a sequence with rawurlencode, but your database expected the escaped format for the specific brand of database. And de-escaped it using that assumption, which was wrong, which messed up your string.
Conclusion: find out what brand database you are using, look up the specific escape function for that database, and use the proper escaping function on all your content "transferral".
P.S.: some definition may not be correct, please comment on that. I wanted to make the idea stick but am probably not using all the right terms.
First of all it is very uncommon to run textarea through urlencode()
urlencode was not designed for this purpose.
Second, if you still want to do this, then maybe the problem comes from database. First you need to tell us what database you using and what TYPE you using for storing this data: do you store it as TEXT or as BINARY data? Have you setup the correct charset in database?