PHP - How to serialize data with special characters from database? - php

i have a string with special characters, like:
$text = "NÃO";
When i use serialize($text), returns
a:1:{i:0;s:4:"NÃO";}
but when i use a string that i get from my database, like:
$query = SELECT special_text FROM ...
(...)
$text = $row->"special_text"
serialize($text);
returns
a:1:{i:0;s:3:"NÃO";}
, what is crashing my script.
what i have to do when i serialize data from database?
Thx, and sorry for my english

If you don't want to use utf-8, try to encode serialized string before saving it to db, and decode after retrieve it. You may use base64 encoding, just if you use any 8-bit encoding in your db. This method increases data size by 4/3. Or you may use binary (BLOB) field in your db and encode your data with gzip or something like that. Very useful for large amount of text, but increases CPU load.

Related

Store php generated JSON string in a CSV file

I am trying to generate a JSON string using php from an array, and I would like to store that string in a csv file, also with PHP. I want to do this, because I am working with a quite large amount of data, and I would like to use MySQL's LOAD DATA LOCAL INFILE to populate and update my database table.
This is the code I have:
$tmpFileProducts = 'path/to/file';
$tmpFileProductsHandler= fopen($tmpFileProducts, 'w');
foreach ($attributeBatch as $productId => $batch) {
fputcsv($tmpFileProductsHandler, array($productId, $batch['title'], $batch['parsed'], json_encode($batch['attributes'])), "|", "\"");
}
My problem is, that when I am creating the CSV file, the JSON double-quotes are not escaped, thus I end up with simmilar lines in my csv file:
43541|"telefon mobil 3l 2020 4g "|"2020-12-05 17:38:19"|"{""color"":""dark chrome"",""memory_value"":4294967296,""storage_value"":68719476736,""sim_slot"":""dual""}"
My first possible solution would be, to change the string enclosure of my CSV file, but what enclosure should I use, to ensure no conflicts could arrise with the inner json column? I am in complete control of the array that is stringified, it will only contain ASCII characters, in it.
Would there be a way, to keep the current string enclosure, and instead escape the JSON string somehow? Later on, I will need to fetch the data that is included to the database, and convert it to an array again.
DISCLAIMER: I am well aware that instead of storing the data as a JSON string, I could store it in a specific relational table (which I am also doing), but I would need quick access to this data, for a background script that is running, and I would like to save on the time of the queries, to the relational table, as when the background script will use this data, it doesn't need to search in it.
Follow up question: as I am explicitly telling the fputcsv function what to use as string enclosure shouldn't it automatically escape all the simmilar inner strings?

How is json decoding of utf emoji possible?

As far as i call recall. A valid Json data comes in this format '{"key":"value"}'
But while surfing, I found an article about sending UTF-8 codes as emoji.
Emoji was stored in variable as
$emoji = "\ud83d\udc4e";. For it to work properly, the answer was to use json_decode($emoji);. I tried it out and it returned a thumbs down emoji. Meanwhile, I was expecting NULL but it turns out that it was a valid json data. So I'm confused how that is possible.

php unserialize returns false

I have the following problem. I am retrieving a mysql text field that is a serialized text of a invoice. I am working in 2 different projects. Both have the same version of PHP. The data was exported & imported from db to db. If i var_dump the data from db1 it tells me it's length is x. When I do the same in from db2 i get x+2
string(595)
"a:3:{s:11:"userdetails";a:20:{s:4:"name";s:3:"bas";s:8:"lastname";s:7:"schmitz";s:5:"email";s:17:"email#test.de";s:6:"street";s:11:"f�rstenwall";s:7:"street2";s:0:"";s:7:"company";s:0:"";s:3:"zip";s:5:"40215";s:9:"residence";s:10:"d�sseldorf";s:7:"country";s:7:"Germany";s:5:"phone";s:7:"3033185";s:3:"fax";s:0:"";s:10:"customerID";i:202771;s:2:"nr";s:3:"228";s:6:"region";s:3:"nrw";s:10:"phone_code";s:3:"211";s:8:"fax_code";s:0:"";s:10:"salutation";s:2:"Mr";s:5:"sales";s:0:"";s:12:"country_code";s:0:"";s:10:"vat_number";s:0:"";}s:6:"domain";s:15:"bas-schmitz2.de";s:10:"has_domain";b:1;}"
string(597)
"a:3:{s:11:"userdetails";a:20:{s:4:"name";s:3:"bas";s:8:"lastname";s:7:"schmitz";s:5:"email";s:17:"email#test.de";s:6:"street";s:11:"fürstenwall";s:7:"street2";s:0:"";s:7:"company";s:0:"";s:3:"zip";s:5:"40215";s:9:"residence";s:10:"düsseldorf";s:7:"country";s:7:"Germany";s:5:"phone";s:7:"3033185";s:3:"fax";s:0:"";s:10:"customerID";i:202771;s:2:"nr";s:3:"228";s:6:"region";s:3:"nrw";s:10:"phone_code";s:3:"211";s:8:"fax_code";s:0:"";s:10:"salutation";s:2:"Mr";s:5:"sales";s:0:"";s:12:"country_code";s:0:"";s:10:"vat_number";s:0:"";}s:6:"domain";s:15:"bas-schmitz2.de";s:10:"has_domain";b:1;}"
As I am pasting these I can see that there is a difference when displaying germanic characters
Any idea to why this is happening?
The output of serialize() cannot be handled as plain text:
Return Values
Returns a string containing a byte-stream representation of value that
can be stored anywhere.
Note that this is a binary string which may include null bytes, and
needs to be stored and handled as such. For example, serialize()
output should generally be stored in a BLOB field in a database,
rather than a CHAR or TEXT field.
Thus your data is corrupted in the first place.
If you're unable to change the database design (which would be the proper fix), you need to re-encode serialised data in a plain text encoding such as Base64:
$encoded = base64_encode(serialize($foo));
$decoded = unserialize(base64_decode($encoded));

Accessing POST data

I'm very new to PHP but have a good understanding of C,
When I want to access some post data on an API i'm creating in PHP I use:
$_POST['date_set']
to fetch a value being passed for date - This all works perfectly, however I read I should be fetching it like this:
$date_set = trim(urldecode($_POST['date_set']));
This always returns a 00:00:00 value for the date after it's stored in my DB.
When I access directly using $_POST['date_set'] I get whatever value was posted, for example: 2013-08-28 10:31:03
Can someone tell me what I'm messing up?
You should try it like,
$date_set = $_POST['date_set'].explode(' ');//('2013-08-28 10:31:03').explode(' ')
echo $date_set[1];
or
echo date('H:i:s',strtotime($_POST['date_set'])));
//echo date('H:i:s',strtotime('2013-08-28 10:31:03'));
If you are very new in php the Read date()
You only run urldecode over data is URL encoded. PHP will have decoded it before populating $_POST, so you certainly shouldn't be using that. (You might have to if you are dealing with double-encoded data, but the right solution there should be to not double encode the data).
trim removes leading and trailing white-space. It is useful if you have a free form input in which rogue spaces might be typed. You will need to do further sanity checking afterwards.
urldecode — Decodes URL-encoded string
Description
string urldecode ( string $str )
Decodes any %## encoding in the given string. Plus symbols ('+') are decoded to a space character.
urldecode: is used only for GET requests. you should be fine using $_POST['date_set'] only.
http://php.net/manual/en/function.urldecode.php
You'd better do this way
if(isset($_POST['date_set'])){
$date_set = $_POST['date_set'];
}
then you can use $date_set how you want.
If you still get 00:00:00 for $date_set, the problem is coming from the code which provide you the $_POST value.

Is PHP serialize function compatible UTF-8?

I have a site I want to migrate from ISO to UTF-8.
I have a record in database indexed by the following primary key :
s:22:"Informations générales";
The problem is, now (with UTF-8), when I serialize the string, I get :
s:24:"Informations générales";
(notice the size of the string is now the number of bytes, not string length)
So this is not compatible with non-utf8 previous records !
Did I do something wrong ? How could I fix this ?
Thanks
The behaviour is completely correct. Two strings with different encodings will generate different byte streams, thus different serialization strings.
Dump the database in latin1.
In the command line:
sed -e 's/latin1/utf8/g' -i ./DBNAME.sql
Import the file converted to a new database in UTF-8.
Use a php script to update each field.
Make a query, loop through each field and update the serialized string using this:
$str = preg_replace('!s:(\d+):"(.*?)";!se', "'s:'.strlen('$2').':\"$2\";'", $str);
After that, I was able to use unserialize() and everything working with UTF-8.
To unserialize an utf-8 encoded serialized array:
$array = #unserialize($arrayFromDatabase);
if ($array === false) {
$array = #unserialize(utf8_decode($arrayFromDatabase)); //decode first
$array = array_map('utf8_encode', $array ); // encode the array again
}
PHP 4 and 5 do not have built-in Unicode support; I believe PHP 6 is starting to add more Unicode support although I'm not sure how complete that is.
You did nothing wrong. PHP prior to v6 just isn't Unicode aware, and as such doesn't support it, if you don't beat it to be (i.e., via the mbstring extension or other means).
We here wrote our own wrapper around serialize() to remedy this. You could, too, move to other serialization techniques, like JSON (with json_encode() and json_decode() in PHP since 5.2.0).

Categories