I'm referencing this Store GZIP:ed text in mysql?.
I want to store serialized sessions in the database (they are actually stored in a memcached pool but i have this as a failsafe). I am gziping/uncompressing from php.
I want to ask the following:
1) Is this a good move? I am doing this to avoid using mediumtext as the data may be bigger than text. I think/hope i will have a lot of sessions stored there. Is it, in this case, worth to gzip? Table is MyISAM.
2) Do i need to set the encoding of the table field to binary? Or only do that if i have a complete gziped file?
3) Is serializing a bad move, should i use json_encode instead (because of the smaller size i guess)?
Thanks,
You should use a MEDIUMBLOB field instead of MEDIUMTEXT. BLOBs have no encoding, as they are raw byte streams.
Related
Hi everyone,
Is there a way to cache BLOB types temporarily in Laravel ?
Scenario:
I'm gonna cache some data MEDIUMBLOB with the size of 2048KB temporarily.
These data are actually parts of a large single file 16MB.
After caching all parts, they will be combined together into a single file, then will be removed from cache.
The content of each single part is given by file_get_contents function.
I'm already doing this with MySQL. (However, there are lots of queries and takes time to be done.)
Is there a better way to store MEDIUMBLOB data temporarily in Cache storage ?
I've faced with Redis and Cache in Laravel, but I'm not sure they support MEDIUMBLOB.
Here's the main point. In the end a BLOB is just a string of bytes. The major difference with a formal string type is that a BLOB doesn't have any sort of encoding or collation associated with it. It's a kind of binary string which is another way of saying "generic data". file_get_contents also generally returns this sort of "generic data".
Laravel's cache is build to be generic like that so it doesn't have any specific data type associated with it. It's just a store of key/value pairs and the keys must be ascii strings while the values can be anything Laravel serialises things before they go in the cache so basically anything that can be represented as a variable in PHP and is serializable can be cached.
I have some text data I would like to store in a mysql database. I currently have the data stored in a variable as a string.
I'm concerned that the table will become quite large due to the amount of text data I have for each row.
Therefore, what is the most easiest way (preferably php built in functions) of compacting this string data in a format ideal for storage and retrieval?
You could GZIP the string with GZEncode.
That's pretty standard and thus should be reversible from other languages if you want to.
I would advise storing a Base64 version of the result.
If you're using InnoDB you can enable compression on entire tables which doesn't impact your code at all.
ALTER TABLE database.tableName ENGINE='InnoDB' ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;
You can alter the KEY_BLOCK_SIZE to smaller values to get more compression (depending on the data), but this adds more overhead to the CPU.
After testing a range of tables, I found a KEY_BLOCK_SIZE of 8 to be a good balance of compression vs performance.
I am storing serialized data in a mysql and am unsure which field type to choose?
One example of the serialized data output is below,
string(393) "a:3:{s:4:"name";s:22:"PACMAN-Appstap.net.rar";s:8:"trackers";a:6:{i:0;s:30:"http://tracker.ccc.de/announce";i:1;s:42:"http://tracker.openbittorrent.com/announce";i:2;s:36:"http://tracker.publicbt.com/announce";i:3;s:23:"udp://tracker.ccc.se:80";i:4;s:35:"udp://tracker.openbittorrent.com:80";i:5;s:29:"udp://tracker.publicbt.com:80";}s:5:"files";a:1:{s:22:"PACMAN-Appstap.net.rar";i:4147632;}}"
The string lengths of the data can vary greatly upto around 20,000 characters.
I understand that I do not want to use TEXT data type as this could corrupt data because of character sets that it would have to use.
I am stuck as when it comes to use either VARBINARY, BLOB, MEDIUMBLOB etc.
Let us say if I use VARBINARY(20000) does this mean that I can insert a string of 20000 in length safely and if it is over then discard the insert?
I agree with PLB in that you should use BLOB. The length attribute specifies how many bytes can be saved in this column. The main difference between BLOB and VARBINARY is that VARBINARY fills up unused space with padding, wheras with BLOB only the actual length of the data is reserved for one field.
But as PLB said, only use this if you absolutely must, because it slows down the whole DB in most cases. A better solution would be to store the files in your server's filesystem and save the file's path in the DB.
I am serializing alot of arrays in php that are to be stored in a database using mysql.
The length of the final string can vary greatly from anything inbetween 2000 to 100,000+, I was wondering what would the best column type for this to be?
I currently have it set as LONGTEXT but I feel this is overkill! The database is already active and has around 3million rows this is a new column which will added soon.
Thanks
Always use any BLOB data-type for serializing data so that it does not get cut off and break the serialization in a binary safe manner. If there is not a maximum to the length of the final string then you will need LONGBLOB. If you know that the data won't fill 2^24 characters you could use a MEDIUMBLOB. MEDIUMBLOB is about 16MB while LONGBLOB is about 4GB so I would say you're pretty safe with MEDIUMBLOB.
Why a binary data type? Text data types in MySQL have an encoding. Character encoding will have an effect on how the serialized data is transposed between the different encodings. E.g. when stored as Latin-1 but then read out as UTF-8 (for example because of the database driver connection encoding setting), the serialized data can be broken because binary offsets did shift however the serialized data was not encoded for such shifts. PHP's serialized strings are binary data, not with any specific encoding.
You should choose BLOB (as Marc B noted) per the PHP manual for serialize():
"Note that this [outputs] a binary string which may include null bytes, and needs to be stored and handled as such. For example, serialize() output should generally be stored in a BLOB field in a database, rather than a CHAR or TEXT field."
Source: http://php.net/serialize
Of course J.Money's input regarding sizes must be borne in mind as well - even BLOB has its limits, and if you are going to exceed them then you would need MEDIUMBLOB or LONGBLOB.
I am having a little issue with storing mcrypt_module_open('rijndael-256','','ofb',''); in a MySQL db.
When it inserts the encrypted data into the MySQL db it looks like this ˜9ÏÏd‰.
It should look like this
÷`¥¶Œ"¼¦q…ËoÇ
I am wondering if I have to do something to get it to work?
Use a blob field type for storing binary data (BLOB, VARBINARY, BINARY)
If you're not doing this already: escape your values with the proper methods if you're using them directly in a SQL-statement. Or even better: use query parameters/prepared statements.
As a last resort you could just encode your data with either base64_encode or bin2hex.
If you want to display binary data on the console or in the browser (even for debugging purpose) use one of those encodings too. Otherwise you might not see the actual data because the browser might not display your binary correctly.
In general, it might be a good idea to base64 encode and decode binary data like this. See Best way to use PHP to encrypt and decrypt passwords? .
Have you tried to Collation of your table that your character supports.
The characters '÷`¥¶Œ"¼¦q…ËoÇ' looks like UTF-8 or someother charset, find charset of your characters and update table Collation based your charset