I have multiple text files that are very large, and adding them on MySQL is 100 text = is over 1MB (this is just an example) and I was thinking if is possible to encrypt them so I can make the text shorter so will use less MySQL DB space? and when I'm getting them back from MySQL to be able to decrypt so I can see the real text?
I try to use base_64 and other gzip compress, but all of them is making the size much bigger than original.
How can I compress the text files (encrypt / decrypt)?
you can use InnoDB (engine) compression. As you've asked, it is the same as ZIP compression
Answer is no :) You can't reduce text files using encryption but you can compress text data in database. InnoDB compression example in MySQL
PHP has the ability to manipulate .zip files.
You could save your text into a .zip file, and simply store the filename in the database. This would save a lot of MySQL database space, but you will need some way to generate unique filenames, and somewhere to store those files.
At least they would be zipped, to save as much disk space as possible...
If you want to make DB shorter, you can save large texts as files (on local server, CDN, or remote servers). Keep only filenames in DB and additional information about texts.
In result, you will be able to use the database in your application and read files from hard disks.
Related
What I know, in Database Context, BLOB or Binary Large OBject is nothing but actually a stored binary code for a given data. Can Reserves spaces in GBs and can be used to store virtually any data type. But What's actually a use of it?
My major is Computer Vision and I'm fairly novice at databases and web development. Currently, I'm working on a sentiment analysis project and want to collect a large dataset for this purpose i.e. huge number of images and also want to keep record of whether a image has been used for the analysis purpose or not. I thought storing images in database with separate column for access record is the best thing I can do to have an organized and systematic approach. But Everyone I talked with recommends not to store image as a blob in database but just have its URL or name there and should have images in a dedicated folder.
Moreover, since BLOB is just binary encoding of a file how would we decode it into an image file? I found codes like following to convert a BLOB value into an image:
echo '<img src="data:image/png;base64,' . base64_encode($image->getimageblob()) . '" />';
echo '<img src="data:image/jpg;base64,' . base64_encode($image->getimageblob()) . '" />';
But these codes are specific to the extension (And personally I haven't been successful with any such codes). As all extensions for sure have some different schemes and thus a code cannot be used for image of all those extensions. My dataset targets visuals of an image and not on extension thus contains images of various extensions so how can one deal with them using a BLOB?
So the approach of storing just names in database and and images in a dedicated folder sounds good but then what is the use of database itself? Can not we have some renaming mechanism for images via PHP and store them directly into that folder. Why use database when we can rename images like img_1_accesses_5.png and split image name to get the ID and number of times it accessed?
If BLOB can store virtually every type of data, why the use of BLOB is such horrible and everyone recommends not to use it? And what is the problem if we directly inject images into database as BLOB? And finally If its suitable for images then how to deal with it?
So my question is How to effectively use BLOB and for which purposes it is suitable?
So my question is How to effectively use BLOB and for which purposes it is suitable?
Quick and dirty answer
The simple answer is: BLOBs smaller than
256KB are more efficiently handled by a database,
while a filesystem is more efficient for those greater
than 1MB. Of course, this will vary between
different databases and filesystems
There is a microsoft technical report here : Compare blob and ntfs filesystem . The report is quite old (2006) but i think there isn't any much change from there.
Imaging when you want to read file which stored in blob. you have send request to your database software, then the software controller will read blob data which is stored in filesystem. Instead of directly read from file-system, you have to go through 2 steps processes. So when the size of your file become bigger, blob will slow down your database a lot. And we all know that speed is the main key for database.
Hope that help
I am working in ionic framework. Currently designing a posts page with text and images. User can post there data and image and all are secure.
So, i use base 64 encoding and save the image in database.
encodeURIComponent($scope.image)
Each time when user request, i select rows from table and display them along with text and decode them.
decodeURIComponent($scope.image)
with HTML "data:image/jpeg;base64,_______" conversion.
Works fine, but take so much time that i expected. Hence, image are 33% bigger size, and totally looks bulgy.
Then i decide to move on file upload plugin of cordova. But i realize, maintain file in this way is so much risk and complected. I also try to save binary data into database. But failed.
Text selecting without base64 data are dramatically reduce time. If it is possible to select image individually in another http call, after selecting other column and display. Is it a right mechanism to handle secure images?
As a rule of thumb, don't save files in the database.
What does the mysql manual have to say about it?
http://dev.mysql.com/doc/refman/5.7/en/miscellaneous-optimization-tips.html
With Web servers, store images and other binary assets as files, with
the path name stored in the database rather than the file itself. Most
Web servers are better at caching files than database contents, so
using files is generally faster. (Although you must handle backups and
storage issues yourself in this case.)
Don't save base4 encoded files in a database at all
Works fine, but take so much time that i expected. Hence, image are
33% bigger size, and totally looks bulgy.
As you discovered, unwanted overhead in encoding/decoing + extra space used up which means extra data transfer back and forth as well.
As #mike-m has mentioned. Base64 encoding is not a compression method. Why use Base64 encoding is also answered by a link that #mike-m posted What is base 64 encoding used for?.
In short there is nothing to gain and much to loose by base64 encoding images before storing them on the file system be it S3 or otherwise.
What about Gzip or other forms of compression without involving base64. Again the answer is that there is nothing to gain and much to lose. For example I just gzipped a 1941980 JPEG image and saved 4000 bytes that's 0.2% saving.
The reason is that images are already in compressed formats. They cannot be compressed any further.
When you store images without compression they can be delivered directly to browsers and other clients and they can be cached. If they are compressed (or base64 encoded) they need to be decompressed by your app.
Modern browsers are able to display base64 images embedded to the HTML but then they cannot be cached and the data is about 30% larger than it needs to be.
Is this an exception to the norm?
User can post there data and image and all are secure.
I presume that you mean a user can download images that belong to him or shared with him. This can be easily achieved by savings the files off the webspace in the file system and saving only the path in the database. Then the file is sent to the client (after doing the required checks) with fpassthru
What about when I grow to a 100000 users
How they take care about images file. In performance issue, when large
user involved, it seams to me, i need 100000 folder for 100000 user
and their sub folder. When large amount of user browse same root
folder, how file system process each unique folder.
Use a CDN or use a file system that's specially suited for this like BTRFS
Database has good searching facility, good thread safe connection, good session management. Is this scenario changed when large operation involved
Yes Indeed. Use it to the fullest by saving all the information about the file and it's file path in the database. Then save the file itself in the file system. You get best of both worlds.
Since it's just personal files, your could store them in S3.
In order to be safe about file uploads, just check the file's mime type before uploading for whatever storage you choose.
http://php.net/manual/en/function.mime-content-type.php
just run a quick check on the uploaded file:
$mime = mime_content_type($file_path);
if($mime == 'image/jpeg') return true;
no big deal!
keeping files on the database is bad practise, it should be your last resource. S3 is great for many use cases, but it's expensive for high usages and local files should be used only for intranets and non-public available apps.
In my opinion, go S3.
Amazon's sdk is easy to use and you get a 1gb free storage for testing.
You could also use your own server, just keep it out of your database.
Solution for storing images on filesystem
Let's say you have 100.000 users and each one of them has 10 pics. How do you handle storing it locally?
Problem: Linux filesystem breaks after a few dozens of thousands images, therefore you should make the file structure avoid that
Solution:
Make the folder name be 'abs(userID/1000)*1000'/userID
That way when you have the user with id 989787 it's images will be stored on the folder
989000/989787/img1.jpeg
989000/989787/img2.jpeg
989000/989787/img3.jpeg
and there you have it, a way of storing images for a million users that doesn't break the unix filesystem.
How about storage sizes?
Last month I had to compress a 1.3 million jpegs for the e-commerce I work on. When uploading images, compress using imagick with lossless flags and 80% quality. That will remove the invisible pixels and optimize your storage. Since our images vary from 40x40 (thumbnails) to 1500x1500 (zoom images) we have an average of 700x700 images, times 1.3 million images which filled around 120GB of storage.
So yeah, it's possible to store it all on your filesystem.
When things start to get slow, you hire a CDN.
How will that work?
The CDN sits in front of your image server, whenever the CDN is requested for a file, if it doesn't find it in it's storage (cache miss) it will copy it from your image server. Later, when the CDN get's requested again, it will deliver the image from it's own cache.
This way no code is needed to migrate to a CDN image deliver, all you will need to do is change the urls in your site and hire a CDN, the same works for a S3 bucket.
It's not a cheap service, but it's waaaaay cheaper then cloudfront and when you get to the point of needing it, you can probably afford it.
I would suggest you to continue with base64 string only, you can use LZ string compression technique to reduce the string size. I've been using and it's working pretty well.
I don't know how am I near to your question, but hope this will help you out.
Here is LZ compression technique : https://github.com/pieroxy/lz-string/
I'm making an android application which takes a photo and push the image (as a base64 encoded string) to a PHP script, from here I'll be storing data about the image inside a MySQL database.
Would it be wise to store the image inside the database (since it's passed as a base64 string), would it be better to convert it back to an image and store it on the filesystem?
A base64 encoded image takes too much place (about 33% more than the binary equivalent).
MySQL offers binary formats (BLOB, MEDIUM_BLOB), use them.
Alternatively, most people prefer to store in the DB only a key to a file that the filesystem will store more efficiently, especially if it's a big image. That's the solution I prefer for the long term. I usually use a SHA1 hash of the file content to form the path to the file, so that I have no double storage and that it's easy to retrieve the record from the file if I want to (I use a three level file tree, first two levels being made respectively from the first two characters and the characters 3 and 4 of the hash so that I don't have too many direct child of a directory). Note that this is for example the logic of the git storage.
The advantage of storing them in the DB is that you'll manage more easily the backups, especially as long as your project is small. The database will offer you a cache, but your server and the client too, it's hard to decide a priori which will be fastest and the difference won't be big (I suppose you don't make too many concurrent write).
I've done it both ways, and every time I come back to code where I stored binary data in a MySQL table I always switch it to filesystem with a pointer in the MySQL table.
When it comes to performance, you're going to be much better off going to the FS as pulling multiple large BLOBs from a MySQL server will tend to saturate its pipe quickly. Usually it's a pipe you don't want clogged.
You could always save the base64_encode($image) in a file and only store the file path in the database, then use fopen() to get the encoded image.
My apologies if I didn't understand the question correctly.
"wise" is pretty subjective, I think. I think it would be wise from a "keep people from directly linking to my images" perspective. Also, it may be helpful as far as if you decide you need to change up dir structures etc.. it might make it easier on you (but this really depends on how you wrote your scripts to begin with..) but other than that... offhand I can't really think of any benefits to doing this.
I'm creating a blog with a featured image on each post. I have a dilemma, I'm unsure what to do with my image data...
Should I insert image data into my MYSQL database using BLOB?
Or should I just create an uploader which makes a directory into the users images folder and upload the photo that way...then just reference it directly in the image field when adding a Blog Post?
Is there a standardised way?
Kind regards,
adam
Upload the files to your server and save the location of the file in your database. Less strain on your DB and your HTTP daemon is better at serving images than MySQL.
The general approach is not to store files in DB, unless you understand why do you need it to be stored there. So, since you are not sure, it's much simplier storing them in upload folder.
But, just in case you decide you need storing files (no matter images or some other) in DB, you have to declare BLOB field and then save it using some BLOB-supporting DB mechanism. 'PHP's MySQLi extension: Storing and retrieving blobs' is a good example of how it can be made
You should store images in folder. Click on below link from where you can get idea how to crop different-different size images and store images name in to database table:
How can I upload images in a normal insert form (MySql)? after upload the image should have three versions of different sizes and different names
convert the image data to base64. This can be done within PHP:
<?
$image=file_get_contents("image.png");
$image=base64_encode($image);
?>
Storing images in a DB is a good idea for secure images.
Always store images, music files etc in system files on disk, and then store the urls to them in the database. That will make it
1) faster
2) easier to configure security settings
3) better in any ways I can imagine
Disadvantage
If file system is corrupted you will have hard time recovering.
You can also use third party Image hosting sites too, you can use Amazon S3 or Mosso Cloud Files.
Problem with file system is it is difficult to scale.
Facebook uses cassandra to store images.
Since it is blog you can store images in filesystem.
Both are valid approaches.
They have different advantages/disadvantages.
Storing it in the database means you need to add extra code to change the image to a representation which will fit inside a INSERT/UPDATE statement (base64 is one approach, and requires equivalent decode, but you could just use mysql_real_escape_string()). Although you can't query the image directly (other than finding exact matches) it may reduce the number of seek and I/O operations required to retrieve the data compared with looking up the path in the database then retrieving the file.
It's also a lot simpler to set up replication of a database compared with setting up replication of the database AND the filesystem if you run on multiple nodes. And there's the issue og keeping filesystem and database backups synchronized.
OTOH, using a filesystem makes your data tables much smaller, and therefore faster to retrieve records from.
which makes a directory into the users images folder
You certainly don't want to allow users to upload content directly into your webserver's document tree - regardless of which route you take, the data should be stored in a location not directly accessible by the webserver but accessible by your code.
I want to upload a large file of maximum size 10MB to my MySQL database. Using .htaccess I changed PHP's own file upload limit to "10485760" = 10MB. I am able to upload files up to 10MB without any problem.
But I can not insert the file in the database if it is more that 1 MB in size.
I am using file_get_contents to read all file data and pass it to the insert query as a string to be inserted into a LONGBLOB field.
But files bigger than 1 MB are not added to the database, although I can use print_r($_FILES) to make sure that the file is uploaded correctly. Any help will be appreciated and I will need it within the next 6 hours. So, please help!
You will want to check the MySQL configuration value "max_allowed_packet", which might be set too small, preventing the INSERT (which is large itself) from happening.
Run the following from a mysql command prompt:
mysql> show variables like 'max_allowed_packet';
Make sure its large enough. For more information on this config option see
MySQL max_allowed_packet
This also impacts mysql_escape_string() and mysql_real_escape_string() in PHP limiting the size of the string creation.
As far as I know it's generally quicker and better practice not to store the file in the db as it will get massive very quickly and slow it down. It's best to make a way of storing the file in a directory and then just store the location of the file in the db.
We do it for images/pdfs/mpegs etc in the CMS we have at work by creating a folder for the file named from the url-safe filename and storing the folder name in the db. It's easy just to write out the url of it in the presentation layer then.
Some PHP extensions for MySQL have issues with LONGBLOB and LONGTEXT data types. The extensions may not support blob streaming (posting the blob one segment at a time), so they have to post the entire object in one go.
So if PHP's memory limit or MySQL's packet size limit restrict the size of an object you can post to the database, you may need to change some configuration on either PHP or MySQL to allow this.
You didn't say which PHP extension you're using (there are at least three for MySQL), and you didn't show any of the code you're using to post the blob to the database.
The best answer is to use an implementation that is better and also works around that issue.
You can read an article here. Store 10MB, 1000MB, doesn't matter. The implementation chunks/cuts the file into many smaller pieces and stores them in multiple rows.. This helps with load and fetching so memory doesn't also become an issue.
You could use MySQL's LOAD_FILE function to store the file, but you still have to obey the max_allowed_packet value and the fact that the file must be on the same server as the MySQL instance.
You don't say what error you're getting (use mysql_error() to find out), but I suspect you may be hitting the maximum packet size.
If this is the case, you'd need to change your MySQL configuration max_allowed_packet
You don't say what error you're getting (use mysql_error() to find out), but I suspect you may be hitting the maximum packet size.
If this is the case, you'd need to change your MySQL configuration max_allowed_packet
Well I have the same problem. And data cannot be entered in the mysql database chunck by chunck in a "io mode"
loop for :
read $data from file,
write $data to blob
end loop
close file
close blob
A solution seems to create a table with multi-part blobs like
create table data_details
(
id int pk auto_increment,
chunck_number int not null,
dataPart blob
);
???