Best way to migrate 6k images from Access into MySQL?

Best way to migrate 6k images from Access into MySQL? - php

I wouldn't mind donating to anyone who helps me with this issue.
Should I store binary information with a BLOB data type? Should I store VARCHAR containing paths? I don't know how to do either of these automated at the moment. The images are currently embedded into an Access database as OLE Objects. This migration cannot be manual; it will have to be done automatically using scripts or programs because there are about 6k records.
Any ideas or recommendations?

You can use Leban's OLEtoDisk to export your images all at once. You can specify a "naming" column, your primary key for example, and constant fields to be appended/prepended to the naming column.
Your pictures are then called "exported1.jpg","exported2.jpg", ... assuming you choose to prepend exported and the id's where 1 and 2. It should be simple to move the files to a server and write a script to insert the correct paths into the MySQL database. Assuming this is a one time thing, because that's what it sounds like.
Just tested it with 4000 small (~150 kb) pictures, it was done in 2 minutes on a limited virtual machine. So 6000 should not be a problem.

Related

Which is better in storing and retrieving images? [duplicate]

This question already has answers here:
Storing Images in DB - Yea or Nay?
(56 answers)
Closed 9 years ago.
I want to create an image gallery and obviously, it must have images in it.
Somehow, I've been wondering about what's better between storing the images in a directory and retrieve them one by one or store them in the database as a BLOB data?
Thank you people! Cheers!
I am willing to learn either of the methods so please enlighten me.

This question has been debated for many years. Advocates will make strong cases for each. Neither side as ever been definitively proven to be right in all cases.
Both methods break down when the number of images that you need to warehouse gets very large. Both databases and file systems have become better in the years since I bench marked both options against each other. At that time, you could fix the performance hit on the "file system" option by creating a hierarch of directories instead of putting them all in one directory. By now, file systems may have been optimized so that they don't choke when the number of directory entries gets large.
This is truly a "your mileage may vary" situation. Factors will include what file system will you use vs. what database engine will you use, how many images, what average size? Will you be "tagging" the images in the database as well as storing them?
Typically, you have to just try all the options until you find something that works in your configuration.
Definitely stress test it. If you think you need to store one million images, don't test with five and assume that it will scale.
Test it with at least a million images, if not with two or five million.
That said, if you only need to store 1,000 images (or less) maybe even 10k or less and if you need to index the images by attributes like date, location, subject matter, etc. then at the risk of offending many well meaning people, I am going to recommend storing the image as a blob in the database. The convenience of using the database to join the image to the meta data will outweigh anyone's performance concerns at that scale. When you store the metadata in a database with a pointer to a file in the file system, it is too easy for things to get out of sync. The file gets moved, renamed, deleted etc; your database wont know and now your system is broken.
Using a database will insure the integrity of your data, including the images for you.

First you will need to upload file in any predefined folder and than you can store name of that file in your database(with is varchar data type).
And when you will fetch those records, use that name to recreating image path wherever application required.

Best practice for storing and searching applicant Résumé or CV file

I am starting a recruitment consultancy and sooner or later we would be dealing with many applicant résumés or CV (curriculum vitae). I am building a simple application with PHP and MySQL (target server to be windows) to let applicant upload CV on our website. Currently I would be restricting upload files to be only MS Word docs and MAX size 500 KB.
Now my question is around two operations which would be performed on these files.
Search content inside these files on specific key words to find relevant skills matching resumes.
Then serve these files to our employers either through download file link or email the resumes to them.
Coming straight to the questions
Do I store the actual files on File System and perform Windows search on them?
Or I only insert the content in to the MySQL blob/cblob, perform search on the table and then serve the content from the table itself to the employer.
Or I Store the file on File System and also insert the content in mysql blob. Search the content in mysql and serve the file from File System.
I am of the opinion that once the number of résumés reaches thousands, the Windows search would be extremely slow but then I search on internet and find that it is not advisable to store huge amount of file contents in a database.
So I just need your suggestion on the approach I should adopt in light of the assumption that at some point of time we would be storing and retrieving thousands of resumes.
Thanks in advance for your help.

One option, a hybrid: Index the resumes into a db, but store a filesystem path as the location. When you get a hit in the db and want to retrieve the resume, get it off the file system via the path indicated in the db.

What you want is a fulltext index of the documents. This tends to be a job for e.g. Solr (see this cross reference on StackOverflow: How do I index documents in Solr). The database would keep a reference to the file on the disk. You should not try to save blob data to an innodb table that does not run on the barracuda format using row_format=dynamic. Please refer to the MySQL performance blog for further details on the Blob storage in innodb topic.

Scalable way to store files on server (PHP)?

I'm creating my first web application - a really simplistic online text editor.
What I need to do is find the best way to store text based files - a lot of them.
These text files can be past 10,000 words in size (text words not computer words.) in essence I want the text documents to be limitless in size.
I was thinking about storing the text files in my MySQL database - but thought there was a better way.
Instead I'm planing on storing the text files in XML based format in a directory on my server.
The rows in the database define the name of the xml based text file and the user who created the text along with basic metadata.
An ID is generated using a V4 GUID generator , which gives the text an id and stores the text in the "/store" directory on my server. The text definitions in my server contain this id, and the android app I'm developing gets the contents of the text file by retrieving the text definition and then downloading the text to the local device using the GUID in the text definition.
I just think this is a botch job? how can I improve this system?
There has been cases of GUID colliding.
I don't want this to happen. A "slim" possibility isn't good enough - I need to make sure there is absolutely no chance in a GUID collision.
I was planning on checking the database for texts that have the same id before storing the text with a particular id - I however believe with over 20,000 pieces of text in my database this would take an long time and produce unneeded stress on the server.
How can I make GUID safe?
What happens when a GUID collides?
The server backend is going to be written in PHP.

You've got several questions here, so I'll try to answer them all.
Is XML with GUID the best way to do this?
"Best" is usually subjective. This is certainly one way to do it, but you're probably adding unneeded overhead. If it's just text you're storing, why not put it in the SQL with varchar(MAX)?
Are GUID collisions possible?
Yes, but the chance of that happening is small. Ridiculously small. There are much bigger things to worry about.
How can I make GUIDs safe?
Stop worrying about them.
What happens when a GUID collides?
This depends on how you're using them. In this case, the old data stored in the location indicated by the GUID would probably be overwritten by the new data.

Well i dont know if id use a guid i would probably just use the auto_increment key on the db table and name the files like that because unless you have deleted records from the db without cleaning up the filesystem they will always be unique. I dont know if the GUID is a requirement on the android side though.

There's nothing wrong with using MySQL to store the documents!
What is storing them in XML going to provide you with? Adding an additional format layer will only increase the processing time when they are to be read and formatted.
Placing them as files on disk would be no different than storing them in an RDBMS and in the longer-term probably cause you further issues down the line. (File access, disk-seek, locking, race conditions come to mind).

How do I store and call 500k images?

I have 500k unique images in my folder/directory. I want call it by name and all names are stored in Mysql database. but I heard that images can be stored in a database. So my question is which is more fastest option to display image faster. do I need to store in mySQl or can I keep same method which I am following?
If I need to store in mySQL then how do I create a table for it, and how do I store all these images?

This has been answered quite a few times. But you haven't talked about what type of application that you are building, if you are building a web application then storing the images on the file system has advantages re: caching.
Storing Images In Filesystem As Files Or In BLOB Database Field As Binaries
Storing Images in DB - Yea or Nay?
It's easy enough to store the images in a table, I would definitely avoid it though if your images are quite large.

I do not think 500k entries in a single directory will go over very well: How many files can I put in a directory?
Once upon a time, Linux ext3 started running slowly for very simple operations once a directory accumulated around 2000 files. (O(N) sorts of slowly!) After the htree patches were merged, large directory performance improved drastically, but I'd still recommend partitioning your image store based on some very easy criteria. (Squid web cache, for example, creates directory trees like 01/, 02/, and places files based on the first two characters in the filename.)

Do not store so many data in a db like mysql especially if you are not so familiar like you sound. Keep the images on the fs

I have 500k unique images in my
folder/directory. I want call it by
name and all names are stored in Mysql
database. but I heard that images can
be store in database. so my question
is which is more fastest option to
display image faster.
You should use the file system. Storing in database is not going to work very well. You should read Facebook Photo Storage Architecture to learn how facebook does it. They have the most photos in the world.
Haystack file storage:
Also interesting:
http://www.fredberinger.com/high-performance-at-massive-scale-lessons-learned-at-facebook/

Storing images into the database (in a blob datatype) in much more inefficient than keep those images stored on the file system.
BTW here is explained how to insert binary data into a mysql table

If you have Redis, you could put all the images in memory, that would be the quickest way

copy mysql blob field from one database to the other

I happen to have a database with pictures stored as blob fields. Can't help it, it was the previous developer's choice.
Now I need that data in a new site and the provider won't let me copy the data the easy way (file has become 11Mb big - won't upload that and I don't have shell access).
So I thought I'd write a script that opens a connection in db1, selects all the records, and then copies each into a table in the new db2.
Works all fine if I exclude the blobs. If I want to copy them too, it won't insert.
Anyone had something similar before?
Should I treat the blobs differently when it comes to inserting?
Thanks for any ideas or help.

11MB isn't a huge file, I'm surprised your host has such a low max upload size.
Have you thought about exporting as SQL, splitting the file in two (in Notepad++ or something) then uploading it in smaller sections? Wouldn't take long.

Perhaps check to see if you can increase the max_allowed_packet setting on your mysql DB. I'm not sure if it affects inserts, but I remember having to adjust this setting when I worked on a web-app that allowed users to download 3-5MB binaries from blob fields in the DB.
This link may be helpful, from a quick google search: http://www.astahost.com/info.php/max_allowed_packet-mysql_t2725.html

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.