I am wondering what the best workflow would be to handle this process.
Basic steps are.
The user selects a csv file and uploads it.
The csv file is then checked against a set of rules.
If the csv file is invalid
The user is shown the rows that are invalid.
The user is given the choice to terminate the upload or, strip the invalid lines.
If the csv file is valid, or strip is clicked
The user is shown a screen to choose the filename.
If the filename is already taken the user is given the choice to
a) rename the file (to a name of their choosing)
b) replace existing file.
c) rename the file to filename_1 etc
When the name is chosen a table is created in the database called (csv_filename);
Then data from the csv is entered into the table.
The file is deleted.
The user is taken to a page showing the file data (from the table)
My issue is,
This is all run through ajax.
How do I handle reporting what file we are dealing with?
I dont want to pass back the filename in an ajax response as that is too easy to tamper with.
I dont want to create a table to hold the filepath and pass back an id, as it seems to be a waste to have a table for just this.
There are some issues with this, when the file is uploaded in the first step. Its done, its got a file name, and it can't be terminated because its already there. It has to be, or how will you analyze it?
From the users perspective you can make it look like thats how its working, which maybe is what you meant.
Anyways, to report what file you're dealing with, store it in a session variable.
Related
i have a client requirement with resume-able file import.i will tell the use case file imported is csv that corresponds to 3 mysql-table datas. during upload if any kind of warnings occure like name too long,description too long it should stop execution and show the warning to client.client have 2 options ignore and re upload if he ignores then file upload will resume from the point where error happens and other option is simple just re upload the file.how should i continue the script execution after showing error msg ? how should i track no of csv lines get completed?
now lets explain my plan. i an using jquery ajax in first place.after saving each line of code save the no of lines executed in a session variable.if the user ignore the warning. then i will reupload csv and wait until the line no matches the one in session variable.is this the elegant way of doing things? reason why i put this question is each time user ignore the error msg according to my logic he/she needs to wait until the line stored in session reached. pls explain with an example if their is a better solution to this problem. thanks in advance.
A possible workflow
Analyze each line of the CSV file and load it into an empty table specially created for this purpose. The table should have the columns to store only the columns you import from the CSV file. It can also contain additional columns to store data that is computed during the import; f.e. if the processing and validation of the input data requires lookups
into the data that already exists in the database, the values found should be also stored in the import table to avoid running the same queries again.
Ignore the lines from the CSV file that cannot be imported but keep track of their information (content, failure reason etc). You'll need the info to display it to the user.
If there are no errors in the CSV file then move the (validated) data from the special table into the destination tables and call it a success.
If there are errors show them to the user.
If the user decides to reload the CSV then start over with step 1.
If the user decides to ignore the errors then resume with step 3.
Advantages
There is no need to upload the CSV file again.
There is no need to keep track of the progress.
Disadvantages
A lot of work is done then discard if the file is large, the error is at the beginning of CSV file and the user decides to upload it again. This can be improved in the algorithm by using an extra database table that contains the raw lines of the CSV file and an extra step (step 0): load the CSV file into it then use this table instead of the CSV file as input data for step 1. When a line is valid remove it from the CSV table; when a line is not valid then skip to step 4. When the user decides to ignore the error then resume with step 1.
When I do an import of data from a file, such as csv or text file, I would like to display on a summary page (table view) about what is going to be imported to the database, and user can select or deselect what will be imported into the database. I would like to find out what is the best way of temporary storing these data to be displayed onto the summary page?
To give a clearer understanding about what i am talking about, for example:
I have a csv file that holds a list of products.
I would do an importation to read the csv file and display onto a summary page with
table view and checkboxes to select/deselect some of the data.
After the selection, I would click the confirm button to store those checked data
So between the process of getting the csv data to the summary page for display, where should I store the data, in the database as a tbl_temp data and clean off when done, or just read directly from the csv file?
Thanks.
I would think an array would do the trick unless you think your environment is really unstable or, for user interface reasons, then just store to a temp file. Temp db table seems a bit much.
I would just display it in the browser, and stored? It's already stored, the uploaded CSV file!
Unless you don't run into an actual problem with that approach, why care?
Just use HTML to display the data read from the file in a web browser. Use standard HTML Form and form controls. Use PHP scripting logic to control what happens when the form is submitted. You could have the submit button hand off to a service or some other php file that could handle writing it to the db.
Why bother writing to a temp table, to just read back out and display in a form to be then written back to a perm. table.
If this were a process that you would be performing a lot, i might give it a little more consideration.
I have a form where an admin will upload three pictures with different dimensions to three different designated directories. now to make sure that i don't get into the problem of duplicate file names i implemented something like the php will compare the uploaded file name and it will check if that file name exist in the designated directory if yes then it will echo an error and stop the script execution.
Now one of my friend suggested me that it is very bad asking the admin to manually rename the picture file and asking them to take care of the file duplication problem. the solution he suggested was to rename the file automatically and then store it in the database and then direct it to directory.
I am confused about what combination should i give to the renamed file and also make sure it will remain unique file name to be more precise i would like you to understand my directory structure
as i said there will be three pictures files the admin will be uploading namely
a) Title Picture b) Brief Picture c)
Detail Picture
and all the three picture files will be moved to the different respective directory, like title picture goes to title directory and so on.
i am using to script below currently just to move and store the file name with path using varchar in the database.
$ns_pic_title_loc= $_FILES["ns_pic_title"]["tmp_name"];
$ns_pic_title_name = $_FILES["ns_pic_title"]["name"];
move_uploaded_file($ns_pic_title_loc, $ns_title_target.$ns_pic_title_name) or die(mysql_error());
that is just the sample code i havent included the validation function which i am using. i was thinking like i want to rename all the files like
a) In title directory the file should be stored as.
title_1.jpg
title_2.jpg
title_3.jpg
title_4.jpg
and so on
and the same way to rest of the pictures. how do i do that? what function do i use to achieve my target. and if this is not the good way to rename the file i would appreciate any suggestion followed to rename the file.
thanks in advance
Well, here's a possible solution:
Get uploaded filename from $_FILES["ns_pic_title"]["name"] and separate extension OR if we are only talking about image files get the image type with getimagesize($_FILES["ns_pic_title"]["tmp_name"]);
Check your database for the maximum id of the image records and make the the $file_name variable 'title_'.($max_id + 1)
At this point you should have $file_name and $file_extension so do move_uploaded_file($_FILES["ns_pic_title"]["tmp_name"], $ns_title_target.$file_name.'.'.$file_extension)
Hopefully this makes sense and helps.
There are a couple of good options with various pros and cons.
Use php's tempnam when moving the file, and store the path in your mysql database. tempnam generates a unique filename.
Use mysql to store the image content in a blob. This way you will access the image content via an id instead of a pathname.
Instead of having logic to figure out what the latest picture name is and calculate the next number increment, why not just use PHP's tempnam() function? It generates an unique name with a prefix of your choice (i.e., "title", "brief", "detail"). You could also simply prepend a timestamp to the file name -- if you don't have a whole lot of admins uploading pictures at the same time, that should handle most name conflicts.
Since your pictures are going to be sorted into title, brief and detail directories already, it's not really necessary to name each picture title_*, brief_*, and detail_*, right? If it's in the title directory, then it's obviously a title picture.
Also, you're going to be putting the file names in the database. Then elsewhere in the app, when you want to display a picture, I assume you are getting the correct file name from the database. So it isn't really important what the actual file name is as long as the application knows where to find it. If that's correct, it's not necessary to have a very friendly name, thus a tempnam() file name or a timestamp plus the original file name would be acceptable.
Because you are storing references into the DB, I would prefer to just md5 the datetime and use that for the filename and store the disk filename to the DB also. It doesn't matter what name it is written to disk with as long as you can point to it with the unique name into the DB.
I use this methodology, and in none of my testing does the disk name (md5 from the datetime) ever require multiple tries.
I am currently using the Zend Framework and have an upload file form. An authenticated user has the ability to upload a file, which will be stored in a directory in the application, and the location stored in the database. That way it can be displayed as a file that can be downloaded.
Download
But something I am noticing is that a file with the same name will overwrite a file in the uploads directory. There is no error message, nor does the filename increment. So I think the file must be overwritten (or never uploaded).
What are some best practices I should be aware of when uploading, moving, or storing these files? Should I always be renaming the files so that the filename is always unique?
Generally, we don't store files with the name given by the user, but using a name that we (i.e. our application) chosse.
For instance, if a user uploads my_file.pdf, we would :
store a line in the DB, containing :
id ; an autoincrement, the primary key -- "123", for instance
the name given by the user ; so we can send the right name when someone tries to download the file
the content-type of the file ; application/pdf or something like that, for instance.
"our" name : file-123 for instance
when there is a request to the file with id=123, we know which physical file should be fetched ('file-' . $id) and sent.
and we can set some header to send to correct "logical" name to the browser, using the name we stored in the DB, for the "save as" dialog box
same for the content-type, btw
This way, we make sure :
that no file has any "wrong" name, as we are the ones choosing it, and not the client
that there is no overwritting : as our filenames include the primary key of our table, those file names are unique
Continuing on Pascal MARTIN's answer:
If using an id as name you can also come up with a directory naming strategy. I takes no longer to get /somedir/part1ofID/part2OfID from the filesystem than /somedir/theWholeID but it will let you choose how many files are stored in the same directory from how you split the ID to form the path and file name.
The next good thing is that the script that you use to actually output the file to the user can choose if the user is authorized to see the file or not. This of course requires the files to be stored somewhere not readable by everyone by default.
You may also want to look at this other question. Not totally related, but good to be aware of.
Yes you need to come up with a way to name them uniquely. Ive seen all kinds of different strategies for this ranging from a hash base on the orignal filename, pk of the db record and upload timestamp, to some type of slugging, again based on varous fields in the db record its attached to or related records.
I'm creating an online game in PHP where users can create playable characters. Each character can have a user-uploaded portrait. A player can simultaneously have multiple characters, and the pictures for them can be changed anytime. Naturally, the pictures have to be resized and re-compressed to avoid huge files. Here's my problem:
When the player changes his data (among it the picture), and then hits "save", server side validation kicks in. It checks for things like non-unique character names, empty mandatory fields, etc. If any errors are found, they are displayed. In this case the form should be pre-populated with the data the player entered, so he only has to change the faulty bit, not re-type everything. But how do you save the picture in such a "temporary" state?
I cannot pre-populate the file upload field, the browsers don't allow that. If I save it in a temporary file, the picture then has to be cleaned up at some point, because the player can simply close his browser and abort the whole process. When should that be? And what file name should I choose for the temporary file? If the player opens the same character to edit in two browser tabs, they should not conflict (each of them should have their own copy).
How would you solve this problem?
I'd save the file to a temporary location and store both a unique identifier as well as the current timestamp in the filename. Then put the filename in the user's session. When a user has successfully created or updated their account, you save the image file to its permanent location and remove the temporary file. You can run a cron process to scan the temporary directory and check the timestamps, deleting any files older than your expiration (an hour perhaps).
If you're unable to run a cron job, you could always just launch the directory clean-up each time you have a successful create/update validation. This might be a bit more inefficient (extra directory reads and possibly file operations for every successful submission) but unless you're dealing with a lot of traffic, you probably won't even notice.
Create a table to hold references to images.
When a file is uploaded, if it's a valid image, do all your resizing, etc, and create a record in the table that points at the file.
Pass the id of the reference record around with the form data. Display the image when you redisplay the form, so the user knows they don't have to re-upload.
When you finally accept the new character object, set avatar_id or whatever.
Run a regular cron-job to cull orphaned image records (deleting the files on disk as well).
You could always populate a disabled text box to hold the name of the picture - it won't populate the browse input field, but is better than nothing. For editing, to avoid conflicts you could create some a "modifing" column for each user's characters, and on a character editing request change the value to true. When the user saves the character, set it back to false. For each edit request, grant it only when the "modifing" is false.
I'd recommend either updating the image immediately, regardless of error in the form, or separating the image updating to a separate form. That way you'll get rid of two problems without complex state machines and cleaning up.