I'm creating an online game in PHP where users can create playable characters. Each character can have a user-uploaded portrait. A player can simultaneously have multiple characters, and the pictures for them can be changed anytime. Naturally, the pictures have to be resized and re-compressed to avoid huge files. Here's my problem:
When the player changes his data (among it the picture), and then hits "save", server side validation kicks in. It checks for things like non-unique character names, empty mandatory fields, etc. If any errors are found, they are displayed. In this case the form should be pre-populated with the data the player entered, so he only has to change the faulty bit, not re-type everything. But how do you save the picture in such a "temporary" state?
I cannot pre-populate the file upload field, the browsers don't allow that. If I save it in a temporary file, the picture then has to be cleaned up at some point, because the player can simply close his browser and abort the whole process. When should that be? And what file name should I choose for the temporary file? If the player opens the same character to edit in two browser tabs, they should not conflict (each of them should have their own copy).
How would you solve this problem?
I'd save the file to a temporary location and store both a unique identifier as well as the current timestamp in the filename. Then put the filename in the user's session. When a user has successfully created or updated their account, you save the image file to its permanent location and remove the temporary file. You can run a cron process to scan the temporary directory and check the timestamps, deleting any files older than your expiration (an hour perhaps).
If you're unable to run a cron job, you could always just launch the directory clean-up each time you have a successful create/update validation. This might be a bit more inefficient (extra directory reads and possibly file operations for every successful submission) but unless you're dealing with a lot of traffic, you probably won't even notice.
Create a table to hold references to images.
When a file is uploaded, if it's a valid image, do all your resizing, etc, and create a record in the table that points at the file.
Pass the id of the reference record around with the form data. Display the image when you redisplay the form, so the user knows they don't have to re-upload.
When you finally accept the new character object, set avatar_id or whatever.
Run a regular cron-job to cull orphaned image records (deleting the files on disk as well).
You could always populate a disabled text box to hold the name of the picture - it won't populate the browse input field, but is better than nothing. For editing, to avoid conflicts you could create some a "modifing" column for each user's characters, and on a character editing request change the value to true. When the user saves the character, set it back to false. For each edit request, grant it only when the "modifing" is false.
I'd recommend either updating the image immediately, regardless of error in the form, or separating the image updating to a separate form. That way you'll get rid of two problems without complex state machines and cleaning up.
Related
I have a simple image upload form. When someone uploads an image, it is for a football pool, so there always is a $poolid that goes with the image they upload.
Right now, I am naming the uploaded image using the poolid. So for example, if someone uploads an image, it might get named P0714TYER7EN.png.
All the app will ever do is, when it outputs the football pool's page, it will check to see if an image exists for that pool and if so, it will show it. It checks like this:
if (file_exists("uploads/".$poolid.".png")) { //code to show it }
My first thought when planning this was to add a field called "image" in my MYSQL database's table for all the pool information (called pools) and I would store a value of either the image name (P0714TYER7EN.png) or empty if there wasn't one uploaded. Then I would check that field in the database to determine if an image exists or not.
But I realized I don't really need to store anything in the database because I can simply use the PHP file_exists check above to know if there is an image or not.
In other words, it would seem redundant to have a field in the database.
Everything works doing it this way (i.e. NOT having a field in the database) but I'm wondering if this is bad practice for any reason?
If anyone feels that I should absolutely still have a field in the database, please share your thoughts. I just want to do it the proper way.
Thank you.
The approach could depend a lot on what exactly you're trying to do. Seems like the options you would have is:
File System Only
Benefits would be the speed of accessing static files of an image and use of it in your HTML directly which makes it a more simple solution. Also if you're comfortable with using these functions it will be faster to finish.
Drawbacks would be that you're limited to using file_exists and similar. Any code to manage files this way has to be very specific and static. You also can not search or perform operations efficiently on this. In general relying on the file system alone is not a best practice from my experience.
Database Only
Benefits, you can use Blob type as a column with meta data like owner, uploader, timestamp, etc. in the same row. This makes checking for existing files faster as well as any searching or other operations fast and efficient.
Drawbacks, you can't serve files statically using a CDN or even a cookie-less subdomain or other strategies for page performance. You also have to use PHP and MySQL to generate then serve any images via code rather than just referring to the image file directly.
Hybrid
Benefits, basically the same benefits as both above. You can have your metadata in MySQL with a MD5 hash and location of the file available as well. Your PHP then renders the page with a direct link to the file rather than processing the Blob to an image. You could use this in conjunction with a CDN by prefixing or storing the CDN location as well.
Drawbacks, if you manually changed names of files on the server you'd have to rely on a function matching hashes to detect this, though this would also affect a File System Only that needs to detect a duplicate file potentially.
TLDR; the Hybrid approach is what you'll see most software use like WordPress or others and I believe would be considered a best practice while file system only is a bit of a hack.
Note: Database only could be a best approach in specific situations where you want database clustering and replication of images directly in your database rather than to a file system (especially if the file system is restricted access or unable to be modified for any reason, then you have full flexibility on the DB).
You can also use the blob datatypes from mysql. There you can save the image as binary data next to the data about the football pool.
So when you want to load an football pool you simple fire an sql statement and check if it returns a result, if so load the image from the database and display the data, otherwise throw an error.
If you have very frequent access you can simply put the images into a seperate table and load the image independent of the data about the football pool. Additional set some cache headers on the image and put it in a seperate file, this way you could simply save the primary key of the images in football table. Then you want to display the web page you simply load another document, pass it the primary key of the image, there the image will be loaded, or if the browser has it in cache, will load it from cache without querying the database.
This way you also have a better consistency of data and images.
Your uploading an image to specific folder and that too with poolid which will be unique. It should work just fine.
Problem :
The code you have written works great. But the problem is, for the first time if the image loaded is .png and second time loaded file in jpeg or jpg then file exists wont check that and hence it may fail.
Caution :
If you have already taken a caution to check that the image uploaded must and should be png than the file_exists will work great.
Alternate Solution :
In case if your not checking for the image type to be .png then I highly advice you to take a boolean image column in your table by is_image_uploaded or something which can be set once you upload the file every time.
This makes sure that in case next time you wan to upload the image then you can directly go and check in your database table and see that if is_image_uploaded column is set or not. If not set then upload or else ignore or do whatever you want
I have a PHP script where a user can upload an image. This image is stored in the temporary directory and is returned to the user. The user can then use a javascript interface to crop the image. (x1,y2)(x2,y2) is sent to the script which is used to crop the image. It is then returned to the user for another preview and\or crop. Once the user is sufficiently satisfied he will click "save". The temp file is copied over to the original and the temp deleted. These are not per-user images, but rather images of equipment. Any user in the organization can replace any image of equipment. This approach is good but there are a few issues:
1) Let's say the user uploads an image for preview but then closes the browser window. I will be left with a temporary file. This can become an issue. Sure I can have a CRON clean them up but in theory I can have a ton of temporary files (this is ugly). The cron can also delete the user's temp file during an edit.
2) To deal with number 1 I can always have a temporary file per piece of equipment, such as equip1.temp and equip1.jpg. All uploads are stored in equip1.temp, all commits are transfered to equip1.jpg. If two users are trying to upload pictures of the same piece of equipment at the same time this could mess them up (highly unlikely + not an issue, but still ugly)
3) I can always pass the image back and forth (user "uploads" image and it get's echoed back as an <img src="base64....." />. The resulting edits + original base64 string are sent back to PHP for processing). This solution relieves the temp file issue but I noticed it takes several seconds to send high res images back and forth.
How would you deal with this situation?
I had a similar issue like this. If I recall correctly (its been a while), I ended up creating a table in a DB to store file names and session keys/time. Each time the script loaded, if there was a dead session in the database, the corresponding session and image/file was deleted.
I don't know if that's a good solution or not, but it solved the multiple user access problem for me.
I wouldn't recommend #3 due to the reasons you mentioned.
I suggest you do this instead:
User uploads file to a random temporary name. equip1.jpg gets stored as equip1_fc8293ae82f72cf7.jpg. Be sure you script will juggle both file names around. It will allow two users to upload the same equipment, with the last one to upload being the winner, but no conflict along the way.
Everytime your cropper works with the temp image, you should "touch" it to update the modified time.
Let the user finish their edits, move the temp file in place of the final image name.
Have a cron, or a section of your uploader script, that deletes abandoned temp files that have a mtime older than an hour or so. You suggest this is messy, because of the potential of lots of temp files, but do you expect a lot of images to be abandoned? Garbage collection is a very standard method for this problem.
I am wondering what the best workflow would be to handle this process.
Basic steps are.
The user selects a csv file and uploads it.
The csv file is then checked against a set of rules.
If the csv file is invalid
The user is shown the rows that are invalid.
The user is given the choice to terminate the upload or, strip the invalid lines.
If the csv file is valid, or strip is clicked
The user is shown a screen to choose the filename.
If the filename is already taken the user is given the choice to
a) rename the file (to a name of their choosing)
b) replace existing file.
c) rename the file to filename_1 etc
When the name is chosen a table is created in the database called (csv_filename);
Then data from the csv is entered into the table.
The file is deleted.
The user is taken to a page showing the file data (from the table)
My issue is,
This is all run through ajax.
How do I handle reporting what file we are dealing with?
I dont want to pass back the filename in an ajax response as that is too easy to tamper with.
I dont want to create a table to hold the filepath and pass back an id, as it seems to be a waste to have a table for just this.
There are some issues with this, when the file is uploaded in the first step. Its done, its got a file name, and it can't be terminated because its already there. It has to be, or how will you analyze it?
From the users perspective you can make it look like thats how its working, which maybe is what you meant.
Anyways, to report what file you're dealing with, store it in a session variable.
I need a way to remove "unused" images from my filesystem, i.e. images that are never accessed from any point in my website (doesn't matter if I break external links. I might disable external hotlinking altogether). What's the best way of going about this? Regular users can add multiple attachments to topics/posts and content contributers can bulk upload large numbers of images which can be used in articles or image galleries.
The problem is that the images could be referenced in any of the following ways:
From user content (text/html, possibly Markdown or BBCode) stored in the database
Hardcoded into an HTML page
Hardcoded into a PHP file
Hardcoded into a CSS file
As an "attachment" field in a database table, usually containing only the filename itself with no path, because the application assumes that it would be in a certain folder.
And to top it off, the path of the image could be an absolute or relative HTTP or PHP path and may or may not be built with string concatenation in PHP.
So obviously find/replace or regexing the database or filesystem is out of the question. But luckily for you and me, this system isn't fully implemented yet and I don't need anything that deals with an existing hoard of images. I just need to set up some efficient structure that will allow this in the future.
Some ideas I've thought of:
Intercepting the HTTP request for the image with PHP, and keeping track of the HTTP_REFERER. The problem with this is that just because no one has clicked on a link at the time of checking this doesn't mean the link doesn't exist.
Use extreme database normalization - i.e. make a table for images and use foreign keys for anything that references it. However this would result in making a metric craptonne of many-to-many relationships (and the crosstables) in addition to being impractical for any regular user to use.
Backup all the images and delete them, and check every single 404 request and run a script each time that attempts to find the image from the backup folder and puts it in the "real" folder. The problem is that this cache would have to be purged every so often and the server might be strained when rebuilding the cache.
Ideas/suggestions? Is this just something you have to ignore and live with even if you're making a site with a ridiculous amount of images? Even if it's not worth it, how would something work just for proof-of-concept (I added the garbage-collection tag just because this might be going into that area conceptually).
I will admit that my experience with this was simpler than yours. I had no 'user generated content' so to speak, and my images were all in only templates or database with full path. But what I did is create a perl script that
Analyzed my HTML templates, database
table, and CSS generated a list of
files
In the HTML it looked for <img> tags
In the CSS it looked for any .png, .jp*g, or .gif regex strings
The tables were easy because I had an Image table for the image data
The files list was then
ordered to remove duplicates
The script iterated through the list and
wrote a csv like:
filename,(CSS filename|HTML filename|DBTABLE),(exists|notexists) for
auditing
In another iteration it
renamed all files not in the list by
appended .del to the filename
After regression testing I called the
script with a -docleanup tag which
told it to go through and delete all
the .del appended files.
If for whatever reason an image was tagged
as .del and shouldn't have been, I
just manually renamed it back to its
original form.
A couple of notes: I realize that I could have made this script 'smoother' and done multiple things in multiple steps, but its use grew over time and I wanted clearly delineated processing steps so it couldn't ever run amok. I used the CSV to go back and clean up the information where the image didn't exist.
I'm building a site were users can upload images and then "use" them. What I would like is some thoughts and ideas about how to manage temporary uploads.
For example, a user uploads an image but decides not to do anything with it and just leaves the site. I have then either uploaded the file to the server, or loaded it to the server memory, but how do I know when the image can be removed? First, I thought of just having a temporary upload folder which is emptied periodically, but it feels like there must be something better?
BTW I'm using cakePHP and MySQL. Although images are stored on the server, only the location is stored in the dbb.
Save the information about file to MySQL, and save the last time the image was viewed - can be done via some script that would be altered everytime the image is being used.. and check the database for images not used for 30 days, delete them..
You could try to define a "session" in some way and give the user some information about it. For example, in SO, there is a popup when you started an answer but try to leave the site (and your answer would be lost). You could do the same and delete the uploaded image if the user proceeds. Of course, you can still use a timeout or some other rules (maximum image folder size etc.).
I'm not sure what does "temporary upload" mean in your app. The file is either uploaded or not, and under the ownership of a user. If a user doesn't want to do anything at the moment, you have no other choice but to leave the file where it is.
What you can do is put a warning somewhere on your image management page about unused images, but removing them yourself seems like a bad practice (at least from the user perspective).
As a user,When I upload the image to a server(assuming I want to use it later) and leave the site, I don't expect it to be deleted if I am a registered user.
I would prefer it to be there in my acct until I come back.I would suggest thinking in those lines and implementing a solution to save the users' images if possible.
Check the last accessed/modified time of file to see it if has been used.