Best way to store and deliver files on a CMS

Best way to store and deliver files on a CMS - php

First of all, this isn't another question about storing images on DB vs file system. I'm already storing the images on the file system. I'm just struggling to find a better way to show them to all my users.
I' currently using this system. I generate a random filename using md5(uniqueid()), the problem that I'm facing, is that the image is then presented with a link like this:
<img src="/_Media/0027a0af636c57b75472b224d432c09c.jpg" />
As you can see this isn't the prettiest way to show a image ( or a file ), and the name doesn't say anything about the image.
I've been working with a CMS system at work, that stores all uploaded files on a single table, to access that image it uses something like this:
<img src="../getattachment/94173104-e03c-499b-b41c-b25ae05a8ee1/Menu-1/Escritorios.aspx?width=175&height=175" />
As you can see the path to the image, now has a meaning compared to the other one, the problem is that this put's a big strain in the DB, for example, in the last site I made, I have an area that has around 60 images, to show the 60 images, I would have to do at least 60 individual query's to the database, besides the other query's to retrieve the various content on the page.
I think you understand my dilemma, has anyone gone trough this problem, that can give me some pointers on how to solve this?
Thanks..

You could always use .htaccess to rewrite the url and strip the friendly name. So taking your example, you could display the image source as something like:
/Media/0027a0af636c57b75472b224d432c09c/MyPictures/Venice.jpg
You could use htaccess to actually request:
/Media/0027a0af636c57b75472b224d432c09c.jpg
The other option is to have a totally friendly name:
/Media/MyPictures/Venice.jpg
And redirect it to a PHP file which examines the url and generates the hash so that it then knows the actual image file name on the server. The php script should then set the content type, read the image and output it. The major downside of this method is that you may end up with collisions as you two images may have the same hash. Given that the same thing can also occur with you current method I assume it isn't an issue.

Related

How to get the newest image (named highest number) from a server to show on a website?

A partner is uploading some images to a server that is accessible by URL without login. The newest image gets the highest number in the end of the name. There's a bunch of different digits in the start of the name so I need to target the 4th last digits of the name and display that image. Would be nice to automate this once a day so I don't have to do this every day for 2 years.
What I have figured out is that this should be done with PHP? (Maybe javascript can do this?). I have looked up a bunch of different things but it's overwhelmingly much for me to figure out how to put something togheter without knowing much of basic PHP and JS, although I'm diving into javascript at the moment and PHP is highly interesting.
This is for a wordpress website and I only do CSS and basic HTML normally but I have experience with C++. I have used Fotorama.io and linked images manually from the server by URL. There's also some thumbnail images below with Fotorama. Would be really nice to get the previous images there but most importantly is the main picture that shows big.
Edit: Sorry for not being specific enough. The files were on a different server. I asked to get the files on the same server as the website and they now get loaded in to its own folder. When I go to the folder-URL then all the images are listed.
The files are imported with the name of "image19-09-17_10-59-55-57_00923.jpg" as the 923rd picture in the folder.

Your files have a file name format that basically puts the latest file the last in lexicographical order.
So just call glob and pick the last file returned.
$files = glob("/path/*");
$latest = array_pop($files);

For image upload, should I add a field to the MYSQL database to check against, or simply use PHP to check if the image exists?

I have a simple image upload form. When someone uploads an image, it is for a football pool, so there always is a $poolid that goes with the image they upload.
Right now, I am naming the uploaded image using the poolid. So for example, if someone uploads an image, it might get named P0714TYER7EN.png.
All the app will ever do is, when it outputs the football pool's page, it will check to see if an image exists for that pool and if so, it will show it. It checks like this:
if (file_exists("uploads/".$poolid.".png")) { //code to show it }
My first thought when planning this was to add a field called "image" in my MYSQL database's table for all the pool information (called pools) and I would store a value of either the image name (P0714TYER7EN.png) or empty if there wasn't one uploaded. Then I would check that field in the database to determine if an image exists or not.
But I realized I don't really need to store anything in the database because I can simply use the PHP file_exists check above to know if there is an image or not.
In other words, it would seem redundant to have a field in the database.
Everything works doing it this way (i.e. NOT having a field in the database) but I'm wondering if this is bad practice for any reason?
If anyone feels that I should absolutely still have a field in the database, please share your thoughts. I just want to do it the proper way.
Thank you.

The approach could depend a lot on what exactly you're trying to do. Seems like the options you would have is:
File System Only
Benefits would be the speed of accessing static files of an image and use of it in your HTML directly which makes it a more simple solution. Also if you're comfortable with using these functions it will be faster to finish.
Drawbacks would be that you're limited to using file_exists and similar. Any code to manage files this way has to be very specific and static. You also can not search or perform operations efficiently on this. In general relying on the file system alone is not a best practice from my experience.
Database Only
Benefits, you can use Blob type as a column with meta data like owner, uploader, timestamp, etc. in the same row. This makes checking for existing files faster as well as any searching or other operations fast and efficient.
Drawbacks, you can't serve files statically using a CDN or even a cookie-less subdomain or other strategies for page performance. You also have to use PHP and MySQL to generate then serve any images via code rather than just referring to the image file directly.
Hybrid
Benefits, basically the same benefits as both above. You can have your metadata in MySQL with a MD5 hash and location of the file available as well. Your PHP then renders the page with a direct link to the file rather than processing the Blob to an image. You could use this in conjunction with a CDN by prefixing or storing the CDN location as well.
Drawbacks, if you manually changed names of files on the server you'd have to rely on a function matching hashes to detect this, though this would also affect a File System Only that needs to detect a duplicate file potentially.
TLDR; the Hybrid approach is what you'll see most software use like WordPress or others and I believe would be considered a best practice while file system only is a bit of a hack.
Note: Database only could be a best approach in specific situations where you want database clustering and replication of images directly in your database rather than to a file system (especially if the file system is restricted access or unable to be modified for any reason, then you have full flexibility on the DB).

You can also use the blob datatypes from mysql. There you can save the image as binary data next to the data about the football pool.
So when you want to load an football pool you simple fire an sql statement and check if it returns a result, if so load the image from the database and display the data, otherwise throw an error.
If you have very frequent access you can simply put the images into a seperate table and load the image independent of the data about the football pool. Additional set some cache headers on the image and put it in a seperate file, this way you could simply save the primary key of the images in football table. Then you want to display the web page you simply load another document, pass it the primary key of the image, there the image will be loaded, or if the browser has it in cache, will load it from cache without querying the database.
This way you also have a better consistency of data and images.

Your uploading an image to specific folder and that too with poolid which will be unique. It should work just fine.
Problem :
The code you have written works great. But the problem is, for the first time if the image loaded is .png and second time loaded file in jpeg or jpg then file exists wont check that and hence it may fail.
Caution :
If you have already taken a caution to check that the image uploaded must and should be png than the file_exists will work great.
Alternate Solution :
In case if your not checking for the image type to be .png then I highly advice you to take a boolean image column in your table by is_image_uploaded or something which can be set once you upload the file every time.
This makes sure that in case next time you wan to upload the image then you can directly go and check in your database table and see that if is_image_uploaded column is set or not. If not set then upload or else ignore or do whatever you want

What is the best way to keep track of images uploaded by an user using PHP

I am planning to do a photo album website, So each user may upload as many number of images. What is the best way to keep track of images for an individual user. What should be the server configuration to handle this part.
-Lokesh

Depending on the amount of images, you will probably want to store them on a static domain. Then, have a table in whatever database you are using to store the paths to each of the images for each user.

Well like many design topics there are lots of different ways to go about it. Two ways that come to mind right now are as follows.
you could simply have a directory created on the server for each user and then have the images each use uploads saved into that directory. Ofcourse you'd want to make sure they didn't over write any existing images with images of the same name. You could do this by warning them about conflicting names or by adding some sort of noce string (like a time stamp) to the end of of the file name. This is a pretty straight forward solution and means that you can login to your server and see all the images each user has uploaded right there for you to do anything you like with.
Another idea would be to save the images in a database. This can be done by serializing the images to a string and storing it in a database. This is nice becaues it means you don't have to worry about handling directories and duplicate file names. You will have to deserialize each image when you want to display it which will put your DB under load so for a very high traffic volume site this might not really be the way to go.
There are ofcourse combinations of these ideas and many others. It really comes down to working out which solution best fits your exact needs.

Linking an image to a PHP file

Here's a bit of history first: Recently finished an application that allows me to upload images and store them in a directory, it also stores the information of that file in a database. Database stores the location, name and gives it an ID (auto_increment).
Okay, so what I'm doing now is allowing people to insert images into posts. Throwing a few ideas around on the best way to do this, as the application I designed allows people to move files around, and I don't want images in posts to break if an image is moved to a different directory (hence the storing of IDs).
What I'm thinking of doing is when linking to images, instead of linking to the file directly, I link it like so:
<img src="/path/to/functions.php?method=media&id=<IMG_ID_HERE>" alt="" />
So it takes the ID, searches the database, then from there determines the mime type and what not, then spits out the image.
So really, my question is: Is this the most efficient way?
Note that on a single page there could be from 3 to 30 images, all making a call to this function.

Doing that should be fine as long as you are aware of your memory limitations configured by both PHP and the web server. (Though you'll run into those problems merely by receiving the file first)
Otherwise, if you're strict about this being just for images, it could prove more efficient to go with Mike B's approach. Design a static area and just drop the images off in there, and record those locations in the records for their associated post. It's less work, and less to worry about... and I'm willing to bet your web server is better at serving files than most developer's custom application code will be.

Normally, I would recommend keeping the src of an image static (instead of a php script). But if you're allowing users to move them around the filesystem you need a way to track them
Some form of caching would help reduce the number of database calls required to fetch the filesystem location of each image. Should be pretty easy to put an indefinite TTL on the cache and invalidate upon the image being moved.

I don't think you should worry about that, what you have planned sounds fine.
But if you want to go out of your way to minimise requests or whatever, you could instead do the following: when someone embeds an image in a post, replace the anchor tag with some special character sequence, like [MYIMAGE=1234] or something. Then when a page with one or more posts is viewed, search through all the posts to find all the [MYIMAGE=] sequences, query the database to get all of the images' locations, and then output the posts with the [MYIMAGE=] sequences replaced with the appropriate anchor tags. You might or might not want to make sure users cannot directly add [MYIMAGE=] tags to their submitted content.

The way you have suggested will work, and it's arguably the nicest solution, but I should warn you that I've tried something similar before and it completely fell apart under load. The database seemed to be keeping up, but the script would start to time out and the image wouldn't arrive. That was probably down to some particular server configuration, but it's worth bearing in mind.
Depending on how much access you have to the server it's running on, you could just create a symlink whenever the user moves a file. It's a little messy but it'll be fast and reliable, and will also handle collisions if a user moves a file to where another one used to be.

Use the format proposed by Hammerite, and use [MYIMAGE=1234] tags (or something similar).
You can then fetch the id-path mappings before display, and replace the [MYIMAGE] tags with proper tags which link to images directly. This will yield much better performance than outputting images using php.
You could even bypass the database completely, and simply use image paths like (for example) /images/hash(IMAGEID).jpg.
(If there are different file formats, use [MYIMAGE=1234.png], so you can append png/jpg/whatever without a database call)
If the need arises to change the image locations, output method, or anything else, you only need to change the method where [MYIMAGE] tags are converted to full file paths.

Keeping track of links or references to image files and deleting unused ones (PHP/Database)

I need a way to remove "unused" images from my filesystem, i.e. images that are never accessed from any point in my website (doesn't matter if I break external links. I might disable external hotlinking altogether). What's the best way of going about this? Regular users can add multiple attachments to topics/posts and content contributers can bulk upload large numbers of images which can be used in articles or image galleries.
The problem is that the images could be referenced in any of the following ways:
From user content (text/html, possibly Markdown or BBCode) stored in the database
Hardcoded into an HTML page
Hardcoded into a PHP file
Hardcoded into a CSS file
As an "attachment" field in a database table, usually containing only the filename itself with no path, because the application assumes that it would be in a certain folder.
And to top it off, the path of the image could be an absolute or relative HTTP or PHP path and may or may not be built with string concatenation in PHP.
So obviously find/replace or regexing the database or filesystem is out of the question. But luckily for you and me, this system isn't fully implemented yet and I don't need anything that deals with an existing hoard of images. I just need to set up some efficient structure that will allow this in the future.
Some ideas I've thought of:
Intercepting the HTTP request for the image with PHP, and keeping track of the HTTP_REFERER. The problem with this is that just because no one has clicked on a link at the time of checking this doesn't mean the link doesn't exist.
Use extreme database normalization - i.e. make a table for images and use foreign keys for anything that references it. However this would result in making a metric craptonne of many-to-many relationships (and the crosstables) in addition to being impractical for any regular user to use.
Backup all the images and delete them, and check every single 404 request and run a script each time that attempts to find the image from the backup folder and puts it in the "real" folder. The problem is that this cache would have to be purged every so often and the server might be strained when rebuilding the cache.
Ideas/suggestions? Is this just something you have to ignore and live with even if you're making a site with a ridiculous amount of images? Even if it's not worth it, how would something work just for proof-of-concept (I added the garbage-collection tag just because this might be going into that area conceptually).

I will admit that my experience with this was simpler than yours. I had no 'user generated content' so to speak, and my images were all in only templates or database with full path. But what I did is create a perl script that
Analyzed my HTML templates, database
table, and CSS generated a list of
files
In the HTML it looked for <img> tags
In the CSS it looked for any .png, .jp*g, or .gif regex strings
The tables were easy because I had an Image table for the image data
The files list was then
ordered to remove duplicates
The script iterated through the list and
wrote a csv like:
filename,(CSS filename|HTML filename|DBTABLE),(exists|notexists) for
auditing
In another iteration it
renamed all files not in the list by
appended .del to the filename
After regression testing I called the
script with a -docleanup tag which
told it to go through and delete all
the .del appended files.
If for whatever reason an image was tagged
as .del and shouldn't have been, I
just manually renamed it back to its
original form.
A couple of notes: I realize that I could have made this script 'smoother' and done multiple things in multiple steps, but its use grew over time and I wanted clearly delineated processing steps so it couldn't ever run amok. I used the CSV to go back and clean up the information where the image didn't exist.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.