Random URL like megaupload - php

I need to sell pictures. I need to create a megaupload like system to create ramdom url, like this: "http://download.server.com/7fdfug87g89f7g98fd7g/image.jpg" associated with the session and IP address.
I'm using PHP, Apache or Nginx.
How can I achieve this?
Any Ideas?

Use mod_rewrite in the .htaccess file to redirect requests matching some patterns you define to a php file, 'index.php' perhaps.
This way you can pass the requested string as a URL parameter to the page. And then in the script you can use the parameter to find and return the related image.
It's called 'URL rewriting', and is the way how those sites with meaningful URLs work, just like the URLs of stackoverflow.
For the uniqueness; rather than bare hash codes, you'll probably need to keep a DB to map the codes with files. So they may be totally random codes in any length you wish, and never collide, as during the assignment you will create a new random one if the one you just created collides with another one already in the DB. And you can add clear IP and session info to the DB record. This also removes the need of some heavy calculations for hashing algorithms.

Something like md5 would be within reason.
$my_seed = "something random here";
$path = md5($my_seed . $_SESSION['something'] . $_SERVER['REMOTE_ADDR']);
echo "http://download.server.com/" . $path . "/" . $file;
That should give you a pretty unique path to put files in that would be rare to collide. You should still check if the previous hash'd path exists though.

Use some of the popular hash functions out there, such as MD5. There should be a PHP module capable of that.

I usually use a hashing function like sha1 or md5 to generate a pseudorandom string of hexadecimal digits based on the current time plus some other bit of data unique to whatever the URL is about.

Related

php how to link a file from file server to that info from database

i'm new to PHP, and i'm trying to upload file to file server and file information to mysql database, i have done uploading file server and database part but i need to retrieve the info of specific file from my file server folder if i click that file, i'm trying get that logic. please help me if there is any solid solution for this. (correct me if i'm wrong, my idea was to upload the file path to database along with info, is this will give me solution? but the filename can be duplicate)
I figured I would write a short(for me this is short) "answer" just so I could summarize my points.
Some "Best Practices" when creating a file storage system. File storage is a broad category so your mileage may vary for some of these. Take them just as suggestion of what I found works well.
Filenames
Don't store the file with the name give it by an end user. They can and will use all kind of crappy characters that will make your life miserable. Some can be as bad as ' single quotes, which on linux basically makes it so it's impossible to read, or even delete the file ( directly ). Some things can seem simple like a space but depending on where you use it and the OS on your server you could wind up with one%20two.txt or one+two.txt or one two.txt which may or may not create all kinds of issues in your links.
The best thing to do is create a hash, something like sha1 this can be as simple as {user_id}{orgianl_name} The username make it less likely of collisions with other users filenames.
I prefer doing file_hash('sha1', $contents) that way if someone uploads the same file more then once you can catch that ( the contents are the same the hash is the same). But if you expect to have large files you may want to do some bench marking on it to see what type of performance it has. I mostly handle small files so it works fine for that.
-note- that with the timestamp the file can still be saved because the full name is different, but it makes it quite easy to see, and it can be verified in the database.
Regardless of what you do I would prefix it with a timestamp time().'-'.$filename. This is useful information to have, because its the absolute time the file was created.
As for the name a user give the file. Just store that in the database record. This way you can show them the name they expect, but use a name you know is always safe for links.
$filename = 'some crapy^ fileane.jpg';
$ext = strrchr($filename, '.');
echo "\nExt: {$ext}\n";
$hash = sha1('some crapy^ fileane.jpg');
echo "Hash: {$hash}\n";
$time = time();
echo "Timestamp: {$time}\n";
$hashname = $time.'-'.$hash.$ext;
echo "Hashname: $hashname\n";
Ouputs
Ext: .jpg
Hash: bb9d2c2c7c73bb8248537a701870e35742b41c02
Timestamp: 1511853063
Hashname: 1511853063-bb9d2c2c7c73bb8248537a701870e35742b41c02.jpg
You can try it here
Paths never store the full path to the file. All you need in the database is the hash from creating the hashed name. The "root" path to the folder the file is stored in should be done in PHP. This has several benefits.
prevents directory transferal. Because your not passing any part of the path around you don't have to worry as much about someone slipping a \..\.. in there and going places they shouldn't. A poor example of this would be someone overwriting a .htpassword file by uploading a file named that with directory transverse in it.
Has more uniform looking links, uniform size, uniform set of
characters.
https://en.wikipedia.org/wiki/Directory_traversal_attack
Maintenance. Paths change, Servers change. Demands on your system change. If you need to relocate those files, but you stored the absolute full path to them in the DB your stuck gluing everything together with symlinks or updating all your records.
There are some exceptions to this. If you want to store them in a monthly folder or by username. You could save that part of the path, in a seperate field. But even in that case, you could build it dynamically based on data saved in the record. I have found it's best to save as little path info as possible. And them make a config or a constant you can use in all the places you need to put the path to the file.
Also the path and the link are very different, so by saving only the name you can link it from whatever PHP page you want without having to subtract data from the path. I've always found it easier to add to the filename then to subtract from a path.
Database (just some suggestions, use may vary )
As always with data ask yourself, who, what, where, when
id - int primary key auto increment
user_id - int foreign key, who uploaded it
hash - char[40] *sha1*, unique what the hash
hashname - varchar {timestampl}-{hash}.{ext} where the files name on the hard drive
filename - varchar the original name give by the user, that way we can show them the name they expect ( if that is important )
status - enum[public,private,deleted,pending.. etc] status of the file, depending on your use case, you may have to review the files, or maybe some are private only the user can see them, maybe some are public etc.
status_date - timestamp|datetime time the status was changed.
create_date - timestamp|datetime when time the file was created, a timestamp is prefered as it makes some things easier but it should be the same timestamp use in the hashname, in that case.
type - varchar - mime type, can be useful for setting the mime type when downloading etc.
If you expect different users to upload the same file and you use the file_hash you can make the hash field a combined unique index of the user_id and the hash this way it would only conflict if the same user uploaded the same file. You could also do it based on the timestamp and hash, depending on your needs.
That's the basic stuff I could think of, this isn't an absolute just some fields I thought would be useful.
It's useful to have the hash by itself, if you store it by it's self you can store it in a CHAR(40) for sha1 (takes up less space in the DB then VARCHAR) and set the collation, to UTF8_bin which is binary. This makes searches on it case sensitive. Although there is little possibility of a hash collision, this adds just a bit more protection because hashes are upper an lower case letters.
You can always build the hashname on the fly if you store the extension, and the timestamp separate. If you find yourself creating things time and time again you may just want to store it in the DB to simplify the work in PHP.
I like just putting the hash in the link, no extension no anything so my links look like this.
http://www.example.com/download/ad87109bfff0765f4dd8cf4943b04d16a4070fea
Real simple, real generic, safe in urls always the same size etc..
The hashname for this "file" would be like this
1511848005-ad87109bfff0765f4dd8cf4943b04d16a4070fea.jpg
If you do have conflicts with the same file and different user(which I mentioned above). You can always add the timestamp part into the link, the user_id or both. If you use the user_id, it might be useful to left pad it with zeros. For example some users may have ID:1 and some may be ID:234 so you could left pad it to 4 places and make them 0001 and 0234. Then add that to the hash, which is almost unnoticeable:
1511848005-ad87109bfff0765f4dd8cf4943b04d16a4070fea0234.jpg
The important thing here is that because sha1 is always 40 and the id is always 4 we can separate the two accurately and easily. And this way, you can still look it up uniquely. There are a lot of different options but so much depends on your needs.
Access
Such as downloading. You should always output the file with PHP, don't give them direct access to the file. The best way is to store the files outside of the webroot ( above the public_html, or www folder ). Then in PHP you can set the headers to the correct type ans basically read out the file. This works for pretty much everything except video. I don't handle videos so that's a topic outside of my experience. But I find it best to think of it as all file data is text, its the headers that make that text into an image, or an excel file or a pdf.
The big advantage of not giving them direct access to the file is if you have a membership site, of don't want your content accessible without a login, you can easily check in PHP if they are logged in before giving them the content. And, as the file is outside the webroot, they can't access it any other way.
The most important thing is to pick something consistent, that is still flexible enough to handle all your needs.
I'm sure I can come up with more, but if you have any suggest feel free to comment.
BASIC PROCESS FLOW
User submits form (enctype="multipart/form-data")
https://www.w3schools.com/tags/att_form_enctype.asp
Server receives the post from the form, Super Globals $_POST and the $_FILES
http://php.net/manual/en/reserved.variables.files.php
$_FILES = [
'fieldname' => [
'name' => "MyFile.txt" // (comes from the browser, so treat as tainted)
'type' => "text/plain" // (not sure where it gets this from - assume the browser, so treat as tainted)
'tmp_name' => "/tmp/php/php1h4j1o" // (could be anywhere on your system, depending on your config settings, but the user has no control, so this isn't tainted)
'error' => "0" //UPLOAD_ERR_OK (= 0)
'size' => "123" // (the size in bytes)
]
];
Check for errors if(!$_FILES['fielname']['error'])
Sanitize display name $filename = htmlentities($str, ENT_NOQUOTES, "UTF-8");
Save file, create DB record ( PSUDO-CODE )
Like this:
$path = __DIR__.'/uploads/'; //for exmaple
$time = time();
$hash = hash_file('sha1',$_FILES['fielname']['tmp_name']);
$type = $_FILES['fielname']['type'];
$hashname = $time.'-'.$hash.strrchr($_FILES['fielname']['name'], '.');
$status = 'pending';
if(!move_uploaded_file ($_FILES['fielname']['tmp_name'], $path.$hashname )){
//failed
//do somehing for errors.
die();
}
//store record in db
http://php.net/manual/en/function.move-uploaded-file.php
Create link ( varies based on routing ), the simple way is to do your link like this http://www.example.com/download?file={$hash} but it's uglier then http://www.example.com/download/{$hash}
user clicks link goes to download page.
get INPUT and look up record
$hash = $_GET['file'];
$stmt = $PDO->prepare("SELECT * FROM attachments WHERE hash = :hash LIMIT 1");
$stmt->execute([":hash" => $hash]);
$row = $stmt->fetch(PDO::FETCH_ASSOC);
print_r($row);
http://php.net/manual/en/intro.pdo.php
Etc....
Cheers!

PHP encdoe function for file name

I'm learning about the function urlencode. Is it possible to use this on a file name? So - when you upload a file to your server and then use that file name later, you would be able to use it in a url?
$promotionpicture=$_FILES["promotionpicture"]["name"];
$promotionpicture=rawurlencode($promotionpicture);
Then later...
$imagesource="http://mysite.com/".$userID."/".$promotionpicture;
I'm trying to do this, but every time I navigate to the picture, i get a "Bad request" from my server. Is there a specific php encode function I should use? Or is this wrong all together? Thanks in advance for you help.
urlencode and similar functions are for making an HTTP friendly URL. You would want to keep the normal filename and then when printing the img src, use urlencode.
Note that this is not really the preferred way to do it as you can run into duplicate filenames and misc security issues. It's better to generate a filename for it using a uuid or timestamp or something, that way you can bypass those types of issues.
Pictures are really just raw data, like any other file. It is possible to do something like what you're doing, but not necessarily advisable.
If you want to do something like that, I recommend instead doing something to strip special characters.
$newfilename=preg_replace('/[^a-zA-Z0-9.]/','',$filename);
(from Regex to match all characters except letters and numbers)
That said, keep in mind what others have said. How will you handle file name collisions? Where will the images be stored and how?
One easy way to do this much more robustly is to store in a database the original file name and the MD5 hash. Save the file by its hash instead of by name, and write a script that retrieves the file by matching the original name to the MD5 using the database. If you store the file type, you can issue correct headers and when the user downloads the file or uses it to embed in a web page, it will retain its original name, or display as expected respectively.

URL GET variable has a necessary hash symbol

I am creating a url link and one of the GET variables has a hash symbol in it. The webpage will not read any data after the hash mark. I cannot take it out for two reasons.
The website database (not designed by me in any way) has hash symbols for various items of data. I have no authorization to edit the database. And I'm sure if I did other things would break.
I cannot edit the webpage of the url. It was designed by someone else and again I don't have any authorization to edit it.
The url looks something like this
www.example.com?datapoint1=abc&datapoint2=#def
where the #def is necessary as the webpage will search the database for this exact string. If I could edit the webpage php I could put the hash in when necessary, but as I said, I don't.
To explain a little further. The user collects data (in a Java app) and the data is put into a long url (like the above example but more complicated)and is automatically emailed to a specific user with this link. The second user clicks on the link and does whatever he/she has to do.
I think the only way is to edit the php or javascript of the webpage. Any ideas would be appreciated.
You'll have to encode the # as %23, so your URL would look like this:
www.example.com?datapoint1=abc&datapoint2=%23def
To make it easier, you could use PHP's built-in urlencode function: http://php.net/urlencode
You need to escape the hash in the url if you don't want it to become the hash part. The urlencoded character for a # is %23.
You can use the urlencode() (php.net doc) in php to escape values in php.
You might also like to know about http_build_query_string() which can generate the url query and encode the values properly from a key value array. Check out the php.net examples for more information.
If you can't access the PHP but can use JS (which is sub-optimal) you could make a small script that rewrites the url when it sees a hash is present (will only work if a hash is never present otherwise)
if(window.location.hash) {
// Hash detected, lets rebuild the url
window.location.href = window.location.href + '%23' + window.location.hash.slice(1);
}

md5 hash for urls in unique Index

I was asked this before with slight different with current question. but did not got the answer I was looking into.
My question is do I need to store md5($url) in unique index in MySQL?? I have seen this in some code actually I don't remember..this is a large database with more than 5 million urls and the indexing is done by calling urls.
Any ideas?
I don't think you should hash your URLs. The only plausible reason would be to save space (if most of the URLs are larger than 32 chars) at the expense of increased risk of collisions.
What you should do is normalize the URLs.
Some sites uses hashing for urls in the database because they use hashes in urls say for user redirect to external url. I can't see any reason to do this if this is not the case.
are you saying that the url is called as such:
www.yourdomain.com?id=89ce9250e9f469c9d1816e1cc0fb47a1
and then the id (89ce9250e9f469c9d1816e1cc0fb47a1 which is an md5() of the real url querystring) is looked up from the database to resolve the actual url which could be:
www.yourdomain.com?user=23&location=5&eventtype=23&year=2010
Is this the kind of usage you're referring to??
jim

URL shortening: using inode as short name?

The site I am working on wants to generate its own shortened URLs rather than rely on a third party like tinyurl or bit.ly.
Obviously I could keep a running count new URLs as they are added to the site and use that to generate the short URLs. But I am trying to avoid that if possible since it seems like a lot of work just to make this one thing work.
As the things that need short URLs are all real physical files on the webserver my current solution is to use their inode numbers as those are already generated for me ready to use and guaranteed to be unique.
function short_name($file) {
$ino = #fileinode($file);
$s = base_convert($ino, 10, 36);
return $s;
}
This seems to work. Question is, what can I do to make the short URL even shorter?
On the system where this is being used, the inodes for newly added files are in a range that makes the function above return a string 7 characters long.
Can I safely throw away some (half?) of the bits of the inode? And if so, should it be the high bits or the low bits?
I thought of using the crc32 of the filename, but that actually makes my short names longer than using the inode.
Would something like this have any risk of collisions? I've been able to get down to single digits by picking the right value of "$referencefile".
function short_name($file) {
$ino = #fileinode($file);
// arbitrarily selected pre-existing file,
// as all newer files will have higher inodes
$ino = $ino - #fileinode($referencefile);
$s = base_convert($ino, 10, 36);
return $s;
}
Not sure this is a good idea : if you have to change server, or change disk / reformat it, the inodes numbers of your files will most probably change... And all your short URL will be broken / lost !
Same thing if, for any reason, you need to move your files to another partition of your disk, btw.
Another idea might be to calculate some crc/md5/whatever of the file's name, like you suggested, and use some algorithm to "shorten" it.
Here are a couple articles about that :
Create short IDs with PHP - Like Youtube or TinyURL
Using Php and MySQL to create a short url service!
Building a URL Shortener
Rather clever use of the filesystem there. If you are guaranteed that inode ids are unique its a quick way of generating the unique numbers. I wonder if this could work consistently over NFS, because obviously different machines will have different inode numbers. You'd then just serialize the link info in the file you create there.
To shorten the urls a bit, you might take case sensitivity into account, and do one of the safe encodings (you'll get about base62 out of it - 10 [0-9] + 26 (a-z) + 26 (A-Z), or less if you remove some of the 'conflict' letters like I vs l vs 1... there are plenty of examples/libraries out there).
You'll also want to 'home' your ids with an offset, like you said. You will also need to figure out how to keep temp file/log file, etc creation from eating up your keyspace.
Check out Lessn by Sean Inman; Haven't played with it yet, but it's a self-hosted roll your own URL solution.

Categories