Secure image download (by url) with php

Secure image download (by url) with php - php

I would like to allow users to download images from URL (same as you can see on imgur.com), i know how to do it with copy() or curl() or file_get_contents but is that 100% secure ?
What is the most secure way to do it ?
Thanks

Is that 100% secure ?
No. Nothing is.
If you're trying to prevent eavesdropping, where an attacker can figure out what a user is downloading, then using https for everything on the download page should be sufficient for almost anything web related.
Even with https, an attacker might be able to tell. If there's one particularly large file, simple traffic analysis (looking at how much is downloaded) will tell you when it's downloaded.
If you allow uploading of SVG images, then, since they can contain and run scripts, they can phone home when downloaded and displayed.
You might also want to check out Tor which provides better browser based anonymity. It's plugin-based, but if you can suggest that your users use it, it can provide an additional layer of protection -- even if an eavesdropper can tell what is being downloaded it will be much harder to tell who is downloading it.

Related

View/download PHP uploads - how to do it virus safe?

Now I've read a bunch of SO topics on how to check whether PHP uploads are virus safe and the gist from that is: I can't 100% guarantee that uploads aren't full of viruses - no matter the extension. One proposed solution is to remove the extension during the upload and then reassemble it when people want to download.
However, I want to let users view files directly on the website. How do I go about doing that? For example, generating an iframe with an uploaded PDF inside - is that safe or is it like executing it which would give potential viruses the opportunity to spread? With DOCs I wanted to use Google Docs, so I'd embed an iframe of Google Docs which GETs a URL of the DOC on my server. Is that safe then?
Or is there simply no way other than only allowing downloads to prevent potential viruses from spreading on the server? If so, how goes the reassembling of the extension? I'd guess, when someone uploads a test.exe, I'd remove the .exe part but store in a database. Then when someone requests the download, i rename the test file to test.exe and push the download. After that I rename it back to test. Is that correct?
Also: how do services like Trello do this? When I upload an image file there, it gets shown directly - without noticeable delay through virus scans or whatever. I thought about using the virustotal.com API but that certainly takes quite long, doesn't it? Would it be okay though to let people upload, then not show them publicly until a virustotal.com-scan is done and then consider the file safe?
Thanks and cheers for all help and sorry, if I missed something.

There are a few approaches I've seen in practice over the years:
Scan it locally, using e.g. ClamAV.
Pro: If your virus detections are up-to-date, you'll catch any known viruses this way.
Con: Anti-virus software is an attack surface. See many of the findings of Tavis Ormandy from Google Project Zero.
Con: Could be taxing to server resources. (Maybe spin up a different server dedicated to AV purposes?)
Use an API, such as VirusTotal.
Pro: Less attack surface.
Con: You have to share the file with VirusTotal, which might be a bad idea if the files you're letting users upload are particularly sensitive (i.e. protected health information).
I'm not sure which to recommend, because I don't know your threat model or operational constraints.
However, the more general problem of not serving browser exploits (e.g. XSS) or allowing reverse shells on the server is actually somewhat easy, but not trivial.

Fetching a file on a server, resizing with PHP GD2, security considerations

What are the security considerations when a server fetches a file from an untrusted domain?
What are the security considerations when resizing an image that you don't trust with PHPs GD2 library?
The file will be stored on the server machine, and will be offered for download. I know I can't trust the MIME-Type header. Is there anything else I should be aware of?
I have a webservice that looks like this:
input
An http-URL (or a String that is expected to be a URL)
output
A meta description of the file, or an error if there was one.
The meta description has one of two forms:
It's an image + a URL to the image on my domain + a thumbnail of the image (generated on and hosted by my server)
It's not an image + a URL to the file on my domain
update
Concerns that I can come up with:
The remote server is a malicious server that will send tiny bits of information, enough to keep the socket open, but doesn't do anything useful - like slowloris. I don't know how real of a threat this is. I suppose it could be easily avoided with timeout + progress check.
The remote server serves something that looks like an image (headers, mime-type) but causes PHP to crash when I load it with GD2.
The server sends a useless or bad MIME-type header. Like text-plain for binary files.
The remote server serves an image with a virus in it. I assume that resizing the image will get rid of the virus, but I will serve the original image if there is no reason to scale.
The remote server serves a file with a virus in it. The file will not be treated as an image so my server will do nothing with it. Nothing will happen until the user downloads, and runs it.
Also, I assume I can trust the users of my service. This is a private application in a situation where users can be held accountable for bad behavior. I assume they wont intentionally try to break it.

What are the security considerations when a server fetches a file from an untrusted domain?
The domain (host) and the file is not to be trusted. This spreads over two points:
Transport
Data
To transport the data safely, use a timeout and a size limit. Modern HTTP client libraries offer both of that. If the file could not be requested in time, drop the connection. If the file is too large, drop the data. Tell the user that there was a problem getting the file. Alternatively let the user handle the transport to that server by using the users browser and javascript to obtain the file. Then post it. Set the post limit with your script.
As long as the data is untrusted you need to handle it with caution. That means, you implement yourself a process that is able to run different security checks on the file before you mark it as "safe".
What are the security considerations when resizing an image that you don't trust with PHPs GD2 library?
Do not pass untrusted data to the image library then. See the step above, bring it into a safe state first.
The file will be stored on the server machine, and will be offered for download. I know I can't trust the MIME-Type header. Is there anything else I should be aware of?
I think you're still at the point above. How to come to safe from untrusted. Sure you can't trust the Content-Type header, however it's good to understand it as well.
You want to protect against the Unrestricted File Upload VulnerabilityOWASP.
Check the filename. If you store the data on your server, give it a safe temporary name that can not be guessed upfront and that is not accessible via the web.
Check the data associated with the filename, e.g. the URL information of the source of that file. Properly handle encoding.
Drop anything that does not meet your expectations, so check the pre-conditions you formulate strictly.
Validate the file data before you continue, for example by using a virus checker.
Validate the image data before you continue. This includes file-headers (magic numbers) as well as that the file-size and file-content is valid. You should use a library that has specialized for the job, e.g. an image-file-format-malformation-checker. This is specialized software, so if this part of your business get into business. Many free software image file code exists, I leave this just for the info, you can't trust any recommendation anyway and need to get into the topic.
If you plan to resize the image yourself, you need to make everything double-safe, because next to hosting you plan to process the data. So know what you do with the data first to locate potential fields of problems.
Do logging and monitoring.
Have a plan for the case that everything get's wrong.
Consider to repeat the process for already existing files, so if you change your procedure, you are able to automatically apply the principles to uploads that were done in the past as well.
Create a system for each type of work that is able to be cleaned after the work has been done. One system to do the download, one system to obtain the meta data etc.. After each action, restore the system from an image. If a single components fails, it won't be left over in an exploited state. Additionally if you detect a fail, you can take your whole system out of business until you have found the flaw.
All this depends a bit how much you want to do, but I think you get the idea. Create a process that works for you knowing where improvement can be added, but first create an infrastructure that is modular enough to deal with error-cases and which probably encapsulates the process enough to deal with any outcome.
You could delegate critical parts to a system that you don't need to care about, e.g. to separate processing from hosting. Additionally, when you host the images the webserver must not be clever. The more stupid a system is, the less exploitable it is (normally).
If hosting is not part of your business, why not hand it over to amazon s3 or similar stores? Your domain can be preserved via DNS settings.
Keep the libraries you use to verify images with up-to-date (which implicates you know which libraries are used and their versio, e.g. the PHP exif extension is making use of mbstring etc. pp. - track the whole tree down). Take care you're in the position to report flaws to the library maintainers in a useful way, e.g. with logging, storing upload data to reproduce stuff etc..
Get knowledge about which exploits for images did exist in the past and which systems/components/libraries (example, see disclaimer there) were affected.
Also get into the topic which are common ways to exploit something, to get the basics together (I'm sure you are aware, however it's always good to re-read some stuff):
Secure file upload in PHP web applications (Alla Bezroutchko; June 13, 2007; PDF)
Some related questions, assorted:
Is it important to verify that the uploaded file is an actual image file?
PHP Upload file enhance security

What you're describing basically comes down to an input validation problem; you don't trust what your application is reading in as input and processing.
To address this, what you should do is to download the resource in question and then attempt to determine a true file type. There are multiple ways to attempt this, but basically you will want to use either some custom-code or a library to parse through the file and look for the tell-tail signs of a certain type. There is a good SO discussion on how to do this in PHP here - How can I determine a file's true extension/type programatically? - I would check the second answer that lists some PHP-specific functions to do this. When your application receives a file, it should perform some true file typing like this and then compare the result to what the specified MIME type from the remote server is; if they match accept the file and if they do not, drop it.
I would also suggest using a whitelist of allowable filetypes (a list of everything your service will support and then ONLY accept files of those types). If you have a very general-purpose service, then you should at least do a blacklist of disallowed filetypes (a list of everything your service absolutely will not support and drop those immediately based on the outcome of your MIME type compares). Again, the use of these is entirely dependent on your use-cases.
Once you've got a type, the concern becomes if what the remote server has sent you is a bad file that targets your server (contains malicious code, buffer overflow designed to make the GD2 library blow up and run arbitrary code, etc). Basically, you are relying on the GD2 library to not contain bugs that would lead to such a successful exploit. There's not much you can do here, short of running security audit on the library yourself and I'm going to assume that's out-of-scope. Basically, keep up on any reported security bugs with the library and patch as soon as you can; as a consumer of the library, you are really relying on the maintainers to find and remedy security vulnerabilities like this.
Next, the concern is that the remote server has sent you a bad file that targets your users/clients (contains malicious code, buffer overflows, viruses, etc). Here, if there is corrupted data that is really malware in the image, it will most likely either (1) break or exploit GD2 when it is read (see above for that scenario) or (2) be eliminated when the resize operation is performed by the library if GD2 can successfully process it. There is still a chance it will remain despite the processing, but there's not much you can do there either. If you're really concerned about this, you can apply a virusscan using an external product designed for that; I would suggest that if you're doing that to do so both (1) after the download and before GD2 processing and then (2) on the manipulated file before you serve it out. Personally, I don't think you get much by doing this, but if you want to provide an additional check / warm fuzzies to your users, it cannot hurt.
To address the slow-feeding of data to keep a connection open, put a timeout on any connection to deal with this problem; unless you are dealing with a specific threat to your use-case here, I do not think this is a huge concern.

1) My primary concern with blindly fetching a file from an untrusted domain would be how to verify that the file is, in fact, what you expected to get.; could the untrusted server trick your script into downloading a harmful file (like a virus) or possibly a script that would allow a backdoor into your system?
2) I haven't read any security issues with resizing an image with the GD2 library. If it's not an image to begin with, the GD2 functions would throw an error. I don't think you have much to worry about with this part.
3) I (personally) would not ever do this without reviewing every single file that my script downloaded first. If you want to partially automate this, you might consider running magic number tests on all the files as a pre-filter. But a human look is the safest way to serve random files. When you finish this project - before you make it live - try to break / trick / hack it as hard as you can. Get some knowledgeable friends involved to help.

when it is not an image you store the file any way regardless what kind of file? so they can upload and php file and browse to it to execute php code on your server?

Large file uploads from web pages

I code primarily in PHP and Perl. I have a client who is insisting on seeking video submissions (any encoding) from the public via one of their pages rather than letting YouTube do its job.
Server in question is a virtual machine and I can adjust ini settings for max post, max upload size etc as needed.
My initial thought is to use a Flash based uploader with PHP on the back end but I wondered if someone might have useful advice and experience on the subject?

Doing large file transfers of HTTP is not usually fun -- but sometimes it's necessary.
For large files, you'll definitely want to provide some kind of progress gauge for end-users.
There are flash-based tools that do this (swfUpload comes to mind).
If you want to avoid flash and do it with pretty html/javascript/css, you can leverage PHP's APC extension, which for some reason provides support for getting upload status from the server, as explained here

You can adjust the post size and use a normal html form. The big problem is not Apache, its http. If anything goes wrong in the transmission you will have no way to detect the error. Further more there is no way to resume the transfer. This is exactly why BitTorrent is so popular.

I don't know how against youtube your client is, but you can use their api to do the uploads from a page on your site.
http://code.google.com/apis/youtube/2.0/developers_guide_protocol.html#Uploading_Videos
See: browser based uploading.

For web-based uploads, there's not many options. Regardless of web platform, web server, etc. you're still transferring over HTTP. The transfer is all or nothing.
Your best option might be to find a Flash, Java, or other client side option that can chunk files and upload them piecemeal, then do a checksum to verify. That will allow for resuming uploads. Unfortunately, I don't know of any such open source component that does this.

Try to convince your client to change point of view.
Using http (and the browser, hell, the browser!) for this kind of issue is rarely a good deal; Will his users wait 40 minutes with the computer and the browser running until the upload is complete?
I dont think so.
Maybe, you could set up a public ftp account, where users can upload but not download and see the others user's files.. then, who want to use FTP software can, who like to do it via browser can too.
The big problem dealing using a browser is that, if something go wrong, you cant resume but have to restart from zero again.
the past year i had the same issue, i gave a look to ZUpload
, but i didnt use it so i can suggest (we wrote a small python script that we send to our customer; the python script create a torrent of the folder our costumer need to send to us, and we download it via utorrent ;)
p.s: again, sorry for my bad english ;)

I used jupload. Yes it looks horrible, but it just works.
With that said, it's still a better idea to convince the client that doing so is stupid.

I would agree with others stating that using HTML is a poor option. I believe there is a size limitation using Flash as well. I know of a script that uses a JavaScript Applet to perform an actual FTP transfer. It is called Simple2FTP and can be found at http://www.simple2ftp.com
Not sure but perhaps worth a try?

Security issue?

I am writing a small PHP application and I am not sure whether I have a security issue. So this is what the application does:
the user can upload either image files (png, gif, jpg, jpeg, tiff and a few others) or zip files
I check for mime-type and extension and if it's not an allowed I don't allow the upload (this is not the part I am worried about).
Now once uploaded I rename the file to a unique hash and store in a folder outside root access.
The user can now access the file through a short URL. I make the file accessible by setting the right mime-type for the header and then I just use readfile().
My question is whether the exploit where a jar file is included inside the image file works here? I am serving the image as a pure image.
If it does what are ways to prevent this?
Thanks.

MIME type checks will not solve the GIFAR issue. 2009's JREs are already patched, but if you want to solve the issue you can either
Serve your images from a different domain
Run a server side code to check if an image contains a valid JAR, like mentioned here
Anything else (short of denying the file to any Java enabled browser with an old enough JRE) may fail on specific cases.
Also remember that to perform a good attack with this technique your server infrastructure would have to be somewhat open (the fact that a request comes from the same domain doesn't mean that you should give any information it asks for.)

Checking the mime-type is not sufficient because that (or any other) HTTP header field can be forged. The best way to confirm that a file is a valid image is to attempt to read it as an image programatically. If it can be parsed as an image, you can be reasonably confident that it's not malicious code.

Related: ensuring uploaded files are safe
Any kind of hidden exploit like you describe should not affect the server because of the way you handle it. You're simply storing binary information, and retrieving binary information, without processing it in anyway. Browsers attempting to display exploited content might be at risk, but standard image types are fairly safe.
If you'd like to be safer, you could run an anti-virus on each uploaded file. If you're on a *nix platform, you can use the industry-standard ClamAV.
I'd be more worried of someone trying to upload a very large image file.

You can do 2 things. Serve your images from images.domain.com. this would have to be on another physical/virtual server, or firewall'd such that no open ports on the server can be accessed from that domain.
Or you can run the image file thru a java script (not javascript) like the one here. This will tell you if there is a jar file embedded in the image.
More info on this issue here:
http://www.gnucitizen.org/blog/java-jar-attacks-and-features/

I didn't actually even hear about this attack before your question, so first off, thanks for enlightening me! Googling around, it seems that there are basically two different attack vectors here. Both include the attacker luring "regular" users to a malicious site pointing to the masqueraded JAR file, and both have to do with the fact that the JAR will be executing in the "context" of your site (i.e. the origin will be your site).
First attack has to do with the applet being able to read user cookies, which basically means it'll be able to steal the user's login information for your domain.
The second one has to do with the fact that the applet is now allowed to open connections to other sockets within your domain, which is pretty bad if one of the users behind your server's firewall visits the malicious page (enabling the attacker to effectively bypass your firewall).
So this attack does not necessarily harm your server directly, but it does harm your users - and hopefully you care about your users. The two things you can do ensure their safety have already been mentioned in most of the other answers and are summarized on this page.

How can I protect my site from being leeched?

I am using the header function of PHP
to send the file to the browser with some small code. Its work well
and I have it so that if any one requests it with a referer other than my site
it redirects to a page first.
Unfortunately it's not working with the internet download manager.
What I want to know is how the rabidshare and 4shared sites do this.

You could use sessions to make sure the download is being requested by a valid user.

Not all browsers / softwares that can see web pages will send a Referer to your server. Some sites will make a browser "fingerprint", usually hashed, which might be Referer, User-Agent and a couple of other headers strung together to make a uniquie identifier for that user and thus restrict access as you describe.
Of course, I may have completely missed the point of your post!

A typical design pattern is using a front controller to have a single entry point for all requests. By having a front controller, you can control exactly what the client sees.
You can configure this in Apache so that all requests go through a single file (it's been a while since I've done this because I now concentrate on Java). I think you would need to look at pathinfo documentation for Apache.
This might require a significant change in the rest of your application code. But, the code will be more secure and maintainable in the long run.
I've served images and other binary files through this pattern. This allowed me to easily verify users were authenticated before actually sending them the file. Obfuscation is not security, so if you rely on obfuscating your URL, an attacker may be delayed in getting in, but it is just a matter of time.
Walter

The problem probably is that sending file through php script (with headers you mentioned) doesn't support starting file download at certain position. Download managers use this feature to download file using several simultaneous threads (assuming server gives one thread at certain speed).
For small project I would recommend making a copy of file with unique filename just for download time and redirecting user to this copied file. This way he gets full server download features and it also doesn't load processor as php does. Disadvantages - more disk space required and need to cleanup download directory.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.