In PHP, the allow_url_fopen flag controls whether or not remote URLs can be used by various file system functions, in order to access remote files.
It is recommended security best practice nowadays to disable this option, as it is a potential attack vector. However, any code which depends on this functionality in order to work would be broken if the setting is disabled. For example, I know of at least one reCaptcha plugin which uses file_get_contents() to access the Google API and which therefore depends on this flag.
In order to check the code in our applications to determine whether it is safe to disable this flag (with a view to rewriting, where necessary) I need a canonical list of the PHP functions that it affects. However, I have been unable to find such a list - there doesn't seem to be one on the PHP website and a Google search didn't turn anything up.
Can anyone provide a list of all PHP functions whose behaviour is affected by allow_url_fopen?
The accepted answer should reference an authoritative source or provide details about methodology used to compile the list, to demonstrate its correctness and completeness.
The list of functions is massive, as the allow_url_fopen ini directive is implemented in PHP's streams system, meaning anything that uses PHP's network streams are affected.
This includes functions from pretty much every extension of PHP that does not use an external library for gaining access to a remote file. As some extensions like cURL uses its own transport layer outside that of PHP.
Some extensions, notoriously ext/soap does bypass this directive in some ways (for what reason I don't exactly know as I'm not familiar with the internals of this extension).
Any function from the standard library (implemented in: main/, Zend/, ext/standard, ext/spl), meaning every Filesystem, Stream, Includes and URL Wrappers respect this directive. From on top of my head I also know that ext/exif does this.
I cannot remember on top of my head if XML based extensions (such as ext/libxml, ext/simplexml, ext/xmlreader, ext/xmlwriter, ext/dom) does this, but I'm certain that there was a point in the past where they did not respect it as the path was directly supplied to LibXML2 underneath.
This is crying out for a list of functions/methods that can take either a file path or a URL when allow_url_fopen is enabled. Making it community wiki, as the reason I found this question was that I was looking for such a list and am unsure that I am considering every corner case.
Opens a file
copy
file
file_get_contents
file_put_contents
fopen
simplexml_load_file
Stats a file
file_exists
filemtime
filesize
filetype
is_dir
is_file
Note: not all of these will work for every kind of URL. For example, "https://" URLs do not allow writing, so copy and file_put_contents will fail on such destinations. Meanwhile, ftp:// URLs do allow writing. Similar issues with file_exists.
I am deliberately not including functions like fwrite and fclose. Because those in particular take the results of fopen. So to my mind, it is fopen that is impacted, not fwrite nor fclose. Because fwrite can't open a file, only fopen can (of those three). So it is fopen that needs checked, not subsequent uses of fwrite or fclose. Those will work or fail if fopen does.
This is why I find answers like "Any function from the standard library" less than helpful. Most of those functions will work with streams opened under allow_url_fopen, but they will not themselves open such a stream. There may be many functions and methods that take resources that were originally opened via a URL, but I don't care about them unless they participate in the opening.
Another way of stating this is that I'm trying to list all functions that accept a URL (e.g. https://stackoverflow.com/ ) as a file path when allow_url_fopen is enabled (but not when it is disabled). Functions like fwrite and fclose do not do this (they take resources, not file paths). So I don't care about them even if their behavior is impacted by allow_url_fopen. I realize that the original question does not make this clear, but I believe that this was what was intended.
Related: list of supported protocols and wrappers.
The fsockopen and curl functions can open URLs even with allow_url_fopen turned off.
Related
I'm interested making certain my file uploaded via php into a db is locked down. Currently the key functions I'm using are fopen and fgetcsv. Unfortunately this subject seems quite nebulous in the webs.
The file isn't "executed" but is opened and walked with fgetcsv. What steps do I need to do in order make certain that no foul play occurs on my server through this module?
Currently I limit the file size and check the extension.
Do I need to verify the file uploaded is actually a csv and not just some file with a csv extension? I assume this would be through a file type recognizer?
What do I need to do to avoid multibyte/encoding exploits?
***Edit
I found this link to be helpful and may be to others; http://php.net/manual/en/features.file-upload.post-method.php
Thanks
If you are relying on a library to parse user input, you should have confidence in the quality of the library.
If you don't then picking a separate library is advisable.
If no sufficiently stable library can be found for the task, the only viable option in a security-critical application is to implement the functionality yourself.
I was always sure that the PHP functions file_get_contents and readfile execute any PHP code in any files - regardless of file type - that are given to it. I tried this on multiple setups, and it always worked.
I received a question regarding this here, and the user seems to think that this is not the case.
I looked at the PHP documentation for the functions, and they do not mention code execution (which is something that I would expect if this is normally the case, as it has serious security implications).
I also searched for it, and found a lot of claims that the functions do not execute PHP code. For example:
readfile does not execute the code on your server so there is no issue there. source
Searching for "php file_get_contents code execution" also returns various questions trying to execute the retrieved PHP code, which seems odd if it would indeed normally execute any given PHP code.
I also found one question that asks about not execution PHP code, so execution does seem to happen to others as well.
So my questions are:
do the functions file_get_contents and readfile execute PHP code in retrieved files?
does this depend on some php.ini setting? If so, what setting(s)?
does it depend on the PHP version, and if so, what versions are affected?
if it is not normally the case, what may be the reasons that they execute the PHP code in my setups?
file_get_contents and readfile do not execute code. All they do is return the raw contents of the file. That could be text, PHP code, binary (e.g. image files), or anything else. No interpretation of the files' contents is happening at all.
The only situation in which it may appear as if execution is happening is:
<?php ?> tags will likely be hidden by the browser because it's trying to interpret them as HTML tags, so this may lead to the impression that the PHP disappeared and hence may have been executed.
You're reading from a source which executes the code, e.g. when reading from http://example.com/foo.php. In this case the functions have the same effect as visiting those URLs in a web browser: the serving web server is executing the PHP code and returning the result, but file_get_contents merely gets that result and returns it.
Those functions are described in the «Function Reference / File System Related Extensions / Filesystem» section of the manual, while function to execute code are described at «Function Reference / Process Control Extensions».
I'm pretty sure the misunderstanding comes from a somehow widespread confusion between file system and network and that's made worse by the PHP streams feature that provides protocol wrappers which allow to use the same functions to transparently open any kind of resources: local files, networks resources, compressed archives, etc. I see endless posts here where someone does something like this:
file_get_contents('http://example.com/inc/database.inc.php');
... and wonders why he cannot see this database connection. And the answer is clear: you are not loading a file, you're fetching a URL. As a result, code inside database.inc.php gets effectively executed... though rather indirectly.
for security reasons I have disabled the function glob in the php.ini and it works as expected, but I also noticed that phpinfo reveals the following information:
Registered PHP Streams: php, file, glob, data, http, ftp, zip, compress.zlib, phar
So if I take following source:
$it = new DirectoryIterator("glob://C:\wamp\www\*");
foreach($it as $f) {
printf("%s: %.1FK\n", $f->getFilename(), $f->getSize()/1024);
}
It would still return the contents of the specified directory.
How can I globally unregister PHP Streams such as glob?
The short answer is: don't even bother trying.
PHP is a complete enough language that if someone is going to write dirty of vulnerable code, they can do it through any block you put in place. The only thing disabling functions like that does is make application developers' lives hell.
It's been proven that things like safe_mode and open_basedir don't actually secure anything. The reason is twofold:
Black lists (which is what safe_mode is) don't work. This has been proven over and over and over.
You can't secure on top of an insecure base. It's already too late. PHP itself already has enough access that even if you disable all the fun parts, people can still get around it.
Instead, protect from the bottom up. Install a chroot jail, and run PHP inside that. Use proper permissions. Vet the code that you run on your server. Monitor the server for intrusions. Nothing fancy. Just good old fashioned sys-admin work...
To answer your original question
The only way that you can unregister a stream wrapper is to do it yourself via stream_wrapper_unregister(). You could use an auto-prepend-file to do it for you (to run that code before every script).
But realize that it's trivial to implement glob in PHP. So there really isn't much point in disabling it...
I am in the process of migrating a lot of files in a large PHP application from local to remote storage. File operations are being transitioned using PHP stream wrappers as an intermediate solution so that we can easily change calls such as fopen('/local/file/path') to fopen('scheme://remote/file/path').
So far I've come across only one feature which is broken by this, which is the GD image library (its file write methods such as imagejpeg, imagegif and imagepng will not write to file streams).
In addition, PHP security options deny include() and require() calls on URLs.
I've tried looking for a list of known incompatibilities but can't find one.
I already have several workarounds available, so I'm covered there, and we'll perform extensive testing, but I would like to know in advance of any pain points if someone's been through the same process before.
Specifically, we are using PHP 5.3.6 on Debian Squeeze.
I would suggest reading this:
http://www.php.net/manual/en/class.streamwrapper.php
A lot of your answers will be found there.
i use file_get_contents function to grab data from sites and store the data in database. it will be very inconvenient for me, if one day the script will start not working.
I know, that it can start not working, if they change the structure of site, but now i'm afraid, that maybe there are mechanisms to disable the working of this function, maybe from server?
i tried to find documentation about it, but can't get, so maybe you will help me?
Thanks
I know, that it can start not working,
if they change the structure of site,
but now i'm afraid, that maybe there
are mechanisms to disable the working
of this function, maybe from server?
Yes, it can be disabled from php.ini with allow_url_fopen option. You have other options such as CURL extension too.
Note also that you will need to have openssl extension turned on from php.ini if you are going to use the file_get_contents function to read from a secure protocol.
So in case file_get_contents is/gets disabled, you can go for CURL extension.
It is possible to disable certain functions using disable_function. Furthermore the support of URLs with filesystem functions like file_get_contents can be disabled with allow_url_fopen. So chances are that file_get_contents might not work as expected one day.
There are at least two PHP configuration directives that can break your script :
If allow_url_fopen is disabled, then, file_get_contents() will not be able to fetch files that are not on the local disk
i.e. it will not be able to load remote pages via HTTP.
Note : I've seen that option disabled quite a few times
And, of course, with disable_functions, any PHP function can be disabled.
Chances are pretty low that file_get_contents() itself will ever get disabled...
But remote-file loading... Well, it might be wise to add an alternative loading mecanism to your script, that would use curl in case allow_url_fopen is disabled.