Apache with PHP serves a base file instead of 404 - php

Problem: Suppose a URL is requesting a file that doesn't exist, e.g. mydomain.com/index.php/bogus
There is no folder named 'bogus' so I expect a '404 not found' response, but instead Apache sends the request to /index.php (which does exist). Why? How do I change it to respond '404 not found'?
I suppose that, in theory, Apache does this to let me generate a custom index page for the folder 'bogus' (which however does not exist). But in practice, by returning a page with 200 response, it is causing confusion to search engines and accidental visitors. My PHP code in 'index.php' is not expecting this URL and so it generates broken links in its dynamic navigation routines.
I've tried to disable indexes (Option -Indexes) and directory indexing (DirectoryIndex disabled) and removed .htaccess (AllowOverride None). None of these changed the response. I've searched stackoverflow and it has plenty of "how to serve a file instead of 404" but this is the opposite: I want Apache to return 404 instead of serving a PHP file from higher up in the file system.
My server environment is Windows Server 2008, Apache 2.2.22, and PHP 5.3. No mod_rewrite.

The solution that works is to add AcceptPathInfo Off to the Apache config file.
This directive controls whether requests that contain trailing pathname information that follows an actual filename (or non-existent file in an existing directory) will be accepted or rejected. The trailing pathname information can be made available to scripts through the CGI (common gateway interface) specifications.
When AcceptPathInfo is 'Off', the CGI parsing will keep the URL as one long string and look for a file in your filesystem to match.
When AcceptPathInfo is 'On', the CGI will separates the URL into a script name PLUS the following characters are information made available to the script.
The Apache core docs have more info: http://httpd.apache.org/docs/2.0/mod/core.html#acceptpathinfo

You don't have a folder named index.php, you have a file with that name. I think apache finds the file and decides it's found what was requested, so it serves the file.
In your index.php file, you can check that $_SERVER['REQUEST_URI'] is a valid request for index.php. If it isn't a valid request, you can use the PHP http_response_code(404) or header() functions to make your index.php return 404 for invalid URLs.

Related

Why do URLs to PHP-Pages no not yield a 404 when nonsense is appended to the URL

Let's take a proper URL to a php-Page like:
https://secure.php.net/ChangeLog-7.php
If we now add a trailing slash and some random garbage like this:
https://secure.php.net/ChangeLog-7.php/nonexistentfolder/anotherfile.html
the URL still works. In my opinion, it should have generated a 404-Error because "nonexistentfolder" is a folder not existing on the remote server as well as "anotherfile.html" is a non existent file.
This seems to happen generally, independent from webserver or rewrite-rules, so it seems to have its source in the PHP-Webserver-Module.
I do understand, what PATH_INFO is, but i do not understand, why calling such a URL does not generate a 404 response which would be the case if the existing file in the URL would be .html (and not .php).
How do people deal with this i.e. to avoid such bogus links making their way to search engines or alike?
Thanks!
According to the Apache Documentation, the Setting for AcceptPathInfo depends on the Handler used to answer the request. Handlers to answer requests for .html and .php files are different and it seems the default of the handler for .php is to accept PATH_INFO.
If you want the webserver to reply with a 404-Status, when the url is pointing to an invalid file/folder but includes a valid .php file at the beginning of the url, you can do so by adding the following i.e. to a .htaccess-file:
<Files ~ "\.php$">
AcceptPathInfo Off
</Files>

How to protect my php files on the server from being requested

I'm very new to php and web , now I'm learning about oop in php and how to divide my program into classes each in .php file. before now all I know about php program, that I may have these files into my root folder
home.php
about.php
products.php
contact.php
So, whenever the client requests any of that in the browser
http://www.example.com/home.php
http://www.example.com/about.php
http://www.example.com/products.php
http://www.example.com/contact.php
No problem, the files will output the proper page to the client.
Now, I have a problem. I also have files like these in the root folder
class1.php
class2.php
resources/myFunctions.php
resources/otherFunctions.php
how to prevent the user from requesting these files by typing something like this in the browser ?
http://www.example.com/resources/myFunctions.php
The ways that I have been thinking of is by adding this line on top of every file of them exit;
Or, I know there is something called .htaccess that is an Apache configuration file that effect the way that the Apache works.
What do real life applications do to solve this problem ?
You would indeed use whatever server side configuration options are available to you.
Depending on how your hosting is set up you could either modify the include path for PHP (http://php.net/manual/en/ini.core.php#ini.include-path) or restricting the various documents/directories to specific hosts/subnets/no access in the Apache site configuration (https://httpd.apache.org/docs/2.4/howto/access.html).
If you are on shared hosting, this level of lock down isn't usually possible, so you are stuck with using the Apache rewrite rules using a combination of a easy to handle file naming convention (ie, classFoo.inc.php and classBar.inc.php), the .htaccess file and using the FilesMatch directive to block access to *.inc.php - http://www.askapache.com/htaccess/using-filesmatch-and-files-in-htaccess/
FWIW all else being equal the Apache foundation says it is better/more efficient to do it in server side config vs. using .htaccess IF that option is available to you.
A real-life application often uses a so-called public/ or webroot/ folder in the root of the project where all files to be requested over the web reside in.
This .htaccess file then forwards all HTTP requests to this folder with internal URL rewrites like the following:
RewriteRule ^$ webroot/ [L] # match either nothing (www.mydomain.com)
RewriteRule ^(.*)$ webroot/$1 [L] # or anything else (www.mydomain.com/home.php)
.htaccess uses regular expressions to match the request URI (everything in the URL after the hostname) and prepends that with webroot/, in this example.
www.mydomain.com/home.php becomes www.mydomain.com/webroot/home.php,
www.mydomain.com/folder/file.php becomes www.mydomain.com/webroot/folder/file.php
Note: this will not be visible in the url in the browser.
When configured properly, all files that are placed outside of this folder can not be accessed by a regular HTTP request. Your application however (your php scripts), can still access those private files, because PHP runs on your server, so it has filesystem access to those files.

Apache does not generate a 404

if I have /faq.php on the server it can also be accessed via /faq.php/nonexistant.gif why? I have made sure MultiViews are disabled. Why does the contents of /faq.php get shown when I access the URI /faq.php/randomstuff.gif? FYI, I have no htaccess file in the same directory.
/nonexistant.gif will be HTTP "PATH_INFO": http://www.ietf.org/rfc/rfc3875, section 4.1.5
Basically, the webserver will scan "down" a url until it hits an actual file. Anything after that file in the url becomes PATH_INFO.
http://example.com/some/path/leading/to/realfile.php/extra/stuff/that/becomes/path/info
^^^^^^^^^^^^^^^^^^^^--- real directories
^^^^^^^^^^^^--actual file, scanning stops here
^^-----onwards = path_info
That is called path_info. You can disable it using AcceptPathInfo Off in the apache config. People generally use it as a fake mod rewrite when mod rewrite is not availalble.
http://httpd.apache.org/docs/2.2/mod/core.html#acceptpathinfo

Apache configuration broken after software update

I just updated my Ubuntu outdated development server, and it broke down some configuration.
Now apache/php does not properly handle urls like index.php/profile, but will handle correctly just index.php.
Basically if there some path after index.php, then it will return 404 error:
The requested URL /index.php/profile was not found on this server.
What configuration option is likely to fix this problem? I need to fix this urgently. Thanks in advance!
Check the setting of AcceptPathInfo:
This directive controls whether requests that contain trailing pathname information that follows an actual filename (or non-existent file in an existing directory) will be accepted or rejected. The trailing pathname information can be made available to scripts in the PATH_INFO environment variable.
For example, assume the location /test/ points to a directory that contains only the single file here.html. Then requests for /test/here.html/more and /test/nothere.html/more both collect /more as PATH_INFO.

How to treat a php file as directory

I wanted to know if it is possible to treat a php file as a directory so that index.php/abc/def really calls index.php. The index.php should then know the subdirectory path (ie /abc/def).
I'm searching a plain php solution. I know that I could use mod_rewrite to map the directory to GET-parameters.
If I recall correctly, you should be able to parse out this info from $_SERVER['PHP_SELF'].
Edit: Even better, take a look at $_SERVER["PATH_INFO"].
This will work if Apache's AcceptPathInfo directive is turned on, which is the default. From the manual:
The treatment of requests with trailing pathname information is determined by the handler responsible for the request. The core handler for normal files defaults to rejecting PATH_INFO requests. Handlers that serve scripts, such as cgi-script and isapi-handler, generally accept PATH_INFO by default.
you can query the path entered using the $_SERVER["PATH_INFO"] directive. You'll have to parse the paths yourself inside the script.
A plain PHP solution is impossible since what is called will be the file.
The PHP in the file does not care where it is or how the file it is in is called other than for include purpose.
What you are looking for is to either convince the OS to pass the /abc/def/ as a parameter to the script or to get the Webserver to do the same thing (ie. mod_rewrite for apache).

Categories