I'm trying to force a file to be downloaded by sending it special headers. In doing so, I have to redirect URL requests for PDF documents through my download script.
I pass a query called $seg3, which is base64_encode()ed before sending, and then base64_decode()ed and urlencoded() when trying to request the file.
I've taken a look at using,
if (ini_get('allow_url_fopen') == '1')
{
$data = file_get_contents(urlencode(base64_decode($seg3)));
}
else
{
$fp = fopen(urlencode(base64_decode($seg3)), 'rb');
$data = stream_get_contents($fp);
fclose($fp);
}
But both file_get_contents() and stream_get_content() fail with:
fopen($URL): failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found
Yet when I dump the URL that is being sent, I can copy and paste it in my browser and open the file.
It only seems to occur when spaces are in the file, yet the error occurs whether I use urlencode() or not.
It may be that an url returns a 404 error, but still returns regular contents as well. So while the browser may display a regular page, this function will fail because of the result code.
I'm sure this has been fixed already, but if there are spaces urlencode($url) might solve your problem.
Related
I'm using the following line with to get the page content:
$handle = file_get_contents(
"http://www.mywebsite.com/index.php?show=users&action=msg§ion=send",
NULL,
NULL,
1000,
19000);
And then, I'm getting the following message:
Warning:
file_get_contents(http://www.mywebsite.com/index.php?show=users&action=msg §ion=send):
failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden
(Please take a note at the bolded part).
What happened to it? Why does it changes the url param?
I don't think PHP is changing querystring parameters.
If you are reading that message in the browser, it should be just a matter of how the output is HTML formatted. So, the 403 error you are getting it should not be related to some unwanted url transformation.
I am using PHP fopen() to make a GET request to an ASP.NET MVC endpoint. When the request is succesful, there is an empty response and an HTTP 204 status code.
However, fopen is throwing a warning so I'm trying to figure out the best way to resolve this.
$handle = fopen("http://myservice.com/test.php?foo=bar", "r");
Warning is:
Warning: fopen(http://myservice.com/test.php?foo=bar) [function.fopen]: failed to open stream: HTTP request failed! HTTP/1.1 204 No Content in E:\web\test.php on line 18
I'm confused as to why fopen is even throwing a warning. What is the best way to handle this? Should I:
Prefix fopen call with "#" to suppress warnings?
Change the webservice to return some content like "OK" so the status will be a HTTP 200 OK?
... something else?
Yes, it should not. All 20x status codes should be treated as ok. But it might depend on your version of PHP. Since PHP5.3 it checks for the response_code >= 200 and < 400 and only complains otherwise.
But PHP 5.2 for example: http://svn.php.net/repository/php/php-src/branches/PHP_5_2/ext/standard/http_fopen_wrapper.c contains this tidbid:
switch(response_code) {
case 200:
case 206: /* partial content */
case 302:
case 303:
case 301:
reqok = 1;
break;
This excludes your 204 status code and would explain why the following code triggers a warning. (At least there's no need to file a bug report, as it apparently has been already.)
In this case I'm not sure if it suits your use case, should be used with care, but # would indeed suppress the warning.
If you want to use fopen(), this is what I dug up.
$context = stream_context_create(
array('http' => array('ignore_errors' => 1))
);
$handle = fopen("url...", "r", false, $context);
Using these leads:
http://nadeausoftware.com/articles/2007/07/php_tip_how_get_web_page_using_fopen_wrappers
http://ca2.php.net/manual/en/function.stream-context-create.php
http://ca2.php.net/manual/en/context.http.php
You may also want to consider something such as curl.
If all you're doing is checking the response, you may want to check out: http://us3.php.net/manual/en/function.get-headers.php
Edit: I just looked at some of my code that gets an HTTP status and it looks like I've been using the suppression operation. This code has been in production for years and there have been no problems (aside from relying on the fopen URL wrappers):
$fp = #fopen($url, 'rb');
if (isset($http_response_header[0])) $http_status = $http_response_header[0];
if ($fp) fclose($fp);
I've got a simple php script to ping some of my domains using file_get_contents(), however I have checked my logs and they are not recording any get requests.
I have
$result = file_get_contents($url);
echo $url. ' pinged ok\n';
where $url for each of the domains is just a simple string of the form http://mydomain.com/, echo verifies this. Manual requests made by myself are showing.
Why would the get requests not be showing in my logs?
Actually I've got it to register the hit when I send $result to the browser. I guess this means the webserver only records browser requests? Is there any way to mimic such in php?
ok tried curl php:
// create curl resource
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, "getcorporate.co.nr");
//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// $output contains the output string
$output = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
same effect though - no hit registered in logs. So far it only registers when I feed the http response back from my script to the browser. Obviously this will only work for a single request and not a bunch as is the purpose of my script.
If something else is going wrong, what debugging output can I look at?
Edit: D'oh! See comments below accepted answer for explanation of my erroneous thinking.
If the request is actually being made, it would be in the logs.
Your example code could be failing silently.
What happens if you do:
<?PHP
if ($result = file_get_contents($url)){
echo "Success";
}else{
echo "Epic Fail!";
}
If that's failing, you'll want to turn on some error reporting or logging and try to figure out why.
Note: if you're in safe mode, or otherwise have fopen url wrappers disabled, file_get_contents() will not grab a remote page. This is the most likely reason things would be failing (assuming there's not a typo in the contents of $url).
Use curl instead?
That's odd. Maybe there is some caching afoot? Have you tried changing the URL dynamically ($url = $url."?timestamp=".time() for example)?
I would like to check to a remote website if it contains some files. Eg. robots.txt, or favicon.ico. Of course the files should be accessible (read mode).
So if the website is: http://www.example.com/ I would like to check if http://www.example.com/robots.txt.
I tried fetching the URL like http://www.example.com/robots.txt. And sometimes you can see if the file is there because you get page not found error in the header.
But some websites handle this error and all you get is some HTML code saying that page can not be found.
You get headers with status code 200.
So Anybody any idea how to check if file exists really or not?
Thanx,
Granit
I use a quick function with CURL to do this, so far it handle's fine even if the URL's server tries to redirect:
function remoteFileExists($url){
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_NOBODY, true);
$result = curl_exec($curl);
$ret = false;
if ($result !== false) {
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 200) {
$ret = true;
}
}
curl_close($curl);
return $ret;
}
$url = "http://www.example.com";
$exists = remoteFileExists("$url/robots.txt");
if($exists){
$robottxt = file_get_contents("$url/robots.txt");
}else{
$robottxt = "none";
}
If they serve an error page with HTTP 200 I doubt you have a reliable way of detecting this. Needless to say that it's extremely stupid to serve error pages that way ...
You could try:
Issuing a HEAD request which yields you only the headers for the requested resource. Maybe you get more reliable status codes that way
Check the Content-Type header. If it's text/html you can assume that it's a custom error page instead of a robots.txt (which should be served as text/plain). For favicons likewise. But I think simply checking for text/html would be the most reliable way here.
Well, if the website gives you an error page with a success status code, there is not much you can do about it.
Naturally, if you're just after robots.txt or favicon.ico or something else very specific, you can simply check if the response document is in correct format... like robots.txt should be text/plain containing stuff that robots.txt is allowed to contain and favicon.ico should be an image file.
The header content-type for a .txt file should be text/plain, so if you receive text/html it's not a simple text file.
To check if a picture is a picture you would need to retrieve the content-type as it will usually be image/png or image/gif. There is also the possibility of using PHP's GD library to check if it is in fact an image.
I'm trying to get the contents from another file with file_get_contents (don't ask why).
I have two files: test1.php and test2.php. test1.php returns a string, bases on the user that is logged in.
test2.php tries to get the contents of test1.php and is being executed by the browser, thus getting the cookies.
To send the cookies with file_get_contents, I create a streaming context:
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"))`;
I'm retrieving the contents with:
$contents = file_get_contents("http://www.example.com/test1.php", false, $opts);
But now I get the error:
Warning: file_get_contents(http://www.example.com/test1.php) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found
Does somebody knows what I'm doing wrong here?
edit:
forgot to mention: Without the streaming_context, the page just loads. But without the cookies I don't get the info I need.
First, this is probably just a typo in your question, but the third arguments to file_get_contents() needs to be your streaming context, NOT the array of options. I ran a quick test with something like this, and everything worked as expected
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"));
$context = stream_context_create($opts);
$contents = file_get_contents('http://example.com/test1.txt', false, $context);
echo $contents;
The error indicates the server is returning a 404. Try fetching the URL from the machine PHP is running on and not from your workstation/desktop/laptop. It may be that your web server is having trouble reaching the site, your local machine has a cached copy, or some other network screwiness.
Be sure you repeat your exact request when running this test, including the cookie you're sending (command line curl is good for this). It's entirely possible that the page in question may load fine in a browser without the cookie, but when you send the cookie the site actually is returning a 404.
Make sure that $_SERVER['HTTP_COOKIE'] has the raw cookie you think it does.
If you're screen scraping, download Firefox and a copy of the LiveHTTPHeaders extension. Perform all the necessary steps to reach whatever page it is you want in Firefox. Then, using the output from LiveHTTPHeaders, recreate the exact same request requence. Include every header, not just the cookies.
Finally, PHP Curl exists for a reason. If at all possible, (I'm not asking!) use it instead. :)
Just to share this information.
When using session_start(), the session file is lock by PHP. Thus the actual script is the only script that can access the session file. If you try to access it via fsockopen() or file_get_contents() you can wait a long time since you try to open a file that has been locked.
One way to solve this problem is to use the session_write_close() to unlock the file and relock it after with session_start().
Example:
<?php
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"));
$context = stream_context_create($opts);
session_write_close(); // unlock the file
$contents = file_get_contents('http://120.0.0.1/controler.php?c=test_session', false, $context);
session_start(); // Lock the file
echo $contents;
?>
Since file_get_contents() is a blocking function, both script won't be in concurrency while trying to modify the session file.
But i'm sure this is not the best manner to manipulate session with an extend connection.
Btw: it's faster than cURL and fsockopen()
Let me know if you find something better.
Just out of curiosity, are you attempting file_get_contents on a page that has a space in it? I remember trying to use fgc on a URL that had a space in the name and while my web browser parsed it just fine, fgc didn't. I ended up having to use a str_replace to replace ' ' with '%20'.
I would think that this should have been relatively easy to spot that though as it would report only half of the filename. Also, I noticed in one of these posts, someone used \r\n while defining the headers. Keep in mind that PHP doesn't like these to be in single quotes, but they work fine in double.
Make sure that file1.php exists on the server. Try opening it in your own browser to make sure!