While logging HTTP requests to a file I have found something I would not expect.
I just put in the log the $_SERVER['REQUEST_URI'].
Guess what I have found, an url with #fragment attached:
18/05: requested cat/page.html#fragment
Note out of 2477 line of logs I found only one line with fragment attached
Everyone know (should) that fragment is never known server-side but only javascript code can get it. So what is happening here?
I am running PHP 5.3 on Apache 2.X (Debian).
Your assertion that "fragment is never known server-side but only javascript code can get it" is a little short-sighted.
Whilst it's true that, in general operation with a conventional browser, a fragment is not included in the request-to-server, there is nothing stopping me from writing whatever I want in an HTTP request.
echo "GET /lol/werent/expecting/this#were_you HTTP/1.1" > /dev/tcp/yourwebsite.com/80
Someone's testing, someone's playing, someone's playing a bizarre hack attempt, or someone's using a buggy browser.
I wouldn't worry about it.
Related
I am using file_get_contents() to fetch the contents from a page. It was working perfectly, but suddenly stopped working and started to show the error below:
"Warning: file_get_contents(https://uae.souq.com/ae-en/apple-iphone-x-with-facetime-256gb-4g-lte-silver-24051446/i/): failed to open stream: HTTP request failed! in /home/xxx/xxxx/xxx/index.php on line 6.
So I tried the same code on localserver, it was working perfectly. Then I tried on another server, and it was working perfectly there too. So I contacted the hosting provider, they said the problem is with the url that they may be preventing the access. So I tried another url (https://www.w3schools.com/) and it is getting contents without any error.
Now I am really confused what the problem is. If the problem is with the server, other urls shouldn't have worked. And if the problem is with url, it shouldn't have worked on the second server and local server.
Here is the test code:
<?php
$html= file_get_contents("https://uae.souq.com/ae-en/apple-iphone-x-with-facetime-256gb-4g-lte-silver-24051446/i/");
echo $html;
?>
What is the problem here? Even if the problem is with url or server, why was it working perfeclty earlier?
It sounds like that site (souq.com) has blocked your server. The block may be temporary or it may be permanent. This may have happened because you made too many requests in a short time, or did something else that looked "suspicious," which triggered a mechanism that prevents misbehaving robots from scraping the site.
You can try again after a while. Another thing you can try is setting the User-Agent request header to impersonate a browser. You can find how to do that here: PHP file_get_contents() and setting request headers
If your intention is to make a well behaved robot, you should set the User-Agent header to something that identifies the request as coming from a bot, and follow the rules the site specifies in its robots.txt.
I was trying to get RSS info from a website the other day, but when I tried to load it using PHP it returned a 403 error.
This was my PHP code:
<?php
$rss = file_get_contents('https://hypixel.net/forums/-/index.rss');
echo $rss;
?>
And the error I got was:
failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden
I must say that loading it regularly from a browser works just fine, but when I try loading it using PHP or any other server-side method it won't work.
Some people don't like servers accessing their stuff. They provide a service intended for human consumers, and not bots. Therefore they may include code that checks whether you are in fact a human using a web browser, which your naïve PHP script is failing to provide. Therefore, the third-party is returning a 403 Forbidden error, indicating that it is forbidden for your program to access it.
There are ways around this, of course, depending on how it's implemented. The most obvious thing to do is send a User-Agent header pretending to be a browser. But servers may do more clever checks than this, and it's questionably moral.
what is the difference from setting the responce status in php
header("HTTP/1.0 404 Not Found");
and
header("Status: 404 Not Found");
What is the difference from the client point of view (aka browser or a client implementation for RESTful WS). I understood that the second one has to do something with CGI.
HTTP/1.0 404 Not Found is the HTTP response code, it's what allows clients to determine whether a request succeeded or not.
Status: 404 Not Found just sets an extra header field called Status with the value of 404 Not Found. It has no intrinsic meaning, it's like setting header('Foo: Bar'). It may mean something to somebody, but it's not officially specified what it should mean. The HTTP response code will be a normal 200 OK.
There seems to be a special case when running PHP through FastCGI. Apparently you can't set the HTTP/ status directly when invoking PHP with this method. Instead you have to set this unofficial header, which will be converted to a real HTTP/ code before it's send back to the client (apparently a limitation of how PHP can talk to the web server when invoked via CGI). In all other cases, it'll just be send as-is (with no meaning) and the real HTTP response code will be 200 OK.
That's what I could gather from the description in the manual at least, I've never had to use this. Also, you're insane if you run PHP through CGI, so hopefully nobody needs this in this day and age. ;o)
So I just got a nasty surprise when I deployed some code I thought I'd tested. It would seem there must be some difference between my test machine and my server. The exact same code, featuring a header redirect, worked perfectly on my test machine and not at all on the server. The redirect on the server simply didn't happen, leaving a blank page as a result.
The header is called somewhere in the middle of the script - but nothing will have been output yet. It doesn't output anything until the very end of the script. Long after everything else is run. It buffers everything.
Both server and test machine are running the same PhP version, the same Apache version. Is there something in the configuration files that would allow the header to happen for one and not in the other? Is there something else going on here that would cause it to fail?
EDIT:
Here's the line that sets the header:
public function setRedirect($url) {
header('Location: '.$url);
}
And here's the code that calls that:
$url = new URL('index');
$this->layout->setRedirect($url->toString());
Where URL::toString() always generates a fully qualified domain name, in this case: http://domain/index.php?action=index
I checked both Php and Apache error logs. Nada.
Probably there was some whitespace or other form of output before the header call.
This is only work if you the ini setting output-buffering is on (or if you explicitly start output buffering, but in that case, the redirect should work in both computers).
You can confirm this by turning on error reporting.
Use Fiddler or some other client-side tool to check your headers. Determine that the Location: header is actually being sent. Also, some browsers are picky in the order that headers need to be sent.
I think the most likely explanation is that an error is causing the script to exit on your server, and you have display errors turned off (hence the blank screen). I would suggest checking the Apache error long on your server to see if PHP is putting something in there.
Otherwise you could use a browser extension like LiveHTTPHeaders (for Firefox) to see if the location header is being sent at all, or try debugging the script to see if it's even getting as far as that header call.
I think your server puts some script in your pages to track visitors and give you traffic stats or for a similar purpose. Ideally, you should get an error for this but may be your server has error reporting disabled which gives you a blank page.
I suggest you to run a script with a syntax error and check weather your server has error reporting disabled.
I'm working on building a PHP based proxy script to access a particular ASP.NET page that uses lots of AJAX. So far most of the website works, but one of the forms produces the following error upon submittal:
Sys.WebForms.PageRequestManagerParserErrorException: The message received from the server could not be parsed. Common causes for this error are when the response is modified by calls to Response.Write(), response filters, HttpModules, or server trace is enabled.
Details: Error parsing near '
<!DOCTYPE html P'.
I've checked the headers that my proxy script sends/receives, and they're identical to what would actually be sent my a web browser like FF. I've checked the page source to make sure everything that should be in tact is so. I've also verified there aren't any javascript errors on the page.
Can anyone suggest an approach to continue troubleshooting the issue?
Thanks.
If you miss an AJAX call in your proxy, there could be some cross domain errors. Also, make sure you are not accidentally stripping any non-standard headers like X-MicrosoftAjax.