Recently, with no changes to my code, my PHP page started to hang at a certain area. It generates all of the HTML on the page right before this line:
$tickerJSON = file_get_contents("http://mtgox.com/code/data/ticker.php");
I commented out everything else and this is the cause of the error.
I know that that JSON url is valid and the array names are correct. I'm not sure where the problem is in this case. Any help?
Note: It doesn't display a partial or white page, it'll keep loading forever with no display output.
The problem is that the remote server appears to be purposely stall requests that don't send a user agent string. By default, PHP's user-agent string is blank.
Try adding this line directly above your call:
ini_set('user_agent', 'PHP/' . PHP_VERSION);
I've tested the above using this script and it worked great for me:
<?php
ini_set('user_agent', 'PHP/' . PHP_VERSION);
$tickerJSON = file_get_contents("http://mtgox.com/code/data/ticker.php");
echo $tickerJSON;
Update:
$tickerJSON = shell_exec('wget --no-check-certificate -q -O - https://mtgox.com/code/data/ticker.php');
The remote connection you do takes a very long time. You can go around with that providing a timeout value. If it takes too long, the function won't return any data but it wont hinder the script as well from continuing to run.
Next to that you need to set the user-agent:
// Create a stream
$opts = array(
'http'=>array(
'timeout'=> 3, // 3 second timeout
'user_agent'=> 'hashcash',
'header'=>"Accept-language: en\r\n"
)
);
$context = stream_context_create($opts);
$url = "https://mtgox.com/code/data/ticker.php";
$tickerJSON = file_get_contents($url, FALSE, $context);
Related
I am using PHP to get JSON from a remote server via file_get_contents command. Here is the piece of code I used:
$opts = array(
'https'=>array(
'method'=>'GET',
'header'=>'Accept-language: en\r\n' .
'Authorization: MAC ["3","ios2.5.0","123","123abc","123=","abc="]\r\n' .
'User-Agent: abc/1.1.1 iOS/10.0.2 iPhone/iPhone7,1\r\n'
)
);
$context = stream_context_create($opts);
$file = file_get_contents('https://www.google.com/v11/file?search=ios&with=users%2Cfiles%2Cquestions', false, $context);
echo $file;
I did a quick debugging:
Using Postman I was able to get the json file with the same header.
I tried a different json from a different url, it works.
I tried a local file, it works.
You have to understand what file_get_contents is. This command is a request to get the file on the server, in this case it is requesting to get https://www.google.com/v11/file/index.html on the server as in one single step. Since your url seems to use header to verify your origin, it might be an ajax request, meaning the server components didn't set up to allow an output from file_get_contents requests, instead they probably accept cURL requests.
So you can use:
curl_exec()
I have a PHP script that connects to an URL through cURL and then does something, depending on the returned HTTP status code:
$ch = curl_init();
$options = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_URL => $url,
CURLOPT_USERAGENT => "What?!?"
);
curl_setopt_array($ch, $options);
$out = curl_exec($ch);
$code = curl_getinfo($ch)["http_code"];
curl_close($ch);
if ($code == "200") {
echo "200";
} else {
echo "not 200";
}
Some webservers are slow to reply, and although the page is loaded in my browser after a few seconds my script, when it tries to connect to that server, tells me that it did not receive a positive ("200") reply. So, apparently, the connection initiated by cURL timed out.
But why? I don't set a timeout in my script, and according to other answers on this site the default timeout for cURL is definitely longer than the three or four seconds it takes for the page to load in my browser.
So why does the connecion time out, and how can I get it to last longer, if, apparently, it is already set to infinite?
Notes:
The same URL doesn't always time out. So sometimes cURL can connect.
It is not one specific URL that sometimes times out, but different URLs at different times.
I'm on a shared server, so I don't have root access to any files.
I tried to look at curl_getinfo($ch) and curl_error($ch) – as per #drew010's suggestion in the comments – but both were empty whenever the problem happened.
The whole script runs for a little more than one minute. In this time it connects to 300+ URLs successfully. Even when one of the URLs fails, the other connections are successfully made. So the script does not time out.
cURL does not time out either, because when I try to connect to an URL with a script sleeping for 59 seconds, cURL successfully connects. So apparently the slowness of the failing URL is not a problem in itself for cURL.
Update
Following #Karlos' suggestion in his answer, I used:
CURLOPT_VERBOSE => 1,
CURLOPT_STDERR => $curl_log
(using code from this answer) and found the following in $curl_log when an URL failed (URL and IP changed):
* About to connect() to www.somesite.com port 80 (#0)
* Trying 104.16.37.249... * connected
* Connected to www.somesite.com (104.16.37.249) port 80 (#0)
GET /wp_german/?feed=rss2 HTTP/1.1
User-Agent: myURL
Host: www.somesite.com
Accept: */*
* Recv failure: Connection reset by peer
* Closing connection #0
So, I have found the why – thank you #Karlos! – and apparently #Axalix was right and it is a network problem. I'll now follow suggestions given on this site for that kind of failure. Thanks to everyone for their help!
My experience working with curl showed me that sometimes when using the option:
CURLOPT_RETURNTRANSFER => true
the server might not give a successful reply or, at least, a successful reply within the timeframe that curl has to receive the response and cache it, so the results are returned by the curl into the variable you assign. In your code:
$out = curl_exec($ch);
In this stackoverflow question CURLOPT_RETURNTRANSFER set to true doesnt work on hosting server, you can see that that the option CURLOPT_RETURNTRANSFER is directly affected by the requested host web server implementation.
As you are using explicitly the response body, and your code relies on the response headers, a good way to solve this might be to:
CURLOPT_RETURNTRANSFER => false
and execute the curl code to work on the response headers.
Once you have the header with the code you are interested, you could run a php script that echoes the curl response and parse it by yourself:
<?php
$url=isset($_GET['url']) ? $_GET['url'] : 'http://www.example.com';
$ch= curl_init();
$options = array(
CURLOPT_RETURNTRANSFER => false,
CURLOPT_URL => $url,
CURLOPT_USERAGENT => "myURL"
);
curl_setopt_array($ch, $options);
curl_exec($ch);
curl_close($ch);
?>
In any case the reply to your question why your request does not get an error, I guess that the use of the option CURLOPT_NOSIGNAL and the different timeout options explained in the set_opt php manual might get you closer to it.
In order to dig further, the option CURLOPT_VERBOSE might help you to have extra information about the request behavior through the STDERR.
The reason may be your hosting provider is imposing some limits on outgoing connections.
Here is what can be done to secure your script:
Create a queue in DB with all the URLs that need to be fetched.
Run cron every minute or 5 minutes, take a few URLs from DB - mark them as in progress.
Try to fetch those URLs. Mark every fetched URL as success in DB.
Increment failure count for unsuccessful ones.
Continue going through queue until its empty.
If you implement such a solution you will be able to process every single URL under any unfavourable conditions.
I have a curl put request that works fine on my localhost but on the live server it throws back a 500 error. Here is my code:
public static function send( $xml )
{
$xml = str_replace( "\n", "", $xml );
//Write to temporary file
$put_data = tmpfile();
fwrite( $put_data, $xml );
fseek( $put_data, 0 );
$options = array(
CURLOPT_URL => 'http://*****************/cgi-bin/commctrl.pl?SessionId=' . Xml_helper::generate_session_id() . '&SystemId=live',
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_HTTPHEADER => array( 'Content-type: text/xml' ),
CURLOPT_PUT => TRUE,
CURLOPT_INFILE => $put_data,
CURLOPT_INFILESIZE => strlen( $xml )
);
$curl = curl_init();
curl_setopt_array( $curl, $options );
$result = curl_exec( $curl );
curl_close( $curl );
return $result;
}
I do have curl enabled on the server!
Does anyone have any ideas why it is not working on the server? I am on shared hosting if that helps.
I also have enabled error reporting at the top of the file but no errors show after the curl has completed. I just get the generic 500 error page.
Thanks
UPDATE:
I have been in contact with the client and they have confirmed that the information that is sent is received by them and inserted into their back office system. So it must be something to do with the response that is the cause. It is a small block of xml that is suppose to be returned.
ANOTHER UPDATE
I have tried the same script on a different server and heroku and I still get the same result.
ANOTHER UPDATE
I think I may have found the route of the issue. The script seems to be timing out because of a timeout on FastCGI and because I am on shared hosting I can not change it. Can any one confirm this?
FINAL UPDATE
I got in contact with my hosting provider and they confirmed that the script was timing out due to the timeout value on the server not the one I can access with any PHP function or ini_set().
If the error is, like you think it is, to do with a script timeout and you do not have access to the php.ini file - there is an easy fix
simply use set_time_limit(INT) where INT is the number of seconds, at the beginning of your script to override the settings in the php.ini file
Setting a timeout of set_time_limit(128) should solve all your problems and is generally accepted as a reasonable upper limit
More info can be found here http://php.net/manual/en/function.set-time-limit.php
Here are a few things to try:
Remove the variability in the script - for testing, hardcode the session id, so that the curl curl is the same. You cannot reliably test something, if it changes each time you run it.
Try using curl directly from the command line, via something like curl http://*****************/cgi-bin/commctrl.pl?SessionId=12345&SystemId=live. This will show you if the problem is really due to the computer itself, or something to do with PHP.
Check the logs on your server, probably something like /var/log/apache/error.log depending on what OS your server uses. Also look at the access log, so that you can see whether you are actually receiving the same request.
Finally, if you really run out of ideas, you can use a program like wireshark or tcpdump/WinDump to monitor the connection, so that you can compare the packets being sent from each computer. This will give you an idea of how they are different - are they being mangled by a firewall? Is php adding extra headers to one of them? Are different CURL defaults causing different data to be included?
I suspect your server does not support tmpfile(). Just to verify:
public static function send( $xml ) {
$xml = str_replace( "\n", "", $xml );
//Write to temporary file
$put_data = tmpfile();
if (!$put_data) die('tmpfile failed');
...
If you are on GoDaddy server check this out... https://stackoverflow.com/questions/9957397/tmpfile-returns-false-on-godaddy-server
Which server is actually showing the 500 ? from your code it seems the local server rather than the remote.
change
public static function send( $xml )
{
to
public static function send( $xml )
{
error_reporting(E_ALL);
if(!function_exists('curl_exec')) {
var_dump("NO CURL");
return false;
}
does that work ?
This is almost certainly NOT the php timeout setting.
If you are using FastCGI as you have stated then you need to edit this file:
/etc/httpd/conf.d/fcgid.conf
And change:
FcgidIOTimeout 3600
Then do:
service httpd restart
This was driving me insane for 3 days. The top voted answer to this is wrong!
I need a function that I can use in my script to contact another script to send it some GET data. But I need to be able to set a timeout so that it only loads for a few seconds, then continues with the rest of the script. I know I could easily use cURL to do this, but I'd like to know if there are any alternatives?
You can specify a timeout for the standard file access functions (like file_get_contents()) using stream_context_create():
<?php
$opts = array(
'http'=>array(
'method'=>"GET",
'timeout' => 5
)
);
$context = stream_context_create($opts);
$fp = fopen('http://www.example.com', 'r', false, $context);
fpassthru($fp);
fclose($fp);
?>
See the list of context options for an explanation on the timeout option.
This requires, of course, that you can access external URLs using fopen() and consorts.
The nice thing about curl, is it lets you uses threads even though php doesn't support them. So you can make the call to curl_multi, give it a callback, and let the rest of the script run. This way your regular processing isn't blocked. This reduces the need for a short timeout.
I'm trying to get the contents from another file with file_get_contents (don't ask why).
I have two files: test1.php and test2.php. test1.php returns a string, bases on the user that is logged in.
test2.php tries to get the contents of test1.php and is being executed by the browser, thus getting the cookies.
To send the cookies with file_get_contents, I create a streaming context:
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"))`;
I'm retrieving the contents with:
$contents = file_get_contents("http://www.example.com/test1.php", false, $opts);
But now I get the error:
Warning: file_get_contents(http://www.example.com/test1.php) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found
Does somebody knows what I'm doing wrong here?
edit:
forgot to mention: Without the streaming_context, the page just loads. But without the cookies I don't get the info I need.
First, this is probably just a typo in your question, but the third arguments to file_get_contents() needs to be your streaming context, NOT the array of options. I ran a quick test with something like this, and everything worked as expected
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"));
$context = stream_context_create($opts);
$contents = file_get_contents('http://example.com/test1.txt', false, $context);
echo $contents;
The error indicates the server is returning a 404. Try fetching the URL from the machine PHP is running on and not from your workstation/desktop/laptop. It may be that your web server is having trouble reaching the site, your local machine has a cached copy, or some other network screwiness.
Be sure you repeat your exact request when running this test, including the cookie you're sending (command line curl is good for this). It's entirely possible that the page in question may load fine in a browser without the cookie, but when you send the cookie the site actually is returning a 404.
Make sure that $_SERVER['HTTP_COOKIE'] has the raw cookie you think it does.
If you're screen scraping, download Firefox and a copy of the LiveHTTPHeaders extension. Perform all the necessary steps to reach whatever page it is you want in Firefox. Then, using the output from LiveHTTPHeaders, recreate the exact same request requence. Include every header, not just the cookies.
Finally, PHP Curl exists for a reason. If at all possible, (I'm not asking!) use it instead. :)
Just to share this information.
When using session_start(), the session file is lock by PHP. Thus the actual script is the only script that can access the session file. If you try to access it via fsockopen() or file_get_contents() you can wait a long time since you try to open a file that has been locked.
One way to solve this problem is to use the session_write_close() to unlock the file and relock it after with session_start().
Example:
<?php
$opts = array('http' => array('header'=> 'Cookie: ' . $_SERVER['HTTP_COOKIE']."\r\n"));
$context = stream_context_create($opts);
session_write_close(); // unlock the file
$contents = file_get_contents('http://120.0.0.1/controler.php?c=test_session', false, $context);
session_start(); // Lock the file
echo $contents;
?>
Since file_get_contents() is a blocking function, both script won't be in concurrency while trying to modify the session file.
But i'm sure this is not the best manner to manipulate session with an extend connection.
Btw: it's faster than cURL and fsockopen()
Let me know if you find something better.
Just out of curiosity, are you attempting file_get_contents on a page that has a space in it? I remember trying to use fgc on a URL that had a space in the name and while my web browser parsed it just fine, fgc didn't. I ended up having to use a str_replace to replace ' ' with '%20'.
I would think that this should have been relatively easy to spot that though as it would report only half of the filename. Also, I noticed in one of these posts, someone used \r\n while defining the headers. Keep in mind that PHP doesn't like these to be in single quotes, but they work fine in double.
Make sure that file1.php exists on the server. Try opening it in your own browser to make sure!