While using PHP I am taking image links from my mysql database, and echoing them out. There are 600 or so, but it keeps stopping after running 100 or so. It is not a logic error, it seems there is a setting that is stopping php from continuing the curl. Please advise which setting I should expand to allow a longer CURL thanks!
Here is what I am using now:
function file_get_contents_curl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch,CURLOPT_BINARYTRANSFER, true);
$data = curl_exec($ch);
return $data;
}
$htmlaa = file_get_contents_curl($getimagefrom);
$docaa = new DOMDocument();
#$docaa->loadHTML($htmlaa);
Again, it is worknig just fine but just keeps stopping after running for maybe 3 minutes.
You can set the curl timeout like so:
curl_setopt($ch, CURLOPT_TIMEOUT, 1000); //seconds to live
Since there are multiple factors that influence execution time you should also check out these two as well:
http://php.net/manual/en/function.set-time-limit.php
http://php.net/manual/en/info.configuration.php#ini.max-execution-time
Also please note that CURLOPT_TIMEOUT defines the amount of time that any cURL function is allowed to take to execute. You should also checkout CURLOPT_CONNECTTIMEOUT option.
Related
The following script just seems to run forever. It never gets to finished.
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
for ($i = 500; $i<3000; i++){
$url = "http://abcedfg.com/$i/index.html";
curl_setopt($ch, CURLOPT_URL, $url);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
Try to wrap curl_init and curl_close in every request.
Like this:
function callurl($myurl) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $myurl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
$response = curl_exec ($ch);
curl_close ($ch);
return $response;
}
And You'll have to call this function for every URL for example using a loop for.
Also try to test with only 10-20 requests before to go BIG.
Consider that 2500 requests, if every request takes 1 second, is translated to 41 minutes of activity.
No server is configured by default to keep a PHP session active for 40min. You can change this settings on the server if You have access to the server.
It's also possible that You're stuck because the server doesn't have so much resources for making so much requests at the same time. Ideally You should fine tune Your server configuration in order to achieve better performance.
Also consider to use
curl_multi_init for better performance and asynchronous requests.
But this will not guarantee that the request will be dropped because of TIMEOUT. So fine tune the server could be still needed.
Check also this post for how to encrease the time Limit:
It's better to close the file, everytime you open it, so that it realese the memory for the open file.
You can list all the urls by running the loop, and then do a multicurl request.
i have an array with approximately 45 k usernames in in i want to hit a url using curl that would give me a response pertaining to those usernames.The issue is i want to achieve it in less time.
$username=['123','456','789'....] //upto 45k entries
for($i=0;$i<sizeof($username);$i++)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://abc.com.pk/hxc/get_user_details.php?uname='.$username[$i]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_USERAGENT, $ua);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 20);
curl_setopt($ch, CURLOPT_HTTPGET, true);
$result = curl_exec($ch);
curl_close($ch);
}
The above code depicts what i am doing right now but as usernames are in large numbers it takes alot of time to return all the responses.Is there any way i can achieve it in less time.
Have a look at https://github.com/php-curl-class/php-curl-class , it speeds up our curl requests a lot.
It has multi-curl support enabled and it's very easy to use.
As for your question on the time, you can set the time out using
Curl::setTimeout($seconds)
Or in the case of MultiCurl
MultiCurl::setTimeout($seconds)
You can extend the timeout as much time as needed.
You can use curl-multi-init and curl-multi-exec so that your requests are processed asynchronously.
I'm currently making a PHP site that gets his data from an API. At first cURL seemed to do that perfectly, but if the API returns an empty response (and we make the request again, because we can't give a correct response without it) it seems to spawn a child process. This didn't use too much CPU when developing, but in production it can get as high as 150% CPU load.
Code used to get data from the API:
while (empty($output )) {
$ch = curl_init();
set_curl($ch);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 8);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 4);
curl_setopt($ch, CURLOPT_URL, "http://domain.com");
$output = curl_exec($ch);
}
Is there any way to fix this?
I want to run my php script for every 5 minutes. Here is my PHP code.
function call_remote_file($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
}
set_time_limit(0);
$root='http://mywebsiteurl'; //remote location of the invoking and the working script
$url=$root."invoker.php";
$workurl=$root."script.php";
call_remote_file($workurl);//call working script
sleep(60*5);// wait for 300 seconds.
call_remote_file($url); //call again this script
I run this code once. It works perfectly, even after i close the entire browser window.
The problem is the stops working if i turn of my system's internet connect.
How to solve this problem. Please help me out.
While I wouldn't really recommend doing this for something critical (you're going to have stability issues), this could work:
function call_remote_file($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
}
set_time_limit(0);
$root='http://mywebsiteurl'; //remote location of the invoking and the working script
$url=$root."invoker.php";
$workurl=$root."script.php";
while(true)
{
call_remote_file($workurl);//call working script
sleep(60*5);// wait for 300 seconds.
}
Another way would be to call it from the command line using exec():
function call_remote_file($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
}
set_time_limit(0);
$root='http://mywebsiteurl'; //remote location of the invoking and the working script
$url=$root."invoker.php";
$workurl=$root."script.php";
call_remote_file($workurl);//call working script
sleep(60*5);// wait for 300 seconds.
exec('php ' . $_SERVER['SCRIPT_FILENAME']);
You should really use cron though if at all possible.
The above code is ok but if you want to add multiple scripts to run at different intervals then the coding becomes far more complicated.
If you try phpjobscheduler (open source so free to use) it provides an interface to add, modify and remove scripts to run.
I am trying to test the below function but every time I try to use any sort of proxy IP (I have tried about 15 now) - I generally get the same error:
Received HTTP code 0 from proxy after CONNECT
Here is the function, anything wrong with it? It could just be the proxies I am using but I have tried several times now.
function getPage($proxy, $url, $referer, $agent, $header, $timeout) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
$result['EXE'] = curl_exec($ch);
$result['INF'] = curl_getinfo($ch);
$result['ERR'] = curl_error($ch);
curl_close($ch);
return $result;
}
Also in general, anyway I can improve it?
I appreciate all help.
Update
As I submitted this, I tried another proxy and it worked!
The other question still stands, how can I improve the above. It takes about 3-4 seconds to execute, anything I can do, or is this too minimal?
I know you sort of answered your first problem but code 0 is not a valid http status code. They should all begin with either 1 (informational), 2 (success), 3 (redirection), 4 (client error), or 5 (server error). I would be really interseted if anyone knows why you might get this code. Searching the libcurl site didn't bring anything up.
(More detailed information is here if you are interested:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmlt)
For the second question I think you would need to find where the longest operation was.The microtime() function might be useful to you here. The documentation for microtime() has some example scripts to help you use the timer.
I suspect though that most of the 3-4 seconds could be waiting to get the response via the proxy at curl_exe($ch).