i have an array with approximately 45 k usernames in in i want to hit a url using curl that would give me a response pertaining to those usernames.The issue is i want to achieve it in less time.
$username=['123','456','789'....] //upto 45k entries
for($i=0;$i<sizeof($username);$i++)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://abc.com.pk/hxc/get_user_details.php?uname='.$username[$i]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_USERAGENT, $ua);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 20);
curl_setopt($ch, CURLOPT_HTTPGET, true);
$result = curl_exec($ch);
curl_close($ch);
}
The above code depicts what i am doing right now but as usernames are in large numbers it takes alot of time to return all the responses.Is there any way i can achieve it in less time.
Have a look at https://github.com/php-curl-class/php-curl-class , it speeds up our curl requests a lot.
It has multi-curl support enabled and it's very easy to use.
As for your question on the time, you can set the time out using
Curl::setTimeout($seconds)
Or in the case of MultiCurl
MultiCurl::setTimeout($seconds)
You can extend the timeout as much time as needed.
You can use curl-multi-init and curl-multi-exec so that your requests are processed asynchronously.
Related
The following script just seems to run forever. It never gets to finished.
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
for ($i = 500; $i<3000; i++){
$url = "http://abcedfg.com/$i/index.html";
curl_setopt($ch, CURLOPT_URL, $url);
$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
Try to wrap curl_init and curl_close in every request.
Like this:
function callurl($myurl) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $myurl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_HEADER, true);
$response = curl_exec ($ch);
curl_close ($ch);
return $response;
}
And You'll have to call this function for every URL for example using a loop for.
Also try to test with only 10-20 requests before to go BIG.
Consider that 2500 requests, if every request takes 1 second, is translated to 41 minutes of activity.
No server is configured by default to keep a PHP session active for 40min. You can change this settings on the server if You have access to the server.
It's also possible that You're stuck because the server doesn't have so much resources for making so much requests at the same time. Ideally You should fine tune Your server configuration in order to achieve better performance.
Also consider to use
curl_multi_init for better performance and asynchronous requests.
But this will not guarantee that the request will be dropped because of TIMEOUT. So fine tune the server could be still needed.
Check also this post for how to encrease the time Limit:
It's better to close the file, everytime you open it, so that it realese the memory for the open file.
You can list all the urls by running the loop, and then do a multicurl request.
I had made a php app which works on an API but for sending requests to the API i use curl with a proxy and that proxy has a limited bandiwith, can someone please tell me about some practices which may help me to reduce bandwith.
P.S - I already know about CURLOPT_NOBODY which helps to reduce the bandwith usage a bit, but i need to save more bandwith.
My Current Code -
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'My User Agent');
curl_setopt($ch, CURLOPT_PROXY, '209.XXX.XXX.XX:8081');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields);
curl_exec($ch);
curl_close($ch);
I'm currently making a PHP site that gets his data from an API. At first cURL seemed to do that perfectly, but if the API returns an empty response (and we make the request again, because we can't give a correct response without it) it seems to spawn a child process. This didn't use too much CPU when developing, but in production it can get as high as 150% CPU load.
Code used to get data from the API:
while (empty($output )) {
$ch = curl_init();
set_curl($ch);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 8);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 4);
curl_setopt($ch, CURLOPT_URL, "http://domain.com");
$output = curl_exec($ch);
}
Is there any way to fix this?
Am trying to access an API using CURL
I can access the API from my browser.
But cannot get the data from the same api(using the same API key)
using curl.
I am getting this error.
403 Developer Over Qps
Please let me know what can be the reason for this.
Earlier it was working. I am facing this issue for the past 2 days.!!
please check the code below:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://api.perfb.com/api/api.php?requestmethod=json&responsemethod=xml');
curl_setopt($ch, CURLOPT_TIMEOUT, 900);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $vJson);
$response = curl_exec($ch);
$info = curl_getinfo($ch);
echo '<pre>';
print_r($info);exit;
Qps means Queries Per Second
Are you hitting the server repeatedly with curl in a loop for example? Try adding a pause after each call and see if that works.
That error usually signifies that you're hitting the server too often (i.e. developer over allowed queries per second). Slow down your code, put some delays in. In browser, you're doing it manually, so it's likely quite a bit slower than your code.
I am trying to test the below function but every time I try to use any sort of proxy IP (I have tried about 15 now) - I generally get the same error:
Received HTTP code 0 from proxy after CONNECT
Here is the function, anything wrong with it? It could just be the proxies I am using but I have tried several times now.
function getPage($proxy, $url, $referer, $agent, $header, $timeout) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, $header);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
$result['EXE'] = curl_exec($ch);
$result['INF'] = curl_getinfo($ch);
$result['ERR'] = curl_error($ch);
curl_close($ch);
return $result;
}
Also in general, anyway I can improve it?
I appreciate all help.
Update
As I submitted this, I tried another proxy and it worked!
The other question still stands, how can I improve the above. It takes about 3-4 seconds to execute, anything I can do, or is this too minimal?
I know you sort of answered your first problem but code 0 is not a valid http status code. They should all begin with either 1 (informational), 2 (success), 3 (redirection), 4 (client error), or 5 (server error). I would be really interseted if anyone knows why you might get this code. Searching the libcurl site didn't bring anything up.
(More detailed information is here if you are interested:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmlt)
For the second question I think you would need to find where the longest operation was.The microtime() function might be useful to you here. The documentation for microtime() has some example scripts to help you use the timer.
I suspect though that most of the 3-4 seconds could be waiting to get the response via the proxy at curl_exe($ch).