I'm using PHP's cURL and explode methods to extract the upvotes from a Reddit post page remotely.
It's quite slow, it takes a number of seconds between the button click and the return of the data, my question is, how can I speed it up? Where can I optimize this? Is it slow in the cURL getting the URL or is it slow exploding the page?
Here's how I'm locating the upvote div and getting its contents:
function between($src, $start, $end){
$txt = explode($start, $src);
$txt2 = explode($end, $txt[1]);
return trim($txt2[0]);
}
$title = between($data, '<div class="score unvoted">','</div>');
Here's the function I'm using to get the page data from Reddit.
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
It might be worth looking into a profiling tool like WebGrind to see where the slow occurs directly.
Chances are it is cURL that's slowing down your page, but without profiling you cannot tell for certain.
Related
I have the following code wherein I am calling a data through php CURL.
$URL = '//abc.com';
$gb = curl_init();
curl_setopt($gb,CURLOPT_URL,$URL);
curl_setopt($gb,CURLOPT_RETURNTRANSFER,1);
curl_setopt($gb,CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($gb,CURLOPT_TIMEOUT,10);
curl_setopt($gb,CURLOPT_SSL_VERIFYPEER,false);
$res = curl_exec($gb);
curl_close($gb);
$data = json_decode($res,true);
What is the best way to call CURL request in case I have multiple variants of URL like?
1). //abc.com
2). //abc.com/abc
3). //abc.com/123
Should I call CURL multiple times or how to define it in php function?
You can do something like
function curlRequest($url){
$gb = curl_init();
curl_setopt($gb,CURLOPT_URL,$url);
curl_setopt($gb,CURLOPT_RETURNTRANSFER,1);
curl_setopt($gb,CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($gb,CURLOPT_TIMEOUT,10);
curl_setopt($gb,CURLOPT_SSL_VERIFYPEER,false);
$res = curl_exec($gb);
$data = json_decode($res,true);
return $data;
}
$urls = ["http://url1","http://url2","http://url3"];
foreach($urls as $url){
curlRequest($url);//do something with data
}
I don't see difference calling the same domain or other since different routes retrieve different information. If all of these urls are equivalents, you don't need foreach or for solution.
It depends.
But if you connect to same website / service every time, you can freely use same connection.
This will allow you to set parameters for the connection one time only (eg. cookies or other headers).
You may also extract CURL handler creation to separated function and then only switch URLs for specific requests.
Your code should look like:
function init_my_curl() {
$h = curl_init();
curl_setopt($h, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($h, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($h, CURLOPT_TIMEOUT, 10);
curl_setopt($h, CURLOPT_SSL_VERIFYPEER, false);
return $h;
}
function do_request($handle, $url) {
curl_setopt($handle, CURLOPT_URL, $url);
$result = curl_exec($handle);
return json_decode($result, true);
}
Than you call:
$curl = init_my_curl();
do_request($curl, '//abc.com');
do_request($curl, '//abc.com/abc');
do_request($curl, '//abc.com/123');
curl_close($curl);
You can also wrap everything in class, but it depends on PHP version you are using and your code style.
I saw the following code on another post...
<?php
$ch = curl_init(); // create curl handle
$url = "http://www.google.com";
/**
* For https, there are more options that you must define, these you can get from php.net
*/
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST, true);
curl_setopt($ch,CURLOPT_POSTFIELDS, http_build_query(['array_of_your_post_data']));
curl_setopt($ch,CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT ,3); //timeout in seconds
curl_setopt($ch,CURLOPT_TIMEOUT, 20); // same for here. Timeout in seconds.
$response = curl_exec($ch);
curl_close ($ch); //close curl handle
echo $response;
?>
The only thing is I have no clue how to actually implement it or how it will work.
Here is what I am trying to do...
sitea.com/setup is setting php variables. If you visited that page, it would set $var1 = "hello" $var2="hi"
If someone visits siteb.com, I want to use php and somehow get those variables from sitea.com/setup to siteb.com and set them as new php variables. I'm assuming curl is the best option from what I've read, but can't figure out how to get it to work (or where to put it and how to call it) for that matter.
I'm experimenting with code trying to see how to do something for an upcoming project. Any help would be greatly appreciated.
I should note that I need this to be able to work from one domain on server1 to another domain on server2.
In a simple way it can be done like:
Site a: file.php
<?php
$a = 10;
$b = 20;
echo $a . ':' . $b;
?>
Site b: curl.php
<?php
$ch = curl_init(); // create curl handle
$url = "http://sitea/file.php";
/**
* For https, there are more options that you must define, these you can get from php.net
*/
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST, true);
curl_setopt($ch,CURLOPT_POSTFIELDS, http_build_query(['array_of_your_post_data']));
curl_setopt($ch,CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT ,3); //timeout in seconds
curl_setopt($ch,CURLOPT_TIMEOUT, 20); // same for here. Timeout in seconds.
$response = curl_exec($ch);
curl_close ($ch); //close curl handle
echo $response;
$parts = explode(':', $response);
$var1 = $parts[0];
$var2 = $parts[1];
?>
I'm making a scoreboard and implementing the steam API to retrieve avatars for users. At first I was using file_get, but it was so slow! So someone suggested I use curl.
Old method
$url = 'http://www.com';
$content = file_get_contents($url);
$json = json_decode($content, true);
I then used a foreach loop to grab the items I wanted from the data.
foreach($output['response']['players'] as $item) {
}
new curl code,
$url = 'www.com';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
echo $output = curl_exec($ch);
curl_close($ch);
$json = json_decode($output, true);
I get pretty much the same result from the json method but it is a little faster. But it is still extremely slow, is there anyway to increase the speed of this? Can I load the table and then load the avatars as they become available?
Scoreboard
http://fyre.site.nfoservers.com/index.php
Consider using for loops since those can speed things up. If you are talking about the load times (time until page loads and displays) being slow, consider using output buffering like this:
Unset arrays or values that you don't need anymore.
Note that the steam API accepts 100 ID's at once, so the friendslist is split into chunks of 100.
It will push out the information once its done, and the site will not wait until it is done. Try it out, I guess.
$totalfriends = count($friends);
$chunkedfriends = array_chunk($friends, 100);
$chunks = ceil($totalfriends / 100);
if(ob_get_length() > 0) {
ob_end_flush();
ob_implicit_flush();}
for($i=0; $i < $chunks; $i++){
$url = "https://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0001/?key=". $steamkey . "&steamids=". implode(',', $chunkedfriends[$i]) . "";
$friendscountchunk = count($chunkedfriends[$i]);
$ch = curl_init();
curl_setopt($ch, CURLOPT_PIPEWAIT, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$url);
$urlresult=curl_exec($ch);
curl_close($ch);
$json_decoded = json_decode($urlresult);
if(ob_get_length() > 0) {
ob_end_flush();
ob_implicit_flush();}
for($x=0; $x < $friendscountchunk; $x++){
?>
<li class="friendsli"><a href="steamuser.php?id=<?=$json_decoded->response->players->player[$x]->steamid?>">
<img src=' <?=$json_decoded->response->players->player[$x]->avatar?>'/><p class="friendname"> <?=$json_decoded->response->players->player[$x]->personaname?> </p>
</a></li> <?php
}}
unset($friends); unset($player); unset($json_decoded);
I don't think it is the best script or method, but it will help for sure.
You cannot speed up an external API, but you can improve and adapt your code.
I'm using the following php script to get search results from Google.
include("simple_html_dom.php");
include("random-user-agent.php");
$query = 'facebook';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'http://www.google.com/search?q='.$query.'');
#curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($curl, CURLOPT_USERAGENT,random_user_agent());
$str = curl_exec($curl);
curl_close($curl);
$html= str_get_html($str);
$i = 0;
foreach($html->find('li[class=g]') as $element) {
foreach($element->find('h3') as $item)
{
$title[$i] = ''.$item->plaintext.'' ;
}
$i++;
}
print_r($title);
When this script runs in a cronjob (with 5 sec sleep) I receive a warning from Google and have to fill in a captcha (obvious). I always thought that using curl and a random user agent can avoid this. What is the correct solution?
A better way to avoid captcha is to set a randomized sleep between 3-6 seconds per request.
Best solution is to use proxies.
I want to create a function using cURL and the bit.ly API to shorten links.
My question is, how can I get the string of data returned by cURL and continue to use it throughout the function (as this doesn't seem to work, I'd assume due to the attempt to return $string and then use string in the rest of the function).
Here's what I have, which just displays a blank page:
function shorten_url($bit_login,$bit_api,$long_url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://api.bitly.com/v3/shorten?login=".$bit_login."&apiKey=".$bit_api."&longUrl=".$long_url."&format=xml");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
return $string;
$xml = simplexml_load_string($string);
$short_url = $xml->data[0]->url;
echo $short_url;
}
shorten_url($login,$apikey,"http://www.google.com");
I've also tried to return $short_url and echo it ouside of the function, after the function is run (below shorten_url()), which doesn't work either.
The function will return NULL. You are missing the $string = curl_exec($ch);