I'm making a scoreboard and implementing the steam API to retrieve avatars for users. At first I was using file_get, but it was so slow! So someone suggested I use curl.
Old method
$url = 'http://www.com';
$content = file_get_contents($url);
$json = json_decode($content, true);
I then used a foreach loop to grab the items I wanted from the data.
foreach($output['response']['players'] as $item) {
}
new curl code,
$url = 'www.com';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
echo $output = curl_exec($ch);
curl_close($ch);
$json = json_decode($output, true);
I get pretty much the same result from the json method but it is a little faster. But it is still extremely slow, is there anyway to increase the speed of this? Can I load the table and then load the avatars as they become available?
Scoreboard
http://fyre.site.nfoservers.com/index.php
Consider using for loops since those can speed things up. If you are talking about the load times (time until page loads and displays) being slow, consider using output buffering like this:
Unset arrays or values that you don't need anymore.
Note that the steam API accepts 100 ID's at once, so the friendslist is split into chunks of 100.
It will push out the information once its done, and the site will not wait until it is done. Try it out, I guess.
$totalfriends = count($friends);
$chunkedfriends = array_chunk($friends, 100);
$chunks = ceil($totalfriends / 100);
if(ob_get_length() > 0) {
ob_end_flush();
ob_implicit_flush();}
for($i=0; $i < $chunks; $i++){
$url = "https://api.steampowered.com/ISteamUser/GetPlayerSummaries/v0001/?key=". $steamkey . "&steamids=". implode(',', $chunkedfriends[$i]) . "";
$friendscountchunk = count($chunkedfriends[$i]);
$ch = curl_init();
curl_setopt($ch, CURLOPT_PIPEWAIT, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$url);
$urlresult=curl_exec($ch);
curl_close($ch);
$json_decoded = json_decode($urlresult);
if(ob_get_length() > 0) {
ob_end_flush();
ob_implicit_flush();}
for($x=0; $x < $friendscountchunk; $x++){
?>
<li class="friendsli"><a href="steamuser.php?id=<?=$json_decoded->response->players->player[$x]->steamid?>">
<img src=' <?=$json_decoded->response->players->player[$x]->avatar?>'/><p class="friendname"> <?=$json_decoded->response->players->player[$x]->personaname?> </p>
</a></li> <?php
}}
unset($friends); unset($player); unset($json_decoded);
I don't think it is the best script or method, but it will help for sure.
You cannot speed up an external API, but you can improve and adapt your code.
Related
i need to reduce load time of my script. It is curl and simple parse dom.
This is my script, i need help :(
It lasts about 2 minutes, i need to parse many different pages!
require_once ("simple_html_dom.php");
function curl ($page){
ob_start();
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "URL");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, "POSTFIELDS");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);
curl_close($ch);
ob_end_clean();
return $result;
}
$start = microtime(true);
set_time_limit(0);
$text = "here text of last page";
$i = 0;
while(strpos(str_get_html(curl($i)), $text) == null){
$html = str_get_html(curl($i));
foreach($html->find('div#box-container-inner div.box') as $e){
PRINT etc... only for test
}
echo "parsata la pagina ".($i+1)."<br>";
$i++;
}
$time_elapsed_secs = (microtime(true) - $start)/60;
echo $time_elapsed_secs;
You appear to be running CURL twice in each loop (once to evaluate the while-loop condition, once to set $html) and converting the resulting string into an object for each loop. That's four potentially-intensive processes each time you loop that you can knock down to two per loop.
Instead, you can assign the $html variable to the result of str_get_html(curl($i)) while within the while-loop evaluation:
while(strpos(($html = str_get_html(curl($i))), $text) === false) {
// $html = str_get_html(curl($i));
Just add curl timeout as parameter set to 60 seconds or your choice.
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
I'm still a bit new to using curl to pull data and I've recently started using Fiddler to help find what options need to be set.
I'm trying to see if I can pull an image from a site. I first hit a search page - I set the search parameters, then start hitting links in the results. When I attempt to go a link in one of the results for an image, I get an empty string returned from curl_exec().
The weird thing is - at one point, it worked - I got the data back and successfully saved the image locally. But then it stopped, and I have no idea what I was doing to have it working. Naturally, everything works OK in the browser. :(
I'm using Simple HTML DOM to parse through results and cUrl for the actual page requests. curl_error() does not show an error, curl_getinfo() thinks everything is OK too. It's probably something trivial, but I'm not sure how to troubleshoot it beyond where I am.
<?php
include 'includes/simple_html_dom.php';
$url = "http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal/Corrections/InmateInquiry.aspx";
// Get Cookie - ASP.NET_SessionId
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$r = curl_exec($ch);
preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $r, $matches);
$cookies = array();
foreach($matches[1] as $item)
{
parse_str($item, $cookie);
$cookies = array_merge($cookies, $cookie);
}
$sessionCookie = "ASP_NET_SessionId=".$cookies['ASP_NET_SessionId'];
// now load up page into Simple HTML DOM and get all inputs - ignore buttons and populate our dates
$startDate = "02%2F01%2F2000";
$endDate = "02%2F07%2F2016";
$getInputs = str_get_html($r);
$inputs = $getInputs->find('input');
$inputs_array = array();
$buttons_array = array();
for ($i=0; $i<count($inputs); $i++)
{
if ($inputs[$i]->type != "submit")
{
$inputs_array[$inputs[$i]->id] = $inputs[$i]->value;
if (stripos($inputs[$i]->id, "FromDate") > 0)
$inputs_array[$inputs[$i]->id] = $startDate;
if (stripos($inputs[$i]->id, "ToDate") > 0)
$inputs_array[$inputs[$i]->id] = $endDate;
}
}
// build up our curl data - includes hidden inputs, our to & from dates, plus the Search button
$curl_data = http_build_query($inputs_array)."&ctl00%24DefaultContent%24uxSearch=Search";
// POST the data, include session cookie
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $curl_data);
curl_setopt($ch, CURLOPT_COOKIE, $sessionCookie);
$response = curl_exec($ch);
// this shows that we can get data
// find the links from the HTML
$htmlDom = str_get_html($response); // load up Simple HTML DOM
// get the table of results
$divTable = $htmlDom->find('div#ctl00_DefaultContent_uxResultsWrapper',0)->find('table',0);
$rows = $divTable->find('tr');
for ($i=1; $i<count($rows);$i++)
{
if ($i>3) break; // limit the length of script for debugging
$link = $rows[$i]->find('td',1)->find('a',0)->href;
// build up query to get inmate details from the link above
$url = "http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal/Corrections/".$link;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIE, $sessionCookie);
$page = curl_exec($ch);
$pageData = str_get_html($page);
// Now find the Photo, there's a thumb in div.BookingPhotos
// It is linked to a full size image, the link is of the form http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal/GetImage.aspx?ImageKey=17C030IS, but in the href, it has ../GetImage.aspx?ImageKey=xxxx
$photoLink = $pageData->find('div.BookingPhotos',0)->find('a',0)->href;
// get rid of .. and put the base URL on the front
$imgLink = str_replace("..", "http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal", $photoLink);
// now attempt to pull the image
$ch = curl_init($imgLink);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIE, $sessionCookie);
// here is the PROBLEM - NO DATA RETURNED
$imgData = curl_exec($ch); // I get a header back, but NO data
}
?>
I'm using the following php script to get search results from Google.
include("simple_html_dom.php");
include("random-user-agent.php");
$query = 'facebook';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, 'http://www.google.com/search?q='.$query.'');
#curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($curl, CURLOPT_USERAGENT,random_user_agent());
$str = curl_exec($curl);
curl_close($curl);
$html= str_get_html($str);
$i = 0;
foreach($html->find('li[class=g]') as $element) {
foreach($element->find('h3') as $item)
{
$title[$i] = ''.$item->plaintext.'' ;
}
$i++;
}
print_r($title);
When this script runs in a cronjob (with 5 sec sleep) I receive a warning from Google and have to fill in a captcha (obvious). I always thought that using curl and a random user agent can avoid this. What is the correct solution?
A better way to avoid captcha is to set a randomized sleep between 3-6 seconds per request.
Best solution is to use proxies.
I'm using PHP's cURL and explode methods to extract the upvotes from a Reddit post page remotely.
It's quite slow, it takes a number of seconds between the button click and the return of the data, my question is, how can I speed it up? Where can I optimize this? Is it slow in the cURL getting the URL or is it slow exploding the page?
Here's how I'm locating the upvote div and getting its contents:
function between($src, $start, $end){
$txt = explode($start, $src);
$txt2 = explode($end, $txt[1]);
return trim($txt2[0]);
}
$title = between($data, '<div class="score unvoted">','</div>');
Here's the function I'm using to get the page data from Reddit.
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
It might be worth looking into a profiling tool like WebGrind to see where the slow occurs directly.
Chances are it is cURL that's slowing down your page, but without profiling you cannot tell for certain.
I found a script in Web Designer magazine that enables you to gather Album Data from a Facebook Fan Page, and put it on your site.
The script utilizes PHP's file_get_contents() function, which works great on my personal server, but is not allowed on the Network Solutions hosting.
In looking through their documentation, they recommended that you use a cURL session to gather the data. I have never used cURL sessions before, and so this is something of a mystery to me. Any help would be appreciated.
The code I "was" using looked like this:
<?php
$FBid = '239319006081415';
$FBpage = file_get_contents('https://graph.facebook.com/'.$FBid.'/albums');
$photoData = json_decode($FBpage);
$albumID = $photoData->data[0]->id;
$albumURL = "https://graph.facebook.com/".$albumID."/photos";
$rawAlbumData = file_get_contents("https://graph.facebook.com/".$albumID."/photos");
$photoData2 = json_decode($rawAlbumData);
$a = 0;
foreach($photoData2->data as $data) {
$photoArray[$a]["source"] = $data->source;
$photoArray[$a]["width"] = $data->width;
$photoArray[$a]["height"] = $data->height;
$a++;
}
?>
The code that I am attempting to use now looks like this:
<?php
$FBid = '239319006081415';
$FBUrl = "https://graph.facebook.com/".$FBid."/albums";
$ch = curl_init($FBUrl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$contents = curl_exec($ch);
curl_close($ch);
$photoData = json_decode($contents);
?>
When I try to echo or manipulate the contents of $photoData however, it's clear that it is empty.
Any thoughts?
Try removing curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); I'm not exactly sure what that does but I'm not using it and my code otherwise looks very similar. I'd also use:
json_decode($contents,true); This should put the results in an array instead of an object. I've had better luck with this approach.
Put it in the works for me category.
Try this it could work
$ch = curl_init($FBUrl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
$contents = curl_exec($ch);
$pageData = json_decode($contents);
//object to array
$objtoarr = get_object_vars($pageData);
curl_close($ch);
Use jquery get Json instead This tips is from FB Album downloader GreaseMonkey script