I need help avoiding duplicate code (copy pasting code twice) - php

I'm trying to improve my programming skills constantly, I learned everything online so far. But I can't find a way to avoid duplicate code. Here's my code:
public function Curl($page, $check_top = 0, $pages = 1, $pagesources = array()){
//$page is the URL
//$check_top 0 = false 1 = true. When true it needs to check both false & true
//$pages is the amount of pages it needs to check.
$agent = "Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0";
try{
for($i = 0; $i < $pages; $i++){
$count = $i * 25; //Page 1 starts at 0, page 2 at 25 etc..
$ch = curl_init($page . "/?count=" . $count);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$pagesource = curl_exec($ch);
$pagesources[] = $pagesource;
}
if($check_top == 1){
for($i = 0; $i < $pages; $i++){
$count = $i * 25;
$ch = curl_init($page . "/top/?sort=top&t=all&count=" . $count);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 60);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$pagesource = curl_exec($ch);
$pagesources[] = $pagesource;
}
}
} catch (Exception $e){
echo $e->getMessage();
}
return $pagesources;
}
What I'm trying to do:
I want to get the HTML Page Sources from a specific page range (for example 1 to 5 pages). There are top pages and standard pages I want to get the sources from both with the page range. So my code works fine, but obviously; there must be a better way.

Here 's a short example, how you can avoid duplicate code with writing functions and using them together.
class A
{
public function methodA($paramA, $paramB, $paramC)
{
if ($paramA == 'A') {
$result = $this->methodB($paramB);
} else {
$result = $this->methodB($paramC);
}
return $result;
}
public function methodB($paramA)
{
// do something with the given param and return the result
}
}
$classA = new Class();
$result = $classA->methodA('foo', 'bar', 'baz');
The code given above shows a simple class with two methods. As you declared your function Curl in your example as public, I guess you 're using a class. The class in the example above is very basic. It calls the method methodB with different params in the nethodA method of the class.
What this means to you? You have to find out, which parameters your helper function needs. If you found out, which parameters it needs, just write another class method, which executes the curl functions with the given parameters. Simple as pie.
If you 're new into using classes and methods with php I suggest reading the documentation, where the basic functionality of classes, methods and members are described: http://php.net/manual/en/classobj.examples.php.

Related

How to get a specified row using cUrl PHP

Hey guys I use curl to communicate web external server, but the type of response is html, I was able to convert it to json code (more than 4000 row) but I have no idea how to get specified row which contains my result. Any idea ?
Here is my cUrl code :
require_once('getJson.php');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.reputationauthority.org/domain_lookup.php?ip=website.com&Submit.x=9&Submit.y=5&Submit=Search');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
$data = curl_exec($ch);
$httpcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
$data = '<<<EOF'.$data.'EOF';
$json = new GetJson();
header("Content-Type: text/plain");
$res = json_encode($json->html_to_obj($data), JSON_PRETTY_PRINT);
$myArray = json_decode($res,true);
For getJson.php
class GetJson{
function html_to_obj($html) {
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($html);
return $this->element_to_obj($dom->documentElement);
}
function element_to_obj($element) {
if ($element->nodeType == XML_ELEMENT_NODE){
$obj = array( "tag" => $element->tagName );
foreach ($element->attributes as $attribute) {
$obj[$attribute->name] = $attribute->value;
}
foreach ($element->childNodes as $subElement) {
if ($subElement->nodeType == XML_TEXT_NODE) {
$obj["html"] = $subElement->wholeText;
}
else {
$obj["children"][] = $this->element_to_obj($subElement);
}
}
return $obj;
}
}
}
My idea is instead of Browsing rows to achieve lign 2175 (doing something like : $data['children'][2]['children'][7]['children'][3]['children'][1]['children'][1]['children'][0]['children'][1]['children'][0]['children'][1]['children'][2]['children'][0]['children'][0]['html'] is not a good idea to me), I want to go directly to it.
If the HTML being returned has a consistent structure every time, and you just want one particular value from one part of it, you may be able to use regular expressions to parse the HTML and find the part you need. This is an alternative you trying to put the whole thing into an array. I have used this technique before to parse a HTML document and find a specific item. Here's a simple example. You will need to adapt it to your needs, since you haven't specified the exact nature of the data you're seeking. You may need to go down several levels of parsing to find the right bit:
$data = curl_exec($ch);
//Split the output into an array that we can loop through line by line
$array = preg_split('/\n/',$data);
//For each line in the output
foreach ($array as $element)
{
//See if the line contains a hyperlink
if (preg_match("/<a href/", "$element"))
{
...[do something here, e.g. store the data retrieved, or do more matching to find something within it]...
}
}

determine and sort distance for several locations

I have about 15 locations in a mysql table with lat and long information.
Using PHP and google maps API Am able to calculate distance between 2 locations.
function GetDrivingDistance($lat1, $lat2, $long1, $long2)
{
$url = "https://maps.googleapis.com/maps/api/distancematrix/json?origins=".$lat1.",".$long1."&destinations=".$lat2.",".$long2."&mode=driving&language=en-US";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_PROXYPORT, 3128);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$response = curl_exec($ch);
curl_close($ch);
$response_a = json_decode($response, true);
$dist = $response_a['rows'][0]['elements'][0]['distance']['text'];
$time = $response_a['rows'][0]['elements'][0]['duration']['text'];
return array('distance' => $dist, 'time' => $time);
}
I want to to select one as fixed e.g. row 1 given lat and long
$query="SELECT lat, long from table WHERE location=1"
$locationStart = $conn->query($query); =
I want to calculate the distance to all other locations in the tables (other rows) and return the the outcome sorted by distance
tried to calculate each one alone and end up with very long code and takes too long to fetch that via api, also still not able to sort them this way!
any hint?
Disclaimer: This is not a working solution, nor have I tested it, it is just a quick example I've done off the top of my head to provide a sort of code sample to go with my comment.
My brains still not fully warmed up, but I believe the bottom should at least act as a sort of guide to help put across the idea I was making in my comment, i'll try to answer any questions you have when I'm free. Hope it helps.
<?php
define('MAXIMUM_REQUEST_STORE', 5); // Store 5 requests in each multi_curl_handle
function getCurlInstance($url) {
$handle = curl_init();
curl_setopt($handle, CURLOPT_URL, $url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
return $handle;
}
$data = []; // Build up an array of Endpoints you want to hit. I'll let you do that.
// Initialise Variables
$totalRequests = count($data);
$parallelCurlRequests = [];
$handlerID = 0;
// Set up our first handler
$parallelCurlRequests[$handlerID] = curl_multi_init();
// Loop through each of our curl handles
for ($i = 0; $i < $totalRequests; ++$i) {
// We want to create a new handler/store every 5 requests. -- Goes off the constant MAXIMUM_REQUEST_STORE
if ($i % MAXIMUM_REQUEST_STORE == 1 && $i > MAXIMUM_REQUEST_STORE) {
++$handlerID;
}
// Create a Curl Handle for the current endpoint
// ... and store the it in an array for later use.
$curl[$i] = getCurlInstance($data[$i]);
// Add the Curl Handle to the Multi-Curl-Handle
curl_multi_add_handle($parallelCurlRequests[$handlerID], $curl[$i]);
}
// Run each Curl-Multi-Handler in turn
foreach ($parallelCurlRequests as $request) {
$running = null;
do {
curl_multi_exec($request, $running);
} while ($running);
}
$distanceArray = [];
// You can now pull out the data from the request.
foreach ($curl as $response) {
$content = curl_multi_getcontent($response);
if (!empty($content)) {
// Build up some form of array.
$response = json_decode($content);
$location = $content->someObject[0]->someRow->location;
$distance = $content->someObject[0]->someRow->distance;
$distanceArray[$location] = $distance;
}
}
natsort($distanceArray);

Inaccurate data share counts from Transient API Cache

Hi I have a wp multisite where I am using the Transients API to cache social media share counts. I'm using the Answer posted here: Caching custom social share count in WordPress
Everything is working, however it is not giving me accurate share counts for all of the posts. Some have the correct share count others just show what appears to be a random number. For example a post that has 65 facebook likes only shows 1 when the Transient code is added. When I remove the Transient it shows the accurate number of shares for all of them. Any ideas of what could cause this?
Here is my code added to functions.php:
class shareCount {
private $url,$timeout;
function __construct($url,$timeout=10) {
$this->url=rawurlencode($url);
$this->timeout=$timeout;
}
function get_fb() {
$json_string = $this->file_get_contents_curl('http://api.facebook.com/restserver.php?method=links.getStats&format=json&urls='.$this->url );
$json = json_decode($json_string, true);
return isset($json[0]['total_count'])?intval($json[0]['total_count']):0;
}
private function file_get_contents_curl($url){
// Create unique transient key
$transientKey = 'sc_' + md5($url);
// Check cache
$cache = get_site_transient($transientKey);
if($cache) {
return $cache;
}
else {
$ch=curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_FAILONERROR, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_TIMEOUT, $this->timeout);
$count = curl_exec($ch);
if(curl_error($ch))
{
die(curl_error($ch));
}
// Cache results for 1 hour
set_site_transient($transientKey, $count, 60 * 60);
return $count;
}
}
}
Everything works if I remove if($cache) {
return $cache;
}
but then the page is really slow.
I have spent hours trying to figure this out, so figured I'd ask the experts. I've attached a screen shot comparing the post share counts with and without the Transient API so you can see the differences.
Comparison of Share Counts
Thanks
I have used this snippet and it did the work for sharecount api
function aesop_share_count(){
$post_id = get_the_ID();
//$url = 'http://nickhaskins.co'; // this one used for testing to return a working result
$url = get_permalink();
$apiurl = sprintf('http://api.sharedcount.com/?url=%s',$url);
$transientKey = 'AesopShareCounts'. (int) $post_id;
$cached = get_transient($transientKey);
if (false !== $cached) {
return $cached;
}
$fetch = wp_remote_get($apiurl, array('sslverify'=>false));
$remote = wp_remote_retrieve_body($fetch);
if( !is_wp_error( $remote ) ) {
$count = json_decode( $remote,true);
}
$twitter = $count['Twitter'];
$fb_like = $count['Facebook']['like_count'];
$total = $fb_like + $twitter;
$out = sprintf('%s',$total);
set_transient($transientKey, $out, 600);
return $out;
}

Using cURL and PHP for CACTI in Windows

Recently tasked to monitor external webpage response/loading time via CACTI. I found some PHP scripts that were working (pageload-agent.php and class.pageload.php) using cURL. All was working fine until they requested it to be transferred from LINUX to Windows 2012R2 server. I'm having a very hard time modifying the scripts to work for windows. Already installed PHP and cURL and both working as tested. Here are the scripts taken from askaboutphp.
class.pageload.php
<?php
class PageLoad {
var $siteURL = "";
var $pageInfo = "";
/*
* sets the URLs to check for loadtime into an array $siteURLs
*/
function setURL($url) {
if (!empty($url)) {
$this->siteURL = $url;
return true;
}
return false;
}
/*
* extract the header information of the url
*/
function doPageLoad() {
$u = $this->siteURL;
if(function_exists('curl_init') && !empty($u)) {
$ch = curl_init($u);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_ENCODING, "gzip");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, false);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)");
$pageBody = curl_exec($ch);
$this->pageInfo = curl_getinfo($ch);
curl_close ($ch);
return true;
}
return false;
}
/*
* compile the page load statistics only
*/
function getPageLoadStats() {
$info = $this->pageInfo;
//stats from info
$s['dest_url'] = $info['url'];
$s['content_type'] = $info['content_type'];
$s['http_code'] = $info['http_code'];
$s['total_time'] = $info['total_time'];
$s['size_download'] = $info['size_download'];
$s['speed_download'] = $info['speed_download'];
$s['redirect_count'] = $info['redirect_count'];
$s['namelookup_time'] = $info['namelookup_time'];
$s['connect_time'] = $info['connect_time'];
$s['pretransfer_time'] = $info['pretransfer_time'];
$s['starttransfer_time'] = $info['starttransfer_time'];
return $s;
}
}
?>
pageload-agent.php
#! /usr/bin/php -q
<?php
//include the class
include_once 'class.pageload.php';
// read in an argument - must make sure there's an argument to use
if ($argc==2) {
//read in the arg.
$url_argv = $argv[1];
if (!eregi('^http://', $url_argv)) {
$url_argv = "http://$url_argv";
}
// check that the arg is not empty
if ($url_argv!="") {
//initiate the results array
$results = array();
//initiate the class
$lt = new PageLoad();
//set the page to check the loadtime
$lt->setURL($url_argv);
//load the page
if ($lt->doPageLoad()) {
//load the page stats into the results array
$results = $lt->getPageLoadStats();
} else {
//do nothing
print "";
}
//print out the results
if (is_array($results)) {
//expecting only one record as we only passed in 1 page.
$output = $results;
print "dns:".$output['namelookup_time'];
print " con:".$output['connect_time'];
print " pre:".$output['pretransfer_time'];
print " str:".$output['starttransfer_time'];
print " ttl:".$output['total_time'];
print " sze:".$output['size_download'];
print " spd:".$output['speed_download'];
} else {
//do nothing
print "";
}
}
} else {
//do nothing
print "";
}
?>
Thank you. any type of assistance is greatly appreciated.

How to Know when to repeat a loop, based on information within the loop

I am new to php, and still trying to grasp the concepts.
My question is, how do i change whether or not to loop, based on the information only available once i start the loop?
This is the code i have come up with, with comments to try and explain my thinking.
I'm basically trying to get the "business done" for every page, if there is 1 or many.
NOTE: $object contains the resultset which may span pages. If it does span pages then $object->pagination will exist, otherwise it will not. $object->pagination->pages = total number of pages, and $object->pagination->pages = the current page.
//get first page. some how this needs to be a loop i'm guessing.
$page = 1;
$url = "api/products/page/" . $page;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER,array ("Content-Type: Content-Type: application/json; charset=utf-8"));
curl_setopt($ch, CURLOPT_HTTPHEADER,array ("Accept: application/json"));
curl_setopt($ch, CURLOPT_USERPWD, "username-password");
$contents = curl_exec ($ch);
curl_close ($ch);
$object = json_decode($contents);
//$data returned with product info and pagination info if there is more than one page.
//now check in that first page to see if there is other pages set
if(isset($object->pagination)){
while($object->pagination->page < $object->pagination->pages) {
$page = $page+1 ;
//do some business
} else {//stop going through the pages}
}
<?php
$object->pagination->page = 1;
$url = "api/products/page/" . $object->pagination->page;
//$url is used for cURL. $object returned with product info and pagination info if there is more than one page.
$object //JSON decoded object returned from cURL.
//now check in that first page to see if there is other pages set
if(isset($object->pagination)){
while($object->pagination->page pagination->pages)
{
//do some business
$object->pagination->page++;
}
}
?>
I think you mean to loop over all the pages in the object:
if (isset($object->pagination)) {
while ($object->pagination->page < $object->pagination->pages) {
$object->pagination->page++; // $x++ is the same as $x = $x+1
//do some business
}
}
Try this:
if(is_array($values){
foreach($values as $value){
//do some with you value
}
}else{
// do something with $values
}
Here is an example:
$x = 1;
// Loop 1 through 5
while( $x <= 5 )
{
// Output x
echo("$x\n");
// Check if x is a certain value and do something different
if($x==3)
{
echo("Almost there...\n");
}
// Increment x
$x++;
}
// This outputs...
// 1
// 2
// 3
// Almost there...
// 4
// 5

Categories