PHP cURL & XPath giving inconsistent results - php

trying to do a loop with a url parameter, into a function which does a curl, gets all html and runs xpath on it. But the results varies. Is there something special I need to consider using curl or xpath? Sometimes it collects an emtpy string. The code works, just this flaw that is really hard to debug.
Here is the code I use.
private function getArticles($url){
// Instantiate cURL to grab the HTML page.
$c = curl_init($url);
curl_setopt($c, CURLOPT_HEADER, false);
curl_setopt($c, CURLOPT_USERAGENT, $this->getUserAgent());
curl_setopt($c, CURLOPT_FAILONERROR, true);
curl_setopt($c, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($c, CURLOPT_AUTOREFERER, true);
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_TIMEOUT, 10);
// Grab the data.
$html = curl_exec($c);
// Check if the HTML didn't load right, if it didn't - report an error
if (!$html) {
echo "<p>cURL error number: " .curl_errno($c) . " on URL: " . $url ."</p>" .
"<p>cURL error: " . curl_error($c) . "</p>";
}
// Close connection.
curl_close($c);
// Parse the HTML information and return the results.
$dom = new DOMDocument();
#$dom->loadHtml($html);
$xpath = new DOMXPath($dom);
// Get a list of articles from the section page
$cname = $xpath->query('//*[#id="item-details"]/div/div[1]/h1');
$link = $xpath->query('//*[#id="item-details"]/div/ul/li[1]/a/#href');
$streetadress = $xpath->query('//*[#id="item-details"]/div[2]/div[3]/div[1]/text()[1]');
$zip = $xpath->query('//*[#id="item-details"]/div[2]/div[3]/div[1]/text()[2]');
$phone1 = $xpath->query('//*[#id="item-details"]/div/h2/span[2]');
$phone2 = $xpath->query('//*[#id="item-details"]/div/h2[2]/span[2]');
$ceo = $xpath->query('//*[#id="company-financials"]/div/div[2]/span');
$orgnr = $xpath->query('//*[#id="company-financials"]/div/div[1]/span');
$turnover13 = $xpath->query('//*[#class="geb-turnover1"]');
$turnover12 = $xpath->query('//*[#class="geb-turnover2"]');
$turnover11 = $xpath->query('//*[#class="geb-turnover3"]');
$logo = $xpath->query('//*[#id="item-info"]/p/img/#src');
$desc = $xpath->query('//*[#id="item-info"]/div[1]/div');
$capturelink = "";
// $capturelink = $this->getWebCapture($link->item(0)->nodeValue);
return array(
'companyname' => $cname->item(0)->nodeValue,
'streetadress' => $streetadress->item(0)->nodeValue,
'zip' => $zip->item(0)->nodeValue,
'phone1' => $phone1->item(0)->nodeValue,
'phone2' => $phone2->item(0)->nodeValue,
'link' => $link->item(0)->nodeValue,
'ceo' => $ceo->item(0)->nodeValue,
'orgnr' => $orgnr->item(0)->nodeValue,
'turnover2013' => $turnover13->item(0)->nodeValue,
'turnover2012' => $turnover12->item(0)->nodeValue,
'turnover2011' => $turnover11->item(0)->nodeValue,
'description' => $desc->item(0)->nodeValue,
'logo' => $logo->item(0)->nodeValue,
'capturelink' => $capturelink);
}
// End Get Articles
Edit:
I really tried everything on this one. But ended up using phpQuery and now it works. I do think php dom and xpath combined is not always a good mix. At least for me in this case.
This how I use it instead of xpath:
....
require('phpQuery.php');
phpQuery::newDocumentHTML($html);
$capture = "";
// $capture = $this->getWebCapture(pq('.website')->attr('href'));
return array(
'companyname' => pq('.header')->find('h1')->text(),
'streetadress' => pq('.address-container:first-child')->text(),
'zip' => pq('.address-container')->text(),
'phone1' => pq('.phone-number')->text(),
'phone2' => pq('.phone-number')->text(),
'link' => pq('.website')->attr('href'),
'ceo' => pq('.geb-ceo')->text(),
'orgnr' => pq('.geb-org-number')->text(),
'turnover2013' => pq('.geb-turnover1')->text(),
'turnover2012' => pq('.geb-turnover2')->text(),
'turnover2011' => pq('.geb-turnover3')->text(),
'description' => pq('#item-info div div')->text(),
'logo' => pq('#item-info logo img')->attr('src'),
'capture' => $capture);

Is there something special I need to consider using curl or xpath?
As you ask that actually, I think you could benefit from making yourself more comfortable what the curl thingy is about and what the xpath thingy is about and at which point both are related and where not.
The code works, just this flaw that is really hard to debug.
Well, the function you've got there is pretty long and does too many things at once. That is why it's hard to debug, too. Move code out of that function into subroutines you call from that function. That will also help you to structure the code more.
Additionally you can keep records of the activity your program does. So you can in debugging for example take the exact same HTML of a past request (because you've stored it) and verify if your xpath queries are really fitting for the data.

Related

Calling a PHP file from a PHP loop in background

I have a PHP loop where i need to call another PHP file in the background to insert/update some information based on a variable send to it. I have tried to use CURL, but it does not seem to work.
I need it to call SQLupdate.php?symbol=$symbol - Is there another way of calling that PHP with the paramter in the background - and can it eventually be done Synchronously with a response back for each loop?
while(($row=mysqli_fetch_array($res)) and ($counter < $max))
{
$ch = curl_init();
$curlConfig = array(
CURLOPT_URL => "SQLinsert.php",
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POSTFIELDS => array(
'symbol' => $symbol,
)
);
curl_setopt_array($ch, $curlConfig);
$result = curl_exec($ch);
curl_close($ch);
}
I'm going to weigh in down here in hopes of getting this one "away & done".
Although it isn't entirely clear from your post, it seems you're trying to call your PHP file via an HTTP(s) protocol.
In many configurations of PHP, you could do this and avoid some potential cURL overhead by using file_get_contents() instead:
while(($row=mysqli_fetch_array($res)) and ($counter < $max)) {
$postdata = http_build_query(
array(
'symbol' => $row['symbol']
)
);
$opts = array('http' =>
array(
'method' => 'POST',
'header' => 'Content-type: application/x-www-form-urlencoded',
'content' => $postdata
)
);
$context = stream_context_create($opts);
$result = file_get_contents('http://example.com/SQLinsert.php', false, $context);
$counter++; // you didn't mention this, but you don't want a everloop...
}
That's pretty much a textbook example copied from the manual, actually.
To use cURL instead, as you tried to do originally, and in truth it seems pretty clean with one call to curl_setopt() inside the loop:
$ch = curl_init();
$curlConfig = array(
CURLOPT_URL => "http://example.com/SQLinsert.php",
CURLOPT_POST => true,
CURLOPT_RETURNTRANSFER => true
);
curl_setopt_array($ch, $curlConfig);
while(($row=mysqli_fetch_array($res)) and ($counter < $max)) {
curl_setopt($ch, CURLOPT_POSTFIELDS, array('symbol' => $row['symbol']));
$result = curl_exec($ch);
$counter++; //see above
}
// do this *after* the loop
curl_close($ch);
Now the actual and original problem may be that $symbol isn't initialized; at least, it isn't in the example you have provided. I've attempted to fix this by using $row['symbol'] in both my examples. If this isn't the name of the column in the database then you would obviously need to use the correct name.
Finally, be advised that it's almost always better to access a secondary resource via the fastest available mechanism; if "SQLinsert.php" is local to the calling script, using HTTP(s) is going to be terribly under-performant, and you should rewrite both pieces of the system to work from a local (e.g. 'disk-based') point-of-view (which has already been recommended by a plethora of commenters):
//SQLinsert.php
function myInsert($symbol) {
// you've not given us any DB schema information ...
global $db; //hack, *cough*
$sql = "insert into `myTable` (symbol) values('$symbol')";
$res = $this->db->query($sql);
if ($res) return true;
return false;
}
//script.php
require_once("SQLinsert.php");
while(($row=mysqli_fetch_array($res)) and ($counter < $max)) {
$ins = myInsert($row['symbol']);
if ($ins) { // let's only count *good* inserts, which is possible
// because we've written 'myInsert' to return a boolean
$counter++;
}
}

How to output the contents of file_get_contents

I'm trying to figure out how to get a php file (html, css and javascript) and load it into the content section below.
The following is the original...
function wpse_124979_add_help_tabs() {
if ($screen = get_current_screen()) {
$help_tabs = $screen->get_help_tabs();
$screen->remove_help_tabs();
$screen->add_help_tab(array(
'id' => 'my_help_tab',
'title' => 'Help',
'content' => 'HTML CONTENT',
I have tried the following but fails. I added a file_get_contents (first line), and then tried pull it in with 'content' => $adminhelp,
The following is with my amended code...
$adminhelp = file_get_contents('admin-help.php');
function wpse_124979_add_help_tabs() {
if ($screen = get_current_screen()) {
$help_tabs = $screen->get_help_tabs();
$screen->remove_help_tabs();
$screen->add_help_tab(array(
'id' => 'my_help_tab',
'title' => 'WTV Help',
'content' => $adminhelp,
Any ideas what's wrong?
If you want the output of the PHP file to be saved as $adminhelp use:
$adminhelp = file_get_contents('http://YOUR_DOMAIN/admin-help.php');
Right now you're loading the source code of admin-help.php into $adminhelp.
Another example for getting the output of a webpage is cURL:
// create curl resource
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, "http://YOUR_DOMAIN/admin-help.php");
//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// $output contains the output string
$adminhelp = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
If this doesn't answer your question, please include the error message that you're receiving.

Paginate data returned by facebook graph API

I am using PHP SDK 5.0. I have been able to pull posts from my FB page and display it on my site with this code below. I need help to achieve pagination of the results returned by this code.
session_start();
require_once __DIR__. '/Facebook/autoload.php';
$fb = new Facebook\Facebook([
'app_id' => 'xxxxxxxxxxxx',
'app_secret' => 'xxxxxxxxxxxx',
'default_graph_version' => 'v2.4',
'default_access_token' => 'xxxxxxxxxxxx',
]);
$request = $fb->request('GET','/500px/feed/', array('fields' => 'created_time,message', 'limit' => '3',));
try {
$response = $fb->getClient()->sendRequest($request);
$data_array = $response->getDecodedBody();
} catch(Facebook\Exceptions\FacebookResponseException $e) {
echo 'Graph returned an error: ' . $e->getMessage();
exit;
} catch(Facebook\Exceptions\FacebookSDKException $e) {
echo 'Facebook SDK returned an error: ' . $e->getMessage();
exit;
}
Facebook returns the next and previous links with every feed-result, like in the example below. How can I get pagination using those ? Or is there a better alternative.
[data] => Array
(
[0] => Array
(
[created_time] => 2015-09-23T17:00:53+0000
[message] => some message pulled from the post
[id] => 57284451149_10152992926641150
)
)
[paging] => Array
(
[previous] => https://graph.facebook.com/v2.4/57284451149/feed?fields=created_time,message&limit=1&since=1443027653&access_token=xxxxxxxxxxx&__paging_token=enc_xxxxxxxxxxxx&__previous=1
[next] => https://graph.facebook.com/v2.4/57284451149/feed?fields=created_time,message&limit=1&access_token=xxxxxxxxxxx&until=1443027653&__paging_token=enc_xxxxxxxxxxxx
)
I am still learning PHP, and at this point I have no clue how to go beyond this. Ideally there would be three results per page and the results would display on the same page. If not a solution, I would really appreciate pseudo codes or helpful suggestions or a roadmap which would help me do it myself.
TEMP SOLUTION - I could be on wrong track, but here is what I have done as a temporary solution. Seems like facebook does all the work, eg offset etc and provides us a calculated url, and all we need to do is work with the provided urls.
//set your url parameters here
function fetchUrl($url){
if(is_callable('curl_init')){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 20);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$feedData = curl_exec($ch);
curl_close($ch);
}
return $feedData;
}
if(isset($_POST['nextLink'])){
$url = $_POST['nextLink'];
} elseif(isset($_POST['prevLink'])){
$url = $_POST['prevLink'];
} else {
$url = 'https://graph.facebook.com/v2.4/'.$pageID.'/feed?fields='.$fields.'&limit='.$limit.'&access_token='.$accessToken;
}
$json_object = fetchUrl($url);
$FBdata = json_decode($json_object, true);
//parse the received data in your desired format
//display data
//get the next link, construct and set a url
$nextLink = $FBdata['paging']['next'];
$prevLink = $FBdata['paging']['previous'];
//Your next and previous form here
I would use http GET method but i don't prefer ugly long urls so I am using POST method to get next and previous urls. Note that I using cURL instead of PHP SDK. This is a simplified example and needs more work.
I am not writing this as an answer because this is just a woraround not a solution, I am still looking to do it using PHP SDK. I just could not gt hold of SDK generated url. Any inputs ?

Trouble with assigning variables in php arrays [duplicate]

This question already has answers here:
Reference: What is variable scope, which variables are accessible from where and what are "undefined variable" errors?
(3 answers)
Closed 7 years ago.
I have this snippet of a function inside a larger php class.
I also get some simple data from my database which I have trouble getting into the array when fetching the variables.
$json = array();
$params = array(
'receiver_name' => $var,
'receiver_address1' => 'this works'
);
$ch = curl_init();
$query = http_build_query($params);
curl_setopt($ch, CURLOPT_URL, self::API_ENDPOINT . '/' . 'shipments/imported_shipment');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $query);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec ($ch);
$http_code = curl_getinfo( $ch, CURLINFO_HTTP_CODE);
curl_close ($ch);
$output = json_decode($output, true);
I've tried to assign the variable before the array starts, like this:
$var = $row['field_1'];
But it doesn't work when I try to insert this code. So I then tried:
$var2 = "'$var'"; // but quickly went back.
Since I thought it might have something to do with the quotes. I've tried several other ways of getting the variable to pass properly into the array. But when I don't know exactly what to read up on, I get stuck like this.
$params = array(
'receiver_name' => ".$var.", //this prints ..
'receiver_name' => "'.$var.'", //this prints '..'
Perhaps someone on here can tell me what terminology I should use for searching for more information related to this subject, as I seemingly have alot to read up on?
Update
I think everything should be cut out and simplified nicely now?
<?php
while($row = mysql_fetch_array($retval, MYSQL_ASSOC))
{
$p_name = $row['input_name'];
}
echo $p_name; // echo's correctly
class labels {
public function myFunction() {
$json = array();
$params = array(
'token' => $this->_token,
'name' => $p_name,
'hardcoded' => 'works fine'
);
}
}
echo $p_name; // echo's correctly
?>
You're trying to use a variable that is declared outside of the local scope of the class. You need to pass that variable as parameter for the class method.
class labels {
public function myFunction($p_name) {
$json = array();
$params = array(
'token' => $this->_token,
'name' => $p_name,
'hardcoded' => 'works fine'
);
}
}
And when you call your method
$labels = new labels();
$labels->myFunction($p_name);
Then it should work.

writing cURL like function in a rails app

I'm trying to convert this PHP cURL function to work with my rails app. The piece of code is from an SMS payment gateway that needs to verify the POST paramters. Since I'm a big PHP noob I have no idea how to handle this problem.
$verify_url = 'http://smsgatewayadress';
$fields = '';
$d = array(
'merchant_ID' => $_POST['merchant_ID'],
'local_ID' => $_POST['local_ID'],
'total' => $_POST['total'],
'ipn_verify' => $_POST['ipn_verify'],
'timeout' => 10,
);
foreach ($d as $k => $v)
{
$fields .= $k . "=" . urlencode($v) . "&";
}
$fields = substr($fields, 0, strlen($fields)-1);
$ch = curl_init($verify_url); //this initiates a HTTP connection to $verify_url, the connection headers will be stored in $ch
curl_setopt($ch, CURLOPT_POST, 1); //sets the delivery method as POST
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields); //The data that is being sent via POST. From what I can see the cURL lib sends them as a string that is built in the foreach loop above
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //This verifies if the target url sends a redirect header and if it does cURL follows that link
curl_setopt($ch, CURLOPT_HEADER, 0); //This ignores the headers from the answer
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); //This specifies that the curl_exec function below must return the result to the accesed URL
$result = curl_exec($ch); //It ransfers the data via POST to the URL, it gets read and returns the result
if ($result == true)
{
//confirmed
$can_download = true;
}
else
{
//failed
$can_download = false;
}
}
if (strpos($_SERVER['REQUEST_URI'], 'ipn.php'))
echo $can_download ? '1' : '0'; //we tell the sms sever that we processed the request
I've googled a cURL lib counterpart in Rails and found a ton of options but none that I could understand and use in the same way this script does.
If anyone could give me a hand with converting this script from php to ruby it would be greatly appreciated.
The most direct approach might be to use the Ruby curb library, which is the most straightforward wrapper for cURL. A lot of the options in Curl::Easy map directly to what you have here. A basis might be:
url = "http://smsgatewayadress/"
Curl::Easy.http_post(url,
Curl::PostField.content('merchant_ID', params[:merchant_ID]),
# ...
)

Categories