Extracting TD Information with DOMXPath - php

I'm using the following code to get information from a specific table column, but I keep getting the following error:
Object of class DOMElement could not be converted to string
Here is the table column I'm trying to get information from:
<td id="billing_address">1099 Somewhere Lane<br /> Some City, IL 60118</td>
Here is the CURL script to get the information:
$url = "http://www.somesite.com";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt($ch, CURLOPT_POST, 3);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-Type: application/x-www-form-urlencoded'));
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
$result = curl_exec($ch);
$document = new DOMDocument();
#$document->loadHTML($result);
$xp = new DOMXpath($document);
$nodes = $xp->query('//td[#id="billing_address"]');
$node = $nodes->item(0);
$client_info = $node;
echo $client_info;
curl_close($ch);
Thanks to Lauris, I was able to get the content to display using this:
$client_info = $node->nodeValue;
echo $client_info;
The problem is that I noticed the line break tags separating the lines are missing when echoed. What I need is to be able to differentiate the different lines that are separated by the tags.
I figured I would just grab whatever was between the tags with the id of billing_info, and then use the PHP explode function to parse out the individual lines. But the tags aren't there for me to do that.

Related

PHP Curl failed to post data

I am coding an aubook appointment script using PHP. There is a calendar with available dates to book.
I successfully do logging, I successfully get random dates, I successfully get available dates parameters, then finally I fail to post data and book the appointment.
After I successfully book with this simple script, I have to make a condition - if there is an avaliable dates try to book else continue to refresh
<?php
set_time_limit(0);// to infinity for
$ch = curl_init();
$headers[] = "Accept: */*";
$headers[] = "Connection: Keep-Alive";
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL, 'https://example.com/login/login.php');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
$co = curl_exec($ch);
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($co);
# Parse the HTML
# The # before the method call suppresses any warnings that
# loadHTML might throw because of invalid HTML in the page.
$xpath = new DOMXPath($doc);
$val1 = $xpath->query('//input[#name="_sid"]/#value')->item(0)->nodeValue;
echo $val1;
echo '<br/>';
$field['process'] = 'login';
$field['_sid'] = $val1;
$field['email'] = 'myemail#example.com';
$field['pwd'] = '123456';
$datafield = http_build_query($field);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $datafield);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_exec($ch);
curl_setopt($ch, CURLOPT_URL, 'https://example.com/login/myapp.php?fg_id=5568094');
$cur = curl_exec($ch);
$do = new DOMDocument(); // New dom Doc to Get URL from calender of avaliable dates
libxml_use_internal_errors(true);
$do->loadHTML($cur);
# Parse the HTML
# The # before the method call suppresses any warnings that
# loadHTML might throw because of invalid HTML in the page.
$xpath = new DOMXPath($do);
$onClickAttrNodeList = $xpath->query('//a[#class="dispo"]/#onclick'); //array contains URL
$array = array(); // CONVERT NODE LIST OBJECT TO ARRAY
foreach($onClickAttrNodeList as $node){
$array[] = $node;
}
$x=array();
foreach($array as $node) {
for($i = 0; $i < 10; ++$i) {
$x[] = $node->nodeValue; //PARSE ALL LINK AS TABLE
}
}
$randlink = array_rand($x, 10); //get gandom link from calender of avaliable dates
$link = $x[$randlink[0]];
echo '<br/>';
preg_match_all('#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#', $link, $match); //Get URL from the last array
echo "<pre>";
$url = $match[0];
print_r($url[0]);
echo'<br/>';
parse_str( parse_url( $url[0], PHP_URL_QUERY), $arrayurl ); // GET parametres from the URL of avaliable dates to book
var_dump($arrayurl);
/* in this part of code
i am trying to post
parametres to book
an appottment i failed on this step */
$fieldbook['timestamp'] = $arrayurl[0];
$fieldbook['skey'] = $arrayurl[1];
$fieldbook['process'] = $arrayurl[2];
$fieldbook['what'] = $arrayurl[3];
$fieldbook['fg_id'] = $arrayurl[4];
$fieldbook['result'] = $arrayurl[5];
$fieldbook['issuer_view'] = $arrayurl[6];
$datafieldbook = http_build_query($fieldbook);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_URL, 'https://example.com/login/action.php');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $datafieldbook);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_exec($ch);
curl_setopt($ch, CURLOPT_URL, 'https://example.com/login/myapp.php?fg_id=5568094');
$book = curl_exec($ch);
echo'<br/>';
echo $book;
curl_close($ch);
?>
Thank you .
Problem solved , i have just Add user agent and some headers .
Thank you guys :-).

HTML <object>: Issue when formatting PHP call result

I have a php page which returns a simple integer. I'm calling this page using the html <object> tag. I get the result but I cannot format it by changing size, color and the like.
Example of the php:
$url = 'https://mydomain/execute.json';
$request = $url;
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "GET");
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/json'));
curl_setopt($ch, CURLOPT_URL, $request);
curl_setopt($ch, CURLOPT_USERPWD, "fih#XXXXXXXXXXXXX);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$json_data = curl_exec($ch);
curl_close($ch);
$data1 = json_decode($json_data, true);
$data1 = $data1['rows'];
echo count($data1);
?>
I call this php file like this:
<html>
<body>
<span style="font-size:500%"><object width="400" height="400" data="http://anotherdomain/ikketildelte.php"></object></span>
</body>
</html>
The result is an integer, but the styling is not effected.
The Chrome developer tools shows that it's embedded in a #document tag:
Anyone has an idea how to format this result?

Get currency exchange rate from bank site

I'm trying to get content from bank site using curl.
http://www.zaba.hr/home/wps/wcm/connect/zaba_hr/zabapublic/tecajna
Site is specific becouse it using ajax to fill currency exchange table. There is a link for download data in to file but you have to have same session id to able to do that.
Im trying this code:
$url="http://www.zaba.hr/home/wps/wcm/connect/zaba_hr/zabapublic/tecajna";
$useragent = $_SERVER['HTTP_USER_AGENT'];
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_URL,$url);
$cl = curl_exec($ch);
$dom = new DOMDocument();
#$dom->loadHTML($cl);
#$link = $dom->getElementById('tecajPrn');
echo $suburl = "http://www.zaba.hr".$link->getAttribute('href');
After this I got link to file but I can't open it.
Another strange situation is that link I got with curl is http://www.zaba.hr/home/ZabaUtilsWeb/utils/tecaj/danasPrn but real link when I click on icon is http://www.zaba.hr/ZabaUtilsWeb/utils/tecaj/prn/62/2014
You are messing with cookie and ajax(may be!). Here is the lookaround. Try this:
First send a request to the page to obtain the cookie.
$url="http://www.zaba.hr/home/wps/wcm/connect/zaba_hr/zabapublic/tecajna";
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "mozilla 5.0");
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_COOKIEFILE,"cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR,"cookie.txt");
$cl = curl_exec($ch);
curl_close($ch);
After that make another curl request. This time to obtain the json data:
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "mozilla 5.0");
curl_setopt($ch, CURLOPT_HTTPHEADER, array("X-Requested-With: XMLHttpRequest", "Referer: http://www.zaba.hr/home/wps/wcm/connect/zaba_hr/zabapublic/tecajna"));
curl_setopt($ch, CURLOPT_URL,"http://www.zaba.hr/ZabaUtilsWeb/utils/tecaj/danas");
curl_setopt($ch, CURLOPT_COOKIEFILE,"cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR,"cookie.txt");
$cl = curl_exec($ch);
curl_close($ch);
Your json is available at this variable. Parse it using json_decode()
// now parse json from $cl
print $cl;
Anything required, help yourself straightway!
Note: Make sure you have write permission for the cookie.txt file. Also, its better to use absolute path like c:/test/cookie.txt or /var/tmp/cookie.txt.

Curl request - invalid xml request when posting user&password after xml

I need to send the following request, which is XML with a username and password on the end of the xml. I can get this to work as a URL when I paste into the browser:
URL?XML=<>...</>&user=123456&password=123456
But when I try to create the curl call it says it's an invalid XML request.
I've tried having the user&password within the POSTFIELDS. I've also tried setting them as a variable before $trial ='&user=123456&password=123456' but still can't seem to get it to work.
$xml = <<<ENDOFXML
<?xml version="1.0" encoding="UTF-8"?><xmldata>...</xmldata>
ENDOFXML;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Content-type: application/x-www-form-urlencoded'));
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'xml='.urlencode($xml).urlencode('&user=123456&password=123456'));
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_close($ch);
This is wrong:
curl_setopt($ch, CURLOPT_POSTFIELDS, 'xml='.urlencode($xml).urlencode('&user=123456&password=123456'));
urlencode only the variable names and the values (or only values if you know the variable names are ok):
curl_setopt($ch, CURLOPT_POSTFIELDS, 'xml='.urlencode($xml).'&user=' . urlencode('123456').'&password=' . urlencode('123456'));

How to solve the irrecognizable characters while curl this page

My code is here, the problem is different with my other question, here I meet the problem is after I successfully curl down the page, but all I see the characters are irrecognizable, eg. "��Ƶ�̳̣�2012�棩", how to make them appear normally?
$cookie_file = tempnam('./temp','cookie');
$login_url = 'http://bbs.php100.com/login.php';
$post_fields = 'cktime=3600&step=2&pwuser=username&pwpwd=password';
$ch = curl_init($login_url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_exec($ch);
curl_close($ch);
$url = 'http://bbs.php100.com/index.php';//or specific page
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
$contents = curl_exec($ch);
//preg_match("",$contents,$arr);
//echo $arr[1];
curl_close($ch);
You need to save the contents of the page to variable and convert encoding.
$url = 'http://bbs.php100.com/index.php';//or specific page
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 0);
// Note "1"! It is needed for curl_exec() to return contents of the page
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
$contents = curl_exec($ch);
$contents = iconv('gbk','utf8',$contents);
echo $contents;
If you are using not UTF-8 encoding, set second parameter of iconv according to your needs.

Categories