Curl - Php get element content that it's updated with ajax - php

I have this code:
$status_url = $site_properties["status_url"];
//$listeners_url = $site_properties["listeners_url"];
//$messages_url = $site_properties["messages_url"];
//$html = file_get_html($status_url);
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $status_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$res = curl_exec($ch);
curl_close($ch);
$dom = new DomDocument();
# $dom->loadHTML($res);
$radio_listeners = $dom->getElementById('listeners_cont');
echo $radio_listeners->textContent;
I was wondering how can i write this script to wait a few seconds (10 for example), so the setInterval ajax in $status_url page, will be started and all the fields will be updated correctly.
Some screen shot to explain:

curl and PHP doesn't execute any javascript at all. You need something like PhantomJS - or just get the AJAX call in your browser's network tab and implement this call only.

Related

Open a URL with a PHP within it using cURL

I am trying to fetch data from a URL using cURL...this is something I would be able to do but the URL that I want to fetch data from has a section of php code within it - i.e. the URL is populated using a values generated by an earlier section of code.
I want to access the URL, take the figures out of it and then plug those figures into something else. At the minute I am hitting a bit of a roadblock as the php doesn't populate the URL...any ideas?
The code I am using is:
$url = "https://*START OF API ADDRESS*<?php echo $request_ids; ?>*END OF API ADDRESS*";
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
curl_close($ch);
//var_dump(json_decode($result, true));
$rates = json_decode($result, true);
// Save result
$rates['last_updated'] = time();
file_put_contents($file, json_encode($rates));
}

how do php scripts for data mining from web pages work?

[Edited for better explanation and code included]
Hi! I have a php script on my web server that logs in to my heat pump web interface nibeuplink.com and gets all my temperature readings and so forth and returns them in a json-format.
freeboard.io is a free service for visualizing data, so I'm making a freeboard.io for my heat pump values. in freeboard.io I can add any json data as a data source, so I have added the link to my php-script. It fetches the data once but it seems there is some kind of cached values that it uses after that so they are not updated with new values from the script. freeboard.io uses a get-function to get the url. If i use a normal web browser to run the php script and refresh it, the values are updated - and also immediately updated in freeboard.io. Freeboard.io has a setting to automatically update the data source every 5 seconds.
It seems that there is something that triggers the script correctly when it is fetched from my web browser, but not when it is fetched from freeboard.io that uses a get function every 5 seconds to get new data.
in freeboard I can add headers to the get request, is there some header that would help me here to discard any cached data?
I hope that explains my problem better.
Is there anything i can add to my code in the beginning to always force an override of any cached data?
<?php
/*
* read nibe heatpump values from nibeuplink status web page and return them in json format.
* based on: https://www.symcon.de/forum/threads/25663-Heizung-Nibe-F750-Nibe-Uplink-auslesen-auswerten
* to get the code which is required as parameter, log into nibe uplink, open status page of your heatpump, and check url:
* https://www.nibeuplink.com/System/<code>/Status/Overview
*
* usage: nibe.php?email=<email>&password=<password>&code=<code>
*/
// to add additional debug output to the resulting page:
$debug = false;
date_default_timezone_set('Europe/Helsinki');
$date = time();
// Create temp file to store cookies
$ckfile = tempnam ("/tmp", "CURLCOOKIE");
// URL to login page
$url = "https://www.nibeuplink.com/LogIn";
// Get Login page and its cookies and save cookies in the temp file
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); // Accepts all CAs
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_COOKIEJAR, $ckfile); // Stores cookies in the temp file
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
// Now you have the cookie, you can POST login values
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, 2);
curl_setopt($ch, CURLOPT_POSTFIELDS, "Email=".$_GET['email']."&Password=".$_GET['password']);
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile); // Uses cookies from the temp file
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // Tells cURL to follow redirects
$output = curl_exec($ch);
curl_setopt($ch, CURLOPT_URL, "https://www.nibeuplink.com/System/".$_GET['code']."/Status/ServiceInfo");
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt($ch, CURLOPT_POST, 0);
$result = curl_exec($ch);
$pattern = '/<h3>(.*?)<\/h3>\s*<table[^>]*>.+?<tbody>(.+?)<\/tbody>\s*<\/table>/s';
if ($debug) echo "pattern: <xmp>".$pattern."</xmp><br>";
$pattern2 = '/<tr>\s*<td>(.+?)<span[^>]*>[^<]*<\/span>\s*<\/td>\s*<td>\s*<span[^>]*>([^<]*)<\/span>\s*<\/td>\s*<\/tr>/s';
if ($debug) echo "pattern2: <xmp>".$pattern2."</xmp><br>";
preg_match_all($pattern, $result, $matches);
// build json format from matches
echo '{';
$first = true;
foreach ($matches[1] as $i => $title) {
echo ($first ? '"' : ',"').trim($title).'":{';
$content = $matches[2][$i];
preg_match_all($pattern2, $content, $values);
$nestedFirst = true;
foreach ($values[1] as $j => $field) {
echo ($nestedFirst ? '"' : ',"').trim($field).'":"'.$values[2][$j].'"';
$nestedFirst = false;
}
echo "}";
$first = false;
}
echo ",\"time\":{\"Last fetch\":\"$date\"}";
echo "}";
if ($debug) {
echo "<pre><xmp>";
echo print_r($matches);
echo "<br><br>";
echo $result;
echo "</xmp></pre>";
}
?>
You can make an ajax call to php script to refresh the part of webpage. I don't understand what do you mean by io i.e. are you talking about fetching the data from database and if any changes occurred in database then only newly added records must be fetched. If you mean it in that sense then you can use cookie to track any new records added into database and only if it finds new records it can make ajax call to php script to run your algorithm on fetched total dataset.

curl_exec returns empty string

I'm still a bit new to using curl to pull data and I've recently started using Fiddler to help find what options need to be set.
I'm trying to see if I can pull an image from a site. I first hit a search page - I set the search parameters, then start hitting links in the results. When I attempt to go a link in one of the results for an image, I get an empty string returned from curl_exec().
The weird thing is - at one point, it worked - I got the data back and successfully saved the image locally. But then it stopped, and I have no idea what I was doing to have it working. Naturally, everything works OK in the browser. :(
I'm using Simple HTML DOM to parse through results and cUrl for the actual page requests. curl_error() does not show an error, curl_getinfo() thinks everything is OK too. It's probably something trivial, but I'm not sure how to troubleshoot it beyond where I am.
<?php
include 'includes/simple_html_dom.php';
$url = "http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal/Corrections/InmateInquiry.aspx";
// Get Cookie - ASP.NET_SessionId
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$r = curl_exec($ch);
preg_match_all('/^Set-Cookie:\s*([^;]*)/mi', $r, $matches);
$cookies = array();
foreach($matches[1] as $item)
{
parse_str($item, $cookie);
$cookies = array_merge($cookies, $cookie);
}
$sessionCookie = "ASP_NET_SessionId=".$cookies['ASP_NET_SessionId'];
// now load up page into Simple HTML DOM and get all inputs - ignore buttons and populate our dates
$startDate = "02%2F01%2F2000";
$endDate = "02%2F07%2F2016";
$getInputs = str_get_html($r);
$inputs = $getInputs->find('input');
$inputs_array = array();
$buttons_array = array();
for ($i=0; $i<count($inputs); $i++)
{
if ($inputs[$i]->type != "submit")
{
$inputs_array[$inputs[$i]->id] = $inputs[$i]->value;
if (stripos($inputs[$i]->id, "FromDate") > 0)
$inputs_array[$inputs[$i]->id] = $startDate;
if (stripos($inputs[$i]->id, "ToDate") > 0)
$inputs_array[$inputs[$i]->id] = $endDate;
}
}
// build up our curl data - includes hidden inputs, our to & from dates, plus the Search button
$curl_data = http_build_query($inputs_array)."&ctl00%24DefaultContent%24uxSearch=Search";
// POST the data, include session cookie
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $curl_data);
curl_setopt($ch, CURLOPT_COOKIE, $sessionCookie);
$response = curl_exec($ch);
// this shows that we can get data
// find the links from the HTML
$htmlDom = str_get_html($response); // load up Simple HTML DOM
// get the table of results
$divTable = $htmlDom->find('div#ctl00_DefaultContent_uxResultsWrapper',0)->find('table',0);
$rows = $divTable->find('tr');
for ($i=1; $i<count($rows);$i++)
{
if ($i>3) break; // limit the length of script for debugging
$link = $rows[$i]->find('td',1)->find('a',0)->href;
// build up query to get inmate details from the link above
$url = "http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal/Corrections/".$link;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIE, $sessionCookie);
$page = curl_exec($ch);
$pageData = str_get_html($page);
// Now find the Photo, there's a thumb in div.BookingPhotos
// It is linked to a full size image, the link is of the form http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal/GetImage.aspx?ImageKey=17C030IS, but in the href, it has ../GetImage.aspx?ImageKey=xxxx
$photoLink = $pageData->find('div.BookingPhotos',0)->find('a',0)->href;
// get rid of .. and put the base URL on the front
$imgLink = str_replace("..", "http://nwweb.co.bell.tx.us/NewWorld.Aegis.WebPortal", $photoLink);
// now attempt to pull the image
$ch = curl_init($imgLink);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIE, $sessionCookie);
// here is the PROBLEM - NO DATA RETURNED
$imgData = curl_exec($ch); // I get a header back, but NO data
}
?>

How to use GET on an external URL

I am using an API that returns JSON from a GET request
Eg.
https://api.domain.com/v1/Account/{auth_id}/Call/{call_uuid}
Returns
{
"call_duration": 4,
"total_amount": "0.00400"
}
How can I call this page from within a script and save call_duation and total_amount as separate variables?
Something like the following?:
$call_duration =
$_GET[https://api.domain.com/v1/Account/{auth_id}/Call/{call_uuid}, 'call_duration'];
$_GET[] contains the get parameters that are passed to your code - they don't generate a GET request.
You could use curl to make your request:
$ch = curl_init("https://api.domain.com/v1/Account/{auth_id}/Call/{call_uuid}");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
$result = json_decode($output);
If PHP has allow_url_fopen enabled you can simply do
json_decode(file_get_contents('https://api.domain.com/v1/Account/{auth_id}/Call/{call_uuid}'))
Otherwise you'll have to resort to using something like Curl to get the request going. $_GET is a superglobal array which doesn't actually do anything. It only contains what the script was started with. It does not make any requests itself.
Use curl to get the JSON, then json_decode to decode it into PHP variables
$auth_id = 'your auth id here';
$call_uuid = 'your call_uuid here';
// initialise curl, set URL and options
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://api.domain.com/v1/Account/{$auth_id}/Call/{$call_uuid}");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
// get the response and decode it
$response = curl_exec($ch);
curl_close($ch);
$response = json_decode($response);
$call_duration = $response['call_duration'];
$total_amount = $response['total_amount'];

how to read html body content using cURL

I'm using the following code to get response from a requested page using php
$ch = curl_init('http://myPageURL/');
curl_setopt($ch, CURLOPT_HEADER, 1);
$c = curl_exec($ch);
echo curl_getinfo($ch, CURLINFO_HTTP_CODE);
Here the response is showing header and other info including body content.
but i only need the body content as a response, what would be the code to do it ?
thanks in advance
$ch = curl_init('http://myPageURL/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
$result = curl_exec($ch);
echo $result;
This will give the contents and I made it to add the content in the result variable and added some settings to make sure the content is received if the page you are going is redirecting to another.
since you are only interested in the body tag you can do something like this:
<?php
$response = curl_getinfo($ch, CURLINFO_HTTP_CODE);
$start = stripos($response, "<body");
$end = stripos($response, "</body");
$body = substr($response,$start,$end-$start);
?>
This is just a quick example how you can do this. But be aware that there can be several body tags in a page (if iframes are used). And also the body tag can contain attributes.

Categories