Display external html specific value - php

I want to display the "hiOutsideTemp" value from this html page: http://amira.meteokrites.gr/ZWNTANA.htm to my page. It's a temp value.
I'm using the following code:
<?php
require_once 'simple_html_dom.php';
$html_string = file_get_contents('http://amira.meteokrites.gr/ZWNTANA.htm');
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($html_string);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
$values = array();
$row = $xpath->query('//td[#id="/html/body/table/tbody/tr[7]/td[2]"]');
foreach($row as $value) {
$values[] = trim($value->textContent);
}
echo '<pre>';
print_r($values);
?>
But i get no result.
What shall i do?

Looking at the content of the URL you have, it just seems to be a list of settings and not a HTML page (this is only displayed if you don't have the actual HTML file included). The content is something like...
imerominia="29/03/18";
ora=" 9:37";
sunriseTime=" 7:10";
sunsetTime="19:37";
ForecastStr=" Mostly cloudy and cooler. Windy with possible wind shift to the W, NW, or N. ";
tempUnit="°C";
outsideTemp="9.3";
hiOutsideTemp="9.5";
hiOutsideTempAT=" 0:03";
lowOutsideTemp="6.2";
lowOutsideTempAT=" 4:24";
...
So you can just load it as though it's an ini file format and this gives you an associative array of the data.
$html_string = file_get_contents('http://amira.meteokrites.gr/ZWNTANA.htm');
$data = parse_ini_string($html_string);
echo $data["hiOutsideTemp"]; // outputs - 9.5

Related

php read html and handle double id-appearance

For my project I'm reading an external website which has used the same ID twice. I can't change that.
I need the content from the second appearance of that ID but my code just results the first one and does not see the second one.
Also a count to $data results 1 but not 2.
I'm desperate. Does anyone have an idea how to access the second ID 'hours'?
<?PHP
$url = 'myurl';
$contents = file_get_contents($url);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile($url);
$data = $dom->getElementById("hours");
echo $data->nodeValue."\n";
echo count($data);
?>
As #rickdenhaan points out, getElementById always returns a single element which is the first element that has that specific value of id. However you can use DOMXPath to find all nodes which have a given id value and then pick out the one you want (in this code it will find the second one):
$url = 'myurl';
$contents = file_get_contents($url);
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTMLFile($url);
$xpath = new DOMXPath($dom);
$count = 0;
foreach ($xpath->query("//*[#id='hours']") as $node) {
if ($count == 1) echo $node->nodeValue;
$count++;
}
As #NigelRen points out in the comments, you can simplify this further by directly selecting the second input in the XPath i.e.
$node = $xpath->query("(//*[#id='hours'])[2]")[0];
echo $node->nodeValue;
Demo on 3v4l.org

How would I parse this html in php?

I've exported my Firefox bookmarks as html so I can download my extensive music collection onto my phone, my problem is there is no easy way that I know of.
My intentions is to use PHP to parse the html into an array of the URLs
Heres what the html looks like
<DT>Don Diablo - Knight Time (Official Music Video) - YouTube
How would I do this?
If you put in $html a correct html string, you could do it parsing the string with DOMDocument and selecting the href attributes with XPath.
<?php
$html = '<DT>Don Diablo - Knight Time (Official Music Video) - YouTube';
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DomXPath($doc);
$nodeList = $xpath->query("//a/#href");
$links_array = [];
foreach($nodeList as $node){
$links_array[] = $node->nodeValue;
}
echo "<pre>";
print_r($links_array);
echo "</pre>";
The output here is:
Array
(
[0] => https://www.youtube.com/watch?v=Ue8PpA557Bc
)
$doc = new DOMDocument();
$doc->loadHTML($bookmarks);
foreach ($doc->getElementsByTagName("a") as $node) {
$urls[] = $node->getAttribute("href");
}

XPath Substring-After Help / Query/Evaluate?

I'm building a php script to transfer selected contents of an xml file to an sql database..
One of the hardcoded XML contents is formatted like this:
<visualURL>
id=18144083|img=http://upload.wikimedia.org/wikipedia/en/8/86/Holyrollernovacaine.jpg
</visualURL>
And I'm looking for a way to just get the contents of the URL (all text after img=).
$Image = $xpath->query("substring-after(/Playlist/PlaylistEntry[1]/visualURL[1]/text(), 'img=')", $element)->item(0)->nodeValue;
Displays a property non-object error on my php output.
There must be another way to just extract the URL contents using XPath that I want, no?
Any help would be greatly appreciated!
EDIT:
Here is the minimum code
<?php
$xmlDoc = new DOMDocument();
$xmlDoc->loadXML('<Playlist>
<PlaylistEntry>
<visualURL>
id=12582194|img=http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
</visualURL>
</PlaylistEntry>
</Playlist>');
$xpath = new DOMXpath($xmlDoc);
$elements = $xpath->query("/Playlist/PlaylistEntry[1]");
if (!is_null($elements))
foreach ($elements as $element)
$Image = $xpath->query("substring-after(/Playlist/PlaylistEntry[1]/visualURL[1]/text(), 'img=')", $element)- >item(0)->nodeValue;
print "Finished Item: $Image";
?>
EDIT 2:
After some research I believe I must use
$xpath->evaluate
instead of my current use of
$xpath->query
see this link
Same XPath query is working with Google docs but not PHP
I'm not exactly sure how to do this yet.. but i will investigate more in the morning. Again, any help would be appreciated.
You're in right direction. Use DOMXPath::evaluate() for xpath expression that doesn't return node(s) like substring-after() (it returns string as documented in the linked page). The following codes prints expected output :
$xmlDoc = new DOMDocument();
$xml = <<<XML
<Playlist>
<PlaylistEntry>
<visualURL>
id=12582194|img=http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
</visualURL>
</PlaylistEntry>
</Playlist>
XML;
$xmlDoc->loadXML($xml);
$xpath = new DOMXpath($xmlDoc);
$elements = $xpath->query("/Playlist/PlaylistEntry");
foreach ($elements as $element) {
$Image = $xpath->evaluate("substring-after(visualURL, 'img=')", $element);
print "Finished Item: $Image <br>";
}
output :
Finished Item: http://upload.wikimedia.org/wikipedia/en/9/96/Sometime_around_midnight.jpg
Demo

xpath extract complete html

I am trying to extract a complete table including the HTML tags, with XPath, that I can store in a variable, do a bit of string replacement on, then echo directly to the screen. I have found numerous posts on getting the text out of the table but I want to retain the HTML formatting since I am just going to display it (after minor modification).
At present I am extracting the table using string functions stristr, substr etc. but I would prefer to use XPath.
I can display the contents of the table with the following but it just displays the table TD fields with no formatting. It also does not store it in a variable that I can manipulate.
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$arr = $xpath->query('//table');
foreach($arr as $el) {
echo $el->textContent;
I tried this but got no output:
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$arr = $xpath->query('//table');
echo $arr->saveHTML();
Use DOMNode::C14N():
foreach($arr as $el) {
echo $el->C14N();

Keep new line, when the HTML is on 1 line and new line layout is done with <div>

I need to get content from a site
I need to get
/html/body/div/div[2]/table/tbody/tr/td/div/div[2]/form/fieldset[2]/table[2]
or
<table class='properties'>
For which the code is visible here: http://paste.pocoo.org/show/347881/
contents with all the content formatted just on new lines.
I don't care about paddings, and other formatting, I just want to keep the new lines.
For example a proper output would be
tájékoztató
az eljárás eredményéről
A Közbeszerzések Tanácsa (Szerkesztőbizottsága) tölti ki
A hirdetmény kézhezvételének dátuma____________________
KÉ nyilvántartási szám_________________________________
I. SZAKASZ: AJÁNLATKÉRŐ
I.1) Név, cím és kapcsolattartási pont(ok)
The problem I face that the new lines are introduced with the div's and cannot get it.
Update
This be executed by a PHP cron, so there is no access to JS.
There is a library called phpQuery: http://code.google.com/p/phpquery/
You can walk through DOM object like with jQuery:
phpQuery::newDocument($htmlCode)->find('table.properties');
On a mached element's content fire strip_tags and you will get pure content of that table.
The trick is to fetch the inner divs in an xpath expression, then use their textContent property:
<?php
$domd = new DOMDocument();
libxml_use_internal_errors(true);
$domd->loadHTML(file_get_contents("..."));
libxml_use_internal_errors(false);
$domx = new DOMXPath($domd);
$items = $domx->query("/html/body/div/div[2]/table/tr/td/div/div[2]/form/fieldset[2]/table[2]/tr/td/div//div/div[#style='padding-left: 0px;']");
$output = "";
foreach ($items as $item) {
$output .= $item->textContent . "\n";
}
echo $output;

Categories