How to parse this table and extract data from it? - php

I have the following table: http://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=lat
It is a currency exchange list and I need to extract some data from it. On left side of the table are currency ID numbers. Would it be possible to extract data from specified rows based on their IDs?
For example, from the table above, I want to extract currencies with IDs 978, 203, and 348.
Output should be:
EUR 104,2182
CZK 4,2747
HUF 38,7919
By looking at similar examples here, I came up with this: http://pastebin.com/hFZs1H7C
I need somehow to detect IDs and the print proper values... I'm noob when it comes to programming and I need your help.
<?php
$data = file_get_contents('http://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=lat');
$dom = new domDocument;
#$dom->loadHTML($data);
$dom->preserveWhiteSpace = false;
$tables = $dom->getElementsByTagName('table');
$rows = $tables->item(1)->getElementsByTagName('tr');
foreach ($rows as $row) {
$cols = $row->getElementsByTagName('td');
foreach ($cols as $col) {
echo $col;
}
}
?>

Collecting the table data as array for later usage:
$dom = new DomDocument;
$dom->loadHtmlFile('http://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=lat');
$xpath = new DomXPath($dom);
// collect header names
$headerNames = array();
foreach ($xpath->query('//table[#id="index:srednjiKursLista"]//th') as $node) {
$headerNames[] = $node->nodeValue;
}
// collect data
$data = array();
foreach ($xpath->query('//tbody[#id="index:srednjiKursLista:tbody_element"]/tr') as $node) {
$rowData = array();
foreach ($xpath->query('td', $node) as $cell) {
$rowData[] = $cell->nodeValue;
}
$data[] = array_combine($headerNames, $rowData);
}
print_r($data);
Output:
Array
(
[0] => Array
(
[ŠIFRA VALUTE] => 978
[NAZIV ZEMLJE] => EMU
[OZNAKA VALUTE] => EUR
[VAŽI ZA] => 1
[SREDNJI KURS] => 104,2182
)
...
)
Example usage:
foreach ($data as $entry) {
printf(
'%s %s' . PHP_EOL,
$entry['OZNAKA VALUTE'],
$entry['SREDNJI KURS']
);
}

You can use xpath and domdocument features of PHP to extract specific data from html(or xml.)
$src = new DOMDocument('1.0', 'utf-8');
$src->formatOutput = true;
$src->preserveWhiteSpace = false;
$content = file_get_contents("http://www.nbs.rs/kursnaListaModul/srednjiKurs.faces?lang=lat");
#$src->loadHTML($content);
$xpath = new DOMXPath($src);
$values=$xpath->query('//td[ contains (#class, "tableCell") ]');
foreach($values as $value)
{
echo $value->nodeValue."<br />";
}
this will print innerHTML of every td element with class="tableCell".

Related

How to parse html table to array with symfony dom crawler

I have html table and I want to make array from that table
$html = '<table>
<tr>
<td>satu</td>
<td>dua</td>
</tr>
<tr>
<td>tiga</td>
<td>empat</td>
</tr>
</table>
My array must look like this
array(
array(
"satu",
"dua",
),
array(
"tiga",
"empat",
)
)
I have tried the below code but could not get the array as I need
$crawler = new Crawler();
$crawler->addHTMLContent($html);
$row = array();
$tr_elements = $crawler->filterXPath('//table/tr');
foreach ($tr_elements as $tr) {
// ???????
}
$table = $crawler->filter('table')->filter('tr')->each(function ($tr, $i) {
return $tr->filter('td')->each(function ($td, $i) {
return trim($td->text());
});
});
print_r($table);
The above example will give you a multidimensional array where the first layer are the table lines "tr" and the second layer are the table columns "td".
EDIT
If you got nested tables, this code will flatten them out nicely into a single dimension array.
$html = 'MY HTML HERE';
$crawler = new Crawler($html);
$flat = function(string $selector) use ($crawler) {
$result = [];
$crawler->filter($selector)->each(function ($table, $i) use (&$result) {
$table->filter('tr')->each(function ($tr, $i) use (&$result) {
$tr->filter('td')->each(function ($td, $i) use (&$result) {
$html = trim($td->html());
if (strpos($html, '<table') !== FALSE) return;
$iterator = $td->getIterator()->getArrayCopy()[0];
$address = $iterator->getNodePath();
if (!empty($html)) $result[$address] = $html;
});
});
});
return $result;
};
// The selector gotta point to the most outwards table.
print_r($flat('#Prod fieldset div table'));
$html = '<table>
<tr>
<td>satu</td>
<td>dua</td>
</tr>
<tr>
<td>tiga</td>
<td>empat</td>
</tr>
</table>';
$crawler = new Crawler();
$crawler->addHTMLContent($html);
$rows = array();
$tr_elements = $crawler->filterXPath('//table/tr');
// iterate over filter results
foreach ($tr_elements as $i => $content) {
$tds = array();
// create crawler instance for result
$crawler = new Crawler($content);
//iterate again
foreach ($crawler->filter('td') as $i => $node) {
// extract the value
$tds[] = $node->nodeValue;
}
$rows[] = $tds;
}
var_dump($rows );exit;
will display
array
0 =>
array
0 => string 'satu'
1 => string 'dua'
1 =>
array (size=2)
0 => string 'tiga'
1 => string 'empat'

Get data from XML with PHP

below is part of my XML where I try to get data from, basicly I need to insert them to array where "role" is key and "entry" is value.
Here is XML:
<CommunicationDetailList>
<CommunicationDetail>
<Role>Phone1</Role>
<Entry>727831333</Entry>
</CommunicationDetail>
<CommunicationDetail>
<Role>Mobile</Role>
<Entry>727834125</Entry>
</CommunicationDetail>
<CommunicationDetail>
<Role>Fax1</Role>
<Entry>123456789</Entry>
</CommunicationDetail>
<CommunicationDetail>
<Role>EMail1</Role>
<Entry>moj#mail.sk</Entry>
</CommunicationDetail>
</CommunicationDetailList>
This is my PHP code, unfotunately it doesn't work correctly (add just first one not rest of it, so I have access just to Phone1):
//this is somewhere on top of my code
$doc = new DOMDocument();
//Load XML to DOM
$doc->loadXml($xml);
.
.
// here I parse rest of XML, where `<tags>` are unique
.
.
//and here is that important part
$communicationDetails = $doc->getElementsByTagName( "CommunicationDetailList" );
foreach( $communicationDetails as $detail )
{
$keys = $detail->getElementsByTagName( "Role" );
$key = $keys->item(0)->nodeValue;
$values = $detail->getElementsByTagName( "Entry" );
$value = $values->item(0)->nodeValue;
//adding login and password to array
$data[$key] = $value;
}
Can someone help me to access to this XML
Try using SimpleXMLElement like this
<?php
$xml = 'data.xml';
//load xml from file
$doc = simplexml_load_file($xml);
// or load from string
// $doc = simplexml_load_string($xmlString);
foreach($doc->CommunicationDetail as $detail){
//print $detail->Role . ' - ' . $detail->Entry . PHP_EOL;
$data[(string)$detail->Role] = (string)$detail->Entry;
// we cast the xml elements as strings to be used as keys and values in the array
}
print_r($data);
//output is
Array
(
[Phone1] => 727831333
[Mobile] => 727834125
[Fax1] => 123456789
[EMail1] => moj#mail.sk
)
Try this may be it help
foreach( $communicationDetails as $detail )
{
$keys = $detail->getElementsByTagName( "Role" );
$values = $detail->getElementsByTagName( "Entry" );
$length = $keys->length;
for($i = 0; $i <= $length; $i++)
{
$key = $keys->item($i)->nodeValue;
$value = $values->item($i)->nodeValue;
$data[$key] = $value;
}
}
The problem is with
$item(0)
If you were to use a iterated loop like
for ($i=0; $i<count($keys); $i++) { echo $keys[$i]; }
Then it would go through the entire array.

remove one or more <tr> tag xpath php

question: In my code I want remove header tr html tag (witch attribute is<tr class='background1'> )
urL http://www.rayansaba.com/index.php?ukey=pricelist
In following code is if condition to avoid headers but I don't no way cant remove.
$table_rows = $xpath->query("//div[#class='cpt_maincontent']/center/table/tr"); // target the row (the browser rendered <tbody>, but actually it really doesnt have one)
if($table_rows->length <= 0) { // exit if not found
echo 'no table rows found';
exit;
}
$i = 0;
$trlength = count($table_rows);
foreach($table_rows as $tr){
$row = $tr->childNodes;
if($row->item(0)->tagName!='<tr class="background1"></tr>') { // avoid headers
$data[] = array(
'Name' => trim($row->item(0)->nodeValue),
'Price' => trim($row->item(2)->nodeValue),
);
}
}
Actually, you can just add it inside the xpath query excluding those rows with background, then extract the values and push the node values accordingly. Example:
$dom = new DOMDocument;
#$dom->loadHTMLFile('http://www.rayansaba.com/index.php?ukey=pricelist');
$xpath = new DOMXpath($dom);
$table_rows = $xpath->query("//div[#class='cpt_maincontent']/center/table/tr[not(#class)]");
$data = array();
foreach($table_rows as $tr) {
if($tr->getElementsByTagName('td')->length > 2) {
$name = trim($tr->getElementsByTagName('td')->item(0)->nodeValue);
$price = trim($tr->getElementsByTagName('td')->item(2)->nodeValue);
$data[] = array('Name' => $name, 'Price' => $price);
}
}
echo '<pre>', print_r($data, 1), '</pre>';

PHP Simple DOM Parser extract single value from two url

I use dom parser to grab text from two html documents with the same li class and I retrieved a double value.
<?php
include_once('simple_html_dom.php');
$links = array (
"Model_one" => "car.html",
"Model_two" => "car/edition.html"
);
foreach ($links as $key=>$link) {
$html = file_get_html($link);
$ret[] = $html->find('ul li[class=dotCar]',0)->plaintext;
$pattern = '/.\d+(?:\.\d{2})?((?<=[0-9])(?= usd))/';
preg_match_all($pattern, $ret[0], $result);
$price = array();
foreach($result[0] as $k=>$v) {
$price[] = $v;
echo $price[0];
}
}
// $price[0]= 10.55 11
?>
How can I associate the model_key from $links array to value $price to obtain the result:
model_one 10.55
model_two 11.00
In this way I can retrieve the single value to insert in a MySQL table.
Perhaps something like this:
foreach($result as $k=>$v) {
//$v is the price (10.55 etc.)
foreach($links as $kk=>$vv) {
//$vv is the link (model_one etc.)
$priceAndLinks[$vv] = $v;
}
}
This might give you an idea of the logic needed.

Parse DOM document to array php

*
What I want is, for the DOM to instead of printing the results line by line in a "foreach" loop, rather store it in an array.... So it should look like a list i.e.
"[0] 16GB USB Stick" "[1] Computer monitor" "[2] wireless keyboard"
etc etc
So far I have this, but it only stores the last value from the for each loop.. Please help!
*
$html = new DOMDocument();
#$html->loadHtmlFile('some online shop');
$xpath = new DOMXPath($html);
$nodelist = $xpath->query( "//div[#class='productname']/p" );
foreach ($nodelist as $n)
{
$value = $n->nodeValue;
$list = array($value);
}
echo $list[0];
That's because you're overriding it in each loop. Create an array, and add to that array:
$list = array();
foreach ($nodelist as $n)
{
$value = $n->nodeValue;
$list[] = $value;
}
// Check there's at least one item in the array before accessing it
if (count($list) > 0)
{
echo $list[0];
}
You need to look into how arrays work in PHP. What you're doing wrong is you are re-declaring the array on each iteration, instead of adding more information to it.
$list = array();
foreach ($nodelist as $n) {
$list[] = $n->nodeValue;
}
var_dump($list);
Explanation:
[] basically means - add an item in this array, and auto generate the key.
The foreach I wrote is equivalent to this one:
$i = 0;
foreach ($nodelist as $n) {
$list[$i] = $n->nodeValue;
$i ++;
}

Categories