Extract Value from HTML using PHP - php

I'm retrieving a HTML page using cURL. The html page has a table like this.
<table class="table2" style="width:85%; text-align:center">
<tr>
<th>Refference ID</th>
<th>Transaction No</th>
<th>Type</th>
<th>Operator</th>
<th>Amount</th>
<th>Slot</th>
</tr>
<tr>
<td>130717919020ffqClE0nRaspoB</td>
<td>8801458920369</td>
<td>Purchase</td>
<td>Visa</td>
<td>50</td>
<td>20130717091902413</td>
</tr>
</table>
This is the only table in that HTML page. I need to extract Refference ID & Slot using PHP.
But no idea how that can be done.
EDIT:
This one helped me a lot.

A regex based solution like the accepted answer is not the right way to extract information from HTML documents.
Use a DOMDocument based solution like this instead:
$str = '<table class="table2" style="width:85%; text-align:center">
<tr>
<th>Refference ID</th>
...
<th>Slot</th>
</tr>
<tr>
<td>130717919020ffqClE0nRaspoB</td>
...
<td>20130717091902413</td>
</tr>
</table>';
// Create a document out of the string. Initialize XPath
$doc = new DOMDocument();
$doc->loadHTML($str);
$selector = new DOMXPath($doc);
// Query the values in a stable and easy to maintain way using XPath
$refResult = $selector->query('//table[#class="table2"]/tr[2]/td[1]');
$slotResult = $selector->query('//table[#class="table2"]/tr[2]/td[6]');
// Check if the data was found
if($refResult->length !== 1 || $slotResult->length !== 1) {
die("Data is corrupted");
}
// XPath->query always returns a node set, even if
// this contains only a single value.
$refId = $refResult->item(0)->nodeValue;
$slot = $slotResult->item(0)->nodeValue;
echo "RefId: $refId, Slot: $slot", PHP_EOL;

$str = '<table class="table2" style="width:85%; text-align:center">
<tr>
<th>Refference ID</th>
<th>Transaction No</th>
<th>Type</th>
<th>Operator</th>
<th>Amount</th>
<th>Slot</th>
</tr>
<tr>
<td>130717919020ffqClE0nRaspoB</td>
<td>8801458920369</td>
<td>Purchase</td>
<td>Visa</td>
<td>50</td>
<td>20130717091902413</td>
</tr>
</table>';
preg_match_all('/<td>([^<]*)<\/td>/', $str, $m);
$reference_id = $m[1][0];
$slot = $m[1][5];

Related

How to follow the condition to underline in the table?

I have a question how to underline in the table according the column data. Below is example coding to explain what I am facing the problem:
I want to detect if column underline is 1 the first name data will draw the underline, if 0 the first name data no show the underline. Below the sample is hardcode, if real situation, I have too many row to show the data, I cannot 1 by 1 to add text-decoration: underline; in the td. So that, hope someone can guide me how to solve this problem. I am using the php code to make the variable to define the underline.
<!--Below the php code I just write the logic, because I don't know how to write to detect the column underline value-->
<?php
if ( <th>Underline</th> == 1) {
$add_underline = "text-decoration: underline;";
}
if ( <th>Underline</th> == 0) {
$add_underline = "text-decoration: underline;";
}
?>
<table style="width:100%">
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Underline</th>
</tr>
<tr>
<td style="<?php echo $add_underline;?> ">Jill</td>
<td>Smith</td>
<td>1</td>
</tr>
<tr>
<td style="<?php echo $add_underline;?>">Eve</td>
<td>Jackson</td>
<td>0</td>
</tr>
<tr>
<td style="<?php echo $add_underline;?>">John</td>
<td>Doe</td>
<td>1</td>
</tr>
</table>
My output like below the picture:
My expected result like below the picture, Jill and John can underline:
Why not use javascript to achieve this? No matter what the server sends it will evaluate the condition if 1 is set and then underline accordingly... You would have to use classes to get the appropriate table data tags holding the values, I added class='name' to the names <td> tag and class='underline' tot he underline <td> tag.
// get the values of the elements with a class of 'name'
let names = document.getElementsByClassName('name');
// get the values of the elements with a class of 'underline'
let underline = document.getElementsByClassName('underline');
// loop over elements using for and use the keys to get and set values
// `i` will iterate until it reaches the length of the list of elements with class of underline
for(let i = 0; i < underline.length; i++){
// use the key to get the text content and check if 1 is set use Number to change string to number for strict evaluation
if(Number(underline[i].textContent) === 1){
// set values set to 1 to underline in css style
names[i].style.textDecoration = "underline";
}
}
<table style="width:100%">
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Underline</th>
</tr>
<tr>
<td class="name">Jill</td>
<td>Smith</td>
<td class='underline'>1</td>
</tr>
<tr>
<td class="name">Eve</td>
<td>Jackson</td>
<td class='underline'>0</td>
</tr>
<tr>
<td class="name">John</td>
<td>Doe</td>
<td class='underline'>1</td>
</tr>
</table>
Or using the td child values...
let tr = document.querySelectorAll("tr");
last = null;
for(let i = 1; i < tr.length; i++){
if(Number(tr[i].lastElementChild.innerHTML) === 1){
tr[i].firstElementChild.style.textDecoration = "underline";
}
}
<table style="width:100%">
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Underline</th>
</tr>
<tr>
<td>Jill</td>
<td>Smith</td>
<td>1</td>
</tr>
<tr>
<td>Eve</td>
<td>Jackson</td>
<td>0</td>
</tr>
<tr>
<td>John</td>
<td>Doe</td>
<td>1</td>
</tr>
</table>

Editing HTML table with PHP DOM

I've tried to add a column to a table I've loaded to the DOM function but can't get this code working.
<?php
$dom = new DOMDocument();
$dom->loadHTML('<table>
<tbody>
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Age</th>
</tr>
<tr>
<td>Jill</td>
<td>Smith</td>
<td>50</td>
</tr>
</tbody>
</table>');
$tr = $dom->getElementsByTagName('tr');
$th = $dom->createElement('th', 'Comment');
$tr->item(0)->appendChild($th);
Your code is working fine. You successfully add a new th element to the DOM tree. But you need to output it to the browser with
echo $dom->saveHTML();
The output in HTML is:
<html>
<body>
<table>
<tbody>
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Age</th>
<th>Comment</th>
</tr>
<tr>
<td>Jill</td>
<td>Smith</td>
<td>50</td>
</tr>
</tbody>
</table>
</body>
</html>

xPath, getting tabular data

I have a HTML thus like:
<table>
<tr>
<th>Name</th>
<th>Email</th>
<th>Age</th>
</tr>
<tr>
<td>Joe Bloggs</td>
<td>joe#bloggs.com</td>
<td>40</td>
</tr>
<tr>
<td>John Doe</td>
<td>john#doe.com</td>
<td>40</td>
</tr>
</table>
Is there a way using xPath to get the first 2 columns, i.e the Name and Email fields?
I can get the table data using $data = $xpath->query( '//table'); just unsure how to get only the first 2 columns.
Many thanks
Get the first two td's:
//table/tr/td[position() <= 2]

php regex or html dom parsing

I use regex for HTML parsing but I need your help to parse the following table:
<table class="resultstable" width="100%" align="center">
<tr>
<th width="10">#</th>
<th width="10"></th>
<th width="100">External Volume</th>
</tr>
<tr class='odd'>
<td align="center">1</td>
<td align="left">
http://xyz.com
</td>
<td align="right">210,779,783<br />(939,265 / 499,584)</td>
</tr>
<tr class='even'>
<td align="center">2</td>
<td align="left">
http://abc.com
</td>
<td align="right">57,450,834<br />(288,915 / 62,935)</td>
</tr>
</table>
I want to get all domains with their volume(in array or var) for example
http://xyz.com - 210,779,783
Should I use regex or HTML dom in this case. I don't know how to parse large table, can you please help, thanks.
here's an XPath example that happens to parse the HTML from the question.
<?php
$dom = new DOMDocument();
$dom->loadHTMLFile("./input.html");
$xpath = new DOMXPath($dom);
$trs = $xpath->query("//table[#class='resultstable'][1]/tr");
foreach ($trs as $tr) {
$tdList = $xpath->query("td[2]/a", $tr);
if ($tdList->length == 0) continue;
$name = $tdList->item(0)->nodeValue;
$tdList = $xpath->query("td[3]", $tr);
$vol = $tdList->item(0)->childNodes->item(0)->nodeValue;
echo "name: {$name}, vol: {$vol}\n";
}
?>

why does it loop more then once...am i missing something

There is only one record in the table so why does it loop like i have 5 table with one letter each
$query = "Select * from click_tracker";
$result = mysql_query($query);
$all_clicks = mysql_fetch_array($result);
foreach($all_clicks as $click){
print "
<table border=\"1\">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>{$click['url_destination']}</td>
<td>{$click['count']}</td>
</tr>
</table>";
}
here is the table returned
<table border="1">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>2</td>
<td>2</td>
</tr>
</table>
<table border="1">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>2</td>
<td>2</td>
</tr>
</table>
<table border="1">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>h</td>
<td>h</td>
</tr>
</table>
<table border="1">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>h</td>
<td>h</td>
</tr>
</table>
<table border="1">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>5</td>
<td>5</td>
</tr>
</table>
<table border="1">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>5</td>
<td>5</td>
</tr>
</table>
mysql_fetch_array fetches one row as an array. When you try to loop over that result with your foreach, you are actually looping through all the columns of the row you returned (twice, actually, because by default, mysql_fetch_array returns an array with both numeric and indexed keys!)
If you want to get all the rows in your result set (and you more than likely do), you need to use a while loop to keep fetching rows until there aren't anymore:
$all_clicks = array();
while ($row = mysql_fetch_array($result))
{
$all_clicks[] = $row;
}
and then when you iterate over $all_clicks, each iteration will have a complete row.
mysql_fetch_array() returns rows, it looks like your foreach is looping over fields in a row not rows in a result set.
You appear to be printing multiple tables. I don't think this is what you intend though. You need to print the table's opening and closing tags, and the headings, outside of the loop. You should also call mysql_fetch_array() in the loop and not just once.
print "
<table border=\"1\">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>";
$query = "Select * from click_tracker";
$result = mysql_query($query);
while ($click = mysql_fetch_array($result)) {
print "
<tr>
<td>{$click['url_destination']}</td>
<td>{$click['count']}</td>
</tr>";
}
print "</table>";
You should also consider escaping the data in $click, but I don't know what your data looks like so I'm not sure what to put in the area just between the while and print statements.
You need to do it like this:
$query = "Select * from click_tracker";
$result = mysql_query($query);
while($click = mysql_fetch_assoc($result)) {
do a print_r($all_clicks) and check the result is what you expect it to be.
you don't really need to use a foreach if there's only one result.
$query = "Select * from click_tracker";
$result = mysql_query($query);
$all_clicks = mysql_fetch_array($result);
print "
<table border=\"1\">
<tr>
<th>Location</th>
<th>Visit Count</th>
</tr>
<tr>
<td>{$all_clicks['url_destination']}</td>
<td>{$all_clicks['count']}</td>
</tr>
</table>";

Categories