PHP DOM GET HREF ATTRIBUTE BETWEEN TABLE - php

I'm trying to get multiple href's from a table like this
<table class="table table-bordered table-hover">
<thead>
<tr>
<th class="text-center">No</th>
<th>TITLE</th>
<th>DESCRIPTION</th>
<th class="text-center"><span class="glyphicon glyphicon-download-alt"></span></th>
</tr>
</thead>
<tbody>
<tr data-key="11e44c4ebff985d08ca5313231363233">
<td class="text-center" style="width: 50px;">181</td>
<td style="width:auto; white-space: normal;">Link 1</td>
<td style="width:auto; white-space: normal;">Lorem ipsum dolor 1</td>
<td class="text-center" style="width: 50px;"><img src="https://example.com/img/pdf.png" width="15" height="20" alt="myImage"></td>
</tr>
<tr data-key="11e44c4e4222d630bdd2313231323532">
<td class="text-center" style="width: 50px;">180</td>
<td style="width:auto; white-space: normal;">Link 2</td>
<td style="width:auto; white-space: normal;">Lorem ipsum dolor 2</td>
<td class="text-center" style="width: 50px;"><img src="https://example.com/img/pdf.png" width="15" height="20" alt="myImage"></td>
</tr>
</tbody>
</table>
i try PHP DOM like this
<?php
$html = file_get_contents('data2.html');
$htmlDom = new DOMDocument;
$htmlDom->preserveWhiteSpace = false;
$htmlDom->loadHTML($html);
$tables = $htmlDom->getElementsByTagName('table');
$rows = $tables->item(0)->getElementsByTagName('tr');
foreach ($rows as $row)
{
$cols = $row->getElementsByTagName('td');
echo #$cols->item(0)->nodeValue.'<br />';
echo #$cols->item(1)->nodeValue.'<br />';
echo trim($cols->item(1)->getElementsByTagName('a')->item(0)->getAttribute('href')).'<br />';
echo #$cols->item(2)->nodeValue.'<br />';
echo trim($cols->item(3)->getElementsByTagName('a')->item(0)->getAttribute('href')).'<br />';
}
?>
I get this error
Fatal error: Uncaught Error: Call to a member function getElementsByTagName() on null
getAttribute causes the error
Could someone help me out here please thanks

Your $rows are results of "all the <tr> within <table>". It not only caught the <tr> in the table body, it also caught that in your table head, which has no <td> in it. Hence when reading that row, $cols->item(0) and $cols->item(1) both got you NULL.
You should take the hint when your code didn't find ->nodeValue attribute in the items (hence you added the # sign to suppress the warning).
Try to change this:
$rows = $tables->item(0)->getElementsByTagName('tr');
into this:
$rows = $tables
->item(0)->getElementsByTagName('tbody')
->item(0)->getElementsByTagName('tr');
Now it is searching the <tr> within your <tbody> and should fix your issue with this particular HTML.
To have a more robust code, you should have checked the variables before acting on them. A type check or count check would be good.

As the previous access to the $cols array all have # to suppress the errors, this is the first one that complains.
A simple fix would be to just skip the rest of the code if no <td> elements are found (such as the header row)...
foreach ($rows as $row)
{
$cols = $row->getElementsByTagName('td');
if ( count($cols) == 0 ) {
continue;
}
You could alternatively use XPath and only select <tr> tags which contain <td> tags.

Related

How to follow the condition to underline in the table?

I have a question how to underline in the table according the column data. Below is example coding to explain what I am facing the problem:
I want to detect if column underline is 1 the first name data will draw the underline, if 0 the first name data no show the underline. Below the sample is hardcode, if real situation, I have too many row to show the data, I cannot 1 by 1 to add text-decoration: underline; in the td. So that, hope someone can guide me how to solve this problem. I am using the php code to make the variable to define the underline.
<!--Below the php code I just write the logic, because I don't know how to write to detect the column underline value-->
<?php
if ( <th>Underline</th> == 1) {
$add_underline = "text-decoration: underline;";
}
if ( <th>Underline</th> == 0) {
$add_underline = "text-decoration: underline;";
}
?>
<table style="width:100%">
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Underline</th>
</tr>
<tr>
<td style="<?php echo $add_underline;?> ">Jill</td>
<td>Smith</td>
<td>1</td>
</tr>
<tr>
<td style="<?php echo $add_underline;?>">Eve</td>
<td>Jackson</td>
<td>0</td>
</tr>
<tr>
<td style="<?php echo $add_underline;?>">John</td>
<td>Doe</td>
<td>1</td>
</tr>
</table>
My output like below the picture:
My expected result like below the picture, Jill and John can underline:
Why not use javascript to achieve this? No matter what the server sends it will evaluate the condition if 1 is set and then underline accordingly... You would have to use classes to get the appropriate table data tags holding the values, I added class='name' to the names <td> tag and class='underline' tot he underline <td> tag.
// get the values of the elements with a class of 'name'
let names = document.getElementsByClassName('name');
// get the values of the elements with a class of 'underline'
let underline = document.getElementsByClassName('underline');
// loop over elements using for and use the keys to get and set values
// `i` will iterate until it reaches the length of the list of elements with class of underline
for(let i = 0; i < underline.length; i++){
// use the key to get the text content and check if 1 is set use Number to change string to number for strict evaluation
if(Number(underline[i].textContent) === 1){
// set values set to 1 to underline in css style
names[i].style.textDecoration = "underline";
}
}
<table style="width:100%">
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Underline</th>
</tr>
<tr>
<td class="name">Jill</td>
<td>Smith</td>
<td class='underline'>1</td>
</tr>
<tr>
<td class="name">Eve</td>
<td>Jackson</td>
<td class='underline'>0</td>
</tr>
<tr>
<td class="name">John</td>
<td>Doe</td>
<td class='underline'>1</td>
</tr>
</table>
Or using the td child values...
let tr = document.querySelectorAll("tr");
last = null;
for(let i = 1; i < tr.length; i++){
if(Number(tr[i].lastElementChild.innerHTML) === 1){
tr[i].firstElementChild.style.textDecoration = "underline";
}
}
<table style="width:100%">
<tr>
<th>Firstname</th>
<th>Lastname</th>
<th>Underline</th>
</tr>
<tr>
<td>Jill</td>
<td>Smith</td>
<td>1</td>
</tr>
<tr>
<td>Eve</td>
<td>Jackson</td>
<td>0</td>
</tr>
<tr>
<td>John</td>
<td>Doe</td>
<td>1</td>
</tr>
</table>

how to resolve min/max width error domp while generating pdf?

i'm getting the following error in my code while converting to pdf
there's no inline block statement included and width is defined for every table header still issue is persistent
<?php
//print_invoice.php
if(isset($_GET["pdf"]) && isset($_GET["id"]))
{
require_once 'pdf.php';
include('connection2.php');
$output = '';
$statement = $connect->prepare("
SELECT * FROM POrder
WHERE order_id = :order_id
LIMIT 1
");
$statement->execute(
array(
':order_id' => $_GET["id"]
)
);
$result = $statement->fetchAll();
foreach($result as $row)
{
$output .= '
<table width="100%" border="1" cellpadding="5" cellspacing="0">
<tr>
<td colspan="2" align="center" style="font-size:18px"><b>Invoice</b></td>
</tr>
<tr>
<td colspan="2">
<table width="100%" cellpadding="5">
<tr>
<td width="65%">
To,<br />
<b>Vendors Name</b><br />
Name : '.$row["vendorname"].'<br />
Description : '.$row["description"].'<br />
</td>
<td width="35%">
Reverse Charge<br />
Invoice No. : '.$row["order_no"].'<br />
Invoice Date : '.$row["order_date"].'<br />
</td>
</tr>
</table>
<br />
<table width="100%" border="1" cellpadding="5" cellspacing="0">
<tr>
<th>Sr No.</th>
<th>Item Name</th>
<th>Quantity</th>
<th>Price</th>
<th>Actual Amt.</th>
<th colspan="2">GST (%)</th>
<th rowspan="2">Total</th>
</tr>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th>Rate</th>
<th>Amt.</th>
</tr>';
$statement = $connect->prepare(
"SELECT * FROM POrder_item
WHERE order_id = :order_id"
);
$statement->execute(
array(
':order_id' => $_GET["id"]
)
);
$item_result = $statement->fetchAll();
$count = 0;
foreach($item_result as $sub_row)
{
$count++;
$output .= '
<tr>
<td>'.$count.'</td>
<td>'.$sub_row["item_name"].'</td>
<td>'.$sub_row["item_quantity"].'</td>
<td>'.$sub_row["item_price"].'</td>
<td>'.$sub_row["item_price_bt"].'</td>
<td>'.$sub_row["item_gst"].'</td>
<td>'.$sub_row["item_price_at"].'</td>
<td>'.$sub_row["final_amount"].'</td>
</tr>
';
}
$output .= '
<tr>
<td align="right" colspan="11"><b>Total</b></td>
<td align="right"><b>'.$row["total_after_tax"].'</b></td>
</tr>
<tr>
<td colspan="11"><b>Total Amt. Before Tax :</b></td>
<td align="right">'.$row["total_before_tax"].'</td>
</tr>
<tr>
<td colspan="11">Add : GST :</td>
<td align="right">'.$row["gst"].'</td>
</tr>
<td colspan="11"><b>Total Tax Amt. :</b></td>
<td align="right">'.$row["order_total_tax"].'</td>
</tr>
<tr>
<td colspan="11"><b>Total Amt. After Tax :</b></td>
<td align="right">'.$row["total_after_tax"].'</td>
</tr>
';
$output .= '
</table>
</td>
</tr>
</table>
;
}
$pdf = new Pdf();
$file_name = 'Invoice-'.$row["order_no"].'.pdf';
$pdf->loadHtml($output);
$pdf->render();
$pdf->stream($file_name, array("Attachment" => false));
}
?>
// pdf.php
<?php
require_once 'dompdf/autoload.inc.php';
use Dompdf\Dompdf;
class Pdf extends Dompdf{
public function __construct() {
parent::__construct();
}
}
?>
i expect to get a pdf but instead i get this error
Fatal error: Uncaught exception 'Dompdf\Exception' with message
'Min/max width is undefined for table rows' in
/Applications/XAMPP/xamppfiles/htdocs/NTPC/dompdf/src/FrameReflower/TableRow.php:72
Stack trace: #0
/Applications/XAMPP/xamppfiles/htdocs/NTPC/dompdf/src/FrameDecorator/AbstractFrameDecorator.php(903):
Dompdf\FrameReflower\TableRow->get_min_max_width() #1
/Applications/XAMPP/xamppfiles/htdocs/NTPC/dompdf/src/FrameReflower/AbstractFrameReflower.php(268):
Dompdf\FrameDecorator\AbstractFrameDecorator->get_min_max_width() #2
/Applications/XAMPP/xamppfiles/htdocs/NTPC/dompdf/src/FrameDecorator/AbstractFrameDecorator.php(903):
Dompdf\FrameReflower\AbstractFrameReflower->get_min_max_width() #3
/Applications/XAMPP/xamppfiles/htdocs/NTPC/dompdf/src/FrameReflower/AbstractFrameReflower.php(268):
Dompdf\FrameDecorator\AbstractFrameDecorator->get_min_max_width() #4
/Applications/XAMPP/xamppfiles/htdocs/NTPC/dompdf/src/FrameDecorator/AbstractFrameDecorator.php(903):
Dompdf\FrameReflower\AbstractFrameReflower->get_min_max_width in
/Applications/XAMPP/xamppfiles/htdocs/NTPC/dompdf/src/FrameReflower/TableRow.php
on line 72
It seems that including dompdf like you do is no longer supported, see issue 1153. The guy who's asking gets exactly the same error messages as you do.
I'd recommend to follow the dompdf installation manual and install it with composer (as it is imo thyoue most hassle-free way in the long term). I've also found something on installing composer on XAMPP, but I can't really help with this since I don't know XAMPP. As a fallback you could download a pre-configured package (described some lines below).
And also cheack the quick start tutorial to see if dompdf genereally works instead of using your own code first, because some of it might be deprecated.
Hope this helps, good luck!
Do not apply display property to your table (not in inline styles or external styles).
Found from web:
In this case, the fix ended up being pretty simple, it didn’t like the inline style display:block; that I had added to the table.
Upon a little more testing, I found that it would allow for display:inline; or display:inline-block;.
This makes sense as a table has natively the property display:table; and I think block is probably not really valid (although works fine in browsers, is a neat trick to apply to td elements to create a responsive table, and didn’t generate any warnings during validation.
The solution that worked for me was to downgrade dompdf to 1.0.0
Am not saying it's the best solution but as for now with it was not depending on so many other packages but as it comes with phenx/php-svg-lib and phenx/php-font-lib those were also downgraded.
And also the downside of this is that it was installed in the main packages in composer.json
Note: This will upgrade, downgrade and remove packages currently locked to specific versions of the dompdf/dompdf:1.0.0
The command I used is composer require dompdf/dompdf:1.0.0 -w
For me #Ghazni Ali had the right cause.
Adding any type of display to a table made this error occur.
I was trying to get my elements properly spaced and inline.
What I found was I had to add width to my table and then add additional widths to the td inside of the table.
Below is trying to get a 20% and 80% split.
<table style="width: 100%;">
<tbody>
<tr>
<td style="width: 20% !important; border: 1px solid black;">
<p >Left Test</p>
</td>
<td style="width: 80% !important; border: 1px solid black;">
<p >Right Test</p>
</td>
</tr>
</tbody>
</table>
I tried similar methods with using a div tag but it wouldn't appear correct.
The first example is with the table and the bottom two are with div tags.

PHP XPath to parse table

Firstly here is my table HTML:
<table class="xyz">
<caption>Outcomes</caption>
<thead>
<tr class="head">
<th title="a" class="left" nowrap="nowrap">A1</th>
<th title="a" class="left" nowrap="nowrap">A2</th>
<th title="result" class="left" nowrap="nowrap">Result</th>
<th title="margin" class="left" nowrap="nowrap">Margin</th>
<th title="area" class="left" nowrap="nowrap">Area</th>
<th title="date" nowrap="nowrap">Date</th>
<th title="link" nowrap="nowrap">Link</th>
</tr>
</thead>
<tbody>
<tr class="data1">
<td class="left" nowrap="nowrap">56546</td>
<td class="left" nowrap="nowrap">75666</td>
<td class="left" nowrap="nowrap">Lower</td>
<td class="left" nowrap="nowrap">High</td>
<td class="left">Area 3</td>
<td nowrap="nowrap">Jan 2 2016</td>
<td nowrap="nowrap">http://localhost/545436</td>
</tr>
<tr class="data1">
<td class="left" nowrap="nowrap">55546</td>
<td class="left" nowrap="nowrap">71666</td>
<td class="left" nowrap="nowrap">Lower</td>
<td class="left" nowrap="nowrap">High</td>
<td class="left">Area 4</td>
<td nowrap="nowrap">Jan 3 2016</td>
<td nowrap="nowrap">http://localhost/545437</td>
</tr>
...
And there are many more <tr> after that.
I am using this PHP code:
$html = file_get_contents('http://localhost/outcomes');
$document = new DOMDocument();
$document->loadHTML($html);
$xpath = new DOMXPath($document);
$xpath->registerNamespace('', 'http://www.w3.org/1999/xhtml');
$elements = $xpath->query("//table[#class='xyz']");
How can I, now that I have the table as the first element in $elements, get the values of each <td>?
Ideally I want to get arrays like:
array(56546, 75666, 'Lower', 'High', 'Area 3', 'Jan 2 2016', 'http://localhost/545436'),
array(55546, 71666, 'Lower', 'High', 'Area 4', 'Jan 3 2016', 'http://localhost/545437'),
...
But I'm not sure how I can dig that deeply into the the table code.
Thank you for any advice.
First, get all the table rows in the <tbody>
$rows = $xpath->query('//table[#class="xyz"]/tbody/tr');
Then, you can iterate over that collection and query for each <td>
foreach ($rows as $row) {
$cells = $row->getElementsByTagName('td');
// alt $cells = $xpath->query('td', $row)
$cellData = [];
foreach ($cells as $cell) {
$cellData[] = $cell->nodeValue;
}
var_dump($cellData);
}

php regex or html dom parsing

I use regex for HTML parsing but I need your help to parse the following table:
<table class="resultstable" width="100%" align="center">
<tr>
<th width="10">#</th>
<th width="10"></th>
<th width="100">External Volume</th>
</tr>
<tr class='odd'>
<td align="center">1</td>
<td align="left">
http://xyz.com
</td>
<td align="right">210,779,783<br />(939,265 / 499,584)</td>
</tr>
<tr class='even'>
<td align="center">2</td>
<td align="left">
http://abc.com
</td>
<td align="right">57,450,834<br />(288,915 / 62,935)</td>
</tr>
</table>
I want to get all domains with their volume(in array or var) for example
http://xyz.com - 210,779,783
Should I use regex or HTML dom in this case. I don't know how to parse large table, can you please help, thanks.
here's an XPath example that happens to parse the HTML from the question.
<?php
$dom = new DOMDocument();
$dom->loadHTMLFile("./input.html");
$xpath = new DOMXPath($dom);
$trs = $xpath->query("//table[#class='resultstable'][1]/tr");
foreach ($trs as $tr) {
$tdList = $xpath->query("td[2]/a", $tr);
if ($tdList->length == 0) continue;
$name = $tdList->item(0)->nodeValue;
$tdList = $xpath->query("td[3]", $tr);
$vol = $tdList->item(0)->childNodes->item(0)->nodeValue;
echo "name: {$name}, vol: {$vol}\n";
}
?>

Using php to parse html document

I am making a php app to parse HTML contents. I need to store a certain table column in php variables.
Here is my code:
$dom = new domDocument;
#$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$tables = $dom->getElementsByTagName('table');
$rows = $tables->item(0)->getElementsByTagName('tr');
$flag=0;
foreach ($rows as $row)
{
if($flag==0) $flag=1;
else
{
$cols = $row->getElementsByTagName('td');
foreach ($cols as $col)
{
echo $col->nodeValue; //NEED HELP HERE
}
echo '<hr />';
}
}
In each row, first col is the KEY, second is the VALUE. How to create key value pairs from the table and store them as arrays in php.
I tried many things but everytime I am just getting DOMElement Object() as value.
Any help is deeply appreciated...
HTML as requested:
<table align='center' border='0' cellpadding='0' cellspacing='0' style='border-collapse: collapse' width='780' height=100%>
<tr><td height=96% align=center><BR><BR>
<html>
<head>
</head>
<body style="background:url(uptu_logo1.gif); background-repeat:no-repeat; background-position:center">
<p align="center" style="font-size:18px"><span style='font-size:20px'>this text is unimportant gibberish that is not required by my app</span><br/><span style='font-size:16px'>this text is unimportant gibberish that is not required by my app</span><br/><u>B.Tech. Third Year Result 2009-10. this text is unimportant gibberish that is not required by my app</u></p>
<br/>
<table align="center" border="1" cellpadding="0" cellspacing="0" bordercolor="#E3DDD5" width="700" style="border-collapse: collapse; font-size: 11px">
<tr>
<td width="50%"><b>Name:</b></td>
<td width="50%">John Fernandes </td>
</tr>
<tr>
<td><b>Fathers Name:</b></td>
<td>Caith Fernandes </td>
</tr>
<tr>
<td><b>Roll No:</b></td>
<td>0702410099</td>
</tr>
<tr>
<td><b>Status:</b></td>
<td>REGULAR </td>
</tr>
<tr>
<td><b>Course/Branch:</b></td>
<td>B. Tech. </td>
</tr>
<tr>
<td><b>Institute Name</b></td>
<td>Imperial College of Science and Technology</td>
</tr>
</table>
My PHP code outputs:
Name:John Fernandes <hr />
Fathers Name:Caith Fernandes <hr />
Roll No:0702410099<hr />
Status:REGULAR <hr />
Course/Branch:B. Tech. Computer Science and Engineering (10)<hr />
Imperial College of Science and Technology<hr />
Also how to get rid of this silly  ? I saw in the original HTML so I tried to sanitize using PHP function html_entity_decode() But its still there...
What is the HTML that you are loading? I am assuming that it's something simple like so:
<table>
<tr>
<td>heading</td>
<td>heading</td>
</tr>
<tr>
<td>key</td>
<td>value</td>
</tr>
</table>
Looks like the first tr is skipped (the headings), and then you have just 2 columns that you want to pair up as KEY => VALUE;
$cols = $row->getElementsByTagName('td');
$key = $cols->item(0)->nodeValue; // string(3) "key"
$val = $cols->item(1)->nodeValue; // string(5) "value"
The above code will return the items you want.

Categories