Php HTML DOM parsing - php

<table width="100%" cellspacing="0" cellpadding="0" border="0" id="Table4">
<tbody>
<tr>
<td valign="top" class="tx-strong-dgrey">
<a class="anc-noul" href="http://www.example.com/catalog/proddetail.asp?logon=&langid=EN&sku_id=0665000FS10129471&catid=25653">
Apple 8GB 3rd Generation iPod Touch</a></td>
</tr>
<tr>
<td valign="top" class="element-spacer"/>
</tr>
<tr>
<td valign="top" class="tx-normal-grey">
Product detail
<a href="http://www.example.com/catalog/proddetail.asp?logon=&langid=EN&sku_id=0665000FS10129471&catid=25653">
More Info</a></td>
</tr>
<tr>
<td valign="top" class="element-spacer"/>
</tr>
<tr>
<td valign="top" class="tx-normal-red">
<span class="tx-strong-dgrey">Price:</span>
$189.99</td>
</tr>
<tr>
<td valign="top">You save: $9.00 after instant savings</td>
</tr>
<tr>
<td valign="top" class="element-spacer"/>
</tr>
<tr>
<td valign="top" class="tx-normal-grey">
<a href="http://www.example.com/catalog/subclass.asp?catid=25653&logon=&langid=EN">
View similar products</a>
<a href="http://www.example.com/catalog/mfr.asp?man=Apple&catid=19&logon=&langid=EN">
View similar products with same brand</a>
</td></tr>
<tr>
<td valign="top" class="element-spacer"/>
</tr>
</tbody>
</table>
I want to be able to get the $189.99.
echo $ret[0]->find('tr', 4)->plaintext;
This outputs: 'Price: $189.99'
I just need $189.99, not 'Price:'

$exp = explode(":", $ret[0]->find('tr', 4)->plaintext);
$price =$exp[1];

Related

Simple Dom Parser or CURL TABLE PARSING

I need help to get the data from a table. It's an internet usage table and the html code is down below :
<table width="572" border="0" align="center" cellspacing="0">
<tbody><tr valign="top">
<td width="1" class="bgsidelines"></td>
<td width="*" class="bgbottom">
<table summary="" width="100%" border="0" cellpadding="0">
<tbody><tr>
<td width="10" rowspan="2" bgcolor="#CCCCCC"></td>
<td width="443">
<table width="443" height="10" border="0" align="center" cellpadding="8">
<tbody>
<tr>
<td width="100%" class="path"><b>Internet usage</b></td>
</tr>
<tr>
<td class="reg"><!-- Begin yours codes -->
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tbody><tr>
<table cellpadding="5" cellspacing="1" border="0">
<tbody>
<tr>
<td width="43" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
</tr>
<tr>
<td bgcolor="#FFFFFF" class="reg" nowrap="nowrap">2017-06-01 to<br>2017-
06-18</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">54815.06</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">53.53</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">52114.59</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">50.89</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">106929.65</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">104.42</td>
</tr>
</tbody></table></td></tr>
</tbody></table>
<!-- End yours codes -->
</tr>
</tbody></table></td></tr>
</tbody></table></td></tr>
</tbody></table>
I've done it in a way that works but only works sometimes, this must be due to the user agent. and it fetches the entire table while I would like each separated values for the internet usage, the ones in the td class="reg" (54815.06, 53.53..) It's hard because there is a table in table.. Also it's
My PHP :
require_once 'advanced_html_dom.php';
$numvl = $_POST['numvl'];
$url =
'https://extranet.videotron.com/services/secur/extranet/tpia/Usage.do?
compteInternet='.$numvl;
$html = new AdvancedHtmlDom();
$html->load_file($url);
$element = $html->find("tr");
echo $element[1]->innertext;
no need for some external lib (advanced_html_dom.php? never heard of), just use PHP's DOMDocument and DOMXPath.
example:
<?php
declare(strict_types=1);
$domd=#DOMDocument::loadHTML(getHTML());
$xpath=new DOMXPath($domd);
foreach($xpath->query("//td[#valign='top' and #class='reg']") as $ele){
var_dump($ele->textContent);
}
function getHTML():string{
$html=<<<'HTML'
<table width="572" border="0" align="center" cellspacing="0">
<tbody><tr valign="top">
<td width="1" class="bgsidelines"></td>
<td width="*" class="bgbottom">
<table summary="" width="100%" border="0" cellpadding="0">
<tbody><tr>
<td width="10" rowspan="2" bgcolor="#CCCCCC"></td>
<td width="443">
<table width="443" height="10" border="0" align="center" cellpadding="8">
<tbody>
<tr>
<td width="100%" class="path"><b>Internet usage</b></td>
</tr>
<tr>
<td class="reg"><!-- Begin yours codes -->
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tbody><tr>
<table cellpadding="5" cellspacing="1" border="0">
<tbody>
<tr>
<td width="43" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
</tr>
<tr>
<td bgcolor="#FFFFFF" class="reg" nowrap="nowrap">2017-06-01 to<br>2017-
06-18</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">54815.06</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">53.53</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">52114.59</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">50.89</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">106929.65</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">104.42</td>
</tr>
</tbody></table></td></tr>
</tbody></table>
<!-- End yours codes -->
</tr>
</tbody></table></td></tr>
</tbody></table></td></tr>
</tbody></table>
HTML;
return $html;
}
output:
string(8) "54815.06"
string(5) "53.53"
string(8) "52114.59"
string(5) "50.89"
string(9) "106929.65"
string(6) "104.42"

HTML CSS PHP, Table design

i have a problem to create table like this image in website, can anyone help me to solve that?. blue line in picture. i just don't know how to create table like this image. especially reference no, it has custom border line.
Here is table design
you can check my code here too MY CODE
this is my code
<html xmlns="http://www.w3.org/1999/ifxhtml">
<head profile="http://www.w3.org/2005/10/profile">
<title>Glisten - A free web template</title>
</head>
<body>
<table width="800" border="1" align="center">
<tbody>
<tr>
<td colspan="2" align="center" bgcolor=""><table width="800" border="1" align="center">
<tbody>
<tr>
<td width="125">Reference No</td>
<td colspan="4"> </td>
<td colspan="2" bgcolor="#8B8A8A" align="center"></td>
</tr>
<tr>
<td align="center" bgcolor="#FF0004"><strong>NG TINEM</strong></td>
</tr>
<tr>
<td>Site ID</td>
<td colspan="3" align="center"></td>
<td width="185">BSC Name</td>
<td colspan="2" align="center"></td>
</tr>
<tr>
<td>Site Name</td>
<td colspan="3" align="center"></td>
<td>New Site ID</td>
<td colspan="2" align="center"></td>
</tr>
<tr>
<td>Sales Cluster</td>
<td colspan="3" align="center"></td>
<td>LAC</td>
<td colspan="2" align="center"></td>
</tr>
<tr>
<td>Ne Type</td>
<td colspan="3" align="center"></td>
<td>Config</td>
<td colspan="2" align="center"></td>
</tr>
<tr>
<td>Band</td>
<td colspan="3" align="center"></td>
<td>PO Number</td>
<td colspan="2" align="center"></td>
</tr>
<tr>
<td>Cell ID</td>
<td width="80" align="center"></td>
<td width="82" align="center"></td>
<td width="80" align="center"></td>
<td> </td>
<td colspan="2"> </td>
</tr>
<tr>
<td colspan="7" > </td>
</tr>
<tr>
<td align="center">Integration Date</td>
<td align="center"></td>
<td align="center">On Air Date</td>
<td align="center"></td>
<td align="center">Acceptance Date</td>
<td colspan="2" align="center"></td>
</tr>
<tr>
</tbody>
</table>
</body></html>
First identify the total columns to be created.
Then use attribute of table like "colspan" to span the columns.
Add following style to your table (I have added class table in your second table tag):
<style type="text/css">
.table>tbody>tr>td, .table>tbody>tr>th, .table>tfoot>tr>td, .table>tfoot>tr>th, .table>thead>tr>td, .table>thead>tr>th
{
padding: 8px;
line-height: 1.42857143;
vertical-align: top;
}
table {
border-collapse: collapse;
border-spacing: 0;
-webkit-border-horizontal-spacing: 0px;
-webkit-border-vertical-spacing: 0px;
}
</style>
<table width="800" align="center">
<tbody>
<tr>
<td colspan="2" align="center" bgcolor=""><table class="table" width="800" border="1" align="center">
<tbody>
<?php
while($data = mysql_fetch_array($query)){
if($data['dt_report']=='Yes'){
$check_dt_report='checked="checked"';
}
else{
$check_dt_report='';
}if($data['kpi_stats']=='Yes'){
$check_kpi_stats='checked="checked"';
}
else{
$check_kpi_stats='';
}
if($data['clear_alarm']=='Yes'){
$check_clear_alarm='checked="checked"';
}
else{
$check_clear_alarm='';
}
if($data['configuration']=='Yes'){
$check_configuration='checked="checked"';
}
else{
$check_configuration='';
}
if($data['neighbor']=='Yes'){
$check_neighbor='checked="checked"';
}
else{
$check_neighbor='';
}
?>
<tr>
<td width="125">Reference No</td>
<td colspan="4"> </td>
<td colspan="2" bgcolor="#8B8A8A" align="center"><?php echo $data['no_ref']; ?></td>
</tr>
<tr>
<td align="center" bgcolor="#FF0004"><strong>NG TINEM</strong></td>
</tr>
<tr>
<td>Site ID</td>
<td colspan="3" align="center"><?php echo $data['site_id']; ?></td>
<td width="185">BSC Name</td>
<td colspan="2" align="center"><?php echo $data['bsc_name']; ?></td>
</tr>
<tr>
<td>Site Name</td>
<td colspan="3" align="center"><?php echo $data['site_name']; ?></td>
<td>New Site ID</td>
<td colspan="2" align="center"><?php echo $data['new_site_id']; ?></td>
</tr>
<tr>
<td>Sales Cluster</td>
<td colspan="3" align="center"><?php echo $data['sales_cluster']; ?></td>
<td>LAC</td>
<td colspan="2" align="center"><?php echo $data['lac']; ?></td>
</tr>
<tr>
<td>Ne Type</td>
<td colspan="3" align="center"><?php echo $data['ne_type']; ?></td>
<td>Config</td>
<td colspan="2" align="center"><?php echo $data['config']; ?></td>
</tr>
<tr>
<td>Band</td>
<td colspan="3" align="center"><?php echo $data['band']; ?></td>
<td>PO Number</td>
<td colspan="2" align="center"><?php echo $data['po_number']; ?></td>
</tr>
<tr>
<td>Cell ID</td>
<td width="80" align="center"><?php echo $data['cell_id1']; ?></td>
<td width="82" align="center"><?php echo $data['cell_id2']; ?></td>
<td width="80" align="center"><?php echo $data['cell_id3']; ?></td>
<td> </td>
<td colspan="2"> </td>
</tr>
<tr>
<td colspan="7" > </td>
</tr>
</tbody>
</table>

Duplicated Results on SQL Query

PHP newbie here! I ve been struggling with this for a few days now and i have decided i cant figure this out on my own.
Basically i have 2 database tables "projects_2016" and "attachment".
I want to show the data of "projects_2016" to show in the top table and then check for a matching id number (and if it exsits) in "attachment" it will list all the results under the "project_2016 data".
At the moment it works great but it duplicates the "projects_2016" data for every "attachment" entry.
Here is my code, any input is appreciated!
PS not too concereded about Sql injections. Still learning that!
<?php include '../../../connection_config.php';
$sql = "SELECT DISTINCT * FROM attachment JOIN projects_2016 ON attachment.attachment_ABE_project_number = projects_2016.id ORDER BY `attachment_ABE_project_number` DESC";
$result = $conn->query($sql);
if ($result->num_rows > 0) {
while($row = $result->fetch_assoc()) {
?>
<table width="20" border="1" cellspacing="0" cellpadding="2">
<tr>
<th height="0" scope="col"><table width="990" border="0" align= "center" cellpadding="3" cellspacing="0">
<tr class="text_report">
<td width="107" height="30" align="left" valign="middle" nowrap="nowrap" bgcolor="#F5F5F5"><strong>PNo</strong></td>
<td width="871" align="left" valign="middle" nowrap="nowrap" bgcolor="#F5F5F5"><strong>Project Name</strong></td>
</tr>
<tr>
<td height="20" align="left" valign="middle" bgcolor="#FFFFFF" class="text_report"><strong><?php echo "<br>". $row["ID"]. "<br>";?></strong></td>
<td height="20" align="left" valign="middle" bgcolor= "#FFFFFF" class="text_report"><strong><?php echo "<br>". $row["project_name"]. "<br>";?></strong></td>
</tr>
</table>
<?php
$photo_id = $row["ID"];
$contacts = "SELECT DISTINCT * FROM attachment WHERE attachment_ABE_project_number = '$photo_id'" ;
$result_contacts = $conn->query($contacts);
if ($result_contacts->num_rows > 0) {
// output data of each row
while($row_contacts = $result_contacts->fetch_assoc()) {
?>
<table width="990" border="0" align="center" cellpadding= "3" cellspacing="0" class="text_report">
<tr>
<td height="0" colspan="4" align="left" valign="middle" nowrap="nowrap" bgcolor="#FFFFFF"> </td>
</tr>
<tr>
<td width="319" height="30" align="left" valign="middle" nowrap="nowrap" bgcolor="#F5F5F5"><strong>File Name</strong></td>
<td width="279" align="left" valign="middle" nowrap="nowrap" bgcolor="#F5F5F5"><strong>File Type</strong></td>
<td width="315" align="left" valign="middle" nowrap="nowrap" bgcolor="#F5F5F5"><strong>File Size (KB)</strong></td>
<td width="53" align="right" valign="middle" nowrap="nowrap" bgcolor="#F5F5F5"><strong>View File</strong></td>
</tr>
<tr>
<td height="20" align="left" valign="middle" bgcolor="#FFFFFF"><?php echo $row_contacts ['file'] ?></td>
<td height="20" align="left" valign="middle" bgcolor="#FFFFFF"><?php echo $row_contacts ['type'] ?></td>
<td height="20" align="left" valign="middle" bgcolor="#FFFFFF"><?php echo $row_contacts ['size'] ?></td>
<td align="right" valign="middle" bgcolor="#FFFFFF">view file</td>
</tr>
<tr>
<td height="0" colspan="4" align="left" valign="middle" bgcolor="#FFFFFF"> </td>
</tr>
<?php
}
?>
</table>
<?php
}
?></th>
</tr>
</table>
<table width="1000" border="0" cellspacing="0" cellpadding="0">
<tr>
<th height="26"> </th>
</tr>
</table>
<p>
<?php
}
}
?>
</p>
</table>
<?php $conn->close();
?>
$sql = "SELECT * FROM projects_2016
WHERE EXISTS (SELECT * FROM attachment WHERE projects_2016.id = attachment_ABE_project_number) ORDER BY id DESC ";

Python regex ignore new line

I have web page look like this
<td valign="top">
<table width="100%" border="0" cellspacing="2" cellpadding="1" class="main_tb3">
<tr>
<td colspan="2">
<div align="center">
<a href="/title/name.php" target="_blank">
<img src="./movie/image.jpg" alt="TitleName" border="0" height="100" width="225" />
</a>
</div>
</td>
</tr>
<tr>
<td colspan="2"><h1 align="center">Title - secondname</h1></td>
</tr>
<tr>
<td><span class="style10">Cat1 :</span></td>
<td>1st name</td>
</tr>
<tr>
<td width="32%"><span class="style10">Cat2 :</span></td>
<td width="68%"><b><i>secondname</i></b></td>
</tr>
<tr>
<td><span class="style10">cat4 :</span></td>
<td>Bla bla</td>
</tr>
<tr>
<td><span class="style10">Cat3 :</span></td>
<td>thirdName2</td>
</tr>
</table>
</td>
<td valign="top">
<table width="100%" border="0" cellspacing="2" cellpadding="1" class="main_tb3">
<tr>
<td colspan="2">
<div align="center">
<a href="/title/name.php" target="_blank">
<img src="./movie/image.jpg" alt="TitleName" border="0" height="100" width="225" />
</a>
</div>
</td>
</tr>
<tr>
<td colspan="2"><h1 align="center">Title - secondname</h1></td>
</tr>
<tr>
<td><span class="style10">Cat1 :</span></td>
<td>1st name</td>
</tr>
<tr>
<td width="32%"><span class="style10">Cat2 :</span></td>
<td width="68%"><b><i>secondname</i></b></td>
</tr>
<tr>
<td><span class="style10">cat4 :</span></td>
<td>Bla bla</td>
</tr>
<tr>
<td><span class="style10">Cat3 :</span></td>
<td>thirdName2</td>
</tr>
</table>
</td>
I would like to get certain values from this site using python regex.
After <div align="center"> I like to get href value: "/title/name.php" and img src: "./movie/image.jpg" and Title - secondname from <h1 align="center">Title - secondname</h1>
i have tried this:
regex = 'class="main_tb3"*\n<a href="(.+?)" target="_blank">\n<img src="(.+?)"'
please help me
you can use below regex
For href value: <a href="(.*?)"
For Image src: <img src="(.*?)"
For Title: titleid=12">(.*?)<
You will find it a lot simpler to install something like BeautifulSoup to do this:
from bs4 import BeautifulSoup
html = """
<td valign="top">
<table width="100%" border="0" cellspacing="2" cellpadding="1" class="main_tb3">
<tr>
<td colspan="2">
<div align="center">
<a href="/title/name.php" target="_blank">
<img src="./movie/image.jpg" alt="TitleName" border="0" height="100" width="225" />
</a>
</div>
</td>
</tr>
<tr>
<td colspan="2"><h1 align="center">Title - secondname</h1></td>
</tr>
<tr>
<td><span class="style10">Cat1 :</span></td>
<td>1st name</td>
</tr>
<tr>
<td width="32%"><span class="style10">Cat2 :</span></td>
<td width="68%"><b><i>secondname</i></b></td>
</tr>
<tr>
<td><span class="style10">cat4 :</span></td>
<td>Bla bla</td>
</tr>
<tr>
<td><span class="style10">Cat3 :</span></td>
<td>thirdName2</td>
</tr>
</table>
</td>
<td valign="top">
<table width="100%" border="0" cellspacing="2" cellpadding="1" class="main_tb3">
<tr>
<td colspan="2">
<div align="center">
<a href="/title/name.php" target="_blank">
<img src="./movie/image.jpg" alt="TitleName" border="0" height="100" width="225" />
</a>
</div>
</td>
</tr>
<tr>
<td colspan="2"><h1 align="center">Title - secondname</h1></td>
</tr>
<tr>
<td><span class="style10">Cat1 :</span></td>
<td>1st name</td>
</tr>
<tr>
<td width="32%"><span class="style10">Cat2 :</span></td>
<td width="68%"><b><i>secondname</i></b></td>
</tr>
<tr>
<td><span class="style10">cat4 :</span></td>
<td>Bla bla</td>
</tr>
<tr>
<td><span class="style10">Cat3 :</span></td>
<td>thirdName2</td>
</tr>
</table>
</td>"""
soup = BeautifulSoup(html)
for table in soup.find_all("table", class_="main_tb3"):
print table.find('a').get('href')
print table.find('h1').text
For the HTML you have given, this will print the following:
/title/name.php
Title - secondname
/title/name.php
Title - secondname

PHP DOM get element which contains

Need help with parsing HTML code by PHP DOM.
This is simple part of huge HTML code:
<table width="100%" border="0" align="center" cellspacing="3" cellpadding="0" bgcolor='#ffffff'>
<tr>
<td align="left" valign="top" width="20%">
<span class="tl">Obchodne meno:</span>
</td>
<td align="left" width="80%">
<table width="100%" border="0">
<tr>
<td width="67%">
<span class='ra'>STORE BUSSINES</span>
</td>
<td width="33%" valign='top'>
<span class='ra'>(od: 02.10.2012)</span>
</td>
</tr>
</table>
</td>
</tr>
</table>
What I need is to get text "STORE BUSINESS". Unfortunately, the only thing I can catch is "Obchodne meno" as a content of first tag, so according to this content I need to get its parent->parent->first sibling->child->child->child->child->content. I have limited experience with parsing html in php so any help will be valuable. Thanks in advance!
Make use of DOMDocument Class and loop through the <span> tags and put them in array.
<?php
$html=<<<XCOE
<table width="100%" border="0" align="center" cellspacing="3" cellpadding="0" bgcolor='#ffffff'>
<tr>
<td align="left" valign="top" width="20%">
<span class="tl">Obchodne meno:</span>
</td>
<td align="left" width="80%">
<table width="100%" border="0">
<tr>
<td width="67%">
<span class='ra'>STORE BUSSINES</span>
</td>
<td width="33%" valign='top'>
<span class='ra'>(od: 02.10.2012)</span>
</td>
</tr>
</table>
</td>
</tr>
</table>
XCOE;
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('span') as $tag) {
$spanarr[]=$tag->nodeValue;
}
echo $spanarr[1]; //"prints" STORE BUSINESS

Categories