Is there any Regex to Read the <td> contents from the following Table, note that there are many similar tables so i want only to read the following table contents.
I want to READ data from 3,4,5 TD from all <TR>
My regex looks like following but doesnt work
$match = preg_replace('~<td width="80" bgcolor="#F3F3E4" align="left"> <a onmouseout="ChangeImage(AE1,1)" onmouseover="ChangeImage(AE1,0)" href="/charts/livegold.html">GOLD</a></td>#[a-z0-9]{6}~i','',$match[3]);
echo '<table><tr>' . $match . '</tr></table>';
the table is as follows
<table width="540" cellspacing="1" cellpadding="0" border="0" align="center">
<tbody><tr>
<td width="16" bgcolor="#000000" align="center"> </td>
<td width="80" bgcolor="#000000" align="center"><font size="1" face="Arial, Helvetica, sans-serif" color="#FFFFFF">www.kitco.com</font></td>
<td width="369" bgcolor="#000000" align="center" colspan="5"><p class="white">The World Spot Price - Asia/Europe/NY markets</p></td>
<td width="73" bgcolor="#000000" align="right"><img width="39" vspace="0" hspace="0" height="17" border="0" alt="light" src="/images/lightgreen.gif"></td>
</tr>
<tr>
<td width="16" bgcolor="#000000" align="center"> </td>
<td width="522" bgcolor="#F3F3E4" align="center" colspan="7"><font size="2" face="Verdana, Arial, Helvetica, sans-serif"><b><font color="GREEN">MARKET IS OPEN</font><br>(Will close in 17 hrs. 41 mins.)<!-- 1486.00--></b></font></td>
</tr>
<tr bgcolor="#F3F3E4">
<td width="16" bgcolor="#000000" align="center"> </td>
<td width="80" bgcolor="#CCCC99" align="center">Metals</td>
<td width="80" bgcolor="#CCCC99" align="center">Date</td>
<td width="80" bgcolor="#CCCC99" align="center">Time (EST)</td>
<td width="68" bgcolor="#CCCC99" align="center">Bid</td>
<td width="68" bgcolor="#CCCC99" align="center">Ask</td>
<td width="146" bgcolor="#CCCC99" align="center" colspan="2">Change from NY Close</td>
</tr>
<tr bgcolor="#F3F3E4">
<td width="16" bgcolor="#000000" align="center"><a onmouseout="ChangeImage('AE1','1')" onmouseover="ChangeImage('AE1','0')" href="/charts/livegold.html"><img width="16" vspace="0" hspace="0" height="13" border="0" alt="Gold Charts" name="AE1" src="http://www.kitco.com/images/graph_down.gif"></a></td>
<td width="80" bgcolor="#F3F3E4" align="left"> <a onmouseout="ChangeImage('AE1','1')" onmouseover="ChangeImage('AE1','0')" href="/charts/livegold.html">GOLD</a></td>
<td width="80" align="center">06/04/2013</td>
<td width="80" align="center">23:34</td>
<td width="68" align="center">1405.50</td>
<td width="68" align="center">1406.50</td>
<td width="73" align="center"><p class="spotgreen">+5.50</p></td>
<td width="73" align="center"><p class="spotgreen">+0.39%</p></td>
</tr>
<tr bgcolor="#F3F3E4">
<td width="16" bgcolor="#000000" align="center"><a onmouseout="ChangeImage('AE2','1')" onmouseover="ChangeImage('AE2','0')" href="/charts/livesilver.html"><img width="16" vspace="0" hspace="0" height="13" border="0" alt="Silver Charts" name="AE2" src="http://www.kitco.com/images/graph_down.gif"></a></td>
<td width="80" align="left"> <a onmouseout="ChangeImage('AE2','1')" onmouseover="ChangeImage('AE2','0')" href="/charts/livesilver.html">SILVER</a></td>
<td width="80" align="center">06/04/2013</td>
<td width="80" align="center">23:34</td>
<td width="68" align="center">22.59</td>
<td width="68" align="center">22.69</td>
<td width="73" align="center"><p class="spotgreen">+0.05</p></td>
<td width="73" align="center"><p class="spotgreen">+0.20%</p></td>
</tr>
<tr bgcolor="#F3F3E4">
<td width="16" bgcolor="#000000" align="center"><a onmouseout="ChangeImage('AE3','1')" onmouseover="ChangeImage('AE3','0')" href="/charts/liveplatinum.html"><img width="16" vspace="0" hspace="0" height="13" border="0" alt="Platinum Charts" name="AE3" src="http://www.kitco.com/images/graph_down.gif"></a></td>
<td width="80" align="left"><p> <a onmouseout="ChangeImage('AE3','1')" onmouseover="ChangeImage('AE3','0')" href="/charts/liveplatinum.html">PLATINUM</a></p></td>
<td width="80" align="center">06/04/2013</td>
<td width="80" align="center">23:34</td>
<td width="68" align="center">1501.00</td>
<td width="68" align="center">1509.00</td>
<td width="73" align="center"><p class="spotgreen">+9.00</p></td>
<td width="73" align="center"><p class="spotgreen">+0.60%</p></td>
</tr>
<tr bgcolor="#F3F3E4">
<td width="16" bgcolor="#000000" align="center"><a onmouseout="ChangeImage('AE4','1')" onmouseover="ChangeImage('AE4','0')" href="/charts/livepalladium.html"><img width="16" vspace="0" hspace="0" height="13" border="0" alt="Palladium Charts" name="AE4" src="http://www.kitco.com/images/graph_down.gif"></a></td>
<td width="80" align="left"> <a onmouseout="ChangeImage('AE4','1')" onmouseover="ChangeImage('AE4','0')" href="/charts/livepalladium.html">PALLADIUM</a></td>
<td width="80" align="center">06/04/2013</td>
<td width="80" align="center">23:25</td>
<td width="68" align="center">755.00</td>
<td width="68" align="center">761.00</td>
<td width="73" align="center"><p class="spotgreen">+6.00</p></td>
<td width="73" align="center"><p class="spotgreen">+0.80%</p></td>
</tr>
</tbody></table>
Here is the data i want to Extract
ok here is the solution with Regex:
$patt = "/<td[^>]*width=['\"]68['\"][^>]*>([0-9\.]+)<\/td>\s*<td[^>]*width=['\"]68['\"][^>]*>([0-9\.]+)<\/td>/i";
if(preg_match_all($patt, $html, $matches))
{
//print all records
//print_r($matches);
for($i=0; $i<count($matches[1]); $i++)
{
echo "Bid: ".$matches[1][$i].", Ask: ".$matches[2][$i]."\n";
}
}
You can use some HTML parser. For PHP there is one http://simplehtmldom.sourceforge.net/
Once the DOM has been loaded into library, get through the TABLE element and iterate through each TR element.
Related
I need help to get the data from a table. It's an internet usage table and the html code is down below :
<table width="572" border="0" align="center" cellspacing="0">
<tbody><tr valign="top">
<td width="1" class="bgsidelines"></td>
<td width="*" class="bgbottom">
<table summary="" width="100%" border="0" cellpadding="0">
<tbody><tr>
<td width="10" rowspan="2" bgcolor="#CCCCCC"></td>
<td width="443">
<table width="443" height="10" border="0" align="center" cellpadding="8">
<tbody>
<tr>
<td width="100%" class="path"><b>Internet usage</b></td>
</tr>
<tr>
<td class="reg"><!-- Begin yours codes -->
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tbody><tr>
<table cellpadding="5" cellspacing="1" border="0">
<tbody>
<tr>
<td width="43" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
</tr>
<tr>
<td bgcolor="#FFFFFF" class="reg" nowrap="nowrap">2017-06-01 to<br>2017-
06-18</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">54815.06</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">53.53</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">52114.59</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">50.89</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">106929.65</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">104.42</td>
</tr>
</tbody></table></td></tr>
</tbody></table>
<!-- End yours codes -->
</tr>
</tbody></table></td></tr>
</tbody></table></td></tr>
</tbody></table>
I've done it in a way that works but only works sometimes, this must be due to the user agent. and it fetches the entire table while I would like each separated values for the internet usage, the ones in the td class="reg" (54815.06, 53.53..) It's hard because there is a table in table.. Also it's
My PHP :
require_once 'advanced_html_dom.php';
$numvl = $_POST['numvl'];
$url =
'https://extranet.videotron.com/services/secur/extranet/tpia/Usage.do?
compteInternet='.$numvl;
$html = new AdvancedHtmlDom();
$html->load_file($url);
$element = $html->find("tr");
echo $element[1]->innertext;
no need for some external lib (advanced_html_dom.php? never heard of), just use PHP's DOMDocument and DOMXPath.
example:
<?php
declare(strict_types=1);
$domd=#DOMDocument::loadHTML(getHTML());
$xpath=new DOMXPath($domd);
foreach($xpath->query("//td[#valign='top' and #class='reg']") as $ele){
var_dump($ele->textContent);
}
function getHTML():string{
$html=<<<'HTML'
<table width="572" border="0" align="center" cellspacing="0">
<tbody><tr valign="top">
<td width="1" class="bgsidelines"></td>
<td width="*" class="bgbottom">
<table summary="" width="100%" border="0" cellpadding="0">
<tbody><tr>
<td width="10" rowspan="2" bgcolor="#CCCCCC"></td>
<td width="443">
<table width="443" height="10" border="0" align="center" cellpadding="8">
<tbody>
<tr>
<td width="100%" class="path"><b>Internet usage</b></td>
</tr>
<tr>
<td class="reg"><!-- Begin yours codes -->
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tbody><tr>
<table cellpadding="5" cellspacing="1" border="0">
<tbody>
<tr>
<td width="43" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="44" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>MB</center></b>
</td>
<td width="60" bgcolor="#EEEEEE" class="grey"><b><center>GB</center></b>
</td>
</tr>
<tr>
<td bgcolor="#FFFFFF" class="reg" nowrap="nowrap">2017-06-01 to<br>2017-
06-18</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">54815.06</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">53.53</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">52114.59</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">50.89</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">106929.65</td>
<td bgcolor="#FFFFFF" align="right" valign="top" class="reg">104.42</td>
</tr>
</tbody></table></td></tr>
</tbody></table>
<!-- End yours codes -->
</tr>
</tbody></table></td></tr>
</tbody></table></td></tr>
</tbody></table>
HTML;
return $html;
}
output:
string(8) "54815.06"
string(5) "53.53"
string(8) "52114.59"
string(5) "50.89"
string(9) "106929.65"
string(6) "104.42"
On one of my old sites, which has a pretty messy and outdated code, I am having problems with navigation menu in Chrome. It aligns perfectly in Firefox and IE but for some reason in Chrome only first 3 tabs get properly centered.
http://jsfiddle.net/8b2Cm/1/
<table width="765" border="0" align="center" cellpadding="0" cellspacing="0" bgcolor="#FFFFFF">
<td valign="top">
<table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top">
<td width="19%"><img src="http://LINK" alt="" width="331" height="95" border="0"></td>
<td width="81%"><img src="http://LINK/images/logo.jpg" alt="" width="434" height="95"></td>
</tr>
</table>
</td>
<td valign="top" class="back1"><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top">
<td width="1%"><img src="http://LINK/images/left-top.jpg" width="23" height="30" alt=""></td>
<td width="98%" valign="middle"><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="90%" class="left-text11"><div align="center"> Home</div></td>
<td width="10%"><img src="http://LINK/images/line1.jpg" width="8" height="30" alt=""></td>
</tr>
</table></td>
<td><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="90%" class="left-text11"><div align="center" class="left-text11">
<? if(!$_SESSION['sbprj_userid'])
{
?><strong>Signup</strong>
<?
}else
{
?>My Account
<?
}
?></div></td>
<td width="10%"><img src="http://LINK/images/line1.jpg" width="8" height="30" alt=""></td>
</tr>
</table></td>
<td><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td align="center" width="90%" class="left-text11"><div align="center">Free Poker Money </div></td>
</tr>
</table></td>
<td><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="10%"><img src="http://LINK/images/line1.jpg" width="8" height="30" alt=""></td>
</tr>
</table></td>
<td><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td align="center" width="90%" class="left-text11"><div align="center">Poker School</div></td>
</tr>
</table></td>
<td><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="10%"><img src="http://LINK/images/line1.jpg" width="8" height="30" alt=""></td>
</tr>
</table></td>
<td class="left-text11"><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="90%" class="left-text11"><div align="center">News </div></td>
<td width="10%"><img src="http://LINK/images/line1.jpg" width="8" height="30" alt=""></td>
</tr>
</table></td>
<td class="left-text11"><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="90%" class="left-text11"><div align="center">Support </div></td>
<td width="10%"><img src="http://LINK/images/line1.jpg" width="8" height="30" alt=""></td>
</tr>
</table></td>
<td class="left-text11"><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr>
<td width="90%" class="left-text11"><div align="center"><img src="http://LINK/images/facebook.png" border="0" width="28" height="25" /> <img src="http://LINK/images/twitter.png" border="0" width="28" height="25" /> <img src="http://LINK/images/googleplus.png" border="0" width="28" height="25" /></div></td>
</tr>
</table></td>
</tr>
</table></td>
<td width="1%"><img src="http://LINK/images/right-top.jpg" width="24" height="30" alt=""></td>
</tr>
</table></td>
<td valign="top"><table width="100%" border="0" cellspacing="0" cellpadding="0">
<tr valign="top">
<td width="1%"><img src="http://LINK/images/top-1.jpg" width="23" height="17" alt=""></td>
<td width="98%" class="back2"><img src="http://LINK/images/back2.jpg" width="9" height="17" alt=""></td>
<td width="1%"><img src="http://LINK/images/right-1.jpg" width="24" height="17" alt=""></td>
</tr>
</table></td>
This is the code, any suggestions on how to fix this ?
There is a mess with separators.
Some separators are inside table(horizontal table with tabs, by the way, you should put only that table into fiddle and code, not whole the mess making for us harder to understand) cells, other separators come all alone in separate column, therefore they are taking the same place as column. Check it and move them, like it's done in first two tabs
I am trying to load a view from constructor with parameter $data. Earlier it was working fine and now suddenly it stopped working. Below is the code:
permissionerror View
<table width="100%" border="0" cellpadding="0" cellspacing="0">
<tr>
<td width="6" align="left" valign="top" background="<?php echo base_url();?>images/bg/topbg.gif"><img src="<?php echo base_url();?>images/bg/top-left.gif" width="5" height="34" /></td>
<td height="34" align="left" valign="middle" background="<?php echo base_url();?>images/bg/topbg.gif" class="heading">Access Forbidden</td>
<td width="6" align="right" valign="top" background="<?php echo base_url();?>images/bg/topbg.gif"><img src="<?php echo base_url();?>images/bg/top-right.gif" width="5" height="34" /></td>
</tr>
<tr>
<td align="left" valign="top" background="<?php echo base_url();?>images/bg/main-content-bg-left.gif"> </td>
<td height="165" valign="top" bgcolor="#FFFFFF" class="leftredheading"><div style="margin:30px auto;padding-left:15px"><? if(isset($sn_error)) echo $sn_error;?></div></td>
<td align="right" valign="top" background="<?php echo base_url();?>images/bg/main-content-bg-right.gif"> </td>
</tr>
<tr>
<td align="left" valign="bottom" background="<?php echo base_url();?>images/bg/bottom-bg.gif"><img src="<?php echo base_url();?>images/bg/bottom-left-corner.gif" width="5" height="5" /></td>
<td valign="top" background="<?php echo base_url();?>images/bg/bottom-bg.gif"></td>
<td align="right" valign="bottom" background="<?php echo base_url();?>images/bg/bottom-bg.gif"><img src="<?php echo base_url();?>images/bg/bottom-right-corner.gif" width="5" height="5" /></td>
</tr>
</table>
Template code:
<?php $this->load->view('includes/login/header'); ?>
<table width="100%" border="0" cellpadding="0" cellspacing="0" bgcolor="#e7e7de">
<tr>
<td width="18"><img src="<?php echo base_url();?>images/bg/zero.gif" width="18" height="1" /></td>
<td width="191"><img src="<?php echo base_url();?>images/bg/zero.gif" width="191" height="1" /></td>
<td width="18"><img src="<?php echo base_url();?>images/bg/zero.gif" width="18" height="1" /></td>
<td width="100%"> </td>
<td width="161"><img src="<?php echo base_url();?>images/bg/zero.gif" width="18" height="1" /></td>
<td width="191"><img src="<?php echo base_url();?>images/bg/zero.gif" width="191" height="1" /></td>
<td width="18"><img src="<?php echo base_url();?>images/bg/zero.gif" width="18" height="1" /></td>
</tr>
<tr>
<td> </td>
<td align="left" valign="top"><table width="191" border="0" cellspacing="0" cellpadding="0">
<tr>
<td background="<?php echo base_url();?>images/bg/left-menu-white-bg.gif"> <?php $this->load->view('includes/login/top_right'); ?> </td>
</tr>
<tr>
<td height="50" background="<?php echo base_url();?>images/bg/left-menu-white-bg.gif"> </td>
</tr>
<tr>
<td background="<?php echo base_url();?>images/bg/left-menu-white-bg.gif"><table width="166" border="0" align="center" cellpadding="0" cellspacing="0">
<tr>
<td align="left" valign="top"><img src="<?php echo base_url();?>images/bg/needhelp-top.gif" width="166" height="7" /></td>
</tr>
<tr>
<td align="center" valign="top" bgcolor="#F8F7EC"> <?php $this->load->view('includes/login/bottom_right'); ?>
<br /></td>
</tr>
<tr>
<td align="left" valign="top" bgcolor="#F8F7EC"><img src="<?php echo base_url();?>images/bg/needhelp-bottom.gif" width="166" height="7" /></td>
</tr>
</table>
<br /></td>
</tr>
<tr>
<td background="<?php echo base_url();?>images/bg/left-menu-white-bg.gif"><table width="191" border="0" cellspacing="0" cellpadding="0">
<tr>
<td height="6" align="left" valign="top"><img src="<?php echo base_url();?>images/bg/left-bttom.gif" width="6" height="6" /></td>
<td width="100%" background="<?php echo base_url();?>images/bg/leftbottom.gif"></td>
<td height="6" align="right" valign="top"><img src="<?php echo base_url();?>images/bg/right-bttom.gif" width="6" height="6" /></td>
</tr>
</table></td>
</tr>
</table>
<br /></td><td> </td>
<td valign="top" <? if($main_content != 'dashboard' || $main_content == 'publisher_dashboard') { ?> colspan="3" <? } ?>><?php $this->load->view($main_content); ?></td>
<td> </td>
<? if($main_content == 'dashboard' || $main_content == 'publisher_dashboard') { ?>
<td align="right" valign="top"> <?php $this->load->view('includes/login/left'); ?>
</td>
<? } ?>
<td> </td>
</tr>
</table>
<?php $this->load->view('includes/login/footer'); ?>
Here in the above code i am checking whether logged in user is having permission for access then do else part otherwise through the error from if case block.
In IF case it loads the error template.
Please somebody help me to fix the above issue ASAP!
Call parent::__construct(), not parent::Controller().
Edit:
After it was made clear that CodeIgniter 1.7.1 is used and after the view itself was shown, this is your issue:
Your view tries to echo $error instead of $sn_error. If you don't pass $data to the view, then $sn_error (which is checked for) isn't set and that error doesn't occur - that's why the view gets loaded in that case.
I´m trying to read a table from a HTML file into an array, I'm stuck.
Any help would be appreciated.
Every table element should be stored into 1 array value
example: $arr[1]= DER HE1 ges 1
PHP
<?php
libxml_use_internal_errors(true);
$i=0;
// new dom object
$dom = new DOMDocument();
//load the html
$html = $dom->loadHTMLFile("106642new.html");
//discard white space
$dom->preserveWhiteSpace = false;
//the table by its tag name
$tables = $dom->getElementsByTagName('table');
//get all rows from the table
$rows = $tables->item(0)->getElementsByTagName('tr');
// $test = $tables->item(0)->getElementsByTagName('td');
// loop over the table rows
foreach ($rows as $row) {
// get each column by tag name
$cols = $row->getElementsByTagName('td');
$i= $i + 1 ;
$value = "Nummer: ".$i.": ".$cols->item(0)->nodeValue.PHP_EOL;
// $value = "test: ".$i.": ".$cols->item(0)->nodeValue.PHP_EOL;
$cols = array(1, 2, 3, 4, 5);
echo $value;
// $cols[$i] = $row;
// echo the values
//echo $cols->item(0)->nodeValue ;
}
?>
HTML:
<body bgcolor="#FFFFFF" topmargin="0" leftmargin="0" marginwidth="0" marginheight="0">
<div align=left>
<table BORDER=0 CELLSPACING=0 CELLPADDING=0 WIDTH="100%" height="100%">
<tr><td valign="top"> </td></tr>
<tr><td valign="top">
<p font class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</font></p>
<br><div font class="lNameHeader"> </font> </div><table border=1>
<tr class="AccentDark">
<td align="left" width="65" class="tableHeader"></td>
<td align="center" width="auto" class="tableHeader">Maandag</td>
<td align="center" width="auto" class="tableHeader">Dinsdag</td>
<td align="center" width="auto" class="tableHeader">Woensdag</td>
<td align="center" width="auto" class="tableHeader">Donderdag</td>
<td align="center" width="auto" class="tableHeader">Vrijdag</td>
</tr><tr>
<td align="left" width="50" class="tableHeader">1e uur</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell"></td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">WAS</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE09</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">econ</td>
<td align="left" width="9" class="tableCell">5</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">WIK</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC17</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">biol</td>
<td align="left" width="9" class="tableCell">4</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">OTT</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC01</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">dutl</td>
<td align="left" width="9" class="tableCell">6</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell"></td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td align="left" width="50" class="tableHeader">2e uur</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">KEJ</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC02</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">wisA</td>
<td align="left" width="9" class="tableCell">3</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">BRT</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE05</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">netl</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">OTT</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC01</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">dutl</td>
<td align="left" width="9" class="tableCell">6</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">BAU</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HG01</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">lo</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">MET</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HD02</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">entl</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td align="left" width="50" class="tableHeader">3e uur</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">WAS</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE07</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">econ</td>
<td align="left" width="9" class="tableCell">5</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">MET</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HD02</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">entl</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">WAS</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE05</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">econ</td>
<td align="left" width="9" class="tableCell">5</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">BAU</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HG01</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">lo</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">KEJ</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC02</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">wisA</td>
<td align="left" width="9" class="tableCell">3</td>
</tr>
</table>
</td>
</tr>
<tr>
<td align="left" width="50" class="tableHeader">4e uur</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell"></td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">DER</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE08</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">ges</td>
<td align="left" width="9" class="tableCell">1</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">KEJ</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC06</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">wisA</td>
<td align="left" width="9" class="tableCell">3</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">DER</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE10</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">ges</td>
<td align="left" width="9" class="tableCell">1</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">CHR</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HB15</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">ckv</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
</tr>
<tr>
<td align="left" width="50" class="tableHeader">5e uur</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">DOC</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE09</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">m&o</td>
<td align="left" width="9" class="tableCell">2</td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell"></td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell"></td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">MET</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HD02</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">entl</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">BRT</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HE05</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">netl</td>
<td align="left" width="9" class="tableCell"></td>
</tr>
</table>
</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">OTT</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC03</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">dutl</td>
<td align="left" width="9" class="tableCell">6</td>
</tr>
</table>
</td>
</tr>
<tr>
<td align="left" width="50" class="tableHeader">6e uur</td>
<td align="left" width="auto" class="tableCell"><table border="0" cellpadding="0" cellspacing="0" >
<tr>
<td align="left" width="41" class="tableCell">OTT</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="75" class="tableCell">HC03</td>
<td align="left" width="3" class="tableCell"> </td>
<td align="left" width="73" class="tableCell">dutl</td>
<td align="left" width="9" class="tableCell">6</td>
</tr>
</table>
</td>
If think the problem is that your first table is a container of other tables.
If you want to get the contents of all the tables, than you should also iterate through the tables list.
If you just want to get the contents of a inner table, than first try to locate it in the DOM. I suggest finding the first table, than geting all table elements inside that and iterate through them.
var_dump is a good starting point for debugging, you don't need anything else than you already did, just debug and test more :)
I'm guessing that the fact that it's invalid HTML/XML is screwing you over.
You're using the loadHTMLFile() function which might support malformed HTML up to an extent, but it might also need valid HTML/XML.
If it requires valid XML, then what's probably happening is that the "<br>" doesn't get interpreted as a stand-alone node, but rather as the starting point of a node... meaning that everything after that becomes sub-nodes of "<br>".
Furthermore this line here doesn't make any sense:
<p font class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</font></p>
The <font> tag has been obsolete for years and should never be used, but more importantly it's not a font tag but a p-tag, that still also gets closed as if it's a font-tag. Just do:
<p class="Header">Basisrooster schooljaar 2011 2012 (m.i.v. 12-09-11)</p>
So the solution may be that your HTML/XML is invalid.
(Dan Bizdadea also has a good point.)
I have the following site and I want with regular expressions to get the text between the following tags
<td colspan="2" align="left" valign="top" bgcolor="#FBFAF4"> ..... </td>
I am trying with the following however it returns an empty array of $matches.
preg_match_all("/<td(.*) bgcolor=\"#FBFAF4\"\>(.*)\<\/td>/",$old_filecontents,$matches);
Which is the correct pattern for this?
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <title>Exotiq - Ðñïúüíôá</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-7"> <link href="Styles.css" rel="stylesheet" type="text/css"> <link href="stylesheets/Styles.css" rel="stylesheet" type="text/css"> <script src="scripts/PopBox.js" type="text/javascript"></script> <script type="text/javascript"> popBoxWaitImage.src = "images/spinner40.gif"; popBoxRevertImage = "images/magminus.gif"; popBoxPopImage = "images/magplus.gif"; </script> <script type="text/javascript"> AC_FL_RunContent('codebase', 'http://download.macromedia.com/pub/shockwave/ cabs/flash/swflash.cab#version=9,0,28,0', 'width','675','height','445','title','Morpork', 'src','assets/flash/morepork','loop', 'false','quality','high','pluginspage', 'http://www.adobe.com/shockwave/download/download.cgi?P1_Prod_Version=ShockwaveFlash', 'wmode','transparent','movie','assets/flash/morepork'); </script> </head> <body background="images/fonto2.jpg" topmargin="0"> <table width="948" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td><table width="948" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td width="24"> </td> <td height="150" colspan="3"><object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,29,0" width="900" height="150"> <param name="movie" value="flash/top02.swf"> <param name="quality" value="high"> <param name="wmode" value="transparent"> <embed src="flash/top02.swf" quality="high" pluginspage="http://www.macromedia.com/go/getflashplayer" type="application/x-shockwave-flash" width="900" height="150"></embed></object></td> <td width="24" height="150"> </td> </tr> <tr> <td height="31" colspan="5" valign="middle"> <div align="center"> <script src="menu/xaramenu.js"></script> <script Webstyle4 src="menu/menu_.js"></script> </div></td> </tr> <tr> <td width="24"> </td> <td width="200" valign="top" background="images/GreenFasa.jpg"> <br> <table width="180" border="0" align="center" cellpadding="0" cellspacing="1"> <tr> <td height="25" class="styles"> Makuti<br> <hr> </td> </tr> <tr> <td height="25" class="styles"> Fun Palm<br> <hr> </td> </tr> <tr> <td height="25" class="styles"> Alang-Alang<br> <hr> </td> </tr> <tr> <td height="25" class="styles"> Thatch<br> <hr> </td> </tr> <tr> <td height="25" class="styles"> <strong>Abaca</strong><br> <hr> </td> </tr> <tr> <td height="25" class="styles"> </td> </tr> </table></td> <td colspan="2" align="left" valign="top" bgcolor="#FBFAF4"> <div align="left"> <table width="680" border="0" align="center" cellpadding="0" cellspacing="0"> <tr> <td width="600" height="40" class="titles">ÊáôáóêåõÝò - ÏìðñÝëåò - Abaca</td> <td width="50" align="right" valign="middle" class="titles"> <div align="right"><img src="images/uk-flag.jpg" width="30" height="17" border="0"></div></td> </tr> <tr> <td colspan="2" class="body"><p>Ç ïìðñÝëá <strong>Abaca</strong> Ýñ÷åôáé ùò Üîéïò áíôéêáôáóôÜôçò ôçò ïìðñÝëáò Rattan ðïõ åðß 15 ÷ñüíéá óôïëßæåé ôéò åëëçíéêÝò ðáñáëßåò. Ôï <strong>Abaca</strong> åßíáé Ýíá öõóéêü õëéêü ðéï <strong>áíèåêôéêü</strong> êáé ðéï üìïñöï áðü ôï Rattan. <br> Ðáñáäßäåôáé ìå <strong>îýëéíï êïñìü åìðïôéóìïý</strong> Ö8åê.<br> <br> </p> <table width="680" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="340" height="150" valign="middle"> <div align="left"><img src="images/Manufactures/Umbrelas/Abaca/AbacaUmbrela.jpg" width="328" height="500"></div></td> <td width="340" height="150" valign="bottom" class="body"> <table width="340" border="0" cellspacing="0" cellpadding="0"> <tr> <td width="170" height="130"> <div align="center"><img src="images/Manufactures/Umbrelas/Abaca/1_Abaca02_s.jpg" width="152" height="101" class="PopBoxImageSmall" onclick="Pop (this,50,'PopBoxImageLarge');" title="ÌåãÝèõíóç" pbsrc="images/Manufactures/Umbrelas/Abaca/1_Abaca02.jpg" pbCaption="Abaca - ÏìðñÝëá ðáñáëßáò" popBoxCaptionBelow="true" /></div></td> <td width="170" height="130"> <div align="center"><img src="images/Manufactures/Umbrelas/Abaca/2_Abaca03_s.jpg" width="150" height="112" class="PopBoxImageSmall" onclick="Pop (this,50,'PopBoxImageLarge');" title="ÌåãÝèõíóç" pbsrc="images/Manufactures/Umbrelas/Abaca/2_Abaca03.jpg" pbCaption="Abaca - ÏìðñÝëá ðáñáëßáò" popBoxCaptionBelow="true" /></div></td> </tr> <tr> <td width="170" height="130"> <div align="center"><img src="images/Manufactures/Umbrelas/Abaca/3_Abaca01_s.jpg" width="150" height="112" class="PopBoxImageSmall" onclick="Pop (this,50,'PopBoxImageLarge');" title="ÌåãÝèõíóç" pbsrc="images/Manufactures/Umbrelas/Abaca/3_Abaca01.jpg" pbCaption="Abaca - ÏìðñÝëá ðáñáëßáò" popBoxCaptionBelow="true" /></div></td> <td width="170" height="130"> <div align="center"></div></td> </tr> <tr> <td width="170" height="130"> <div align="center"></div></td> <td width="170" height="130"> <div align="center"></div></td> </tr> <tr> <td width="170" height="130"> <div align="center"></div></td> <td width="170" height="130"> <div align="center"></div></td> </tr> </table></td> </tr> <tr> <td width="340" height="50" valign="top"> <p align="center"> </p></td> <td width="340" height="50" valign="top"> <div align="center" class="perigrafes">ÊëéêÜñåôáé ðÜíù óôéò öùôïãñáößåò ãéá ìåãÝèõíóç</div></td> </tr> <tr> <td width="340" valign="bottom"> <div align="center"> </div></td> <td width="340" valign="bottom"> <p align="center"> </p></td> </tr> <tr> <td width="340" valign="top"> <div align="center"></div></td> <td width="340" valign="top"> <p align="center"> </p></td> </tr> <tr> <td height="20" colspan="2" valign="top"> </td> </tr> </table></td> </tr> </table> <font color="#FFFFFF"></font></div></td> <td width="24" height="420"> </td> </tr> <tr> <td width="24"> </td> <td width="200"> </td> <td width="600"> </td> <td width="100"> </td> <td width="24"> </td> </tr> </table></td> </tr> <tr> <td height="22"><table width="900" border="0" align="center" cellpadding="0" cellspacing="0" bgcolor="#007F3E"> <tr> <td height="25"> <div align="center" class="styles">All rights reserved ® Designed by CONTINENTAL ADVERTISING </div></td> </tr> </table></td> </tr> </table> <script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12742174-1"); pageTracker._trackPageview(); } catch(err) {}</script> </body> </html>
Given that the cell you're talking about contains HTML, another table in fact, you can't do traditional termination checking ... or you'll get the content between the cell opening and the first </td> you find. Plus '.' isn't multi-line friendly, so unless your cell opens and terminates on the same line, you'll get no matches.
I'd say don't use regular expressions for this. Try an XML parser.
If you were just getting plain text, that'd be fine, but because you're returning HTML which contains your terminator, you'll need to use a parser with some kind of DOM depth awareness ... ... or find a way to count terminators in regex.