How to get a td value using SimpleHTMLDOM? - php

I have the following table which I'm trying to scrape with SimpleHtmlDom.
How would I reference/access the date value of "application received" and put it in to a variable. I've tried using the plain text approach, but this produces all the text in all of the td's.
<table id="simpleDetailsTable" summary="Case Details">
<tbody><tr>
<th scope="row" width="40%">
Reference
</th>
<td>
LA01/2018/0235/F
</td>
</tr>
<tr>
<th scope="row" width="40%">
Application Received
</th>
<td>
Fri 23 Feb 2018
<td>
</tr>
<tr>
<th scope="row" width="40%">
Address
</th>
<td>
206 Straid Road Bushmills.
</td>
</tr>
<tr>
<th scope="row" width="40%">
Proposal
</th>
<td>
Alterations to existing car showroom.
</td>
</tr>
<tr>
<th scope="row" width="40%">
Status
</th>
<td>
<span class="caseDetailsStatus">Application Received</span>
</td>
</tr>
<tr>
<th scope="row" width="40%">
Authority Decision
</th>
<td>
Not Available
</td>
</tr>
<tr>
<th scope="row" width="40%">
Authority Decision Date
</th>
<td>
Not Available
</td>
</tr>
<tr>
<th scope="row" width="40%">
PAC Decision
</th>
<td>
Not Available
</td>
</tr>
<tr>
<th scope="row" width="40%">
PAC Decision Date
</th>
<td>
Not Available
</td>
</tr>
</tbody></table>
Using the following code I'm able to echo out each listing.
curl_setopt($ch, CURLOPT_URL,$item);
$output = curl_exec($ch);
$html = str_get_html($output);
$table = $html->find('table',0);
$rowdata = array();
foreach($table->find('tr') as $row){
foreach($row->find('td') as $cell){
$listing = $cell->plaintext;
echo $listing,'<br>';
//echo '<pre>'; print_r($listing);echo'<pre/>';
}
}
which produces the following result;
LA01/2018/0254/O
Tue 27 Feb 2018
39 Lyttlesdale Garvagh.
Proposed new dwelling and proposed new paired access to include demolition of existing garage.
Application Received
Not Available
Not Available
Not Available
Not Available
LA10/2018/0265/F
Tue 27 Feb 2018
45 Glen Road Killyculla Tempo BT94 3JU
Replacement dwelling-amended position,house type (split level) and domestic garage from previous approval LA10/2017/0529/RM for same
Application Received
Not Available
Not Available
Not Available
Not Available
Can anyone put me right on what it is I should be doing?

Related

Why datagrid_data1.json data can not appear

In my project, i use easyui-datagrid.
Here is my html code:
<tr>
<td style="width:10%" colspan="2">attachment</td>
<td style="width:90%" colspan="18">
<table class="easyui-datagrid" data-options="url:'datagrid_data1.json',method:'get'" title="" style="width:100%">
<thead>
<tr>
<th data-options="field:'ck',checkbox:true"></th>
<th data-options="field:'name',align:'center'">filename</th>
</tr>
</thead>
</table>
</td>
</tr>
The code of datagrid_data1.json:
{"total":2,"rows":[
{"name":"1.jpg"},
{"name":"123.jpg"}
]}
But unfortunately, I works fail. Two names of 1.jpg and 123.jpg can not appear. Who can help me?

another data date value null when grab html

here's my problem, i tried grab date and change to date format from special character in 2 table. it successful. but 1 table the date return 0000-00-00 00:00:00, another table success. here's code,table and output below.
Table 1
<TABLE class="tab1" border="1" cellpadding="0" cellspacing="0"
summary="">
<TR>
<TH align=left colspan=2 bgcolor=#0066CC><H1> Start RIP Job</H1>
</TH>
</TR>
<TR>
<TH align=left> Send Date:
</TH>
<TD class="td1" align=left> 1/9/2017 1:15 PM
</TD>
</TR>
<TR>
<TH align=left> RIP Start Date and Time:
</TH>
<TD class="td1" align=left> 13:21:22 09/01/2017
</TD>
</TR>
<TR>
<TH align=left> RIP End Date and Time:
</TH>
<TD class="td1" align=left> 13:21:33 09/01/2017
</TD>
</TR>
<TR>
<TH align=left> RIP Duration:
</TH>
<TD class="td1" align=left> 11 seconds
</TD>
</TR>
<TR>
<TH align=left colspan=2 bgcolor=#0066CC><H1> End RIP Job</H1>
</TH>
</TR>
</TABLE>
Table 2
<TABLE class="tab1" border="1" cellpadding="0" cellspacing="0"
summary="">
<TR>
<TH align=left colspan=2 bgcolor=#0066CC><H1> Start RIP Job</H1>
</TH>
</TR>
<TR>
<TH align=left> Printer:
</TH>
<TD class="td1" align=left> RunJiang Flora 3204P
</TD>
</TR>
<TR>
<TH align=left> Send Date:
</TH>
<TD class="td1" align=left> 9/29/2017 10:09 PM
</TD>
</TR>
<TR>
<TH align=left> RIP Start Date and Time:
</TH>
<TD class="td1" align=left> 22:09:49 29/09/2017
</TD>
</TR>
<TR>
<TH align=left> RIP End Date and Time:
</TH>
<TD class="td1" align=left> 22:10:13 29/09/2017
</TD>
</TR>
<TR>
<TH align=left> RIP Duration:
</TH>
<TD class="td1" align=left> 24 seconds
</TD>
</TR>
<TR>
<TH align=left colspan=2 bgcolor=#0066CC><H1> End RIP Job</H1>
</TH>
</TR>
</TABLE>
CODE :
$source=file_get_contents("C://xampp/htdocs/Champion/machine-logs/LogPrinting04/nulldate.HTML");
$dom = new DOMDocument();
$dom->loadHTML($source);
// print_r($dom);
$xp = new DOMXPath($dom);
$textList = $xp->query("//table[//th[contains(text(),'')]]");
foreach ( $textList as $text ) {
$enddate = $xp->evaluate(
"string(descendant::tr[th[contains(text(),'RIP End Date and Time') or contains(text(),'Output End Date And Time')]]/td/text())",
$text);
$date = preg_replace("/[^0-9a-zA-Z \/:\-]/", "", $enddate);
$xtime = strtotime($date);
$tes = date("Y-m-d H:i:s",$xtime);
echo "enddate=".$tes.PHP_EOL;
}
OUTPUT :
table 1 : 2017-09-01 13:21:33
table 2 : 0000-00-00 00:00:00
'29' is not a valid month value.
Look at the one that is returning a value. Notice that it's returning September 1, not January 9.
13:21:33 09/01/2017 -> 2017-09-01 13:21:33
mm dd yyyy yyyy mm dd
Then look at the one that is returning zeros.
22:10:13 29/09/2017
mm dd yyyy
29 is not a valid value for month. (If we're expecting this input to be a representation of September 29, then we're also probably expecting the first one to be a representation of January 1.)

PHP Xpath get both a href and text node

I have a table that contains a number of headings like this:
<TR>
<TH CLASS="ddtitle" scope="colgroup" >Linked text</TH>
</TR>
The table is thousands of lines long so I can't share in full, but here is the initial tag and one full item within the table. Sadly there is no nested wrapped around each item and the comments are mine - so it's a pain to decipher where one item begins and ends.
<TABLE CLASS="datadisplaytable" SUMMARY="Layout table" width="100%"><CAPTION class="captiontext">Items Found</CAPTION>
<!-- START of first item in the table -->
<TR>
<TH CLASS="ddtitle" scope="colgroup" >Linked text</TH>
</TR>
<TR>
<TD CLASS="dddefault">
<SPAN class="fieldlabeltext">Term: </SPAN>Fall
<BR>
<SPAN class="fieldlabeltext">Registration: </SPAN>Jan 1, 2018 to Aug 1, 2018
<BR>
<SPAN class="fieldlabeltext">Levels: </SPAN>Undergraduate
<BR>
<BR>
Location
<BR>
Lecture Schedule Type
<BR>
3.000 Credits
<BR>
View Entry
<BR>
<BR>
<TABLE CLASS="datadisplaytable" SUMMARY="Meeting time table"><CAPTION class="captiontext">Scheduled Meeting Times</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Type</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Schedule Type</TH>
<TH CLASS="ddheader" scope="col" >Instructors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">9:20 am - 10:10 am</TD>
<TD CLASS="dddefault">MWF</TD>
<TD CLASS="dddefault">Some Building Room 101</TD>
<TD CLASS="dddefault">Aug 1, 2018 - Dec 1, 2018</TD>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">Instructor Name (<ABBR title= "Primary">P</ABBR>)<A HREF="mailto:email#foo.com" target="Instructur Name" ><IMG SRC="/wtlgifs/email.png" ALIGN="middle" ALT="E-mail" CLASS="headerImg" TITLE="E-mail" NAME="web_email" HSPACE=0 VSPACE=0 BORDER=0 HEIGHT=16 WIDTH=16></A></TD>
</TR>
</TABLE>
<BR>
<BR>
</TD>
</TR>
<!-- END first item in the table -->
I want to extract the item details, starting with the course name (which is the text content, "linked text," inside th.ddtitle) and the course link (which is the a href inside th.ddtitle). Here's what I've tried for grabbing those two items:
$dom = new DOMDocument();
$myHtml = file_get_contents(__DIR__.'myfile.html');
$dom->loadHTML($myHtml);
$xpath = new DOMXPath($dom);
// first part changes an outer table with the same class, so I can get inner tables without the outer one
$tables = $xpath->query("//table[#class='datadisplaytable']");
for($i=0; $i<1; $i++) {
$tables[$i]->setAttribute('class', 'masterTable');
}
$html = $dom->saveHTML();
// now, the query I'm having trouble with:
$textAndLink = $xpath->query("//th[#class='ddtitle']/*");
$i=1;
foreach($textAndLink as $info) {
foreach($info->childNodes as $child) {
if($i%2 == 0) {
echo $child->getAttribute('href') . '<br>';
} else {
echo $child->nodeValue . '<br>';
}
}
$i++;
}
I've also tried print_r($child) and the only items displayed are the text nodes, no <a> tags. How can I get both the anchor's "href" attribute and the text content? What I am expecting from the code above is a list like this:
http://foo.com/<br>
Linked text<br>
http://foo.com/secondlink<br>
Second linked text<br>
and so on and so forth.
Try this code snippet here
<?php
ini_set('display_errors', 1);
$string = '
<TABLE CLASS="datadisplaytable" SUMMARY="Layout table" width="100%"><CAPTION class="captiontext">Items Found</CAPTION>
<!-- START of first item in the table -->
<TR>
<TH CLASS="ddtitle" scope="colgroup" >Linked text</TH>
</TR>
<TR>
<TD CLASS="dddefault">
<SPAN class="fieldlabeltext">Term: </SPAN>Fall
<BR>
<SPAN class="fieldlabeltext">Registration: </SPAN>Jan 1, 2018 to Aug 1, 2018
<BR>
<SPAN class="fieldlabeltext">Levels: </SPAN>Undergraduate
<BR>
<BR>
Location
<BR>
Lecture Schedule Type
<BR>
3.000 Credits
<BR>
View Entry
<BR>
<BR>
<TABLE CLASS="datadisplaytable" SUMMARY="Meeting time table"><CAPTION class="captiontext">Scheduled Meeting Times</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Type</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Schedule Type</TH>
<TH CLASS="ddheader" scope="col" >Instructors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">9:20 am - 10:10 am</TD>
<TD CLASS="dddefault">MWF</TD>
<TD CLASS="dddefault">Some Building Room 101</TD>
<TD CLASS="dddefault">Aug 1, 2018 - Dec 1, 2018</TD>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">Instructor Name (<ABBR title= "Primary">P</ABBR>)<A HREF="mailto:email#foo.com" target="Instructur Name" ><IMG SRC="/wtlgifs/email.png" ALIGN="middle" ALT="E-mail" CLASS="headerImg" TITLE="E-mail" NAME="web_email" HSPACE=0 VSPACE=0 BORDER=0 HEIGHT=16 WIDTH=16></A></TD>
</TR>
</TABLE>
<BR>
<BR>
</TD>
</TR>';
$domDocument = new DOMDocument();
$domDocument->loadHTML($string);
$domXPath = new DOMXPath($domDocument);
$results = $domXPath->query('//tr/th[#class="ddtitle"]/a');
foreach($results as $result)
{
print_r($result->textContent);
print_r($result->getAttribute("href"));
}

How to create an html link to another page from "<?=$objResult["id"];?>"

I'm working on a php/mysql application and need to make the output of one column, one row an html link to another html page. Have not been able to find any relavant information. The "" needs to be the link. Thanks for any help.
<table id="display" style="width:800px;">
<tr>
<th width="40">ID</th>
<th width="70">Last</th>
<th width="70">First</th>
<th width="10">Middle</th>
<th width="70">Birth</th>
<th width="70">Death</th>
<th width="170">Notes</th>
<th width="100">Cemetery</th>
</tr>
<?
while($objResult = mysql_fetch_array($objQuery))
{
?>
<tr>
<td style='text-align:center;'><?=$objResult["id"];?></td>
<td style='text-align:center;'><?=$objResult["last"];?></td>
<td style='text-align:center;'><?=$objResult["first"];?></td>
<td style='text-align:center;'><?=$objResult["middle"];?></td>
<td style='text-align:center;'><?=$objResult["birth"];?></td>
<td style='text-align:center;'><?=$objResult["death"];?></td>
<td><?=$objResult["notes"];?></td>
<td style='text-align:center;'><?=$objResult["cemetery"];?></td>
</tr>
<?
}
?>
</table>

Problems scraping all data within a table

I'm having some issues getting all the data I need from two specific html tables. Tables at the bottom of this post.
The code above states html table id "table1". I also need to grab values from a table called "table2" in the exact same format. I have tried this code and can extract the td values but not the few values that are within the span specifiers within the td. I've tried multiple ways to do this but I'm just not getting it. My code looks something like:
$dom = file_get_html("internets.html);
//not sure how to specify the table exactly!? because this code didn't work.
//$tds = $dom->find('table[id=table1]',0)->find('tr');
foreach($dom->find('tr') as $key => $tr)
{
$td = $tr->find('td');
echo $td[0]->innertext . "</br>";
}
Any assistance much appreciated. I have done some searching here and also used the simple php dom manual.
Here is the format of a table:
<table id="table1">
<tbody>
<tr>
<th width="48%" scope="row">
Prev Close:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Open:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Bid:
</th>
<td class="yfnc_tabledata1">
<span id="yfs_b00_pgo.ax">
0.0180
</span>
</td>
</tr>
<tr>
<th width="48%" scope="row"></th>
<td class="yfnc_tabledata1"></td>
</tr>
<tr>
<th width="48%" scope="row">
1y Target Est:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="48%" scope="row">
Beta:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="54%" scope="row">
Next Earnings Date:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
</tbody>
</table>
<?php
$html=<<<XHTML
<table id="table1">
<tbody>
<tr>
<th width="48%" scope="row">
Prev Close:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Open:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Bid:
</th>
<td class="yfnc_tabledata1">
<span id="yfs_b00_pgo.ax">
0.0180
</span>
</td>
</tr>
<tr>
<th width="48%" scope="row"></th>
<td class="yfnc_tabledata1"></td>
</tr>
<tr>
<th width="48%" scope="row">
1y Target Est:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="48%" scope="row">
Beta:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="54%" scope="row">
Next Earnings Date:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
</tbody>
</table>
XHTML;
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
foreach ($xp->query("/*//table[#id='table1'//*/td") as $i=>$node) {
echo $node->nodeValue;
}
?>

Categories