I have a table that contains a number of headings like this:
<TR>
<TH CLASS="ddtitle" scope="colgroup" >Linked text</TH>
</TR>
The table is thousands of lines long so I can't share in full, but here is the initial tag and one full item within the table. Sadly there is no nested wrapped around each item and the comments are mine - so it's a pain to decipher where one item begins and ends.
<TABLE CLASS="datadisplaytable" SUMMARY="Layout table" width="100%"><CAPTION class="captiontext">Items Found</CAPTION>
<!-- START of first item in the table -->
<TR>
<TH CLASS="ddtitle" scope="colgroup" >Linked text</TH>
</TR>
<TR>
<TD CLASS="dddefault">
<SPAN class="fieldlabeltext">Term: </SPAN>Fall
<BR>
<SPAN class="fieldlabeltext">Registration: </SPAN>Jan 1, 2018 to Aug 1, 2018
<BR>
<SPAN class="fieldlabeltext">Levels: </SPAN>Undergraduate
<BR>
<BR>
Location
<BR>
Lecture Schedule Type
<BR>
3.000 Credits
<BR>
View Entry
<BR>
<BR>
<TABLE CLASS="datadisplaytable" SUMMARY="Meeting time table"><CAPTION class="captiontext">Scheduled Meeting Times</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Type</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Schedule Type</TH>
<TH CLASS="ddheader" scope="col" >Instructors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">9:20 am - 10:10 am</TD>
<TD CLASS="dddefault">MWF</TD>
<TD CLASS="dddefault">Some Building Room 101</TD>
<TD CLASS="dddefault">Aug 1, 2018 - Dec 1, 2018</TD>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">Instructor Name (<ABBR title= "Primary">P</ABBR>)<A HREF="mailto:email#foo.com" target="Instructur Name" ><IMG SRC="/wtlgifs/email.png" ALIGN="middle" ALT="E-mail" CLASS="headerImg" TITLE="E-mail" NAME="web_email" HSPACE=0 VSPACE=0 BORDER=0 HEIGHT=16 WIDTH=16></A></TD>
</TR>
</TABLE>
<BR>
<BR>
</TD>
</TR>
<!-- END first item in the table -->
I want to extract the item details, starting with the course name (which is the text content, "linked text," inside th.ddtitle) and the course link (which is the a href inside th.ddtitle). Here's what I've tried for grabbing those two items:
$dom = new DOMDocument();
$myHtml = file_get_contents(__DIR__.'myfile.html');
$dom->loadHTML($myHtml);
$xpath = new DOMXPath($dom);
// first part changes an outer table with the same class, so I can get inner tables without the outer one
$tables = $xpath->query("//table[#class='datadisplaytable']");
for($i=0; $i<1; $i++) {
$tables[$i]->setAttribute('class', 'masterTable');
}
$html = $dom->saveHTML();
// now, the query I'm having trouble with:
$textAndLink = $xpath->query("//th[#class='ddtitle']/*");
$i=1;
foreach($textAndLink as $info) {
foreach($info->childNodes as $child) {
if($i%2 == 0) {
echo $child->getAttribute('href') . '<br>';
} else {
echo $child->nodeValue . '<br>';
}
}
$i++;
}
I've also tried print_r($child) and the only items displayed are the text nodes, no <a> tags. How can I get both the anchor's "href" attribute and the text content? What I am expecting from the code above is a list like this:
http://foo.com/<br>
Linked text<br>
http://foo.com/secondlink<br>
Second linked text<br>
and so on and so forth.
Try this code snippet here
<?php
ini_set('display_errors', 1);
$string = '
<TABLE CLASS="datadisplaytable" SUMMARY="Layout table" width="100%"><CAPTION class="captiontext">Items Found</CAPTION>
<!-- START of first item in the table -->
<TR>
<TH CLASS="ddtitle" scope="colgroup" >Linked text</TH>
</TR>
<TR>
<TD CLASS="dddefault">
<SPAN class="fieldlabeltext">Term: </SPAN>Fall
<BR>
<SPAN class="fieldlabeltext">Registration: </SPAN>Jan 1, 2018 to Aug 1, 2018
<BR>
<SPAN class="fieldlabeltext">Levels: </SPAN>Undergraduate
<BR>
<BR>
Location
<BR>
Lecture Schedule Type
<BR>
3.000 Credits
<BR>
View Entry
<BR>
<BR>
<TABLE CLASS="datadisplaytable" SUMMARY="Meeting time table"><CAPTION class="captiontext">Scheduled Meeting Times</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Type</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Schedule Type</TH>
<TH CLASS="ddheader" scope="col" >Instructors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">9:20 am - 10:10 am</TD>
<TD CLASS="dddefault">MWF</TD>
<TD CLASS="dddefault">Some Building Room 101</TD>
<TD CLASS="dddefault">Aug 1, 2018 - Dec 1, 2018</TD>
<TD CLASS="dddefault">Lecture</TD>
<TD CLASS="dddefault">Instructor Name (<ABBR title= "Primary">P</ABBR>)<A HREF="mailto:email#foo.com" target="Instructur Name" ><IMG SRC="/wtlgifs/email.png" ALIGN="middle" ALT="E-mail" CLASS="headerImg" TITLE="E-mail" NAME="web_email" HSPACE=0 VSPACE=0 BORDER=0 HEIGHT=16 WIDTH=16></A></TD>
</TR>
</TABLE>
<BR>
<BR>
</TD>
</TR>';
$domDocument = new DOMDocument();
$domDocument->loadHTML($string);
$domXPath = new DOMXPath($domDocument);
$results = $domXPath->query('//tr/th[#class="ddtitle"]/a');
foreach($results as $result)
{
print_r($result->textContent);
print_r($result->getAttribute("href"));
}
Related
Trying to use a Bootstrap 5 Collapse. This same code has been copied from a similar page, but the rows do not expand to show the collapsed (hidden) row. Instead I get the above error on the browser console.
The browser's console has these lines at the bottom ( I can click the top row to expand the remaining rows, which looks like the attached pic):
SyntaxError: The string did not match the expected pattern.
querySelector - index.js:64
e - index.js:64
(anonymous function) - collapse.js:317
If I tap any of the lines after the f symbols, the browser redirects me to the "Sources" tab to the relevant JS file for Bootstrap.
The code is - not the full page, this is exerted:
<?php
$x = 1;
$leadingScore = -100;
if (isset($leaderboard)) {
foreach ($leaderboard as $score) {
$pid = (string) $score['playerId'];
?>
<div class="collapse">
<tr data-bs-toggle='collapse' data-bs-target='#<?=$pid?>'>
<th scope="row">
<?php
if ($leadingScore != $score['score'])
echo $x;
?>
</th>
<td style="text-align: left;">
<?php echo $score['playerName']." (".$score['hcap'].")"; ?>
</td>
<td>
<?php
echo $score['score'];
$leadingScore = $score['score'];
?>
</td>
<td>
<?php
echo $score['thru'];
?>
</td>
</tr>
<tr id='<?=$pid?>' class='collapse'>
<td colspan=4>
<table class='table'>
<tr>
<th class='sub-th' scope='col'></th>
<th class='sub-th' scope='col'>1</th>
<th class='sub-th' scope='col'>2</th>
<th class='sub-th' scope='col'>3</th>
<th class='sub-th' scope='col'>4</th>
<th class='sub-th' scope='col'>5</th>
<th class='sub-th' scope='col'>6</th>
<th class='sub-th' scope='col'>7</th>
<th class='sub-th' scope='col'>8</th>
<th class='sub-th' scope='col'>9</th>
<th class='sub-th' scope='col'>TOTAL</th>
</tr>
</table>
</td>
</tr>
</div>
<?php
$x++;
}
}
?>
The output in the browser source, the primary table row HTML is:
<div class="collapse">
<tr data-bs-toggle='collapse' data-bs-target='#609d0993906429612483cf49'>
The collapsable row HTML is:
<tr id='609d0993906429612483cf49' class='collapse'>
<td colspan=4>
<table class='table'>
...
So it has the target and id tags populated from the DB.
As stated by #johansenja, starting the div id with character not integer resolved the issue.
I have the following table which I'm trying to scrape with SimpleHtmlDom.
How would I reference/access the date value of "application received" and put it in to a variable. I've tried using the plain text approach, but this produces all the text in all of the td's.
<table id="simpleDetailsTable" summary="Case Details">
<tbody><tr>
<th scope="row" width="40%">
Reference
</th>
<td>
LA01/2018/0235/F
</td>
</tr>
<tr>
<th scope="row" width="40%">
Application Received
</th>
<td>
Fri 23 Feb 2018
<td>
</tr>
<tr>
<th scope="row" width="40%">
Address
</th>
<td>
206 Straid Road Bushmills.
</td>
</tr>
<tr>
<th scope="row" width="40%">
Proposal
</th>
<td>
Alterations to existing car showroom.
</td>
</tr>
<tr>
<th scope="row" width="40%">
Status
</th>
<td>
<span class="caseDetailsStatus">Application Received</span>
</td>
</tr>
<tr>
<th scope="row" width="40%">
Authority Decision
</th>
<td>
Not Available
</td>
</tr>
<tr>
<th scope="row" width="40%">
Authority Decision Date
</th>
<td>
Not Available
</td>
</tr>
<tr>
<th scope="row" width="40%">
PAC Decision
</th>
<td>
Not Available
</td>
</tr>
<tr>
<th scope="row" width="40%">
PAC Decision Date
</th>
<td>
Not Available
</td>
</tr>
</tbody></table>
Using the following code I'm able to echo out each listing.
curl_setopt($ch, CURLOPT_URL,$item);
$output = curl_exec($ch);
$html = str_get_html($output);
$table = $html->find('table',0);
$rowdata = array();
foreach($table->find('tr') as $row){
foreach($row->find('td') as $cell){
$listing = $cell->plaintext;
echo $listing,'<br>';
//echo '<pre>'; print_r($listing);echo'<pre/>';
}
}
which produces the following result;
LA01/2018/0254/O
Tue 27 Feb 2018
39 Lyttlesdale Garvagh.
Proposed new dwelling and proposed new paired access to include demolition of existing garage.
Application Received
Not Available
Not Available
Not Available
Not Available
LA10/2018/0265/F
Tue 27 Feb 2018
45 Glen Road Killyculla Tempo BT94 3JU
Replacement dwelling-amended position,house type (split level) and domestic garage from previous approval LA10/2017/0529/RM for same
Application Received
Not Available
Not Available
Not Available
Not Available
Can anyone put me right on what it is I should be doing?
I'm working on a php/mysql application and need to make the output of one column, one row an html link to another html page. Have not been able to find any relavant information. The "" needs to be the link. Thanks for any help.
<table id="display" style="width:800px;">
<tr>
<th width="40">ID</th>
<th width="70">Last</th>
<th width="70">First</th>
<th width="10">Middle</th>
<th width="70">Birth</th>
<th width="70">Death</th>
<th width="170">Notes</th>
<th width="100">Cemetery</th>
</tr>
<?
while($objResult = mysql_fetch_array($objQuery))
{
?>
<tr>
<td style='text-align:center;'><?=$objResult["id"];?></td>
<td style='text-align:center;'><?=$objResult["last"];?></td>
<td style='text-align:center;'><?=$objResult["first"];?></td>
<td style='text-align:center;'><?=$objResult["middle"];?></td>
<td style='text-align:center;'><?=$objResult["birth"];?></td>
<td style='text-align:center;'><?=$objResult["death"];?></td>
<td><?=$objResult["notes"];?></td>
<td style='text-align:center;'><?=$objResult["cemetery"];?></td>
</tr>
<?
}
?>
</table>
I'm having some issues getting all the data I need from two specific html tables. Tables at the bottom of this post.
The code above states html table id "table1". I also need to grab values from a table called "table2" in the exact same format. I have tried this code and can extract the td values but not the few values that are within the span specifiers within the td. I've tried multiple ways to do this but I'm just not getting it. My code looks something like:
$dom = file_get_html("internets.html);
//not sure how to specify the table exactly!? because this code didn't work.
//$tds = $dom->find('table[id=table1]',0)->find('tr');
foreach($dom->find('tr') as $key => $tr)
{
$td = $tr->find('td');
echo $td[0]->innertext . "</br>";
}
Any assistance much appreciated. I have done some searching here and also used the simple php dom manual.
Here is the format of a table:
<table id="table1">
<tbody>
<tr>
<th width="48%" scope="row">
Prev Close:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Open:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Bid:
</th>
<td class="yfnc_tabledata1">
<span id="yfs_b00_pgo.ax">
0.0180
</span>
</td>
</tr>
<tr>
<th width="48%" scope="row"></th>
<td class="yfnc_tabledata1"></td>
</tr>
<tr>
<th width="48%" scope="row">
1y Target Est:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="48%" scope="row">
Beta:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="54%" scope="row">
Next Earnings Date:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
</tbody>
</table>
<?php
$html=<<<XHTML
<table id="table1">
<tbody>
<tr>
<th width="48%" scope="row">
Prev Close:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Open:
</th>
<td class="yfnc_tabledata1">
0.02
</td>
</tr>
<tr>
<th width="48%" scope="row">
Bid:
</th>
<td class="yfnc_tabledata1">
<span id="yfs_b00_pgo.ax">
0.0180
</span>
</td>
</tr>
<tr>
<th width="48%" scope="row"></th>
<td class="yfnc_tabledata1"></td>
</tr>
<tr>
<th width="48%" scope="row">
1y Target Est:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="48%" scope="row">
Beta:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
<tr>
<th width="54%" scope="row">
Next Earnings Date:
</th>
<td class="yfnc_tabledata1">
N/A
</td>
</tr>
</tbody>
</table>
XHTML;
$dom = new DOMDocument;
$dom->loadHTML($html);
$xp = new DOMXPath($dom);
foreach ($xp->query("/*//table[#id='table1'//*/td") as $i=>$node) {
echo $node->nodeValue;
}
?>
I have a spinner and what happens is that whatever number is in the spinner, when the form is submitted, it should display the word "quest" as many times as the number in the spinner.. E.g if number in spinner is 3, then it will display "quest" 3 times in the table.
The problem is displaying it in the table.
At the moment with my current code it is displaying it like this:
quest
quest
quest
Question Id, Option Type, Duration .... These are table headings
It is displaying the words quest outside the table
Instead I want the word "quest" to be displayed in the Question Id column like this:
Question Id, Option Type, Duration...
quest
quest
quest
How can I get it to display it like the example above?
Below is code
<table border=1 id="qandatbl" align="center">
<tr>
<th class="col1">Question No</th>
<th class="col2">Option Type</th>
<th class="col1">Duration</th>
<th class="col2">Weight(%)</th>
<th class="col1">Answer</th>
<th class="col2">Video</th>
<th class="col1">Audio</th>
<th class="col2">Image</th>
</tr>
<?php
$spinnerCount = $_POST['txtQuestion'];
if($spinnerCount > 0) {
for($i = 1; $i <= $spinnerCount; $i++) {
echo "<tr>quest";
}
}
?>
<td class='qid'></td>
<td class="options"></td>
<td class="duration"></td>
<td class="weight"></td>
<td class="answer"></td>
<td class="video"></td>
<td class="audio"></td>
<td class="image"></td>
</tr>
</table>
I did try echo "<td class='qid'></td>"; but this completely failed as well
Try this:
<table border=1 id="qandatbl" align="center">
<tr>
<th class="col1">Question No</th>
<th class="col2">Option Type</th>
<th class="col1">Duration</th>
<th class="col2">Weight(%)</th>
<th class="col1">Answer</th>
<th class="col2">Video</th>
<th class="col1">Audio</th>
<th class="col2">Image</th>
</tr>
<?php
$spinnerCount = $_POST['txtQuestion'];
if($spinnerCount > 0) {
for($i = 1; $i <= $spinnerCount; $i++) {
?>
<tr>
<td class='qid'><?php echo $quest; ?></td>
<td class="options"></td>
<td class="duration"></td>
<td class="weight"></td>
<td class="answer"></td>
<td class="video"></td>
<td class="audio"></td>
<td class="image"></td>
</tr>
<?php
} // For
} // If
?>
</table>
Is this what you want to do? Display "quest" in the first column?
<table border=1 id="qandatbl" align="center">
<tr>
<th class="col1">Question No</th>
<th class="col2">Option Type</th>
<th class="col1">Duration</th>
<th class="col2">Weight(%)</th>
<th class="col1">Answer</th>
<th class="col2">Video</th>
<th class="col1">Audio</th>
<th class="col2">Image</th>
</tr>
<?php
$spinnerCount = $_POST['txtQuestion'];
if($spinnerCount > 0) {
for($i = 1; $i <= $spinnerCount; $i++) { ?>
<tr>
<td class='qid'>quest</td>
<td class="options"></td>
<td class="duration"></td>
<td class="weight"></td>
<td class="answer"></td>
<td class="video"></td>
<td class="audio"></td>
<td class="image"></td>
</tr>
<?php
}
}
?></table>
?>