How can I merge the date td/cell into the time td/cell?
I would like the table row to consist of 3 cells, the middle cell should read date time.
My Code:
$dom = new DOMDocument;
$dom->loadHTMLFile("test.html");
$dom->validateOnParse = true;
$xpath = new DOMXPath($dom);
$table = $xpath->query("//*[#class='mytable']//tbody")->item(0);
$td = $table->getElementsbytagname("td");
test.html file contents:
<table class="mytable">
<tbody><tr>
<td>date</td>
<td>td1</td>
<td>time</td>
<td>td2</td>
</tr></tbody>
</table>
Desired result:
<table class="mytable">
<tbody><tr>
<td>td1</td>
<td>date time</td>
<td>td2</td>
</tr></tbody>
</table>
Collect the tbody tr cells. Overwrite the 3rd occurring cell's text using the 1st occurring cell' text, then delete the first cell.
Code (Demo)
$html = <<<HTML
<table class="mytable">
<tbody><tr>
<td>date</td>
<td>td1</td>
<td>time</td>
<td>td2</td>
</tr></tbody>
</table>
HTML;
$dom = new DOMDocument;
$dom->loadHTML($html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);
foreach ($xpath->query("//table[#class='mytable']/tbody/tr") as $tr) {
$tds = $tr->getElementsByTagName("td");
$tds->item(2)->nodeValue = $tds->item(0)->nodeValue .
' ' . $tds->item(2)->nodeValue;
$tr->removeChild($tds->item(0));
}
echo $dom->saveHTML();
Output:
<table class="mytable">
<tbody><tr>
<td>td1</td>
<td>date time</td>
<td>td2</td>
</tr></tbody>
</table>
<table>
<tr>
<th>Year</th>
<th>Score</th>
</tr>
<tr>
<td>2014</td>
<td>3078</td>
</tr>
</table>
If I have the above table being successfully stored as a variable, how could I append it to a div with an overflow-x style attribute?
I've tried the following snippet but no cigar:
$div = str_get_html('<div style="overflow-x:auto;"></div>');
$div = $div->find('div');
$div = $div->appendChild($table);
return $div;
so expected output should be:
<div style="overflow-x:auto;">
<table>
<tr>
<th>Year</th>
<th>Score</th>
</tr>
<tr>
<td>2014</td>
<td>3078</td>
</tr>
</table>
</div>
Hope this one will give you a basic idea of implementation. Here we are using DOMDocument.
Try this code snippet here
<?php
ini_set('display_errors', 1);
//creating table node
$tableNode='<table><tr><th>Year</th><th>Score</th></tr><tr><td>2014</td><td>3078</td></tr></table>';
$domDocument = new DOMDocument();
$domDocument->encoding="UTF-8";
$domDocument->loadHTML($tableNode);
$domXPath = new DOMXPath($domDocument);
$table = $domXPath->query("//table")->item(0);
//creating empty div node.
$domDocument = new DOMDocument();
$element=$domDocument->createElement("div");
$element->setAttribute("style", "overflow-x:auto;");
$result=$domDocument->importNode($table,true);//importing node from of other DOMDocument
$element->appendChild($result);
echo $domDocument->saveHTML($element);
I have a html content like this...
$html = <<<EOF
<table id="specialTbl">
<tbody>
<tr>
<td> row-1-td-1</td>
<td> row-1-td-2</td>
<td> row-1-td-3</td>
<td>
<table class="runsOn"> // Problem starts here
<tbody>
<tr>
<td>row-1-td-4-Child-1</td>
<td>row-1-td-4-Child-2</td>
</tr>
</tbody>
</table>
</td>
<td> row-1-td-5</td>
<td> row-1-td-6</td>
</tr>
<tr>
<td> row-2-td-1</td>
<td> row-2-td-2</td>
<td> row-2-td-3</td>
<td>
<table class="runsOn">
<tbody>
<tr>
<td>row-2-td-4-Child-1</td>
<td>row-2-td-4-Child-2</td>
</tr>
</tbody>
</table>
</td>
<td> row-2-td-5</td>
<td> row-2-td-6</td>
</tr>
<tr>
<td> row-3-td-1</td>
<td> row-3-td-2</td>
<td> row-3-td-3</td>
<td>
<table class="runsOn">
<tbody>
<tr>
<td>row-3-td-4-Child-1</td>
<td>row-3-td-4-Child-2</td>
</tr>
</tbody>
</table>
</td>
<td> row-3-td-5</td>
<td> row-3-td-6</td>
</tr>
<tr>
<td> row-4-td-1</td>
<td> row-4-td-2</td>
<td> row-4-td-3</td>
<td>
<table class="runsOn">
<tbody>
<tr>
<td>row-4-td-4-Child-1</td>
<td>row-4-td-4-Child-2</td>
</tr>
</tbody>
</table>
</td>
<td> row-4-td-5</td>
<td> row-4-td-6</td>
</tr>
<tr>
<td> row-5-td-1</td>
<td> row-5-td-2</td>
<td> row-5-td-3</td>
<td>
<table class="runsOn">
<tbody>
<tr>
<td>row-5-td-4-Child-1</td>
<td>row-5-td-4-Child-2</td>
</tr>
</tbody>
</table>
</td>
<td> row-5-td-5</td>
<td> row-5-td-6</td>
</tr>
<tr>
<td> row-6-td-1</td>
<td> row-6-td-2</td>
<td> row-6-td-3</td>
<td>
<table class="runsOn">
<tbody>
<tr>
<td>row-6-td-4-Child-1</td>
<td>row-6-td-4-Child-2</td>
</tr>
</tbody>
</table>
</td>
<td> row-6-td-5</td>
<td> row-6-td-6</td>
</tr>
<tr>
<td> row-7-td-1</td>
<td> row-7-td-2</td>
<td> row-7-td-3</td>
<td>
<table class="runsOn">
<tbody>
<tr>
<td>row-7-td-4-Child-1</td>
<td>row-7-td-4-Child-2</td>
</tr>
</tbody>
</table>
</td>
<td> row-7-td-5</td>
<td> row-7-td-6</td>
</tr>
</tbody>
</table>
EOF;
$html= str_get_html($html);
$table =$html->find('table#specialTbl',0) ;
$response["response_code"] = 200;
$response["rows"] = array();
foreach($table->find('tr') as $key=>$value) {
$post["td1"]= trim(strip_tags($value->find('td',0)->plaintext));
$post["td2"]= trim(strip_tags($value->find('td',1)->plaintext));
$post["td3"]= trim(strip_tags($value->find('td',2)->plaintext));
$post["td4"]= trim(strip_tags($value->find('td',3)->plaintext));
$post["td5"]= trim(strip_tags($value->find('td',4)->plaintext));
$post["td6"]= trim(strip_tags($value->find('td',5)->plaintext));
array_push($response["rows"], $post);
}
$json = json_encode($response);
echo $json_content;
And Json Response is
{
"response_code":200,
"rows":[
{
"td1":"row-1-td-1",
"td2":"row-1-td-2",
"td3":"row-1-td-3",
"td4":"row-1-td-4-Child-1 row-1-td-4-Child-2",
"td5":"row-1-td-4-Child-1",
"td6":"row-1-td-4-Child-2"
},
{
"td1":"row-1-td-4-Child-1",
"td2":"row-1-td-4-Child-2",
"td3":"",
"td4":"",
"td5":"",
"td6":""
},
{
"td1":"row-2-td-1",
"td2":"row-2-td-2",
"td3":"row-2-td-3",
"td4":"row-2-td-4-Child-1 row-2-td-4-Child-2",
"td5":"row-2-td-4-Child-1",
"td6":"row-2-td-4-Child-2"
},
{
"td1":"row-2-td-4-Child-1",
"td2":"row-2-td-4-Child-2",
"td3":"",
"td4":"",
"td5":"",
"td6":""
},
{
"td1":"row-3-td-1",
"td2":"row-3-td-2",
"td3":"row-3-td-3",
"td4":"row-3-td-4-Child-1 row-3-td-4-Child-2",
"td5":"row-3-td-4-Child-1",
"td6":"row-3-td-4-Child-2"
},
{
"td1":"row-3-td-4-Child-1",
"td2":"row-3-td-4-Child-2",
"td3":"",
"td4":"",
"td5":"",
"td6":""
},
{
"td1":"row-4-td-1",
"td2":"row-4-td-2",
"td3":"row-4-td-3",
"td4":"row-4-td-4-Child-1 row-4-td-4-Child-2",
"td5":"row-4-td-4-Child-1",
"td6":"row-4-td-4-Child-2"
},
{
"td1":"row-4-td-4-Child-1",
"td2":"row-4-td-4-Child-2",
"td3":"",
"td4":"",
"td5":"",
"td6":""
},
{
"td1":"row-5-td-1",
"td2":"row-5-td-2",
"td3":"row-5-td-3",
"td4":"row-5-td-4-Child-1 row-5-td-4-Child-2",
"td5":"row-5-td-4-Child-1",
"td6":"row-5-td-4-Child-2"
},
{
"td1":"row-5-td-4-Child-1",
"td2":"row-5-td-4-Child-2",
"td3":"",
"td4":"",
"td5":"",
"td6":""
},
{
"td1":"row-6-td-1",
"td2":"row-6-td-2",
"td3":"row-6-td-3",
"td4":"row-6-td-4-Child-1 row-6-td-4-Child-2",
"td5":"row-6-td-4-Child-1",
"td6":"row-6-td-4-Child-2"
},
{
"td1":"row-6-td-4-Child-1",
"td2":"row-6-td-4-Child-2",
"td3":"",
"td4":"",
"td5":"",
"td6":""
},
{
"td1":"row-7-td-1",
"td2":"row-7-td-2",
"td3":"row-7-td-3",
"td4":"row-7-td-4-Child-1 row-7-td-4-Child-2",
"td5":"row-7-td-4-Child-1",
"td6":"row-7-td-4-Child-2"
},
{
"td1":"row-7-td-4-Child-1",
"td2":"row-7-td-4-Child-2",
"td3":"",
"td4":"",
"td5":"",
"td6":""
}
]
}
Problem is with foreach. How can i skip the tr inside a td. I have 7 rows in table with id "specialTbl". But for each returns 14 rows in json as it loops through table called runsOn.
How can i avoid looping through table inside td(4th)
It would be easier to use DomDocument with Xpath as follows. DomDocument already is present in PHP5. It will give you the desired output.
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$response["response_code"] = 200;
$response["rows"] = array();
$trs = $xpath->query("//table[#id='specialTbl']/tbody/tr"); // all child tr's in all child tbody's in any table that has id 'specialTbl'
foreach ($trs as $tr) {
$post = array();
$tds = $xpath->query("td", $tr); // all child td's in $tr
foreach ($tds as $key => $td) {
$post["td" . ++$key] = $td->textContent;
}
array_push($response["rows"], $post);
}
$json_content = json_encode($response);
echo $json_content;
But you could also keep using http://simplehtmldom.sourceforge.net/manual.htm and use css-like selectors (untested code, I don't have simplehtmldom):
$html= str_get_html($html);
$response["response_code"] = 200;
$response["rows"] = array();
$trs = $html->find("table#specialTbl>tbody>tr");
foreach ($trs as $tr) {
$post = array();
$tds = $tr->children();
foreach ($tds as $key => $td) {
$post["td" . ++$key] = $td->innertext;
}
array_push($response["rows"], $post);
}
$json_content = json_encode($response);
echo $json_content;
You could use PHP's DOM parser, and before searching for the tr items, get it to prune out all the nested tables from the HTML structure:
// Parse the HTML into a DOM object & find the table by ID
$doc = new DOMDocument();
$doc->loadHTML($html);
$table = $doc->getElementById('specialTbl');
// Remove all nested TRs from the DOM table object
$nested = $table->getElementsByTagName('table');
foreach ($nested as $element)
{
// Remove the TRs from thi nested table
foreach ($element->getElementsByTagName('tr') as $tr)
$tr->parentNode->removeChild($tr);
}
// Now when we search through TRs, we only get the top level ones
$rows = $table->getElementsByTagName('tr');
$response = array();
foreach ($rows as $row)
{
// Collect the values of this row's TDs
$tds = array();
foreach ($row->getElementsByTagName('td') as $td)
{
$tds[] = trim($td->nodeValue);
}
// Add this row to the response
$response['rows'][] = $tds;
}
// Add extra response details
$response['response_code'] = 200; // You shouldn't need to explicitly send this
$json = json_encode($response);
// Output JSON
header('Content-type: application/json'); // Use the correct MIME type
echo $json;
I have this table in output from a program (string converted in a DomDocument in PHP):
<table>
<tr>
<td width="50">Â </td>
<td>My content</td>
<td width="50">Â </td>
</tr>
<table>
I need to remove the two tag <td width="50">Â </td> (i don't know why the program adds them, but there are -.-") like this:
<table>
<tr>
<td>My content</td>
</tr>
<table>
What's the best way for do it in PHP?
Edit:
the program is JasperReport Server. I call the report rendering function via web application:
//this is the call to server library for generate the report
$reportGen = $reportServer->runReport($myReport);
$domDoc = new \DomDocument();
$domDoc->loadHTML($reportGen);
return $domDoc->saveHTML($domDoc->getElementsByTagName('table')->item(0));
return the upper table who i need to fix...
Try this
<?php
$domDoc = new DomDocument();
$domDoc->loadHTML($reportGen);
$xpath = new DOMXpath($domDoc);
$tags = $xpath->query('//td');
foreach($tags as $tag) {
$value = $tag->nodeValue;
if(preg_match('/^(Â )/',$value))
$tag->parentNode->removeChild($tag);
}
?>
Regex and replace:
$var = '<table>
<tr>
<td width="50">Ã</td>
<td>My interssing content</td>
<td width="50">Ã</td>
</tr>
<table>';
$final = preg_replace('#(<td width="50".*?>).*?(</td>)#', '$1$2', $var);
$final = str_replace('<td width="50"></td>', '', $final);
echo $final;
I use regex for HTML parsing but I need your help to parse the following table:
<table class="resultstable" width="100%" align="center">
<tr>
<th width="10">#</th>
<th width="10"></th>
<th width="100">External Volume</th>
</tr>
<tr class='odd'>
<td align="center">1</td>
<td align="left">
http://xyz.com
</td>
<td align="right">210,779,783<br />(939,265 / 499,584)</td>
</tr>
<tr class='even'>
<td align="center">2</td>
<td align="left">
http://abc.com
</td>
<td align="right">57,450,834<br />(288,915 / 62,935)</td>
</tr>
</table>
I want to get all domains with their volume(in array or var) for example
http://xyz.com - 210,779,783
Should I use regex or HTML dom in this case. I don't know how to parse large table, can you please help, thanks.
here's an XPath example that happens to parse the HTML from the question.
<?php
$dom = new DOMDocument();
$dom->loadHTMLFile("./input.html");
$xpath = new DOMXPath($dom);
$trs = $xpath->query("//table[#class='resultstable'][1]/tr");
foreach ($trs as $tr) {
$tdList = $xpath->query("td[2]/a", $tr);
if ($tdList->length == 0) continue;
$name = $tdList->item(0)->nodeValue;
$tdList = $xpath->query("td[3]", $tr);
$vol = $tdList->item(0)->childNodes->item(0)->nodeValue;
echo "name: {$name}, vol: {$vol}\n";
}
?>