Add dynamic classes to an unordered list with PHP - php

I'm trying to add consecutive classes to all list-items in a list with the class of 'nav'. Essentially, I want every list-item to have a class of 'nthChild-x', where x represents its position in the list. I'm a major noob to PHP, so be easy.
Here is the current markup:
<ul id="primaryNav" class="nav">
<li>Blah Blah Uno</li>
<li>Blah Blah Dos</li>
<li>Blah Blah Tres</li>
</ul>
I want this list to be rendered as the following:
<ul id="primaryNav" class="nav">
<li class="nthChild-1">Blah Blah Uno</li>
<li class="nthChild-3">Blah Blah Dos</li>
<li class="nthChild-3">Blah Blah Tres</li>
</ul>
Please don't reply with a JavaScript or JQuery solution. I know how to do this with JS but need this to be server-side. Also, I don't necessarily want to target the ID of the list because I'd rather do it once and target all lists (though that could be a start).
Any ideas?

You can use DOMDocument for that.
This one will work with existing classes and won't add the same class twice.
$dom = new DOMDocument;
$dom->loadHTML($html);
$lists = $dom->getElementsByTagName('ul');
foreach($lists as $list) {
$index = 1;
foreach($list->childNodes as $node) {
if ($node->nodeName != 'li') {
continue;
}
$class = array();
if ($node->hasAttribute('class')) {
$class = preg_split('/\s+/', $node->getAttribute('class'));
}
$addClass = 'nthChild-' . $index;
if (in_array($addClass, $class)) {
continue;
}
$class[] = $addClass;
$node->setAttribute('class', implode(' ', $class));
$index++;
}
}
$html = '';
foreach($dom->getElementsByTagName('body')->item(0)->childNodes as $element) {
$html .= $dom->saveXML($element, LIBXML_NOEMPTYTAG);
}
CodePad.

I suppose you don't use any template system (like Smarty) to generate this. Than you have to use cycle to generate the fields together with integer variable that will have number of iteration stored. It can be done for example like this:
<ul id="primaryNav" class="nav">
<?php
$values = array(1 => "Blah Blah Uno", 2 => "Blah Blah Dos", 3 => "Blah Blah Tres");
for ($i = 1; $i <= 3; $i++) {
echo "<li class=\"nthChild-" . $i . "\">" . $values[$i] . "</li>";
}
?>
</ul>
This solution is quite simple, of course there are many better ways how to do this.

If you only have the list formatted as you've shown here (not in a PHP array), you can do it like this
$markup = '<ul id="primaryNav" class="nav">
<li>Blah Blah Uno</li>
<li>Blah Blah Dos</li>
<li>Blah Blah Tres</li>
</ul>';
if (preg_match_all("/<li>(.*)<\/li>/U",$markup,$result) > 0)
{
$newMarkup = "<ul id=\"primaryNav\" class=\"nav\">\n";
$count = 0;
foreach ($result[1] as $listElement)
{
$count++;
$newMarkup .= "\t<li class=\"nThCild-{$count}\">$listElement</li>\n";
}
print $newMarkup."</ul>\n";
}

Related

PHP string search and replace - possible use of DOM Needed

I cant seem to figure out how to achieve my goal.
I want to find and replace a specific class link based off of a generated RSS feed (need the option to replace later no matter what link is there)
Example HTML:
<a class="epclean1" href="#">
WHAT IT SHOULD LOOK LIKE:
<a class="epclean1" href="google.com">
May need to incorporate get element using DOM as the Full php has a created document. If that is the case I would need to know how to find by class and add the href url that way.
FULL PHP:
<?php
$rss = new DOMDocument();
$feed = array();
$urlArray = array(array('url' => 'https://feeds.megaphone.fm')
);
foreach ($urlArray as $url) {
$rss->load($url['url']);
foreach ($rss->getElementsByTagName('item') as $node) {
$item = array (
'title' => $node->getElementsByTagName('title')->item(0)->nodeValue
);
array_push($feed, $item);
}
}
usort( $feed, function ( $a, $b ) {
return strcmp($a['title'], $b['title']);
});
$limit = sizeof($feed);
$previous = null;
$count_firstletters = 0;
for ($x = 0; $x < $limit; $x++) {
$firstLetter = substr($feed[$x]['title'], 0, 1); // Getting the first letter from the Title you're going to print
if($previous !== $firstLetter) { // If the first letter is different from the previous one then output the letter and start the UL
if($count_firstletters != 0) {
echo '</ul>'; // Closing the previously open UL only if it's not the first time
echo '</div>';
}
echo '<button class="glanvillecleancollapsible">'.$firstLetter.'</button>';
echo '<div class="glanvillecleancontent">';
echo '<ul style="list-style-type: none">';
$previous = $firstLetter;
$count_firstletters ++;
}
$title = str_replace(' & ', ' & ', $feed[$x]['title']);
echo '<li>';
echo '<a class="epclean'.$i++.'" href="#" target="_blank">'.$title.'</a>';
echo '</li>';
}
echo '</ul>'; // Close the last UL
echo '</div>';
?>
</div>
</div>
The above fullphp shows on site like so (this is shortened as there is 200+):
<div class="modal-glanvillecleancontent">
<span class="glanvillecleanclose">×</span>
<p id="glanvillecleaninstruct">Select the first letter of the episode that you wish to get clean version for:</p>
<br>
<button class="glanvillecleancollapsible">8</button>
<div class="glanvillecleancontent">
<ul style="list-style-type: none">
<li><a class="epclean1" href="#" target="_blank">80's Video Vixen Tawny Kitaen 044</a></li>
</ul>
</div>
<button class="glanvillecleancollapsible">A</button>
<div class="glanvillecleancontent">
<ul style="list-style-type: none">
<li><a class="epclean2" href="#" target="_blank">Abby Stern</a></li>
<li><a class="epclean3" href="#" target="_blank">Actor Nick Hounslow 104</a></li>
<li><a class="epclean4" href="#" target="_blank">Adam Carolla</a></li>
<li><a class="epclean5" href="#" target="_blank">Adrienne Janic</a></li>
</ul>
</div>
You're not very clear about how your question relates to the code shown, but I don't see any attempt to replace the attribute within the DOM code. You'd want to look at XPath to find the desired elements:
function change_clean($content) {
$dom = new DomDocument;
$dom->loadXML($content);
$xpath = new DomXpath($dom);
$nodes = $xpath->query("//a[#class='epclean1']");
foreach ($nodes as $node) {
if ($node->getAttribute("href") === "#") {
$node->setAttribute("href", "https://google.com/");
}
}
return $dom->saveXML();
}
$xml = '<?xml version="1.0"?><foo><bar><a class="epclean1" href="#">test1</a></bar><bar><a class="epclean1" href="https://example.com">test2</a></bar></foo>';
echo change_clean($xml);
Output:
<foo><bar><a class="epclean1" href="https://google.com/">test1</a></bar><bar><a class="epclean1" href="https://example.com">test2</a></bar></foo>
Hmm. I think your pattern and replacement might be your problem.
What you have
$pattern = 'class="epclean1 href="(.*?)"';
$replacement = 'class="epclean1 href="google.com"';
Fix
$pattern = '/class="epclean1" href=".*"/';
$replacement = 'class="epclean1" href="google.com"';

Want to get specific data from a webpage

I am trying hard to get data from following portion of a webpage
<div id="menu_pannel">
<ul class="sf-menu" id="nav">
<li class="current"><a href="/" class="current" >Home</a></li>
<li class="">Schedule</li>
<li class="">All Channels</li>
<li class="">Sports Channels
<ul id="submenu">
<li>Sky Sports 1</li>
<li>Sky Sports 2</li>
<li><a href="http://www.time4tv.com/2011/03/sky-sports-3.php">Sky Sports
I want to get data from for that i am using
$pattern = '|<ul id="nav" class="sf-menu">(.*?)</ul>|';
preg_match($pattern, $html, $data);
but getting emty array .
if strip_tags($html) doesn't returns what you want, you can use this example to get an array of text:
function getTextBetweenTags($string, $tagname) {
preg_match_all("#<$tagname.*?>([^<]+)</$tagname>#", $string, $matches);
return $matches[1];
}
$values = getTextBetweenTags ($html, 'a' );
foreach($values as $value) {
echo $value . '<br>';
}
where $html is a var containing your html.
If you decide to use dom parser
$doc = new DOMDocument();
$doc->loadHTML($str);
$x = new DomXpath($doc);
$ul = $x->query('//ul[#id="nav"]'); // 'id' is a unique identifier!
// Echo outerHTML of ul[#id="nav"]
echo $doc->saveHTML($ul->item(0));
demo
Use DOMDocument class for manipulating HTML content:
// $html_str is your html fragment
$doc = new DOMDocument();
$doc->loadHTML($html_str);
$ul_content = "";
$ul = $doc->getElementsByTagName("ul")->item(0);
if ($ul && $ul->getAttribute('class') == 'sf-menu') {
foreach ($ul->childNodes as $n) {
$ul_content .= $doc->saveHTML($n);
}
}
echo $ul_content;

Format my JSON string into an <ol> ordered list in PHP

I'm working on a simple CMS for a pet project. I currently have a JSON string that contains a list of page ID's and Parent page ID's for a menu structure.
I want to now convert this string into a nested or hierarchical list (ordered list).
I've tried looking looping through but seem to have ended up with an overly complex range of sub classes. I'm struggling to find a suitable light-weight solution in PHP.
Here's the JSON:
**[{"id":3,"children":[{"id":4,"children":[{"id":5}]}]},{"id":6},{"id":2},{"id":4}]**
Here's the desired output:
<ol>
<li>3
<ol>
<li>4</li>
<ol>
<li>5</li>
</ol>
</ol>
</li>
<li>6</li>
<li>2</li>
<li>4</li>
</ol>
Is there anything built in to PHP that can simplify this process? Has anyone had any experience of this before?
I'm a newbie to PHP and SE. Looking forward to hearing from you.
Here's my current progress (it's not working too well)
<ol>
<?php
$json = '[{"id":3,"children":[{"id":4,"children":[{"id":5}]}]},{"id":6},{"id":2},{"id":4}]';
$decoded = json_decode($json);
$pages = $decoded;
foreach($pages as $page){
$subpages = $decoded->children;
echo "<li>".$page->id."</li>";
foreach($subpages as $subpage){
echo "<li>".$subpage->id."</li>";
}
}
?>
</ol>
You can use recursion to get deep inside the data. If the current value is an array then recursion again. Consider this example:
$json_string = '[{"id":3,"children":[{"id":4,"children":[{"id":5}]}]},{"id":6},{"id":2},{"id":4}]';
$array = json_decode($json_string, true);
function build_list($array) {
$list = '<ol>';
foreach($array as $key => $value) {
foreach($value as $key => $index) {
if(is_array($index)) {
$list .= build_list($index);
} else {
$list .= "<li>$index</li>";
}
}
}
$list .= '</ol>';
return $list;
}
echo build_list($array);
Using a function that can recursively go through your JSON, you can get the functionality you wish. Consider the following code: (this only accounts for an attribute of id as getting listed, as your desired code shows)
$json = '[{"id":3,"children":[{"id":4,"children":[{"id":5}]}]},{"id":6},{"id":2},{"id":4}]';
function createOLList($group) {
$output = (is_array($group)) ? "<ol>" : "";
foreach($group as $attr => $item) {
if(is_array($item) || is_object($item)) {
$output .= createOLList($item);
} else {
if($attr == "id") {
$output .= "<li>$item</li>";
}
}
}
$output .= (is_array($group)) ? "</ol>" : "";
return $output;
}
print(createOLList(json_decode($json)));
This will produce the following HTML output.
<ol>
<li>3</li>
<ol>
<li>4</li>
<ol>
<li>5</li>
</ol>
</ol>
<li>6</li>
<li>2</li>
<li>4</li>
</ol>
What you're looking for is called recursion, which can be done by a function calling itself.
If you solved once to list all nodes of the list in one function, you can then apply the same function for all child-lists. As then those child-lists will do the same on their children, too.
call_user_func(function ($array, $id = 'id', $list = 'children') {
$ul = function ($array) use (&$ul, $id, $list) {
echo '<ul>', !array_map(function ($child) use ($ul, $id, $list) {
echo '<li>', $child[$id], isset($child[$list]) && $ul($child[$list])
, '</li>';
}, $array), '</ul>';
};
$ul($array);
}, json_decode('[{"id":3,"children":[{"id":4,"children":[{"id":5}]}]},{"id":6},
{"id":2},{"id":4}]', TRUE));
As this example shows, the $ul function is called recursively over the list and all children. There are other solutions, but most often recursion is a simple method here to get the job done once you've wrapped your head around it.
Demo: https://eval.in/153471 ; Output (beautified):
<ul>
<li>3
<ul>
<li>4
<ul>
<li>5</li>
</ul>
</li>
</ul>
</li>
<li>6</li>
<li>2</li>
<li>4</li>
</ul>
<?php
$json_array = array();
array_push($json_array, array(
'id' => 3,
'children' => array(
'id' => 4,
'children' => array(
'id' => 5,
)
)
));
array_push($json_array, array('id' => 6));
array_push($json_array, array('id' => 2));
array_push($json_array, array('id' => 4));
//your json object
$json_object = json_encode($json_array);
//echo $json_object;
//here is where you decode your json object
$json_object_decoded = json_decode($json_object,true);
//for debug to see how your decoded json object looks as an array
/*
echo "<pre>";
print_r($json_object_decoded);
echo "</pre>";
*/
echo "<ol>";
foreach($json_object_decoded as $node){
if(isset($node['id'])){
echo "<li>" . $node['id'];
if(isset($node['children'])){
echo "<ol>";
echo "<li>" . $node['children']['id'] . "</li>";
if(isset($node['children'])){
echo "<ol>";
echo "<li>" . $node['children']['children']['id'] . "</li>";
echo "</ol>";
}
echo "</ol>";
}
echo "</li>";
}
}
echo "</ol>";
?>
I have found that i have to fix or simplify almost every of the functions above.
So here i came with something simple and working, still recursion.
function build_list($array) {
$list = '<ul>';
foreach($array as $key => $value) {
if (is_array($value)) {
$list .= "<strong>$key</strong>: " . build_list($value);
} else {
$list .= "<li><strong>$key</strong>: $value</li>";
}
}
$list .= '</ul>';
return $list;
}
build_list(json_encode($json_string),true);

Creating a table of contents in php

I am looking to create a very simple, very basic nested table of contents in php which gets all the h1-6 and indents things appropriately. This means that if I have something like:
<h1>content</h1>
<h2>more content</h2>
I should get:
content
more content.
I know it will be css that creates the indents, that's fine, but how do I create a table of contents with working links to the content on the page?
apparently its hard to grasp what I am asking for...
I am asking for a function that reads an html document and pulls out all the h1-6 and makes a table of contents.
I used this package, it's pretty easy and straight forward to use.
https://github.com/caseyamcl/toc
Install via Composer by including the following in your composer.json file:
{
"require": {
"caseyamcl/toc": "^3.0",
}
}
Or, drop the src folder into your application and use a PSR-4 autoloader to include the files.
Usage
This package contains two main classes:
TOC\MarkupFixer: Adds id anchor attributes to any H1...H6 tags that do not already have any (you can specify which header tag levels to use at runtime)
TOC\TocGenerator: Generates a Table of Contents from HTML markup
Basic Example:
$myHtmlContent = <<<END
<h1>This is a header tag with no anchor id</h1>
<p>Lorum ipsum doler sit amet</p>
<h2 id='foo'>This is a header tag with an anchor id</h2>
<p>Stuff here</p>
<h3 id='bar'>This is a header tag with an anchor id</h3>
END;
$markupFixer = new TOC\MarkupFixer();
$tocGenerator = new TOC\TocGenerator();
// This ensures that all header tags have `id` attributes so they can be used as anchor links
$htmlOut = "<div class='content'>" . $markupFixer->fix($myHtmlContent) . "</div>";
//This generates the Table of Contents in HTML
$htmlOut .= "<div class='toc'>" . $tocGenerator->getHtmlMenu($myHtmlContent) . "</div>";
echo $htmlOut;
This produces the following output:
<div class='content'>
<h1 id="this-is-a-header-tag-with-no-anchor-id">This is a header tag with no anchor id</h1>
<p>Lorum ipsum doler sit amet</p>
<h2 id="foo">This is a header tag with an anchor id</h2>
<p>Stuff here</p>
<h3 id="bar">This is a header tag with an anchor id</h3>
</div>
<div class='toc'>
<ul>
<li class="first last">
<span></span>
<ul class="menu_level_1">
<li class="first last">
This is a header tag with an anchor id
<ul class="menu_level_2">
<li class="first last">
This is a header tag with an anchor id
</li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
For this you have just to search for the tags in the HTML code.
I wrote two functions (PHP 5.4.x).
The first one returns an array, that contains the data of the table of contents. The data is is only the headline it self, the id of the tag (if you want to use anchors) and a sub-table of content.
function get_headlines($html, $depth = 1)
{
if($depth > 7)
return [];
$headlines = explode('<h' . $depth, $html);
unset($headlines[0]); // contains only text before the first headline
if(count($headlines) == 0)
return [];
$toc = []; // will contain the (sub-) toc
foreach($headlines as $headline)
{
list($hl_info, $temp) = explode('>', $headline, 2);
// $hl_info contains attributes of <hi ... > like the id.
list($hl_text, $sub_content) = explode('</h' . $depth . '>', $temp, 2);
// $hl contains the headline
// $sub_content contains maybe other <hi>-tags
$id = '';
if(strlen($hl_info) > 0 && ($id_tag_pos = stripos($hl_info,'id')) !== false)
{
$id_start_pos = stripos($hl_info, '"', $id_tag_pos);
$id_end_pos = stripos($hl_info, '"', $id_start_pos);
$id = substr($hl_info, $id_start_pos, $id_end_pos-$id_start_pos);
}
$toc[] = [ 'id' => $id,
'text' => $hl_text,
'sub_toc' => get_headlines($sub_content, $depth + 1)
];
}
return $toc;
}
The second returns a string that formats the toc with HTML.
function print_toc($toc, $link_to_htmlpage = '', $depth = 1)
{
if(count($toc) == 0)
return '';
$toc_str = '';
if($depth == 1)
$toc_str .= '<h1>Table of Content</h1>';
foreach($toc as $headline)
{
$toc_str .= '<p class="headline' . $depth . '">';
if($headline['id'] != '')
$toc_str .= '<a href="' . $link_to_htmlpage . '#' . $headline['id'] . '">';
$toc_str .= $headline['text'];
$toc_str .= ($headline['id'] != '') ? '</a>' : '';
$toc_str .= '</p>';
$toc_str .= print_toc($headline['sub_toc'], $link_to_htmlpage, $depth+1);
}
return $toc_str;
}
Both functions are far away from being perfect, but they work fine in my tests. Feel free to improve them.
Notice: get_headlines is not a parser, so it does not work on broken HTML code and just crashes. It also only works with lowercase <hi>-tags.
How about this (although it can only do one H level) ...
function getTOC(string $html, int $level=1) {
$toc="";
$x=0;
$n=0;
$html1="";
$safety=1000;
while ( $x>-1 and $safety-->0 ) {
$html0=strtolower($html);
$x=strpos($html0, "<h$level");
if ( $x>-1 ) {
$y=strpos($html0, "</h$level>");
$part=strip_tags(substr($html, $x, $y-$x));
$toc .="<a href='#head$n'>$part</a>\n";
$html1.=substr($html,0,$x)."<a name='head$n'></a>".substr($html, $x, $y-$x+5)."\n";
$html=substr($html, $y+5);
$n++;
}
}
$html1.=$html;
$html=$toc."\n<HR>\n".$html1;
return $html;
}
This will create a basic list of links
$html="<html><body>";
$html.="<h1>Heading 1a</h1>One Two Three";
$html.="<h2>heading 2a</h2>Four Five Six";
$html.="<h1 class='something'>Heading 1b</h1>Seven Eight Nine";
$html.="<h2>heading 2b</h2>Ten Eleven Twelve";
$html.="</body></html>";
echo getTOC($html, 1);
gives...
<a href='#head0'>Heading 1a</a>
<a href='#head1'>Heading 1b</a>
<HR>
<html><body><a name='head0'></a><h1>Heading 1a</h1>
One Two Three<h2>heading 2a</h2>Four Five Six<a name='head1'></a><h1
class='something'>Heading 1b</h1>
Seven Eight Nine<h2>heading 2b</h2>Ten Eleven Twelve</body></html>
See https://onlinephp.io/c/fceb0 for a running example
This function return the string with appended table of content only for h2 tags. 100% tested code.
function toc($str){
$html = preg_replace('/]+\>/i', '$0 In This Article', $str, 1); //toc just after first image in content
$doc = new DOMDocument();
$doc->loadHTML($html);
// create document fragment
$frag = $doc->createDocumentFragment();
// create initial list
$frag->appendChild($doc->createElement('ul'));
$head = &$frag->firstChild;
$xpath = new DOMXPath($doc);
$last = 1;
// get all H1, H2, …, H6 elements
$tagChek = array();
foreach ($xpath->query('//*[self::h2]') as $headline) {
// get level of current headline
sscanf($headline->tagName, 'h%u', $curr);
array_push($tagChek,$headline->tagName);
// move head reference if necessary
if ($curr parentNode->parentNode;
}
} elseif ($curr > $last && $head->lastChild) {
// move downwards and create new lists
for ($i=$last; $ilastChild->appendChild($doc->createElement('ul'));
$head = &$head->lastChild->lastChild;
}
}
$last = $curr;
// add list item
$li = $doc->createElement('li');
$head->appendChild($li);
$a = $doc->createElement('a', $headline->textContent);
$head->lastChild->appendChild($a);
// build ID
$levels = array();
$tmp = &$head;
// walk subtree up to fragment root node of this subtree
while (!is_null($tmp) && $tmp != $frag) {
$levels[] = $tmp->childNodes->length;
$tmp = &$tmp->parentNode->parentNode;
}
$id = 'sect'.implode('.', array_reverse($levels));
// set destination
$a->setAttribute('href', '#'.$id);
// add anchor to headline
$a = $doc->createElement('a');
$a->setAttribute('name', $id);
$a->setAttribute('id', $id);
$headline->insertBefore($a, $headline->firstChild);
}
// echo $frag;
// append fragment to document
if(!empty($tagChek)):
$doc->getElementsByTagName('section')->item(0)->appendChild($frag);
return $doc->saveHTML();
else:
return $str;
endif;
}

Automatically generate nested table of contents based on heading tags

Which one of you crafty programmers can show me an elegant php coded solution for automatically generating a nested table of contents based on heading tags on the page?
So I have a html document thus:
<h1> Animals </h1>
Some content goes here.
Some content goes here.
<h2> Mammals </h2>
Some content goes here.
Some content goes here.
<h3> Terrestrial Mammals </h3>
Some content goes here.
Some content goes here.
<h3> Marine Mammals </h3>
Some content goes here.
Some content goes here.
<h4> Whales </h4>
Some content goes here.
Some content goes here.
More specifically, I want a linked table of contents in the form of a nested list of links to headings on the same page:
Table of Contents (automatically generated by PHP code)
Animals
Mammals
Terrestrial_Mammals
Marine_Mammals
Whales
I don't find it elegant, but might help in getting general idea how to create one ;)
It uses simple_html_dom to find and manipulate elements in original html
$htmlcode = <<< EOHTML
<h1> Animals </h1>
Some content goes here.
Some content goes here.
<h2> Mammals </h2>
Some content goes here.
Some content goes here.
<h3> Terrestrial Mammals </h3>
Some content goes here.
Some content goes here.
<h3> Marine Mammals </h3>
Some content goes here.
Some content goes here.
<h4> Whales </h4>
Some content goes here.
Some content goes here.
EOHTML;
// simpehtmldom or other dom manipulating library
require_once 'simple_html_dom.php';
$html = str_get_html($htmlcode);
$toc = '';
$last_level = 0;
foreach($html->find('h1,h2,h3,h4,h5,h6') as $h){
$innerTEXT = trim($h->innertext);
$id = str_replace(' ','_',$innerTEXT);
$h->id= $id; // add id attribute so we can jump to this element
$level = intval($h->tag[1]);
if($level > $last_level)
$toc .= "<ol>";
else{
$toc .= str_repeat('</li></ol>', $last_level - $level);
$toc .= '</li>';
}
$toc .= "<li><a href='#{$id}'>{$innerTEXT}</a>";
$last_level = $level;
}
$toc .= str_repeat('</li></ol>', $last_level);
$html_with_toc = $toc . "<hr>" . $html->save();
Here’s an example using DOMDocument:
$doc = new DOMDocument();
$doc->loadHTML($code);
// create document fragment
$frag = $doc->createDocumentFragment();
// create initial list
$frag->appendChild($doc->createElement('ol'));
$head = &$frag->firstChild;
$xpath = new DOMXPath($doc);
$last = 1;
// get all H1, H2, …, H6 elements
foreach ($xpath->query('//*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 or self::h6]') as $headline) {
// get level of current headline
sscanf($headline->tagName, 'h%u', $curr);
// move head reference if necessary
if ($curr < $last) {
// move upwards
for ($i=$curr; $i<$last; $i++) {
$head = &$head->parentNode->parentNode;
}
} else if ($curr > $last && $head->lastChild) {
// move downwards and create new lists
for ($i=$last; $i<$curr; $i++) {
$head->lastChild->appendChild($doc->createElement('ol'));
$head = &$head->lastChild->lastChild;
}
}
$last = $curr;
// add list item
$li = $doc->createElement('li');
$head->appendChild($li);
$a = $doc->createElement('a', $headline->textContent);
$head->lastChild->appendChild($a);
// build ID
$levels = array();
$tmp = &$head;
// walk subtree up to fragment root node of this subtree
while (!is_null($tmp) && $tmp != $frag) {
$levels[] = $tmp->childNodes->length;
$tmp = &$tmp->parentNode->parentNode;
}
$id = 'sect'.implode('.', array_reverse($levels));
// set destination
$a->setAttribute('href', '#'.$id);
// add anchor to headline
$a = $doc->createElement('a');
$a->setAttribute('name', $id);
$a->setAttribute('id', $id);
$headline->insertBefore($a, $headline->firstChild);
}
// append fragment to document
$doc->getElementsByTagName('body')->item(0)->appendChild($frag);
// echo markup
echo $doc->saveHTML();
I found this method, by Alex Freeman (http://www.10stripe.com/articles/automatically-generate-table-of-contents-php.php):
preg_match_all('#<h[4-6]*[^>]*>.*?<\/h[4-6]>#',$html_string,$resultats);
//reformat the results to be more usable
$toc = implode("\n",$resultats[0]);
$toc = str_replace('<a name="','<a href="#',$toc);
$toc = str_replace('</a>','',$toc);
$toc = preg_replace('#<h([4-6])>#','<li class="toc$1">',$toc);
$toc = preg_replace('#<\/h[4-6]>#','</a></li>',$toc);
//plug the results into appropriate HTML tags
$toc = '<div id="toc">
<p id="toc-header">Table des matières</p>
<hr />
<ul>
'.$toc.'
</ul>
</div><br /><br />';
return $toc;
In the HTML, the headers have to be written as:
<h2><a name="target"></a>Text</h2>
Combined some of the above to make a nested index of the headings. This function also inserts links into html itself so it can be linked. Pure php no library needed.
function generateIndex($html)
{
preg_match_all('/<h([1-6])*[^>]*>(.*?)<\/h[1-6]>/',$html,$matches);
$index = "<ul>";
$prev = 2;
foreach ($matches[0] as $i => $match){
$curr = $matches[1][$i];
$text = strip_tags($matches[2][$i]);
$slug = strtolower(str_replace("--","-",preg_replace('/[^\da-z]/i', '-', $text)));
$anchor = '<a name="'.$slug.'">'.$text.'</a>';
$html = str_replace($text,$anchor,$html);
$prev <= $curr ?: $index .= str_repeat('</ul>',($prev - $curr));
$prev >= $curr ?: $index .= "<ul>";
$index .= '<li>'.$text.'</li>';
$prev = $curr;
}
$index .= "</ul>";
return ["html"=>$html,"index"=>$index];
}
Have a look at the TOC class. It allows generating table of contents from nested headings. h1 tag can be followed by any lower level h tag. The class uses recursion to extract the headings from article text
Short solution using SimpleHTMLDom :
public function getSummary($body)
{
$dom = new Htmldom($body);
$summ = "<ul>";
$prev = 2;
foreach($dom->find("h2,h3,h4") as $x => $htag)
{
$curr = intval(substr($htag->tag, -1));
$prev <= $curr ?: $summ .= "</ul>";
$prev >= $curr ?: $summ .= "<ul>";
$summ .= "<li>$htag->plaintext</li>";
$prev = $curr;
}
$summ .= "</ul>";
return $summ;
}
You have a very simple library for this caseyamcl/toc
$html='<h1>Title</h1>text<h2>...<h2>...';
$tocGenerator = new TOC\TocGenerator();
$toc = $tocGenerator->getHtmlMenu($html);
echo $htmlOut;
Bonus: If you want, he can fix the header without tag id by insert this code before.
$tocGenerator = new TOC\TocGenerator();
$html = $markupFixer->fix($html);

Categories