Want to get specific data from a webpage - php

I am trying hard to get data from following portion of a webpage
<div id="menu_pannel">
<ul class="sf-menu" id="nav">
<li class="current"><a href="/" class="current" >Home</a></li>
<li class="">Schedule</li>
<li class="">All Channels</li>
<li class="">Sports Channels
<ul id="submenu">
<li>Sky Sports 1</li>
<li>Sky Sports 2</li>
<li><a href="http://www.time4tv.com/2011/03/sky-sports-3.php">Sky Sports
I want to get data from for that i am using
$pattern = '|<ul id="nav" class="sf-menu">(.*?)</ul>|';
preg_match($pattern, $html, $data);
but getting emty array .

if strip_tags($html) doesn't returns what you want, you can use this example to get an array of text:
function getTextBetweenTags($string, $tagname) {
preg_match_all("#<$tagname.*?>([^<]+)</$tagname>#", $string, $matches);
return $matches[1];
}
$values = getTextBetweenTags ($html, 'a' );
foreach($values as $value) {
echo $value . '<br>';
}
where $html is a var containing your html.

If you decide to use dom parser
$doc = new DOMDocument();
$doc->loadHTML($str);
$x = new DomXpath($doc);
$ul = $x->query('//ul[#id="nav"]'); // 'id' is a unique identifier!
// Echo outerHTML of ul[#id="nav"]
echo $doc->saveHTML($ul->item(0));
demo

Use DOMDocument class for manipulating HTML content:
// $html_str is your html fragment
$doc = new DOMDocument();
$doc->loadHTML($html_str);
$ul_content = "";
$ul = $doc->getElementsByTagName("ul")->item(0);
if ($ul && $ul->getAttribute('class') == 'sf-menu') {
foreach ($ul->childNodes as $n) {
$ul_content .= $doc->saveHTML($n);
}
}
echo $ul_content;

Related

PHP How to get value using the Class

I have made a form with a description and inside that description there's a list called "services-list" and I need to get the value with the class "services-list"
here is the code on how I get the whole description section
<p class="card"><?php echo substr(strip_tags($row->description), 0, 200); ?>...</p>
description came from a form for="description", and saved to the database.
in that code there is a section which has class "services-list" and listed all the services, how can I call that specific section to print?
I am referring to code something like this
<?php echo substr(strip_tags($row->description .'.services-list'), 0, 200); ?>
but not sure with this code.
Assuming you have a $row->description object contains this text :
$row = (object) [ 'description' => '
<div>
<li class="other-class">Service A</li>
<li class="services-list">Service B</li>
<li class="services-list">Service C</li>
<li class="services-list">Service D</li>
</div>
'];
You could use DOMDocument to get all of the services-list class element by :
$doc = new DOMDocument();
$doc->loadHTML($row->description);
$xpath = new DomXPath($doc);
$nodeList = $xpath->query("//li[#class='services-list']");
$node = $nodeList->item(1);
$service_list = [];
foreach ($nodeList as $i => $val){
if (!empty($val->nodeValue)) {
array_push($service_list, $val->nodeValue);
}
}
$service_list = implode($service_list, ', '); // separate each item by comma
// To check the result:
echo "<p>" . $service_list . "</p>";
The output will be :
<p>Service B, Service C, Service D</p>

PHP string search and replace - possible use of DOM Needed

I cant seem to figure out how to achieve my goal.
I want to find and replace a specific class link based off of a generated RSS feed (need the option to replace later no matter what link is there)
Example HTML:
<a class="epclean1" href="#">
WHAT IT SHOULD LOOK LIKE:
<a class="epclean1" href="google.com">
May need to incorporate get element using DOM as the Full php has a created document. If that is the case I would need to know how to find by class and add the href url that way.
FULL PHP:
<?php
$rss = new DOMDocument();
$feed = array();
$urlArray = array(array('url' => 'https://feeds.megaphone.fm')
);
foreach ($urlArray as $url) {
$rss->load($url['url']);
foreach ($rss->getElementsByTagName('item') as $node) {
$item = array (
'title' => $node->getElementsByTagName('title')->item(0)->nodeValue
);
array_push($feed, $item);
}
}
usort( $feed, function ( $a, $b ) {
return strcmp($a['title'], $b['title']);
});
$limit = sizeof($feed);
$previous = null;
$count_firstletters = 0;
for ($x = 0; $x < $limit; $x++) {
$firstLetter = substr($feed[$x]['title'], 0, 1); // Getting the first letter from the Title you're going to print
if($previous !== $firstLetter) { // If the first letter is different from the previous one then output the letter and start the UL
if($count_firstletters != 0) {
echo '</ul>'; // Closing the previously open UL only if it's not the first time
echo '</div>';
}
echo '<button class="glanvillecleancollapsible">'.$firstLetter.'</button>';
echo '<div class="glanvillecleancontent">';
echo '<ul style="list-style-type: none">';
$previous = $firstLetter;
$count_firstletters ++;
}
$title = str_replace(' & ', ' & ', $feed[$x]['title']);
echo '<li>';
echo '<a class="epclean'.$i++.'" href="#" target="_blank">'.$title.'</a>';
echo '</li>';
}
echo '</ul>'; // Close the last UL
echo '</div>';
?>
</div>
</div>
The above fullphp shows on site like so (this is shortened as there is 200+):
<div class="modal-glanvillecleancontent">
<span class="glanvillecleanclose">×</span>
<p id="glanvillecleaninstruct">Select the first letter of the episode that you wish to get clean version for:</p>
<br>
<button class="glanvillecleancollapsible">8</button>
<div class="glanvillecleancontent">
<ul style="list-style-type: none">
<li><a class="epclean1" href="#" target="_blank">80's Video Vixen Tawny Kitaen 044</a></li>
</ul>
</div>
<button class="glanvillecleancollapsible">A</button>
<div class="glanvillecleancontent">
<ul style="list-style-type: none">
<li><a class="epclean2" href="#" target="_blank">Abby Stern</a></li>
<li><a class="epclean3" href="#" target="_blank">Actor Nick Hounslow 104</a></li>
<li><a class="epclean4" href="#" target="_blank">Adam Carolla</a></li>
<li><a class="epclean5" href="#" target="_blank">Adrienne Janic</a></li>
</ul>
</div>
You're not very clear about how your question relates to the code shown, but I don't see any attempt to replace the attribute within the DOM code. You'd want to look at XPath to find the desired elements:
function change_clean($content) {
$dom = new DomDocument;
$dom->loadXML($content);
$xpath = new DomXpath($dom);
$nodes = $xpath->query("//a[#class='epclean1']");
foreach ($nodes as $node) {
if ($node->getAttribute("href") === "#") {
$node->setAttribute("href", "https://google.com/");
}
}
return $dom->saveXML();
}
$xml = '<?xml version="1.0"?><foo><bar><a class="epclean1" href="#">test1</a></bar><bar><a class="epclean1" href="https://example.com">test2</a></bar></foo>';
echo change_clean($xml);
Output:
<foo><bar><a class="epclean1" href="https://google.com/">test1</a></bar><bar><a class="epclean1" href="https://example.com">test2</a></bar></foo>
Hmm. I think your pattern and replacement might be your problem.
What you have
$pattern = 'class="epclean1 href="(.*?)"';
$replacement = 'class="epclean1 href="google.com"';
Fix
$pattern = '/class="epclean1" href=".*"/';
$replacement = 'class="epclean1" href="google.com"';

How get list subItens nodes separateds using PHP DOM

I was seeing this tip
PHP DOM get items from first ul element
But in this case:
<li>First item
<ul>
<li>
First SubItem
</li>
<li>
Second SubItem
</li>
</ul>
</li>
PHP Code:
$DOM = new DOMDocument;
libxml_use_internal_errors(true);
$DOM->loadHTML( $output);
$items = $DOM->getElementsByTagName('ul');
echo '<ul>';
foreach ($items->item(3)->getElementsByTagName('li') as $li) {
var_dump($li);die();
echo '<li>'.$li->nodeValue;
$ul = $li->getElementsByTagName('ul');
echo '<ul>';
echo '--->'.$ul->length.'<br>';
for($u=0;$u<$ul->length;$u++){
foreach ($ul->item($u)->getElementsByTagName('li') as $lii) {
echo '<li>'.$lii->nodeValue.'</li>';
}
}
echo '</ul>';
echo '</li>';
}
echo '</ul>';
The Problem is:
Im getting in //$li->nodeValue;// "First itemFirst SubItemSecond SubItem" as the Fist node;
I need get this items separated (subItems)
I'm assuming you just want to retrieve the text values from those <li> tags.
You can greatly simplify the query with DOMXPath as ->query('//li') will fetch all <li> tags in your code snippet.
$DOM = new DOMDocument();
$DOM->loadHTML($output);
$xPath = new DOMXPath($DOM);
if($xpResponse = $xPath->query('//li/text()')) {
echo "<ul>\n";
foreach($xpResponse as $xNode) {
echo "<li>" . trim($xNode->nodeValue) . "</li>\n";
}
echo "</ul>\n";
}
This will simply output (as HTML):
First item
First SubItem
Second SubItem

DOMXpath & PHP: how to wrap a bunch of <li> inside an <ul>

I have a html-document with this not-so-nice markup, without the 'ul':
<p>Lorem</p>
<p>Ipsum...</p>
<li class='item'>...</li>
<li class='item'>...</li>
<li class='item'>...</li>
<div>...</div>
I am now trying to "grab" all li-elements and wrap them inside an ul-list which I'd like to place in the same spot, using PHP and DOMXPath. I manage to find and "remove" the li-elements:
$elements = $xpath->query('//li[#class="item"]');
$wrapper = $document->createElement('ul');
foreach($elements as $child) {
$wrapper->appendChild($child);
}
Maybe you can get the parentNode of the first <li> and then use the insertBefore method:
$html = <<<HTML
<p>Lorem</p>
<p>Ipsum...</p>
<li class='item'>...</li>
<li class='item'>...</li>
<li class='item'>...</li>
<div>...</div>
HTML;
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXpath($doc);
$elements = $xpath->query('//li[#class="item"]');
$wrapper = $doc->createElement('ul');
$elements->item(0)->parentNode->insertBefore(
$wrapper, $elements->item(0)
);
foreach($elements as $child) {
$wrapper->appendChild($child);
}
echo $doc->saveHTML();
Demo
Here's what you need. You may need to tweak the XPath query for your real HTML.
$document = new DOMDocument;
// We don't want to bother with white spaces
$document->preserveWhiteSpace = false;
$html = <<<EOT
<p>Lorem</p>
<p>Ipsum...</p>
<li class='item'>...</li>
<li class='item'>...</li>
<li class='item'>last...</li>
<div>...</div>
EOT;
$document->LoadHtml($html);
$xpath = new DOMXPath($document);
$elements = $xpath->query('//li[#class="item"]');
// Saves a reference to the Node that is positioned right after our li's
$ref = $xpath->query('//li[#class="item"][last()]')->item(0)->nextSibling;
$wrapper = $document->createElement('ul');
foreach($elements as $child) {
$wrapper->appendChild($child);
}
$ref->parentNode->insertBefore($wrapper, $ref);
echo $document->saveHTML();
Running example: https://repl.it/B3UO/24

Catch first and last <li> inside php variable

echo $ul; // gives this code:
<ul id="menu">
<li id="some_id" class="some_class">...</li>
<li id="some_id" class="some_class">...</li>
<li id="some_id" class="some_class">...</li>
</ul>
How to add some class for the first and the last <li>?
Need a regex solution.
echo $ul; should give (if we add class my_class for the last <li>):
<ul id="menu">
<li id="some_id" class="some_class">...</li>
<li id="some_id" class="some_class">...</li>
<li id="some_id" class="some_class my_class">...</li>
</ul>
The DOM solution
$dom = new DOMDocument;
$dom->loadHTML( $ul );
$xPath = new DOMXPath( $dom );
$xPath->query( '/html/body/ul/li[last()]/#class' )
->item( 0 )
->value .= ' myClass';
echo $dom->saveXml( $dom->getElementById( 'menu' ) );
If you know the HTML to be valid, you can also use loadXML instead. That would make DOM not add ther HTML skeleton. Note that you have to change the XPath to '/ul/li[last()]/#class' then.
In case you are not familiar with XPath queries, you can also use the regular DOM interface, e.g.
$dom = new DOMDocument;
$dom->loadHTML( $ul );
$liElements = $dom->getElementsByTagName( 'li' );
$lastLi = $liElements->item( $liElements->length-1 );
$classes = $lastLi->getAttribute( 'class' ) . ' myClass';
$lastLi->setAttribute( 'class', $classes );
echo $dom->saveXml( $dom->getElementById( 'menu' ) );
EDIT Since you changed the question to have classes for first and last now, here is how to do that using XPath. This assumes your markup is valid XHTML. If not, switch back to loadHTML (see code above):
$dom = new DOMDocument;
$dom->loadXML( $html );
$xpath = new DOMXPath( $dom );
$first = $xpath->query( '/ul/li[1]/#class' )->item( 0 );
$last = $xpath->query( '/ul/li[last()]/#class' )->item( 0 );
$last->value .= ' last';
$first->value .= ' first';
echo $dom->saveXML( $dom->documentElement );
Alternatively, you could use "#menu li:last-child" in your CSS instead of a class name, that way you don't have to modify your PHP code.
If you MUST use regex for this(not exactly to be advised).
I think this should work...
$replacement1 = "<li\s.*?class="(.*?)".*?>.*?</li>\s</ul>";
$string1 = "$1 class_last";
$ul = preg_replace($ul, $replacement1, $string1);)
$replacement2 = "<ul.*?>\s<li\s.*?class="(.*?)".*?>";
$string2 = "$1 class_first";
$ul = preg_replace($ul, $replacement2, $string2);)
If you really want to do that job with regex, you could try :
$ul = '
<ul id="menu">
<li id="some_id" class="some_class">...</li>
<li id="some_id" class="some_class">...</li>
<li id="some_id" class="some_class">...</li>
</ul>
';
// explode input string to an array
$lines = explode("\n", $ul);
$found_first = 0; // is the first li founded
$found_last = 0; // index of the last li
$class_first = "class_first"; // class for the first li
$class_last = "class_last"; // class for the last li
// loop on all lines
for ($i = 0; $i < count($lines); $i++) {
$line = $lines[$i];
// the line begins with <li
if (preg_match("/^<li/", $line)) {
// is it the first one
if (!$found_first) {
// add the class
$lines[$i] = preg_replace('/ class="([^"]+?)"/', " class=\"$1 $class_first\"", $lines[$i]);
// the first li have been found
$found_first = 1;
}
// memo the last line proceded
$found_last = $i;
}
}
// this will add class_last even if the last li
// is also the first one (ie: only one li)
if ($found_last) {
$lines[$found_last] = preg_replace('/ class="([^"]+?)"/', " class=\"$1 $class_last\"", $lines[$found_last]);
}
$ul = implode("\n", $lines);
echo $ul;
Ouput:
<ul id="menu">
<li id="some_id" class="some_class class_first">...</li>
<li id="some_id" class="some_class">...</li>
<li id="some_id" class="some_class class_last">...</li>
</ul>
You can use counters:
<?php
$list = array('aaa', 'bbb', 'ccc', 'ddd');
$items = count($list); // count items in list
$i = 1; // set counter to one, because first item in list will be item number: 1
echo '<ul>';
// create loop
foreach($list as $value) {
// first item
if($i == 1) {
$class = 'some_class first_class';
// last item
} elseif ($i == $items) {
$class = 'some_class last_class';
// not first / not last item
} else {
$class = 'some_class';
}
echo '<li id="some_id" class="'. $class .'">' . $value . '</li>';
$i++; // raise $i by one
}
echo '</ul>';
?>
Will output:
<ul>
<li id="some_id" class="some_class first_class">aaa</li>
<li id="some_id" class="some_class">bbb</li>
<li id="some_id" class="some_class">ccc</li>
<li id="some_id" class="some_class last_class">ddd</li>
</ul>
However, my suggestion would be:
<ul id="menu">
<li>aaa</li>
<li>bbb</li>
<li>ccc</li>
<li>ddd</li>
</ul>
Within your css:
#menu {
}
#menu li:first-child {
}
#menu li {
}
#menu li:last-child {
}

Categories