Using DOM PHP Parse child contents - php

<h2>
<strong>
<a href="http://www.linkedin.com/pub/rom-agustin/0/111/947" title="Rom Agustin">
<span class="given-name">Rom</span>
<span class="family-name">Agustin</span>
</a>
</strong>
</h2>
So i need to parse the two span classes and store them each in a variable.
span.given-name = $given_name
span.family-name = $family_name
Right now my code is :
foreach($vcard as $items):
$names = $items->getElementsByTagName('h2');
$name = $names->item(0)->nodeValue;
echo $name; //Rom Agustin
endforeach;
How to properly separated those two? or how can i just target the class? I've read the DOM in php.net but there is no GetElementbyClass. Tried explode, very messy.

Use http://simplehtmldom.sourceforge.net, which makes it a breeze, with http://simplehtmldom.sourceforge.net/manual.htm#section_create explaining how to work from plain string.
$htmltext = "...your snippet...";
$html = str_get_html($htmltext);
foreach($html->find("span") as $span) {
// use $span here!
}

Related

PHP: How can I put HTML Code with PHP Code within into a PHP Variable?

I have a question about " and ' in PHP. I have to put a complete <li> element into a PHP variable but this doesn't work, the output is completely false...
$list =
"
<li class=\"<?php if($info[1] < 700) {echo \"half\";} else {echo \"full\";} ?>\">
<div class=\"onet-display\">
<?php if ($image = $post->get('image.src')): ?>
<a class=\"onet-display-block\" href=\"<?= $view->url('#blog/id', ['id' => $post->id]) ?>\"><img class=\"onet-thumb\" src=\"<?= $image ?>\" alt=\"<?= $post->get('image.alt') ?>\"></a>
<?php endif ?>
<h1 class=\"onet-thumb-title\"><?= $post->title ?></h1>
<div class=\"uk-margin\"><?= $post->excerpt ?: $post->content ?></div>
</div>
</li>
";
Is it because there is PHP Content in the HTML Code? How can I solve this?
Can someone help me and explain why this doesn't work?
<?php ... <?php
Since your string contains PHP tags, I suppose you expect them to be evaluated. The opening PHP tag within another PHP tag is interpreted as a part of the PHP code. For example, the following outputs <?php echo time();:
<?php echo "<?php echo time();";
There are several ways to build a PHP string from PHP expressions.
Concatenation
You can create functions returning strings and concatenate the calls to them with other strings:
function some_function() {
return time();
}
$html = "<li " . some_function() . ">";
or use sprintf:
$html = sprintf('<li %s>', some_function());
eval
Another way is to use eval, but I wouldn't recommend it as it allows execution of arbitrary PHP code and may cause unexpected behavior.
Output Buffering
If you are running PHP as a template engine, you can use the output control functions, e.g.:
<?php ob_start(); ?>
<li data-time="<?= time() ?>"> ...</li>
<?php
$html = ob_get_contents();
ob_end_clean();
echo $html;
Output
<li data-time="1483433793"> ...</li>
Here Document Syntax
If, however, the string is supposed to be assigned as is, use the here document syntax:
$html = <<<'HTML'
<li data-time="{$variable_will_NOT_be_parsed}">...</li>
HTML;
or
$html = <<<HTML
<li data-time="{$variable_WILL_be_parsed}">...</li>
HTML;
You want to store some html into a variable.
Your source should (if not yet) start with
<?php
Then you start building the contents of $list.
Starting from your code the nearest fix is to build $list by appending strings:
<?php
$list = "<li class=";
if($info[1] < 700)
{
$list .= "\"half\""; // RESULT: <li class="half"
}
else
{
$list .= "\"full\""; // RESULT: <li class="full"
}
// ...and so on...
Now a couple things to note:
$list .= "... is a short form of $list = $list . "...
Where the . dot operator joins two strings.
Second thing you may make code easier to read by mixing single and double quotes:
Use double quotes in PHP and single quotes in the generated HTML:
<?php
$list = "<li class=";
if($info[1] < 700)
{
$list .= "'half'"; // RESULT: <li class='half'
}
else
{
$list .= "'full'"; // RESULT: <li class='full'
}
// ...and so on...
This way you don't need to escape every double quote
i think you have to work like this
$test = "PHP Text";
$list = "<strong>here ist Html</strong>" . $test . "<br />";
echo $list;

Filter Child Tags Values

How to filter child tags value of a <div> and display just the parent inner texts?
For example in this code :
include('simple_html_dom.php');
$htm='<div class="date">
<span class="title">post date <!-- or Anything --> :</span>
2103/04/07 13:06
</div>';
$html = str_get_html($htm);
$date = $html->find('.date ',0)->plaintext;
echo $date;
The result is :
post date : 2103/04/07 13:06
But I need :
2103/04/07 13:06
Is there any way to filter the <span> values? I prefer to not use patterns in my case.
Thank you
This is possibly not the cleanest solution, but it works:
$dateDiv = $html->find('.date', 0);
$textElems = $dateDiv->find('text');
$str = '';
foreach ($textElems as $subText) {
if ($subText->parent() === $dateDiv) {
$str .= $subText->plaintext;
}
}
echo $str;
It retrieves all text blocks and then removes all nodes which are not direct children of <div class="date">.

Detailed preg_match_all

I am having an issue getting a detailed preg_match_all to work. I keep getting a blank Array.
Here is my code:
<?php
$remote_search = file_get_contents('http://wiki.seg.org/index.php?title=Special%3ASearch&search=drilling&button=');
preg_match_all('%<li><div class=\'mw-search-result-heading\'>(.*) </div> <div class=\'searchresult\'>(.*)</div>
<div class=\'mw-search-result-data\'>(.*)</div></li>%si', $remote_search, $links);
echo '<ul class=\'mw-search-results\'>';
for($i = 0; $i < count($links[1]); $i++) {
echo '<li><div class=\'mw-search-result-heading\'><a href="' . $links[5][$i] . '" title="' . $links[4][$i] . '">' . $links[3][$i] . '<\/a> </div> <div class=\'searchresult\'>' . $links[2][$i] . '<\/div><div class=\'mw-search-result-data\'>' . $links[1][$i] . '<\/div><\/li>';
}
echo '</ul>';
?>
I am trying to grab the link details from code shown below:
<li><div class='mw-search-result-heading'>Dictionary:Cable drilling </div> <div class='searchresult'>{{lowercase}}{{#category_index:C|cable <span class='searchmatch'>drilling</span>}}
</div>
<div class='mw-search-result-data'>132 B (22 words) - 19:58, 20 December 2011</div></li>
When I perform a var_dump($links); I get Array as the result.
The code below works to grab the contents in the section I am trying to pull the variables.
<?php
$remote_search = file_get_contents('http://wiki.seg.org/index.php?title=Special%3ASearch&search=drilling&button=');
preg_match_all('%<ul class=\'mw-search-results\'>(.*)</ul>%si', $remote_search, $links);
$bar = $links[0];
echo '<ul class=\'mw-search-results\'>';
echo $bar;
echo '</ul>';
var_dump($links);
?>
The echo $bar; results in Array and no ouput.
The var_dump($links); in this snippet outputs the content of the ul.
Does anyone see the error in my top snippet that is preventing me from parsing the code the way I am intending it?
Try:
preg_match_all('#<li><div\s*class=\'mw-search-result-heading\'><a\s*href=.([^"]*).\s*title=.([^"]*).>([^<]*)<\/a>\s*<\/div>\s*<div\s*class=\'searchresult\'>(.*?)<\/div>\s*<div\s*class=.mw-search-result-data.>([^<]*)<\/div><\/li>#sim', $remote_search, $links);
print_r($links);
The logic error in your code was the way you were matching <div class=\'searchresult\'>(.*)</div> against <div class='searchresult'>{{lowercase}}{{#category_index:C|cable <span class='searchmatch'>drilling</span>}}</div>
This doesn't work well with regular expressions since there is a nested tag -- the span. So I changed your matching logic to non-greedy: .*?. Also notice how I changed the flag modifiers for the regular expression to sim. I always use these three modifiers whenever I toss a regular expression against HTML. I use them so often I even found a way to arrange the modifier letters into a word namely "sim" as a memory aid to help remember the modifiers.
Happy coding!
Never try to parse html with Regex. Use DOMDocument instead. In your case to get links from file you can do something like:
$dom = new DOMDocument();
$dom->load($url);
$elements = $dom->getElementsByTagName('a');
$links = array();
foreach ($elements as $element)
$links[] = $element->getAttribute('href');
var_dump($links);

Stripping Specific HTML & Content from a page with PHP for RSS

I am building a mobile version of my company website, and one thing we are in need of is an RSS feed.
I have the RSS pulling in fine with this code:
<?php
$url = 'http://www.someurl.com/rss/articles';
$feed = simplexml_load_file($url, 'SimpleXMLIterator');
$filtered = new LimitIterator($feed->channel->item, 0, 15);
foreach ($filtered as $item) { ?>
<li data-icon="false">
<h2><?php echo $item->title; ?></h2>
<p class="desc"><?php echo $item->description; ?></p>
<br />
<p class="category"><b><?php echo $item->category; ?></b></p>
<a class="link" href="<?php echo $item->link; ?>">Read More</a>
<br />
<p class="pubDate"><?php echo $item->pubDate; ?></p>
<br />
</li>
<?php } ?>
What I would like to do is utilize either the fopen() or file_get_contents() to handle the clicking of the 'Read More' link and strip all of the contents of the incoming page except for the <article> tag.
I have searched Google the past day, and have not been successful in finding any tutorials on this subject.
EDIT:
I would like to load the stripped HTML contents into their own view within my framework.
SECOND EDIT:
I would just like to share how I solved this problem.
I modified my $item->link; to be passed through the URL as a variable:
Read More
On the article.php page, I collect the variable with a if() statement:
if (isset($_GET['rss_url']) && is_string($_GET['rss_url'])) {
$url = $_GET['rss_url'];
}
Then building on the suggestions of the comments below, I built a way to then collect the incoming URL and strip the necessary tags to then format for my mobile view:
<div id="article">
<?php
$link = file_get_contents($url);
$article = strip_tags($link, '<title><div><article><aside><footer><ul><li><img><h1><h2><span><p><a><blockquote><script>');
echo $article;
?>
</div>
Hopefully this helps anyone else who may encounter this problem :)
I'm not sure if I understand it correctly but are you trying to output the contents on the current page whenever someone clicks the more link?
I would probably use Javascipt to do that, maybe jQuery's .load() function which loads html from another page and allows you to load only specific fragments of a page.. but if you need to use php I would look into Simple HTML DOM Parser
$html = file_get_html($yourUrl);
$article = $html->find('article', 0); // Assuming you only have 1 article/page
echo $article;
The only way I can see is to set up your own separate script to route the links through.
So, instead of echo $item->link use
echo 'LinkProcessor.php?link='.$item->link
Then, setup a script called LinkProcessor.php and use file_get_contents on that page. You can then process the XML to only show the article tag and echo the results:
$article = file_get_contents($_GET['link']);
$xml = new SimpleXMLElement($article);
$articleXml = $xml->xpath('//article');
echo articleXml[0];
Note that the code is untested, but it should be OK.

Where should HTML be rendered in Object-Oriented PHP design?

Using Object-Oriented PHP, where should HTML be rendered?
The business processes include several actions to maintain customer records.
Should the rendering of each business process get a separate PHP file? ie. viewCustomerTransactions.php?
Where should code like this reside?
$custTrans = Customer.getTransactions();
foreach ($custTrans as $ct){
$amount = $ct[0];
$date = $ct[1];
$product = $ct[2];
echo '<div class="custTrans">';
echo '<span class="custTransAmount">'.$amount.'</span>';
echo '<span class="custTransDate">'.$date.'</span>';
echo '<span class="custTransproduct">'.$product.'</span>';
echo '</div>';
}
Perhaps an MVC framework like codeigniter would be better?
I'm still figuring out what's the best way to keep php and layout seperate without too much fuzz. For the moment I really like the include-templating approach, beacause it's so simple and has no restrictions.
So, for your example, you would have a php file (example.php) that looks like this:
<?php
$custTrans = Customer.getTransactions();
$displ_transactions = array();
foreach ($custTrans as $ct){
$transaction = array(
'amount' => $ct[0],
'date' => $ct[1];
'product' => $ct[2];
);
$displ_transactions[] = $transaction; // this will push the transacion into the array
}
include 'example.tpl.php'
?>
And then you need a second file (example.tpl.php):
<?php foreach ($displ_transactions as $transaction) { ?>
<div class="custTrans">
<span class='custTransAmount'><?php echo $transaction['amount'] ?></span>;
<span class='custTransDate'><?php echo $transaction['date'] ?></span>;
<span class='custTransproduct'><?php echo $transaction['product'] ?></span>;
</div>
<?php } ?>
Just call example.php in your browser and you will see the same result as you had before.
This is all good and well for small websites, because this method causes some overhead. If you are serious about templating, use smarty. it's easy to learn, and it has automatic caching, so it's super fast.
I just realize you can also do it this way:
example.php:
<?php
$custTrans = Customer.getTransactions();
foreach ($custTrans as $ct){
$amount = $ct[0];
$date = $ct[1];
$product = $ct[2];
include 'example.tpl.php';
}
?>
example.tpl.php:
<div class="custTrans">
<span class='custTransAmount'><?php echo $amount ?></span>;
<span class='custTransDate'><?php echo $date ?></span>;
<span class='custTransproduct'><?php echo $product ?></span>;
</div>
Use whatever suits you best :)
If I were you I would store the html in a variable instead of echoing it out like so:
$custTrans = Customer.getTransactions();
$html = "";
foreach ($custTrans as $ct){
$amount = $ct[0];
$date = $ct[1];
$product = $ct[2];
$html .= "<div class="custTrans">";
$html .= "<span class='custTransAmount'>".$amount."</span>";
$html .= "<span class='custTransDate'>".$date."</span>";
$html .= "<span class='custTransproduct'>".$product."</span>";
$html .= "</div>";
}
You then have this html data stored in the variable $html and you can echo it out where ever you like.
echo $html;
Does that solve you problem mate?
W.
I would have to confirm that the include-templating (mentioned by Jules Colle) is one of the answers, nevertheless, it might get messy to maintain when the project is too large, so keep in mind to document what file is included by what file, since there is (currently) no (free) IDE solution to this type of chaining... a solution that would easily bring you from one file to another or simply lay-out everything into a procedural-like code.
Edit:
do not forget the magic constants:
http://www.php.net/manual/en/language.constants.predefined.php
and this:
$_SERVER['DOCUMENT_ROOT']

Categories