Detailed preg_match_all - php

I am having an issue getting a detailed preg_match_all to work. I keep getting a blank Array.
Here is my code:
<?php
$remote_search = file_get_contents('http://wiki.seg.org/index.php?title=Special%3ASearch&search=drilling&button=');
preg_match_all('%<li><div class=\'mw-search-result-heading\'>(.*) </div> <div class=\'searchresult\'>(.*)</div>
<div class=\'mw-search-result-data\'>(.*)</div></li>%si', $remote_search, $links);
echo '<ul class=\'mw-search-results\'>';
for($i = 0; $i < count($links[1]); $i++) {
echo '<li><div class=\'mw-search-result-heading\'><a href="' . $links[5][$i] . '" title="' . $links[4][$i] . '">' . $links[3][$i] . '<\/a> </div> <div class=\'searchresult\'>' . $links[2][$i] . '<\/div><div class=\'mw-search-result-data\'>' . $links[1][$i] . '<\/div><\/li>';
}
echo '</ul>';
?>
I am trying to grab the link details from code shown below:
<li><div class='mw-search-result-heading'>Dictionary:Cable drilling </div> <div class='searchresult'>{{lowercase}}{{#category_index:C|cable <span class='searchmatch'>drilling</span>}}
</div>
<div class='mw-search-result-data'>132 B (22 words) - 19:58, 20 December 2011</div></li>
When I perform a var_dump($links); I get Array as the result.
The code below works to grab the contents in the section I am trying to pull the variables.
<?php
$remote_search = file_get_contents('http://wiki.seg.org/index.php?title=Special%3ASearch&search=drilling&button=');
preg_match_all('%<ul class=\'mw-search-results\'>(.*)</ul>%si', $remote_search, $links);
$bar = $links[0];
echo '<ul class=\'mw-search-results\'>';
echo $bar;
echo '</ul>';
var_dump($links);
?>
The echo $bar; results in Array and no ouput.
The var_dump($links); in this snippet outputs the content of the ul.
Does anyone see the error in my top snippet that is preventing me from parsing the code the way I am intending it?

Try:
preg_match_all('#<li><div\s*class=\'mw-search-result-heading\'><a\s*href=.([^"]*).\s*title=.([^"]*).>([^<]*)<\/a>\s*<\/div>\s*<div\s*class=\'searchresult\'>(.*?)<\/div>\s*<div\s*class=.mw-search-result-data.>([^<]*)<\/div><\/li>#sim', $remote_search, $links);
print_r($links);
The logic error in your code was the way you were matching <div class=\'searchresult\'>(.*)</div> against <div class='searchresult'>{{lowercase}}{{#category_index:C|cable <span class='searchmatch'>drilling</span>}}</div>
This doesn't work well with regular expressions since there is a nested tag -- the span. So I changed your matching logic to non-greedy: .*?. Also notice how I changed the flag modifiers for the regular expression to sim. I always use these three modifiers whenever I toss a regular expression against HTML. I use them so often I even found a way to arrange the modifier letters into a word namely "sim" as a memory aid to help remember the modifiers.
Happy coding!

Never try to parse html with Regex. Use DOMDocument instead. In your case to get links from file you can do something like:
$dom = new DOMDocument();
$dom->load($url);
$elements = $dom->getElementsByTagName('a');
$links = array();
foreach ($elements as $element)
$links[] = $element->getAttribute('href');
var_dump($links);

Related

PHP: How can I put HTML Code with PHP Code within into a PHP Variable?

I have a question about " and ' in PHP. I have to put a complete <li> element into a PHP variable but this doesn't work, the output is completely false...
$list =
"
<li class=\"<?php if($info[1] < 700) {echo \"half\";} else {echo \"full\";} ?>\">
<div class=\"onet-display\">
<?php if ($image = $post->get('image.src')): ?>
<a class=\"onet-display-block\" href=\"<?= $view->url('#blog/id', ['id' => $post->id]) ?>\"><img class=\"onet-thumb\" src=\"<?= $image ?>\" alt=\"<?= $post->get('image.alt') ?>\"></a>
<?php endif ?>
<h1 class=\"onet-thumb-title\"><?= $post->title ?></h1>
<div class=\"uk-margin\"><?= $post->excerpt ?: $post->content ?></div>
</div>
</li>
";
Is it because there is PHP Content in the HTML Code? How can I solve this?
Can someone help me and explain why this doesn't work?
<?php ... <?php
Since your string contains PHP tags, I suppose you expect them to be evaluated. The opening PHP tag within another PHP tag is interpreted as a part of the PHP code. For example, the following outputs <?php echo time();:
<?php echo "<?php echo time();";
There are several ways to build a PHP string from PHP expressions.
Concatenation
You can create functions returning strings and concatenate the calls to them with other strings:
function some_function() {
return time();
}
$html = "<li " . some_function() . ">";
or use sprintf:
$html = sprintf('<li %s>', some_function());
eval
Another way is to use eval, but I wouldn't recommend it as it allows execution of arbitrary PHP code and may cause unexpected behavior.
Output Buffering
If you are running PHP as a template engine, you can use the output control functions, e.g.:
<?php ob_start(); ?>
<li data-time="<?= time() ?>"> ...</li>
<?php
$html = ob_get_contents();
ob_end_clean();
echo $html;
Output
<li data-time="1483433793"> ...</li>
Here Document Syntax
If, however, the string is supposed to be assigned as is, use the here document syntax:
$html = <<<'HTML'
<li data-time="{$variable_will_NOT_be_parsed}">...</li>
HTML;
or
$html = <<<HTML
<li data-time="{$variable_WILL_be_parsed}">...</li>
HTML;
You want to store some html into a variable.
Your source should (if not yet) start with
<?php
Then you start building the contents of $list.
Starting from your code the nearest fix is to build $list by appending strings:
<?php
$list = "<li class=";
if($info[1] < 700)
{
$list .= "\"half\""; // RESULT: <li class="half"
}
else
{
$list .= "\"full\""; // RESULT: <li class="full"
}
// ...and so on...
Now a couple things to note:
$list .= "... is a short form of $list = $list . "...
Where the . dot operator joins two strings.
Second thing you may make code easier to read by mixing single and double quotes:
Use double quotes in PHP and single quotes in the generated HTML:
<?php
$list = "<li class=";
if($info[1] < 700)
{
$list .= "'half'"; // RESULT: <li class='half'
}
else
{
$list .= "'full'"; // RESULT: <li class='full'
}
// ...and so on...
This way you don't need to escape every double quote
i think you have to work like this
$test = "PHP Text";
$list = "<strong>here ist Html</strong>" . $test . "<br />";
echo $list;

Get id value with PHP Simple HTML DOM

I have this scenario:
<div class="listing_title">
<strong>
<a href="http://www.mywebsite.com/dectails23291.html" id="url_id_1977">
Listing Title
</a>
</strong>
</div>
To get Listing Title, I have implemented this code:
$page = "http://www.mywebsite.com/listings.html";
$html = new simple_html_dom();
$html->load_file($pagina);
foreach($html->find('.listing_title') as $element)
echo $element->first_child()->plaintext . '<br>';
OUTPUT IS:
Listing Title
Now I need get id value
url_id_1977
preferably only "1977", clean of "url_id_", but I dont know do. Thanks in advance!!
Add this inside of your foreach loop:
echo end(explode('_', $element->find('a', 0)->id));
To get rid of the warning you could assign the id to a variable:
$id = explode('_', $element->find('a', 0)->id);
echo $id[2];
Or, if your anchor's id always starts with url_id_, just use str_replace():
echo str_replace('url_id_', '', $element->find('a', 0)->id);
try this
foreach($html->find('*[class=listing_title] a') as $element)
echo $element->id. '<br>';

DOM not parsing correctly

I have a script which loads items from an XML file, and displays them, so a user can choose which item they want to remove from their file. Here's the PHP:
<?php
global $current_user;
get_currentuserinfo();
$userid= $current_user->ID;
$dom = new DOMDocument;
$dom->load("playlists/$userid.xml");
echo '<div class="styled-select">';
echo '<center><form name="input" action="/remove/removesure.php" method="get">';
echo '<select name="q[]" size="2" multiple>';
$titles = $dom->getElementsByTagName('title');
foreach ($titles as $title) {
echo '<option>'.$title->nodeValue.'</option>';
}
echo '</select>';
echo '<input type="submit" class="submit" value="Remove">';
echo '</form></center>';
echo '</div>';
?>
The problem I've ran into is that it doesn't display some objects correctly, mainly items with hyphens (it displays – instead of -) and titles with spaces at the end, and because of this, my removal code doesn't find the item, and so can't remove it. I don't know what to do, and I don't know why it's doing this. I'm running the code in wordpress, if that makes a difference.
Any ideas?
If there is no chance of having any kind of councurrency, I would suggest you to use the title index of the title tag as value of "option". Eg:
$titles = $dom->getElementsByTagName('title');
$counter = 0;
foreach ($titles as $title) {
echo '<option value='.$counter.' >'.$title->nodeValue.'</option>';
$counter++;
}
In removesure.php, you could handle this by using an XPath expression like the next one:
//Title[2]
where "2" is the index of the title that must be removed.
That is a possible solution; another path you could try to follow is to handle spaces providing the best encoding option for your titles. htmlentity is the function that you should execute
echo '<option>'.htmlentity($title->nodeValue).'</option>'

Using DOM PHP Parse child contents

<h2>
<strong>
<a href="http://www.linkedin.com/pub/rom-agustin/0/111/947" title="Rom Agustin">
<span class="given-name">Rom</span>
<span class="family-name">Agustin</span>
</a>
</strong>
</h2>
So i need to parse the two span classes and store them each in a variable.
span.given-name = $given_name
span.family-name = $family_name
Right now my code is :
foreach($vcard as $items):
$names = $items->getElementsByTagName('h2');
$name = $names->item(0)->nodeValue;
echo $name; //Rom Agustin
endforeach;
How to properly separated those two? or how can i just target the class? I've read the DOM in php.net but there is no GetElementbyClass. Tried explode, very messy.
Use http://simplehtmldom.sourceforge.net, which makes it a breeze, with http://simplehtmldom.sourceforge.net/manual.htm#section_create explaining how to work from plain string.
$htmltext = "...your snippet...";
$html = str_get_html($htmltext);
foreach($html->find("span") as $span) {
// use $span here!
}

How to rewrite this php function into HTML-emeddable function <?php ?>

Could someone convert this line of code to be readable by HTML?
echo '<h3>'. $r['title'] .'</h3>';
into something like this:
<?php echo...blah blah blah ?> /* To display the title in HTML */
I am sure I am not doing it right, that's why it's still not working :(.
Edit: There seems to be a confusion here. I am not going to modify the original php function. What I need to do is call it to my HTML page, to display the Title of the page
function r($text, $level = 3)
{
$tag = 'h' . $level . '>';
return '<' . $tag . $text . '</' . $tag;
}
Thanks for the downvote. The given question is totally unclear and constantly edited.
Ah you mean?
<php echo "<h3>$r['title']</h3>"; ?>
could be an answer to this unclear question
Save the result into a variable.
<?php $title = '<h3>'. $r['title'] .'</h3>';?>
<?php echo $title; ?>
Not exactly sure what you're asking, but you can't use PHP code within an HTML page.
The line
<?php echo '<h3>'. $r['title'] .'</h3>'; ?>
Within a PHP file, will print out the contents of $r['title'], within <h3> tags.
There is no function involved; $r is an associative array variable and title is a key to a particular value.

Categories