Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm looking for a way to get specific content from a remote web page
The content I want to get are inside javascript variables, this kind :
var Example1 = 0;
var Example2 = 14;
The name of the variable remain the same and the content is only numbers
Thank you
Find scripts in html source by DomDocument and then variable declaration by regex
$DOM = new DomDocument();
$DOM->loadHTML( $output);
$res = [];
$scripts = $DOM->getElementsByTagName('script');
$lnt = $scripts->length;
for($i=0; $i < $lnt; $i++) {
preg_match_all('/var\s+(\w+)\s*=\s*(\d+)\s*;/', $DOM->saveHtml($scripts->item($i)), $m);
$res = array_merge($res, array_combine($m[1], $m[2]));
}
print_r($res);
demo
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
<a data-track='' _sp= class=s-item__link href=get_this_href>...</a>
With the above link, the data-track contains some json data. The _sp= could contain numbers/letters and a period (.). The class is s-item__link.
I would need the get_this_href and then I can go from there.
This is the regex I tried... but im stuck from here.
<a\b(?=[^>]* class="[^"]*(?<=[" ])s-item__link[" ])(?=[^>]* href="([^"]*))
Here is an example: https://regex101.com/r/rVPeUI/1
$link = ""; //url im scraping
$html = file_get_html($link);
//find is part of simple_html_dom.php. im saying each li item is an $item.
foreach ($html->find('li.s-item ') as $item) {
//$item contains the decent amount of nested divs with spans and links.
}
Without using Regex, its better to use DOMDocument() to parse HTML tags:
$doc = DOMDocument::loadHTML($html);
$xpath = new DOMXPath($doc);
$query = "//a[#class='s-item__link']";
$entries = $xpath->query($query);
foreach ($entries as $entry) {
echo "HREF " . $entry->getAttribute("href");
}
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I execute on my main page some php code which includes :
foreach ($fbdata->feed->data as $fbpost)
{
...
}
How can we convert this , into a loop that goes from (i to z)(0 to 10) ?
Simple for loop
for($i = 0; $i < 10; $i++) {
$fbpost = $fbdata->feed->data[$i];
...
}
or if you like to use the as, try using a foreach but slicing the array before using it
$fbPosts = array_slice($fbdata->feed->data, 0, 10);
foreach($fbPosts as $fbpost) {
...
}
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I need to extract the value of the attributes of a html tag that is defined as a php variable:
[video src="http://localhost/video.mp4" poster="http://localhost/thumb.jpg" preload="none"][/video]
values of src and poster should be separated and cast to array
if your text is a php variable, you can use preg_match :
<?php
$text = '[video src="http://localhost/video.mp4" poster="http://localhost/thumb.jpg" preload="none"][/video]';
preg_match('/src=\"([^\"]+)\"/si', $text, $src);
preg_match('/poster=\"([^\"]+)\"/si', $text, $poster);
$array['src'] = $src[1];
$array['poster'] = $poster[1];
print_r($array);
?>
also you can use simplehtmldom
You can use an html parser for this. See this one: http://simplehtmldom.sourceforge.net/.
$html = str_get_html('<video src="http://localhost/video.mp4" poster="http://localhost/thumb.jpg" preload="none"></video>');
foreach($html->find('video') as $element)
{
$array['src'] = $element->src;
$array['poster'] = $element->poster;
}
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Is it possible to get the text between <p></p> tags and set this in a variable?
<p>blabla</p> So i would like to get the text "blabla" and set this into a php variable so the variable would have the text value like this:.
<?$test = blabla;?>
Try:
$html = "<p>blabla</p>";
$dom = new DOMDocument;
$dom->loadXML($html);
$arr = $dom->getElementsByTagName('p');
foreach ($arr as $value) {
echo $value->nodeValue; // result => blabla
}
There are many methods which can be used based on your needs so take a look on documentation
DOMDocument
You can use this function, it is self explanatory:
function getTextBetweenTags($string, $tagname)
{
$pattern = "/<$tagname>(.*?)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
?>
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I'm trying to parse out a very specific part of a page using PHP and i'm currently using:"getElementsByTagName"
and it's actually working BUT, it seems that it doesn't clear everything as there are other lines with similar tags,
so i've tried to look for a better pattern and found a unique attached "class" tag.
<li class="unique">
//get all li
$items = $DOM->getElementsByTagName('li');
//display all LI text
for ($i = 0; $i < $items->length; $i++)
echo $items->item($i)->nodeValue . "<br/>";
You can get all the unique class names using an XPath:
$xpath = new DOMXpath($DOM);
$items = $xpath->query('//li[#class="unique"]');
And loop them over:
for ($i = 0; $i < $items->length; $i++) {
echo $items->item($i)->nodeValue . "<br/>";
}