Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
<a data-track='' _sp= class=s-item__link href=get_this_href>...</a>
With the above link, the data-track contains some json data. The _sp= could contain numbers/letters and a period (.). The class is s-item__link.
I would need the get_this_href and then I can go from there.
This is the regex I tried... but im stuck from here.
<a\b(?=[^>]* class="[^"]*(?<=[" ])s-item__link[" ])(?=[^>]* href="([^"]*))
Here is an example: https://regex101.com/r/rVPeUI/1
$link = ""; //url im scraping
$html = file_get_html($link);
//find is part of simple_html_dom.php. im saying each li item is an $item.
foreach ($html->find('li.s-item ') as $item) {
//$item contains the decent amount of nested divs with spans and links.
}
Without using Regex, its better to use DOMDocument() to parse HTML tags:
$doc = DOMDocument::loadHTML($html);
$xpath = new DOMXPath($doc);
$query = "//a[#class='s-item__link']";
$entries = $xpath->query($query);
foreach ($entries as $entry) {
echo "HREF " . $entry->getAttribute("href");
}
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I have a xml with more subcategory.
I want extract in Php the “long_name” where type is “adminkstrative_area_level_3”
How I can do?
This is my xml https://ibb.co/fY27bJ
I tried but don’t work
<?
$string_data = "https://maps.googleapis.com/maps/api/geocode/xml?latlng=41.51,15.16&key=AIzaSyClG_vc2nkQCzXqvDzW1maPrUWLyADI7xI";
$xml = simplexml_load_string($string_data);
$citta = (string) $xml->result[0]->address_component[3]->long_name;
echo "<p>".$citta."</p>";
?>
You are missing geoname in $xml-> name;
Try it like this:
$xml = simplexml_load_string($string_data);
$citta = (string)$xml->geoname->name;
echo $citta;
Demo Php
If you want to loop through mulitple items you could use:
foreach ($xml->geoname as $item) {
echo $item->name;
}
Update:
For the updated part you could use the same technique:
$xml = simplexml_load_file("https://maps.googleapis.com/maps/api/geocode/xml?latlng=41.51,15.16&key=AIzaSyClG_vc2nkQCzXqvDzW1maPrUWLyADI7xI");
foreach ($xml->result as $item) {
if ((string)$item->type === "administrative_area_level_3") {
echo $item->address_component->long_name;
}
}
Or by index [1]:
echo $xml->result[1]->address_component->long_name;
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm looking for a way to get specific content from a remote web page
The content I want to get are inside javascript variables, this kind :
var Example1 = 0;
var Example2 = 14;
The name of the variable remain the same and the content is only numbers
Thank you
Find scripts in html source by DomDocument and then variable declaration by regex
$DOM = new DomDocument();
$DOM->loadHTML( $output);
$res = [];
$scripts = $DOM->getElementsByTagName('script');
$lnt = $scripts->length;
for($i=0; $i < $lnt; $i++) {
preg_match_all('/var\s+(\w+)\s*=\s*(\d+)\s*;/', $DOM->saveHtml($scripts->item($i)), $m);
$res = array_merge($res, array_combine($m[1], $m[2]));
}
print_r($res);
demo
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I need to extract the value of the attributes of a html tag that is defined as a php variable:
[video src="http://localhost/video.mp4" poster="http://localhost/thumb.jpg" preload="none"][/video]
values of src and poster should be separated and cast to array
if your text is a php variable, you can use preg_match :
<?php
$text = '[video src="http://localhost/video.mp4" poster="http://localhost/thumb.jpg" preload="none"][/video]';
preg_match('/src=\"([^\"]+)\"/si', $text, $src);
preg_match('/poster=\"([^\"]+)\"/si', $text, $poster);
$array['src'] = $src[1];
$array['poster'] = $poster[1];
print_r($array);
?>
also you can use simplehtmldom
You can use an html parser for this. See this one: http://simplehtmldom.sourceforge.net/.
$html = str_get_html('<video src="http://localhost/video.mp4" poster="http://localhost/thumb.jpg" preload="none"></video>');
foreach($html->find('video') as $element)
{
$array['src'] = $element->src;
$array['poster'] = $element->poster;
}
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Is it possible to get the text between <p></p> tags and set this in a variable?
<p>blabla</p> So i would like to get the text "blabla" and set this into a php variable so the variable would have the text value like this:.
<?$test = blabla;?>
Try:
$html = "<p>blabla</p>";
$dom = new DOMDocument;
$dom->loadXML($html);
$arr = $dom->getElementsByTagName('p');
foreach ($arr as $value) {
echo $value->nodeValue; // result => blabla
}
There are many methods which can be used based on your needs so take a look on documentation
DOMDocument
You can use this function, it is self explanatory:
function getTextBetweenTags($string, $tagname)
{
$pattern = "/<$tagname>(.*?)<\/$tagname>/";
preg_match($pattern, $string, $matches);
return $matches[1];
}
?>
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
Lets say I have the following $string...
<span style='text-decoration:underline; display:none;'>Some text</span>
I only want to allow the style text-decoration, so I want a PHP function like the following...
$string = stripStyles($string, array("text-decoration"));
Similar to strip_tags, but using an array instead. So $string will now be...
<span style='text-decoration:underline;'>Some text</span>
I am using Cake, so if this can be done with Sanitize then all the better.
This is tricky, but you should be able to do it with DOMDocument. This should get you started, but it's likely to require some serious tweaking.
// Load your html string
$dom = new DOMDocument();
$dom->loadHTML($your_html_string);
// Get all the <span> tags
$spans = $dom->getElementsByTagName("span");
// Loop over the span tags
foreach($spans as $span) {
// If they have a style attribute that contains "text-decoration:"
// attempt to replace the contents of the style attribute with only the text-decoration component.
if ($style = $span->getAttribute("style")) {
if (preg_match('/text-decoration:([^;]*);/i', $style)) {
$span->setAttribute("style", preg_replace('/^(.*)text-decoration:([^;]*);(.*)$/i', "text-decoration:$2;", $style);
}
// Otherwise, erase the style attribute
else $span->setAttribute("style", "");
}
}
$output = $dom->saveHTML;
It's maybe better to attempt to parse the style attributes by explode()ing on ;
// This replaces the inner contents of the foreach ($spans as $span) above...
// Instead of the preg_replace()
$styles = explode(";", $style);
$replaced_style = FALSE;
foreach ($styles as $s) {
if (preg_match('/text-decoration/', $s) {
$span->setAttribute("style", $s);
$replaced_style = TRUE;
}
// If a text-decoration wasn't found, empty out the style
if (!$replaced_style) $span->setAttribute("style", "");
}