With the help of XPath, how to get the value of the href attribute in the following case (only grabbing the url that is the right one)?:
a wrong one
the right one
a wrong one
That is, to get the value of the href attribute if the link has a particular text.
This will select the attributes:
"//a[text()='the right one']/#href"
i think this is the best solution, you can use each of them as an array element
$String= '
a wrong one
the right one
a wrong one
';
$array=get_all_string_between($String,'href="','">');
print_r($array);//just to see what is inside the array
//now get each of them
foreach($array as $value){
echo $value.'<br>';
}
function get_all_string_between($string, $start, $end)
{
$result = array();
$string = " ".$string;
$offset = 0;
while(true)
{
$ini = strpos($string,$start,$offset);
if ($ini == 0)
break;
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
$result[] = substr($string,$ini,$len);
$offset = $ini+$len;
}
return $result;
}
"//a[#href='http://example.com']"
I'd use an opensource class like simple_html_dom.php
$oHtml = new simple_html_dom();
$oHtml->load($sBody)
foreach($oHtml->find('a') as $oElement) {
echo $oElement->href
}
Here's a full example using SimpleXML:
$xml = '<html>a wrong one'
. 'the right one'
. 'a wrong one</html>';
$tree = simplexml_load_string($xml);
$nodes = $tree->xpath('//a[text()="the right one"]');
$href = (string) $nodes[0]['href'];
Related
how to extract specific string after specific word using html dom in php. I have
<div class="abc">
<script type="text/javascript">
var flashvars = { word : path } </script>
Now i want to extract path after word
thanks for your response.
I got the solution for what i was looking.
Here is the code in case someone needs it.
Explanation :
'$results' is the curl response.
Enter div class name (which you want to fetch) inside "$xpath->query() function"
You will get source code for entire class inside "$tag->textContent"
$dom = new DOMDocument();
$dom->loadHTML($results);
$xpath = new DOMXPath($dom);
$tags = $xpath->query('//div[#class="e"]');
foreach ($tags as $tag)
{
echo "<br>----------<br>";
var_dump($tag->textContent);
echo "<br>----------<br>";
}
Now you have your required class' html source inside "$tag->textContent".
Now you can fetch anything from the string between "start" and "end" points using below function.
function get_string_between($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
In my case i used it like this :
$price = get_string_between($tag->textContent,'swf', '+');
echo $price;
Here "swf" is the starting point of the path and "+" is the end point.
Hope it saves somebody else time :)
So right now I have 2 php codes that work exactly as they are supposed to
the first one pulls all info from the "src" of an "img" tag
<?php
$url="foo";
$html = file_get_contents($url);
$doc = new DOMDocument();
#$doc->loadHTML($html);
$tags = $doc->getElementsByTagName('img');
foreach ($tags as $tag) {
echo $tag->getAttribute('src') . "<br>";
}
?>
the second one is designed to pull a string of characters from between two others strings
<?php
function get_string_between($string, $start, $end){
$string = " ".$string;
$ini = strpos($string,$start);
if ($ini == 0) return "";
$ini += strlen($start);
$len = strpos($string,$end,$ini) - $ini;
return substr($string,$ini,$len);
}
$fullstring = "this is my [tag]dog[/tag]";
$parsed = get_string_between($fullstring, "[tag]", "[/tag]");
echo $parsed; // (result = dog)
?>
what I need is to figure out how to use the second code to only pull a piece of the "src" and replace it for as long as there are still "img" tags to process
so if the tag comes back "/pics/foo.jpg" i can remove the "/pics/" and the ".jpg" leaving me with just "foo"
i hope i have made some sense. thanks
Why do you want to use the second code? You can do it with exploding the fullstring:
$exploded_tag = explode($tag,'\');
Then you need the last element of the string (foo.jpg):
$last_part = end($exploded_tag);
Then you have to explode it and take the first element (foo):
$exploded_lastpart = explode($last_part,'.');
$piece = $exploded_lastpart[0];
You don't need second code. There is function in PHP called pathinfo(). So you can just do:
$path_parts = pathinfo('/pics/foo.jpg');
echo $path_parts['filename'];
My file xml:
<pasaz:Envelope>
<pasaz:Body>
<loadOffe>
<offe>
<off>
<id>120023</id>
<name>my name John</name>
<name>Test</name>
</off>
</offe>
</loadOffe>
</pasaz:Body>
</pasaz:Envelope>
How to view a php (id and name).
If you're just looking for a simple way to extract the contents of a tag, but don't want to go to all the trouble of parsing the XML properly, you could do something like this:
$xml = ""; // your xml data as a string
function get_tag_contents($xml, $tagName) {
$startPosition = strpos($xml, "<" . $tagName . ">");
$endPosition = strpos($xml, "</" . $tagName . ">");
$length = $endPosition - ($startPosition + 1);
return substr($xml, $startPosition, $length);
}
$id = get_tag_contents($xml, "id");
$name = get_tag_contents($xml, "name");
This assumes you haven't assigned any attributes to your tags, and that each tag is unique (in the example you gave us I noted two "name" tags, and if you want both you'll need to make this solution a bit more robust or do proper XML parsing).
How to get all items?
Example (does not work ..)
$pliks = simplexml_load_file("file.xml");
foreach ($pliks->children('pasaz', true) as $body)
{
foreach ($body->children() as $loadOffe)
{
if ($loadOffe->offe->off) {
echo "<p>id: $loadOffe->id</p>";
echo "$id->id";
echo "<p>name: <b>$name->name</b></p>";
}
}
// echo $loadOffe->offe->off->id;
}
As Marc B suggested in his comment you should use DOM, either use getElementsByTagName() or DOMXPath, example for getElementaByTagName():
$dom = new DOMDocument;
$dom->loadXML($xml);
$ids = $dom->getElementsByTagName('id');
if( $ids || !$ids->length){
throw new Exception( 'Id not found');
}
return $ids->item(0);
I have these php lines:
<?php
$start_text = '<username="';
$end_text = '" userid=';
$source = file_get_contents('http://mysites/users.xml');
$start_pos = strpos($source, $start_text) + strlen($start_text);
$end_pos = strpos($source, $end_text) - $start_pos;
$found_text = substr($source, $start_pos, $end_pos);
echo $found_text;
?>
I want to see just the names from entire file, but it shows me just the first name. I want to see all names.
I think it is something like: foreach ($found_text as $username).... but here I am stuck.
Update from OP post, below:
<?php
$xml = simplexml_load_file("users.xml");
foreach ($xml->children() as $child)
{
foreach($child->attributes() as $a => $b)
{
echo $a,'="',$b,"\"</br>";
}
foreach ($child->children() as $child2)
{
foreach($child2->attributes() as $c => $d)
{
echo "<font color='red'>".$c,'="',$d,"\"</font></br>";
}
}
}
?>
with this code, i receive all details about my users, but from all these details i want to see just 2 or 3
Now i see :
name="xxx"
type="default"
can_accept="true"
can_cancel="false"
image="avatars/trophy.png"
title="starter"
........etc
Another details from the same user "Red color(defined on script)"
reward_value="200"
reward_qty="1"
expiration_date="12/07/2012"
.....etc
what i want to see?
i.e first line from first column "name="xxx" & expiration_date="12/07/2012" from second column
You will need to repeat the loop, using the 3rd parameter, offset, of the strpos function. That way, you can look for a new name each time.
Something like this (untested)
<?php
$start_text = '<username="';
$end_text = '" userid=';
$source = file_get_contents('http://mysites/users.xml');
$offset = 0;
while (false !== ($start_pos = strpos($source, $start_text, $offset)))
{
$start_pos += strlen($start_text);
$end_pos = strpos($source, $end_text, $offset);
$offset = $end_pos;
$text_length = $end_pos - $start_pos;
$found_text = substr($source, $start_pos, $text_length);
echo $found_text;
}
?>
You should either use XMLReader or DOM or SimpleXML to read XML files. If you don't see the necessity, try the following regular expressions approach to retrieve all usernames:
<?php
$xml = '<xml><username="hello" userid="123" /> <something /> <username="foobar" userid="333" /></xml>';
if (preg_match_all('#<username="(?<name>[^"]+)"#', $xml, $matches, PREG_PATTERN_ORDER)) {
var_dump($matches['name']);
} else {
echo 'no <username="" found';
}
How can I get everything from $string after "<div class='partFive'>"?
try this,
$prefix = "<div class='partFive'>";
$index = strpos($string, $prefix) + strlen($prefix);
$result = substr($string, $index);
obviously you don't have to re-calculate the "strlen" part of it each time if the $prefix value is static.
$myString = strstr($string, "<div class='partFive'>");
Do you need everything after <div class='partFive'> or everything in that DOM element?
I'm assuming you mean you want to grab everything in that DOM element, and the easiest way would be to grab it by using Zend_Dom.
$dom = new Zend_Dom_Query($html);
$results = $dom->query('div.partFive');
foreach ($results as $result) {
// $result is a DOMElement
}
You could just remove the part you don't want, like so:
$myString = str_replace("<div class='partFive'>","",$string);