DOMDocument cannot change parentNode - php

I cannot change the DOMDocument parentNode from null. I have tried using both appendChild and replaceChild, but haven't had any luck.
Where am I going wrong here?
error_reporting(E_ALL);
function xml_encode($mixed, $DOMDocument=null) {
if (is_null($DOMDocument)) {
$DOMDocument =new DOMDocument;
$DOMDocument->formatOutput = true;
xml_encode($mixed, $DOMDocument);
echo $DOMDocument->saveXML();
} else {
if (is_array($mixed)) {
$node = $DOMDocument->createElement('urlset', 'hello');
$DOMDocument->parentNode->appendChild($node);
}
}
}
$data = array();
for ($x = 0; $x <= 10; $x++) {
$data['urlset'][] = array(
'loc' => 'http://www.example.com/user',
'lastmod' => 'YYYY-MM-DD',
'changefreq' => 'monthly',
'priority' => 0.5
);
}
header('Content-Type: application/xml');
echo xml_encode($data);
?>
http://runnable.com/VWhQksAhdIJYEPLj/xml-encode-for-php

Since the document has no parent node you need to append the root node directly to the document, like this:
$DOMDocument->appendChild($node);
This works since DOMDocument extends DOMNode.
Working example:
error_reporting(E_ALL);
function xml_encode($mixed, &$DOMDocument=null) {
if (is_null($DOMDocument)) {
$DOMDocument =new DOMDocument;
$DOMDocument->formatOutput = true;
xml_encode($mixed, $DOMDocument);
return $DOMDocument->saveXML();
} else {
if (is_array($mixed)) {
$node = $DOMDocument->createElement('urlset', 'hello');
$DOMDocument->appendChild($node);
}
}
}
$data = array();
for ($x = 0; $x <= 10; $x++) {
$data['urlset'][] = array(
'loc' => 'http://www.example.com/user',
'lastmod' => 'YYYY-MM-DD',
'changefreq' => 'monthly',
'priority' => 0.5
);
}
header('Content-Type: application/xml');
echo xml_encode($data);
Btw, if you just want to serialize an XML file, DOM is a bit overhead. I would use a template engine for this, meaning handle it as plain text.

This should work, when you create a new DOMDocument you don't have a root element yet, so you can just create it and add it to the document
//You could add this to the top of xml_encode
if($DOMDocument->parentNode === null) {
$root = $DOMDocument->createElement('root');
$root = $DOMDocument->appendChild($root);
}
//Your script working:
<?php
error_reporting(E_ALL);
function xml_encode($mixed, $DOMDocument=null) {
if (is_null($DOMDocument)) {
$DOMDocument =new DOMDocument();
$DOMDocument->formatOutput = true;
//add here, but note that in the "else" it isn't sure if the DOMDocument has a root element
$root = $DOMDocument->createElement('root');
$root = $DOMDocument->appendChild($root);
xml_encode($mixed, $root);
echo $DOMDocument->saveXML();
} else {
if (is_array($mixed)) {
$node = $DOMDocument->createElement('urlset', 'hello');
$DOMDocument->parentNode->appendChild($node);
}
}
}
I'm not sure why you need the parentNode? you could do $DOMDocument->appendChild();

Related

Using DomDocuments, finding and returning value of ID

I have the jquery that i can run and console and finds the element.
$.get("http://www.roblox.com/groups/group.aspx?gid=2755722", function(webpage) {
if ($(webpage).find("#ctl00_cphRoblox_rbxGroupFundsPane_GroupFunds .robux").length) {
alert("Eureka I found it!")
} else {
alert("nope!")
}
})
<div id="ctl00_cphRoblox_rbxGroupFundsPane_GroupFunds" class="StandardBox" style="padding-right:0">
<b>Funds:</b>
<span class="robux" style="margin-left:5px">29</span>
<span class="tickets" style="margin-left:5px">45</span>
</div>
When i try to run it as PHP with functions and using DomDocuments to handle it all, it wont return anything when i decode it. (the following is all part of a class)
protected function xpath($url,$path)
{
libxml_use_internal_errors(true);
$dom = new DomDocument;
$dom->loadHTML($this->file_get_contents_curl($url));
$xpath = new DomXPath($dom);
return $xpath->query($path);
}
public function GetGroupStats($id)
{
$elements = array (
'Robux' => "//span[#id='ctl00_cphRoblox_rbxGroupFundsPane_GroupFunds .robux']",
'Tix' => "//span[#id='ctl00_cphRoblox_rbxGroupFundsPane_GroupFunds .tickets']",
);
$data = array();
foreach($elements as $name => $element)
{
foreach ($this->xpath('http://www.roblox.com/Groups/group.aspx?gid='.$id,$element) as $i => $node)
$data[$name] = $node->nodeValue;
}
return $data;
}
//File that includes the class and runs the function (ignore the login stuff because it isn't required for this situation)
<?php
$randomstuffdude = include 'RApi.php';
$GetAccessToken = $_GET['token'];
if ($GetAccessToken == "secrettoken6996") {
$rbxBot = new Roblox();
$rbxBot -> DoLogin();
$StatsArray = $rbxBot->GetGroupStats(2755722);
foreach ($StatsArray as $other => $array) {
echo $other . ' : ' . $array . ' / ';
}
} else {
echo "no";
}
?>

Get Element by ClassName with DOMdocument() Method

Here is what I am trying to achieve : retrieve all products on a page and put them into an array. Here is the code I am using :
$page2 = curl_exec($ch);
$doc = new DOMDocument();
#$doc->loadHTML($page2);
$nodes = $doc->getElementsByTagName('title');
$noders = $doc->getElementsByClassName('productImage');
$title = $nodes->item(0)->nodeValue;
$product = $noders->item(0)->imageObject.src;
It works for the $title but not for the product. For info, in the HTML code the img tag looks like this :
<img alt="" class="productImage" data-altimages="" src="xxxx">
I have been looking at this (PHP DOMDocument how to get element?) but I still don't understand how to make it work.
PS : I get this error :
Call to undefined method DOMDocument::getElementsByclassName()
I finally used the following solution :
$classname="blockProduct";
$finder = new DomXPath($doc);
$spaner = $finder->query("//*[contains(#class, '$classname')]");
https://stackoverflow.com/a/31616848/3068233
Linking this answer as it helped me the most with this problem.
function getElementsByClass(&$parentNode, $tagName, $className) {
$nodes=array();
$childNodeList = $parentNode->getElementsByTagName($tagName);
for ($i = 0; $i < $childNodeList->length; $i++) {
$temp = $childNodeList->item($i);
if (stripos($temp->getAttribute('class'), $className) !== false) {
$nodes[]=$temp;
}
}
return $nodes;
}
Theres the code and heres the usage
$dom = new DOMDocument('1.0', 'utf-8');
$dom->loadHTML($html);
$content_node=$dom->getElementById("content_node");
$div_a_class_nodes=getElementsByClass($content_node, 'div', 'a');
function getElementsByClassName($dom, $ClassName, $tagName=null) {
if($tagName){
$Elements = $dom->getElementsByTagName($tagName);
}else {
$Elements = $dom->getElementsByTagName("*");
}
$Matched = array();
for($i=0;$i<$Elements->length;$i++) {
if($Elements->item($i)->attributes->getNamedItem('class')){
if($Elements->item($i)->attributes->getNamedItem('class')->nodeValue == $ClassName) {
$Matched[]=$Elements->item($i);
}
}
}
return $Matched;
}
// usage
$dom = new \DOMDocument('1.0');
#$dom->loadHTML($html);
$elementsByClass = getElementsByClassName($dom, $className, 'h1');

Get Vine video url and image using PHP simple HTML DOM Parser

So i like to take vine image url and video url using PHP Simple HTML DOM Parser.
http://simplehtmldom.sourceforge.net/
here is a example vine url
https://vine.co/v/bjHh0zHdgZT
So i need to take this info from the URL. Form image URL:
<meta property="twitter:image" content="https://v.cdn.vine.co/v/thumbs/8B474922-0D0E-49AD-B237-6ED46CE85E8A-118-000000FFCD48A9C5_1.0.6.mp4.jpg?versionId=mpa1lJy2aylTIEljLGX63RFgpSR5KYNg">
and For the video URL
<meta property="twitter:player:stream" content="https://v.cdn.vine.co/v/videos/8B474922-0D0E-49AD-B237-6ED46CE85E8A-118-000000FFCD48A9C5_1.0.6.mp4?versionId=ul2ljhBV28TB1dUvAWKgc6VH0fmv8QCP">
I want to take only the content of the these meta tags. if anyone can help really appreciate it. Thanks
Instead of using the lib you pointed out, I'm using native PHP DOM in this example, and it should work.
Here's a small class I created for something like that:
<?php
class DomFinder {
function __construct($page) {
$html = #file_get_contents($page);
$doc = new DOMDocument();
$this->xpath = null;
if ($html) {
$doc->preserveWhiteSpace = true;
$doc->resolveExternals = true;
#$doc->loadHTML($html);
$this->xpath = new DOMXPath($doc);
$this->xpath->registerNamespace("html", "http://www.w3.org/1999/xhtml");
}
}
function find($criteria = NULL, $getAttr = FALSE) {
if ($criteria && $this->xpath) {
$entries = $this->xpath->query($criteria);
$results = array();
foreach ($entries as $entry) {
if (!$getAttr) {
$results[] = $entry->nodeValue;
} else {
$results[] = $entry->getAttribute($getAttr);
}
}
return $results;
}
return NULL;
}
function count($criteria = NULL) {
$items = 0;
if ($criteria && $this->xpath) {
$entries = $this->xpath->query($criteria);
foreach ($entries as $entry) {
$items++;
}
}
return $items;
}
}
To use it you can try:
$url = "https://vine.co/v/bjHh0zHdgZT";
$dom = new DomFinder($url);
$content_cell = $dom->find("//meta[#property='twitter:player:stream']", 'content');
print $content_cell[0];

php xml generation with xpath

I wrote a small helper function to do basic search replace using xpath, because I found it easy to write manipulations very short and at the same time easy to read and understand.
Code:
<?php
function xml_search_replace($dom, $search_replace_rules) {
if (!is_array($search_replace_rules)) {
return;
}
$xp = new DOMXPath($dom);
foreach ($search_replace_rules as $search_pattern => $replacement) {
foreach ($xp->query($search_pattern) as $node) {
$node->nodeValue = $replacement;
}
}
}
The problem is that now I need to do different "search/replace" on different parts of the XML dom. I had hoped something like the following would work, but DOMXPath can't use DOMDocumentFragment :(
The first part (until the foreach loop) of the example below works like a charm. I'm looking for inspiration for an alternative way to go around it which is still short and readable (without to much boiler plate).
Code:
<?php
$dom = new DOMDocument;
$dom->loadXml(file_get_contents('container.xml'));
$payload = $dom->getElementsByTagName('Payload')->item(0);
xml_search_replace($dom, array('//MessageReference' => 'SRV4-ID00000000001'));
$payloadXmlTemplate = file_get_contents('payload_template.xml');
foreach (array(array('id' => 'some_id_1'),
array('id' => 'some_id_2')) as $request) {
$fragment = $dom->createDocumentFragment();
$fragment->appendXML($payloadXmlTemplate);
xml_search_replace($fragment, array('//PayloadElement' => $request['id']));
$payload->appendChild($fragment);
}
Thanks to Francis Avila I came up with the following:
<?php
function xml_search_replace($node, $search_replace_rules) {
if (!is_array($search_replace_rules)) {
return;
}
$xp = new DOMXPath($node->ownerDocument);
foreach ($search_replace_rules as $search_pattern => $replacement) {
foreach ($xp->query($search_pattern, $node) as $matchingNode) {
$matchingNode->nodeValue = $replacement;
}
}
}
$dom = new DOMDocument;
$dom->loadXml(file_get_contents('container.xml'));
$payload = $dom->getElementsByTagName('Payload')->item(0);
xml_search_replace($dom->documentElement, array('//MessageReference' => 'SRV4-ID00000000001'));
$payloadXmlTemplate = file_get_contents('payload_template.xml');
foreach (array(array('id' => 'some_id_1'),
array('id' => 'some_id_2')) as $request) {
$fragment = $dom->createDocumentFragment();
$fragment->appendXML($payloadXmlTemplate);
xml_search_replace($payload->appendChild($fragment),
array('//PayloadElement' => $request['id']));
}

How can I extract all img tag within an anchor tag?

I would like to extract all img tags that are within an anchor tag using the PHP DOM object.
I am trying it with the code below but its getting all anchor tag and making it's text empty due the inside of an img tag.
function get_links($url) {
// Create a new DOM Document to hold our webpage structure
$xml = new DOMDocument();
// Load the url's contents into the DOM
#$xml->loadHTMLFile($url);
// Empty array to hold all links to return
$links = array();
//Loop through each <a> tag in the dom and add it to the link array
foreach($xml->getElementsByTagName('a') as $link)
{
$hrefval = '';
if(strpos($link->getAttribute('href'),'www') > 0)
{
//$links[] = array('url' => $link->getAttribute('href'), 'text' => $link->nodeValue);
$hrefval = '#URL#'.$link->getAttribute('href').'#TEXT#'.$link->nodeValue;
$links[$hrefval] = $hrefval;
}
else
{
//$links[] = array('url' => GetMainBaseFromURL($url).$link->getAttribute('href'), 'text' => $link->nodeValue);
$hrefval = '#URL#'.GetMainBaseFromURL($url).$link->getAttribute('href').'#TEXT#'.$link->nodeValue;
$links[$hrefval] = $hrefval;
}
}
foreach($xml->getElementsByTagName('img') as $link)
{
$srcval = '';
if(strpos($link->getAttribute('src'),'www') > 0)
{
//$links[] = array('src' => $link->getAttribute('src'), 'nodval' => $link->nodeValue);
$srcval = '#SRC#'.$link->getAttribute('src').'#NODEVAL#'.$link->nodeValue;
$links[$srcval] = $srcval;
}
else
{
//$links[] = array('src' => GetMainBaseFromURL($url).$link->getAttribute('src'), 'nodval' => $link->nodeValue);
$srcval = '#SRC#'.GetMainBaseFromURL($url).$link->getAttribute('src').'#NODEVAL#'.$link->nodeValue;
$links[$srcval] = $srcval;
}
}
//Return the links
//$links = unsetblankvalue($links);
return $links;
}
This returns all anchor tag and all img tag separately.
$xml = new DOMDocument;
libxml_use_internal_errors(true);
$xml->loadHTMLFile($url);
libxml_clear_errors();
libxml_use_internal_errors(false);
$xpath = new DOMXPath($xml);
foreach ($xpath->query('//a[contains(#href, "www")]/img') as $entry) {
var_dump($entry->getAttribute('src'));
}
The usage of strpos() function is not correct in the code.
Instead of using
if(strpos($link->getAttribute('href'),'www') > 0)
Use
if(strpos($link->getAttribute('href'),'www')!==false )

Categories