beginner attempting to read xml into php - php

I have an xml feed located here that I am trying to read into a php script, then cycle through the <packages>, and sum the <downloads>. I've attempted to do this using DOMDocument, but have thus far failed.
the basic method i've been trying to use is as follows
<?php
$dom = new DomDocument;
$dom->loadXML('http://www.phogue.net/feed');
$packages = $dom->getElementsByTagName('package');
foreach($packages as $item)
{
echo $item->getAttribute('uid').'<br>';
}
?>
The above code is meant to just print out the name of each item, but its not working. I am currently getting the following error
Warning: DOMDocument::loadXML() [domdocument.loadxml]: Start tag expected, '<' not found in Entity, line: 1 in /home/a8744502/public_html/userbar.php on line 3
WORKING CODE:
<?php
$dom = new DomDocument;
$dom->load('http://www.phogue.net/feed/');
$package = $dom->getElementsByTagName('package');
$value=0;
foreach ($package as $plugin) {
$downloads = $plugin->getElementsByTagName("downloads");
$download = $downloads->item(0)->nodeValue;
$authors = $plugin->getElementsByTagName("author");
$author = $authors->item(0)->nodeValue;
if($author == "Zaeed")
{
$value += $download;
}
}
echo $value;
?>

DOMDocument::loadXML() expects a string of XML. Try DOMDocument::load() instead - http://www.php.net/manual/en/domdocument.load.php
Keep in mind that to open an XML file via HTTP, you will need the appropriate wrapper enabled.

You have a open parenthesis at the beginning of your echo.

Related

PHP How to avoid this warning: DOMDocument::loadHTML(): Invalid char in CDATA

I'm trying to collect some info from a web service, but I'm having issues with the CDATA Section of a page, because everything goes right when I use something like this:
$url = 'http://www.example.com';
$content = file_get_contents($url);
$doc = new DOMDocument();
$doc->loadHTML($content);
foreach($doc->getElementsByTagName('h3') as $subtitle) {
echo $subtitle->textContent; //The output is the Subtitle/s.
}
But when the page contains CDATA sections there is a problem with this error on the line $doc->loadHTML($content).
Warning: DOMDocument::loadHTML(): Invalid char in CDATA
I've seen over here a solution that I tried to implement without any success.
function sanitize_html($content) {
if (!$content) return '';
$invalid_characters = '/[^\x9\xa\x20-\xD7FF\xE000-\xFFFD]/';
return preg_replace($invalid_characters,'', $content);
}
$url = 'http://www.example.com';
$content = file_get_contents($url);
$cleanContent = sanitize_html($content);
$doc = new DOMDocument();
$doc->loadHTML($cleanContent); //Warning: DOMDocument::loadHTML(): htmlParseEntityRef: no name in Entity
But I got this other error:
Warning: DOMDocument::loadHTML(): htmlParseEntityRef: no name in Entity
What could be a good way to deal with the CDATA sections of a page? Greetings.
The solution is to - replace the & symbol with &
or if you must have that & as it is then, may be you could enclose it in: <![CDATA[ - ]]>
Try adding PCLZIP before load IOFactory as shown:
require_once '/Classes/PHPExcel.php';
\PHPExcel_Settings::setZipClass(\PHPExcel_Settings::PCLZIP);
add libxml_use_internal_errors(true) and libxml_clear_errors() this work for me please click below to review code
https://i.stack.imgur.com/6MN4H.png

PHP DomDocument failing to handle quotes in a url

When I try to open a url like that :
http://api.anghami.com/rest/v1/GETsearch.view?sid=11754134061397734622103190992&query=Can't Remember to Forget You Shakira&searchtype=SONG&ook&songCount=1
containing a quote with the browser everything works fine and the output is good as an xml
But when I try to call it from a php file:
$url = "http:/api.anghami.com/rest/v1/GETsearch.view?sid=11754134061397734622103190992&query=Can't Remember to Forget You Shakira&searchtype=SONG&ook&songCount=1"
//using DOMDocument for parsing.
$data = new DOMDocument();
// loading the xml from Anghami API.
if($data->load("$url")){// Getting the Tag song.
foreach ($data->getElementsByTagName('song') as $searchNode)
{
$count++;
$n++;
//Getting the information of Anghami Song from the XML file.
$valueID = $searchNode->getAttribute('id');
$titleAnghami = $searchNode->getAttribute('title');
$album = $searchNode->getAttribute('album');
$albumID = $searchNode->getAttribute('albumID');
$artistAnghami = $searchNode->getAttribute('artist');
$track = $searchNode->getAttribute('track');
$year = $searchNode->getAttribute('year');
$coverArt = $searchNode->getAttribute('coverArt');
$ArtistArt = $searchNode->getAttribute('ArtistArt');
$size = $searchNode->getAttribute('size');
}
}
I get this error:
'Warning: DOMDocument::load(): I/O warning : failed to load external entity /var/www/html/http:/api.anghami.com/rest/v1/GETsearch.view?sid=11754134061397734622103190992&query=Can't Remember to Forget You Shakira&searchtype=SONG&ook&songCount=1" in /var/www/html/search.php on line 93'
Can anyone help please?
#Fracsi is correct: the URL needs to start with http:// not http:/
The other problem is that the XML has a default namespace (defined with the xmlns attribute on the root element), so you need to use
$data->getElementsByTagNameNS('http://api.anghami.com/rest/v1', 'song')
to select all the "song" elements.

Domdocument Load not loading

I am trying to load xml file through url(i.e. rss).
But when I Use
$doc = new DOMDocument();
$doc->load($url);
if($doc->load($url,LIBXML_NOWARNING)===false)
{
echo "Hello";
//echo #$doc->load($url,LIBXML_NOWARNING);
//exit;
$error = $doc->load($url);
print_r($error);exit;
}
It only prints Hello..
No warning displayed for line 2.
Please provide me solution that which error occurs as I am getting nothing.
Remove exit; from the code after echo "Hello"; that is the reason
That is because the rendered content is not visible. Try pressing Ctrl+U on your browser.
Also, instead of print_r try with var_dump
$error = $doc->load($url);
var_dump($error);
EDIT :
So it seems like your $doc->load failed in the first place. You need to change your if statement to
if($doc->load($url,LIBXML_NOWARNING)===true) // Replaced false with true.
or simply
if($doc->load($url,LIBXML_NOWARNING))
Your XML load failed that's why it went inside the if statement. Check whether the URL is spelled right or check if the URL really exists.
Try using libxml_use_internal_errors() to capture XML parsing errors:
<?php
$doc = new DOMDocument();
$doc->recover = true;
libxml_use_internal_errors(true);
$url = 'http://page2rss.com/rss/91a83628a27c43b6ab4f0b3959f69f5a';
$doc->load($url);
$errors = libxml_get_errors();
foreach ($errors as $error) {
printf("Error %d at line %d, column %d:\n\t%s\n",
$error->code, $error->line, $error->column, $error->message);
}
libxml_use_internal_errors(false);
// Error 9 at line 82, column 155:
// Input is not proper UTF-8, indicate encoding !
// Bytes: 0xAE 0x20 0x28 0x52

simplexml load on google weather api prooblem

Hi I have been having problems with the google weather api having errors Warning: simplexml_load_string() [function.simplexml-load-string]: Entity: line 2: parser error ....
I tried to use the script of the main author(thinking it was my edited script) but still I am having this errors I tried 2
//komunitasweb.com/2009/09/showing-the-weather-with-php-and-google-weather-api/
and
//tips4php.net/2010/07/local-weather-with-php-and-google-weather/
The weird part is sometimes it fixes itself then goes back again to the error I have been using it for months now without any problem, this just happened yesterday. Also the demo page of the authors are working but I have the same exact code any help please.
this is my site http://j2sdesign.com/weather/widgetlive1.php
#Mike I added your code
<?
$xml = file_get_contents('http://www.google.com/ig/api?weather=jakarta'); if (! simplexml_load_string($xml)) { file_put_contents('malformed.xml', $xml); }
$xml = simplexml_load_file('http://www.google.com/ig/api?weather=jakarta');
$information = $xml->xpath("/xml_api_reply/weather/forecast_information");
$current = $xml->xpath("/xml_api_reply/weather/current_conditions");
$forecast_list = $xml->xpath("/xml_api_reply/weather/forecast_conditions");
?>
and made a list of the error but I can't seem to see the error cause it's been fixing itself then after sometime goes back again to the error
here is the content of the file
<?php include_once('simple_html_dom.php'); // create doctype $dom = new DOMDocument("1.0");
// display document in browser as plain text
// for readability purposes //header("Content-Type: text/plain");
// create root element
$xmlProducts = $dom->createElement("products");
$dom->appendChild($xmlProducts);
$pages = array( 'http://myshop.com/small_houses.html', 'http://myshop.com/medium_houses.html', 'http://myshop.com/large_houses.html' ) foreach($pages as $page) { $product = array(); $source = file_get_html($page); foreach($source->find('img') as $src) { if (strpos($src->src,"http://myshop.com") === false) { $product['image'] = "http://myshop.com/$src->src"; } } foreach($source->find('p[class*=imAlign_left]') as $description) { $product['description'] = $description->innertext; } foreach($source->find('span[class*=fc3]') as $title) { $product['title'] = $title->innertext; } //debug perposes! echo "Current Page: " . $page . "\n"; print_r($product); echo "\n\n\n"; //Clear seperator } ?>
When simplexml_load_string() fails you need to store the data you're trying to load somewhere for review. Examining the data is the first step to diagnose what it causing the error.
$xml = file_get_contents('http://example.com/file.xml');
if (!simplexml_load_string($xml)) {
file_put_contents('malformed.xml', $xml);
}

Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity,

$html = file_get_contents("http://www.somesite.com/");
$dom = new DOMDocument();
$dom->loadHTML($html);
echo $dom;
throws
Warning: DOMDocument::loadHTML(): htmlParseEntityRef: expecting ';' in Entity,
Catchable fatal error: Object of class DOMDocument could not be converted to string in test.php on line 10
To evaporate the warning, you can use libxml_use_internal_errors(true)
// create new DOMDocument
$document = new \DOMDocument('1.0', 'UTF-8');
// set error level
$internalErrors = libxml_use_internal_errors(true);
// load HTML
$document->loadHTML($html);
// Restore error level
libxml_use_internal_errors($internalErrors);
I would bet that if you looked at the source of http://www.somesite.com/ you would find special characters that haven't been converted to HTML. Maybe something like this:
link
Should be
link
$dom->#loadHTML($html);
This is incorrect, use this instead:
#$dom->loadHTML($html);
There are 2 errors: the second is because $dom is no string but an object and thus cannot be "echoed". The first error is a warning from loadHTML, caused by invalid syntax of the html document to load (probably an & (ampersand) used as parameter separator and not masked as entity with &).
You ignore and supress this error message (not the error, just the message!) by calling the function with the error control operator "#" (http://www.php.net/manual/en/language.operators.errorcontrol.php )
#$dom->loadHTML($html);
The reason for your fatal error is DOMDocument does not have a __toString() method and thus can not be echo'ed.
You're probably looking for
echo $dom->saveHTML();
Regardless of the echo (which would need to be replaced with print_r or var_dump), if an exception is thrown the object should stay empty:
DOMNodeList Object
(
)
Solution
Set recover to true, and strictErrorChecking to false
$content = file_get_contents($url);
$doc = new DOMDocument();
$doc->recover = true;
$doc->strictErrorChecking = false;
$doc->loadHTML($content);
Use php's entity-encoding on the markup's contents, which is a most common error source.
replace the simple
$dom->loadHTML($html);
with the more robust ...
libxml_use_internal_errors(true);
if (!$DOM->loadHTML($page))
{
$errors="";
foreach (libxml_get_errors() as $error) {
$errors.=$error->message."<br/>";
}
libxml_clear_errors();
print "libxml errors:<br>$errors";
return;
}
$html = file_get_contents("http://www.somesite.com/");
$dom = new DOMDocument();
$dom->loadHTML(htmlspecialchars($html));
echo $dom;
try this
I know this is an old question, but if you ever want ot fix the malformed '&' signs in your HTML. You can use code similar to this:
$page = file_get_contents('http://www.example.com');
$page = preg_replace('/\s+/', ' ', trim($page));
fixAmps($page, 0);
$dom->loadHTML($page);
function fixAmps(&$html, $offset) {
$positionAmp = strpos($html, '&', $offset);
$positionSemiColumn = strpos($html, ';', $positionAmp+1);
$string = substr($html, $positionAmp, $positionSemiColumn-$positionAmp+1);
if ($positionAmp !== false) { // If an '&' can be found.
if ($positionSemiColumn === false) { // If no ';' can be found.
$html = substr_replace($html, '&', $positionAmp, 1); // Replace straight away.
} else if (preg_match('/&(#[0-9]+|[A-Z|a-z|0-9]+);/', $string) === 0) { // If a standard escape cannot be found.
$html = substr_replace($html, '&', $positionAmp, 1); // This mean we need to escape the '&' sign.
fixAmps($html, $positionAmp+5); // Recursive call from the new position.
} else {
fixAmps($html, $positionAmp+1); // Recursive call from the new position.
}
}
}
Another possibile solution is
$sContent = htmlspecialchars($sHTML);
$oDom = new DOMDocument();
$oDom->loadHTML($sContent);
echo html_entity_decode($oDom->saveHTML());
Another possibile solution is,maybe your file is ASCII type file,just change the type of your files.
Even after this my code is working fine , so i just removed all warning messages with this statement at line 1 .
<?php error_reporting(E_ERROR); ?>

Categories