PHP DOMElement is Immutable. = 'No Modification Allowed Error' - php

I cannot understand why this fails. Does a DOMElement need to be part of a Document?
$domEl = new DOMElement("Item");
$domEl->setAttribute('Something','bla');
Throws exception
> Uncaught exception 'DOMException' with message 'No Modification Allowed Error';
I would have thought I could just create a DOMElement and it would be mutable.

From http://php.net/manual/en/domelement.construct.php
Creates a new DOMElement object. This object is read only. It may be appended to a document, but additional nodes may not be appended to this node until the node is associated with a document. To create a writeable node, use DOMDocument::createElement or DOMDocument::createElementNS.

I needed to pass one instance of \DOMElement to a function in order to add children elements, so I ended up with a code like this:
class FooBar
{
public function buildXml() : string
{
$doc = new \DOMDocument();
$doc->formatOutput = true;
$parentElement = $doc->createElement('parentElement');
$this->appendFields($parentElement);
$doc->appendChild($parentElement);
return $doc->saveXML();
}
protected function appendFields(\DOMElement $parentElement) : void
{
// This will throw "No Modification Allowed Error"
// $el = new \DOMElement('childElement');
// $el->appendChild(new \DOMCDATASection('someValue'));
// but these will work
$el = $parentElement->appendChild(new \DOMElement('childElement1'));
$el->appendChild(new \DOMCdataSection('someValue1'));
$el2 = $parentElement->appendChild(new \DOMElement('childElement2'));
$el2->setAttribute('foo', 'bar');
$el2->appendChild(new \DOMCdataSection('someValue2'));
}
}

Related

PHP Doc comment after unserialization

The ReflectionMethod instance from PHP (http://php.net/manual/en/class.reflectionmethod.php) has the getDocComment method that returns the annotation of a method. This works ok, unless you use unserialized object.
$ref = new ReflectionClass('a');
var_dump(method_exists($ref, 'getDocComment')); //bool(true)
var_dump($ref->getDocComment()); //bool(false)
$ref = unserialize(serialize($ref));
var_dump(method_exists($ref, 'getDocComment')); //bool(true)
var_dump($ref->getDocComment()); //PHP Warning: Uncaught Error: Internal error: Failed to retrieve the reflection object
Is there any way of testing if the ReflectionMethod object has correctly defined doc comment? I mean, I do not care about getting the annotation after serialize/unserialize, but I want to check if calling getDocComment is safe.
Edit: According to responses that advice error handling + fallback, I rephrase the Q.
I have some simple cache of reflections (array of ReflectionMethod objects). Until I use item from that cache, I wold like to chech its correctness. I do NOT want to handle error, I want to "predict error". Awesome would be something like hasDocComment method that does not generate any error, but returns only true/false within any ReflectionMethod object state.
The general approach of serializing reflection objects is wrong. There exists a PHP Bug report for it, but it has been set to "irrelevant":
https://bugs.php.net/bug.php?id=70719
The reason is, that you cannot connect a reflection object back to its class again, because you would have to deal with source code changes and all kinds of stuff. What you should do instead is, to serialize the name of the class and generate a NEW reflection object from that class, when you unserialize.
Code Example:
class A { }
$ref = new ReflectionClass('A');
var_dump(method_exists($ref, 'getDocComment'));
// serialize only the class name
$refClass = unserialize(serialize($ref->getName()));
// get a new reflection object from that class ...
$ref = new ReflectionClass($refClass);
var_dump(method_exists($ref, 'getDocComment'));
// If you want to serialize an object
$a = new A();
$a2 = unserialize(serialize($a));
$ref = new ReflectionClass(get_class($a2));
var_dump(method_exists($ref, 'getDocComment'));
If you need to be able to handle errors, you can try/catch the execution block. Since alt-php71-7.1.0-1 (which you seem to be using), this will throw an instance of Error instead of simply a Fatal Error, which allows you to do error handling.
<?php
class A { }
$ref = new ReflectionClass('A');
var_dump(method_exists($ref, 'getDocComment')); //bool(true)
var_dump($ref->getDocComment()); //bool(false)
// serialize only the class name
$refClass = unserialize(serialize($ref));
try {
$refClass->getDocComment();
// do your work
}
catch (Error $e) {
echo "Malformed Reflection object: ".$e->getMessage();
}
Demo
And since you can still get the class name from the malformed Reflection instance, you can instantiate a new one right in your catch block:
<?php
class A { }
$ref = new ReflectionClass('A');
var_dump(method_exists($ref, 'getDocComment')); //bool(true)
var_dump($ref->getDocComment()); //bool(false)
// serialize only the class name
$refClass = unserialize(serialize($ref));
try {
$refClass->getDocComment();
}
catch (Error $e) {
$recoveredRef = new ReflectionClass($refClass->getName());
var_dump($recoveredRef);
var_dump($recoveredRef->getDocComment()); // works correctly
echo "Malformed Reflection object, but recovered: ".$e->getMessage();
}
Demo

PHP DOMDocument loading custom tags [duplicate]

I need to parse some HTML files, however, they are not well-formed and PHP prints out warnings to. I want to avoid such debugging/warning behavior programatically. Please advise. Thank you!
Code:
// create a DOM document and load the HTML data
$xmlDoc = new DomDocument;
// this dumps out the warnings
$xmlDoc->loadHTML($fetchResult);
This:
#$xmlDoc->loadHTML($fetchResult)
can suppress the warnings but how can I capture those warnings programatically?
Call
libxml_use_internal_errors(true);
prior to processing with with $xmlDoc->loadHTML()
This tells libxml2 not to send errors and warnings through to PHP. Then, to check for errors and handle them yourself, you can consult libxml_get_last_error() and/or libxml_get_errors() when you're ready:
libxml_use_internal_errors(true);
$dom->loadHTML($html);
$errors = libxml_get_errors();
foreach ($errors as $error) {
// handle the errors as you wish
}
To hide the warnings, you have to give special instructions to libxml which is used internally to perform the parsing:
libxml_use_internal_errors(true);
$dom->loadHTML($html);
libxml_clear_errors();
The libxml_use_internal_errors(true) indicates that you're going to handle the errors and warnings yourself and you don't want them to mess up the output of your script.
This is not the same as the # operator. The warnings get collected behind the scenes and afterwards you can retrieve them by using libxml_get_errors() in case you wish to perform logging or return the list of issues to the caller.
Whether or not you're using the collected warnings you should always clear the queue by calling libxml_clear_errors().
Preserving the state
If you have other code that uses libxml it may be worthwhile to make sure your code doesn't alter the global state of the error handling; for this, you can use the return value of libxml_use_internal_errors() to save the previous state.
// modify state
$libxml_previous_state = libxml_use_internal_errors(true);
// parse
$dom->loadHTML($html);
// handle errors
libxml_clear_errors();
// restore
libxml_use_internal_errors($libxml_previous_state);
Setting the options "LIBXML_NOWARNING" & "LIBXML_NOERROR" works perfectly fine too:
$dom->loadHTML($html, LIBXML_NOWARNING | LIBXML_NOERROR);
You can install a temporary error handler with set_error_handler
class ErrorTrap {
protected $callback;
protected $errors = array();
function __construct($callback) {
$this->callback = $callback;
}
function call() {
$result = null;
set_error_handler(array($this, 'onError'));
try {
$result = call_user_func_array($this->callback, func_get_args());
} catch (Exception $ex) {
restore_error_handler();
throw $ex;
}
restore_error_handler();
return $result;
}
function onError($errno, $errstr, $errfile, $errline) {
$this->errors[] = array($errno, $errstr, $errfile, $errline);
}
function ok() {
return count($this->errors) === 0;
}
function errors() {
return $this->errors;
}
}
Usage:
// create a DOM document and load the HTML data
$xmlDoc = new DomDocument();
$caller = new ErrorTrap(array($xmlDoc, 'loadHTML'));
// this doesn't dump out any warnings
$caller->call($fetchResult);
if (!$caller->ok()) {
var_dump($caller->errors());
}

Catchable fatal error Object of class DOMDocument could not be converted to string

I am lost with this error:
Catchable fatal error Object of class DOMDocument could not be converted to string
This is my PHP code:
<?php
require_once('includes/mysqlConnect.php');
require_once('includes/utility.php');
//calling utility
$utility = new Utility();
//Creating a connection
$connection= new mySQL();
$connection->connect();
$getContent= file_get_contents('http://www.example.com/');
//echo $getContent;
//create a new DOMDocument Object
$doc= new DOMDocument();
//load HTML into DOMDoc
libxml_use_internal_errors(true);
$doc->loadHTML($getContent);
$utility->removeElementsByTagName('script', $doc);
$utility->removeElementsByTagName('style', $doc);
$utility->removeElementsByTagName('link', $doc);
echo $doc->saveHTML();
//Insert HTMl to DB
try
{
$result=$connection->db_query("CALL finalaggregator.insert_html('$doc')");
if ($result==0){
echo "<span style='color:red;'>Error! Data Saving Processes Unsuccessful</span>";
}
else {
echo "<span style='color:green;'>Data Successfully Saved!</span>";
}
}
catch (Exception $e){
echo "<span color='color:red;'>Error in Storing Data! Please Check Store Procedure.</span>";
}
?>
But always end up with showing
DOM Document could not be converted to string on line 29
I want to store the the value of $doc into a database.
When I am trying to to call the stored procedure from Mysql:
call finalaggregator.insert_html("<p>Testing123</p>");
it is working fine.
Please help me. I am new to php.
My stored procedure is as follow:
CREATE DEFINER=`root`#`localhost` PROCEDURE `insert_html`( IN HTML LONGTEXT)
BEGIN
INSERT INTO finalaggregator.site_html (html) VALUES(HTML);
END
You cannot simply use the instance of DOMDocument as a string in your query. You have to explicitly convert it to an HTML string first:
$html = $doc->saveHTML();
$result = $connection->db_query("CALL finalaggregator.insert_html('$html')");

libxml error handler with OOP

I need catch libxml errors. But I want to use my class for this. I know about libxml_get_errors and other function. But I need something like libxml_set_erroc_class("myclass") and in all case for error will call my class. I don't want in each case after use $dom->load(...) create some construction like foreach(libxml_get_errors as $error) {....}. Can you help me?
libxml errors are mostly generated when reading or writing xml document because automatic validation is done.
So this is where you should concentrate and you don't need to overwrite the set_error_handler .. Here is a prove of concept
Use Internal Errors
libxml_use_internal_errors ( true );
Sample XML
$xmlstr = <<< XML
<?xml version='1.0' standalone='yes'?>
<movies>
<movie>
<titles>PHP: Behind the Parser</title>
</movie>
</movies>
XML;
echo "<pre>" ;
I guess this is an example of the kind of what you want to achieve
try {
$loader = new XmlLoader ( $xmlstr );
} catch ( XmlException $e ) {
echo $e->getMessage();
}
XMLLoader Class
class XmlLoader {
private $xml;
private $doc;
function __construct($xmlstr) {
$doc = simplexml_load_string ( $xmlstr );
if (! $doc) {
throw new XmlException ( libxml_get_errors () );
}
}
}
XmlException Class
class XmlException extends Exception {
private $errorMessage = "";
function __construct(Array $errors) {
$x = 0;
foreach ( $errors as $error ) {
if ($error instanceof LibXMLError) {
$this->parseError ( $error );
$x ++;
}
}
if ($x > 0) {
parent::__construct ( $this->errorMessage );
} else {
parent::__construct ( "Unknown Error XmlException" );
}
}
function parseError(LibXMLError $error) {
switch ($error->level) {
case LIBXML_ERR_WARNING :
$this->errorMessage .= "Warning $error->code: ";
break;
case LIBXML_ERR_ERROR :
$this->errorMessage .= "Error $error->code: ";
break;
case LIBXML_ERR_FATAL :
$this->errorMessage .= "Fatal Error $error->code: ";
break;
}
$this->errorMessage .= trim ( $error->message ) . "\n Line: $error->line" . "\n Column: $error->column";
if ($error->file) {
$this->errorMessage .= "\n File: $error->file";
}
}
}
Sample Output
Fatal Error 76: Opening and ending tag mismatch: titles line 4 and title
Line: 4
Column: 46
I hope this helps
Thanks
There is no facility to do this directly. Your options would be:
extend the PHP class that uses libxml and wrap your custom error handling logic around the stock implementation (not that good), or
write your own class that aggregates an instance of that PHP class and create your own public interface around it (better, because you are in control of the public interface and you don't risk problems if the PHP class is extended in the future), or
replace the global error handler for the duration of your parsing and restore it afterwards (not as powerful, may be problematic if your code calls into other code, however easier to do)
Solutions 1 and 2 have the advantage that they do not modify the behavior of any other code in your application no matter what.
edit (confused exceptions with errors):
Use set_exception_handler to catch global exceptions.
Have your code throw these exceptions in cases like $dom->load(). Because libxml doesn't seem to throw exceptions on its own, your other option is to create a wrapper around it, use the wrapper in your code and have it check libxml for errors and throw them in necessary cases.
Handle the exceptions inside "myclass".
Be wary though that the set_exception_handler will be catching all your exceptions.
Here's an example of what you could do:
//inheritance example (composition, though, would be better)
class MyDOMWrapper extends DOMDocument{
public function load($filename, $options = 0){
$bool = parent::load($filename, $options);
if (!$bool){
throw new MyDOMException('Shit happens. Feeling lucky.', 777);
}
}
}
class MyDOMException extends DOMException{
public $libxml;
public function __construct($message, $code){
parent::__construct($message, $code);
$this->libxml = libxml_get_errors();
}
}
class MyDOMExceptionHandler{
public static function handle($e){
//handle exception globally
}
}
libxml_use_internal_errors(true);
set_exception_handler(array('MyDOMErrorHandler', 'handle'));
//global exception handler
$myDom = new MyDOMWrapper();
$myDom->load('main.xml');
//when special handling is necessary
try {
$myDom = new MyDOMWrapper();
$myDom->load('main.xml');
} catch (MyDOMException $e) {
//handle or
throw $e; //rethrow
}

loadXML unhandleable error

I'm using PEAR XML_Feed_Parser.
I have some bad xml that I give to it and get error.
DOMDocument::loadXML(): Input is not proper UTF-8, indicate encoding !
Bytes: 0xE8 0xCF 0xD3 0xD4 in Entity, line: 7
It's actually html in wrong encoding - KOI8-R.
It's ok to get error but I can't handle it!
When I create new XML_Feed_Parser instance with
$feed = new XML_Feed_Parser($xml);
it calls to __construct() that looks like that
$this->model = new DOMDocument;
if (! $this->model->loadXML($feed)) {
if (extension_loaded('tidy') && $tidy) {
/* tidy stuff */
}
} else {
throw new Exception('Invalid input: this is not valid XML');
}
Where we can see that if loadXML() failed then it throw exception.
I want to catch error from loadXML() to skip bad XMLs and notify user. So i wrapped my code with try-catch like that
try
{
$feed = new XML_Feed_Parser($xml);
/* ... */
}
catch(Exception $e)
{
echo 'Feed invalid: '.$e->getMessage();
return False;
}
But even after that I get that error
DOMDocument::loadXML(): Input is not proper UTF-8, indicate encoding !
Bytes: 0xE8 0xCF 0xD3 0xD4 in Entity, line: 7
I've read about loadXML() and found that
If an empty string is passed as the source, a warning will be generated. This warning is not generated by libxml and cannot be handled using libxml's error handling functions.
But somehow instead of warning i get error that halts my application. I've written my error handler and I saw that this is really warning ($errno is 2).
So i see 2 solutions:
Revert warnings to warnings - do not
treat them like errors. (Google
doesn't help me here). After that
handle False returned from loadXML.
Somehow catch that error.
Any help?
libxml_use_internal_errors(true) solved my problem. It made libxml to use normal errors so i can catch False from loadXML().
Try this one:
$this->model = new DOMDocument;
$converted = mb_convert_encoding($feed, 'UTF-8', 'KOI8-R');
if (! $this->model->loadXML($converted)) {
if (extension_loaded('tidy') && $tidy) {
/* tidy stuff */
}
} else {
throw new Exception('Invalid input: this is not valid XML');
}
or you can do it without need to modify XML_Feed_Parser like this:
$xml = mb_convert_encoding($loaded_xml, 'UTF-8', 'KOI8-R');
$feed = new XML_Feed_Parser($xml);

Categories