I have a script that scrapes some old HTML. It does about 1000 pages a day, and every so often it chokes for some reason and throws up the following error:
PHP Catchable fatal error: Argument 1 passed to DOMXPath::__construct() must be an instance of DOMDocument, null given, called in /var/scraper/autotrader/inc/QueryPath/QueryPath/CSS/DOMTraverser.php on line 417 and defined in /var/scraper/autotrader/inc/QueryPath/QueryPath/CSS/DOMTraverser.php on line 467
At first I thought it was the error was generated when htmlqp($html) was called, but I have wrapped it in a try{} statement and it didnt catch anything:
UPDATE:
I've found the offending line of code by using # to see when the script would terminate without error. It's this line:
try {
$items = $html->find('.searchResultHeader')->find('.vehTitle'); //this one
} catch (Exception $e) {
var_dump(get_class($e));
echo 'big dump'.$e->getTraceAsString();
}
When it bombs out, it doesn't even echo 'big dump', so it really doesn't seem to be catching it.
I'm wondering if this is maybe a fault with QueryPath's error handling rather than my own?
This:
$html->find('.searchResultHeader')->find('.vehTitle');
is the same as this:
$html->find('.searchResultHeader .vehTitle');
But without the risk of calling null->find();
If you really want to do it in 2 steps, use an if, not a try:
if($el = $html->find('.searchResultHeader')) $items = $el->find('.vehTitle');
Or maybe a ternary:
$items = ($el = $html->find('.searchResultHeader')) ? $el->find('.vehTitle') : null;
It is not catching because a standard try catch block will not catch errors of this type. In order to catch a 'Catchable' fatal error a Set Error Handler for the E_RECOVERABLE_ERROR is needed.
See Also: How can I catch a “catchable fatal error” on PHP type hinting?.
Related
I'd like to able to catch an exception and continue with the execution of other subsequent functions (and possibly log an error in the catch section). In the code sample below, there are instances where $html->find doesn't find the element and returns error exception undefined offset. In such cases, the entire script fails. I don't want to specifically test for this error but rather any error that may occur within the code block in the try section.
public function parsePage1($provider)
{
$path = $this->getFile($provider);
$link = $this->links[$provider];
if (file_exists($path)) {
$string = file_get_contents($path);
$html = \HTMLDomParser::str_get_html($string);
$wrapper = $html->find('.classToLookFor')[0];
unset($string);
}
}
try {
$this->parsePage1('nameOfProvider');
} catch(Exception $e) {
// continue...
}
try {
$this->parsePage2('nameOfProvider');
} catch(Exception $e) {
// continue...
}
No, there is no way to make the code within the try block continue past an exception. An exception terminates the function just like a return would; there is no way to restore the state of the function afterwards.
Instead, avoid triggering the error in the first place:
$wrappers = $html->find('.classToLookFor'); # <-- no [0]!
if (count($wrappers)) {
$wrapper = $wrappers[0];
...
}
Just to be clear, the 'error' in this case was a notice. If your errorlevel does not include notices, which is typically the case in production, your code will continue past that point.
With that said, Notices and warnings are intended for developers to add checks for expected input, as in duskwuff's example.
Unfortunatley, duskwuff's answer is problematic with the most recent versions of php at 7.2+. This is because count() expects either an array or an object that implements countable.
With the newest version you will get a Warning:
Warning: count(): Parameter must be an array or an object that implements Countable in
You will be back where you were before using count() only. A simple fix for that is to add a check for is_array.
$wrappers = $html->find('.classToLookFor'); # <-- no [0]!
if (is_array($wrappers) && count($wrappers)) {
$wrapper = $wrappers[0];
...
}
I also want to point out, that per my original comment, the whole purpose of exception catching is to protect against program termination errors.
This was not a good example of the types of errors where you should apply try-catch, but to be clear, your original code does continue... just not within the try section of the code, but after the catch()
This simulation of your original problem illustrates that:
<?php
function findit($foo) {
return $foo[0];
}
try {
findit('');
} catch(Exception $e) {
var_dump($e);
}
echo 'Hey look we continued';
Output will be something like:
Notice: Uninitialized string offset: 0 in ... on line 4
Hey look we continued
I feel this needs to be added as a response because people in the future are going to probably find this question, which really has nothing much to do with try-catch handling, and really has to do with code that expects to work with an array, but might not get one.
I'm trying to catch error when invalid data to imagecreatefromjpeg are passed. Cake php displays error page saying Fatal Error Cake\Error\FatalErrorException so this code supposed to work but it is not:
try {
$src_img = imagecreatefromjpeg($image);
} catch (\Cake\Error\FatalErrorException $e) {
echo 'Caught exception: ', $e->getMessage(), "\n";
}
I'm also tried to use \Exception, \Cake\Core\Exception\Exception, \ErrorException but with no success.
imagecreatefromjpeg() normally shouldn't cause fatal errors, but only warnings, you may want to investigate that further.
Anyhow, catching fatal errors via try...catch is only possible as of PHP 7, where most of them have been changed to exceptions. You'd have to catch \Error or \Throwable in that case. However there are still fatal errors that cannot be catched, for example when require() fails, or memory is exceeded.
\Cake\Error\FatalErrorException is being created internally in a regular error handler, where (uncatched) fatal errors will be handled, ie that exception is not being thrown, and therefore cannot be catched.
See also
PHP Manual > Language Reference > Errors > Errors in PHP 7
PHP Manual > Language Reference > Predefined Exceptions > Error
PHP Manual > Language Reference > Predefined Interfaces and Classes > Throwable
CakePHP Source > \Cake\Error\BaseErrorHandler::handleFatalError()
I am experiencing an Exception due to a recursive function in my code, due to the nature of what I am coding, the exception can just be ignored and FALSE returned instead. So here is som simplified code to illustrate my issue.
function recursive() {
try{ recursive(); }
catch(Exception $e)
{ echo "Error Caught!"; }
}
recursive();
I can't seem to catch the 'Maximum function nesting level of '100' reached, aborting!' exception.
Have I misunderstood how try-catch's work?
Because it is a Fatal Error and not an Exception so you can not use try & catch.
An Error in general means that the execution of the program by all means can not be continued and has to be aborded.
An Exception on the other hand is like a warning, meaning something has gone wrong, but with the right handling of this exception the program execution can continue.
An Example for try & catch could be:
try to connect to the database ... function connect throws an exception because database server is not reachable ... you catch the exception and decide, well then lets read the stuff from a cached file. The intention behind exception is, to let the developer decide wether he wants to catch the exception and continue the program execution or suspend it.
"PHP Fatal error: Maximum function nesting level of '100' reached, aborting!"
It is a "Fatal error", not an exception. There is no way in PHP to convert it to Exception using set_error_handler (which is good for converting lower level errors to exceptions).
In case of "Fatal error" the only thing You can do is to make some cleanup using register_shutdown_function where you can call error_get_last and recognize that this particular fatal error occured. But thats all You can do, there is no way to continue designed program flow.
BTW this partical fatal error can happen only when You have XDebug module enabled in Your php.ini.
How do I catch this type of error?
ContextErrorException: Catchable Fatal Error: Argument 1 passed to AA\SomeBundle\Entity\SomeEntity::setCity() must be an instance of AA\SomeBundle\Entity\City, null given, called in /srv/dev/some_path/vendor/symfony/symfony/src/Symfony/Component/PropertyAccess/PropertyAccessor.php on line 360 and defined in /srv/dev/some_path/src/AA/SomeBundle/Entity/SomeEntity.php line 788
And I am trying to catch everything like that:
$form = $this->createForm(new SomeFormType(), $instanceOfSomeEntity);
try {
$form->handleRequest($request);
} catch (\Exception $e) {
$form->addError(new FormError('missing_information'));
}
The easiest way to fix this is to prefix the offending code with the # symbol, so any warnings are suppressed. Any errors should then be caught in the try...catch.
Not ideal as # has non-trivial performance implications, but otherwise, you're looking at perhaps replacing the error handling or in my case, when reading from XML, checking the existence of every tag before trying to get the value.
This is my code, fixed by adding the '#'
try {
$value = #$this->XML->StructuredXMLResume->ContactInfo->ContactMethod->PostalAddress->DeliveryAddress->AddressLine;
} catch (\Exception $e) {
$value = '';
}
As you can imagine, checking every level down to AddressLine would be ridiculous.
You have to disable the error reporting and catch the last error with error_get_last() function, here an example, from the Symfony Finder component : https://github.com/symfony/symfony/blob/master/src/Symfony/Component/Finder/SplFileInfo.php#L65
The other way is to create a custom error handler, here an example from Monolog : https://github.com/Seldaek/monolog/blob/master/src/Monolog/ErrorHandler.php
I know, that by its very definition, a fatal exception is supposed to kill the execution, and should not be suppressed, but here's the issue.
I'm running a script that scrapes, parses and stores in a DB about 10,000 pages. This takes a couple of hours, and in rare cases (1 in 1000) a page fails parsing and throws a fatal exception.
Currently, I'm doing this:
for ($i=0;$i<$count;$i++)
{
$classObject = $classObjects[$i];
echo $i . " : " . memory_get_usage(true) . "\n";
$classDOM = $scraper->scrapeClassInfo($classObject,$termMap,$subjectMap);
$class = $parser->parseClassInfo($classDOM);
$dbmanager->storeClassInfo($class);
unset($classDOM,$class,$classObject);
}
Can I do something like
for ($i=0;$i<$count;$i++)
{
$classObject = $classObjects[$i];
echo $i . " : " . memory_get_usage(true) . "\n";
try
{
$classDOM = $scraper->scrapeClassInfo($classObject,$termMap,$subjectMap);
$class = $parser->parseClassInfo($classDOM);
$dbmanager->storeClassInfo($class);
unset($classDOM,$class,$classObject);
}
catch (Exception $e)
{
//log the error here
continue;
}
}
The code above doesn't work for fatal exceptions.
Would it be possible to do something like this:
If I moved the main loop into a method, and then call the method from register_shutdown_function ?
Like this:
function do($start)
{
for($i=$start;$i<$count;$i++)
{
//do stuff here
}
}
register_shutdown_function('shutdown');
function shutdown()
{
do();
}
This is the message that is output when execution stops:
Fatal error: Call to a member function find() on a non-object in ...
I expect this above message when a page isn't parse-able by the method I am using. I'm fine with just skipping that page and moving on to the next iteration of the loop.
Fatal errors are fatal and terminate execution. There is no way around this if a fatal error occurs. However, your error:
Fatal error: Call to a member function find() on a non-object in ...
is entirely preventable. Just add a check to make sure you have an instance of the correct object, and if not, handle the error:
if ($foo instanceof SomeObject) {
$foo->find(...);
} else {
// something went wrong
}
First, there is a distinct difference between exceptions and errors. What you encountered is an error, not an exception. Based on your message and the code you posted the problem is with something you haven't put into your question. What variable are you trying to call find() on? Well, that variable isn't an object. There is no way to trap fatal errors and ignore it, you must go find where you are calling find() on a non-object and fix it.
Seems to me like the only possible way to "catch" a faltal error is with the registering a shutdown function. Remember to add all (or maybe groups of) queries into transactions and maybe roll them back if something fails, just to ensure consistency.
I have had a similar problem to this, and I found that using a is_object() call before the find() call allows you to avoid the fatal error.