Handle errors in simple html dom - php

I have some code to get some public available data that i am fetching from a website
//Array of params
foreach($params as $par){
$html = file_get_html('WEBSITE.COM/$par');
$name = $html->find('div[class=name]');
$link = $html->find('div[class=secondName]');
foreach($link as $i => $result2)
{
$var = $name[$i]->plaintext;
echo $result2->href,"<br>";
//Insert to database
}
}
So it goes to the given website with a different parameter in the URL each time on the loop, i keep getting errors that breaks the script when a 404 comes up or a server temporarily unavailable. I have tried code to check the headers and check if the $html is an object first but i still get the errors, is there a way i can just skip the errors and leave them out and carry on with the script?
Code i have tried to checked headers
function url_exists($url){
if ((strpos($url, "http")) === false) $url = "http://" . $url;
$headers = #get_headers($url);
//print_r($headers);
if (is_array($headers)){
//Check for http error here....should add checks for other errors too...
if(strpos($headers[0], '404 Not Found'))
return false;
else
return true;
}
else
return false;
}
Code i have tried to check if object
if (method_exists($html,"find")) {
// then check if the html element exists to avoid trying to parse non-html
if ($html->find('html')) {
// and only then start searching (and manipulating) the dom

You need to be more specific, what kind of errors are you getting? Which line errors out?
Edit: Since you did specify the errors you're getting, here's what to do:
I've noticed you're using SINGLE quotes with a string that contains variables. This won't work, use double quotes instead, i.e.:
$html = file_get_html("WEBSITE.COM/$par");
Perhaps this is the issue?
Also, you could use file_get_contents()
if (file_get_contents("WEBSITE.COM/$par") !== false) {
...
}

Related

Automatic conversion of false to array is deprecated

I get this warning in a chunk of instructions PHP 8+ dedicated to the check of the user inside the page:
if ($_POST['go'] ?? null) {
// $_SESSION_VALUES is an array $db, $nick are classes of mine
$_SESSION_VALUES = $nick->get_cookie (COOKIE_NAME); // get the name of the cookie
if ($db->check_user (USERS_TABLE, $_POST['nick'], $db->encode_password($_POST['password']))) {
$_SESSION_VALUES['_USERNAME'] = $db->user_rec['nick']; // get the nickname from the cookie
$_SESSION_VALUES['_PASSWORD'] = $db->user_rec['password']; //get the password
$_SESSION_VALUES['_USER'] = $db->user_type;
if (! $nick->set_cookie (COOKIE_NAME, $_SESSION_VALUES)) die ('Cannot write the cookie'); // record the cookie
header('Location: ./copertina'); }
else $_SESSION_VALUES['_USER'] = -1;
}
The execution of
else $_SESSION_VALUES['_USER'] = -1;
gives "Automatic conversion of false to array is deprecated"
Following a suggestion from stack overflow I tryed this:
$\_SESSION_VALUES = \[\];
if ($\_POST\['go'\] ?? null) {
...
but apparently it doesn't work
any idea?
Thanks
I assume that $nick->get_cookie(COOKIE_NAME); returns false.
Try changing:
else $_SESSION_VALUES['_USER'] = -1;
to:
else $_SESSION_VALUES = ['_USER' => -1];
This will probably get rid of the error message you reported, but I don't know if the rest of your code, which I cannot see, will accept this.

How to add an error handling to read an XML file in php?

I am developing a PHP script that allows me to modify tags in an XML file and move them once done.
My script works correctly but I would like to add error handling: So that if the result of my SQL query does not return anything display an error message or better, send a mail, and not move the file with the error and move to the next.
I did some tests but the code never displays the error and it moves the file anyway.
Can someone help me to understand why? Thanks
<?php
}
}
$xml->formatOutput = true;
$xml->save($source_file);
rename($source_file,$destination_file);
}
}
closedir($dir);
?>
Give this one a try
$result = odbc_fetch_array($exec);
if ($result === false || $result['GEAN'] === null) {
echo "GEAN not found for $SKU_CODE";
// continue;
}
$barcode = (string) $result['GEAN'];
echo $barcode; echo "<br>"; //9353970875729
$node->getElementsByTagName("SKU")->item(0)->nodeValue = "";
$node->getElementsByTagName("SKU")->item(0)->appendChild($xml->createTextNode($result[GEAN]));

PHP script can't open certain URLs

I'm calling through Axios a PHP script checking whether a URL passed to it as a parameter can be embedded in an iframe. That PHP script starts with opening the URL with $_GET[].
Strangely, a page with cross-origin-opener-policy: same-origin (like https://twitter.com/) can be opened with $_GET[], whereas a page with Referrer Policy: strict-origin-when-cross-origin (like https://calia.order.liven.com.au/) cannot.
I don't understand why, and it's annoying because for the pages that cannot be opened with $_GET[] I'm unable to perform my checks on them - the script just fails (meaning I get no response and the Axios call runs the catch() block).
So basically there are 3 types of pages: (1) those who allow iframe embeddability, (2) those who don't, and (3) the annoying ones who not only don't but also can't even be opened to perform this check.
Is there a way to open any page with PHP, and if not, what can I do to prevent my script from failing after several seconds?
PHP script:
$source = $_GET['url'];
$response = true;
try {
$headers = get_headers($source, 1);
$headers = array_change_key_case($headers, CASE_LOWER);
if (isset($headers['content-security-policy'])) {
$response = false;
}
else if (isset($headers['x-frame-options']) &&
$headers['x-frame-options'] == 'DENY' ||
$headers['x-frame-options'] == 'SAMEORIGIN'
) {
$response = false;
}
} catch (Exception $ex) {
$response = $ex;
}
echo $response;
EDIT: below is the console error.
Access to XMLHttpRequest at 'https://path.to.cdn/iframeHeaderChecker?url=https://calia.order.liven.com.au/' from origin 'http://localhost:3000' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
CustomLink.vue?b495:61 Error: Network Error
at createError (createError.js?2d83:16)
at XMLHttpRequest.handleError (xhr.js?b50d:84)
VM4758:1 GET https://path.to.cdn/iframeHeaderChecker?url=https://calia.order.com.au/ net::ERR_FAILED
The error you have shown is coming from Javascript, not from PHP. get_headers() returns false on failure, it will not throw an exception - the catch() never happens. get_headers() just makes an http request, like your browser, or curl, and the only reason that would fail is if the URL is malformed, or the remote site is down, etc.
It is the access from http://localhost:3000 to https://path.to.cdn/iframeHeaderChecker with Javascript that has been blocked, not PHP access to the URLs you are passing as parameters in $_GET['url'].
What you're seeing is a standard CORS error when you try to access a different domain than the one the Javascript is running on. CORS means Javascript running on one host cannot make http requests to another host, unless that other host explicitly allows it. In this case, the Javascript running at http://localhost:3000 is making an http request to a remote site https://path.to.cdn/. That's a cross-origin request (localhost !== path.to.cdn), and the server/script receiving that request on path.to.cdn is not returning any specific CORS headers allowing that request, so the request is blocked.
Note though that if the request is classed as "simple", it will actually run. So your PHP is working already, always, but bcs the right headers aren't returned, the result is blocked from being displayed in your browser. This can lead to confusion bcs for eg you might notice a delay while it gets the headers from a slow site, whereas it is super fast for a fast site. Or maybe you have logging which you see is working all the time, despite nothing showing up in your browser.
My understanding is that https://path.to.cdn/iframeHeaderChecker is your PHP script, some of the code of which you have shown in your question? If so, you have 2 choices:
Update iframeHeaderChecker to return the appropriate CORS headers, so that your cross-origin JS request is allowed. As a quick, insecure hack to allow access from anyone and anywhere (not a good idea for the long term!) you could add:
header("Access-Control-Allow-Origin: *");
But it would be better to update that to more specifically restrict access to only your app, and not everyone else. You'll have to evaluate the best way to do that depending on the specifics of your application and infrastructure. There many questions here on SO about CORS/PHP/AJAX to check for reference. You could also configure this at the web server level, rather than the application level, eg here's how to configure Apache to return those headers.
If iframeHeaderChecker is part of the same application as the Javascript calling it, is it also available locally, on http://localhost:3000? If so, update your JS to use the local version, not the remote one on path.to.cdn, and you avoid the whole problem!
This is just my rough guess about what wrong with your code can be.
I noticed you do:
a comparison of values from $headers but without
ensuring they have the same CAPITAL CASE as the values you compare against. Applied: strtoupper().
check with isset() but not test if key_exist before
Applied: key_exist()
check with isset() but perhaps you should use !empty() instead of isset()
compare result:
$value = "";
var_dump(isset($value)); // (bool) true
var_dump(!empty($value)); // (bool) false
$value = "something";
var_dump(isset($value)); // (bool) true
var_dump(!empty($value)); // (bool) true
unset($value);
var_dump(isset($value)); // (bool) false
var_dump(!empty($value)); // (bool) false
The code with applied changes:
<?php
error_reporting(E_ALL);
declare(strict_types=1);
header('Access-Control-Allow-Origin: *');
ob_start();
try {
$response = true;
if (!key_exists('url', $_GET)) {
$msg = '$_GET does not have a key "url"';
throw new \RuntimeException($msg);
}
$source = $_GET['url'];
if ($source !== filter_var($source, \FILTER_SANITIZE_URL)) {
$msg = 'Passed url is invaid, url: ' . $source;
throw new \RuntimeException($msg);
}
if (filter_var($source, \FILTER_VALIDATE_URL) === FALSE) {
$msg = 'Passed url is invaid, url: ' . $source;
throw new \RuntimeException($msg);
}
$headers = get_headers($source, 1);
if (!is_array($headers)) {
$msg = 'Headers should be array but it is: ' . gettype($headers);
throw new \RuntimeException($msg);
}
$headers = array_change_key_case($headers, \CASE_LOWER);
if ( key_exists('content-security-policy', $headers) &&
isset($headers['content-security-policy'])
) {
$response = false;
}
elseif ( key_exists('x-frame-options', $headers) &&
(
strtoupper($headers['x-frame-options']) == 'DENY' ||
strtoupper($headers['x-frame-options']) == 'SAMEORIGIN'
)
) {
$response = false;
}
} catch (Exception $ex) {
$response = "Error: " . $ex->getMessage() . ' at: ' . $ex->getFile() . ':' . $ex->getLine();
}
$phpOutput = ob_get_clean();
if (!empty($phpOutput)) {
$response .= \PHP_EOL . 'PHP Output: ' . $phpOutput;
}
echo $response;
Using Throwable instead of Exception will also catch Errors in PHP7.
Keep in mind that:
$response = true;
echo $response; // prints "1"
but
$response = false;
echo $response; // prints ""
so for the $response = false you'll get an empty string, not 0
if you want to have 0 for false and 1 for true then change the $response = true; to $response = 1; for true and $response = false; to $response = 0; for false everywhere.
I hope that somehow helps

PHP Not loading rest of page after exit;

I'm very new to PHP, and I can't figure out why this is happening.
For some reason, when exit fires the entire page stops loading, not just the PHP script. Like, it'll load the top half of the page, but nothing below where the script is included.
Here's my code:
$page = $_GET["p"] . ".htm";
if (!$_GET["p"]) {
echo("<h1>Please click on a page on the left to begin</h1>\n");
// problem here
exit;
}
if ($_POST["page"]) {
$handle = fopen("../includes/$page", "w");
fwrite($handle, $_POST["page"]);
fclose($handle);
echo("<p>Page successfully saved.</p>\n");
// problem here
exit;
}
if (file_exists("../includes/$page")) {
$FILE = fopen("../includes/$page", "rt");
while (!feof($FILE)) {
$text .= fgets($FILE);
}
fclose($FILE);
} else {
echo("<h1>Page "$page" does not exist.</h1>\n");
// echo("<h1>New Page: $page</h1>\n");
// $text = "<p></p>";
// problem here
exit;
}
Even if you have HTML code following your PHP code, from the web server's perspective it is strictly a PHP script. When exit() is called, that is the end of it. PHP will output process and output no more HTML, and the web server will not output anymore html. In other words, it is working exactly as it is supposed to work.
If you need to terminate the flow of PHP code execution without preventing any further HTML from being output, you will need to reorganize your code accordingly.
Here is one suggestion. If there is a problem, set a variable indicating so. In subsequent if() blocks, check to see if previous problems were encountered.
$problem_encountered = FALSE;
if (!$_GET["p"]) {
echo("<h1>Please click on a page on the left to begin</h1>\n");
// problem here
// Set a boolean variable indicating something went wrong
$problem_encountered = TRUE;
}
// In subsequent blocks, check that you haven't had problems so far
// Adding preg_match() here to validate that the input is only letters & numbers
// to protect against directory traversal.
// Never pass user input into file operations, even checking file_exists()
// without also whitelisting the input.
if (!$problem_encountered && $_GET["page"] && preg_match('/^[a-z0-9]+$/', $_GET["page"])) {
$page = $_GET["p"] . ".htm";
$handle = fopen("../includes/$page", "w");
fwrite($handle, $_GET["page"]);
fclose($handle);
echo("<p>Page successfully saved.</p>\n");
// problem here
$problem_encountered = TRUE;
}
if (!$problem_encountered && file_exists("../includes/$page")) {
$FILE = fopen("../includes/$page", "rt");
while (!feof($FILE)) {
$text .= fgets($FILE);
}
fclose($FILE);
} else {
echo("<h1>Page "$page" does not exist.</h1>\n");
// echo("<h1>New Page: $page</h1>\n");
// $text = "<p></p>";
// problem here
$problem_encountered = TRUE;
}
There are lots of ways to handle this, many of which are better than the example I provided. But this is a very easy way for you to adapt your existing code without needing to do too much reorganization or risk breaking much.
In PHP 5.3+ you can use the goto statement to jump to a label just before the ?> instead of using exit in the example given in the question.
It would'n work well with more structured code (jumping out of functions), tough.
Maybe this should be a comment, who knows.

Having trouble getting the right idea

well i'm writing a php code to edit tags and data inside those tags but i'm having big trouble getting my head around the thing.
basically i have an xml file similar to this but bigger
<users>
<user1>
<password></password>
</user1>
</users>
and the php code i'm using to try and change the user1 tag is this
function mod_user() {
// Get global Variables
global $access_level;
// Pull the data from the form
$entered_new_username = $_POST['mod_user_new_username'];
$entered_pass = $_POST['mod_user_new_password'];
$entered_confirm_pass = $_POST['mod_user_confirm_new_password'];
$entered_new_roll = $_POST['mod_user_new_roll'];
$entered_new_access_level = $_POST['mod_user_new_access_level'];
// Grab the old username from the last page as well so we know who we are looking for
$current_username = $_POST['mod_user_old_username'];
// !!-------- First thing is first. we need to run checks to make sure that this operation can be completed ----------------!!
// Check to see if the user exist. we just use the normal phaser since we are only reading and it's much easier to make loop through
$xml = simplexml_load_file('../users/users.xml');
// read the xml file find the user to be modified
foreach ($xml->children() as $xml_user_get)
{
$xml_user = ($xml_user_get->getName());
if ($xml_user == $entered_new_username){
// Set array to send data back
//$a = array ("error"=>103, "entered_user"=>$new_user, "entered_roll"=>$new_roll, "entered_access"=>$new_access_level);
// Add to session to be sent back to other page
// $_SESSION['add_error'] = $a;
die("Username Already exist - Pass");
// header('location: ../admin.php?page=usermanage&task=adduser');
}
}
// Check the passwords and make sure they match
if ($entered_pass == $entered_confirm_pass) {
// Encrypt the new password and unset the old password variables so they don't stay in memory un-encrytped
$new_password = hash('sha512', $entered_pass);
unset ($entered_pass, $entered_confirm_pass, $_POST['mod_user_new_password'], $_POST['mod_user_confirm_pass']);
}
else {
die("passwords did not match - Pass");
}
if ($entered_new_access_level != "") {
if ($entered_new_access_level < $access_level){
die("Access level is not sufficiant to grant access - Pass");
}
}
// Now to load up the xml file and commit changes.
$doc = new DOMDocument;
$doc->formatOutput = true;
$doc->perserveWhiteSpace = false;
$doc->load('../users/users.xml');
$old_user = $doc->getElementsByTagName('users')->item(0)->getElementsByTagName($current_username)->item(0);
// For initial debugging - to be deleted
if ($old_user == $current_username)
echo "old username found and matches";
// Check the variables to see if there is something to change in the data.
if ($entered_new_username != "") {
$xml_old_user = $doc->getElementsByTagName('users')->item(0)->getElementsByTagName($current_username)->item(0)->replaceChild($entered_new_username, $old_user);
echo "Username is now: " . $current_username;
}
if ($new_pass != "") {
$current_password = $doc->getElementsByTagName($current_user)->item(0)->getElementsByTagName('password')->item(0)->nodeValue;
//$replace_password = $doc
}
}
when run with just the username entered for change i get this error
Catchable fatal error: Argument 1 passed to DOMNode::replaceChild() must be an instance of DOMNode, string given, called in E:\xampp\htdocs\CGS-Intranet\admin\html\useraction.php on line 252 and defined in E:\xampp\htdocs\CGS-Intranet\admin\html\useraction.php on line 201
could someone explain to me how to do this or show me how they'd do it.. it might make a little sense to me to see how it's done :s
thanks
$entered_new_username is a string so you'll need to wrap it with a DOM object, via something like$doc->createElement()
$xml_old_user = $doc->getElementsByTagName('users')->item(0)->getElementsByTagName($current_username)->item(0)->replaceChild($doc->createElement($entered_new_username), $old_user);
This may not be quite right, but hopefully it points you in the correct direction.
alright got it writing and replacing the node that i want but i have ran into other issues i have to work out (IE: it's replacing the whole tree rather then just changing the node name)
anyway the code i used is
// For initial debugging - to be deleted
if ($old_user == $current_username)
echo "old username found and matches";
// Check the variables to see if there is something to change in the data.
if ($entered_new_username != "") {
try {
$new_node_name = $doc->createElement($entered_new_username);
$old_user->parentNode->replaceChild($new_node_name, $old_user);
}
catch (DOMException $e) {
echo $e;
}
echo "Username is now: " . $current_username;
}
if ($new_pass != "") {
$current_password = $doc->getElementsByTagName($current_user)->item(0)->getElementsByTagName('password')->item(0)->nodeValue;
//$replace_password = $doc
}
$doc->save('../users/users.xml');

Categories