PHP script can't open certain URLs - php

I'm calling through Axios a PHP script checking whether a URL passed to it as a parameter can be embedded in an iframe. That PHP script starts with opening the URL with $_GET[].
Strangely, a page with cross-origin-opener-policy: same-origin (like https://twitter.com/) can be opened with $_GET[], whereas a page with Referrer Policy: strict-origin-when-cross-origin (like https://calia.order.liven.com.au/) cannot.
I don't understand why, and it's annoying because for the pages that cannot be opened with $_GET[] I'm unable to perform my checks on them - the script just fails (meaning I get no response and the Axios call runs the catch() block).
So basically there are 3 types of pages: (1) those who allow iframe embeddability, (2) those who don't, and (3) the annoying ones who not only don't but also can't even be opened to perform this check.
Is there a way to open any page with PHP, and if not, what can I do to prevent my script from failing after several seconds?
PHP script:
$source = $_GET['url'];
$response = true;
try {
$headers = get_headers($source, 1);
$headers = array_change_key_case($headers, CASE_LOWER);
if (isset($headers['content-security-policy'])) {
$response = false;
}
else if (isset($headers['x-frame-options']) &&
$headers['x-frame-options'] == 'DENY' ||
$headers['x-frame-options'] == 'SAMEORIGIN'
) {
$response = false;
}
} catch (Exception $ex) {
$response = $ex;
}
echo $response;
EDIT: below is the console error.
Access to XMLHttpRequest at 'https://path.to.cdn/iframeHeaderChecker?url=https://calia.order.liven.com.au/' from origin 'http://localhost:3000' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource.
CustomLink.vue?b495:61 Error: Network Error
at createError (createError.js?2d83:16)
at XMLHttpRequest.handleError (xhr.js?b50d:84)
VM4758:1 GET https://path.to.cdn/iframeHeaderChecker?url=https://calia.order.com.au/ net::ERR_FAILED

The error you have shown is coming from Javascript, not from PHP. get_headers() returns false on failure, it will not throw an exception - the catch() never happens. get_headers() just makes an http request, like your browser, or curl, and the only reason that would fail is if the URL is malformed, or the remote site is down, etc.
It is the access from http://localhost:3000 to https://path.to.cdn/iframeHeaderChecker with Javascript that has been blocked, not PHP access to the URLs you are passing as parameters in $_GET['url'].
What you're seeing is a standard CORS error when you try to access a different domain than the one the Javascript is running on. CORS means Javascript running on one host cannot make http requests to another host, unless that other host explicitly allows it. In this case, the Javascript running at http://localhost:3000 is making an http request to a remote site https://path.to.cdn/. That's a cross-origin request (localhost !== path.to.cdn), and the server/script receiving that request on path.to.cdn is not returning any specific CORS headers allowing that request, so the request is blocked.
Note though that if the request is classed as "simple", it will actually run. So your PHP is working already, always, but bcs the right headers aren't returned, the result is blocked from being displayed in your browser. This can lead to confusion bcs for eg you might notice a delay while it gets the headers from a slow site, whereas it is super fast for a fast site. Or maybe you have logging which you see is working all the time, despite nothing showing up in your browser.
My understanding is that https://path.to.cdn/iframeHeaderChecker is your PHP script, some of the code of which you have shown in your question? If so, you have 2 choices:
Update iframeHeaderChecker to return the appropriate CORS headers, so that your cross-origin JS request is allowed. As a quick, insecure hack to allow access from anyone and anywhere (not a good idea for the long term!) you could add:
header("Access-Control-Allow-Origin: *");
But it would be better to update that to more specifically restrict access to only your app, and not everyone else. You'll have to evaluate the best way to do that depending on the specifics of your application and infrastructure. There many questions here on SO about CORS/PHP/AJAX to check for reference. You could also configure this at the web server level, rather than the application level, eg here's how to configure Apache to return those headers.
If iframeHeaderChecker is part of the same application as the Javascript calling it, is it also available locally, on http://localhost:3000? If so, update your JS to use the local version, not the remote one on path.to.cdn, and you avoid the whole problem!

This is just my rough guess about what wrong with your code can be.
I noticed you do:
a comparison of values from $headers but without
ensuring they have the same CAPITAL CASE as the values you compare against. Applied: strtoupper().
check with isset() but not test if key_exist before
Applied: key_exist()
check with isset() but perhaps you should use !empty() instead of isset()
compare result:
$value = "";
var_dump(isset($value)); // (bool) true
var_dump(!empty($value)); // (bool) false
$value = "something";
var_dump(isset($value)); // (bool) true
var_dump(!empty($value)); // (bool) true
unset($value);
var_dump(isset($value)); // (bool) false
var_dump(!empty($value)); // (bool) false
The code with applied changes:
<?php
error_reporting(E_ALL);
declare(strict_types=1);
header('Access-Control-Allow-Origin: *');
ob_start();
try {
$response = true;
if (!key_exists('url', $_GET)) {
$msg = '$_GET does not have a key "url"';
throw new \RuntimeException($msg);
}
$source = $_GET['url'];
if ($source !== filter_var($source, \FILTER_SANITIZE_URL)) {
$msg = 'Passed url is invaid, url: ' . $source;
throw new \RuntimeException($msg);
}
if (filter_var($source, \FILTER_VALIDATE_URL) === FALSE) {
$msg = 'Passed url is invaid, url: ' . $source;
throw new \RuntimeException($msg);
}
$headers = get_headers($source, 1);
if (!is_array($headers)) {
$msg = 'Headers should be array but it is: ' . gettype($headers);
throw new \RuntimeException($msg);
}
$headers = array_change_key_case($headers, \CASE_LOWER);
if ( key_exists('content-security-policy', $headers) &&
isset($headers['content-security-policy'])
) {
$response = false;
}
elseif ( key_exists('x-frame-options', $headers) &&
(
strtoupper($headers['x-frame-options']) == 'DENY' ||
strtoupper($headers['x-frame-options']) == 'SAMEORIGIN'
)
) {
$response = false;
}
} catch (Exception $ex) {
$response = "Error: " . $ex->getMessage() . ' at: ' . $ex->getFile() . ':' . $ex->getLine();
}
$phpOutput = ob_get_clean();
if (!empty($phpOutput)) {
$response .= \PHP_EOL . 'PHP Output: ' . $phpOutput;
}
echo $response;
Using Throwable instead of Exception will also catch Errors in PHP7.
Keep in mind that:
$response = true;
echo $response; // prints "1"
but
$response = false;
echo $response; // prints ""
so for the $response = false you'll get an empty string, not 0
if you want to have 0 for false and 1 for true then change the $response = true; to $response = 1; for true and $response = false; to $response = 0; for false everywhere.
I hope that somehow helps

Related

Why does my second function always return false? Also, why does code in my if statement run regardless?

SOLVED!! Sorry for wasting your time.
Problems:
Second function "verify_webhook_2" always returns false.
Code in if statement runs whether tests return true or not.
I copied and pasted the first function, then made (what I would think to be) appropriate changes, so I can verify webhooks coming from two different Shopify stores. I'm sure it's something simple that I am just oblivious to, as I'm still fairly new to all of this. If I change $verify to the secret for $verify2 then webhooks received from that shop will verify true.
And I cannot for the life of me understand why the code in the if statement runs even when both requirements test false. There's no way I can think of that either could prove true when the receiving a webhook from the shop related to the $verify2 secret. Probably a rookie mistake?
$verify = "xxxxsecretxxxx";
$verify2 = "xxxxsecretxxxx";
define('SHOPIFY_APP_SECRET', $verify);
define('SHOPIFY_APP_SECRET_2', $verify2);
function verify_webhook($data, $hmac_header)
{
$calculated_hmac = base64_encode(hash_hmac('sha256', $data, SHOPIFY_APP_SECRET, true));
return hash_equals($hmac_header, $calculated_hmac);
}
function verify_webhook_2($data, $hmac_header)
{
$calculated_hmac_2 = base64_encode(hash_hmac('sha256', $data, SHOPIFY_APP_SECRET_2, true));
return hash_equals($hmac_header, $calculated_hmac_2);
}
$hmac_header = $_SERVER['HTTP_X_SHOPIFY_HMAC_SHA256'];
$data = file_get_contents('php://input');
$verified = verify_webhook($data, $hmac_header);
$verified_2 = verify_webhook_2($data, $hmac_header);
error_log('Webhook verified: '.var_export($verified, true)); //check error.log to see the result
if ($verified == true || $verified_2 == true){
header("HTTP/1.1 200 OK"); //respond with success
http_response_code(201); //respond with success
file_put_contents('/var/www/html/temp/webhook.json', $data);
$POST = json_decode(file_get_contents('/var/www/html/temp/webhook.json'), true);
//$POST = $POST['id'];
$report = "id: " . $POST['id'] . " - email: " . $POST['email'] . " - name: " . $POST['customer']['first_name'] . " " . $POST['customer']['last_name'] ;
}else{
}
Of course, right after posting the question, I realized my failure. I had only written to the error log for the first function's comparison, so when I kept seeing "webhook verified: false" in the error logs, I assumed that was regardless of the shop I was sending data from.
I added:
error_log('Webhook verified_2: '.var_export($verified_2, true)); //check error.log to see the result
just below the first error_log call, then added another error log into the else section of my if statement, and all is working correctly, and responding correctly.
It was a lack of understanding on my part that led to me believing it was not working correctly, when in fact, everything was, but I was missing information.

PHP - URL gets malformed during redirect

So, I have an image link that has this href:
http://www.app.com/link?target=www.target.com&param1=abc&param2=xyz
This is processed like so (I use laravel):
function out (Request $request) {
$url = $request->target;
$qs = $request->except('target');
if ( !empty($qs) ) {
$url .= strpos($url, '?') !== false ? '&' : '?';
$url .= http_build_query($qs);
}
return redirect($url);
}
Most of the time, this works. However, lately, we've been experiencing an issue where param1 and param2 are attached to the URL in a seemingly infinite loop causing us to hit a 414 Request URI too long Error.
The problem is that it happens so randomly that I really don't know where to check because I added a checker before the return statement.
if ( substr_count($url, 'param1') > 1 ) {
$file = storage_path() . '/logs/logger.log';
$log = "[ " . date("d-m-Y H:i:sa") . " ] [ {$request->ip()} ] - {$url} \n";
file_put_contents($file, $log, FILE_APPEND);
}
And it hasn't logged a single hit. Even after our testers experienced the bug.
Is it possible that the receiving application is breaking the URL somehow?
What information should I be looking out for? Have you seen an issue like this before?
Is it the http_build_query that could be causing this and that my checker just doesn't work as expected (though, I did test it and it logged my test URL).
Any help on the matter would be great.
Assuming and issue with http_build_query:
Well, one attempt you may try is to rewrite the code without $request->except and http_build_query.
If you don't have any special reason to use http_build_query i would suggest to use $request->input.
Example with $request->input:
function out (Request $request) {
$url = $request->target;
$param1 = $request->input('param1', '');
$param2 = $request->input('param2', '');
if (!empty($param1) || !empty($param2)) {
$url .= '?';
}
if (!empty($param1) && !empty($param2)) {
$url .= 'param1=' . $param1 . '&param2=' . $param2;
} else {
$url .= !empty($param1) 'param1=' . $param1 : '';
$url .= !empty($param2) 'param2=' . $param2 : '';
}
return redirect($url);
}
The solution is a little bit more verbose but with that, you should be sure 100% that is not the code to generate the redundancy.
Absurd, remote possibility:
The second thing I would try is to check you log system. For instance if you are running under apache you should have a file called access.log under /var/log/apache2/ (or under /var/log/nginx/ with nginx).
In there you should have the history of all your http requests.
Maybe there is a chance that some of the wired requests with multiple params are from a strange IP address.
If this is the case, it means that some company is monitoring and testing the website (potentially with the strange parameters) for security reasons.
If this is the case, I guess you are under http and you should switch to https.
Anyway, with the new code, you should be sure about the code and be able to investigate any other part of the system.

Incorrect Client IP in PHP

I am getting a weird result for the client IP in PHP in some cases.
Result in Most Cases (Expected Result) :
192.123.132.123
Erroneous Result Type 1:
for="192.123.132.123"
Erroneous Result Type 2:
for="192.123.132.123:1232"
Code for getting the IP:
<?php
function getIP(){
$ip = isset($_SERVER['REMOTE_ADDR']) ? $_SERVER['REMOTE_ADDR'] : '-';
$proxy = false;
if (!empty($_SERVER['HTTP_VIA']) || !empty($_SERVER['HTTP_X_FORWARDED_FOR'])) {
$proxy = true;
} elseif (!empty($_SERVER['REMOTE_HOST'])) {
$aProxyHosts = array('proxy','cache','inktomi');
foreach ($aProxyHosts as $proxyName) {
if (strpos($_SERVER['REMOTE_HOST'], $proxyName) !== false) {
$proxy = true;
break;
}
}
}
// Has the viewer come via an HTTP proxy?
if ($proxy) {
// Try to find the "real" IP address the viewer has come from
$aHeaders = array('HTTP_FORWARDED','HTTP_FORWARDED_FOR','HTTP_X_FORWARDED','HTTP_X_FORWARDED_FOR','HTTP_CLIENT_IP');
foreach ($aHeaders as $header) {
if (!empty($_SERVER[$header])) {
$ip = $_SERVER[$header];
break;
}
}
}
if (!empty($ip)) {
// The "remote IP" may be a list, ensure that
// only the last item is used in that case
$ip = explode(',', $ip);
$ip = trim($ip[count($ip) - 1]);
}
return $ip;
}
?>
I know that I can clean the result to get the correct value (IP) but I am puzzled at why is this happening in the first place.
PS: 192.123.132.123 is an arbitrary IP used to explain the issue.
You're reading arbitrary HTTP headers... not all of them contain purely the IP, some are in the form of for=... and some include the port as well.
Using any HTTP header instead $_SERVER['REMOTE_ADDR'] means you're allowing anyone to mask/fake their IP address by simply sending an HTTP header. You should be perfectly aware of where such headers may be set, which usually means you know they're set by a proxy you control. In this case you obviously don't know where those headers are coming from, so you should not use them.
If you decide to use an HTTP header, you should know which one exactly you want to read and what format it's in. If its format is for=..., then parse that format correctly.

Handle errors in simple html dom

I have some code to get some public available data that i am fetching from a website
//Array of params
foreach($params as $par){
$html = file_get_html('WEBSITE.COM/$par');
$name = $html->find('div[class=name]');
$link = $html->find('div[class=secondName]');
foreach($link as $i => $result2)
{
$var = $name[$i]->plaintext;
echo $result2->href,"<br>";
//Insert to database
}
}
So it goes to the given website with a different parameter in the URL each time on the loop, i keep getting errors that breaks the script when a 404 comes up or a server temporarily unavailable. I have tried code to check the headers and check if the $html is an object first but i still get the errors, is there a way i can just skip the errors and leave them out and carry on with the script?
Code i have tried to checked headers
function url_exists($url){
if ((strpos($url, "http")) === false) $url = "http://" . $url;
$headers = #get_headers($url);
//print_r($headers);
if (is_array($headers)){
//Check for http error here....should add checks for other errors too...
if(strpos($headers[0], '404 Not Found'))
return false;
else
return true;
}
else
return false;
}
Code i have tried to check if object
if (method_exists($html,"find")) {
// then check if the html element exists to avoid trying to parse non-html
if ($html->find('html')) {
// and only then start searching (and manipulating) the dom
You need to be more specific, what kind of errors are you getting? Which line errors out?
Edit: Since you did specify the errors you're getting, here's what to do:
I've noticed you're using SINGLE quotes with a string that contains variables. This won't work, use double quotes instead, i.e.:
$html = file_get_html("WEBSITE.COM/$par");
Perhaps this is the issue?
Also, you could use file_get_contents()
if (file_get_contents("WEBSITE.COM/$par") !== false) {
...
}

Exception handling with get_meta_tags() & get_headers()?

In PHP, I am using get_meta_tags() and get_headers(), however, when there is a 404, those two functions throw a warning. Is there any way for me to catch it?
Thanks!
get_headers does not throw a Warning/Error on 404, but get_meta_tags does.
So you can check the header response and do something, when it's not OK:
$url = 'http://www.example.com/';
$headers = array();
$metatags = array();
$validhost = filter_var(gethostbyname(parse_url($url,PHP_URL_HOST)), FILTER_VALIDATE_IP);
if($validhost){
// get headers only when Domain is valid
$headers = get_headers($url, 1);
if(substr($headers[0], 9, 3) == '200'){
// read Metatags only when Statuscode OK
$metatags = get_meta_tags($url);
}
}
those two functions throw a warning. Is there any way for me to catch it?
You shouldn't have to care. Naturally, a E_WARNING message upon failure while developing is fine; it's even desirable, as you can instantly see that something went wrong. I can imagine though that you don't want your customers to see those warnings, but you should not be doing that per use of function, you should be doing that globally: turn display_errors off in the php.ini in the production environment, and your customers will never see such messages.
That said, if you don't want them to appear in the error logs, you'll have to check to see if the page exists before trying to retrieve the meta tags. get_headers doesn't appear to throw a warning, instead it returns an array of which the first element contains the string "HTTP/1.1 404 Not Found". You can use this to your advantage:
<?php
$url = 'http://stackoverflow.com';
$headers = get_headers( $yoururl );
preg_match( '~HTTP/1.(?:1|0) (\d{3})~', $headers[0], $matches );
$code = $matches[1];
if( $code === '200' ) {
$tags = get_meta_tags( $url );
}
If you start using this code, mind that 200 isn't the only notification of a successful request; 304 Not Modified - for example - is equally valid.
You can silence it by calling them like this:
#get_meta_tags();
You can't "catch" it (easily), but you can check the return values.
Also, you can disable or redirect warnings, see error_reporting() and ini directoves "display_errors" & similar.

Categories