403 calling api.weather.gov using Buzz client w/ https - php

Trying to use curl/file_get_contents using Buzz in my Symfony application to retrieve data like from what you could see here:
https://api.weather.gov/points/44.3537,-73.8636/forecast/hourly
No matter what, I seem to get a 403 Forbidden error and I'm wondering if anyone has any advice?
$location = sprintf(
'/points/%f,%f/forecast/hourly', $this->latitude, $this->longitude
);
$request = new Request('GET', $location, 'https://api.weather.gov');
$response = new Response();
try {
$this->httpClient->send($request, $response);
} catch (\Exception $e) {
throw new ServiceResponseException('Failed to send Request', 0, $e);
}
if (!$response->isSuccessful()) {
throw new ServiceResponseException('Unsuccessful Response', $response->getStatusCode());
}
return $response->getContent();
In Buzz' FileGetContents class
$url = $request->getHost().$request->getResource();
$url evaluates to :
https://api.weather.gov/points/44.353700,-73.863600/forecast/hourly
Same with CURL
Headers returned
Array ( [0] => HTTP/1.1 403 Forbidden [1] => Server: AkamaiGHost [2] => Mime-Version: 1.0 [3] => Content-Type: text/html [4] => Content-Length: 334 [5] => Expires: Thu, 13 Apr 2017 14:15:56 GMT [6] => Date: Thu, 13 Apr 2017 14:15:56 GMT [7] => Connection: close )

You need to set the Accept, Version, and User-Agent headers.

Related

I'm writing a links-checker in php and I can't get the HTTP error (e.g. 404 or 400) when the site displays a screen like 'this page can't be found'

I am making a links-checker tool to avoid broken links in our site content and it works when the page doesn't exist or can't be loaded - except when the external site replaces it with a screen saying something like 'This page doesn't seem to exist. Search for the content you are looking for from our menu...'.
Apart from the html/css/js code for this tool, here is the main PHP code that checks the links
$headers = get_headers($url);
$headers = (is_array($headers)) ? implode( "\n ", $headers) : $headers;
$exists = (bool)preg_match('#^HTTP/.*\s+[(200|301|302)]+\s#i', $headers);
$status = (is_array($headers)) ? $headers[0] : $headers;
Then js use this information including $status but it's not returning error code when the external site shows a 'not found' screen (e.g. http://www.drdansiegel.com/resources/healthy_mind_platter).
You get back redirects before resolving to the 404. I would invert your logic, check to see if you have a 404 or 400 ever present.
$notexists = (bool)preg_match('#^HTTP/.*\s40[04]\s#mi', $headers);
Also, you should use the m modifier so the leading anchor matches each line, not the whole string.
Additionally, note a character class is a list of characters, you can't do groupings in it as you have. [(200|301|302)] says a (, 2, 0, 0 (again), |, 3, 0 (again), etc. are all allowed. You would write that as (200|301|302) if you wanted 200, 301, or 302 to be allowed characters. You could use a character class for the last integer on the redirect status code (and should add 7 and 8 to that as though are valid redirects as well). So it could be (200|30[1278]).
Here's what your $headers contained from example link:
Array
(
[0] => HTTP/1.1 301 Moved Permanently
[1] => Server: nginx
[2] => Date: Sat, 15 May 2021 01:57:44 GMT
[3] => Content-Type: text/html
[4] => Content-Length: 162
[5] => Connection: close
[6] => Location: https://www.drdansiegel.com/resources/healthy_mind_platter
[7] => HTTP/1.1 301 Moved Permanently
[8] => Server: nginx
[9] => Date: Sat, 15 May 2021 01:57:45 GMT
[10] => Content-Type: text/html; charset=UTF-8
[11] => Content-Length: 0
[12] => Connection: close
[13] => Expires: Sat, 15 May 2021 02:57:45 GMT
[14] => Cache-Control: max-age=3600
[15] => X-Redirect-By: WordPress
[16] => Location: https://drdansiegel.com/resources/healthy_mind_platter
[17] => HTTP/1.1 404 Not Found
[18] => Server: nginx
[19] => Date: Sat, 15 May 2021 01:57:46 GMT
[20] => Content-Type: text/html; charset=UTF-8
[21] => Connection: close
[22] => Vary: Accept-Encoding
[23] => Expires: Wed, 11 Jan 1984 05:00:00 GMT
[24] => Cache-Control: no-cache, must-revalidate, max-age=0
[25] => Link: <https://drdansiegel.com/wp-json/>; rel="https://api.w.org/"
)

get_headers() used on live site is not returning any array but on localhost it is

When I use the function get_headers($url) where $url = "https://www.example.com/product.php?id=15" on my live site then it is not returning any array from given url. I get nothing. But when the same code is used on my localhost, I get following:
Array
(
[0] => HTTP/1.1 200 OK
[1] => Cache-Control: private
[2] => Content-Type: text/html; charset=utf-8
[3] => Server: Microsoft-IIS/8.5
[4] => Set-Cookie: ASP.NET_SessionId=wumg0dyscw3c4pmaliwehwew; path=/; HttpOnly
[5] => X-AspNetMvc-Version: 4.0
[6] => X-AspNet-Version: 4.0.30319
[7] => X-Powered-By: ASP.NET
[8] => Date: Fri, 18 Aug 2017 13:06:18 GMT
[9] => Connection: close
[10] => Content-Length: 73867
)
So, why the function is not working successfully on live?
EDIT
<?php
if(isset($_POST['prdurl']))
{
$url = $_POST['prdurl'];
print_r(get_headers($url)); // not getting any array on live but on localhost
if(is_array(#get_headers($url)))
{
// some code goes here...
}
else
{
echo "URL doesn't exist!"
}
}
?>
One more thing to note down here is that I'm using file_get_html to retrieve the html page from the remote url. It's working on my localhost but not on live as well.

PHP get response time and http status code same request

I can get the HTTP status code of a URL using Curl and I can get the response time of a URL by doing something like the following...
<?php
// check responsetime for a webbserver
function pingDomain($domain){
$starttime = microtime(true);
// supress error messages with #
$file = #fsockopen($domain, 80, $errno, $errstr, 10);
$stoptime = microtime(true);
$status = 0;
if (!$file){
$status = -1; // Site is down
} else {
fclose($file);
$status = ($stoptime - $starttime) * 1000;
$status = floor($status);
}
return $status;
}
?>
However, I'm struggling to think of a way to get the HTTP status code and the response time using the same request. If this is possible to do only via curl that would be great.
Note: I don't want/need any other information from the URL as this will slow down my process.
Please use get_headers() function it will return you status code, refer php docs -http://php.net/manual/en/function.get-headers.php
<?php
$url = "http://www.example.com";
$header = get_headers($url);
print_r($header);
$status_code = $header[0];
echo $status_code;
?>
Output -->
Array
(
[0] => HTTP/1.0 200 OK
[1] => Cache-Control: max-age=604800
[2] => Content-Type: text/html
[3] => Date: Sun, 07 Feb 2016 13:04:11 GMT
[4] => Etag: "359670651+gzip+ident"
[5] => Expires: Sun, 14 Feb 2016 13:04:11 GMT
[6] => Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
[7] => Server: ECS (cpm/F9D5)
[8] => Vary: Accept-Encoding
[9] => X-Cache: HIT
[10] => x-ec-custom-error: 1
[11] => Content-Length: 1270
[12] => Connection: close
)
HTTP/1.0 200 OK

Google Play links validation via PHP

I want to check via script if Google Play link for app is valid:
https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggame - valid
https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggamessdasd - invalid
but every script what I bought or is free is giving for me 404 or 303 response. There is some redirect probably.
How to validate links like that. I need to check some 1000 links in my ad system if apps exist in Google Play store.
I will write myself loops, reading from database, etc. but please someone familiar with php, help with the check. I spended some $300 for this and got cheated by 2 people, that is "checking" link. Always 404 or 303.
Try this :
<?php
/**
* Check google play app
*
* #param string $url Url to check
*
* #return boolean True if it exists, false otherwise
* #throws \Exception On Curl error, an exception is thrown
*/
function checkGooglePlayApp($url)
{
$curlOptions = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_CUSTOMREQUEST => 'GET',
CURLOPT_URL => $url
);
$ch = curl_init();
curl_setopt_array($ch, $curlOptions);
$result = curl_exec($ch);
$http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($curl_error = curl_error($ch))
{
throw new \Exception($curl_error, Exception::CURL_ERROR);
}
curl_close($ch);
return $http_code == '200';
}
$url = 'https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggameERRORERROR';
$result = checkGooglePlayApp($url);
var_dump($result); // Should return false
$url = 'https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggame';
$result = checkGooglePlayApp($url);
var_dump($result); // Should return true
It will return :
bool(false)
bool(true)
This can be easily done with the get_headers function. For example:
Incorrect URL
$file = 'https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggamessdasd';
$file_headers = get_headers($file);
print_r($file_headers);
Will return:
Array
(
[0] => HTTP/1.0 404 Not Found
[1] => Cache-Control: no-cache, no-store, max-age=0, must-revalidate
[2] => Pragma: no-cache
[3] => Expires: Fri, 01 Jan 1990 00:00:00 GMT
[4] => Date: Tue, 03 Mar 2015 04:23:31 GMT
[5] => Content-Type: text/html; charset=utf-8
[6] => Set-Cookie: NID=67=QFThy03gh34QypYfoLFTz7bJDI-qzXvuzI05DtrF3aVs1L7NJO9byV6kemHRVVkViz-sodx3Z0GuCQTu9a_1JvToen6ZtjfhNy8MH6DDgH6zix2I4Gm9mauBPCxipnlG;Domain=.google.com;Path=/;Expires=Wed, 02-Sep-2015 04:23:31 GMT;HttpOnly
[7] => P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
[8] => X-Content-Type-Options: nosniff
[9] => X-Frame-Options: SAMEORIGIN
[10] => X-XSS-Protection: 1; mode=block
[11] => Server: GSE
[12] => Alternate-Protocol: 443:quic,p=0.08
[13] => Accept-Ranges: none
[14] => Vary: Accept-Encoding
)
If the file does exist, will return:
Array
(
[0] => HTTP/1.0 200 OK
[1] => Content-Type: text/html; charset=utf-8
[2] => Set-Cookie: PLAY_PREFS=CgJVUxC6uYnvvSkourmJ770p:S:ANO1ljKvPst7-nSw; Path=/; Secure; HttpOnly
[3] => Set-Cookie: NID=67=iFUl_Ls8EhAJE7STIJD7Wdq6NF-y4i6Xrlb78My75ZaruVWlAKObDRDNGDddGxD0hSsLRpvrQK7Tp5nuKCgGg2jF1GUf9_4H_zYsUDQ548Be2n8EDjp9clDfXKLYjmSg;Domain=.google.com;Path=/;Expires=Wed, 02-Sep-2015 04:26:14 GMT;HttpOnly
[4] => Cache-Control: no-cache, no-store, max-age=0, must-revalidate
[5] => Pragma: no-cache
[6] => Expires: Fri, 01 Jan 1990 00:00:00 GMT
[7] => Date: Tue, 03 Mar 2015 04:26:14 GMT
[8] => P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
[9] => X-Content-Type-Options: nosniff
[10] => X-Frame-Options: SAMEORIGIN
[11] => X-XSS-Protection: 1; mode=block
[12] => Server: GSE
[13] => Alternate-Protocol: 443:quic,p=0.08
[14] => Accept-Ranges: none
[15] => Vary: Accept-Encoding
)
So you can create a script like:
<?php
$files = ['https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggame', 'https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggamesadasd'];
foreach($files as $file)
{
$headers = get_headers($file);
if($headers[0] == 'HTTP/1.0 404 Not Found')
{
return false;
}
else
{
return true;
}
}
?>
You can simply do as
function checkGooglePlayApp($url)
{
$headers = get_headers($url);
return $headers[0] == 'HTTP/1.0 404 Not Found';
}
$inValid = checkGooglePlayApp("https://play.google.com/store/apps/details?id=com.ketchapp.zigzaggame");
if(!$inVald)
{
echo "URL Valid";
}
else{
echo "URL Invalid";
}

PHP check download link without downloading the file

On my site I have a couple links for downloading a file, but I want to make a php script that check if the download link is still online.
This is the code I'm using:
$cl = curl_init($url);
curl_setopt($cl,CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($cl,CURLOPT_HEADER,true);
curl_setopt($cl,CURLOPT_NOBODY,true);
curl_setopt($cl,CURLOPT_RETURNTRANSFER,true);
if(!curl_exec($cl)){
echo 'The download link is offline';
die();
}
$code = curl_getinfo($cl, CURLINFO_HTTP_CODE);
if($code != 200){
echo 'The download link is offline';
}else{
echo 'The download link is online!';
}
The problem is that it downloads the whole file which makes it really slow, and I only need to check the headers. I saw that curl has an option CURLOPT_CONNECT_ONLY, but the webhost I'm using has php version 5.4 which doesn't have that option. Is there any other way I can do this?
CURLOPT_CONNECT_ONLY would be good, but it’s only available in PHP 5.5 & abodes. So instead, try using get_headers. Or even use another method using fopen, stream_context_create & stream_get_meta_data. First the get_headers method:
// Set a test URL.
$url = "https://www.google.com/";
// Get the headers.
$headers = get_headers($url);
// Check if the headers are empty.
if(empty($headers)){
echo 'The download link is offline';
die();
}
// Use a regex to see if the response code is 200.
preg_match('/\b200\b/', $headers[0], $matches);
// Act on whether the matches are empty or not.
if(empty($matches)){
echo 'The download link is offline';
}
else{
echo 'The download link is online!';
}
// Dump the array of headers for debugging.
echo '<pre>';
print_r($headers);
echo '</pre>';
// Dump the array of matches for debugging.
echo '<pre>';
print_r($matches);
echo '</pre>';
And the output of this—including the dumps used for debugging—would be:
The download link is online!
Array
(
[0] => HTTP/1.0 200 OK
[1] => Date: Sat, 14 Jun 2014 15:56:28 GMT
[2] => Expires: -1
[3] => Cache-Control: private, max-age=0
[4] => Content-Type: text/html; charset=ISO-8859-1
[5] => Set-Cookie: PREF=ID=6e3e1a0d528b0941:FF=0:TM=1402761388:LM=1402761388:S=4YKP2U9qC6aMgxpo; expires=Mon, 13-Jun-2016 15:56:28 GMT; path=/; domain=.google.com
[6] => Set-Cookie: NID=67=Wun72OJYmuA_TQO95WXtbFOK5g-xU53PQZ7dAIBtzCaBWxhXzduHQZfBVPf4LpaK3MVH8ZKbrBIc3-vTKuMlEnMdpWH0mcft5pA_0kCoe4qolDmednpPJqezZF_HyfXD; expires=Sun, 14-Dec-2014 15:56:28 GMT; path=/; domain=.google.com; HttpOnly
[7] => P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
[8] => Server: gws
[9] => X-XSS-Protection: 1; mode=block
[10] => X-Frame-Options: SAMEORIGIN
[11] => Alternate-Protocol: 443:quic
)
Array
(
[0] => 200
)
And here is another method using fopen, stream_context_create & stream_get_meta_data. The benefit of this method is it gives you a bit more info on what actions were taken to fetch the URL in addition to the headers:
// Set a test URL.
$url = "https://www.google.com/";
// Set the stream_context_create options.
$opts = array(
'http' => array(
'method' => 'HEAD'
)
);
// Create context stream with stream_context_create.
$context = stream_context_create($opts);
// Use fopen with rb (read binary) set and the context set above.
$handle = fopen($url, 'rb', false, $context);
// Get the headers with stream_get_meta_data.
$headers = stream_get_meta_data($handle);
// Close the fopen handle.
fclose($handle);
// Use a regex to see if the response code is 200.
preg_match('/\b200\b/', $headers['wrapper_data'][0], $matches);
// Act on whether the matches are empty or not.
if(empty($matches)){
echo 'The download link is offline';
}
else{
echo 'The download link is online!';
}
// Dump the array of headers for debugging.
echo '<pre>';
print_r($headers);
echo '</pre>';
And here is the output of that:
The download link is online!
Array
(
[wrapper_data] => Array
(
[0] => HTTP/1.0 200 OK
[1] => Date: Sat, 14 Jun 2014 16:14:58 GMT
[2] => Expires: -1
[3] => Cache-Control: private, max-age=0
[4] => Content-Type: text/html; charset=ISO-8859-1
[5] => Set-Cookie: PREF=ID=32f21aea66dcfd5c:FF=0:TM=1402762498:LM=1402762498:S=NVP-y-kW9DktZPAG; expires=Mon, 13-Jun-2016 16:14:58 GMT; path=/; domain=.google.com
[6] => Set-Cookie: NID=67=mO_Ihg4TgCTizpySHRPnxuTp514Hou5STn2UBdjvkzMn4GPZ4e9GHhqyIbwap8XuB8SuhjpaY9ZkVinO4vVOmnk_esKKTDBreIZ1sTCsz2yusNLKA9ht56gRO4uq3B9I; expires=Sun, 14-Dec-2014 16:14:58 GMT; path=/; domain=.google.com; HttpOnly
[7] => P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
[8] => Server: gws
[9] => X-XSS-Protection: 1; mode=block
[10] => X-Frame-Options: SAMEORIGIN
[11] => Alternate-Protocol: 443:quic
)
[wrapper_type] => http
[stream_type] => tcp_socket/ssl
[mode] => rb
[unread_bytes] => 0
[seekable] =>
[uri] => https://www.google.com/
[timed_out] =>
[blocked] => 1
[eof] =>
)
Try add curl_setopt( $cl, CURLOPT_CUSTOMREQUEST, 'HEAD' ); to send HEAD request.

Categories