Is it possible to check the response headers (200=OK) and download a file in a single CURL request?
Here is my code. The problem with this is that it makes 2 requests, and hence the second request can be different and the saved file will be overwritten. This is a problem with rate limited API. I searched here on Stackoverflow but most solutions still make 2 requests.
// Check response first, we don't want to download the response error to the file
$urlCheck = checkRemoteFile($to_download);
if ($urlCheck) {
// Response is 200, continue
} else {
// Do not overwrite existing file
echo 'Download failed, response code header is not 200';
exit();
}
// File Handling
$new_file = fopen($downloaded, "w") or die("cannot open" . $downloaded);
// Setting the curl operations
$cd = curl_init();
curl_setopt($cd, CURLOPT_URL, $to_download);
curl_setopt($cd, CURLOPT_FILE, $new_file);
curl_setopt($cd, CURLOPT_TIMEOUT, 30); // timeout is 30 seconds, to download the large files you may need to increase the timeout limit.
// Running curl to download file
curl_exec($cd);
if (curl_errno($cd)) {
echo "the cURL error is : " . curl_error($cd);
} else {
$status = curl_getinfo($cd);
echo $status["http_code"] == 200 ? "File Downloaded" : "The error code is : " . $status["http_code"] ;
// the http status 200 means everything is going well. the error codes can be 401, 403 or 404.
}
// close and finalize the operations.
curl_close($cd);
fclose($new_file);
# FUNCTIONS
function checkRemoteFile($url) {
$curl = curl_init($url);
//don't fetch the actual page, you only want to check the connection is ok
curl_setopt($curl, CURLOPT_NOBODY, true);
//do request
$result = curl_exec($curl);
$ret = false;
//if request did not fail
if ($result !== false) {
//if request was ok, check response code
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 200) {
$ret = true;
}
}
curl_close($curl);
return $ret;
}
Related
i am trying to retrieve information of file from the url containing the file. but how can i get the information of file before downloading it to my server.
i need file information like file size,file type etc
i had found the code to validate and download file but how to get information from it before downloading file actually to server
<?php
function is_url_exist($url)
{
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_exec($ch);
$code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if($code == 200)
{
$status = "true";
}
curl_close($ch);
if ( $status == true)
{
$name = "abc.png";
if (file_put_contents("uploads/$name", file_get_contents($url)))
echo "file uploaded";
else
echo "error check upload link";
}
}
$url = "http://theonlytutorials.com/wp-content/uploads/2015/06/blog-logo1.png";
echo is_url_exist($url);
?>
you can get all information of remote file by get_headers function. Try following code to find out type, content length etc.
$url = "http://theonlytutorials.com/wp-content/uploads/2015/06/blog-logo1.png";
$headers = get_headers($url,1);
print_r($headers);
Know more about get_headers click http://php.net/manual/en/function.get-headers.php
I am writing a PHP program that downloads a pdf from a backend and save to a local drive. Now how do I check whether the file exists before downloading?
Currently I am using curl (see code below) to check and download but it still downloads the file which is 1KB in size.
$url = "http://wedsite/test.pdf";
$path = "C:\\test.pdf;"
downloadAndSave($url,$path);
function downloadAndSave($urlS,$pathS)
{
$fp = fopen($pathS, 'w');
$ch = curl_init($urlS);
curl_setopt($ch, CURLOPT_FILE, $fp);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
echo $httpCode;
//If 404 is returned, then file is not found.
if(strcmp($httpCode,"404") == 1)
{
echo $httpCode;
echo $urlS;
}
fclose($fp);
}
I want to check whether the file exists before even downloading. Any idea how to do it?
You can do this with a separate curl HEAD request:
curl_setopt($ch, CURLOPT_NOBODY, true);
$data = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
When you actually want to download you can use set NOBODY to false.
Call this before your download function and it's done:
<?php function remoteFileExists($url) {
$curl = curl_init($url);
//don't fetch the actual page, you only want to check the connection is ok
curl_setopt($curl, CURLOPT_NOBODY, true);
//do request
$result = curl_exec($curl);
$ret = false;
//if request did not fail
if ($result !== false) {
//if request was ok, check response code
$statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($statusCode == 200) {
$ret = true;
}
}
curl_close($curl);
return $ret;
}
?>
Since you are using HTTP to fetch a resource on the internet, what you really want to check is that the return code is a 404.
On some PHP installations, you can just use file_exists($url) out of the box. This does not work in all environments, however. http://www.php.net/manual/en/wrappers.http.php
Here is a function much like file_exists but for URLs, using curl:
<?php function curl_exists()
$file_headers = #get_headers($url);
if($file_headers[0] == 'HTTP/1.1 404 Not Found') {
$exists = false;
}
else {
$exists = true;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#75064
Sometimes the CURL extension isn't installed with PHP. In that case you can still use the socket library in the PHP core:
<?php function url_exists($url) {
$a_url = parse_url($url);
if (!isset($a_url['port'])) $a_url['port'] = 80;
$errno = 0;
$errstr = '';
$timeout = 30;
if(isset($a_url['host']) && $a_url['host']!=gethostbyname($a_url['host'])){
$fid = fsockopen($a_url['host'], $a_url['port'], $errno, $errstr, $timeout);
if (!$fid) return false;
$page = isset($a_url['path']) ?$a_url['path']:'';
$page .= isset($a_url['query'])?'?'.$a_url['query']:'';
fputs($fid, 'HEAD '.$page.' HTTP/1.0'."\r\n".'Host: '.$a_url['host']."\r\n\r\n");
$head = fread($fid, 4096);
$head = substr($head,0,strpos($head, 'Connection: close'));
fclose($fid);
if (preg_match('#^HTTP/.*\s+[200|302]+\s#i', $head)) {
$pos = strpos($head, 'Content-Type');
return $pos !== false;
}
} else {
return false;
}
} ?>
source: http://www.php.net/manual/en/function.file-exists.php#73175
An even faster function can be found here:
http://www.php.net/manual/en/function.file-exists.php#76246
In the first example above $file_headers[0] may contain more than or something other than 'HTTP/1.1 404 Not Found', e.g:
HTTP/1.1 404 Document+%2Fdb%2Fscotbiz%2Freports%2FR20131212%2Exml+not+found
So it's important to use some other test, e.g., regex, as '==' is not reliable.
I basically created a script using Curl and PHP that sends data to the website e.g. host, port and time. Then it submits the data. How would I know if the Curl/PHP actually sent those data to the web pages?
$fullcurl = "?host=".$host."&time=".$time.";
Any ways to see if they actually sent the data to those URLs on My MYSQL?
You can use curl_getinfo() to get the status code of the response like so:
// set up curl to point to your requested URL
$ch = curl_init($fullcurl);
// tell curl to return the result content instead of outputting it
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
// execute the request, I'm assuming you don't care about the result content
curl_exec($ch);
if (curl_errno($ch)) {
// this would be your first hint that something went wrong
die('Couldn\'t send request: ' . curl_error($ch));
} else {
// check the HTTP status code of the request
$resultStatus = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($resultStatus == 200) {
// everything went better than expected
} else {
// the request did not complete as expected. common errors are 4xx
// (not found, bad request, etc.) and 5xx (usually concerning
// errors/exceptions in the remote script execution)
die('Request failed: HTTP status code: ' . $resultStatus);
}
}
curl_close($ch);
For reference: http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
Or, if you are making requests to some sort of API that returns information on the result of the request, you would need to actually get that result and parse it. This is very specific to the API, but here's an example:
// set up curl to point to your requested URL
$ch = curl_init($fullcurl);
// tell curl to return the result content instead of outputting it
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
// execute the request, but this time we care about the result
$result = curl_exec($ch);
if (curl_errno($ch)) {
// this would be your first hint that something went wrong
die('Couldn\'t send request: ' . curl_error($ch));
} else {
// check the HTTP status code of the request
$resultStatus = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if ($resultStatus != 200) {
die('Request failed: HTTP status code: ' . $resultStatus);
}
}
curl_close($ch);
// let's pretend this is the behaviour of the target server
if ($result == 'ok') {
// everything went better than expected
} else {
die('Request failed: Error: ' . $result);
}
in order to be sure that curl sends something, you will need a packet sniffer.
You can try wireshark for example.
I hope this will help you,
Jerome Wagner
Is there a way to check if the server responds with an error code before sending a user there?
Currently, I am redirecting based on user editable input from the backend (client request, so they can print their own domain, but send people elsewhere), but I want to check if the URL will actually respond, and if not send them to our home page with a little message.
You can do this with CURL:
$ch = curl_init('http://www.example.com/');
//make a HEAD request - we don't need the response body
curl_setopt($ch, CURLOPT_NOBODY, true);
// Execute
curl_exec($ch);
// Check if any error occured
if(!curl_errno($ch))
{
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE); //integer status code
}
// Close handle
curl_close($ch);
You can then check if $httpCode is OK. Generally a 2XX response code is ok.
You could try the following, but beware that this is a seperate request to the redirect, so if something goes wrong in between then a user can still get sent to an erroneous location.
$headers = get_headers($url);
if(strpos($headers[0], 200) !== FALSE) {
// redirect to $url
} else {
// redirect to homepage with error notice
}
The PHP manual for get_headers(): http://www.php.net/manual/en/function.get-headers.php
I don't understand what you mean by making sure the URL will respond. But if you want to display a message you can use a $_SESSION variable. Just remember to put session_start() on every page that will use the variable.
So when you want to redirect them back to the home page. You could do this.
// David Caunt's answer
$ch = curl_init('http://www.example.com/');
// Execute
curl_exec($ch);
// Check if any error occured
if(!curl_errno($ch))
{
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE); //integer status code
// My addition
if( $httpCode >= 200 && $httpCode < 300 ) {
// All is good
}else {
// This doesn't exist
// Set the error message
$_SESSION['error_message'] = "This domain doesn't exist";
// Send the user back to the home page
header('Location: /home.php'); // url based: http://your-site.com/home.php
}
// My addition ends here
}
// Close handle
curl_close($ch);
Then on your home page, you'll something like this.
// Make sure the error_message is set
if( isset($_SESSION['error_message']) ) {
// Put the error on the page
echo '<div class="notification warning">' . $_SESSION['error_message'] . '</div>';
}
I need to create a function that returns if a URL is reachable or valid.
I am currently using something like the following to determine a valid url:
static public function urlExists($url)
{
$fp = #fopen($url, 'r');
if($fp)
{
return true;
}
return false;
}
It seems like there would be something faster, maybe something that just fetched the page header or something.
You can use curl as follows:
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_NOBODY, true); // set to HEAD request
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); // don't output the response
curl_exec($ch);
$valid = curl_getinfo($ch, CURLINFO_HTTP_CODE) == 200;
curl_close($ch);
You could check http status code.
Here is a code you could use to check that an url returns 2xx or 3xx http code to ensure the url works.
<?php
$url = "http://stackoverflow.com/questions/1122845";
function urlOK($url)
{
$url_data = parse_url ($url);
if (!$url_data) return FALSE;
$errno="";
$errstr="";
$fp=0;
$fp=fsockopen($url_data['host'],80,$errno,$errstr,30);
if($fp===0) return FALSE;
$path ='';
if (isset( $url_data['path'])) $path .= $url_data['path'];
if (isset( $url_data['query'])) $path .= '?' .$url_data['query'];
$out="GET /$path HTTP/1.1\r\n";
$out.="Host: {$url_data['host']}\r\n";
$out.="Connection: Close\r\n\r\n";
fwrite($fp,$out);
$content=fgets($fp);
$code=trim(substr($content,9,4)); //get http code
fclose($fp);
// if http code is 2xx or 3xx url should work
return ($code[0] == 2 || $code[0] == 3) ? TRUE : FALSE;
}
echo $url;
if (urlOK($url)) echo " is a working URL";
else echo " is a bad URL";
?>
Hope this helps!
You'll likely be limited to sending some kind of HTTP request. Then you can check HTTP status codes.
Be sure to send only a "HEAD" request, which doesn't pull back all the content. That ought to be sufficient and lightweight enough.