Calculate MD5 of file being downloaded with PHP and CURL - php

I have some cURL call that download a large file.
I'm wondering if it is possible to calculate hash when the file is still downloading?
I think the progress callback function is the right place for accomplish that..
function get($urlget, $filename) {
//Init Stuff[...]
$this->fp = fopen($filename, "w+");
$ch = curl_init();
//[...] irrelevant curlopt stuff
curl_setopt($ch, CURLOPT_FILE, $this->fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_NOPROGRESS, 0);
curl_setopt($ch, CURLOPT_PROGRESSFUNCTION, array($this,'curl_progress_cb'));
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
$ret = curl_exec($ch);
if( curl_errno($ch) ){
$ret = FALSE;
}
curl_close($ch);
fclose($this->fp);
return $ret;
}
function curl_progress_cb($dltotal, $dlnow, $ultotal, $ulnow ){
//... Calculate MD5 of file here with $this->fp
}

Its possible to calculate md5 hash of partially downloaded file, but it does not make too much sense. Every downloaded byte will change your hash diametrally, what is the reason behind going with this kind solution?
If you need to have md5 hash for entire file than the answer is NO. Your program has to first download the file and then generate the hash.

I just do it:
in a file wget-md5.php, add the below code:
<?php
function writeCallback($resource, $data)
{
global $handle;
global $handle_md5_val;
global $handle_md5_ctx;
$len = fwrite($handle,$data);
hash_update($handle_md5_ctx,$data);
return $len;
}
$handle=FALSE;
$handle_md5_val=FALSE;
$handle_md5_ctx=FALSE;
function wget_with_curl_and_md5_hashing($url,$uri)
{
global $handle;
global $handle_md5_val;
global $handle_md5_ctx;
$handle_md5_val=FALSE;
$handle_md5_ctx=hash_init('md5');
$handle = fopen($uri,'w');
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_BUFFERSIZE, 64000);
curl_setopt($curl, CURLOPT_WRITEFUNCTION, 'writeCallback');
echo "wget_with_curl_and_md5_hashing[".$url."]=downloading\n";
curl_exec($curl);
curl_close($curl);
fclose($handle);
$handle_md5_val = hash_final($handle_md5_ctx);
$handle_md5_ctx=FALSE;
echo "wget_with_curl_and_md5_hashing[".$url."]=downloaded,md5=".$handle_md5_val."\n";
}
wget_with_curl_and_md5_hashing("http://archlinux.polymorf.fr/core/os/x86_64/core.files.tar.gz","core.files.tar.gz");
?>
and run:
$ php -f wget-md5.php
wget_with_curl_and_md5_hashing[http://archlinux.polymorf.fr/core/os/x86_64/core.files.tar.gz]=downloading
wget_with_curl_and_md5_hashing[http://archlinux.polymorf.fr/core/os/x86_64/core.files.tar.gz]=downloaded,md5=5bc1ac3bc8961cfbe78077e1ebcf7cbe
$ md5sum core.files.tar.gz
5bc1ac3bc8961cfbe78077e1ebcf7cbe core.files.tar.gz

Related

Curl request not flushing data periodically

I am making a curl request to a function and in that function the data is being flushed periodically. So I want to display the data as soon as it is flushed in my calling function. But my response is displayed collectively after the request is over. I want to display response side by side.
Code
require_once 'mystream.php';
stream_wrapper_register("var", "mystream") or die("Failed to register protocol");
$myVar = '';
// Open the "file"
$fp = fopen("var://myVar", "r+");
// Configuration of curl
$ch = curl_init();
$output = ' ';
$url = '';
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $output);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0 );
//curl_setopt($ch, CURLOPT_BUFFERSIZE, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FILE, $fp); // Data will be sent to our stream ;-)
curl_exec($ch);
curl_close($ch);
// Don't forget to close the "file" / stream
fclose($fp);
mystream.php code
<?php
class mystream{
protected $buffer = '';
function stream_open($path, $mode, $options, &$opened_path) {
return true;
}
public function stream_write($data) {
ob_start();
echo $data;
ob_end_flush();
ob_flush();
flush();
// Extract the lines ; on y tests, data was 8192 bytes long ; never more
$lines = explode("\n", $data);
// The buffer contains the end of the last line from previous time
// => Is goes at the beginning of the first line we are getting this time
$lines[0] = $this->buffer . $lines[0];
// And the last line os only partial
// => save it for next time, and remove it from the list this time
$nb_lines = count($lines);
$this->buffer = $lines[$nb_lines-1];
unset($lines[$nb_lines-1]);
// Here, do your work with the lines you have in the buffer
//var_dump($lines);
//echo '<hr />';
return strlen($data);
}
}
Any leads would be highly appreciated.

How to get the image extention from function file_get_content()

I have a url something like this
$url ="www.domain.com/image.php?id=123&idlocation=987&number=01";
Previously i was getting the extension using following code
$img_details= pathinfo($url);
But this won't work any more since the url has other variables also . So in this case how to get the Image name and extension .
I know i should first download the file using
$contenido = file_get_contents($url);
But don't know how to get the files name/extention from this
Thanks in advance
$ch = curl_init();
$url = 'http://www.domain.com/image.php?id=123&idlocation=987&number=01';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$results = split("\n", trim(curl_exec($ch)));
foreach($results as $line) {
if (strtok($line, ':') == 'Content-Type') {
$parts = explode(":", $line);
echo trim($parts[1]);
}
}
Return: image/png
Already answered in: Get mime type of external file using cURL and php
The above answers focus both on the mime-type and they will work in most cases they both need additional resource usage - either a second network call to get the mime type or more disk reads/writes to save the file to disk and use the exif-imagetype function. And both will not return the file name which was a part of the question. Here is a download function using curl that will return a downloaded file as an array with name,mime type and content. Additionally it will try to get the name from the URL if possible.
Sample usage
$file=downloadfile($url);
echo "Name: ".$file["name"]."<br>";
echo "Type: ".$file["mimetype"]."<br>";
Code
function downloadfile($url){
global $headers;
$headers=array();
$file=array("content"=>"","mimetype"=>"","name"=>"");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_HEADERFUNCTION, 'readHeaders');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$file["content"]=curl_exec($ch);
if(sizeof($headers)){
foreach($headers as $header){
if(!strncasecmp($header,'content-type:',13)){
$file["mimetype"]=trim(substr($header,13));
}
elseif(!strncasecmp($header,'content-disposition:',20)){
$file["name"]=trim(substr(strstr($header,'filename='),9));
}
}
}
if(!$file["name"]){
$query=strpos("?",$url);
$file["name"]=basename(substr($url,0,($query?$query:strlen($url))));
}
unset($headers);
return $file;
}
function readHeaders($ch, $header) {
global $headers;
array_push($headers, $header);
return strlen($header);
}

Running a for loop on an ASP page

I was trying to download the results of a batch and was coding a program for this.
On an aspx file, I was able to write the following PHP code since the URL included the parameters:
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
for ($i=1; $i<15000; $i++) {
$url = "http://example.com/result.aspx?ClassId=342&TermId=95&StudentId=".$i;
$returned_content = get_data($url);
if (stripos($returned_content,'Roll') !== false) {
echo "Student ID:" . $i;
echo $returned_content;
}
}
However, when a result is queried on a .ASP file, the URL simply says 'results.asp' without any additional parameters. Is there a way to use CURL requests to run a for loop and download this data in a similar manner?
Thanks for any help!

Saving an image from a PHP URL using curl and PHP

I have tried to download an image from a PHP link. When I try the link in a browser it downloads the image. I enabled curl and I set “allow_url_fopen” to true. I’ve used the methods discussed here Saving image from PHP URL but it didn’t work. I've tried "file_get_contents" too, but it didn't work.
I made few changes, but still it doesn’t work. This is the code
$URL_path='http://…/index.php?r=Img/displaySavedImage&id=68';
$ch = curl_init ($URL_path);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_BINARYTRANSFER,1);
$raw=curl_exec($ch);
curl_close ($ch);
$fp = fopen($path_tosave.'temp_ticket.jpg','wb');
fwrite($fp, $raw);
fclose($fp);
Do you have any idea to make it works? Please help. Thanks
<?php
if( ini_get('allow_url_fopen') ) {
//set the index url
$source = file_get_contents('http://…/index.php?r=Img/displaySavedImage&id=68');
$filestr = "temp_ticket.jpg";
$fp = fopen($filestr, 'wb');
if ($fp !== false) {
fwrite($fp, $source);
fclose($fp);
}
else {
// File could not be opened for writing
}
}
else {
// allow_url_fopen is disabled
// See here for more information:
// http://php.net/manual/en/filesystem.configuration.php#ini.allow-url-fopen
}
?>
This is what I used to save an image without an extension (dynamic image generated by server). Hope it works for you. Just make sure that the file path location is fully qualified and points to an image. As #ComFreek pointed out, you can use file_put_contents which is the equivalent to calling fopen(), fwrite() and fclose() successively to write data to a file. file_put_contents
You can use it as a function :
function getFile($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$tmp = curl_exec($ch);
curl_close($ch);
if ($tmp != false){
return $tmp;
}
}
And to call it :
$content = getFile(URL);
Or save its content to a file :
file_put_contents(PATH, getFile(URL));
You're missing a closing quote and semicolon on the first line:
$URL_path='http://…/index.php?r=Img/displaySavedImage&id=68';
Also, your URL is in $URL_path but you initialise cURL with $path_img which is undefined based on the code in the question.
Why use cURL when file_get_contents() does the job?
<?php
$img = 'http://…/index.php?r=Img/displaySavedImage&id=68';
$data = file_get_contents( $img );
file_put_contents( 'img.jpg', $data );
?>

How can I get file name from header using curl in php?

how can i determine the file name in header when i get with php. lock at this:
<?php
/*
This is usefull when you are downloading big files, as it
will prevent time out of the script :
*/
set_time_limit(0);
ini_set('display_errors',true);//Just in case we get some errors, let us know....
$fp = fopen (dirname(__FILE__) . '/tempfile', 'w+');//This is the file where we save the information
$ch = curl_init('http://www.example.com/getfile.php?id=4456'); // Here is the file we are downloading
/*
the server get me an header 'Content-Disposition: attachment; filename="myfile.pdf"'
i want get 'myfile.pdf' from headers. how can i get it ?
*/
$fileNameFromHeader = '?????????????';
curl_setopt($ch, CURLOPT_TIMEOUT, 50);
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_exec($ch);
curl_close($ch);
fclose($fp);
// rename file
rename(dirname(__FILE__) . '/tempfile', dirname(__FILE__) . $fileNameFromHeader);
Create a callback that reads headers, and parse them yourself. Something like:
function readHeader($ch, $header)
{
global $responseHeaders;
$url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
$responseHeaders[$url][] = $header;
return strlen($header);
}
... curl stuff ...
// if you need to call a class method use this:
// curl_setopt($ch, CURLOPT_HEADERFUNCTION, array(&$this,'readHeader'));
// for a non class method use this
curl_setopt($ch, CURLOPT_HEADERFUNCTION, 'readHeader');
$result = curl_exec($ch);
// all headers are now in $responseHeaders
I would use the function Byron mentioned. However, if you just want the fileNameFromHeader, I would include this in the readHeader function:
function readHeader($ch, $header)
{
global $responseHeaders;
$url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
$responseHeaders[$url][] = $header;
// $url = 'http://stackoverflow.com/questions/4091203/how-can-i-get-file-name-from-header-using-curl-in-php';
$params = explode('/', $url);
$fileNameFromHeader = $params[count($params) - 1];
//return strlen($header);
return $fileNameFromHeader;
}

Categories