cUrl - store everything from a webpage - php

i'm saving cookies in a text file by using this function:
$cookie_file_path = "".dirname(__FILE__)."/cookie.txt"; // Please set your Cookie File path
$fp = fopen($cookie_file_path,'wb');
fclose($fp);
$ch = curl_init();
// other curl functions here //
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
$loginpage_html = curl_exec ($ch);
curl_close ($ch);
it saves cookies to the same folder as cookie.txt, and it uses same cookies while connecting.
i'd like to save images (css,scripts+everythings) to the same folder. any advice?

I suggest using php DOM extension http://php.net/manual/en/book.dom.php
It's quit similiar to javascript. You just loop thru typical tags like <img>, <script> <style>, search for attributes src and get links to referenceing resources and retrieve those contents using the same cURL or file_get_contents.
Check out the DOM manual, it has a lot of useful comments.

try wget with the recursive switch

First I see you create the file using fopen and fclose, you can just use the function touch for that.
cURL is only used to get the contents of requested page. What you can do is then parse the HTML for links and use cURL in a loop to get those.
There is an set_opt CURLOPT_FILE which is where the output will go. For example:
<?php
foreach($links as $link){
$file = dirname(__FILE__)."/".basename($link);
touch($file);
// get page
$ch = curl_init();
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FILE, $file);
$output = curl_exec($ch);
curl_close ($ch);
}
?>
I didn't check that code, but thats a base for what you want. Just use regex or some functions to get the links.

Related

Download a file from Dropbox with cURL

i would like to get some help about the following problem. I'm under windows and i'm able to download any files using the cURL, but when it comes to download from Dropbox i'm unable to do it. Even if i use ?raw=1 or ?dl=1 which is responsible to redirect me to the file i still can't do it.
Here is the script i'm using:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'any url?raw=1');
$fp = fopen('backup.wpress', 'w+');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_exec ($ch);
curl_close ($ch);
fclose($fp);
Thanks in advance. I would be very grateful for any suggestions and help.
There's a 302 redirect on that URL, so you'll need to add the line
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
(You also might want to edit that URL out the post, not sure if it's sensitive data...)

Download image using PHP but image is htaccess redirected?

I want to download an image to my server using PHP. This image's html only allows target="_self" meaning it can only be downloaded from the browser apparently. I try to access the image directly in the browser and I get redirected. Is there any way to download this image onto my server via PHP? Maybe I'm missing an option in cURL?
Thanks!
Yes, you have to tell CURL to follow redirects --- try this function:
function wgetImg($img, $pathToSaveTo) {
$ch = curl_init($img);
$fp = fopen($pathToSaveTo, 'wb');
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
fclose($fp);
}

How to ensure download does not breaks using PHP Curl

I have using PHP's Curl function to download a list of pdfs from the backend. But sometimes some pdfs are corrupted.
I think, this happens because the downloads breaks and it will start to download the next pdf before completing the previous download.
Any idea how to prevent this? I am using the code below:
function downloadAndSave($urlS,$pathS)
{
$fp = fopen($pathS, 'w');
$ch = curl_init();
curl_setopt($ch,CURLOPT_PROXY,"http://test:1234");
curl_setopt($ch,CURLOPT_PROXYPORT,1234);
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($ch,CURLOPT_URL,$urlS);
curl_setopt($ch, CURLOPT_FILE, $fp);
$data = curl_exec($ch);
fclose($fp);
}
I have tried using CURLOPT_CONNECTTIMEOUT but no difference. Any other way to prevent this?

Can I use a URL as the source for imagecreatefromjpeg() without enabling fopen wrappers?

I know it’s possible to use imagecreatefromjpeg(), imagecreatefrompng(), etc. with a URL as the ‘filename’ with fopen(), but I'm unable to enable the wrappers due to security issues. Is there a way to pass a URL to imagecreatefromX() without enabling them?
I’ve also tried using cURL, and that too is giving me problems:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"http://www.../image31.jpg"); //Actually complete URL to image
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
curl_close($ch);
$image = imagecreatefromstring($data);
var_dump($image);
imagepng($image);
imagedestroy($image);
You can download the file using cURL then pipe the result into imagecreatefromstring.
Example:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $imageurl);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); // good edit, thanks!
curl_setopt($ch, CURLOPT_BINARYTRANSFER, 1); // also, this seems wise considering output is image.
$data = curl_exec($ch);
curl_close($ch);
$image = imagecreatefromstring($data);
You could even implement a cURL based stream wrapper for 'http' using stream_wrapper_register.
You could always download the image (e.g. with cURL) to a temporary file, and then load the image from that file.

capturing curl stdout into variable in PHP

Task
Downloading binary files from a remote media processing server to a web server.
This works but I cannot capture the curl stdout output
$result = shell_exec('curl -v http://domain.com/images/dir/dir/dir/file.jpg --user username:password -o /usr/www/htdocs/images/dir/dir/dir/file.jpg');
Note:
I have had no luck using the PHP curl wrapper methods for this task, partly because I've never used PHP curl wrappers for FTP downloads.
Question
Can someone explain how to capture the output from the command that I am shelling out to or a simple example using PHP curl wrappers?
Here is what I tried and the part I'm stumped on is how to get the new file placed on the target server - The line that's wrong is the CURLOPT_FILE line. I don't want to have to create a stub file, open it and then write to it.
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $ftpserver.$file['directory'].$filename); #input
curl_setopt($curl, CURLOPT_FILE, $dest); #output
curl_setopt($curl, CURLOPT_USERPWD, "$user:$password");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($curl);
Thanks
According to a comment in the PHP manual, you must be sure to close your curl stream AND your file handler before the file is written properly. I'll copy the example here for search purposes:
<?php
$fh = fopen('/tmp/foo', 'w');
$ch = curl_init('http://example.com/foo');
curl_setopt($ch, CURLOPT_FILE, $fh);
curl_exec($ch);
curl_close($ch);
# at this point your file is not complete and corrupted
fclose($fh);
# now you can use your file;
read_file('/tmp/foo');
?>
Also, I would debate the merits of using CURLOPT_RETURNTRANSFER with CURLOPT_FILE, as CURLOPT_RETURNTRANSFER tells curl to return the fetched results as the result of curl_exec(). You probably don't need that if you're just writing it to a file.
$dest needs to be a file resource opened via
$dest = fopen("filename.ext", "w");
does that work for you?
This fixed it by redirecting stdout output: 2>&1
So this works:
$result = shell_exec('curl -v http://domain.com/images/dir/dir/dir/file.jpg --user username:password -o /usr/www/htdocs/images/dir/dir/dir/file.jpg 2>&1');
It would be great to be able to use the PHP curl wrappers but not if it means opening a stub file on the source machine and then writing to it. The curl command line version just moves over the file, nice and neat. If anyone knows how to do what I'm trying to do using PHP wrappers, I'd love to see how you do it.
/**
*Downloads a binary file into a string
*#param string $file URL's file
*#param string $ref Referer
*#return string $downloaded_binary string containing the binary file
*/
function download_file($file, $ref)
{
$curl_obj = curl_init();
curl_setopt($curl_obj, CURLOPT_URL, $file);
curl_setopt($curl_obj, CURLOPT_REFERER, $ref);
curl_setopt($curl_obj, CURLOPT_BINARYTRANSFER, 1);
curl_setopt($curl_obj, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl_obj, CURLOPT_MAXREDIRS, SPDR_MAX_REDIR);
curl_setopt($curl_obj, CURLOPT_FOLLOWLOCATION, TRUE); //followlocation cannot be used when safe_mode/open_basedir are on
curl_setopt($curl_obj, CURLOPT_SSL_VERIFYPEER, FALSE);
$downloaded_binary = curl_exec($curl_obj);
curl_close($curl_obj);
return $downloaded_binary;
}
I think the way to do this is with ob_start().
ob_start();
curl_exec($curl);
$result = ob_get_clean();

Categories