I have to download large videos with curl. Cause the site where I download from put 403 blocks, And I can pass thru with only Curl. Thats my php curl function code. I need this with the download but it shouldnt eat up my memory..
function fakeip()
{
//return long2ip( mt_rand(0, 65537) * mt_rand(0, 65535) );
return "127.0.0.1";
}
function curlx($feed,$coo=null,$ref=null)
{
$ch = curl_init();
$timeout = 0;
curl_setopt ($ch, CURLOPT_URL, $feed);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("REMOTE_ADDR: ".fakeip(),"X-Client-IP: ".fakeip(),"Client-IP: ".fakeip(),"HTTP_X_FORWARDED_FOR: ".fakeip(),"X-Forwarded-For: ".fakeip()));
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:40.0) Gecko/20100101 Firefox/40.0');
//curl_setopt($ch, CURLOPT_INTERFACE, "88.150.225.55");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
if(!empty($coo))
{
curl_setopt($ch, CURLOPT_COOKIEFILE, $coo);
curl_setopt($ch, CURLOPT_COOKIEJAR, $coo);
}
if(empty($ref))
{
curl_setopt($ch, CURLOPT_REFERER,$feed);
}
else
{
curl_setopt($ch, CURLOPT_REFERER,$ref);
}
curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$veri= curl_exec($ch);
curl_close($ch);
return str_replace(array("\n", "\t", "\r",), null, $veri);
}
Related
I have used php CURL to get the html or echo the html. But it is suddenky redirecting, when i am trying with this code.
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
function get_data( $ch, $url, $post, $cookie ){
$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7";
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
//curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
if( $post != '' )
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
return curl_exec($ch);
}
$url = 'https://iapps.courts.state.ny.us/webcivil/FCASSearch?param=I';
$html = get_data( $ch, $url, '', '' );
echo $html; exit;
I have played with these
CURLOPT_RETURNTRANSFER,
CURLOPT_FOLLOWLOCATION,
CURLOPT_COOKIEJAR,
CURLOPT_COOKIEFILE
But still i got redirection when trying to get the html. How can i get the HTML of the page or is there any other thing try ?
Here is a fixed working code to grab the code of the page.
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
function get_data( $curl, $url, $post, $cookie ){
$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7";
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, $agent);
curl_setopt($curl, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($curl, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($curl, CURLOPT_HEADER, 0);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, 2);
if( $post != '' )
curl_setopt($curl, CURLOPT_POSTFIELDS, $post);
return curl_exec($curl);
}
$url = 'https://iapps.courts.state.ny.us/webcivil/FCASSearch?param=I';
$html = get_data( $ch, $url, '', '' );
echo htmlspecialchars($html);
But have you seen what you get on this? Almost only JS
which doesnt seem to be very usefull to parse.
You can take idea from this code. Give a path to page from which you want to get html content in live_url.
$live_url = "http://www.example.com/page/header.php";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $live_url);
curl_setopt($ch, CURLOPT_TIMEOUT, 1000);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$content = curl_exec($ch);
$res = curl_getinfo($ch);
curl_close($ch);
echo $content;
I am running Linux on a VPS.
When I run the this command:
curl https://www.bloomingdales.com/account/signin -H "Cookie: ewqeqweq" -X GET
... I get the source code of the site.
I tried to write the same command in PHP:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.bloomingdales.com/account/signin');
curl_setopt($ch, CURLOPT_ENCODING ,"");
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIESESSION, 1);
curl_setopt($ch, CURLOPT_HEADER, array("Cookie: ewqeqweq"));
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_REFERER, 'https://www.bloomingdales.com/account/signin');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:47.0) Gecko/20100101 Firefox/47.0');
$get_ = curl_exec($ch);
echo $get_;
curl_close($ch);
However, the result of this code is: NULL.
I am wondering what is wrong in my code?
You can use my source code
function _curl($url,$post="",$usecookie = false,$_sock = false,$timeout = false) {
$ch = curl_init();
if($post) {
curl_setopt($ch, CURLOPT_POST ,1);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $post);
}
if($timeout){
curl_setopt($ch, CURLOPT_TIMEOUT,$timeout);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
}
if($_sock){
curl_setopt($ch, CURLOPT_PROXY, $_sock);
curl_setopt($ch, CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5);
}
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10");
if ($usecookie) {
curl_setopt($ch, CURLOPT_COOKIEJAR, $usecookie);
curl_setopt($ch, CURLOPT_COOKIEFILE, $usecookie);
}
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: zip, deflate, sdch'
));
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
$result=curl_exec ($ch);
curl_close ($ch);
return $result;
}
$socks5 = '176.126.196.52:24369';
$cookie = tempnam('cookies','coo'.rand(1000000,9999999));
$url = "https://www.bloomingdales.com/account/signin";
$post = "";
$s = _curl($url,$post,$cookie,$socks5,'');
echo $s;
unlink($cookie);
My result
Here is my cUrl code,
function curl_cookieset() {
$fp = fopen("cookie.txt", "w");
fclose($fp);
/*visit the homepage to set the cookie properly */
$ch = curl_init ("http://www.autoscout24.de/ListGN.aspx?vis=1&state=A&atype=C&cy=D&page=1&results=20&ustate=N,U&sort=price&rfde=True&custtype=P&zipc=D");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.001 (windows; U; NT4.0; en-US; rv:1.0) Gecko/25250101");
curl_setopt($ch, CURLOPT_HEADER, 0);
$output = curl_exec ($ch);
}
function curl_download($Url ){
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.001 (windows; U; NT4.0; en-US; rv:1.0) Gecko/25250101');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
And this is how I use it,
curl_cookieset();
while(true) {
//some other process
$result = curl_download($url);
//some other process
}
This works expectedly on localhost, but when I try to run on an external server, It doesn't set cookie and I don't get the same result.
Note: It creates cookie.txt
cookie.txt chmod = 666
What should be the problem?
Any idea where my code is going wrong...I am trying to connect through a proxy with the curl function in php...I assuming the proxy worked bc I tried a few from this list http://hidemyass.com/proxy-list/search-234921 but cant seem to get any to function correctly...
Thoughts?
function my_fetch($url,$user_agent='Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)')
{
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt ($ch, CURLOPT_HEADER, 0);
curl_setopt ($ch, CURLOPT_REFERER, 'http://www.google.com/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($ch, CURLOPT_PROXY, '75.74.244.122:1523');
$data = curl_exec();
curl_close($ch);
return $result;
}
It doesn't look like the proxy you are using is working:
jasonfunk#jasonfunk-laptop:$ telnet 75.74.244.122 1523
Trying 75.74.244.122...
telnet: Unable to connect to remote host: Connection refused
You can try multiple proxy by using random one by one using this script
Get random proxy
function get_random_proxy(){
srand ((double)microtime()*1000000);
$f_contents = file ("proxy.txt");
$line = $f_contents[array_rand ($f_contents)];
return $line;
}
call curl function using one proxy randomly
function get_curl_proxy($url){
$proxy_ip = get_random_proxy();
$agent = "Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/532.4 (KHTML, like Gecko) Chrome/4.0.233.0 Safari/532.4";
$referer = "http://www.google.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, 1);
curl_setopt($ch, CURLOPT_PROXY, $proxy_ip);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS, 2);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
For further reference see this
http://altafphp.blogspot.in/2012/06/using-proxies-with-curl-in-php.html
I am writing a script to download files from a password protected members area. I have it working right now by using a curl call to login and then download. But the issue I am trying to fix is that I could like to have a script login and save the cookie then another script use the cookie to download the file needed. Now I am not sure if this is possible.
Here is my working code:
$cookie_file_path = "downloads/cookie.txt";
$fp = fopen($cookie_file_path, "w");
fclose($fp);
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_URL, $loginUrl);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $loginPostInfo);
curl_exec($ch);
// harddcode some known data
$downloadSize = 244626770;
$chuckSize = 1024*2048;
$filePath = "downloads/file.avi";
$file = fopen($filePath, "w");
$downloaded = 0;
$startTime = microtime(true);
while ($downloaded < $downloadSize) {
// DOWNLOAD
curl_setopt($ch, CURLOPT_RANGE, $downloaded."-".($downloaded + $chuckSize - 1));
curl_setopt($ch, CURLOPT_URL, $downloadUrl);
$result = curl_exec($ch);
$nowTime = microtime(true);
fwrite($file, $result);
echo "\n\nprogress: ".$downloaded."/".$downloadSize." - %".(round($downloaded / $downloadSize, 4) * 100);
$downloaded += $chuckSize;
// calculate kbps
$totalTime = $nowTime - $startTime;
$kbps = $downloaded / $totalTime;
echo "\ndownloaded: ".$downloaded." bytes";
echo "\ntime: ".round($totalTime, 2);
echo "\nkbps: ".(round($kbps / 1024, 2));
}
fclose($file);
curl_close($ch);
So is it possible to close the curl after the login curl_exec and then open a curl call again to download the file using the cookie I saved during the login part?
Yes it's possible.
CURLOPT_COOKIEJAR is the write path for cookies, while CURLOPT_COOKIEFILE is the read path for cookies. If you provide CURLOPT_COOKIEFILE with the same path as you did with CURLOPT_COOKIEJAR, cURL will persist the cookies across requests:
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
On ImpressPages I've done it this way:
//initial request with login data
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/login.php');
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/32.0.1700.107 Chrome/32.0.1700.107 Safari/537.36');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, "username=XXXXX&password=XXXXX");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie-name'); //could be empty, but cause problems on some hosts
curl_setopt($ch, CURLOPT_COOKIEFILE, '/var/www/ip4.x/file/tmp'); //could be empty, but cause problems on some hosts
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
//another request preserving the session
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/profile');
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, "");
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
Yes, please loop at the CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE. I see you already use CURLOPT_COOKIEJAR, so you should probably only dive into *_COOKIEJAR.