cURL is retrieving encoded HTML from Pirate Bay - php

I'm creating a script that is scraping the site www.piratebay.se. The script was working OK two-three days ago but now I'm having problems with it.
This is my code:
$URL = 'http://thepiratebay.se';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1");
curl_setopt($ch, CURLOPT_COOKIE, "language=pt_BR; c[thepiratebay.se][/][language]=pt_BR");
$fonte = curl_exec ($ch);
curl_close ($ch);
echo $fonte;
The response of this code is not clean HTML, but looks like this instead:
��[s۸N>��k�9��-ىmI7��$�8�.v��͕���$h���y�G�Sg:ӷ>�5����ʱ�aor&���.v)���������) d�w��8w�l����c�u""1����F*G��ِ�2$�6�C�}��z(bw�� 4Ƒz6�S��t4�K��x�6u���~�T���ACJb��T^3�USPI:Mf��n�'��4��� ��XE�QQ&�c5�`'β�T Y]D�Q�nBfS�}a�%� ���R) �Zn��̙ ��8IB�a����L�
I already tried to use user agent on .htaccess, PHP and cURL but to no success.

Add this:
curl_setopt($ch, CURLOPT_ENCODING , "gzip");
Tested on my local environment, works fine with it.

Related

Amazon Blocks cURL Request?

I am trying to use php cURL to fetch amazon web page but get
HTTP/1.1 503 Service Temporarily Unavailable instead. Is Amazon blocking cURL?
http://www.amazon.com/gp/offer-listing/B003B7Q5YY/
<?php
function get_html_content($url) {
// fake user agent
$userAgent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.2) Gecko/20070219 Firefox/2.0.0.2';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
$string = curl_exec($ch);
curl_close($ch);
return $string;
}
echo get_html_content("http://www.amazon.com/gp/offer-listing/B003B7Q5YY");
?>
I use simple
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $offers_page);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');
$html = curl_exec($ch);
curl_close($ch);
but i have another problem. if you send a lot of queries to amazon - they start send 500 page to you.

curl_init not working

I'm using the following function that based on cURL
$url = "http://www.web_site.com";
$string = #file_get_contents($url);
if(!$string){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0');
$string = curl_exec($ch);
curl_close($ch);
}
But suddenly my website stopped due to this function and once i remove curl it works fine
so i thought my hosting disabled it so i checked it out
Click here to check it out
and it should be working so what is wrong ?
~ any help , what shall i say to my hosting provider !!
The file_get_contents method doesn't look to the URL header, try using cURL with the CURLOPT_FOLLOWLOCATION enabled and CURLOPT_MAXREDIRS to the value you prefer.

CURL: PHP : can't submit

I need to write a php script that will login to my admin page then submit rss.
I'm able to login with the code below, but can't submit the rss
<?php
function rssadd($url,$post,$post2) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.0.4) Gecko/2008102920 AdCentriaIM/1.7 Firefox/3.0.4');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);
curl_close($ch);
$ch2 = curl_init($url);
curl_setopt($ch2, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.9.0.4) Gecko/2008102920 AdCentriaIM/1.7 Firefox/3.0.4');
curl_setopt($ch2, CURLOPT_POST, 1);
curl_setopt($ch2, CURLOPT_POSTFIELDS, $post2);
curl_setopt($ch2, CURLOPT_REFERER, $url);
curl_setopt($ch2, CURLOPT_RETURNTRANSFER, 1);
$result2 = curl_exec($ch2);
return $result . $result2;
}
$page2 = rssadd('http://site.com/admin.php?mod=rss&action=news&id=4','subaction=dologin&username=admin&password=pass','subaction=doit');
echo $page2;
?>
the html on "http://site.com/admin.php?mod=rss&action=news&id=4" i'm not able to submit
<input type="submit" name="subaction" value="doit" class="buttons">
Perhaps you need to retain cookies between requests? Try setting CURLOPT_COOKIEFILE to '' (the empty string). You'll also need to use the same curl handle on both requests - instead of closing the first handle and initializing a new one, just change the options on the first one and run it again.
I originally learned how to do this from this Stack Overflow answer: https://stackoverflow.com/a/5758471/638544

cURL to traverse a login with many pages

I am trying to use cURL to automate a login with multiple steps involved. The problem I am running into is that I get the first page of the login fine but the next page I hit I must select or hit a link to continue. How the heck do I "keep going". I've tried taking the next URL and putting it into my cURL code but it does not work as it just goes directly to that page and errors because I have not gone to the first page of the login process. Here is my code.
$ch = curl_init();
$data = array('fp_software' => '', 'fp_screen' => '', 'fp_browser' => '','txtUsername' => "$username", 'btnLogin' => 'Log In');
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_URL, 'https://www.website.com/Login.aspx');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($ch);
curl_close ($ch);
The next url is www.website.com/PassMarkFrame.aspx - Basically I need to crawl threw this login process.
I tried this...but it didn't work.
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($ch, CURLOPT_URL, 'https://www.website.com/Login.aspx'); // use the URL that shows up in your <form action="...url..."> tag
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($ch);
curl_setopt($ch, CURLOPT_URL, 'https://www.website.com/PassMarkFrame.aspx'); // use the URL that shows up in your <form action="...url..."> tag
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($ch);
curl_close ($ch);
Is that the right syntax?
Don't close the curl handle after each stage. if cookies are being set, and you haven't configured the cookiejar/cookiefile options, then you start with a brand new sparkly fresh and clean CURL with no "memory" of the previous requests.
Keep the same curl handle going, and any cookies set by the site will be preserved.

PHP CURL Login with cookie

I have a chat service that i would like to make an announcement bot that runs on cron to post daily updates to this chat. The url is http://www6.cbox.ws/box/?boxid=524970&boxtag=7xpsk7&sec=form I have tried various curl examples online but none seem to get the job done. My latest attempt which was a failure.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www6.cbox.ws/box/?boxid=&boxtag=&sec=profile&n=andysmith&k=0000000000000000000000000000000000000000");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_REFERER, 'http://www6.cbox.ws/box/?boxid=&boxtag=&sec=profile&n=andysmith&logpword=iloveJD');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cbox.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cbox.txt');
curl_exec($ch);
curl_close($ch);
I just need it to login and post a message.
try with
$result=curl_exec($ch);
//print $result;
if($result === false)
{
echo '<br/>Curl error: '.curl_error($ch);
curl_close($ch);
exit;
}
and see the error
Note: if u r setting 'CURLOPT_REFERER' , u need to set
curl_setopt($ch, CURLOPT_HEADER, true);

Categories