I am trying to log into a page through curl. Where a successful login redirects you to the actual site and you see the content there.
Basically, there is are 2 urls, the first url is to post the login credentials to and the other url is where the content is visible after the login.
I managed to send a post request to the login url and it successfully creates a valid cookie too but I can't figure out how to use the cookie to see the content of the page from the second url.
I am trying to do a normal curl request (without the POSTFIELDS in the code) with these two options to retrieve the content of page 2 but if you view the source for it, it just displays the html code to redirect to the login url.
curl_setopt($ch1, CURLOPT_COOKIEJAR, 'cookie.txt');
curl_setopt($ch1, CURLOPT_COOKIEFILE, 'cookie.txt');
Any ideas on what I might be doing wrong?
Try to add more parameters to your CURL request :
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)');
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$string = curl_exec ($ch);
curl_close ($ch);
Related
Hi how can i search data from other website using curl and php. i want to search imei number from this website https://www.example.com/xxx
this is what i have tried so far
$imei = '013887009861498';
$cookie_file_path = "cookies/cookiejar.txt";
$fp = fopen("$cookie_file_path","w") or die("<BR><B>Unable to open cookie file $mycookiefile for write!<BR>");
fclose($fp);
$url="https://example.com/xxx";
$agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,$imei);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
$result = curl_exec ($ch);
echo $result ;
(this is not a full answer, but too long to be a comment. i can't be arsed to figure out all the small details for you)
there are several different problems here, the first is how to do a POST request with php/curl, of which you can find an example here.
another problem, is how to parse HTML in PHP, of which there are several options listed here. (i highly recommend the DOMDocument & DOMXPath combo)
another problem, is how to get past CAPTCHA challenges in PHP, 1 solution is to use the deathbycaptcha API (which is a paid service, by the way), you can find an example of that here.
another problem is that they're using 3 different CSRF-like tokens, called __VIEWSTATE, __EVENTVALIDATION, and hdnCaptchaInstance, all of which must be parsed out and submitted with the captcha answer. also you need to handle cookies, as the CSRF tokens and captcha is tied to your cookie session (luckily you can let curl handle cookies automatically with CURLOPT_COOKIEFILE )
I'm opening a page as logged user, and it kind of seems to work, except the website has some sort of a protection system. If I do this normally, I'll get the page I want, but if I do it with cURL, I'll get 'Welcome back user (userid)' and a link to the page I requested. Once I click the link, I'll get where I want to be. Now I tried faking the referer and checking the data that gets sent to the page, there's nothing special there. When I click the link, I simply get redirected to the page I wanted in the first place. My question is why doesn't this code get me there as well:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , "http://www.site.com/sell/index");
curl_setopt($ch, CURLOPT_REFERER, 'http://www.site.com');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, false);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
$response = curl_exec($ch);
curl_close($ch);
echo $response;
Just before I do this, I perform login procedure and grab the cookie. And I do get to open the page as logged in user, I just can't seem to access it without clicking the ahref.
PS. The same thing would happen if I logged in, open the page I wanted, closed browser and opened it again. So I'm thinking it has to do with referer?
cookie-jar means it will save your cookie from curl's response. That's why it is not working for you. Instead use cookie-file so that your curl send stored cookie with request:
curl_setopt($ch, CURLOPT_COOKEFILE, "cookie.txt");
Also, use absolute path(/var/tmp/cookie.txt) instead of relative path.
Now, Be Happy!
Html output of below code gives some additional data which are not at all available in page. I compare this output with view page source. Extra data start from "Find a different......"
$url : http://www.linkedin.com/pub/senthil-selvaraj/36/90b/5b9
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "$url");
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER,false);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)");
if ($proxystatus == 'on')
{
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_HTTPPROXYTUNNEL, TRUE);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
}
$body = curl_exec($ch);
This is most probably connected to cookies or headers, as cURL does not emulate real browser in all ways. Therefore, your output can differ, since cURL can even send different Accept or Location headers etc.
Have you tried different browsers? Also, is that cURL going out from the same IP you are browsing the page?
EDIT: What you can try to do is install Firebug into Firefox, then open it using F12 key, switch to Net (or Network) tab and check what headers your browser sends to the server. Then, you may be able to emulate these headers using your cURL request.
<?php
$ebay_user_id = "id"; // Please set your Ebay ID
$ebay_user_password = "password"; // Please set your Ebay Password
$cookie_file_path = dirname(__FILE__).'/cookie.txt'; // Please set your Cookie File path
$LOGINURL = "http://signin.ebay.com/aw-cgi/eBayISAPI.dll?SignIn";
$agent = "Mozilla/4.0 (compatible;)";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$LOGINURL);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
$result = curl_exec ($ch);
curl_close ($ch);
$LOGINURL = "http://signin.ebay.com/aw-cgi/eBayISAPI.dll";
$POSTFIELDS = 'MfcISAPICommand=SignInWelcome&siteid=0&co_partnerId=2&UsingSSL=0&ru=&pp=&pa1=&pa2=&pa3=&i1=-1&pageType=-1&userid='. $ebay_user_id .'&pass='. $ebay_user_password;
$reffer = "http://signin.ebay.com/aw-cgi/eBayISAPI.dll?SignIn";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$LOGINURL);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,$POSTFIELDS);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, $reffer);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
$result = curl_exec ($ch);
curl_close ($ch);
print $result; ?>
I'm really new player on cURL...
I have this code now using in login into ebay.
The problem for now is the cookies it told me that it was blocked by something.
The message it shows: Your web browser settings are blocking cookies.
I use firefox for test and tried other browser also got the same issues.
I have confirmed that my browser setting are accepted for the cookies access.
Also, I have checked there has conntent inside the cookies.txt file, so that mean the cookies.txt can be access correctly.
So....What is the problem for this issue? The code I used are correct?
Thanks everyone for help.
Try modifying the agent to something similar;
'Mozilla/5.0 (Windows NT 6.1; rv:15.0) Gecko/20100101 Firefox/15.0.1'
Edit: actually I believe the problem is you need to query the signin page first,
first visit "http://signin.ebay.com/aw-cgi/eBayISAPI.dll?SignIn"
this will set the cookies, then sign in as you have.
you can try it in a browser, navigate to the eBay sign in page,
clear your cookies and then signin.
You will get the browser not supporting cookies error.
You need to understand something and that is that doing a HTTP request with curl through php has nothing to do with your browser. The website you are accessing doesn't care what browser you use to run the php script. The actual request is done by your server, not by your browser.
On the other hand, if eBay engineers are smart they'd block this, you probably aren's supposed to do things like this, that's what the Ebay API's are for.
And a little tip, use a HTTP Client library, doing things like this in plain cURL is a pita and gives some very bad and unreadable code.
Check https://github.com/guzzle/guzzle for example.
i tried many tutorials but all failed, i know for an experienced user it might be obvious, thx anyway.
there is the simple form:
https://www.shab.ch/shabforms/COMMON/application/applicationGrid.jsp?template=1&view=2&page=/COMMON/search/searchForm.jsp?MODE=SHAB
here is my script which returns only the empty form instead of my POST search:
(i used tamper-data to get the Post-variables, i also use https)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.shab.ch/shabforms/COMMON/application/applicationGrid.jsp?template=1&view=2&page=/COMMON/search/searchForm.jsp?MODE=SHAB');
curl_setopt($ch, CURLOPT_POSTFIELDS,'KEYWORDS=&NOTICE_NR=&TIMESPAN=TODAY&STAT_TM_1=&STAT_TM_2=&SELTYPE=HR&TYPE_CD_AW=&TYPE_CD_AN=&TYPE_CD_BL=&TYPE_CD_VM=&TYPE_CD_HR=HR01&LEGAL_FORM_NR_HR=&FIRM_ID_HR=&HR_CANTON_AG=ON&HR_CANTON_BE=ON&TYPE_CD_IS=&TYPE_CD_KK=&YN_KK=&TYPE_CD_IP=&TYPE_CD_NA=&YN_NA=&TYPE_CD_SB=&YN_SB=&TYPE_CD_SR=&FIRM_NAME_TX_UP=&FIRM_CITY_TX_UP=&command=Recherchieren');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_REFERER,"https://www.shab.ch/shabforms/COMMON/application/applicationGrid.jsp?template=1&view=2&page=/COMMON/search/searchForm.jsp%3Fcategory%3DHR");
curl_setopt($ch, CURLOPT_COOKIEJAR, "my_cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "my_cookies.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
$result = curl_exec($ch);
echo $result;
This has strangely been written into my_cookies.txt
www.shab.ch FALSE /shabforms FALSE 0 JSESSIONID E884A3B4187C68253CEEBCD58E7E934E
www.shab.ch FALSE / FALSE 1287673522 BC_HA_C30B29681466613B 131BDF
What is wrong? :)
UPDATE:
Ok, i got the error. it was related to the post-url. the script on the website seems to do the process by ajax,... without changing the url to send (i could not even find the correct url in tamper data!!).
Fortunately i could figure that out, its "shabforms/servlet/web/DocumentSearch".
Now it works, thx
I just ran this script and got a German website saved in $result.
Maybe your curl setup needs tweeking? Have you got it working with another site?
it was related to the post-url. the script on the website seems to do the process by ajax,... without changing the url to send (i could not even find the correct url in tamper data!!). Fortunately i could figure that out, its "shabforms/servlet/web/DocumentSearch".
Now it works, thx