i tried many tutorials but all failed, i know for an experienced user it might be obvious, thx anyway.
there is the simple form:
https://www.shab.ch/shabforms/COMMON/application/applicationGrid.jsp?template=1&view=2&page=/COMMON/search/searchForm.jsp?MODE=SHAB
here is my script which returns only the empty form instead of my POST search:
(i used tamper-data to get the Post-variables, i also use https)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.shab.ch/shabforms/COMMON/application/applicationGrid.jsp?template=1&view=2&page=/COMMON/search/searchForm.jsp?MODE=SHAB');
curl_setopt($ch, CURLOPT_POSTFIELDS,'KEYWORDS=&NOTICE_NR=&TIMESPAN=TODAY&STAT_TM_1=&STAT_TM_2=&SELTYPE=HR&TYPE_CD_AW=&TYPE_CD_AN=&TYPE_CD_BL=&TYPE_CD_VM=&TYPE_CD_HR=HR01&LEGAL_FORM_NR_HR=&FIRM_ID_HR=&HR_CANTON_AG=ON&HR_CANTON_BE=ON&TYPE_CD_IS=&TYPE_CD_KK=&YN_KK=&TYPE_CD_IP=&TYPE_CD_NA=&YN_NA=&TYPE_CD_SB=&YN_SB=&TYPE_CD_SR=&FIRM_NAME_TX_UP=&FIRM_CITY_TX_UP=&command=Recherchieren');
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_REFERER,"https://www.shab.ch/shabforms/COMMON/application/applicationGrid.jsp?template=1&view=2&page=/COMMON/search/searchForm.jsp%3Fcategory%3DHR");
curl_setopt($ch, CURLOPT_COOKIEJAR, "my_cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "my_cookies.txt");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3");
$result = curl_exec($ch);
echo $result;
This has strangely been written into my_cookies.txt
www.shab.ch FALSE /shabforms FALSE 0 JSESSIONID E884A3B4187C68253CEEBCD58E7E934E
www.shab.ch FALSE / FALSE 1287673522 BC_HA_C30B29681466613B 131BDF
What is wrong? :)
UPDATE:
Ok, i got the error. it was related to the post-url. the script on the website seems to do the process by ajax,... without changing the url to send (i could not even find the correct url in tamper data!!).
Fortunately i could figure that out, its "shabforms/servlet/web/DocumentSearch".
Now it works, thx
I just ran this script and got a German website saved in $result.
Maybe your curl setup needs tweeking? Have you got it working with another site?
it was related to the post-url. the script on the website seems to do the process by ajax,... without changing the url to send (i could not even find the correct url in tamper data!!). Fortunately i could figure that out, its "shabforms/servlet/web/DocumentSearch".
Now it works, thx
Related
Hi how can i search data from other website using curl and php. i want to search imei number from this website https://www.example.com/xxx
this is what i have tried so far
$imei = '013887009861498';
$cookie_file_path = "cookies/cookiejar.txt";
$fp = fopen("$cookie_file_path","w") or die("<BR><B>Unable to open cookie file $mycookiefile for write!<BR>");
fclose($fp);
$url="https://example.com/xxx";
$agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,$imei);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
$result = curl_exec ($ch);
echo $result ;
(this is not a full answer, but too long to be a comment. i can't be arsed to figure out all the small details for you)
there are several different problems here, the first is how to do a POST request with php/curl, of which you can find an example here.
another problem, is how to parse HTML in PHP, of which there are several options listed here. (i highly recommend the DOMDocument & DOMXPath combo)
another problem, is how to get past CAPTCHA challenges in PHP, 1 solution is to use the deathbycaptcha API (which is a paid service, by the way), you can find an example of that here.
another problem is that they're using 3 different CSRF-like tokens, called __VIEWSTATE, __EVENTVALIDATION, and hdnCaptchaInstance, all of which must be parsed out and submitted with the captcha answer. also you need to handle cookies, as the CSRF tokens and captcha is tied to your cookie session (luckily you can let curl handle cookies automatically with CURLOPT_COOKIEFILE )
I'm opening a page as logged user, and it kind of seems to work, except the website has some sort of a protection system. If I do this normally, I'll get the page I want, but if I do it with cURL, I'll get 'Welcome back user (userid)' and a link to the page I requested. Once I click the link, I'll get where I want to be. Now I tried faking the referer and checking the data that gets sent to the page, there's nothing special there. When I click the link, I simply get redirected to the page I wanted in the first place. My question is why doesn't this code get me there as well:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL , "http://www.site.com/sell/index");
curl_setopt($ch, CURLOPT_REFERER, 'http://www.site.com');
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like Gecko) Chrome/0.A.B.C Safari/525.13");
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FAILONERROR, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, false);
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
$response = curl_exec($ch);
curl_close($ch);
echo $response;
Just before I do this, I perform login procedure and grab the cookie. And I do get to open the page as logged in user, I just can't seem to access it without clicking the ahref.
PS. The same thing would happen if I logged in, open the page I wanted, closed browser and opened it again. So I'm thinking it has to do with referer?
cookie-jar means it will save your cookie from curl's response. That's why it is not working for you. Instead use cookie-file so that your curl send stored cookie with request:
curl_setopt($ch, CURLOPT_COOKEFILE, "cookie.txt");
Also, use absolute path(/var/tmp/cookie.txt) instead of relative path.
Now, Be Happy!
There's a webpage that I need to log in to. I used CURL with post to login, but it's not enough. When you log in from the website the post also includes a string that is always changing. Is threre a way to get over that?
I use this:
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; he-IL; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_POST, TRUE);
$post = "username=$username&password=$password";
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
$result = curl_exec($ch);
$info = curl_getinfo($ch);
curl_close($ch);
It's like I need the code to actually go to the webpage and fill the form regularly.
I looked everywhere but all I could find was using post data.
Thanks!
To pass that you need to visit the page with the form, grab the field and then use it in POST request when you submit the form.
I suggest you visit the form page not only for that, but also for the following reasons (some of which can be used to figure people using automatic requests):
You recieve cookies
You don't fake referrer, you actually visited the page
You might want to check form fields to see if there's any new ones added since you wrote the script. That could be the case if form setup changes and you might want to adapt to that, if you don't then your script might stop working one day
Using PHP and cURL, I'd like to check if I can login to a website using the provided user credentials. For that I'm currently retrieving the entire website and then use regex to filter for keywords that might indicate the login didn't work.
The url itself contains the string "errormessage" if a wrong username/password has been entered. Is it possible to only use curl to get the url address, without the contents to speed it up?
Here's my curl PHP code:
function curl_get_request($referer, $submit_url, $ch)
{
global $cookie_path;
// sends a request via curl to the string specifics listed
$agent = "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)";
curl_setopt($ch, CURLOPT_URL, $submit_url);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_path);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_path);
return $result = curl_exec ($ch);
}
Also, if somebody has a better idea on how to handle a problem like this, please let me know!
What you should do is check the URL each time there is a redirect. Most redirects are going to be done with the proper HTTP headers. If that is the case, see this answer:
PHP: cURL and keep track of all redirections
Basically, turn off automatic redirection following, and check the HTTP status code for 301 or 302. If you get one of those, you can continue to follow the redirection if needed, or exit from there.
If instead, the redirection is happening client side, you will have to parse the page with a DOM parser.
I want to access https://graph.facebook.com/19165649929?fields=name (obviously it's also accessable with "http") with cURL to get the file's content, more specific: I need the "name" (it's json).
Since allow_url_fopen is disabled on my webserver, I can't use get_file_contents! So I tried it this way:
<?php
$page = 'http://graph.facebook.com/19165649929?fields=name';
$ch = curl_init();
//$useragent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1";
//curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
curl_setopt($ch, CURLOPT_URL, $page);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_exec($ch);
curl_close($ch);
?>
With that code I get a blank page! When I use another page, like http://www.google.com it works like a charm (I get the page's content). I guess facebook is checking something I don't know... What can it be? How can I make the code work? Thanks!
did you double post this here?
php: Get html source code with cURL
however in the thread above we found your problem beeing unable to resolve the host and this was the solution:
//$url = "https://graph.facebook.com/19165649929?fields=name";
$url = "https://66.220.146.224/19165649929?fields=name";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Host: graph.facebook.com'));
$output = curl_exec($ch);
curl_close($ch);
Note that the Facebook Graph API requires authentication before you can view any of these pages.
You basically got two options for this. Either you login as an application (you've registered before) or as a user. See the api documentation to find out how this works.
My recommendation for you is to use the official PHP-SDK. You'll find it here. It does all the session and cURL magic for you and is very easy to use. Take the examples which are included in the package and start to experiment.
Good luck.