Alright I am practicing using cURL to login to different webservices. For this pariticular try, I am doing YouTube. This was a pretty big challenge, but I finally got it...almost.
After posting the HUGE amount of post data tags to the login page, you get sent to a checkCookie kind of thing. The checkcookie page verifies that you have the right cookies and then redirects you to youtube.com (logged into your account) This is whats messing me up.
When I have this:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
I get the source of the checkcookie page. It simply says "Document Moved". This isn't what I want, I want to get the source of me being logged in stored into a variable. So I tried something else...
When I use this setup:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
I get sent to the youtube page and I am logged in! It seems to work! Except...I don't want to be redirected off my script. My goal is to get the source of youtube.com with me logged in.
In other words, the cURL is logging in just fine, the problem is I literally get redirected to YouTube. Which I don't want.
Any suggestions? It's like I need to follow the redirects...but not be redirected.
Thanks for any help!
try this:
<?php
function getURL($url) {
$curlHandle = curl_init(); // init curl
curl_setopt($curlHandle, CURLOPT_URL, $url); // set the url to fetch
curl_setopt($curlHandle, CURLOPT_HEADER, 0);
curl_setopt($curlHandle, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curlHandle, CURLOPT_TIMEOUT,30);
curl_setopt($curlHandle, CURLOPT_POST, 0);
$content = curl_exec($curlHandle);
curl_close($curlHandle);
return $content;
}
?>
Looks like you are getting redirected because you echo the curl_exec, in which there is a javascript code for redirection. Since you are likely requesting that from your browser, it runs the code and redirects you to YouTube. If that's the case, obvious solution would be to turn off JS or filter what you echo to yourself
Related
I built a custom plugin for WordPress that people can post without having to register / login, but just double confirming the password. It has been working well, spam free, but someone started posting spammy links.
I wrote a plugin to detect the pattern based on IP address then block the IP and delete all posts for those who got blocked. However, I think this spammer is using a tool that spoofs or switches IP address and started posting from a different IP address. One thing in common I found is that the links go to the same URL after series of redirects.
I've tried the following functions to trace the destination, but no luck.
myfunction( $url ){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($ch);
$lastUrl = curl_getinfo($ch);
curl_close($ch);
return $lastUrl;
}
I've also tried getting the header information from the link, but no luck.
So, I tried many online tool that grabs the final URL from the link, and none of them worked.
The URL shortener service this spammer uses is http://urnic.com/
I don't think it is doing a JavaScript redirect as it worked with JS turned off from my Chrome.
you can use curl's CURLOPT_FOLLOWLOCATION + CURLINFO_EFFECTIVE_URL to find the final address, provided that the redirects you speak of are HTTP-redirects (eg HTTP 3xx 300 Multiple Choices or 301 Moved Permanently or 302 Found or 307 Temporary Redirect or some such),
function get_final_url(string $redirect_url):string{
$ch=curl_init($redirect_url);
curl_setopt_array($ch,array(
CURLOPT_FOLLOWLOCATION=>1,
CURLOPT_ENCODING=>'',
CURLOPT_USERAGENT=>'many_websites_block_UAless_requests',
CURLOPT_RETURNTRANSFER=>1, // ideally we should use CURLOPT_NOBODY but some websites respond differently to HEAD requests, so using GET requests is the safest option =/ (also if you're worried about ram usage, you should set CURLOPT_OUTFILE to /dev/null or enable CURLOPT_WRITEFUNCTION)
));
curl_exec($ch);
$ret=curl_getinfo($ch,CURLINFO_EFFECTIVE_URL);
curl_close($ch);
return $ret;
}
(ps! untested, might be a typo or something, but that should work in theory.)
as i mentioned in a code-comment, the function can be optimized to use less ram if you're worried about huge responses (CURLOPT_RETURNTRANSFER put the entire response in-ram, can be fixed by using an empty CURLOPT_WRITEFUNCTION)
anyhow, that should return the final url.
You can make with preg_match and catch location url. its working for me perfectly.
$curlhandle = curl_init();
curl_setopt($curlhandle, CURLOPT_URL, $url);
curl_setopt($curlhandle, CURLOPT_HEADER, 1);
curl_setopt($curlhandle, CURLOPT_USERAGENT, 'googlebot');
curl_setopt($curlhandle, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($curlhandle, CURLOPT_RETURNTRANSFER, 1);
$final = curl_exec($curlhandle);
if (preg_match('~Location: (.*)~i', $final, $lasturl)) {
$loc = trim($lasturl[1]);
echo $loc;
} else {
echo "Dont have redirect url...";
}
This will behavior like googlebot and will show you redirected url.
only add curl_setopt($curlhandle, CURLOPT_USERAGENT, 'googlebot'); this code.
I need to fetch data from this web page Sender score
.I try to use cURL but it renders white page.
Here is my code :
$ch = curl_init();
$keyword = "an-example.com";
curl_setopt($ch, CURLOPT_URL, 'https://www.senderscore.org/lookup.php?lookup='.$keyword.'&validLookup=true');
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
print_r($data);
curl_close($ch);
any idea ?
Regards.
You're getting a blank page because of the captcha which is needed to fill in. Perhaps senderscore has an API which you can use? Or maybe there's another website available doing the same thing. I thought this was about scoring email statusus right? Then maybe this site will help you out: http://www.reputationauthority.org/domain_lookup.php?ip=somedomain.nl&Submit.x=0&Submit.y=0&Submit=Search
I can use this site without the need of captcha or any other bot interference.
I am trying to use php curl function to log in to a https webpage "https://portal.opalonline.co.uk/Home/PortalCore/SignIn/SignIn.aspx"
but I have run out of ideas how I can post values to this particular page (username, password) and 'press 'sign in'.
$postfields = array('ctl00_MasterContentContentPane_Signin1_userID_txt'=>'email#address.com',
'ctl00_MasterContentContentPane_Signin1_password_txt'=>'somepassword123');
/* LOG IN TO TalkTalk ACCOUNT */
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://portal.opalonline.co.uk/Home/PortalCore/SignIn/SignIn.aspx?");
curl_setopt($ch, CURLOPT_HEADER, false);
// curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
// curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE_FILE);
// curl_setopt($ch, CURLOPT_COOKIE, COOKIE_FILE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields);
curl_setopt($ch, CURLOPT_POST, 1);
var_dump($ch);
$string_exec = curl_exec($ch);
var_dump($string_exec);
I can not even display the page with var_dump :( . Ideas / suggestions much appreciated
First, I don't think you can do the 'array' thing like that as that will make PHP/CURL create multipart formpost instead, and this is not such a form. Provide the data in "name=value&name2=value2" style.
Then, make sure you also submit all the hidden fields in the form. There are at least four of them. One of them is set by the HTML to a long value that you need to extract and set, and there is also some javascript magic that sets some of the others. You probably need to use your browser's networking tool to snoop on what exactly your browser sends to be able to mimic that perfectly.
The login page sets cookies and you probably need to pass those cookies on when you submit the login form. So you need to first fetch (GET) the login form page to get the cookies, then file the login POST.
With that fixed, you should be closer. If that isn't all that takes, then continue comparing the browser's request with what your request is sending and make sure they are as similar as possible.
Open the website in google chrome, open the console, to go the network tab.
Login to the website. You should see the request in the network tab. Do a right click on it, select "copy as cURL". It will give you a command line, that will help you understand what you need.
I have following URL
http://www.davesinclairstpeters.com/auto2_inventorylist?i=37647&c=12452&npg=1&ns=50&echo=2
I want to retrieve content of this url using curl but everytime I make this request it is showing me error, as it is not passing required parameters
Below is my code
$ch = curl_init(); // start CURL
curl_setopt($ch, CURLOPT_URL, $json_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPGET, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
$response = curl_exec($ch);
That page doesn't give any information stating that the information isn't being passed properly. In fact, it tells you that the information has been recieved - by viewing the source, you can see:
<!--
javax.servlet.forward.request_uri = /auto2_inventorylist
...
javax.servlet.forward.servlet_path = /auto2_inventorylist
...
javax.servlet.forward.query_string = i=37647&c=12452&npg=1&ns=50&echo=2
-->
Which tells you the information has infact been recieved.
Therefore, it's no problem with your code, but with the website itself. You should make sure the URL you are using is valid, or contact that website to get more information.
With regards to your code itself - the curl_setopt($ch, CURLOPT_HTTPGET, true); isn't necessary, as this is already set by default, and you can also pass the URL as an argument of the curl_init function. Doesn't impact performance, but makes for neater code.
$ch = curl_init($json_url); // start CURL
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
$response = curl_exec($ch);
You code is perfectly fine and if there's something wrong returned, simply paste this URL to your web browser and check the result. In this case website simply failed for some reasons. There's nothing you can do about that as problem is NOT on your side.
This URL yields a page of cars with links to more cars. Looks like the URL you're starting with is old, or has some sort of expiration factor that's not obvious.
Not knowing which sort of filtering parameters you're shooting for.. hard to say what else my be wrong, other than your starting URL be bad.
working url:
http://www.davesinclairlincolnstpeters.com/all-inventory/index.htm?listingConfigId=auto-new%2Cauto-used&compositeType=&year=&make=&start=0&sort=&facetbrowse=true&quick=true&preserveSelectsOnBack=true&searchLinkText=SEARCH&showInvTotals=false&showRadius=false&showReset=true&showSubmit=true&facetbrowseGridUnit=BLANK&showSelections=true&dependencies=model%3Amake%2Ccity%3Aprovince%2Ccity%3Astate&suppressAllConditions=false
Basically, I'm trying to log into a site. I've got it logging in, but the site redirects to another part of the site, and upon doing so, it redirects my browser as well.
For example:
It successfully logs into http://example.com/login.php
But then my browser goes to http://mysite.com/site.php?page=loggedin
I just want it to return the contents of the page, not be redirected to it.
How would I do this?
As requested, here is my code
// create curl resource
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, $loginURL);
//Some setopts
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postFields);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_FOLLOWREDIRECT, FALSE);
curl_setopt($ch, CURLOPT_REFERRER, $referrer);
// $output contains the output string
$output = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
echo $output;
Figured it out. The webpage was echoing a meta refresh, and since I was echoing the output, my browser followed.
Removed the echo $output; and it no longer does that.
I feel kind of dumb for not recognizing that in the beginning.
Thanks everyone.
Using cURL you have to find the redirect and follow it, then return that page's content. I'm not sure why your browser would be redirecting unless you have some weird header code that you are returning from the login page.
set CURLOPT_FOLLOWLOCATION to false.
curl_setopt($ch , CURLOPT_FOLLOWLOCATION , FALSE);
this might help you.