cURL call to web service not working - php

I am trying to make a simple get call to a .NET RESTful web service at https://salesgenie.com/brandingservice/details?url=att.salesgenie.com The test page making the cURL call is at https://test-cms.salesgenie.com/wp-content/themes/salesgenie/branding-proxy.php The web service can be called in the browser but is timing out when called via cURL. Here is the code from the branding-proxy.php page:
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
$url = 'https://salesgenie.com/brandingservice/details?url=att.salesgenie.com';
$handle = curl_init();
curl_setopt($handle, CURLOPT_URL, $url);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($handle, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($handle, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($handle, CURLOPT_SSLVERSION,3);
$response = curl_exec($handle);
$code = curl_getinfo($handle, CURLINFO_HTTP_CODE);
curl_close($handle);
setcookie('CmsCookie', $code . ' ' . $response, 0, '/', 'salesgenie.com', 0);
echo 'Code:' . $code."<br />Response: ".$response;

I can see that the timeout happens while using your links, though I think the problem happens somewhere else. The reason I've come to that conclusion is that I uploaded your code to my own website. It worked fine.
Important
If you are the owner of salesgenie.com then I think you have a problem with the server, since sometimes the responses I got was either a timeout (code 0), code 200 (with result) or code 320. So the problem might be a problem within the server itself.
Tricks
Try removing the setcookie to spare some processing.
Also try using http instead of https since https requires some more processing to use.

Related

How to find last (final) URL after series of redirects via shortened URL from PHP

I built a custom plugin for WordPress that people can post without having to register / login, but just double confirming the password. It has been working well, spam free, but someone started posting spammy links.
I wrote a plugin to detect the pattern based on IP address then block the IP and delete all posts for those who got blocked. However, I think this spammer is using a tool that spoofs or switches IP address and started posting from a different IP address. One thing in common I found is that the links go to the same URL after series of redirects.
I've tried the following functions to trace the destination, but no luck.
myfunction( $url ){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_exec($ch);
$lastUrl = curl_getinfo($ch);
curl_close($ch);
return $lastUrl;
}
I've also tried getting the header information from the link, but no luck.
So, I tried many online tool that grabs the final URL from the link, and none of them worked.
The URL shortener service this spammer uses is http://urnic.com/
I don't think it is doing a JavaScript redirect as it worked with JS turned off from my Chrome.
you can use curl's CURLOPT_FOLLOWLOCATION + CURLINFO_EFFECTIVE_URL to find the final address, provided that the redirects you speak of are HTTP-redirects (eg HTTP 3xx 300 Multiple Choices or 301 Moved Permanently or 302 Found or 307 Temporary Redirect or some such),
function get_final_url(string $redirect_url):string{
$ch=curl_init($redirect_url);
curl_setopt_array($ch,array(
CURLOPT_FOLLOWLOCATION=>1,
CURLOPT_ENCODING=>'',
CURLOPT_USERAGENT=>'many_websites_block_UAless_requests',
CURLOPT_RETURNTRANSFER=>1, // ideally we should use CURLOPT_NOBODY but some websites respond differently to HEAD requests, so using GET requests is the safest option =/ (also if you're worried about ram usage, you should set CURLOPT_OUTFILE to /dev/null or enable CURLOPT_WRITEFUNCTION)
));
curl_exec($ch);
$ret=curl_getinfo($ch,CURLINFO_EFFECTIVE_URL);
curl_close($ch);
return $ret;
}
(ps! untested, might be a typo or something, but that should work in theory.)
as i mentioned in a code-comment, the function can be optimized to use less ram if you're worried about huge responses (CURLOPT_RETURNTRANSFER put the entire response in-ram, can be fixed by using an empty CURLOPT_WRITEFUNCTION)
anyhow, that should return the final url.
You can make with preg_match and catch location url. its working for me perfectly.
$curlhandle = curl_init();
curl_setopt($curlhandle, CURLOPT_URL, $url);
curl_setopt($curlhandle, CURLOPT_HEADER, 1);
curl_setopt($curlhandle, CURLOPT_USERAGENT, 'googlebot');
curl_setopt($curlhandle, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($curlhandle, CURLOPT_RETURNTRANSFER, 1);
$final = curl_exec($curlhandle);
if (preg_match('~Location: (.*)~i', $final, $lasturl)) {
$loc = trim($lasturl[1]);
echo $loc;
} else {
echo "Dont have redirect url...";
}
This will behavior like googlebot and will show you redirected url.
only add curl_setopt($curlhandle, CURLOPT_USERAGENT, 'googlebot'); this code.

How To Bypass Cloudfare using php or javascript

My problem is
I have made curl request on paxfull api earlier it was returning result but now its returning 503 .
Here is my code
$handle = curl_init();
curl_setopt($handle, CURLOPT_URL, 'https://paxful.com/buy-bitcoin?format=json');
curl_setopt($handle, CURLOPT_POST, false);
curl_setopt($handle, CURLOPT_BINARYTRANSFER, false);
curl_setopt($handle, CURLOPT_HEADER, true);
curl_setopt($handle, CURLOPT_RETURNTRANSFER, true);
curl_setopt($handle, CURLOPT_CONNECTTIMEOUT, 10);
$response = curl_exec($handle);
$hlength = curl_getinfo($handle, CURLINFO_HEADER_SIZE);
$httpCode = curl_getinfo($handle, CURLINFO_HTTP_CODE);
$body = substr($response, $hlength);
// If HTTP response is not 200, throw exception
if ($httpCode != 200) {
throw new Exception($httpCode);
}
I get this error:
Error Fatal error: Uncaught exception 'Exception' with message '503'
I googled and found it might be ip address blocked but when made get request at browser its giving results to me.
now i came with conclusion they are not allowing any GET Request .
if you run url https://paxful.com/buy-bitcoin?format=json it first check browser the return the result.
how can we get results of paxfull api. please suggest
here is snapshot
then there api redirects 404
url http://localhost/cdn-cgi/l/chk_jschl?jschl_vc=c5b74eae14eb1b1e5862f913b9f0f178&pass=1499953121.017-h%2FljgkjMr%2B&jschl_answer=18913
Its not possible through javascript also
fiddle http://jsfiddle.net/00cvyyuo/350/
i found link How to bypass cloudflare bot/ddos protection in Scrapy?
but this link helps in python so can someone help in php or javascript.
try to use Chrome's network tab in Developer Tools(f12) to see the actual request being sent. If it works then try to repeat the request from a REST client, where you can edit the headers to check what works and what not.
If you got it working then set all the headers in cURL. If it fails then set verbose and check what was sent.

Issue with cURL in PHP, returning false regardless

So I'm rather new to using cURL in PHP, I was told to use it for a task this week and it's been nothing but a pain, and I can't seem to find a solution to my problem no matter how hard I search.
What I am attempting to do is send a file to an upload directory on my hosted server from a remote portal manager that I have built. The file uploadhandler in the portal manager connects via curl to the remote destination and then the remote destination grabs the info and processes the file like normal. No matter what I've been trying though everything just throws back a failed response.
Here is the updated version of the code I am working with
updates
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, false);
//curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt(
$ch,
CURLOPT_POSTFIELDS,
array(
'file' =>
'#' .$_FILES['doc']['tmp_name'][$i]
.';filename=' .$_FILES['doc']['name'][$i]
.';type=' .$_FILES['doc']['type'][$i]
)
);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
//$response = curl_exec($ch);
if(curl_exec($ch) === false){
echo curl_error($ch);
echo curl_errno($ch);
}else{
echo "ok";
}
With all of this information set I am getting no value for curl_error but I get a value of 43 for curl_errno.
From what I have been researching, error 43 for curl is
CURLE_BAD_FUNCTION_ARGUMENT (43)
Internal error. A function was called with a bad parameter.
However all my functions for the curl_setopt() are put together correctly based on the info from php.net. So this is where I am now confused, because I have no idea what is causing this to happen. Thanks again for the help!

How can I properly follow all redirects on sites I am trying to scrape with cURL in PHP?

I am using cURL to try to scrape an ASP site that is not on my server, with the following option to automatically follow redirects it comes across:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
but it is not properly following all redirects that the website sends back: it is putting several of the redirect URLs as relative to my server and my PHP script's path, instead of the website's server and the path that the website's pages should be relative to. Is there any way to set the base path or server path in cURL, so my script can properly follow the relative redirects it comes across when scraping through the other website?
For example: If I authenticate on their site and then try to access "https://www.theirserver.com/theirapp/mainForm/securepage.aspx" with my script at "https://www.myserver.com/php/myscript.php", then, under some circumstances, their website tries to redirect back to their login page, but this causes a big problem, because the redirect sends my cURL client to "https://www.myserver.com/php/mainForm/login.aspx", that is, '/mainForm/login.aspx' relative to my script on my server, instead of the correct "https://www.theirserver.com/theirapp/mainForm/login.aspx" relative to the site I am scraping on their server.
I would expect cURL's FOLLOWLOCATION option to properly follow relative redirects based on the "Location:" header of the web pages I am accessing, but it seems that it doesn't and can't. Since this seems to not work, preferably I want a way to tell cURL a base path for the server or for all relative redirects it sees, so I can just use FOLLOWLOCATION. If not, then I need to figure out some code that will do the same thing FOLLOWLOCATION does, but that can let me specify a base path to handle these relative URLs when it comes across them.
I see several similar questions about following relative paths with cURL, but none of the answers have any good suggestions for dealing with this problem, where I don't own the website's server and I don't know every single redirect that might come up. In fact, none of the answers I've seen for similar questions seem to even understand that a person might be trying to scrape an external website and would want any relative redirects they come across while scraping the site to just be relative to that site.
EDIT: Here is the code in question:
$urlLogin = "https://www.theirsite.com/theirApp/MainForm/login.aspx"
$urlSecuredPage = "https://www.theirsite.com/theirApp/ContentPages/content.aspx"
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $urlLogin);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_AUTOREFERER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; yie8)");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 120);
curl_setopt($ch, CURLOPT_TIMEOUT, 120);
curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
// GET login page
$data=curl_exec($ch);
// Read ASP viewstate and eventvalidation fields
$viewstate = parseExtract($data,$regexViewstate, 1);
$eventval = parseExtract($data, $regexEventVal, 1);
//set POST data
$postData = '__EVENTTARGET='.$eventtarget
.'&__EVENTARGUMENT='.$eventargument
.'&__VIEWSTATE='.$viewstate
.'&__EVENTVALIDATION='.$eventval
.'&'.$nameUsername.'='.$valUsername
.'&'.$namePassword.'='.$valPassword
.'&'.$nameLoginBtn.'='.$valLoginBtn;
// POST authentication
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postData);
curl_setopt($ch, CURLOPT_URL, $urlLogin);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookieFile);
$data = curl_exec($ch);
/******************************************************************
GET secure page (This is where a redirect fails... when getting
the secure page, it redirects to /mainForm/login.aspx relative to my
script, instead of /mainForm/login.aspx on their site.
*****************************************************************/
curl_setopt($ch, CURLOPT_POST, FALSE);
curl_setopt($ch, CURLOPT_URL, $urlSecuredPage);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookieFile);
$data = curl_exec($ch);
echo $data; // Page Not Found
You may be running into redirects that are JavaScript redirects.
To find out what is there:
This will give you additional info.
curl_setopt($ch, CURLOPT_FILETIME, true);
You should set fail on error:
curl_setopt($ch, CURLOPT_FAILONERROR,true);
You may also need to see all the Request and Response headers:
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLINFO_HEADER_OUT, true);
The big thing you are missing is curl_getinfo($ch);
It has info on all the redirects and the headers.
You may want to turn off: CURLOPT_FOLLOWLOCATION
And do each request individually. You can get the redirect location from curl_getinfo("redirect_url")
Or you can set CURLOPT_MAXREDIRS to the number of successful redirects, then do a separate curl request for the problem redirect location
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
When you get the response, if no curl error, get the resposne header
$data = curl_exec($ch);
if (curl_errno($ch)){
$data .= 'Retreive Base Page Error: ' . curl_error($ch);
echo $data;
}
else {
$skip = intval(curl_getinfo($ch, CURLINFO_HEADER_SIZE));
$responseHeader = substr($data,0,$skip);
$data= substr($data,$skip);
$info = curl_getinfo($ch);
$info = var_export($info,true);
}
echo $responseHeader . $info . $data;
A better way to web scraping a webpage is to use 2 PHP Packages = Guzzle + DomCrawler.
I made a lot of tests with this combination and i came to the conclusion that this is the best choice.
Here, you will find an example for your implementation.
Let me know if you have any problem! ;)

curl CLI to curl PHP

I use the following command in some old scripts:
curl -Lk "https:www.example.com/stuff/api.php?"
I then record the header into a variable and make comparisons and so forth. What I would really like to do is convert the process to PHP. I have enabled curl, openssl, and believe I have everything ready.
What I cannot seem to find is a handy translation to convert that command line syntax to the equivalent commands in PHP.
I suspect something in the order of :
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
// What goes here so that I just get the Location and nothing else?
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
// Get the response and close the channel.
$response = curl_exec($ch);
curl_close($ch);
The goal being $response = the data from the api “OK=1&ect”
Thank you
I'm a little confused by your comment:
// What goes here so that I just get the Location and nothing else?
Anyway, if you want to obtain the response body from the remote server, use:
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$response = curl_exec($ch);
If you want to get the headers in the response (i.e.: what your comment might be referring to):
curl_setopt($ch, CURLOPT_HEADER, 1);
If your problem is that there is a redirection between the initial call and the response, use:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

Categories