I'm writing a cURL script, but how can I check if it's working and passing properly when it's visiting the website?
$ckfile = '/tmp/cookies.txt';
$useragent= "Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0_1 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Mobile/7A400";
$ch = curl_init ("http://website.com");
curl_setopt($ch, CURLOPT_AUTOREFERER , true);
=> true
curl_setopt($ch, CURLOPT_USERAGENT, $useragent); // set user agent
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$output = curl_exec ($ch);
curl_close($ch);
just make a php page like this on your server and try your script on your own url
var_dump($_SERVER);
and check the HTTP_USER_AGENT string.
You can also achieve the same things by looking at the Apache logs.
But I am pretty sure curl is setting the User-Agent string like it should ;-)
You'll find the FF extension LiveHTTPHEaders will help you see exactly what happens to the headers when using a normal browsing session.
http://livehttpheaders.mozdev.org/
This will increase your understanding of how your target server responds, and even shows if it redirects your request internally.
Related
If we call the same URL in the browser then it auto-download the CSV file. So, I want the same feature using PHP curl to save the CSV file under the same folder. But it gives me an empty result every time. Can you please guide me on what's is wrong in the code below?
$url="https://www.centrano.com/catalog_download.php?email=info#sporttema.com&password=dHB3L1FpTEg1c2pLZ29SUkdnUWcwWTFqN2RIamQx";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
$agent= "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36";
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 100000); //time out of 15 seconds
$output = curl_exec($ch);
print_r($output);
curl_close($ch);
Add the following two CURL options to make it work:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, '/tmp/cookies.txt');
The page redirects several times to the same host and requires for the session cookies to remain present (thus, storing them in a cookie jar.
Also: change your centrano.com account password IMMEDIATELY! Even though it helped solving this, it's generally not a good idea to make it public.
I am using PHP Curl with this code:
curl_setopt($ch, CURLOPT_URL, 'https://www.segundamano.mx/anuncios/ciudad-de-mexico/alvaro-obregon/florida/renta-inmuebles/departamentos?precio=0-10000');
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookies);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookies);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
//curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; rv:22.0) Gecko/20100101 Firefox/22.0");
$uagent = 'Mozilla/5.0 (Windows NT 6.1; rv:22.0) Firefox/22.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/36.0.1985.125 Chrome/36.0.1985.125 Safari/537.36';
curl_setopt($ch, CURLOPT_USERAGENT, $uagent);
curl_setopt($ch, CURLOPT_REFERER, 'http://www.google.com');
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
My question is.. why PHP Curl gives different result Than Searching URL in BROWSER?
PHP Curls gives a big BODY CONTENT... with this LINE...
In Spanish... "No encontramos resultados para tu búsqueda..."
In English.....There are no results for your search...
What happen with this URL?
How Can I CURL and read by code this URL and get the REAL RESULTS AS BROWSER?
Help me please!
Thanks!!!
The link you have mentioned is a single-page web application or web site that interacts with the user by dynamically rewriting the current page rather than loading entire new pages from a server.
Also, this website is using vue js.
Please find the below links for more details.
https://en.wikipedia.org/wiki/Single-page_application
https://vuejs.org/
Because JavaScript is the root of all evil. the website gets the search results you want with AJAX after you've succesfully loaded the page. Just open the "network" tab of your browser inspection tool and see the requests flying around.
Fun part: the website does have a (seemingly authorized) API that it can talk too, maybe you can try that? https://webapi.segundamano.mx/nga/api/v1.1/public
I am using Curl on server to curl www.yelp.com, but using Curl on Http(s)://localhost will not output any CSS htmls sheets. I have tried:
CURLOPT_SSL_VERIFYPEER, FALSE
But the issue is not curling the page, it is the fact that my browser Chrome does not seem to recognize any sort of CSS formatting. Any ideas?
For example, running the below code from http://localhost gives a well formatted page. Running the below code from https://localhost gives a page without css.
<?php
$url="http://www.yelp.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.2; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch,CURLOPT_TIMEOUT,10);
$cl = curl_exec($ch);
echo $cl;
exit;
I'm using the google text to speech api, but for some reason it's being really slow when I connect to it via php or command line.
I'm doing this:
$this->mp3data = file_get_contents("http://translate.google.com/translate_tts?tl=en&q={$text}");
Where $text is just a urlencoded string.
I've also tried doing it via wget on the command line:
wget http://translate.google.com/translate_tts?tl=en&q=test
Either way takes about 20 seconds or more. Via php it does eventually get the contents and add them to a new file on my server as I want it to. Via wget it times the connection out.
However, if I just go to that url in the browser, it's pretty much instant.
Could anyone shed any light on why this might be occuring?
Thanks.
It's due to how Google parses robots. You need to spoof the User-Agent headers to pretend to be a computer.
Some info on how to go about this would be here:
https://duckduckgo.com/?q=php%20curl%20spoof%20user%20agent
Managed to sort this out now, this is what I ended up doing and now it's only taking a few seconds:
$header=array("Content-Type: audio/mpeg");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $uri);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$this->mp3data = curl_exec($ch);
curl_close($ch);
I have written a PHP script that auhtenticates me and send back an URL with a ticket : www.example.com/login?ticket=xxxx
When I enter this URL in the navigator, I does work perfectly. When doing this, checking the sent HTTP Header confirms that no POST variables are required, no cookie is sent.
When sending exactly the same HTTP header through CURL, I keep getting a 500 error with this description :
--cas:authenticationFailure code='INVALID_TICKET'--
However the ticket itself cant be invalid while it perfectly works in a navigator.
The configuration of CURL is the following :
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_COOKIESESSION, 0);
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11');
Thanks in advance for helping on this.