I'm trying to learn cURL with PHP to spoof the referrer to a website.
With the following script I expected to accomplish this...but it seems to not work.
Any ideas/suggestion where I am going wrong??
Or do you know of any tutorials that could help me figure this out?
Thanks!
Jessica
<?php
$host = "http://mysite.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $host);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_AUTOREFERER, false);
curl_setopt($ch, CURLOPT_REFERER, "http://google.com");
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLOPT_HEADER, 0);
$result = curl_exec($ch);
curl_close($ch);
?>
You wont be able to see the result in webserver's analytics because it might probably using a javascript to get the analytics and curl wont run/execute the javascript. All Curl will do is get the content of the page as it like it is a text file. It wont run any of the scripts or anything.
To be more clear if you have an html tag like
<img src="path/to/image/image.jpg" />
The curl will treat it as a line of text. it wont load the image.jpg from the server. The same goes with the js if their is a
<script type="text/javascript" src="analytics.js"></script>
Normally the browser will load that analytics.js and run it, but the curl wont.
Related
My php Curl request(also curl command in terminal) unfortunately shows a different code than opening the URL manually in the browser.
Here is my problem: I want to display the currently available films from https://reservierung.kinolambach.at/filmlist/?location_id=1 (german cinema)
on my website. When you check your network tab, the website sends an ajax call to a url with the specific date to the server and gets a response. Try it out by calling https://reservierung.kinolambach.at/filmlist/?location_id=1&date=4.1.2021&film_table=true in your browser(due to Covid there is a test film available on 4. Jan 2021)
My website does not run on the same domain.
Using an ajax call did not work because of CORS-Policy, so my second approach is to load the data via curl from php.
Here is my php code, which does not work. For every request I make I get the result "No film available on this day". It seems as if "No film available on this day" (german: 'Für diesen Tag ist kein Programm verfügbar.') is the "default" response.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://reservierung.kinolambach.at/filmlist/?location_id=1&date=4.1.2021&film_table=true');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 0);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_HTTPGET, 1);
curl_setopt($ch, CURLOPT_COOKIESESSION, 0);
$result = curl_exec($ch);
echo($result);
curl_close($ch);
There are several curl_setopt which I tried out, none did work.
Here is my question
How is it possible, that curl shows a different result than calling the url in the browser? You can even try it in your terminal with curl https://reservierung.kinolambach.at/filmlist/?location_id=1&date=1.1.2021&film_table=true
How can I change this? I guess it has to do with the header.
Thanks in advance and all the best!
curl_setopt($ch, CURLOPT_URL, 'https://reservierung.kinolambach.at/filmlist/?
location_id=1&date=4.1.2021&film_table=true');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_COOKIEFILE, 'C:/tmp/cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'C:/tmp/cookies.txt');
curl_setopt($ch,CURLOPT_SSL_VERIFYPEER, false);
Works form me with cookies enabled. On second request as my comment before.
I have the following code to capture the html code of a given url:
$url = "https://fnet.bmfbovespa.com.br/fnet/publico/exibirDocumento?id=77212&cvm=true";
$ch = curl_init();
curl_setopt($ch, CURLOPT_CAINFO, '/etc/ssl/certs/cacert.pem');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$html = curl_exec($ch);
echo "$url\n\n";
die($html);
For some reason the result of the following url is not as expected:
"https://fnet.bmfbovespa.com.br/fnet/publico/exibirDocumento?id=77212&cvm=true"
Instead of the code, the result is a giant meaningless string.
I've have successfully used the same code with other pages of the same domain.
I can assure that the desired page's content is not loaded by any js/ajax method (i did the test loading the page when disabling javascript).
My question is:
There is any cUrl option that i should set to correct this error?
My whole site depends on capturing this pages.
Any help would be truly appreciated.
That is base64 encoded, all you need to do is decode it back to plain text like this
echo base64_decode($html);
and you will see HTML
I'm trying to parse a shopping site webpage using CURL in PHP.
the url is: http://computers.pricegrabber.com/printers/HP-Officejet-Pro-8600-Plus-All-One-Wireless-Inkjet-Printer/m916995235.html/zip_code=97045/sort_type=bottomline
Here's the code I use.
function getWebsiteCURL($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
echo getWebsiteCURL("http://computers.pricegrabber.com/printers/HP-Officejet-Pro-8600-Plus-All-One-Wireless-Inkjet-Printer/m916995235.html/zip_code=97045/sort_type=bottomline");
It works, but I couldn't get the full HTML code.
Anybody have any idea why?
TIA.
This is often caused by connection timeout.
Try setting the opt:
CURLOPT_TIMEOUT => 120
curl cannot interpret Javascript, you can see what curl will see if you disable javascript in a browser and navigate to the page. If you need to interpret Javascript then I would use a headless browser like phantomjs. In PHP you could use PHP PhantomJS.
my curl function cannot follow the redirection of Facebook external link redirector, l.php and i have no idea what's wrong...
here is the code that i'm working on and i commented the lines that i've tried... and an example link (http://www.facebook.com/l.php?u=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DGvhFyNLK66A%26feature%3Dyoutu.be&h=xAQFD_3svAQFKxF5YrtqNQ5cL3lIQxo0uaC9PoB7qAvG7Yw&enc=AZPxNZ8P5q54FREC37UC_MP02pwh2DOmsI5bbFkoQm5VUPUlYeNzQASjarRjhTtcedRkmM3mDjK7J_r_P5pRpYhL)
function connect($u) {
$ch= curl_init();
curl_setopt($ch, CURLOPT_URL, $u);
curl_setopt($ch, CURLOPT_USERAGENT, $_SERVER['HTTP_USER_AGENT']);
curl_setopt($ch, CURLOPT_HEADER, true);
//curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
//curl_setopt($ch, CURLOPT_REFERER, 'spie');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
//curl_setopt($ch, CURLOPT_AUTOREFERER, true );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
//curl_setopt($ch, CURLOPT_VERBOSE, true);
//curl_setopt($ch, CURLOPT_MAXREDIRS, 10);
$source=curl_exec($ch);
curl_close($ch);
return $source;
}
thank you..
I first thought this was a redirect issue with cURL (safe mode enabled for instance). But it actually comes from how Facebook redirector works.
There is no Location: header, so curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); won't help you with it.
The Facebook link page actually redirects you using Javascript:
<script type="text/javascript">document.location.replace("http:\/\/www.youtube.com\/watch?v=GvhFyNLK66A&feature=youtu.be");</script>
cURL cannot analyse the content of the page nor execute javascript so this is exepcted behaviour. If you still want to do this, you'll need to parse the content of the page, grab the URL from the javascript, and issue an new cURL request to this URL.
Apparently only HTTP redirects are supported by cURL with the '--location' option.
Reference: https://everything.curl.dev/http/redirects#non-http-redirects
With this code I'm trying to download this web page: http://www.kayak.com/s/...
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,'http://www.kayak.com/s/search/air?ai=kayaksample&do=y&ft=ow&ns=n&cb=e&pa=1&l1=ZAG&t1=a&df=dmy&d1=4/10/2010&depart_flex=exact&r1=y&l2=LON&t2=a&d2=11/10/2010&return_flex&r2=y');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_REFERER,"http://wwww.google.com");
$content = curl_exec ($ch);
echo $content;
You can see the demo at: http://www.pointout.org/test.php
As you can see the part with prices is missing.
What could be wrong?
This is not going to work the way you think it will. The reason is the prices are not in the initial HTML response that you get. Rather, there is some Javascript magic occurring which is using AJAX to load the prices when the page is loaded.