(PHP) Getting Blank Page with Curl (Mytischtennis) - php

I´m new to php and trying to program a few things.
In the first step I want to show the following page: "https://www.mytischtennis.de/public/home" on my website. I am using Curl to grab the page. But every time I want to output the page I am getting a blank page.
My code looks like this:
<?php
function grab_page($site){
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17');
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 40);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_URL, $site);
ob_start();
return curl_exec($ch);
ob_end_clean();
curl_close($ch);
}
echo grab_page("https://www.mytischtennis.de/public/home");
echo "hallo";
?>
With other pages this code works. Only for "https://www.mytischtennis.de/public/home" it wont work for me?
Can someone help me why i get only with this site a blank page?
Thank you :)

You need to set two more options in your curl request:
// Add some more headers here if you need to
$headers = [
"Accept-Encoding: gzip, deflate, br"
];
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
// Decode the response automatically when needed
curl_setopt($ch, CURLOPT_ENCODING, '');
After this you will get the page you want.

Related

Get an existing Captcha image via cURL

I'm trying to get a Captcha(old, the image one) image from a web page. But, I know it always changes and being regenerated on every HTTP request. But I can't get the image via cURL.
I've tried this with this code in PHP:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://example.com/login.aspx');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.2309.372 Safari/537.36");
curl_setopt($ch, CURLOPT_NETRC, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
$data = curl_exec($ch);
curl_close($ch);
Image just comes as empty. There is field like Captcha but nothing is written on it. I couldn't understand if there is a difference between browser request or cURL request.

Simulate a real browser in curl

for some time I have been trying to write the ideal browser simulations, for this purpose I wrote a script that to some extent simulates the browser and works correctly on many pages with ssl. Recently during the test site pornhub.com and wikipedia.com encountered a strange error in my script, just for pornhub after a few page reloads shows the status of header "Loading .. Content-length: 1456" and the number of loaded data changes in real time on smaller and larger values. I have a question for already very experienced and professional programmers: Have you met with such a situation, if so you have any hints or corrections for my script ?.
I post my code (test for wikipedia). If you fire it on 3 browser tabs and you will refresh, you will get an error.
<?php
function curl($url)
{
$headers = [
'Accept-Language: pl,en-US;q=0.9,en;q=0.8',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36'
];
$cookie = 'cookie.txt';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_REFERER, 'https://www.wikipedia.org');
curl_setopt($ch, CURLOPT_ENCODING, 'gzip');
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
if (!file_exists($cookie)){
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
}else{
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
}
$c = curl_exec($ch);
curl_close($ch);
return $c;
}
echo curl('https://www.wikipedia.org');
?>

PHP Curl not executing

I am trying to retrieve the HTML from a user profile on Instagram using cURL.
I am new to cURL so do not know the cause of this error.
Nothing happens when the cURL is executed , the page seems to refresh?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.instagram.com/zohebchaudhry1/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookiess.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookiess.txt');
curl_setopt($ch ,CURLOPT_TIMEOUT , 10);
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36" );
$html = curl_exec($ch);
curl_close($ch);
echo $html;
above is the PHP cURL code.
It appears that cURL is working, however you're unable to see the output because printing HTML may not be desired.
I suggest replacing echo $html; with echo htmlentities($html);
Read more: php.net/htmlentities

Server Error in '/' Application when attempting to cURL

I am trying to download a pdf using cURL and am getting stuck on a "Server Error in '/' Application" page. My code:
$url = "https://some.domain.com/Reports/Report?ReportID=123456"
$ch = curl_init($url);
$header = array ('Host: some.domain.com');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.76 Safari/537.36');
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSLVERSION, 4);
curl_setopt($ch, CURLOPT_SSL_CIPHER_LIST, implode(':', $arrayCiphers));
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_COOKIE, "ASP.NET_SessionId=XXXXXXXX; __RequestVerificationToken_XXXXXXX=lots-of-alpha-numeric-characters");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt");
curl_setopt($ch, CURLOPT_REFERER, "https://some.domain.com");
$output = curl_exec($ch);
echo $output;
curl_close($ch);
Is there something else I can try or some more debugging I can do?
[edit] Apparently it's caused by one of my parameters. There are several parameters in the URL. &Flag=True seems to be causing the error. If I change it to &Flag=False I get a blank page.
This error was being caused because my curl request header did not include cookies that were required to download the file. Adding those cookies in the header fixed it.

Using cURL to download a site's HTML source, but getting different file than intended

I'm trying to use cURL and PHP to download the HTML source (as it appears in the browser) of here. But instead of the actual source code, this is returned instead (a meta refresh link set to 0).
<html>
<head><title>Object moved</title></head>
<body>
<h2>Object moved to here.
</h2>
</body>
</html>
I'm trying to spoof the referral header to be the site, but it seems I'm doing it wrong. Code is below. Any suggestions? Thanks
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.windowsphone.com/en-US/apps/ea39f002-ac30-e011-854c-00237de2db9e');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.6 (KHTML, like Gecko) Chrome/16.0.897.0 Safari/535.6');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HTTP_VERSION, CURL_HTTP_VERSION_1_1);
curl_setopt($ch, CURLOPT_AUTOREFERER, false);
curl_setopt($ch, CURLOPT_REFERER, "http://www.windowsphone.com/en-US/apps/ea39f002-ac30-e011-854c-00237de2db9e");
$html = curl_exec($ch);
curl_close($ch);
Add the curl option to follow redirects:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
If it is a meta refresh and not an HTTP moved header, see:
PHP: Can CURL follow meta redirects
As mentioned by flesk, you may also need to store the cookies.
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.windowsphone.com/en-US/apps/ea39f002-ac30-e011-854c-00237de2db9e');
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.6 (KHTML, like Gecko) Chrome/16.0.897.0 Safari/535.6');
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($ch, CURLOPT_REFERER, "http://www.windowsphone.com");
$html = curl_exec($ch);
curl_close($ch);
echo $html;
The problem isn't the referrer but that you need to enable cookies for it to work.
Try something like this:
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
You have to query the page twice. First allow redirects to get the cookie from login.live.com, then query again with the cookie set.

Categories