I need to access this URL on php:
https://wmf.ok.ru/play;jsessionid=a-pt2O8FJKq_wzqod9LAJNtwgjNSjaNa-KVIGc1d1eRUSWhdAw9dlDo13fLzh57rGyKPzk2V0jMFrnKw8R4HjA.p162X6pZ_FG0kKMmKa6bkQ?client=flash&jsonp=&tid=40542951634095&ctx=my
But on my PHP code I got 404 error. I have done everything correctly. I think there is a mistake with ; symbol. We can open the link above on chrome, but not on php curl. Here is my code:
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
function file_get_contents_curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch,CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36');
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: application/json, text/javascript, */*; q=0.01',
'Accept-Encoding: gzip, deflate, br',
'Accept-Language: en-US,en;q=0.9,az;q=0.8,tr;q=0.7,uz;q=0.6,ru;q=0.5',
'Referer: https://ok.ru/',
'Origin: https://ok.rus'
));
$data = curl_exec($ch);
if(curl_error($ch))
{
echo 'error:' . curl_error($ch);
}
echo curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
return $data;
}
$url = 'https://wmf.ok.ru/play;jsessionid=a-pt2O8FJKq_wzqod9LAJNtwgjNSjaNa-KVIGc1d1eRUSWhdAw9dlDo13fLzh57rGyKPzk2V0jMFrnKw8R4HjA.p162X6pZ_FG0kKMmKa6bkQ?client=flash&jsonp=&tid=40542951634095&ctx=my';
echo file_get_contents_curl($url);
?>
After executing this code, I got microsoft's server 404 error. How can I make Curl to open URLs like this?
Just add this to your function and it will work:
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
Here is the full working function:
<?php
error_reporting(E_ALL);
ini_set('display_errors', 1);
function file_get_contents_curl($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch,CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36');
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip, deflate');
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Accept: application/json, text/javascript, */*; q=0.01',
'Accept-Encoding: gzip, deflate, br',
'Accept-Language: en-US,en;q=0.9,az;q=0.8,tr;q=0.7,uz;q=0.6,ru;q=0.5',
'Referer: https://ok.ru/',
'Origin: https://ok.rus'
));
$data = curl_exec($ch);
if(curl_error($ch))
{
echo 'error:' . curl_error($ch);
}
echo curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
curl_close($ch);
return $data;
}
$url = 'https://wmf.ok.ru/play;jsessionid=a-pt2O8FJKq_wzqod9LAJNtwgjNSjaNa-KVIGc1d1eRUSWhdAw9dlDo13fLzh57rGyKPzk2V0jMFrnKw8R4HjA.p162X6pZ_FG0kKMmKa6bkQ?client=flash&jsonp=&tid=40542951634095&ctx=my';
echo file_get_contents_curl($url);
?>
Related
I am trying to scrape from a site which is behind cloudflare.
My code is::
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, 'https://targetsite');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POSTFIELDS, '{"current_bid_status":true}');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($ch, CURLOPT_HTTPHEADER, [
'User-Agent: Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.130 Safari/537.36',
'Accept: application/json',
'Accept-Language: en-US,en;q=0.5',
'Content-Type: application/x-www-form-urlencoded',
'Content-Length: '.strlen($data)
]);
$result = curl_exec($ch);
$status = curl_getinfo ($ch);
The response header was 403, and the response body was error code: 1020
It looks like cloudflare is blocking the request.
But when i add fiddler proxy:
curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
It works nicely!
What can be the possible reason here? Is it something related with ssl certificate?
_
So I'm trying to just get the HTML from a page. I have added any possible data into curl headers SSL anything. But they still know that its a CURL BOT. How can I bypass this or how they do it?
When I visit other pages from them I dont get Detected as a Bot only when I'm on search
$url = "https://suchen.mobile.de/fahrzeuge/search.html?damageUnrepaired=NO_DAMAGE_UNREPAIRED&isSearchRequest=true&maxPowerAsArray=PS&minPowerAsArray=PS&scopeId=C";
$data = curl($url);
echo $data;
function curl($url, $post = "") {
$cookie = "cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36');
curl_setopt($ch, CURLOPT_HTTPHEADER, array('authority: suchen.mobile.de', 'path: /fahrzeuge/search.html?damageUnrepaired=NO_DAMAGE_UNREPAIRED&isSearchRequest=true&maxPowerAsArray=PS&minPowerAsArray=PS&scopeId=C', 'scheme: https', 'accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8', 'accept-encoding: gzip, deflate, br', 'accept-language: en-US,en;q=0.9', 'upgrade-insecure-requests: 1'));
$data = curl_exec ($ch);
if (curl_error($ch))
return "Bad";
if (curl_getinfo($ch)["http_code"] == 200)
return $data;
}
I think Firefox automatic proxy configuration is TSL 1.2:
$headers = array(
'host:ip-adress.eu',
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language: en-US,en;q=0.5',
'Accept-Encoding: gzip, deflate, br',
'Proxy-Authorization: Basic b2xkLmV1cm9tYWtlckB5YWhvby5jb206WEpzV0Uwc0ZqT1pOK2MydDZWZWc4WWFreklaUVVDSUcxbDVrWE1yK0xKVT0=',
'Connection: keep-alive'
);
$proxy = "GQ2S4MZSFYZTMLRRHAYSGMJUG4ZTQMJRGIYDA.cd-n.net:143";
$url = 'ip-adress.eu';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSLVERSION, 6);
curl_setopt($curl, CURLOPT_HTTPPROXYTUNNEL, true); // OK
curl_setopt($curl, CURLOPT_PROXY, $proxy); // OK
curl_setopt($curl, CURLOPT_URL, trim($url)); // OK
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0'); // OK
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
curl_setopt($curl, CURLOPT_TIMEOUT, 30);
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT, 30);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_VERBOSE, false);
$httpresult = curl_exec($curl);
$httpstatus = curl_getinfo($curl, CURLINFO_HTTP_CODE);
print curl_errno ($curl);
curl_close($curl);
This code returns CURL Error - curl: (56) Recv failure: Connection reset by peer
i am having a problem with PHP file_get_contents.i am trying to fetch inforamtion following url but is getting captcha page.
$link = 'http://www.wayfair.com/a/product_review_page/get_update_reviews_json?_format=json&product_sku=KUS1523&page_number=5&sort_order=relevance&filter_rating=&filter_tag=&item_per_page=5';
$Page_information = file_get_contents($link);
print_r($Page_information);
Also i am trying to get page information using php curl but same captcha page is display.
$cookie='cookie.txt';
if(!file_exists($cookie)){
$fh = fopen($cookie, "w");
fwrite($fh, "");
fclose($fh);
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_URL, "http://www.wayfair.com/a/product_review_page/get_update_reviews_json?_format=json&product_sku=KUS1523&page_number=5&sort_order=relevance&filter_rating=&filter_tag=&item_per_page=5");
curl_setopt($ch, CURLOPT_BINARYTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_COOKIE,1);
curl_setopt($ch, CURLOPT_COOKIEJAR,$cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE,$cookie);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
$result11 = curl_exec($ch);
print_r($result11);
If you analyze the headers from a browser where cookies and javascript are disabled you should see the bare minimum sent - some, perhaps all might be required and are set with the context argument.
/* set the options for the stream context */
$args=array(
'http'=>array(
'method' => "GET",
'header' => array(
'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0',
'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Host: www.wayfair.com',
'Accept-Encoding: gzip, deflate'
)
)
);
/* create the context */
$context=stream_context_create( $args );
$link = 'http://www.wayfair.com/a/product_review_page/get_update_reviews_json?_format=json&product_sku=KUS1523&page_number=5&sort_order=relevance&filter_rating=&filter_tag=&item_per_page=5';
/* Get the response from remote url */
$res = file_get_contents( $link, FILE_TEXT, $context );
/* process the response */
print_r( $res );
$url = "http://www.wayfair.com/a/product_review_page/get_update_reviews_json?_format=json&product_sku=KUS1523&page_number=5&sort_order=relevance&filter_rating=&filter_tag=&item_per_page=5";
$cookie = getcwd().DIRECTORY_SEPARATOR.'cookie.txt';
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_COOKIE,1);
curl_setopt($ch, CURLOPT_COOKIEJAR,$cookie);
curl_setopt($ch, CURLOPT_COOKIEFILE,$cookie);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
//added
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.95 Safari/537.36");
$result11 = curl_exec($ch);
print_r($result11);
try this
<?php
function instagram_login($data_sent_username, $data_sent_password){
$data_filtered_data = instagram_gettoken();
$data_rec_token = $data_filtered_data[0];
$data_rec_mid = $data_filtered_data[1];
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.instagram.com/accounts/login/ajax/');
curl_setopt($ch, CURLOPT_POSTFIELDS, 'username='.urlencode($data_sent_username).'&password='.urlencode($data_sent_password));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Host: www.instagram.com',
'Connection: keep-alive',
'Content-Length: 25',
'Origin: https://www.instagram.com',
'X-Instagram-AJAX: 1',
'Content-Type: application/x-www-form-urlencoded; charset=UTF-8',
'Accept: */*',
'X-Requested-With: XMLHttpRequest',
'X-CSRFToken: '.$data_rec_token.'',
'DNT: 1',
'Referer: https://www.instagram.com/accounts/login/',
'Accept-Encoding: gzip,deflate',
'Accept-Language: en-US',
'Cookie: mid='.$data_rec_mid.'; ig_pr=1; ig_vw=1319; csrftoken='.$data_rec_token.''));
curl_setopt($ch, CURLOPT_COOKIEFILE, getcwd() . '/instagram_cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, getcwd() . '/instagram_cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.43 Safari/537.31");
$data_rec_page = curl_exec($ch) or die(curl_error($ch));
echo $data_rec_page;
}
function instagram_gettoken(){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.instagram.com/accounts/login/');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('application/x-www-form-urlencoded', 'charset=UTF-8'));
curl_setopt($ch, CURLOPT_COOKIEFILE, getcwd() . '/instagram_cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, getcwd() . '/instagram_cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.43 Safari/537.31");
curl_setopt($ch, CURLOPT_REFERER, "https://www.instagram.com/");
$data_local_lines = file('instagram_cookie.txt');
foreach($data_local_lines as $data_local_line) {
if($data_local_line[0] != '#' && substr_count($data_local_line, "\t") == 6) {
$data_filter_tokens = explode("\t", $data_local_line);
$data_filter_tokens = array_map('trim', $data_filter_tokens);
$data_filtered_data[] = $data_filter_tokens[6];
}
}
return $data_filtered_data;
}
instagram_login("jackzett10","password");
?>
The output surprisingly is in json format and somehow it doesnt matter what password i enter i always get the same output:
{"status":"ok","authenticated":false,"user":"jackzett10"}
It took me some time to get the token and mid right per request so that is not the problem.
If i delete the cookie in the header array the output doesnt change
I took the header from a login request i did in burbsuite.
Does anyone have any idea what im doing wrong here?
1) Try to add - curl_exec($ch);
2) Save certificate file and try this.
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_CAINFO, getcwd() . "/ca/DigiCertHighAssuranceEVRootCA.crt");