php curl to read data from webpage - php

I am given a project on fetching data from this url.
For this, Simple HTML DOM process has already failed, so I am working on:
function curl_download($Url){
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_REFERER, "www.idealo.de/preisvergleich/MainSearchProductCategory.html?q=0018208925063");
curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
print curl_download('www.idealo.de/preisvergleich/MainSearchProductCategory.html?q=0018208925063');
This code returns a blank page. Can anyone please help me?

The reason is the Useragent you used is too short to look like a real browser.
Try to use this one bellow:
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.130 Safari/537.38");

Related

how to create curl function same as like wp_remote_post in core php

I want to create in curl function which will give me same response as Wordpress wp_remote_post I am working with core php
function curlPost($url, $data=null, $header=null) {
$ch = curl_init($url);
if(!empty($header)){
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
}
curl_setopt($ch, CURLOPT_TIMEOUT, 30);
if(!empty($header)){
curl_setopt( $ch, CURLOPT_HTTPHEADER, array('Content-Type:application/json'));
}
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.17');
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
$response = curl_exec($ch);
curl_close($ch);
return $response;
}
I am using this simple function, and not able to get response as wp_remote_post
I want to create same function which will get the input and produce result same as wp_remote_post. I am new to Wordpress and don't know this this function is written.

Get an existing Captcha image via cURL

I'm trying to get a Captcha(old, the image one) image from a web page. But, I know it always changes and being regenerated on every HTTP request. But I can't get the image via cURL.
I've tried this with this code in PHP:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://example.com/login.aspx');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.2309.372 Safari/537.36");
curl_setopt($ch, CURLOPT_NETRC, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
$data = curl_exec($ch);
curl_close($ch);
Image just comes as empty. There is field like Captcha but nothing is written on it. I couldn't understand if there is a difference between browser request or cURL request.

A website URL is not loading with Curl php

I am using Curl PHP to fetch data from remote site. My Script is:
<?php
$url = 'https://www.(url).com/';
$sleep = rand(10, 12);
sleep($sleep);
$agent= 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36';
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_VERBOSE, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array('accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8','accept-encoding:gzip, deflate, sdch','accept:image/webp,image/*,*/*;q=0.8'));
curl_setopt($ch, CURLOPT_PROXY, "x.x.x.x:x");
curl_setopt($ch, CURLOPT_USERAGENT, $agent);
curl_setopt($ch, CURLOPT_URL,$url);
$result=curl_exec($ch);
$mainPage = new simple_html_dom;
echo $mainPage->load($result);
But it returns 403 forbidden error in response.
I tried with advanced User agents include, but still I am getting this error in response.
Thanks in advance for suggestions and comments.

PHP Curl not executing

I am trying to retrieve the HTML from a user profile on Instagram using cURL.
I am new to cURL so do not know the cause of this error.
Nothing happens when the cURL is executed , the page seems to refresh?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://www.instagram.com/zohebchaudhry1/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookiess.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookiess.txt');
curl_setopt($ch ,CURLOPT_TIMEOUT , 10);
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36" );
$html = curl_exec($ch);
curl_close($ch);
echo $html;
above is the PHP cURL code.
It appears that cURL is working, however you're unable to see the output because printing HTML may not be desired.
I suggest replacing echo $html; with echo htmlentities($html);
Read more: php.net/htmlentities

simple_html_dom: trying to find height in google search

Anyone can explain to me what is wrong with the code and how do i get the height value? I am trying to get the height of celebrities. Any suggestions?
Thanks.
My code (Updated with CURL user agent setting as advised):
$url='https://www.google.com/webhp?ie=UTF-8#q=ailee+height';
//Set CURL user agent
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
//simple html dom
require_once('lib/simple_html_dom.php');
$html = str_get_html($data);
$height= $html->find('div[class="_eF"]',0)->innertext;
echo $height;
I get empty from the above code. In this case, I want to return:
5' 5" (1.65 m)
The problem is that curl doesn't process JavaScript and Google will show a different webpage when JavaScript is disabled, in this case, the div changes to a span with a different id
<span class="_m3b">1.65 m</span>
Also, the link you were using wasn't working for me.
Try this instead:
<?php
header('Content-Type: text/html; charset=utf-8');
$url='https://www.google.pt/search?q=ailee+height&num=10&gbv=1';
//Set CURL user agent
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36');
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
require_once('simple_html_dom.php');
$html = str_get_html($data);
$height= $html->find('span[class="_m3b"]',0)->innertext;
echo $height;
//1.65 m

Categories