This question already has answers here:
How do I add PHP code/file to HTML(.html) files?
(12 answers)
Closed 6 years ago.
I am trying to learn how to use PHP cURL and I am following a tutorial and while using Wamp. I am going to localhost and I never see the result of the code no matter the changes I do, all I see is:
This is my code:
<html>
<head>
</head>
<body>
<?php
function curl($url){
$options = Array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_AUTOREFERER => TRUE,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",
CURLOPT_URL => $url,
$ch = curl_init();
curl_setopt_array($ch, $options);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
function scrape_between($data, $start, $end){
$data= stristr($data, $start);
$data= substr($data, strlen($start));
$stop= stripos($data, $end);
$data= substr($data, 0, $stop);
return $data;
}
$scraped_page = curl("http://www.imdb.com"); // Downloading IMDB home page to variable $scraped_page
$scraped_data = scrape_between($scraped_page, "<title>", "</title>"); // Scraping downloaded dara in $scraped_page for content between <title> and </title> tags
echo $scraped_data; // Echoing $scraped data, should show "The Internet Movie Database (IMDb)"
?>
</body>
</html>
Change the file extension to php, not html (i.e. Make it end in .php). To quote #John Conde from this answer:
You can't run PHP in .html files because the server does not recognize that as a valid PHP extension unless you tell it to.
So you could modify the web server (e.g Apache, IIS, etc.) to process files with the HTML extension as PHP files.
Also, ensure that the assignment of the options array is ended with a closing parenthesis terminated with a semi-colon. For more information about arrays see php.net/array. So this line:
$options = Array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_AUTOREFERER => TRUE,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",
CURLOPT_URL => $url,
should be updated to:
$options = Array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_AUTOREFERER => TRUE,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",
CURLOPT_URL => $url
);
You can see this working on this phpfiddle
You missed ')' in line 18 (CURLOPT_URL => $url,)
Try this
<html>
<head>
</head>
<body>
<?php
function curl($url){
$options = Array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_AUTOREFERER => TRUE,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",
CURLOPT_URL => $url);
$ch = curl_init();
curl_setopt_array($ch, $options);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
function scrape_between($data, $start, $end){
$data= stristr($data, $start);
$data= substr($data, strlen($start));
$stop= stripos($data, $end);
$data= substr($data, 0, $stop);
return $data;
}
$scraped_page = curl("http://www.imdb.com"); // Downloading IMDB home page to variable $scraped_page
$scraped_data = scrape_between($scraped_page, "<title>", "</title>"); // Scraping downloaded dara in $scraped_page for content between <title> and </title> tags
echo $scraped_data; // Echoing $scraped data, should show "The Internet Movie Database (IMDb)"
?>
</body>
</html>
Related
I am using the following code to run curl.But for one url, I am getting 502 Bad Gateway error.
<?php
//$proxy = '127.0.0.1:80';
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => '<requesturl>',
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => '',
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 0,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_SSL_VERIFYPEER=>false,
CURLOPT_SSL_VERIFYHOST=>false,
//CURLOPT_PROXY=>$proxy,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_CUSTOMREQUEST => 'GET',
CURLOPT_HTTPHEADER => array(
'Cookie: PHPSESSID=e03338f51c56ada6870d530080127581'
),
));
$response = curl_exec($curl);
$err = curl_error($curl);
print_r($response);
print_r($err);
curl_close($curl);
?>
I have removed https and put http and checked but for that also not working.
My PHP version is 5.6.
Thanks,
Rekha
I have found the answer.I have added user agent and it worked fine.
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
For dynamically get user agent in php,
$userAgent = $_SERVER['HTTP_USER_AGENT'];
This will help someone.
Thanks,
Rekha
This is the php code that I am using.
I tested this code on localhost (xampp) and everything ok. but when I upload this code to my host (1and1 hosting) This code will not work. Please help me to find out why?
Thank you very much!
$url = "https://www.packagetrackr.com/track/1ZX799390355046642";
$page = get_web_page($url);
echo $page;
function get_web_page($url)
{
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => true, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_2_1 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8C148 Safari/6533.18.5", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLOPT_SSL_VERIFYPEER => false, // Disabled SSL Cert checks
CURLOPT_REFERER => $url // referent
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
curl_close( $ch );
return $content;
}
you can try turn on allow_url_fopen in your php.ini file
or
php_value allow_url_fopen On in .htaccess
How i can redirect user to another web site with cookie?
Im using this code
<?php
$fields_string = 'client_login=jadro&client_pass=jadro&client_remember=on&action=client_login';
$options = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => true,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_USERAGENT => "Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.1.7) Gecko/20091221 Firefox/3.5.7 FirePHP/0.3",
CURLOPT_AUTOREFERER => false,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
);
$ch = curl_init();
curl_setopt_array( $ch, $options );
curl_setopt($ch,CURLOPT_URL,'http://orion10.ru');
//curl_setopt($ch,CURLOPT_POST,count(explode('&',$fields)));
curl_setopt($ch,CURLOPT_POSTFIELDS,$fields_string);
//curl_setopt($ch,CURLOPT_COOKIEJAR, 'cooc.txt');
//curl_setopt($ch,CURLOPT_COOKIEFILE, 'cooc.txt');
curl_setopt($ch, CURLOPT_COOKIEJAR, getcwd()."/cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, getcwd()."/cookies.txt");
$result = curl_exec($ch);
echo $result;
//header("Location: http://orion10.ru".session_name().'='.session_id());
header('Refresh: 15; URL='.$url['http://orion10.ru']);
exit();
?>
I need authorize user to another site.
cURL is being executed on your server. Therefore, the website in question thinks that your server is the user. i.e. When you redirect the actual user to the website in question, it won't recognize them. Read this.
Try to write a simple crawler method. When I use PHP curl to get the www.yahoo.com page, I fetch nothing. How can I fetch the page?
My code is in the following.
public function getWebPage($url, $timeout = 120) {
$options = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => false,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_ENCODING => "",
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.19) Gecko/20081216 Ubuntu/8.04 (hardy) Firefox/2.0.0.19",
CURLOPT_AUTOREFERER => true,
CURLOPT_CONNECTTIMEOUT => $timeout,
CURLOPT_TIMEOUT => $timeout,
CURLOPT_MAXREDIRS => 10,
);
$ch = curl_init($url);
curl_setopt_array($ch, $options);
$content = curl_exec($ch);
$err = curl_errno($ch);
$errmsg = curl_error($ch);
$header = curl_getinfo($ch);
curl_close($ch);
return $content;
}
The yahoo.com runs on secure socket layer. So add this cURL param to your existing set.
CURLOPT_SSL_VERIFYPEER => false,
and also disable the USERAGENT..
The working code.. (tested)
<?php
class A
{
public function getWebPage($url, $timeout = 120) {
$options = array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_HEADER => false,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_ENCODING => "",
//CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.19) Gecko/20081216 Ubuntu/8.04 (hardy) Firefox/2.0.0.19",
CURLOPT_AUTOREFERER => true,
CURLOPT_CONNECTTIMEOUT => $timeout,
CURLOPT_SSL_VERIFYPEER => false,
CURLOPT_TIMEOUT => $timeout,
CURLOPT_MAXREDIRS => 10,
);
$ch = curl_init($url);
curl_setopt_array($ch, $options);
$content = curl_exec($ch);
$err = curl_errno($ch);
$errmsg = curl_error($ch);
$header = curl_getinfo($ch);
curl_close($ch);
return $content;
}
}
$a = new A;
echo $a->getWebPage('www.yahoo.com');
I am trying to scrape a website using PHP, CURL and POST method in order to submit a form before web scraping the page. The problem I am experiencing is that there is connected with POST method: no data is submitted to the server, so the scraped webpage doesn't contain what I am looking for.
I quit sure the problem is connected with the form type: enctype="multipart/form-data".
How can I manage this POST request, considering that the form is multipart/form-data?
Do I have to encode the post_string in a special way?
Here's the code I'm using:
function curl($url) {
//POST string
$post_string="XXXX";
$options = Array(
CURLOPT_RETURNTRANSFER => TRUE,
CURLOPT_FOLLOWLOCATION => TRUE,
CURLOPT_AUTOREFERER => TRUE,
CURLOPT_CONNECTTIMEOUT => 120,
CURLOPT_TIMEOUT => 120,
CURLOPT_MAXREDIRS => 10,
CURLOPT_USERAGENT => "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1a2pre) Gecko/2008073000 Shredder/3.0a2pre ThunderBrowse/3.2.1.8",
CURLOPT_URL => $url,
CURLOPT_CAINFO => dirname(__FILE__)."/cacert.pem",
CURLOPT_POSTFIELDS => $post_string,
);
$ch = curl_init();
curl_setopt_array($ch, $options);
$data = curl_exec($ch);
curl_error($ch);
curl_close($ch);
return $data;
}
$scraped_page = curl("XXXURLXXX");
echo $scraped_page;
Thank you!
Set the CURLOPT_POST to true:
CURLOPT_POST = true
Then fill your post fields like this 'setup':
$postfields = array();
$postfields['field1'] = 'value1';
$postfields['field2'] = 'value2';
CURLOPT_POSTFIELDS => $postfields
If value is an array, the Content-Type header will be set to multipart/form-data.
The PHP manual
Yes, $post_string needs to be an array.
Also set CURLOPT_POST to true.