Fetch the description from wikipedia from an article - php

I am trying to make a API call to wikipedia through: http://en.wikipedia.org/w/api.php?action=parse&page=Petunia&format=xml, but the xml is full with html and css tags.
Is there a way to fetch only plain text without tags? Thanks!
*Edit 1:
$json = json_decode(file_get_contents('http://en.wikipedia.org/w/api.php?action=parse&page=Petunia&format=json'));
$txt = strip_tags($json->text);
var_dump($json);
Null displayed.

Question was partially answered here
$url = 'http://en.wikipedia.org/w/api.php?action=parse&page=Petunia&format=json&prop=text';
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_USERAGENT, "TestScript"); // required by wikipedia.org server
$c = curl_exec($ch);
$json = json_decode($c);
var_dump(strip_tags($json->{'parse'}->{'text'}->{'*'}))
I was not able to use file_get_contents but it works fine with cURL.

it is possible to fetch info or description from wikipedia by using xml.
$url = "http://en.wikipedia.org/w/api.php?action=opensearch&search=".$term."&format=xml&limit=1";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HTTPGET, TRUE);
curl_setopt($ch, CURLOPT_POST, FALSE);
curl_setopt($ch, CURLOPT_HEADER, false); // Include head as needed
curl_setopt($ch, CURLOPT_NOBODY, FALSE); // Return body
curl_setopt($ch, CURLOPT_VERBOSE, FALSE); // Minimize logs
curl_setopt($ch, CURLOPT_REFERER, ""); // Referer value
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE); // No certificate
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE); // Follow redirects
curl_setopt($ch, CURLOPT_MAXREDIRS, 4); // Limit redirections to four
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Return in string
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; he; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8"); // Webbot name
$page = curl_exec($ch);
$xml = simplexml_load_string($page);
if((string)$xml->Section->Item->Description) {
print_r(array((string)$xml->Section->Item->Text,
(string)$xml->Section->Item->Description,
(string)$xml->Section->Item->Url));
} else {
echo "sorry";
}
But curl must be install on server... have a nice day...

Related

set CURL on template blade laravel

I want to get image from instagram link. the link is this https://www.instagram.com/p/B_zZCRpB895/media/?size=t
if you open that link. it will redirect the link into actual image link. The redirect link is this
https://scontent-xsp1-1.cdninstagram.com/v/t51.2885-15/e35/c0.180.1440.1440a/s150x150/95528722_148590620037203_2029915803294658254_n.jpg?_nc_ht=scontent-xsp1-1.cdninstagram.com&_nc_cat=111&_nc_ohc=jeCE5U4ckBkAX8I42Og&oh=79d2aee23705bd88b58801d877642aed&oe=5F22A9C3
that redirect link that i want to capture. and store into DB.
because im using AWS server. i can't get the actual image link from controller. i need to get the link into front end. so i create template blade to get the actual link. i was created file call instagram.blade.php and i create this code :
<?php
$image = "https://www.instagram.com/p/B_zZCRpB895/media/?size=t";
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $image);
curl_setopt ($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0');
curl_setopt ($ch, CURLOPT_HEADER, TRUE);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch);
curl_close($ch);
preg_match("|location: (https?://\S+)|", $result, $link);
dump($link[1]);
?>
this code can run well into my local PC.. but when i tried it into my AWS server. it display this execption
Undefined offset: 1 (View: /home/ec2-user/......./views/instagram.blade.php)
how do i solve this?
parameter $link[1] must return this link
https://scontent-xsp1-1.cdninstagram.com/v/t51.2885-15/e35/c0.180.1440.1440a/s150x150/95528722_148590620037203_2029915803294658254_n.jpg?_nc_ht=scontent-xsp1-1.cdninstagram.com&_nc_cat=111&_nc_ohc=jeCE5U4ckBkAX8I42Og&oh=79d2aee23705bd88b58801d877642aed&oe=5F22A9C3
please help
try{
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, 'your header');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode('your fields data'));
$result = curl_exec($ch);
curl_close($ch);
} catch(Exeption $e){
return $e ;
}
return $result ;

Logging into secure site using CURL [duplicate]

I'm new to using cURL and its hard to find good resources for it.
What I'm trying to do is login to a remote site, by having curl do the login form and then send back that it was successful.
The code I have doesn't seem to work and only tries to show the main page of the site.
$username="mylogin#gmail.com";
$password="mypassword";
$url="http://www.myremotesite.com/index.php?page=login";
$cookie="cookie.txt";
$postdata = "email=".$username."&password=".$password;
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt ($ch, CURLOPT_POST, 1);
$result = curl_exec ($ch);
echo $result;
curl_close($ch);
What am I doing wrong. After this is working I want to redirect to another page and get content from my site.
I had let this go for a good while but revisited it later. Since this question is viewed regularly. This is eventually what I ended up using that worked for me.
define("DOC_ROOT","/path/to/html");
//username and password of account
$username = trim($values["email"]);
$password = trim($values["password"]);
//set the directory for the cookie using defined document root var
$path = DOC_ROOT."/ctemp";
//build a unique path with every request to store. the info per user with custom func. I used this function to build unique paths based on member ID, that was for my use case. It can be a regular dir.
//$path = build_unique_path($path); // this was for my use case
//login form action url
$url="https://www.example.com/login/action";
$postinfo = "email=".$username."&password=".$password;
$cookie_file_path = $path."/cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
//set the cookie the site has for certain features, this is optional
curl_setopt($ch, CURLOPT_COOKIE, "cookiename=0");
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postinfo);
curl_exec($ch);
//page with the content I want to grab
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/page/");
//do stuff with the info with DomDocument() etc
$html = curl_exec($ch);
curl_close($ch);
Update: This code was never meant to be a copy and paste. It was to show how I used it for my specific use case. You should adapt it to your code as needed. Such as directories, vars etc
I had same question and I found this answer on this website.
And I changed it just a little bit (the curl_close at last line)
$username = 'myuser';
$password = 'mypass';
$loginUrl = 'http://www.example.com/login/';
//init curl
$ch = curl_init();
//Set the URL to work with
curl_setopt($ch, CURLOPT_URL, $loginUrl);
// ENABLE HTTP POST
curl_setopt($ch, CURLOPT_POST, 1);
//Set the post parameters
curl_setopt($ch, CURLOPT_POSTFIELDS, 'user='.$username.'&pass='.$password);
//Handle cookies for the login
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
//Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL
//not to print out the results of its query.
//Instead, it will return the results as a string return value
//from curl_exec() instead of the usual true/false.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
//execute the request (the login)
$store = curl_exec($ch);
//the login is now done and you can continue to get the
//protected content.
//set the URL to the protected file
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/protected/download.zip');
//execute the request
$content = curl_exec($ch);
curl_close($ch);
//save the data to disk
file_put_contents('~/download.zip', $content);
I think this was what you were looking for.Am I right?
And one useful related question. About how to keep a session alive in cUrl: https://stackoverflow.com/a/13020494/2226796
View the source of the login page. Look for the form HTML tag. Within that tag is something that will look like action= Use that value as $url, not the URL of the form itself.
Also, while you are there, verify the input boxes are named what you have them listed as.
For example, a basic login form will look similar to:
<form method='post' action='postlogin.php'>
Email Address: <input type='text' name='email'>
Password: <input type='password' name='password'>
</form>
Using the above form as an example, change your value of $url to:
$url="http://www.myremotesite.com/postlogin.php";
Verify the values you have listed in $postdata:
$postdata = "email=".$username."&password=".$password;
and it should work just fine.
This is how I solved this in ImpressPages:
//initial request with login data
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/login.php');
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/32.0.1700.107 Chrome/32.0.1700.107 Safari/537.36');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, "username=XXXXX&password=XXXXX");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie-name'); //could be empty, but cause problems on some hosts
curl_setopt($ch, CURLOPT_COOKIEFILE, '/var/www/ip4.x/file/tmp'); //could be empty, but cause problems on some hosts
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
//another request preserving the session
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/profile');
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, "");
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
Panama Jack Example not work for me - Give Fatal error: Call to undefined function build_unique_path(). I used this code - (more simple - my opinion) :
// options$login_email = 'alabala#gmail.com';$login_pass = 'alabala4807';$cookie_file_path = "/tmp/cookies.txt";$LOGINURL = "http://alabala.com/index.php?route=account/login"; $agent = "Nokia-Communicator-WWW-Browser/2.0 (Geos 3.0 Nokia-9000i)";// begin script$ch = curl_init();// extra headers$headers[] = "Accept: */*";$headers[] = "Connection: Keep-Alive";// basic curl options for all requestscurl_setopt($ch, CURLOPT_HTTPHEADER, $headers);curl_setopt($ch, CURLOPT_HEADER, 0);curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_USERAGENT, $agent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); // set first URLcurl_setopt($ch, CURLOPT_URL, $LOGINURL);// execute session to get cookies and required form inputs$content = curl_exec($ch); // grab the hidden inputs from the form required to login$fields = getFormFields($content);$fields['email'] = $login_email;$fields['password'] = $login_pass;// set postfields using what we extracted from the form$POSTFIELDS = http_build_query($fields); // change URL to login URLcurl_setopt($ch, CURLOPT_URL, $LOGINURL); // set post optionscurl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $POSTFIELDS); // perform login$result = curl_exec($ch); print $result; function getFormFields($data){ if (preg_match('/()/is', $data, $matches)) { $inputs = getInputs($matches[1]); return $inputs; } else { die('didnt find login form'); }}function getInputs($form){ $inputs = array(); $elements = preg_match_all("/(]+>)/is", $form, $matches); if ($elements > 0) { for($i = 0;$i $el = preg_replace('/\s{2,}/', ' ', $matches[1][$i]); if (preg_match('/name=(?:["\'])?([^"\'\s]*)/i', $el, $name)) { $name = $name[1]; $value = ''; if (preg_match('/value=(?:["\'])?([^"\'\s]*)/i', $el, $value)) { $value = $value[1]; } $inputs[$name] = $value; } } } return $inputs;}$grab_url='http://grab.url/alabala';//page with the content I want to grabcurl_setopt($ch, CURLOPT_URL, $grab_url);//do stuff with the info with DomDocument() etc$html = curl_exec($ch);curl_close($ch);var_dump($html); die;

curl hitting the url but data not recieved by server

Can anyone help me? I am tring to hit a website using curl.
My code is as follows:
$data = array(
'utm_source'=>'Google',
'utm_medium'=>'Google',
'utm_campaign'=>'Sales',
'utm_term'=>'united flights tickets'
);
$data = http_build_query($data, '', '&');
$proxies[] = 'user:password#71.6.46.151:443';
$proxies[] = 'user:password#14.102.19.206:61217';
$proxies[] = 'user:password#187.95.230.65:8080';
$url = 'https://www.example.com/index.php?'.$data;
$ch = curl_init();
if (isset($proxy)) { // If the $proxy variable is set, then
curl_setopt($ch, CURLOPT_PROXY, $proxy); // Set CURLOPT_PROXY with proxy in $proxy variable
}
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; rv:5.0) Gecko/20100101 Firefox/5.0');
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_REFERER, 'https://www.google.com');
curl_setopt($ch, CURLOPT_HTTPGET, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS,$data);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 500);
curl_setopt($ch, CURLOPT_URL, $url);
$results = curl_exec($ch); // Execute a cURL request
if (curl_error($ch)) {
$error_msg = curl_error($ch);
}
$info = curl_getinfo($ch);
curl_close($ch);
print_r($error_msg);
print_r($info);
print_r($results);
It hits the website and also displayed in the Google Analytics account, but the only problem is that the refferer and source showing "direct" in Google Analytics.
If there is any other way to achive this, please tell me.

How to get full html content of specific url?

I used several method to get html content of aptoide.com in php.
1) file_get_contents();
2) readfile();
3) curl as php function
function get_dataa($url) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; Konqueror/4.0; Microsoft Windows) KHTML/4.0.80 (like Gecko)");
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
4)PHP Simple HTML DOM Parser
include_once('simple_html_dom.php');
$url="http://aptoide.com";
$html = file_get_html($url);
But all of them give empty output for aptoide.com
Is there a way to get full html content of that url ?
echo file_get_contents('http://www.aptoide.com/'); works fine for me.
So it's possible that aptoide.com has been blocked you. If you want to change your IP (as you said in comment) you have to use this:
$url = 'http://aptoide.com.com/';
$proxy = '127.0.0.1:9095'; // Your proxy
// $proxyauth = 'user:password'; // Proxy authentication if required
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
//curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxyauth);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
$curl_scraped_page = curl_exec($ch);
curl_close($ch);
echo $curl_scraped_page;
use your curl get_dataa function with this line added:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
because that page is redirecting to www.aptide.com
full function:
function get_dataa($url) {
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; Konqueror/4.0; Microsoft Windows) KHTML/4.0.80 (like Gecko)");
$data = curl_exec($ch);
curl_close($ch);
return $data;
}

Login to remote site with PHP cURL

I'm new to using cURL and its hard to find good resources for it.
What I'm trying to do is login to a remote site, by having curl do the login form and then send back that it was successful.
The code I have doesn't seem to work and only tries to show the main page of the site.
$username="mylogin#gmail.com";
$password="mypassword";
$url="http://www.myremotesite.com/index.php?page=login";
$cookie="cookie.txt";
$postdata = "email=".$username."&password=".$password;
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL, $url);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt ($ch, CURLOPT_REFERER, $url);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt ($ch, CURLOPT_POST, 1);
$result = curl_exec ($ch);
echo $result;
curl_close($ch);
What am I doing wrong. After this is working I want to redirect to another page and get content from my site.
I had let this go for a good while but revisited it later. Since this question is viewed regularly. This is eventually what I ended up using that worked for me.
define("DOC_ROOT","/path/to/html");
//username and password of account
$username = trim($values["email"]);
$password = trim($values["password"]);
//set the directory for the cookie using defined document root var
$path = DOC_ROOT."/ctemp";
//build a unique path with every request to store. the info per user with custom func. I used this function to build unique paths based on member ID, that was for my use case. It can be a regular dir.
//$path = build_unique_path($path); // this was for my use case
//login form action url
$url="https://www.example.com/login/action";
$postinfo = "email=".$username."&password=".$password;
$cookie_file_path = $path."/cookie.txt";
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
//set the cookie the site has for certain features, this is optional
curl_setopt($ch, CURLOPT_COOKIE, "cookiename=0");
curl_setopt($ch, CURLOPT_USERAGENT,
"Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_REFERER, $_SERVER['REQUEST_URI']);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postinfo);
curl_exec($ch);
//page with the content I want to grab
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/page/");
//do stuff with the info with DomDocument() etc
$html = curl_exec($ch);
curl_close($ch);
Update: This code was never meant to be a copy and paste. It was to show how I used it for my specific use case. You should adapt it to your code as needed. Such as directories, vars etc
I had same question and I found this answer on this website.
And I changed it just a little bit (the curl_close at last line)
$username = 'myuser';
$password = 'mypass';
$loginUrl = 'http://www.example.com/login/';
//init curl
$ch = curl_init();
//Set the URL to work with
curl_setopt($ch, CURLOPT_URL, $loginUrl);
// ENABLE HTTP POST
curl_setopt($ch, CURLOPT_POST, 1);
//Set the post parameters
curl_setopt($ch, CURLOPT_POSTFIELDS, 'user='.$username.'&pass='.$password);
//Handle cookies for the login
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
//Setting CURLOPT_RETURNTRANSFER variable to 1 will force cURL
//not to print out the results of its query.
//Instead, it will return the results as a string return value
//from curl_exec() instead of the usual true/false.
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
//execute the request (the login)
$store = curl_exec($ch);
//the login is now done and you can continue to get the
//protected content.
//set the URL to the protected file
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/protected/download.zip');
//execute the request
$content = curl_exec($ch);
curl_close($ch);
//save the data to disk
file_put_contents('~/download.zip', $content);
I think this was what you were looking for.Am I right?
And one useful related question. About how to keep a session alive in cUrl: https://stackoverflow.com/a/13020494/2226796
View the source of the login page. Look for the form HTML tag. Within that tag is something that will look like action= Use that value as $url, not the URL of the form itself.
Also, while you are there, verify the input boxes are named what you have them listed as.
For example, a basic login form will look similar to:
<form method='post' action='postlogin.php'>
Email Address: <input type='text' name='email'>
Password: <input type='password' name='password'>
</form>
Using the above form as an example, change your value of $url to:
$url="http://www.myremotesite.com/postlogin.php";
Verify the values you have listed in $postdata:
$postdata = "email=".$username."&password=".$password;
and it should work just fine.
This is how I solved this in ImpressPages:
//initial request with login data
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/login.php');
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/32.0.1700.107 Chrome/32.0.1700.107 Safari/537.36');
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, "username=XXXXX&password=XXXXX");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookie-name'); //could be empty, but cause problems on some hosts
curl_setopt($ch, CURLOPT_COOKIEFILE, '/var/www/ip4.x/file/tmp'); //could be empty, but cause problems on some hosts
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
//another request preserving the session
curl_setopt($ch, CURLOPT_URL, 'http://www.example.com/profile');
curl_setopt($ch, CURLOPT_POST, false);
curl_setopt($ch, CURLOPT_POSTFIELDS, "");
$answer = curl_exec($ch);
if (curl_error($ch)) {
echo curl_error($ch);
}
Panama Jack Example not work for me - Give Fatal error: Call to undefined function build_unique_path(). I used this code - (more simple - my opinion) :
// options$login_email = 'alabala#gmail.com';$login_pass = 'alabala4807';$cookie_file_path = "/tmp/cookies.txt";$LOGINURL = "http://alabala.com/index.php?route=account/login"; $agent = "Nokia-Communicator-WWW-Browser/2.0 (Geos 3.0 Nokia-9000i)";// begin script$ch = curl_init();// extra headers$headers[] = "Accept: */*";$headers[] = "Connection: Keep-Alive";// basic curl options for all requestscurl_setopt($ch, CURLOPT_HTTPHEADER, $headers);curl_setopt($ch, CURLOPT_HEADER, 0);curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); curl_setopt($ch, CURLOPT_USERAGENT, $agent); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path); curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path); // set first URLcurl_setopt($ch, CURLOPT_URL, $LOGINURL);// execute session to get cookies and required form inputs$content = curl_exec($ch); // grab the hidden inputs from the form required to login$fields = getFormFields($content);$fields['email'] = $login_email;$fields['password'] = $login_pass;// set postfields using what we extracted from the form$POSTFIELDS = http_build_query($fields); // change URL to login URLcurl_setopt($ch, CURLOPT_URL, $LOGINURL); // set post optionscurl_setopt($ch, CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $POSTFIELDS); // perform login$result = curl_exec($ch); print $result; function getFormFields($data){ if (preg_match('/()/is', $data, $matches)) { $inputs = getInputs($matches[1]); return $inputs; } else { die('didnt find login form'); }}function getInputs($form){ $inputs = array(); $elements = preg_match_all("/(]+>)/is", $form, $matches); if ($elements > 0) { for($i = 0;$i $el = preg_replace('/\s{2,}/', ' ', $matches[1][$i]); if (preg_match('/name=(?:["\'])?([^"\'\s]*)/i', $el, $name)) { $name = $name[1]; $value = ''; if (preg_match('/value=(?:["\'])?([^"\'\s]*)/i', $el, $value)) { $value = $value[1]; } $inputs[$name] = $value; } } } return $inputs;}$grab_url='http://grab.url/alabala';//page with the content I want to grabcurl_setopt($ch, CURLOPT_URL, $grab_url);//do stuff with the info with DomDocument() etc$html = curl_exec($ch);curl_close($ch);var_dump($html); die;

Categories