CURL Authentication being lost?

CURL Authentication being lost? - php

I am authenticating a login via CURL just fine. I have a variable I am using to display the returned HTML, and it is returning my user control panel as if I am logged in.
After authenticating, I want to communicate variables with a form on another page within the site; but for some reason the HTML from that page is returning a non-authenticated version of the header (as if the original authentication never took place.)
I have a cookies.txt file with 777 permissions, and have tried just getting the contents of the same page shown when I authenticate and it is as if I am losing any associated session/cookie data somewhere along the way.
Here is my curl.class file -
<?
class Curl {
public $cookieJar = "";
// Make sure the cookies.txt file is read/write permissions
public function __construct($cookieJarFile = 'cookies.txt') {
$this->cookieJar = $cookieJarFile;
}
function setup() {
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($this->curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
curl_setopt($this->curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($this->curl, CURLOPT_COOKIEJAR, $this->cookieJar);
curl_setopt($this->curl, CURLOPT_COOKIEFILE, $this->cookieJar);
curl_setopt($this->curl, CURLOPT_AUTOREFERER, true);
curl_setopt($this->curl, CURLOPT_COOKIESESSION, true);
curl_setopt($this->curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($this->curl, CURLOPT_RETURNTRANSFER, true);
}
function get($url) {
$this->curl = curl_init($url);
$this->setup();
return $this->request();
}
function getAll($reg, $str) {
preg_match_all($reg, $str, $matches);
return $matches[1];
}
function postForm($url, $fields, $referer = '') {
$this->curl = curl_init($url);
$this->setup();
curl_setopt($this->curl, CURLOPT_URL, $url);
curl_setopt($this->curl, CURLOPT_POST, 1);
curl_setopt($this->curl, CURLOPT_REFERER, $referer);
curl_setopt($this->curl, CURLOPT_POSTFIELDS, $fields);
return $this->request();
}
function getInfo($info) {
$info = ($info == 'lasturl') ? curl_getinfo($this->curl, CURLINFO_EFFECTIVE_URL) : curl_getinfo($this->curl, $info);
return $info;
}
function request() {
return curl_exec($this->curl);
}
}
?>
And here is my curl.php file -
<?
include('curl.class.php'); // This path would change to where you store the file
$curl = new Curl();
$url = "http://www.site.com/public/member/signin";
$fields = "MAX_FILE_SIZE=50000000&dado_form_3=1&member[email]=email&member[password]=pass&x=16&y=5&member[persistent]=true";
// Calling URL
$referer = "http://www.site.com/public/member/signin";
$html = $curl->postForm($url, $fields, $referer);
echo($html);
?>
<hr style="clear:both;"/>
<?
$html = $curl->postForm('http://www.site.com/index.php','nid=443&sid=733005&tab=post&eval=yes&ad=&MAX_FILE_SIZE=10000000&ip=63.225.235.30','http://www.site.com/public/member/signin');
echo $html; // This will show you the HTML of the current page you and logged into
?>
Any ideas?

As always when doing HTTP scripting, you should use LiveHTTPHeaders or similar to record a manual session first and then you should mimic that as closely as possible when you write your curl stuff.
Also (unfortunately) the command line tool curl offers slightly better debug and tracing options than what the PHP binding does, which makes that a better tool to work out exactly what you need to do and once that works you convert it to a PHP program.
See http://curl.haxx.se/docs/httpscripting.html for further details.

Err, please tell us what authentication scheme the server is using. Not all schemes use cookies.

Related

getting page from twitter or facebook anonymously by curl

I'm trying to make some kind of page parser (more specific - highlighting some words on pages) and i've got some problems with it. I'm getting whole page data from url using curl and most pages are cooperating nicely, while others don't.
My goal is to get all page html just like browser is getting it and I'm trying to use it anonymously - like browser is. I mean - if some page needs log in to show data for browser that doesn't interest me. The problem is that I can't get on Twitter or Facebook pages that I can reach anonymously from regular browser, even when I set all headers just like they are send normally form Firefox or Chrome.
Is there any way to simply emulate browser to get page from these side or I have to use OAuth (and can someone explain why browsers don't need to use it)?
EDIT
I got the solution! If somebody will have problems with that you should:
-> try to switch protocol from https to http
-> get rid of the /#!/ element if there is one in url
-> for my curl element "Accept-Encoding: gzip, deflate" was also causing problems.. dunno why, but now everything is OK
code of mine:
if (substr($this->url,0,5) == 'https')
$this->url = str_replace('https://', 'http://', $this->url);
$this->url = str_replace('/#!/', '/', $this->url);
//check, if a valid url is provided
if(!filter_var($this->url, FILTER_VALIDATE_URL))
return false;
$curl = curl_init();
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
// -> gives an error: $header[] = "Accept-Encoding: gzip, deflate";
$header[] = "Accept-Language: pl,en-us;q=0.7,en;q=0.3";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_HTTPHEADER,$header);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_URL, $this->url);
curl_setopt($curl, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($curl, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($curl, CURLOPT_COOKIESESSION,true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; pl; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 (.NET CLR 3.5.30729)');
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
$response = curl_exec($curl);
curl_close($curl);
if ($response) return $response;
return false;
All was in class, but you can extract code very easy. For me it's getting both (twitter and facebook) nicely.

Yes, this is possible to emulate a browser: but you need to carefully watch all the http headers (including cookies) that are sent by the browser, and also handle redirects as well. Some of this can be "automated" by cUrl functions, the rest you'll need to manually handle.
Note: I'm not talking about HTML headers in code; these are HTTP headers sent and received by browsers.
The easiest way to spot these is to user fiddler to monitor the traffic. Choose a URL and look on the right for "inspect element" and you'll see headers that get send, and headers that are received.
Facebook makes this more complicated with a mirad of iFrames, so I suggest you start on a simpler website!

I got the solution! If somebody will have problems with that you should:
-> try to switch protocol from https to http
-> get rid of the /#!/ element if there is one in url
-> for my curl element "Accept-Encoding: gzip, deflate" was also causing problems.. dunno why, but now everything is OK
code of mine:
if (substr($this->url,0,5) == 'https')
$this->url = str_replace('https://', 'http://', $this->url);
$this->url = str_replace('/#!/', '/', $this->url);
//check, if a valid url is provided
if(!filter_var($this->url, FILTER_VALIDATE_URL))
return false;
$curl = curl_init();
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
// -> gives an error: $header[] = "Accept-Encoding: gzip, deflate";
$header[] = "Accept-Language: pl,en-us;q=0.7,en;q=0.3";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($curl, CURLOPT_HTTPHEADER,$header);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_URL, $this->url);
curl_setopt($curl, CURLOPT_COOKIEJAR, "cookie.txt");
curl_setopt($curl, CURLOPT_COOKIEFILE, "cookie.txt");
curl_setopt($curl, CURLOPT_CONNECTTIMEOUT,10);
curl_setopt($curl, CURLOPT_COOKIESESSION,true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; pl; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 (.NET CLR 3.5.30729)');
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
$response = curl_exec($curl);
curl_close($curl);
if ($response) return $response;
return false;
All was in class, but you can extract code very easy. For me it's getting both (twitter and facebook) nicely.

Using curl to load an HTML file that submit an external form with JS

I need a little help here:
I have 2 files
index.php
form0.html
form0.html automatically fill a form and send it.
When I go straight to the fill it works fine, but when I try to access it through my php script it won't work unless I print the results.
PHP CODE:
<?php
set_time_limit(30);
$delay_time = 2; // time to wait before looping again.
$loop_times = 1; // number of times to loop.
$url = array("http://localhost/htmlfile0.html");
for($x=0;$x<$loop_times;$x++)
{
echo count($url);
for($i=0;$i<count($url);$i++)
{
$url1=$url[$i];
$curl = curl_init(); // Initialize the cURL handler
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 30000";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
$var_host = parse_url($url1,PHP_URL_HOST);
$cookieJar = 'cookies/'.$var_host.'.txt'; // change it according to your requirement. make dynamic for multi URL
curl_setopt($curl, CURLOPT_HTTPHEADER, $header); // Browser like header
curl_setopt($curl, CURLOPT_COOKIEJAR, $cookieJar); // file for cookies if site requires
curl_setopt($curl, CURLOPT_COOKIEFILE, $cookieJar); // file for cookies if site requires
curl_setopt($curl, CURLOPT_HEADER, 0);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9) Gecko/2008052906 Firefox/3.0');
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1); // Follow any redirects, just in case
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl,CURLOPT_RETURNTRANSFER, true); // set curl to return the page
curl_setopt($curl, CURLOPT_POSTFIELDS, '');
curl_setopt($curl, CURLOPT_POST, true); //post the form
curl_setopt($curl, CURLOPT_URL, $url1); // Set the URL
$ch=curl_exec($curl); // Display page
curl_close($ch); // Close cURL handler
echo date('h:i:s') . "\n";
if($ch) echo "Success: ".$url1;
else echo "Fail: ".$url1;
echo '<hr>';
sleep(5);
echo date('h:i:s') . "\n";
}
if($x < $loop_times) sleep($delay_time);
}
?>
How can I get pass this?
Thanks.

Correct me if I'm wrong but you're trying to execute JS using CURL request which you cant' at least not directly.
When you go straight to file, it works fine - your browser's JS interpreter executes JS proprerly and form is submitted, but by using cURL you are doing something different - you are sending http request headers to certain URL(http://localhost/htmlfile0.html) and you may and may not fetch the response content.
If you do fetch the content (and parse JS in your browser's js interpreter) javascript may be set to refuse correct action based on wheather it is or it is not correct url.
Example:
do some action if it is right URL - if you reached code on http://example.com/script.html
do not do anything if it's not above URL - and that's the case when you perform action using your cURL php script to reach above URL with and to output document's code in http://localhost.

Why can I not scrape the title off this site?

I'm using simple-html-dom to scrape the title off of a specified site.
<?php
include('simple_html_dom.php');
$html = file_get_html('http://www.pottermore.com/');
foreach($html->find('title') as $element)
echo $element->innertext . '<br>';
?>
Any other site I've tried works, apple.com for example.
But if I input pottermore.com, it doesn't output anything. Pottermore has flash elements on it, but the home screen I'm trying to scrape the title off of has no flash, just html.

This works for me :)
$url = 'http://www.pottermore.com/';
$html = get_html($url);
file_put_contents('page.htm',$html);//just to test what you have downloaded
echo 'The title from: '.$url.' is: '.get_snip($html, '<title>','</title>');
function get_html($url)
{
$ch = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; //browsers keep this blank.
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows;U;Windows NT 5.0;en-US;rv:1.4) Gecko/20030624 Netscape/7.1 (ax)');
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, COOKIE);
curl_setopt($ch, CURLOPT_COOKIEJAR, COOKIE);
$result = curl_exec ($ch);
curl_close ($ch);
return($result);
}
function get_snip($string,$start,$end,$trim_start='1',$trim_end='1')
{
$startpos = strpos($string,$start);
$endpos = strpos($string,$end,$startpos);
if($trim_start!='')
{
$startpos += strlen($start);
}
if($trim_end=='')
{
$endpos += strlen($end);
}
return(substr($string,$startpos,($endpos-$startpos)));
}

Just to confirm what others are saying, if you don't send a user agent string this site sends 403 Forbidden.
Adding this worked for me:
User-Agent: Mozilla/5.0 (Windows;U;Windows NT 5.0;en-US;rv:1.4) Gecko/20030624 Netscape/7.1 (ax)

The function file_get_html uses file_get_contents under the covers. This function can pull data from a URL, but to do so, it sends a User Agent string.
By default, this string is empty. Some webservers use this fact to detect that a non-browser is accessing its data and opt to forbid this.
You can set user_agent in php.ini to control the User Agent string that gets sent. Or, you could try:
ini_set('user_agent','UA-String');
with 'UA-String' set to whatever you like.

file_get_contents script works with some websites but not others

I'm looking to build a PHP script that parses HTML for particular tags. I've been using this code block, adapted from this tutorial:
<?php
$data = file_get_contents('http://www.google.com');
$regex = '/<title>(.+?)</';
preg_match($regex,$data,$match);
var_dump($match);
echo $match[1];
?>
The script works with some websites (like google, above), but when I try it with other websites (like, say, freshdirect), I get this error:
"Warning: file_get_contents(http://www.freshdirect.com) [function.file-get-contents]: failed to open stream: HTTP request failed!"
I've seen a bunch of great suggestions on StackOverflow, for example to enable extension=php_openssl.dll in php.ini. But (1) my version of php.ini didn't have extension=php_openssl.dll in it, and (2) when I added it to the extensions section and restarted the WAMP server, per this thread, still no success.
Would someone mind pointing me in the right direction? Thank you very much!

It just requires a user-agent ("any" really, any string suffices):
file_get_contents("http://www.freshdirect.com",false,stream_context_create(
array("http" => array("user_agent" => "any"))
));
See more options.
Of course, you can set user_agent in your ini:
ini_set("user_agent","any");
echo file_get_contents("http://www.freshdirect.com");
... but I prefer to be explicit for the next programmer working on it.

$html = file_get_html('http://google.com/');
$title = $html->find('title')->innertext;
Or if you prefer with preg_match and you should be really using cURL instead of fgc...
function curl($url){
$headers[] = "User-Agent:Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13";
$headers[] = "Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
$headers[] = "Accept-Language:en-us,en;q=0.5";
$headers[] = "Accept-Encoding:gzip,deflate";
$headers[] = "Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$headers[] = "Keep-Alive:115";
$headers[] = "Connection:keep-alive";
$headers[] = "Cache-Control:max-age=0";
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
curl_setopt($curl, CURLOPT_ENCODING, "gzip");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($curl);
curl_close($curl);
return $data;
}
$data = curl('http://www.google.com');
$regex = '#<title>(.*?)</title>#mis';
preg_match($regex,$data,$match);
var_dump($match);
echo $match[1];

Another option: Some hosts disable CURLOPT_FOLLOWLOCATION so recursive is what you want, also will log into a text file any errors. Also a simple example of how to use DOMDocument() to extract the content, obviously its not extensive but something you could build appon.
<?php
function file_get_site($url){
(function_exists('curl_init')) ? '' : die('cURL Must be installed. Ask your host to enable it or uncomment extension=php_curl.dll in php.ini');
$curl = curl_init();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: ";
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0 Firefox/5.0');
curl_setopt($curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($curl, CURLOPT_HEADER, true);
curl_setopt($curl, CURLOPT_REFERER, $url);
curl_setopt($curl, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_TIMEOUT, 60);
$html = curl_exec($curl);
$status = curl_getinfo($curl);
curl_close($curl);
if($status['http_code']!=200){
if($status['http_code'] == 301 || $status['http_code'] == 302) {
list($header) = explode("\r\n\r\n", $html, 2);
$matches = array();
preg_match("/(Location:|URI:)[^(\n)]*/", $header, $matches);
$url = trim(str_replace($matches[1],"",$matches[0]));
$url_parsed = parse_url($url);
return (isset($url_parsed))? file_get_site($url):'';
}
$oline='';
foreach($status as $key=>$eline){$oline.='['.$key.']'.$eline.' ';}
$line =$oline." \r\n ".$url."\r\n-----------------\r\n";
$handle = #fopen('./curl.error.log', 'a');
fwrite($handle, $line);
return FALSE;
}
return $html;
}
function get_content_tags($source,$tag,$id=null,$value=null){
$xml = new DOMDocument();
#$xml->loadHTML($source);
foreach($xml->getElementsByTagName($tag) as $tags) {
if($id!=null){
if($tags->getAttribute($id)==$value){
return $tags->getAttribute('content');
}
}
return $tags->nodeValue;
}
}
$source = file_get_site('http://www.freshdirect.com/about/index.jsp');
echo get_content_tags($source,'title'); //FreshDirect
echo get_content_tags($source,'meta','name','description'); //Online grocer providing high quality fresh......
?>

Saving images only available when logged in

I've been having some trouble getting images to download when logged into a website that requires you to be logged in. The images can only be viewed when you are logged in to the site, but you cannot seem to view them directly in the browser if you copy its location into a tab/new window (it redirects to an error page - so I guess the containing folder has be .htaccess-ed).
Anyway, the code I have below allows me to log in and grab the HTML content, which works well - but I cannot grab the images ... this is where I need help!
<?
// curl.php
class Curl {
public $cookieJar = "";
public function __construct($cookieJarFile = 'cookies.txt') {
$this->cookieJar = $cookieJarFile;
}
function setup() {
$header = array();
$header[0] = "Accept: text/xml,application/xml,application/xhtml+xml,";
$header[0] .= "text/html;q=0.9,text/plain;q=0.8,image/gif;q=0.8,image/x-bitmap;q=0.8,image/jpeg;q=0.8,image/png,*/*;q=0.5";
$header[] = "Cache-Control: max-age=0";
$header[] = "Connection: keep-alive";
$header[] = "Keep-Alive: 300";
$header[] = "Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7";
$header[] = "Accept-Language: en-us,en;q=0.5";
$header[] = "Pragma: "; // browsers keep this blank.
curl_setopt($this->curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7');
curl_setopt($this->curl, CURLOPT_HTTPHEADER, $header);
curl_setopt($this->curl, CURLOPT_COOKIEJAR, $this->cookieJar);
curl_setopt($this->curl, CURLOPT_COOKIEFILE, $this->cookieJar);
curl_setopt($this->curl, CURLOPT_AUTOREFERER, true);
curl_setopt($this->curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($this->curl, CURLOPT_RETURNTRANSFER, true);
}
function get($url) {
$this->curl = curl_init($url);
$this->setup();
return $this->request();
}
function getAll($reg, $str) {
preg_match_all($reg, $str, $matches);
return $matches[1];
}
function postForm($url, $fields, $referer = '') {
$this->curl = curl_init($url);
$this->setup();
curl_setopt($this->curl, CURLOPT_URL, $url);
curl_setopt($this->curl, CURLOPT_POST, 1);
curl_setopt($this->curl, CURLOPT_REFERER, $referer);
curl_setopt($this->curl, CURLOPT_POSTFIELDS, $fields);
return $this->request();
}
function getInfo($info) {
$info = ($info == 'lasturl') ? curl_getinfo($this->curl, CURLINFO_EFFECTIVE_URL) : curl_getinfo($this->curl, $info);
return $info;
}
function request() {
return curl_exec($this->curl);
}
}
?>
And below is the page that uses it.
<?
// data.php
include('curl.php');
$curl = new Curl();
$url = "http://domain.com/login.php";
$newURL = "http://domain.com/go_here.php";
$username = "user";
$password = "pass";
$fields = "user=$username&pass=$password";
// Calling URL
$referer = "http://domain.com/refering_page.php";
$html = $curl->postForm($url, $fields, $referer);
$html = $curl->get($newURL);
echo $html;
?>
I've tried putting the direct URL for the image into $newURL but that doesn't get the image - it simply returns an error saying since that folder is not available to view directly. I've tried varying the above in different ways, but I haven't been successful in getting an image, though I have managed to get a screen through basically saying error 405 and/or 406 (but not the image I want).
Any help would be great!

Wow,
Seems like convoluted issue.
What I would do is compare a browser session with your PHP code at the HTTP layer and see what's different.
Grab Wireshark, connect using your browser successfully. You will need to filter out all other traffic and only dump what's on port 80. If you right click on a packet and click "follow TCP stream" it'll give you the HTTP headers and the output of the page.
Then do the same but this time with the PHP script.
Then compare the headers and see what's different. Maybe you're missing one or two headers, maybe you need to go to a page first, maybe your PHP script isn't sending the right cookies.

From the site's behavior it seems to me that it is not a session (cookie) problem, otherwise opening another tab would allow you to download the images.
Check the http referrer, it is the first suspect on my list.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

CURL Authentication being lost? - php

Err, please tell us what authentication scheme the server is using. Not all schemes use cookies.

Related

getting page from twitter or facebook anonymously by curl

Using curl to load an HTML file that submit an external form with JS

Why can I not scrape the title off this site?

file_get_contents script works with some websites but not others

Saving images only available when logged in

Categories

Resources