Download image from remote website and show to use - php

What is the quickest most memory efficient way to download a remote image and then display that to the user? - Note I don't want to save it, I just want to pass it on to the user.
i.e.
a user goes to www.website.com/image1.jpg
In the background I would use a getFile script similar to this to retrieve the file, but how do I display this to the user and is this a memory efficient way of doing it?
function getUrl($url, $method='', $vars='', $fh = '') {
$ch = curl_init();
if ($method == 'post') {
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $vars);
}
if ($fh != '')
{
curl_setopt($ch, CURLOPT_FILE, $fh);
}
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies/cookies.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'cookies/cookies.txt');
$buffer = curl_exec($ch);
curl_close($ch);
return $buffer;
}

Here it is:
function proxyImage( $fromHost, $path, $bufsize=4096 ) {
$conn = fsockopen($fromHost,80);
fwrite( $conn, "GET {$path} HTTP/1.0\r\n" );
fwrite( $conn, "Host: {$fromHost}\r\n" );
fwrite( $conn, "User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0\r\n" );
fwrite( $conn, "Connection: close\r\n" );
fwrite( $conn, "\r\n" );
$answer = '';
while( !feof($conn) and !$bodystarted ) {
$portion = fread( $conn, $bufsize );
if( strpos($portion,"\r\n\r\n")!==false ) $bodystarted = true;
$answer .= $portion;
}
list( $headers, $bodypart ) = explode( "\r\n\r\n", $answer, 2 );
foreach( explode("\r\n",$headers) as $h )
header($h);
echo $bodypart;
while( !feof($conn) )
echo fread( $conn, $bufsize );
fclose($conn);
}
proxyImage( 'www.newyorker.com', '/online/blogs/photobooth/NASAEarth-01.jpg' );

Related

Remove the line that found in array while processes a code PHP

i have the following code which getting some links from remote.
Now the code while processesing and found the hash [if($done==true)] keep sending request to the host even if the hash have been found in the link.
i need the code stop sending requests to the host that we found its hash, and keep working on the other hosts, so it will be exclude all founded hosts from the the second foreach [foreach (file("hosts.txt")] and not send more request to them.
please if any one can help to fix my code .
sorry about bad english.
$MAXPROCESS=50;
$execute=0;
$Arr = array();
foreach (file("hashes.txt") as $hashkey => $hash)
{
$hash = trim($hash);
$pid = pcntl_fork();
$execute++;
if ($execute >= $MAXPROCESS)
{
while (pcntl_waitpid(0, $status) != -1)
{
$status = pcntl_wexitstatus($status);
$execute =0;
//sleep(1);
flush();
//echo " [$ipkey] Child $status completed\n";
}
}
if (!$pid)
{
foreach (file("hosts.txt") as $hostkey => $hosts)
{
$host = trim($hosts);
if(!in_array($host,$Arr))
{
$done = CHECK_URL($host.$hash.'.xml');
if($done==true)
{
$Arr[] = $host;
echo "\n\r\n\r".$host." : Found [$hashkey] ------------\n\r\n\r";
}else{
echo $host." : error [$hashkey]\n";
}
}else{
return false;
}
flush();
}
exit;
}
}
function CHECK_URL($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, 5 );
curl_setopt( $ch, CURLOPT_TIMEOUT, 5 );
$result = curl_exec($ch);
$response = curl_getinfo($ch);
//if($response['http_code']=="200" || preg_match('/xml/i', $result))
if($response['http_code']=="200")
{
file_put_contents('hash.found.txt', $url."\n", FILE_APPEND);
return true;
}
curl_close($ch); // Close the curl stream
}
.............................

PHP's fopen() can't get url, but curl can

A while back, I wrote a little utility function that takes inPath and outPath, opens both and copies from one to the other using fread() and fwrite(). allow_url_fopen is enabled.
Well, I've got a url that I'm trying to get the contents of, and fopen() doesn't get any data, but if I use curl to do the same, it works.
The url in question is: http://www.deltagroup.com/Feeds/images.php?lid=116582497&id=1
fopen version:
$in = #fopen( $inPath, "rb" );
$out = #fopen( $outPath, "wb" );
if( !$in || !$out )
{
echo 0;
}
while( $chunk = fread( $in, 8192 ) )
{
fwrite( $out, $chunk, 8192 );
}
fclose( $in );
fclose( $out );
if( file_exists($outPath) )
{
echo 1;
}
else
{
echo 0;
}
curl version:
$opt = "curl -o " . $outPath . " " . $inPath;
$res = `$opt`;
if( file_exists($outPath) )
{
echo 1;
}
else
{
echo 0;
}
Any idea why this would happen?
Even using php's curl, I was unable to download the file- until I added a curlopt_useragent string. Nothing in the response indicated that it was required (no errors, nothing other than an HTTP 200).
Final code:
$out = #fopen( $outPath, "wb" );
if( !$out )
{
echo 0;
}
$ch = curl_init();
curl_setopt( $ch, CURLOPT_URL, $inPath );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt( $ch, CURLOPT_FILE, $out );
curl_setopt( $ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13');
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, 15 );
curl_setopt( $ch, CURLOPT_TIMEOUT, 18000 );
$data = curl_exec( $ch );
curl_close( $ch );
fclose( $out );
if( file_exists($outPath) )
{
echo 1;
}
else
{
echo 0;
}

Select HTML content using PHP

I want to get the paragraphs under this tag:
I tried to:
<?php
$doc = new DOMDocument();
$doc->loadHTMLFile("https://sabq.org/xMQjz2");
$elements = $doc->getElementsByTagName('p');
if (!is_null($elements)) {
foreach ($elements as $element) {
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->textContent. "\n";
}
}
}
?>
And I got the paragraphs I wanted along with unwanted ones, and they were duplicated.
EDIT:
I changed the URL, hope it works
The link that you have provided throws an error when accessing it so what I did, I found a function that could get the contents of the webpage using curl instead of the DOMDocument class which you were using.
I used preg_match and regex to extract the specific element that you were looking for.
Here's the code:
<?php
//opened url
$content = get_fcontent("https://sabq.org/%D8%B4%D8%A7%D9%87%D8%AF-%D8%A3%D9%84%D9%81-%D8%B5%D9%81%D8%AD%D8%A9-%D8%AA%D8%B1%D9%88%D9%8A-%D9%82%D8%B5%D8%B5-%D8%A7%D9%84%D8%AD%D8%B1%D9%85%D9%8A%D9%86-%D9%85%D9%86%D8%B0-%D8%A7%D9%86%D8%B7%D9%84%D8%A7%D9%82-%D8%A7%D9%84%D8%B9%D9%87%D8%AF-%D8%A7%D9%84%D8%B3%D8%B9%D9%88%D8%AF%D9%8A");
//extract specific html tag and its innerHTML
preg_match('/<p .*? ng\-bind\-html\=\"getContent\(material\.content\)\" .*?>.*?<\/p>/m', $content[0], $matches);
//display the wanted element
echo $matches[0];
//getting contents using curl because threw error: failed to open stream
function get_fcontent( $url, $javascript_loop = 0, $timeout = 5 ) {
$url = str_replace( "&", "&", urldecode(trim($url)) );
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_COOKIEJAR, $cookie );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_ENCODING, "" );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false ); # required for https urls
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_TIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_MAXREDIRS, 10 );
$content = curl_exec( $ch );
$response = curl_getinfo( $ch );
curl_close ( $ch );
if ($response['http_code'] == 301 || $response['http_code'] == 302) {
ini_set("user_agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1");
if ( $headers = get_headers($response['url']) ) {
foreach( $headers as $value ) {
if ( substr( strtolower($value), 0, 9 ) == "location:" )
return get_url( trim( substr( $value, 9, strlen($value) ) ) );
}
}
}
if ( ( preg_match("/>[[:space:]]+window\.location\.replace\('(.*)'\)/i", $content, $value) || preg_match("/>[[:space:]]+window\.location\=\"(.*)\"/i", $content, $value) ) && $javascript_loop < 5) {
return get_url( $value[1], $javascript_loop+1 );
} else {
return array( $content, $response );
}
}
?>
For testing, I created a local file called test.html:
<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body>
<p>This should not be showing.</p>
<p ng-bind-html="getContent(material.content)" id="dev-content" class="details-text">This is a test.</p>
</body>
</html>
I used the local url http://localhost/example/test.html instead of the link you provided for testing purposes.
And from the local file I created for testing, I got the following result:
<p ng-bind-html="getContent(material.content)" id="dev-content" class="details-text">This is a test.</p>
Here's the result that I got from the original url:
<p ng-bind-html="getContent(material.content)" id="dev-content" class="details-text"></p>
I hope this helps!

How to download a copy of the html from a website?

How do I download a copy of the html from a website that has language detection (eg google, youtube) and redirection? I have tried file_get_contents but it is to limited.
I am trying to use curl in php to get the html from www.google.com but it detects that I am from the UK and sends me a 302 redirect to www.google.co.uk.
I have tried many different things with no joy, is this possible? websites like www.markosweb.com do it..
my code:
$ch = curl_init( "http://www.google.com/" );
// $userAgent = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Win64; x64; Trident/5.0)";
// $userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
$userAgent = 'Googlebot/2.1 (http://www.googlebot.com/bot.html)';
$header = array(
"Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
"Accept-Language: en-US,us;q=0.7,en-us;q=0.5,en;q=0.3",
"Accept-Charset: windows-1251,utf-8;q=0.7,*;q=0.7",
"Keep-Alive: 300");
curl_setopt($ch,CURLOPT_RETURNTRANSFER,TRUE); //TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,5); //The number of seconds to wait while trying to connect.
curl_setopt($ch, CURLOPT_USERAGENT, $userAgent); //The contents of the "User-Agent: " header to be used in a HTTP request.
curl_setopt($ch, CURLOPT_FAILONERROR, TRUE); //To fail silently if the HTTP code returned is greater than or equal to 400.
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1); //To follow any "Location: " header that the server sends as part of the HTTP header.
curl_setopt($ch, CURLOPT_AUTOREFERER, TRUE); //To automatically set the Referer: field in requests where it follows a Location: redirect.
curl_setopt($ch, CURLOPT_TIMEOUT, 10); //The maximum number of seconds to allow cURL functions to execute.
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, FALSE);
curl_setopt($curl, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, 0);
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
I have tried changing the useragent to lots of things, tried with and without header details. I managed to get something if I used header info : "Accept-Language: ru-ru,ru;q=0.7,en-us;q=0.5,en;q=0.3" but it was in russian or something.
Thanks for your help.
Carl
Try this proxy script:
// Change these configuration options if needed, see above descriptions for info.
$enable_jsonp = false;
$enable_native = false;
$valid_url_regex = '/.*/';
// ############################################################################
$url = $_GET['url'];
if ( !$url ) {
// Passed url not specified.
$contents = 'ERROR: url not specified';
$status = array( 'http_code' => 'ERROR' );
} else if ( !preg_match( $valid_url_regex, $url ) ) {
// Passed url doesn't match $valid_url_regex.
$contents = 'ERROR: invalid url';
$status = array( 'http_code' => 'ERROR' );
} else {
$ch = curl_init( $url );
if ( strtolower($_SERVER['REQUEST_METHOD']) == 'post' ) {
curl_setopt( $ch, CURLOPT_POST, true );
curl_setopt( $ch, CURLOPT_POSTFIELDS, $_POST );
}
if ( $_GET['send_cookies'] ) {
$cookie = array();
foreach ( $_COOKIE as $key => $value ) {
$cookie[] = $key . '=' . $value;
}
if ( $_GET['send_session'] ) {
$cookie[] = SID;
}
$cookie = implode( '; ', $cookie );
curl_setopt( $ch, CURLOPT_COOKIE, $cookie );
}
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_HEADER, true );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_USERAGENT, $_GET['user_agent'] ? $_GET['user_agent'] : $_SERVER['HTTP_USER_AGENT'] );
list( $header, $contents ) = preg_split( '/([\r\n][\r\n])\\1/', curl_exec( $ch ), 2 );
$status = curl_getinfo( $ch );
curl_close( $ch );
}
// Split header text into an array.
$header_text = preg_split( '/[\r\n]+/', $header );
if ( $_GET['mode'] == 'native' ) {
if ( !$enable_native ) {
$contents = 'ERROR: invalid mode';
$status = array( 'http_code' => 'ERROR' );
}
// Propagate headers to response.
foreach ( $header_text as $header ) {
if ( preg_match( '/^(?:Content-Type|Content-Language|Set-Cookie):/i', $header ) ) {
header( $header );
}
}
print $contents;
} else {
// $data will be serialized into JSON data.
$data = array();
// Propagate all HTTP headers into the JSON data object.
if ( $_GET['full_headers'] ) {
$data['headers'] = array();
foreach ( $header_text as $header ) {
preg_match( '/^(.+?):\s+(.*)$/', $header, $matches );
if ( $matches ) {
$data['headers'][ $matches[1] ] = $matches[2];
}
}
}
// Propagate all cURL request / response info to the JSON data object.
if ( $_GET['full_status'] ) {
$data['status'] = $status;
} else {
$data['status'] = array();
$data['status']['http_code'] = $status['http_code'];
}
// Set the JSON data object contents, decoding it from JSON if possible.
$decoded_json = json_decode( $contents );
$data['contents'] = $decoded_json ? $decoded_json : $contents;
// Generate appropriate content-type header.
$is_xhr = strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) == 'xmlhttprequest';
header( 'Content-type: application/' . ( $is_xhr ? 'json' : 'x-javascript' ) );
// Get JSONP callback.
$jsonp_callback = $enable_jsonp && isset($_GET['callback']) ? $_GET['callback'] : null;
// Generate JSON/JSONP string
$json = json_encode( $data );
print $jsonp_callback ? "$jsonp_callback($json)" : $json;
}
Make sure to perform a request like this:
http://example.com/script?url=http://whateverurl.com/
Oh, and this PHP script will display the result in JSON.
From there, you can parse it using jQuery.
Like I use this jQuery code:
<script type="text/javascript">
$(document).ready(function(){
var url='+++++URL WHICH THE PHP PROXY SCRIPT IS IN++++++';
$(window).load(function(){
$.getJSON(url,function(json){
$("#resu").append(""+json.contents+"");
});
});
});
</script>
Edit: This script is not a true proxy in the sense that it does fake an IP address. Sorry for the confusion.

Fogbugz XML_API PHP CURL File upload

I have built a php script which receives values in $_POST and $_FILES
I'm catching those values, and then trying to use CURL to make posts to FogBugz.
I can get text fields to work, but not files.
$request_url = "http://fogbugz.icarusstudios.com/fogbugz/api.php";
$newTicket = array();
$newTicket['cmd'] = 'new';
$newTicket['token'] = $token;
$newTicket['sPersonAssignedTo'] = 'autobugz';
$text = "\n";
foreach( $form as $pair ) {
$text .= $pair[2] . ": " . $pair[0] . "\n";
}
$text = htmlentities( $text );
$newTicket['sEvent'] = $text;
$f = 0;
foreach ($_FILES as $fk => $v) {
if ($_FILES[$fk]['tmp_name'] != '') {
$extension = pathinfo( $_FILES[$fk]['name'], PATHINFO_EXTENSION);
//only take the files we have specified above
if (in_array( array( $fk, $extension ) , $uploads)) {
$newTicket['File'.$f] = $_FILES[$fk]['tmp_name'];
//echo ( $_FILES[$fk]['name'] );
//echo ( $_FILES[$fk]['tmp_name'] );
//print $fk;
//print '<br/>';
//print_r( $v );
}
}
}
$ch = curl_init( $request_url );
$timeout = 5;
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_POSTFIELDS, $newTicket );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
curl_close($ch);
To upload files with CURL you should prepend a # to the path, see this example:
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_VERBOSE, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible;)");
curl_setopt($ch, CURLOPT_URL, _VIRUS_SCAN_URL);
curl_setopt($ch, CURLOPT_POST, true);
// same as <input type="file" name="file_box">
$post = array(
"file_box"=>"#/path/to/myfile.jpg",
);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post);
$response = curl_exec($ch);
Taken from http://dtbaker.com.au/random-bits/uploading-a-file-using-curl-in-php.html.
The other answer -- for FogBugz reasons only --
$f cannot be set to 0 initially. It must be 1, so the files go through as File1, File2, etc.
The # symbol is also key.

Categories