I'm using an external web service that will return an image URL which i will display in my website, For example :
$url = get_from_web_service();
echo '<img url="'.$url.'" />';
everything is working fine except if i have 100 images to show then calling the web service become time & resources consuming.
//the problem
foreach($items as $item) {
$url = get_from_web_service($item);
echo '<img url="'.$url.'" />';
}
So now i'm considering two options:
//Option1: Using php get_file_contents():
foreach($items as $item)
{
echo '<img url="url_to_my_website/get_image.php?id='.$item->id.'" />'
}
get_image.php :
$url = get_from_web_service($id);
header("Content-Type: image/png");
echo file_get_contents($url);
//Option2: Using ajax:
echo '<img scr="dummy_image_or_website_logo" data-id="123" />';
//ajax call to the web service to get the id=123 and get the url then add the src attribute to that image.
THOUGHTS
First option seems more straight forward, but my server might be
overloaded and involved in every single image request.
Second option it's all done by browser & web service so my server is not involved at all. but for each image i'm making 2 calls 1 ajax call to get the image URL and another one one to get the image. so loading time might be vary and ajax calls might fail for large number of calls.
Information
Around 50 Images will be displayed in that page.
This service will be used by around a 100 user at a given time.
I have no control over the web service so i can't change its functionality and it doesn't accept more than 1 image ID for each call.
My Questions
Any better option i should consider?
If not, which option should I follow? and most important why i should follow that one?
Thanks
Method 1: Rendering in PHP
Pros:
Allows for custom headers that're independent of any server software. If you're using something that's not generally cached (like a PHP file with a query string) or are adding this to a package that needs header functionality regardless of server software, this is a very good idea.
If you know how to use GD or Imagick, you can easily resize, crop, compress, index, etc. your images to reduce the image file size (sometimes drastically) and make the page load significantly faster.
If width and height are passed as variables to the PHP file, the dimensions can be set dynamically:
<div id="gallery-images">
<noscript>
<!-- So that the thumbnail is small for old mobile devices //-->
<img src="get-image.php?id=123&h=200&w=200" />
</noscript>
</div>
<script type="text/javascript">
/* Something to create an image element inside of the div.
* In theory, the browser height and width can be pulled dynamically
* on page load, which is useful for ensuring that images are no larger
* than they need to be. Having a function to load the full image
* if the borwser becomes bigger isn't a bad idea though.
*/
</script>
This would be incredibly considerate of mobile users on a page that has an image gallery. This is also very considerate of users with limited bandwidth (like almost everyone in Alaska. I say this from personal experience).
Allows you to easily clear the EXIF data of images if they're uploaded by users on the website. This is important for user privacy as well as making sure there aren't any malicious scripts living in your JPGs.
Gives potential to dynamically create a large image sprite and drastically reduce your HTTP requests if they're causing latency. It'd be a lot of work so this isn't a very strong pro, but it's still something you can do using this method that you can't do using the second method.
Cons:
Depending on the number and size of images, this could put a lot of strain on your server. When used with browser-caching, the dynamic images are being pulled from cache instead of being re-generated, however it's still very easy for a bot to be served the dynamic image a number of times.
It requires knowledge of HTTP headers, basic image manipulation skills, and an understanding of how to use image manipulation libraries in PHP to be effective.
Method 2: AJAX
Pros:
The page would finish loading before any of the images. This is important if your content absolutely needs to load as fast as possible, and the images aren't very important.
Is far more simple, easy and significantly faster to implement than any kind of dynamic PHP solution.
It spaces out the HTTP requests, so the initial content loads faster (since the HTTP requests can be sent based on browser action instead of just page load).
Cons:
It doesn't decrease the number of HTTP requests, it simply spaces them out. Also note that there will be at least one additional external JS file in addition to all of these images.
Displays nothing if the end device (such as older mobile devices) does not support JavaScript. The only way you could fix this is to have all of the images load normally between some <noscript> tags, which would require PHP to generate twice as much HTML.
Would require you to add loading.gif (and another HTTP request) or Please wait while these images load text to your page. I personally find this annoying as a website user because I want to see everything when the page is "done loading".
Conclusion:
If you have the background knowledge or time to learn how to effectively use Method 1, it gives far more potential because it allows for manipulation of the images and HTTP requests sent by your page after it loads.
Conversely, if you're looking for a simple method to space out your HTTP Requests or want to make your content load faster by making your extra images load later, Method 2 is your answer.
Looking back at methods 1 and 2, it looks like using both methods together could be the best answer. Having two of your cached and compressed images load with the page (one is visible, the other is a buffer so that the user doesn't have to wait every time they click "next"), and having the rest load one-by-one as the user sees fit.
In your specific situation, I think that Method 2 would be the most effective if your images can be displayed in a "slideshow" fashion. If all of the images need to be loaded at once, try compressing them and applying browser-caching with method 1. If too many image requests on page load is destroying your speed, try image spriting.
As of now, you are contacting the webservice 100 times. You should change it so it contacts the webservice only once and retrieves an array of all the 100 images, instead of each image separately.
You can then loop over this array, which will be very fast as no further webtransactions are needed.
If the images you are fetching from the webservice are not dynamic in nature i.e. do not get changed/modified frequently, I would suggest to setup a scheduled process/cron job on your server which gets the images from the webservice and stores locally (in your server itself), so you can display images on the webpage from your server only and avoid third party server round trip every time webpage is served to the end users.
Both of the 2 option cannot resolve your problem, may be make it worse.
For option 1:
The process where cost most time is "get_from_web_service($item)", and the code is only made it be executed by another script( if the file "get_image.php" is executed at the same server).
For option 2:
It only make the "get-image-resource-request" being trigger by browser, but your server has also need to process the "get_from_web_service($item)".
One thing must be clear is that the problem is about the performance of get_from_web_service, the most straight proposal is to make it have a better performance. On the other hand, we can make it reduce the number of concurrent connections. I haven't thought this through, only have 2 suggestion:
Asynchronous: The user didn't browse your whole page, they only notice the page at the top. If your mentioned images does not all displayed at the top, you can use jquery.lazyload extension, it can make the image resource at invisible region do not request the server until they are visible.
CSS Sprites : An image sprite is a collection of images put into a single image. If images on your page does not change frequency, you can write some code to merge them daily.
Cache Image : You can cache your image at your server, or another server (better). And do some key->value works: key is about the $item, value is the resource directory(url).
I am not a native english speaker, hope I made it clear and helpful to you.
im not an expert, but im thinking everytime you echo, it takes time. getting 100 images shouldnt be a problem (solely)
Also. maybe get_from_web_service($item); should be able to take an array?
$counter = 1;
$urls = array();
foreach($items as $item)
{
$urls[$counter] = get_from_web_service($item);
$counter++;
}
// and then you can echo the information?
foreach($urls as $url)
{
//echo each or use a function to better do it
//echo '<img url="url_to_my_website/get_image?id='.$url->id.'" />'
}
get_image.php :
$url = get_from_web_service($item);
header("Content-Type: image/png");
echo file_get_contents($url);
at the end, it would be mighty nice if you can just call
get_from_web_service($itemArray); //intake the array and return images
Option 3:
cache the requests to the web service
Option one is the best option. I would also want to make sure that the images are cached on the server, so that multiple round trips are not required from the original web server for the same image.
If your interested, this is the core of the code that I use for caching images etc (note, that a few things, like reserving the same content back to the client etc is missing):
<?php
function error404() {
header("HTTP/1.0 404 Not Found");
echo "Page not found.";
exit;
}
function hexString($md5, $hashLevels=3) {
$hexString = substr($md5, 0, $hashLevels );
$folder = "";
while (strlen($hexString) > 0) {
$folder = "$hexString/$folder";
$hexString = substr($hexString, 0, -1);
}
if (!file_exists('cache/' . $folder))
mkdir('cache/' . $folder, 0777, true);
return 'cache/' . $folder . $md5;
}
if (!isset($_GET['img']))
error404();
getFile($_GET['img']);
function getFile($url) {
// true to enable caching, false to delete cache if already cached
$cache = true;
$defaults = array(
CURLOPT_HEADER => FALSE,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_MAXCONNECTS => 15,
CURLOPT_CONNECTTIMEOUT => 30,
CURLOPT_TIMEOUT => 360,
CURLOPT_USERAGENT => 'Image Download'
);
$ch = curl_init();
curl_setopt_array($ch, $defaults);
curl_setopt($ch, CURLOPT_URL, $_GET['img']);
$key = hexString(sha1($url));
if ($cache && file_exists($key)) {
return file_get_contents($key);
} elseif (!$cache && file_exists($key)) {
unlink($key);
}
$data = curl_exec($this->_ch);
$info = curl_getinfo($this->_ch);
if ($cache === true && $info['http_code'] == 200 && strlen($data) > 20)
file_put_contents($key, $data);
elseif ($info['http_code'] != 200)
error404();
return $data;
}
$content = getURL($_GET['img']);
if ($content !== null or $content !== false) {
// Success!
header("Content-Type: image");
echo $content;
}
None of the two options will resolve server resources usage issue. Out of the two, though, I would recommend option 1. The second one will delay page loading, causing website speed to slow down, and reducing your SEO ratings.
Best option for you would be something like:
foreach($items as $item) {
echo '<img url="url_to_my_website/get_image.php?id='.$item->id.'" />'
}
Then where the magic happens is get_image.php:
if(file_exists('/path_to_local_storage/image_'.$id.'.png')) {
$url = '/path_to_images_webfolder/image_'.$id.'.png';
$img = file_get_contents($url);
} else {
$url = get_from_web_service($id);
$img = file_get_contents($url);
$imgname = end(explode('/', $url));
file_put_contents($imgname, $img);
}
header("Content-Type: image/png");
echo $img;
This was you will only run the request to web service once per image, and then store it on your local space. Next time the image is requested - you will serve it form your local space, skipping request to web service.
Of course, considering image IDs to be unique and persistent.
Probably not the best solution, but should work pretty well for you.
As we see that above you're including an URL to the web service provided image right in the <img> tag src attribute, one can safely assume that these URLs are not secret or confidential.
Knowing that above, the following snippet from the get_image.php will work with the least overhead possible:
$url = get_from_web_service($id);
header("Location: $url");
If you're getting a lot of subsequent requests to the same id from a given client, you can somewhat lessen number of requests by exploiting browser's internal cache.
header("Cache-Control: private, max-age=$seconds");
header("Expires: ".gmdate('r', time()+$seconds));
Else resort to server-side caching by means of Memcached, database, or plain files like so:
is_dir('cache') or mkdir('cache');
$cachedDataFile = "cache/$id";
$cacheExpiryDelay = 3600; // an hour
if (is_file($cachedDataFile) && filesize($cachedDataFile)
&& filemtime($cachedDataFile) + $cacheExpiryDelay > time()) {
$url = file_get_contents($cachedDataFile);
} else {
$url = get_from_web_service($id);
file_put_contents($cachedDataFile, $url, LOCK_EX);
}
header("Cache-Control: private, max-age=$cacheExpiryDelay");
header("Expires: ".gmdate('r', time() + $cacheExpiryDelay));
header("Location: $url");
Related
I have use image src encryption with base64_encode but that code slower my site but when I put this code it makes my site slower. so do anybody has any solution to make my site faster with this type of encryption. I have put my code below.
<?php
while ($user = mysqli_fetch_array($queryResult, MYSQLI_ASSOC)){
if ($user["main_picture"]){
$imageData = base64_encode(file_get_contents($user["main_picture"]));
$result .= '<td><div class="user_image_container"><img src="data:image/jpeg;base64,'.$imageData.'"></img></div></td>';
}
else{
$result .= '<td></td>';
}
?>
can anybody help me in this.
When you load your page with this code the PHP interpreter has to finish fetching the image over the network before it can interpret the rest of the page, which adds time.
If you were to load the page without this code and use a direct link to the image, the page itself would load a lot faster and then the browser would load the image.
Potential workaround: use a database to store single-use tokens that map to to images. When a user loads the page, generate a token (which will be quicker than pulling the image) and have the image src point to an image-serving endpoint that you set up that checks the token, marks it as used, fetches the file and then sends the image. You might run into problems with caching if you want it to be truly single use, but it does at least hide the source of the image.
Amazon AWSSDKforPHP too slow
Hi there,
I'm using Amazon AWSSDKforPHP for connecting my web application with S3. But, there's an issue with the process or making requests to the service that make this too slow.
For example, I have this code:
// Iterate an array of user images
foreach($images as $image){
// Return the Bucket URL for this image
$urls[] = $s3->get_object_url($bucket, 'users/'.trim($image).'.jpg', '5 minutes');
}
Supposing that $images is an array of user pictures, this returns an array called $urls that have (As his name says) the URL of tha pictures with the credentials for 5 minutes. This request takes at least 6 seconds with 35 images, and that's ok. But.... when the pictures does not exists in the bucket, I want to assign a default image for the user, something like 'images/noimage.png'.
Here's the code:
// Iterate an array of user images
foreach($images as $image){
// Check if the object exists in the Bucket
if($s3->if_object_exists($bucket, 'users/'.trim($image).'.jpg')){
// Return the Bucket URL for this image
$urls[] = $s3->get_object_url($bucket, 'users/'.trim($image).'.jpg', '5 minutes');
} else {
// Return the default image
$urls[] = 'http://www.example.com/images/noimage.png';
}
}
And the condition works, but SLOOOOOW. With the the condition "$s3->if_object_exists()", the Script takes at least 40 seconds with 35 images!
I have modified my Script, making the request using cURL:
// Iterate an array of user images
foreach($images as $image){
// Setup cURL
$ch = curl_init($s3->get_object_url($bucket, 'users/'.trim($image).'.jpg', '1 minutes') );
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
// Get Just the HTTP response code
$res = curl_getinfo($ch,CURLINFO_HTTP_CODE);
if($res == 200){ //the image exists
$urls[] = $s3->get_object_url($bucket, 'users/'.trim($image).'.jpg', '5 minutes');
}else{ // The response is 403
$urls[] = 'http://www.example.com/images/noimage.png';
}
}
And this modified Script takes between 16 and 18 Seconds. This is a big difference, but it's still a lot of time :(.
Please, any help is so much appreciated.
Thank you.
Why not change how you are doing your checks. Store the locations/buckets of the images locally in a database then this way you do not have to worry about this check?
This way you minimize the number of API calls you are doing which is 35 in your case now, but this could get exponentially large with time. And, not only are you doing one call per image but rather two calls per image for the most part. This is highly inefficient and reliant on your network connection to be fairly fast.
Moving the location data and if the image exists or not locally is a much better choice in terms of performance in this area. Also this check should only have to be done a single time it looks like anyways if you store the result ahead of time.
I would think that if you wanted to be able to read directory type of information from S3, you might best use something like s3fs to mount your bucket as a system drive. s3fs can also be configured with a local cache to speed things up (cache on fast ephemeral storage if you are using EC2).
This would allow you to do regular PHP directory handling (DirectoryIterator, etc.) with ease.
If this is more than you want to mess with, at least store the filename data in a databases and just expect the files to be in proper S3 locations or cache the results of individual API checks locally in some manner so as to not need to make an API call for each similar request.
It's slow because you're calling if_object_exists() in every iteration through the loop, kicking off a network request to AWS.
The user "thatidiotguy" said:
I do not know about the S3 API, but could you ask for a list of files in the bucket and do the string matching/searching yourself in the script? There is no way 34 string match tests should take anywhere near that long in a PHP script.
He's right.
Instead of calling if_object_exists(), you can instead call get_object_list() once — at the beginning of the script — then compare your user photo URL to the list using PHP's in_array() function.
You should see a speed-up of approximately a zillion percent. Don't quote me on that, though. ;)
I made a simple parser for saving all images per page with simple html dom and get image class but i had to make a loop inside the loop in order to pass page by page and i think something is just not optimized in my code as it is very slow and always timeouts or memory exceeds. Could someone just have a quick look at the code and maybe you see something really stupid that i made?
Here is the code without libraries included...
$pageNumbers = array(); //Array to hold number of pages to parse
$url = 'http://sitename/category/'; //target url
$html = file_get_html($url);
//Simply detecting the paginator class and pushing into an array to find out how many pages to parse placing it into an array
foreach($html->find('td.nav .str') as $pn){
array_push($pageNumbers, $pn->innertext);
}
// initializing the get image class
$image = new GetImage;
$image->save_to = $pfolder.'/'; // save to folder, value from post request.
//Start reading pages array and parsing all images per page.
foreach($pageNumbers as $ppp){
$target_url = 'http://sitename.com/category/'.$ppp; //Here i construct a page from an array to parse.
$target_html = file_get_html($target_url); //Reading the page html to find all images inside next.
//Final loop to find and save each image per page.
foreach($target_html->find('img.clipart') as $element) {
$image->source = url_to_absolute($target_url, $element->src);
$get = $image->download('curl'); // using GD
echo 'saved'.url_to_absolute($target_url, $element->src).'<br />';
}
}
Thank you.
I suggest making a function to do the actual simple html dom processing.
I usually use the following 'template'... note the 'clear memory' section.
Apparently there is a memory leak in PHP 5... at least I read that someplace.
function scraping_page($iUrl)
{
// create HTML DOM
$html = file_get_html($iUrl);
// get text elements
$aObj = $html->find('img');
// do something with the element objects
// clean up memory (prevent memory leaks in PHP 5)
$html->clear(); // **** very important ****
unset($html); // **** very important ****
return; // also can return something: array, string, whatever
}
Hope that helps.
You are doing quite a lot here, I'm not surprised the script times out. You download multiple web pages, parse them, find images in them, and then download those images... how many pages, and how many images per page? Unless we're talking very small numbers then this is to be expected.
I'm not sure what your question really is, given that, but I'm assuming it's "how do I make this work?". You have a few options, it really depends what this is for. If it's a one-off hack to scrape some sites, ramp up the memory and time limits, maybe chunk up the work to do a little, and next time write it in something more suitable ;)
If this is something that happens server-side, it should probably be happening asynchronously to user interaction - i.e. rather than the user requesting some page, which has to do all this before returning, this should happen in the background. It wouldn't even have to be PHP, you could have a script running in any language that gets passed things to scrape and does it.
Currently I'm using PHP to load multiple XML files from around the web (non-local) using simplexml_load_file(). This, as you can imagine, is quite a clunky process and is slowing load time significantly (7 seconds to load 7 files), and there could possibly be more files to load. These files don't change often, but changes should be displayed on the page as soon as they are made.
One idea I had was to cache a version of each feed and the html output I generate from that feed in my DB. Then, each time the user loads the page, the feeds would be compared; if they are different I would run my existing code, generate the HTML, output it, and save it to the DB. However, if it is the same, I could simply output the cached HTML.
My two concerns with this are:
Security: If I am storing a copy of an XML file, could this pose a security threat, seeing as I don't control the content of that file?
Speed: The main goal here is to increase the speed of the overall page load. Would the process described above increase the speed, or would it just bog down the server with more to do? Thanks for your help!
How about having a cron job crawl through every external XML source, say, hourly or quarter-hourly and update it if necessary?
It wouldn't be in 100% real time, but would take the load off your web page - that would always be using cached files. I don't think there is a reliable way of polling external sources for updates other than actually downloading the file (in theory, it should be possible to get the correct cache headers, but I wouldn't rely on them being configured correctly.)
Security: If I am storing a copy of an XML file, could this pose a security threat, seeing as I don't control the content of that file?
Hardly. To make totally sure, store the cached XML files outside the web root. The any threat that remains then is the same as if you were passing the stream through live.
One idea I had was to cache a version of each feed and the html output I generate from that feed in my DB. Then, each time the user loads the page, the feeds would be compared; if they are different I would run my existing code, generate the HTML, output it, and save it to the DB. However, if it is the same, I could simply output the cached HTML.
Rather than caching the XML file yourself, you should set the If-None-Match or If-Modified-Since fields in the request header. This way you can check to see if the files have changed without necessarily downloading them.
This can be done by setting a stream context for libxml before running simplexml_load_file(). If the file hasn't changed, you'll get a 304 Not Modified response, and simplexml_load_file will fail.
You could also use stream_context_get_default to set the general stream context, then retrieve the XML file into a string with file_get_contents and pass it to simplexml_load_string().
Here's an example of the first way:
Class CachedXml {
public $element,$url;
private $mod_date, $etag;
public function __construct($url){
$this->url = $url;
$this->element = NULL;
$this->mod_date = FALSE;
$this->etag = FALSE;
}
public function updateXml(){
if($this->mod_date || $this->etag){
$opts = array(
'http'=>array(
'header'=>"If-Modified-Since: $this->mod_date\r\n" .
"If-None-Match: $this->etag\r\n"
)
);
$context = stream_context_create($opts);
libxml_set_streams_context($context);
}
if($attempt = # simplexml_load_file($this->url)){
$this->element = $attempt;
$headers = get_headers($this->url,1);
$this->mod_date = $headers['Last-Modified'];
$this->etag = $headers['ETag'];
return TRUE;
}
return FALSE;
}
}
$bob = new CachedXml('http://example.com/xml/test.xml');
if($bob->updateXml()){
echo "Bob was just updated.<br />";
echo " Bob's name is " . $bob->element->getName() . ".<br />";
}
else{
echo "Bob was not updated.<br />";
}
Im pulling the binary data out of my mySql database and want to display it as a image.
I do not want to make a separate page for it to display the image (this would involve a extra call to the databae among other things)
I simply want to be able to do
Pretty much but the $Image variable is in its longblob format and I need to convert it.
THanks in advance.
I know this is not a specific answer to your question, but consider that by removing that database call, you are dramatically increasing your server load, increasing the size of each page and slowing down the responsiveness of your site.
Consider any page stackoverflow. Most of it is dynamic, so the page cannot be cached. but the users' thumbnail is static and can be cached.
If you send the thumbnail as a data URI, you are doing the DB lookup and data transfer for every thumbnail on every page.
If you send it as a linked image, you incur a single DB lookup for when the image is first loaded, and from then on it will be cached (if you send the correct HTTP headers with it), making your server load lighter, and your site run faster!
I do not want to make a separate page for it to display the image
You can base64 encode your image data and include it directly into the markup as a data URI. In most cases, that's not a good idea though:
It's not supported by IE < 8
It (obviously) sizes up the HTML page massively.
It slows down rendering because the browser has to load the resource first before it can finish HTML rendering
Better build a separate script, and make that one extra call.
You could probably do this using Base64-encoded Data URIs.
I'm not sure if it's possible to do straight into a img-tag, but you can do it by setting a background-image for a div.
Basically you change the regular
.smurfette {
background: url(smurfette.png);
}
to
.smurfette {
background: url( [...] P6VAAAAAElFTkSuQmCC);
}
Data URIs are supported in:
* Firefox 2+
* Safari – all versions
* Google Chrome – all versions
* Opera 7.2+
* Internet Explorer 8+
Info borrowed from Robert Nyman: http://robertnyman.com/2010/01/15/how-to-reduce-the-number-of-http-requests/
$_GET the result into a separate variable i.e. $myvar = $_GET['Id']; before you process the $imageResult line e.g.:
$myid = $_GET['Id'];
$ImageResult = "select player.Image from player where Id = '$myid'";
thanks for the answers, ive decided to go with a separate GetImage.php page but now cant seem to do the simplest of tasks
$ImageResult = "select player.Image from player where Id = " . $_GET['Id'];
$result = mysql_query($ImageResult) or die ("data recovery failed -3");
header("Content-type: image/jpeg");
echo mysql_result($result, 0);
But this returns just a broken link and cannot work out what I have missed out