Website screenshots - php

Is there any way of taking a screenshot of a website in PHP, then saving it to a file?

LAST EDIT: after 7 years I'm still getting upvotes for this answer, but I guess this one is now much more accurate.
Sure you can, but you'll need to render the page with something.
If you really want to only use php, I suggest you HTMLTOPS, which renders the page and outputs it in a ps file (ghostscript), then, convert it in a .jpg, .png, .pdf.. can be little slower with complex pages (and don't support all the CSS).
Else, you can use wkhtmltopdf to output a html page in pdf, jpg, whatever..
Accept CSS2.0, use the webkit (safari's wrapper) to render the page.. so should be fine.
You have to install it on your server, as well..
UPDATE Now, with new HTML5 and JS feature, is also possible to render the page into a canvas object using JavaScript. Here a nice library to do that: Html2Canvas and here is an implementation by the same author to get a feedback like G+.
Once you have rendered the dom into the canvas, you can then send to the server via ajax and save it as a jpg.
EDIT: You can use the imagemagick tool for transforming pdf to png. My version of wkhtmltopdf does not support images. E.g. convert html.pdf -append html.png.
EDIT: This small shell script gives a simple / but working usage example on linux with php5-cli and the tools mentioned above.
EDIT: i noticed now that the wkhtmltopdf team is working on another project: wkhtmltoimage, that gives you the jpg directly

Since PHP 5.2.2 it is possible, to capture a website with PHP solely!
imagegrabscreen — Captures the whole screen
<?php
$img = imagegrabscreen();
imagepng($img, 'screenshot.png');
?>
imagegrabwindow - Grabs a window or its client area using a windows handle (HWND property in COM instance)
<?php
$Browser = new COM('InternetExplorer.Application');
$Browserhandle = $Browser->HWND;
$Browser->Visible = true;
$Browser->Fullscreen = true;
$Browser->Navigate('http://www.stackoverflow.com');
while($Browser->Busy){
com_message_pump(4000);
}
$img = imagegrabwindow($Browserhandle, 0);
$Browser->Quit();
imagepng($img, 'screenshot.png');
?>
Edit: Note, these functions are available on Windows systems ONLY!

If you don't want to use any third party tools, I have come across to simple solution that is using Google Page Insight api.
Just need to call it's api with params screenshot=true.
https://www.googleapis.com/pagespeedonline/v1/runPagespeed?
url=https://stackoverflow.com/&key={your_api_key}&screenshot=true
For mobile site view pass &strategy=mobile in params,
https://www.googleapis.com/pagespeedonline/v1/runPagespeed?
url=http://stackoverflow.com/&key={your_api_key}&screenshot=true&strategy=mobile
DEMO.

You can use simple headless browser like PhantomJS to grab the page.
Also you can use PhantomJS with PHP.
Check out this little php script that do this. Take a look here https://github.com/microweber/screen
And here is the API- http://screen.microweber.com/shot.php?url=https://stackoverflow.com/questions/757675/website-screenshots-using-php

There is a lot of options and they all have their pros and cons. Here is list of options ordered by implementation difficulty.
Option 1: Use an API (the easiest)
ApiFlash (based on chrome)
EvoPDF (has an option for html)
Grabzit
...
Pros
Execute Javascript
Near perfect rendering
Fast when caching options are correctly used
Scale is handled by the APIs
Precise timing, viewport, ...
Most of the time they offer a free plan
Cons
Not free if you plan to use them a lot
Option 2: Use one of the many available libraries
dom-to-image
wkhtmltoimage (included in the wkhtmltopdf tool)
phpwkhtmltopdf
...
Pros
Conversion is quite fast most of the time
Cons
Bad rendering
Does not execute javascript
No support for recent web features (FlexBox, Advanced Selectors, Webfonts, Box Sizing, Media Queries, HTML5 tags...)
Sometimes not so easy to install
Complicated to scale
Option 3: Use PhantomJs and maybe a wrapper library
PhantomJs
php-phantomjs (php wrapper library for PhantomJs)
...
Pros
Execute Javascript
Quite fast
Cons
Bad rendering
PhantomJs has been deprecated and is not maintained anymore.
No support for recent web features (FlexBox, Advanced Selectors, Webfonts, Box Sizing, Media Queries, HTML5 tags...)
Complicated to scale
Not so easy to make it work if there is images to be loaded ...
Option 4: Use Chrome Headless and maybe a wrapper library
Chrome Headless
chrome-devtools-protocol
puphpeteer
...
Pros
Execute Javascript
Near perfect rendering
Cons
Not so easy to have exactly the wanted result regarding:
page load timing
proxy integration
auto scrolling
...
Complicated to scale
Quite slow and even slower if the html contains external links
Disclaimer: I'm the founder of ApiFlash. I did my best to provide an honest and useful answer.

Well, PhantomJS is a browser that can be easily put on a server and integrate it to php. You can find the code in WDudes. They have included lot more features like specifying the image size, cache, download as a file or display in img src etc.
<img src=”screenshot.php?url=google.com” />
URL Parameters
Width and Height: screenshot.php?url=google.com&w=1000&h=800
With cropping:
screenshot.php?url=google.com&w=1000&h=800&clipw=800&cliph=600
Disable cache and load fresh screesnhot:
screenshot.php?url=google.com&cache=0
To download the image: screenshot.php?url=google.com&download=true
You can see the tutorial here: Capture Screenshot of a Website using PHP without API

cutycapt saves webpages to most image formats(jpg,png..) download it from your synaptic, it works much better than wkhtmltopdf

I set up finally using microweber/screen as proposed by #boksiora.
Initially when trying the mentioned link here what I got:
Please download this script from here https://github.com/microweber/screen
I'm on Linux. So if you want to run it, you may adjust my step follow to your environment.
Here are the step I did on my shell on DOCUMENT_ROOT folder:
$ sudo wget https://github.com/microweber/screen/archive/master.zip
$ sudo unzip master.zip
$ sudo mv screen-master screen
$ sudo chmod +x screen/bin/phantomjs
$ sudo yum install fontconfig
$ sudo yum install freetype*
$ cd screen
$ sudo curl -sS https://getcomposer.org/installer | php
$ sudo php composer.phar update
$ cd ..
$ sudo chown -R apache screen
$ sudo chgrp -R www screen
$ sudo service httpd restart
Point your browser to screen/demo/shot.php?url=google.com. When you see the screenshot, you are done. Discussion for more advance setting is available here and here.

There are many open source projects that can generate screenshots. For example PhantomJS, webkit2png etc
The big problem with these projects is that they are based on older browser technology and have problems rendering many sites, especially sites that use webfonts, flexbox, svg and various other additions to the HTML5 and CSS spec over the last couple of months/years.
I've tried a few of the third party services, and most are based on PhantomJS, meaning they also produce poor quality screenshots. The best third party service for generating website screenshots is urlbox.io. It is a paid service, although there is a free 7-day trial to test it out without committing to any paid plan.
Here is a link to the documentation, and below are simple steps to get it working in PHP with composer.
// 1 . Get the urlbox/screenshots composer package (on command line):
composer require urlbox/screenshots
// 2. Set up the composer package with Urlbox API credentials:
$urlbox = UrlboxRenderer::fromCredentials('API_KEY', 'API_SECRET');
// 3. Set your options (all options such as full page/full height screenshots, retina resolution, viewport dimensions, thumbnail width etc can be set here. See the docs for more.)
$options['url'] = 'example.com';
// 4. Generate the Urlbox url
$urlboxUrl = $urlbox->generateUrl($options);
// $urlboxUrl is now 'https://api.urlbox.io/v1/API_KEY/TOKEN/png?url=example.com'
// 5. Now stick it in an img tag, when the image is loaded in browser, the API call to urlbox will be triggered and a nice PNG screenshot will be generated!
<img src="$urlboxUrl" />
For e.g. here's a full height screenshot of this very page:
https://api.urlbox.io/v1/ca482d7e-9417-4569-90fe-80f7c5e1c781/8f1666d1f4195b1cb84ffa5f992ee18992a2b35e/png?url=http%3A%2F%2Fstackoverflow.com%2Fquestions%2F757675%2Fwebsite-screenshots-using-php%2F43652083%2343652083&full_page=true

I'm on Windows so I was able to use the imagegrabwindow function after reading the tip on here from stephan. I added in cropping (to get rid of the Browser header, scroll bars, etc.) and resizing to get a final image. Here's my code. Hope that helps someone.

I used bluga. The api allows you to take 100 snapshots a month without paying, but sometimes it uses more than 1 credit for a single page. I just finished upgrading a drupal module, Bluga WebThumbs to drupal 7 which allows you to print a thumbnail in a template or input filter.
The main advantage to using this api is that it allows you to specify browser dimensions in case you use adaptive css, so I am using it to get renderings for the mobile and tablet layout as well as the regular one.
There are api clients for the following languages:
PHP,
Python,
Ruby,
Java,
.Net C#,
Perl
and Bash (the shell script looks like it requires perl)

It all depends on how you wish to take the screenshot.
You could do this via PHP, using a webservice to get the image for you
grabz.it has a webservice to do just this, here's an article showing a simple example of using the service.
http://www.phpbuilder.com/articles/news-reviews/miscellaneous/capture-screenshots-in-php-with-grabzit-120524022959.html

There are some ways in which you can achieve this in PHP, but realistically it's better to delegate this to a non-PHP based API which you can build yourself, or you can pay for. Many people have already listed screenshot APIs in the answers, and you can use any of those to achieve this. My own screenshot API is extremely well tested and covers many rendering cases that most APIs don't cover, but for most people, this is overkill, honestly.
My recommendation is to build your own API using Puppeteer, which is the canonical solution nowadays to build screenshot solutions. My service is built on top of Puppeteer and it really works well for most basic use cases.
You can build a serverless Puppeteer solution on AWS or GCP using something like https://www.npmjs.com/package/chrome-aws-lambda, which is an excellent serverless package for Puppeteer that comes pre-loaded with Chromium.

You can use https://grabz.it solution.
It's got a PHP API which is very flexible and can be called in different ways such as from a cronjob or a PHP web page.
In order to implement it you will need to first get an app key and secret and download the (free) SDK.
And an example for implementation. First of all initialization:
include("GrabzItClient.class.php");
// Create the GrabzItClient class
// Replace "APPLICATION KEY", "APPLICATION SECRET" with the values from your account!
$grabzIt = new GrabzItClient("Sign in to view your Application Key", "Sign in to view your Application Secret");
And screenshoting example:
// To take a image screenshot
$grabzIt->URLToImage("http://www.google.com");
// Or to take a PDF screenshot
$grabzIt->URLToPDF("http://www.google.com");
// Or to convert online videos into animated GIF's
$grabzIt->URLToAnimation("http://www.example.com/video.avi");
// Or to capture table(s)
$grabzIt->URLToTable("http://www.google.com");
Next is the saving.You can use one of the two save methods, Save if publicly accessible callback handle available and SaveTo if not. Check the documentation for details.

After a lot for surfing on web I found this.
PPTRAAS > A free tool to capture screenshot by passing your URL as a parameter
They provide multiple options by simply hitting their URL.
Get full page screenshot
https://pptraas.com/screenshot?url={YOU URL HERE}
Get page screenshot of specific size
https://pptraas.com/screenshot?url={YOU URL HERE}&size=400,400
One can even convert the page to pdf
https://pptraas.com/pdf?url={YOU URL HERE}

Not directly. Software such as Selenium have features like this and can be controlled by PHP but have other dependencys (such as running their java-based server on the computer with the browser you want to screenshot)

you can use cutycapt .
kwhtml is deprecated and show page like old browser.

I've found this to be the best and easiest tool around: ScreenShotMachine. It's a paid service, but you get 100 free screenshots and you can buy another 2,000 for (about) $20, so it's a pretty good deal. It has a very simple usage, you just use a URL, so I wrote this little script to save a file based on it:
<?php
$url = file_get_contents("http://api.screenshotmachine.com/?key={mykey}&url=https://stackoverflow.com&size=X");
$file = fopen("snapshots/stack.jpg", "w+");
fwrite($file, $url);
fclose($file);
die("saved file!");
?>
They have a very good documentation here, so you should definitely take a look.

Related

extract images from PDF with PHP

The thing is that the client wants to upload a pdf with images as a way of batch processing multiple images at once.
I already looked around and out of the box PHP can't read PDF's.
What are my alternatives?
I already know the host has not installed imageMagick or any pdf library and the exec function is disabled. That's basicly leaving me with nothing to work with, I guess?
Does anyone know if there is an online service that can do this, with an api of sorts?
thanks in adv
AFAIK, there is no PHP module to do it. There is a command line tool, pdfimages (part of xpdf). For reference, here's how that works:
pdfimages -j source.pdf image
Which will extract all images from source.pdf as image-000.jpg, image-001.jpg, etc. Note the output format is always Jpeg.
Possible Options
Being a command line tool, you need exec (or system, passthru, any of the command executing functions built into PHP). As your environment doesn't have that, I see four options:
Beg that exec be turned on for you (your hosting provider can limit what you can exec to a single command)
Change the design -- how about a ZIP upload?
Roll your own, using the source code of pdfimages as a model
Let pdfimages do the heavy lifting, by running it on a remote host you do control
Regarding #3, rolling your own, I don't think rolling your own, to solve a very narrow definition of requirements, would be too difficult. I seem to recall that the image boundaries in PDF are well defined: just read in the file to a boundary, cut to the end of the boundary, base64_decode, and write to a file -- repeat. However, that may be too much...
If rolling your own is too complicated, then option #4 is kind of like what Joel Spolsky describes for working with complicated Excel objects (see the numbered list under the bold heading "Let Office do the heavy work for you").
Find a cheap hosting environment (eg Amazon EC2) that let's you exec and curl
Install pdfimages
Write a PHP script that takes a URL to a PDF, curl opens that PDF, writes it to disk, passes it to pdfimages, then returns the URL to the resulting images.
An example exchange could look like this:
GET http://www.cheaphost.com/pdfimages.php?extract=http://www.limitedhost.com/path/to/uploaded.pdf
Content-type: text/html
<html>
<body>
<ul>
<li>http://www.cheaphost.com/pdfimages.php?retrieve=ab9895v/image-000.jpg</li>
<li>http://www.cheaphost.com/pdfimages.php?retrieve=ab9895v/image-001.jpg</li>
</ul>
</body>
</html>
So your single pdfimages.php script (running on the host with the exec functionality) can both extract images, and give you access to the extracted images. When extracting, it reads a PDF you tell it, runs pdfimages on it, and gives you back a list of URL to call to retrieve the extracted images. When retrieving, it just gives you back a straight image.
You would need to deal with cleanup, perhaps the thing to do would be to delete the image after retrieval. You would also need to handle security -- don't know what's in these images, but the content might need to be wrapped in SSL and other precautions taken.
You can use pdfimages and install it this way:
apt install poppler-utils
Then use it this way to get all the images as PNG files:
pdfimages -j mypdf.pdf image -png
Images will be placed in the same folder under image-000.png, image-001.png, etc.
There are many options available, including some to change the output format, more information here.
I hope this helps!

HTML to picture ( php parser for html2jpg / html2pic / html2img )

I need to make thumbnails from html (and css) code. Similar to flash's AIR1 HTMLLoader.
Is there a php class or php script that does that?
If you have access to the command line from PHP (via exec() or shell_exec()) you can check out PhantomJS, a headless WebKit browser with a JavaScript API. I use this to do exactly what you're describing:
I generate an HTML file locally
I pass the path of that HTML file and an output path for the image (in my case a PNG) to a bash script that executes a call to PhantomJS (there are great examples when you download the package)
I serve up the generated image
I've tried a LOT of potential solutions and spent many hours on this problem and PhantomJS is by far the easiest I've found. There is a bit of lag waiting for the headless browser to start up, but from what I understand the latest version allows you to keep it running on a port of your choosing. I haven't been able to try this yet. You'll need GhostDriver for this as well. Check out the 1.8 Release Notes for more info.
Good luck!

how to scan image on php or javascript?

i dont know its possible or no?
i have a project in php ,
can we scan image on php or javascript or ... via scanner , is any way for that?
Assuming that the scanner is connected to the server where PHP is being executed:
Currently there's some good solutions to scan images through PHP by making use of the Open Source SANE Scanning software:
SCANPHP
A PHP Lightweight Scanning GUI which makes use of the Open Source SANE Scanning software. The PHP GUI can be installed on any Web Server as long as PHP can be run. PHP calls the scanimage command in order to provide the scan. Post scanning, the image is piped "|" to gocr / pnmtojpeg in order to provide the acquired file.
PHPSANE
phpSANE is a web-based frontend for SANE written in HTML/PHP so you can scan with your web-browser. It also supports OCR.
A code example would be:
exec("scanimage --mode Gray --resolution 150 | pnmtojpeg > /tmp/image.jpg");
You can refer to the SANE Project Homepage for further information on options available, installation steps, etc.
It's possible, but only using third-party plug-ins or applets, most of which are not free and limited to a platform (PC / Windows mostly, and some even to Internet Explorer, although there are ActiveX wrappers for other browsers, too.)
Check out the answers to this question. They should give you a good overview about what's possible.
It is impossible to use scanner from the JavaScript on the client unless you have a special browser interface for that. No major browser have built-in support for that -- only via plugins.
Here is some more info:
http://www.ciansoft.com/samples/tcxbrowser.htm
http://www.chestysoft.com/ximage/twainupload.asp

Download web page with images and stylesheets and (optionally) E-mailing it

I need to make snapshots of web pages programmatically using PHP and get them into a HTML E-Mail.
I tried wget --page-requisites. It downloads everything all right, but it doesn't change the HTML page's source code to point to the downloaded files rather than the on-line originals. Also, that HTML is of course a long way from being displayed properly in a HTML E-Mail.
I am interested to know whether there are ready-made solutions for this. I would already be happy with a solution that takes a HTML snapshot and changes the HTML accordingly. Being able to E-Mail it would be the icing on the cake.
I control the web pages being snapshot, so I have the possibility to adjust the content to optimize the results.
My server-side platform is PHP but with very liberal settings, I can execute things like wget and Perl scripts from within PHP. I do however not have root access and can not install additional packages or programs.
The task is to make a snapshot of a product page each time somebody places an order, so there is documentation about what the page looked like at the time.
wget has a -k (--convert-links) option, which will convert both links and references to embedded content (like images). See e.g. wget advanced use (also here).
For the email-part of your question - I'm sure you can use one of the existing libraries. For example, PHP has some PEAR package (do no remember the exact name) to handle HTML emails; I'm pretty sure both Perl and Python have something similar.
In this case, you try to do a website mirroring using wget. The simple solution is to use httrack which is a simple command-line tool. It's very powerful and configurable, try it!
The httrack website presents a GUI, but you don't need it, all is possible from the command-line (or from PHP).

Web Page Screenshots with PHP?

I know there is not a direct way to take a screen shot of a web page with PHP. What would be the most straightforward way to accomplish this? Are there any command line tools that could do this that I might be able to execute from a PHP script (I'm thinking something that would run in a 'NIX OS (OS X and/or Linux in particular)?
Edit: Or maybe some sort of web service I could access via SOAP or REST or ...
Edit #2: I found a related question discussing the CLI option, but I'd still be open to other methods if anyone knows of anything.
See webkit2png for an OSX commandline program that does this.
The page also mentions Linux alternatives.
[edit]: wkhtml2image is the newest kid in town, and it works better then anything else i've ever used.
[edit2]: As of 2014, PhantomJS seems to be the way to go, as it has the newest webkit version of the alternatives I know about.
[edit3]: In 2019, Puppeteer is the way to go. Official headless chrome, always up to date.
You can use the GD functions imagegrabscreen() or imagegrabwindow() to take a screenshot, but they're only available on Windows at the moment.
http://www.thumbshots.org/
html2ps does a decent job for relatively simple pages, and it requires very little in terms of external binaries, meaning it's very easy to install/use. If you control the pages you'll be capturing, then you can ensure that they'll render appropriately in html2ps. If you're hoping to capture arbitrary URLs, however, I'm not sure that the PHP port of HTML2PS is up to the task. It's also not the fastest thing in the world (expect render times in the seconds for complex pages), but that doesn't really matter for some applications.
Not sure if this would be enough for you, because it has some added stuff there, but would be worth giving it a try: http://www.snap.com
It's possible to get a base64 encoded image of a site by using the Google pagespeed api.
You can specify desktop or mobile views, but you are limited to an image of a certain size.

Categories