Convert HTML to Image in PHP without shell - php

I want the option of converting HTML to image and showing the result to the user. I would be creating an $html variable with PHP, and instead of displaying using echo $html, I want to display it as an image so the user can save the file if they needed to.
I was hoping there would something as simple as $image = convertHTML2Image($html); :p if that exists?!
Thanks!!

As #Pekka says, the job of turning HTML code into an image is the job of a full-blown web browser.
If you want to do this sort of thing, you therefore need to have a script that does the following:
Opens the page in a browser.
Captures the rendered page from the browser as a graphic.
Outputs that graphic to your user.
Traditionally, this would have been a tough task, because web browsers are typically driven by the user and not easy to automate in this way.
Fortunately, there is now a solution, in the form of PhantomJS.
PhantomJS is a headless browser, designed for exactly this kind of thing -- automated tasks that require a full-blown rendering engine.
It's basically a full browser, but without the user interface. It renders the page content exactly as another browser would (it's based on Webkit, so results are similar to Chrome), and it can be controlled by a script.
As it says on the PhantomJS homepage, one of its target use-cases is for taking screenshots or thumbnail images of websites.
(another good use for it is automated testing of your site, where it is also a great tool)
Hope that helps.

This is not possible in pure PHP.
What you call "converting" is in fact a huge, non-trivial task: the HTML page has to be rendered. To do this in PHP, you'd have to rewrite an entire web browser.
You'll either have to use an external tool (which usually taps into a browser's rendering engine) or a web service (which does the same).

It is possible to convert html to image. However, first you must convert to PDF. see link

You may have a look at dompdf which is a php framework to convert a html file to a pdf.

use WKHTMLTOPDF. works like a charm. it converts to any page to PDF ..
a jpeg can be obtained by performing later operation.
http://code.google.com/p/wkhtmltopdf/

Related

Script to take screenshot of webpage running on localhost

Is there any script which enables automatic screenshot of this webpage running on a server on a daily basis and store the captured images ?
First of all lets define the task and understand its boundaries. Because there is no simple and easy solution to address your question.
To capture the screenshot of the web page, first it must be rendered. This is quite complicated process. You tagged a PHP, so shortly you can not do it simply using only PHP. I would recommend you to read briefly how the rendering of web pages work. This article is a great source to learn about it: https://www.html5rocks.com/en/tutorials/internals/howbrowserswork/
Hence, first it is needed some combination of "services" which will render the page, then capture it as a bitmap (it does not matter which graphic format). Then you can obtain it using PHP (REST or any other suitable way). Roughly saying you need some browser-like system (or exactly the browser itself) which will render the page and return bitmap.
If you are looking for some simple practical solution without burden you have several solutions:
For getting screenshot from any remote page you can use paid API https://thumbnail.ws/ . It has a Free option with limits.
For getting screenshot and other related thumbnail data you can use Google's Page Speed API. The example code can be found at https://gist.github.com/ajdruff/e6b69e3eb5a3bc1dc081
Use some available extensions for Google Chrome or Firefox (you can make your own using JavaScript) then use the data from it.
There are many packages for that purpose out there one of them for example is Screen
Here's an example of how to use it:
Assuming you've already installed it
require './vendor/autoload.php';
use Screen\Capture;
$url = 'https://example.com'; // webpage you want to capture
$screenCapture = new Capture($url);
$screenCapture->save('./test'); // test is the name of the screenshot (default type is 'jpg')
In order to correctly take screenshots of webpages, you need to first make sure that the page rendered correctly and the most convenient way is to use real browsers.
As others mentioned, there are different options:
you can use a paid API to do that (wasting money IMHO).
You can write your own code to do that (which is not necessarily easy and safe).
You can use ready-to-use tools (mostly command-line tools)
cutycapt (cutycapt --url= < URL > --out= < filename >)
firefox (firefox -screenshot < filename > < URL >)
wkhtmltoimage (as part of wkhtmltopdf tool)
Chrome browser ( google-chrome --headless --disable-gpu --screenshot < URL>)
You can run your own server if you are doing screenshot in bulk (for example, https://github.com/filovirid/screenshot_server)
There is no such script which automatically takes you the screenshot unless you are developing a software for it.
The alternative thing which you can do is to write a python script to scrape the data in the webpage daily and store it in a file or else you can use selenium tool for this purpose.

Capturing image generated dynamically when scraping a page

I'm trying to capture some images from an old database.
When writing scrapers, I use ruby (but am comfortable with php as well) to directly open() a website and read its contents. I sometimes also use the script to call the appropriate curl ... command.
However, the database I'm scraping some pieces out of returns a page and then embeds the target image with an image name using a series of random numbers I assume by the server side script. For example:
<img ... show_image.jsp?343523.jpg
However, I cannot call this show_image script directly (denied), it only works when embedded in the website as a whole.
Can I use curl, or within ruby or php do something download the entire page, for example, 1929.2.14.aspx in such a way that it includes the embedded image generated by show_image.jsp?343523.jpg?
If I simply curl the aspx file directly, I naturally just get the html - how might one save both the html and embedded image via scripting in the manner that a browser-based "web archive" feature works manually?
Any tips, links to tutorials, etc. appreciated...
You should probably be using mechanize to scrape websites in ruby. When you do it will set cookies and referer for you so getting the image will be as easy as:
agent.get(image_url).save_as 'local_filename.jpg'
If the script (show_image.jsp - for example) is doing a simple referrer check, you may be able to work around it by writing your PHP (or Ruby) scraper in such a way so as to set the referrer before making the GET:
curl --referer http://www.example.com http://www.example.com/show_image.jsp?bar.jpg

creating pdf from web page with SWF files

I am trying to generate a pdf from a web page which has pictures and swf files.
Final pdf should have pictures (swf should be converted into image, last frame is sufficient).
I am able to generate pdf when only images are there but i am stuck in creating pdf when the web page has swf files.
I've used wkhtmltopdf before to render pdfs programatically from web sites. I'm not sure if it'll cope with swf but it may do since it uses a version of webkit compiled in to qt.
You might be able to use wkhtmltopdf --enable-plugins. But according to this bugreport it might not work http://code.google.com/p/wkhtmltopdf/issues/detail?id=48 with the flash plugin (Java however does!).
Another option is running a browser in headless mode, or on a virtual X. Firefox3 works supposedly if you use the extension "CommandLinePrint".
Xvfb :2 -screen 0 1600x1200x24 &
firefox --display=localhost:2.0 -print http://flashgames.com -printmode pdf -printfile '/tmp/test.pdf'
Infos stolen from http://spielwiese.la-evento.com/xelasblog/archives/31-Headless-Firefox-als-HTML-to-PDF.html (in German however).
But there are a few more guides like this ("headless browser, HTML to PDF"). I would totally link to one of the dupes here on Stackoverflow. But I'm too lazy to search right now.
Since you are wanting to output the target page as a PDF I would look at using .rdlc (Report Definition Language Client). It is part of the Microsoft.Reporting namespace and is designed to work with asp.net. It is freely usable and redistributable.
In many cases the layout of a web page is not "printer friendly". By using this technique you can re-arrange the layout and spacing of the PDF output to a presentation that is more printer friendly.
This will not "directly" convert your page to a PDF, but rather allow you to adapt your page layout and data to a dataset and use that to build a report. That report can then be output programmatically at runtime using the reportviewer control. If this approach interests you, let me know and I will be glad to provide more help getting you through setting it up and using.

Curl preg_match

We are downloading images to our computers when we open new webpages. For example: If a webpage has an image(image.jpg), our computer downloads it while we are surfing that page.
Some webpages are using ajax methods. For example: You don't see an image on the page's source codes, however your computer downloads an image. Because, if you click a link on that page, ajax will be showing that image...
Let me show an example:
<div id="ajax_will_load_image_here"></div>
Okay, how can php curl see (or download) that image? Curl can't see that image when I try to use preg_match function. Actually there is an image. I want to download that image by using php curl. Any advice?
If i understand the question correctly there is no convinient way of doing that.
Your crawler/spider would have to parse the website and evaluate javascript.
There are libraries for that but support is very limited.
There are however methods where an actual browser is used to evaulate the page (without displaying it but setting proper environment variables like resolution etc).
Then the generated source including javascript dom modifications is available.
This is for example how the google search previews are generated.
But if you require user interaction it gets pretty specific and complicated.
I am sorry to dissapoint you, but using curl and preg metch the old school way we used to when javascript was not yet so common wont work.
However for most legit use cases this is more than sufficient and websites are today more and more designed to be non-javascript compliant. Especially the content for crawling purposes. It is a must in search engine optimization, and which website doesnt want that?

How can I create image from html using PHP?

something like painty but with advanced options having div, font, fontsize, style... etc....
I would like to have a coupon design in html and output it as an image.. preferably JPG..
but painty is not supporting those.
you can find here.. http://www.rabuser.info/painty.php the painty code i am using right now.
Thanks and waiting for the reply.
Creating this with pure php is bad idea, this will be slow as hell.
As far as I know in production this is achieved with external screenshot app and standard browser run by exec() or similar function.
There's khtml2png, which renders the whole page and takes a screenshot; however, it's a standalone executable (and it needs an X server or xvfb), so you need to be able to run it on your server (so probably not on a shared hosting). This may be a bit of an overkill, but it gives you complete control over the final appearance.
You could also use some of the HTML to PDF convertors and then use ImageMagick to convert the PDF to JPEG.

Categories