PHP: Get all CSS files of an HTML web page - php

I'm trying to get all CSS files of an html file from URL.
I know that if I want to get the HTML code it is easy - just using PHP function - file_get_contents.
The question is - if I could search easily inside an a URL of HTML and get from there the files or content of all related CSS files?
Note - I want to build an engine for getting a lot of CSS files, this is why just reading the source is not enough..
Thanks,

You could try using http://simplehtmldom.sourceforge.net/ for HTML parsing.
require_once 'SimpleHtmlDom/simple_html_dom.php';
$url = 'www.website-to-scan.com';
$website = file_get_html($url);
// You might need to tweak the selector based on the website you are scanning
// Example: some websites don't set the rel attribute
// others might use less instead of css
//
// Some other options:
// link[href] - Any link with a href attribute (might get favicons and other resources but should catch all the css files)
// link[href="*.css*"] - Might miss files that aren't .css extension but return valid css (e.g.: .less, .php, etc)
// link[type="text/css"] - Might miss stylesheets without this attribute set
foreach ($website->find('link[rel="stylesheet"]') as $stylesheet)
{
$stylesheet_url = $stylesheet->href;
// Do something with the URL
}

You need to parse the HTML tags looking for CSS files. You can do it for example with preg_match - looking for matching regex.
Regex which would find such files might be like this:
\<link .+href="\..+css.+"\>

Related

PHP include svg assets in cake php

Does cake have a way to php include svg assets? I know how to use helpers to create an <img> tag pointing to the SVG for the img's src attribute, but I'd like to actually include the file rather than reference it within an <img> tag.
No, CakePHP doesn't ship with such functionality, you'll have to come up with something on your own, or use one of the many PHP based SVG inliners out there, it should be easy enough to wrap that in a custom helper.
If you just need to embed the file, then you could even stick to simply reading and outputting the file contents with the XML declaration and doctype removed, something like:
$svg = file_get_contents($path);
$svg = preg_replace('/^<\?xml.*?\?>\s*(<!DOCTYPE.*?>\s*)?/is', '', $svg);
In the end, this was a silly question. You can of course just use <?php include 'img/thefile.svg' ?> assuming your svg is in webroot/img folder. If the svg or file is not in a publicly accessible folder I would look to creating a custom helper as another post suggested.

i want to get data from another website and display it on mine but with my style.css

So my school has this very annoying way to view my rooster.
you have to bypass 5 links to get to my rooster.
this is the link for my class (it updates weekly without changing the link)
https://webuntis.a12.nl/WebUntis/?school=roc%20a12#Timetable?type=1&departmentId=0&id=2147
i want to display the content from that page on my website but with my
own stylesheet.
i don't mean this:
<?php
$homepage = file_get_contents('http://www.example.com/');
echo $homepage;
?>
or an iframe....
I think this can be better done using jquery and ajax. You can get jquery to load the target page, use selectors to strip out what you need, then attach it to your document tree. You should then be able to style it anyway you like.
I would recommend you to use the cURL library: http://www.php.net/manual/en/curl.examples.php
But you have to extract part of the page you want to display, because you will get the whole HTML document.
You'd probably read the whole page into a string variable (using file_get_contents like you mentioned for example) and parse the content, here you have some possibilities:
Regular expressions
Walking the DOM tree (eg. using PHPs DOMDocument classes)
After that, you'd most likely replace all the style="..." or class="..." information with your own.

Creating a personalization engine with php

I am new to php and I want to create an php engine which changes the web content of a webpage with PHP with the use of data in mysql. For example (changing the order of navigation links on a webpage with the order of highest click count) I am not sure how PHP will read the HTML file and change the elements in the HTML file and also output the HTML file with the changes. Is this possible?
I am not quite sure why you would want to generate the html, read it, change it and then output it. It seems to be a lot easier to just generate it the way you want to in the first place.
I am not sure how PHP will read the HTML file and change the elements in the HTML file and also output the HTML file with the changes. Is this possible?
You could use file_get_contents:
$html = file_get_contents($url);
Then use a html-parser like Simple HTML DOM Parser, change what you want to do and output it.
If you want to modify HTML structure, use ganon - HTML DOM parser for PHP
include('path/ganon.php');
// Parse the google code website into a DOM
$html = file_get_dom('http://code.google.com/');
foreach($html('p[class]') as $element) {
echo $element->class, "<br>\n";
}

Render HTML pages from private folder in PHP

I have a web site that have HTML pages stored in a private folder. I want a PHP script that can read the HTML file then push it to the browser.
My tought was to get the html file with the file() function in PHP. Then echo() it to the browser. That works for the html content of the page. The images and the css does not follow however.
I heard of a "render" function in IIS or ASP that render the HTML content of a web page in a private folder then send the images in a binary format. Does PHP have something similar?
Currently I read the file as follow :
$htmlFile = file(PATHTOFILE);
echo(implode('',$htmlFile));
The reason we are trying to do that is to protect the url / information of the pages contained in this folder. The user will have to connect to the web service, then the PHP script will push the html pages
You can use the tag base to solve the problem of the relative path of the files, something like this:
$html = file_get_contents($url);
$html = str_replace('<head>', '<head><base href="FULL PATH OF DIR" />', $html);
echo $html;
CSS and images are not displayed because their paths in the HTML files is relative to HTML files, right? And if you have these CSS and images in the same private folder, how can you hope the user will fetch them?
Indirect, you should fetch CSS and images the same way you do with HTML. But this means you have to replace all paths in your displayed HTML, that is quite absurd. In fact, we are talking about some kind of proxy now... ?!?!?
Why you need it?
Anyway echo(file_get_contents($htmlFile)); is less stressful.
Another option if it is an <img /> tag and the image is also stored outside of the root you can just make the src= attribute as so:
src="get_image.php?file=thisfile.png" // add a $_GET if needed to distinguish files
then get_image.php:
$file = $_GET['file'];
// security checks if you wish
header(sprintf("Content-type: %s;",'image/png'));
readfile($file);
exit;

How to replace unlinked images with a link to the image?

In PHP
I'm converting a large number of blog posts into something more mobile friendly. As these blogs posts seem to have alot of large images in them, I would like to use some regex or something to go thorugh the article's HTML and replace any images that aren't currently linked with a link to that image. This will allow the mobile browser to display the article without any images, but a link to the image inplace of where the image would be thus downsizing the page download size.
Alternatively, if anyone knows any php classes/functions that can make the job of formatting these posts easier, please suggest.
Any help would be brilliant!
To parse HTML, use an HTML parser. Regex is not the correct tool to parse HTML or XML documents. PHP's DOM offers the loadHTML() function. See http://in2.php.net/manual/en/domdocument.loadhtml.php. Use the DOM's functions to access and modify img elements.
How about doing it in JQuery instead of PHP. That way, it will work across different blogging software.
You can do something like...
$(document).ready( function () {
$('#content img').each(function () {
var imageUrl = $(this).attr('src');
//* Now write codes to delete off the image and put in a link instead. :)
});
});
With Regexp something like this:
$c = 0;
while(preg_match('/<img src="(.*?)">/', $html, $match) && $c++<100) {
$html = str_replace($match[0], 'Image '.$c.'');
}
You can also use preg_replace (saves the loop), but the loop allows for easy extensions, e.g. using the image functions to create thumbnails.

Categories