Clean HTML Sourcecode with PHP - php

i'm looking for a way to keep my HTML code output via PHP clean.
If you look into the source code, the result looks like this:
<section><div class="card">
<div class="card-body">
<h5 class="card-title">Special title treatment</h5>
<p class="card-text">With supporting text below as a natural lead-in</p>
Go somewhere </div>
</div>
</section><section><div class="card">
<div class="card-body">
<h5 class="card-title">Special title treatment</h5>
<p class="card-text">With supporting text below as a natural lead-in content.</p>
Go somewhere </div>
</div></section>
I want it to look like this:
<section>
<div class="card">
<div class="card-body">
<h5 class="card-title">Special title treatment</h5>
<p class="card-text">With supporting text below as a natural lead-in</p>
Go somewhere </div>
</div>
</section>
<section>
<div class="card">
<div class="card-body">
<h5 class="card-title">Special title treatment</h5>
<p class="card-text">With supporting text below as a natural lead-in</p>
Go somewhere </div>
</div>
</section>
this is my php output code:
ob_start();
include_once ROOT.'/global/header.php';
print $content_output; // the included files
include_once ROOT.'/global/footer.php';
$output = ob_get_contents();
ob_end_clean();
echo $output;
The reason for this is that I am building a scaffold where blocks are created for a website. For example the start page consists of block2, block7, block1 and block5. At the end the customer gets a clean HTML, which consists of the above mentioned blocks.

If your PHP fully renders the HTML, why would you want it to look good? It is not like any other developer is going to look inside the compiled HTML, right?
The browser does not care how your HTML is formatted, if it is valid HTML, it is valid HTML. This should also not affect the SEO of your webpage.
In the case you are manually writing HTML in PHP code. You should avoid echoing full HTML strings. You can do this by using as much inline PHP as you can. For example:
<?php if(//Statement): ?>
<h1><?= $test ?></h1>
<?php endif; ?>
This way you know PHP is not going to affect the indentation of the markup.

You can use DOMDocument to process & format HTML. DOMDocument is tough to use & much of it could use better documentation.
If all you want to do is pretty print the html, something like this should do what you need:
$html = '<div>Happy Coding todayyy</div>';
$doc = new \DOMDocument($html);
$doc->formatOutput = true;
$cleanHtml = $doc->saveHTML();
You could also look for an html beautifier, but it doesn't look like there's any particularly mature projects for that.
I also want to add that running DOMDocument on every single request to format html adds additional overhead. More cpu cycles means more energy, so something to be mindful of. You probably won't see any real change in script execution time though.
Some existing projects that might make DOM work easier for you. Things to maybe try if DOMDocument doesn't do quite what you want (I'm not 100% sure the code above will do the trick, nor do I know if any of these repos can definitely solve your problem):
Voku's port of simple_html_dom <- simple_html_dom has been around for awhile. Haven't tried Voku's port, but his repos that I've reviewed are usually very good quality.
A DomDocument extension by ivopetkov <- I think this one is the most mature
Another option by scotteh <- Don't know anything about it
A DomDocument extension by me. Stable, small but nice feature set

Related

Is there a way to assign html to a php variable without loosing its markup?

I am trying to assign some html to a variable in php in such a way in an effort to modularise my html.
$html_block1 = "<div class='item'>
<img src='chicago.jpg' alt='Chicago'>
<div class='carousel-caption'>
<h3>Chicago</h3>
<p>Thank you, Chicago!</p>
</div>
</div>"
Html specific Markup however is lost since every text editor will consider this as a simple string instead of the html it indeed it. This makes later editing very impractical. Is there is a way to store this html in a php file without loosing the markup in a way that can easily be used later by other files? Thank you.
As an alternative, you may use Output Control Functions:
Your code becomes:
<?php
ob_start();
?>
<div class='item'>
<img src='chicago.jpg' alt='Chicago'>
<div class='carousel-caption'>
<h3>Chicago</h3>
<p>Thank you, Chicago!</p>
</div>
</div>
<?php
$html_block1 = ob_get_clean();
This way also allows you to move your HTML templates to a view file:
script.php
<?php
ob_start();
require VIEW_FOLDER . '/my_view.php';
$html_block1 = ob_get_clean();
my_view.php
<div class='item'>
<img src='chicago.jpg' alt='Chicago'>
<div class='carousel-caption'>
<h3>Chicago</h3>
<p>Thank you, Chicago!</p>
</div>
</div>
If you're going to do stuff like this, you should however consider looking into template engines, such as Twig.

whitespace before/after php tags

I have one small issue with my code below, I'd like the HTML view of the source code to be "tidy" as well as the PHP, I've spent days searching but I've drawn a blank as to why the HTML side is not tidy, the only thing I'm guessing is extra whitespace is being generated, if that's the case I'm not sure why, if anyone could help it would be appreciated, thanks.
html
<div class='col-md-12'>
<div class='alert alert- info'>
Already Logged In
</div>
</div>
php
<div class='col-md-12'>
<?php if ($action=='logged-in'): ?>
<div class='alert alert- info'>
Already Logged In
</div>
<?php endif ?>
</div>
Your lines with just PHP code still keep their whitespace, so the spaces at the beginning of that line and the \n at the end still count.
Unfortunately, this means you'd have to have your code look like this:
<div class='col-md-12'>
<?php if ($action=='logged-in'): ?> <div class='alert alert- info'>
Already Logged In
</div>
<?php endif ?></div>
Pretty ugly, right?
You have a couple options:
Run the final HTML (captured using an output buffer) through something like Tidy, which has indentation correction built-in as an option. A lot of work for something no one's ever gonna see, but it'll do the trick.
Use a templating system to separate out the PHP a bit. Something like Twig can probably be massaged a little easier into the nicely indented HTML you want, but there'll still be some of the same troubles if you're not careful.
Stop caring. (This is my recommendation.) Focus on the readability and simplicity of the code you'll actually be working with, not the whitespace of the resulting HTML, which pretty much no one will ever notice or care about. Take a look at the HTML generated for any major website - Amazon.com, Facebook.com, Google.com, etc. - and you'll see that this is the standard practice.
Since it looks like the root <div> will be there no matter what, try using the following:
echo
'<div class="col-md-12">',
($action == 'logged-in' ? '
<div class="alert alert-info">
Already Logged In
</div>' : '
'), '
</div>';
Fiddle: Live Demo

I want to remove certain parent- and child-divs in all my wordpress posts with php or some other script

Is there a quick way, via script maybe, to remove a certain pair of div's out of all my wordpress posts? For example:
I want to go from this:
<div class="single_textimage">
<div class="youtube_play"><iframe src="-,-"></iframe></div>
<div class="single_textimage_text">Some text.</div>
<div class="single_textimage_copyright">Some text.</div>
</div>
To this:
<div class="youtube_play"><iframe src="-,-"></iframe></div>
AND
From this:
<div class="single_textimage">
<img class="aligncenter size-full wp-image-1700" src="-,-" />
<div class="single_textimage_text">Some text.</div>
<div class="single_textimage_copyright">Some text.</div>
</div>
To this:
<img class="aligncenter size-full wp-image-1700" src="-,-" />
So I want the divs: single_textimage, single_textimage_text and single_textimage_copyright to go.
I hope there is an easy script, or difficult for that matter. Via "php", "mysql" or "jquery" for example, that I can put in test.php in the root or something...
I hope I supplied you with enough information. If I haven't made myself clear enough, please reply. :)
Seems to me like you should be able to take those out of whatever template your using - probably in a PHP include, but I don't really use WordPress, so I wouldn't know where without seeing all your files. If you're bent on using jQuery instead of modifying the template, I would throw in some CSS too, to hide the elements that will be removed:
.single_textimage, .single_textimage_text, .single_textimage_copyright{
display:none;
}
Then you can take the elements you want to keep out of their parent DIVs, and place them right after (or before):
$('.youtube_play, .wp-image-1700').each(function(){
$(this).parent().after($(this));
});
Then you can remove the elements you don't want from the page:
$('.single_textimage, .single_textimage_text, .single_textimage_copyright').remove();
Here's a fiddle: https://jsfiddle.net/3uztorzL/
I would use this search and replace utility to update all of the content in the DB:
https://interconnectit.com/products/search-and-replace-for-wordpress-databases/
You'll need a regex to replace <div class="single_textimage_text">Some text.</div> (assuming the "some text" is different in each post). The utility supports regex replace. This may do it:
<div class="single_textimage_text">(.*?)</div>
Make sure you make a backup before you do the replace.

HTML attributes and PHP - where to draw the line between conflicting principles

I have an HTML file which contains two tabs like this (besides lots of other stuff):
<div id="tabs">
<div id="tab1" class="">Tab 1</div>
<div id="tab2" class="">Tab 2</div>
</div>
This HTML file is used in several contexts, so it seems reasonable to have one single HTML file that I include with PHP in order not to violate the Don't Repeat Yourself principle.
Depending on various conditions these tabs need to have different classes when the page loads. This is achieved through PHP, either
$tab1Class = 'active';
$tab2Class = 'inactive';
or
$tab1Class = 'inactive';
$tab2Class = 'active';
with this mixed HTML and PHP:
<div id="tabs">
<div id="tab1" class="<?php echo $tab1Class;?>">Tab 1</div>
<div id="tab2" class="<?php echo $tab2Class;?>">Tab 2</div>
</div>
My question is what would be considered best practice when having to balance between repeating code or inserting PHP into the HTML file - how much repetition is ok and how deep into the HTML is it ok to bury PHP snippets?
Is, for example, the following degree of separation better? The PHP isn't as deeply buried into the HTML here, but there's more code that I have to repeat in the PHP file(s) in order to generate the div elements every time.
<div id="tabs">
<?php echo $tabs;?>
</div>
I understand that one possible answer is that this is something that has to be decided for each case individually, but if there are any principles that are more or less generally accepted it would be nice to hear about them.
IMHO first approach where you print just css classes is better. If you want to bring this to new level consider using some template engine, like Smarty or Twig. That way you can totally distinct your logic (code) from your presentation.

Getting unstyled text using QueryPath in PHP

I'm just getting to grips with QueryPath after using HTML Simple Dom for quite some time and am finding that the QP documentation doesn't seem to offer much in the way of examples for all of its functions.
At the moment I'm trying to retrieve some text from a HTML doc that doesn't make much use of ID's or Classes, so I'm a little outside of my comfort zone.
Here's the HTML:
<div class="blue-box">
<div class="top">
<h2><img src="pic.gif" alt="Advertise"></h2>
<p>Some uninteresting stuff</p>
<p>More stuff</p>
</div>
</div>
<div class="blue-box">
<div class="top">
<h2><img src="pic2.gif" alt="Location"></h2>
**I NEED THIS TEXT**
<div style="margin:stuff">
<img src="img3.gif">
</div>
</div>
</div>
I was thinking about selecting the class 'box-blue' as the starting point and then descending from there. The issue is that there could be any number of box-blue classes in the HTML doc.
Therefore I was thinking that maybe I should try to select the image with alt="Location" and then use ->next()->text() or something along those lines?
I've tried about 15 variations os far and none are getting the text I need.
Assistance most appreciated!
Can you have a look to this example http://jsfiddle.net/Pedro3M/mujtk/
I made like you said using the alt attribute, if you confirm if this is always unique
$("img[alt='Location']").parent().parent().text();
How about:
$doc->find('div.top:has(img[alt="Location"])')->text();

Categories