Clean up PHP/HTML pages - php

Does anybody know of a good tool that cleans up files with php and html in it? I've used Tidy before but it doesn't do a good job at leaving the php code alone. I know there are various implementations of tidy but does any tool reign champion specifically for pages with html and php?

Cleaning your code starts with separating PHP from HTML !

I am aware that this is a pretty old question but still a valid one. I currently use this and it seems to be doing a decent job: PHP Formatter
For HTML, CSS and JS, DirtyMarkup is a handy tool. Only drawback of these is that you have to copy and paste the code twice.

As far as I know, Tidy is the "reigning champion" when is comes to cleaning html code. The only other tool I've personally used in cleaning code is within Adobe Dreamweaver.

I would agree with seperating your HTML and your PHP code. However, I think you have to think of it kind of backwards. I would seperate your HTML code from your PHP code. Take your HTML and block it up and use include 'html_code_1.php';. Thus you can run Tidy on your HTML and not worry about it affecting your PHP code.

I previously had this problem, however had issues with other programs reorganizing what I coded, and trying to clean it up usually ended up doing more harm than good. To solve this, I am starting to learn the ins and outs of Code Igniter, a basic PHP framework that uses the MVC approach to splitting HTML and PHP. I haven't tested much, but it looks like much less hassle than writing HTML and PHP straight into the single file.

You can use this PHP class, if you can't install the "Tidy" module (sometimes when you buy hosts you can't).
http://www.barattalo.it/html-fixer/

Related

Is there anything in PHP like middleman in Ruby?

I've been on a project with a buddy who is leading us with Middleman. We are coding in HAML and SASS and he's obviously a Ruby Dev. I'd like to know if there's ANY type of equivalent for PHP? I'm going to eventually lead a team and I'm much more comfortable with PHP than Ruby.
I'd like to have a layout file (like Zend's layout file)
I'd like to...at one command, convert all of the source files from PHP to static HTML and place those static files in a 'build' folder so we can hand it over to the client.
Anyone know of some cool things out there to make this happen? Thanks a bunch!
A project I work on, www.findbigmail.com, was written completely in PHP to start with and then I did some Ruby/Rails work for a different project, and coming back to PHP was a grind. After using HAML, SCSS and other wonderful things like CSS and JS minify, oh and Compass to build sprites, it was painful to go back to PHP and work again in PHP files with embedded HTML etc.
So, driven by pure slothfulness, I looked around and found MiddleManApp (MM) - after a couple of side trips along the way.
Now we have a very strong separation between what is now a mostly static html site built by MM, with some PHP files that are POSTed to and then redirect back to html pages. Where we need more dynamic behaviour, we've added javascript to the pages and have them call PHP API wrappers around our pre-existing code.
Our site performance has jumped hugely (doh, now its all static html), and its poised to take another jump when the next MiddleMan version comes out with its improved cache-busting abilities inherited from the Rails 3.1 asset pipeline. E.g. we'll be able to reference main.css in our source scripts (which itself is made up of sub-scss files like _index.scss, _pricing.scss) and it will be built with references to main-2348jlk23489kdj.css -- allowing us to set the server to cache for a year and/or deploy many more files to CDN.
Our engineering performance is way up too. We're no longer reluctant to touch UI code for fear of introducing a syntax error into the PHP code. And no more mismatched HTML tags to cause grief. The other PHP developer wasn't familiar with the Ruby/Rails derived toolchain, but has quickly become proficient (though he is a rockstar developer, so that's not too surprising!)
Coming soon is i18n support. Most of that is in MM already and hopefully Javascript support
real-soon-now.
We also explored generating pages from HAML with PHP added to them. We decided it was probably quite simple - e.g. add a ":php" tag to the HAML pipeline and then use .php partials as needed. But, we found that between Javascript and wrapping the existing PHP code as an "engine API", we were able to keep the codebases neatly separated -- which we found we prefer overall.
I hope this helps! Happy to explain more.
There is one for PHP called Piecrust.
I ended up choosing Middleman for the bundled coffeescript, sass, etc., but Piecrust is well done.
http://bolt80.com/piecrust/
PHP can render static HTML from PHP code quite easily:
Easiest way to convert a PHP page to static HTML page
Generate HTML Static Pages from Dynamic Php Pages
PHP - How to programmatically bake out static HTML file?
You could wire up something with existing template systems like Twig or use PHP Markdown to more or less mimic what Middleman is doing and create static HTML pages from your source files.
EDIT: As Hari K T mentioned above, http://www.phrozn.info/en/ does exactly this.

Is there any php script that using regex or other methods to clean html pages?

Here is my idea, I want to create a tool that can create static html pages, out of php pages, perhaps generated by a cms.
Then I want to use some kind of regex, or clean tool, to reorganize the html to generate a cleaner, more standardized, yslow compliant html pages.
I may asking for what does not exist, if so, any suggestions for a close cousin solution?
Thank you for your time.
Take a look at Tidy: http://php.net/manual/en/book.tidy.php
Works great for cleaning up html.
Not regex but an extension.

what is the recommended way to embed html codes inside php codes?

Lets say I have 2 cases:
<?php
print("html code goes here");
?>
v.s
<?php
?>
html codes goes here
<?php
?>
Would the performance for PHP interpreter in the first case be worse than the second one? (Due to the extra overhead of processing inside print function).
So, does anyone have a recommended way to insert html codes inside php codes?
Oh, for the sake of all those who edit your code later, please never put any significant amount HTML code inside a string and echo it out. In any language.
HTML belongs in HTML files, formatted and styled by your IDE or editor, so it can be read. Giant string blocks are the biggest cause of HTML errors I have ever seen.
Performance shouldn't matter too much, in this case, but I would assume the second would be faster, because it is streamed directly to the output or buffer.
If you want it to be easier to read, enable short tags, and write it like this:
?><b>blah blah blah</b><?
Plus, with short tags enabled, it's easier to echo out variables:
Hello, <?= $username ?>
If you are using this to generate some sort of reusable library, there are other options.
You should put HTML outside of PHP code in order for better maintenance and scalability. It's also very beneficial to do all your necessary data processing before displaying any data, in order to separate logic and presentation.
Rather then try to think about constantly separating your php and HTML you should instead be in the mind set of separating your backend logic and display logic.
The MVC pattern is a good way of thinking about your code - In order to correctly use PHP you must use MVC (model-view-controller) pattern
Never put HTML inside PHP codes unless you specifically intend to do so or its very small. But then again 100% separation is what i recommend. People will have to work very hard to understand your code later if you mix them up. Especially designers who may not be comfortable with php.
The golden rule is separation of the front and back end process to the maximum helps in every aspect. Keep things where they belong. Styles in CSS, Java-scripts in JS, Php in a library folder/files and just use the required classes/functions.
Use short tags <? if required (but i dont like it :P ) also <?= tag for output echo. Besides short tags are better be avoided.
Don't do it that way at all! Use a templating system like Smarty so you can separate your logic from your display code. It also allows you to bring in a designer that can work with html and might not know php at all. Smarty templates read more like HTML than PHP and that can make a big difference when dealing with a designer.
Also, you don't have to worry about your designer messing with your code while doing a find/replace!
Better yet would be to go to a setup like Django or Rails that has clearly delineated code/template setup, but if you can't do that (cheap hosting solutions) then at least use templating. I don't know if smarty is the best thing for PHP, but its far better than both solutions you are proposing.
[head above the parapet] Many of us have learnt templating from WordPress where without embedding php it's virtually impossible to do anything. I can quite understand why people advocate strict MVC or engines such as Smarty but the fact is in the case of WordPress development you need to manipulate output on the fly with php. In fact, coming from that background, I always use to assume that the 'hp' in php was for exactly that reason. So I could write 'normal' looking HTML, do a bit of server-side processing and then return to HTML.
So, from my point of view, the answer to your question, is the second of your examples is much easier to read - one of the fundamentals of elegant coding. But it does depend. If there's a lot of processing to produce a simple piece of html then it may be easier to build a large variable and echo it at the end. I abhore multiple lines of echo statements. In this case I am likely to use a function to keep my HMTL clean. Again WordPress does this a lot; for instance the_title() returns a simple string but does a deal of processing before returning this string so <h1><? the_title(); ?> </h1> reads well.
That is the POV of a WordPress developer who was never formally taught complex coding. I expect to lose a fair amount of reputation points over this answer. [/head above the parapet]

Pretty-print HTML via PHP without validation?

I'd like to automatically pretty-print (indentation, mostly) the HTML output that my PHP scripts generate. I've been messing with Tidy, but have found that in its efforts to validate and clean my code, Tidy is changing way too much. I know Tidy's intentions are good but I'm really just looking for an HTML beautifier. Is there a simpler library out there that can run in PHP and just do the pretty-printing? Or, is there a way to configure Tidy to skip all the validation stuff and just beautify?
The behaviour that you've observed when using Tidy is a result of the underlying use of DOM API. Instead of manipulating the provided source code, DOM API will reconstruct the whole source, thus making fixes along the way.
I've written Dindent, which is a library that uses Regex. It does not do anything beyond adding the indentation and removing whitespaces. However, I advise against using this implementation beyond development purposes.
I've never used Tidy but it seems pretty customizable.
Here's the quick reference of configuration options: http://tidy.sourceforge.net/docs/quickref.html
But really, with tools like Firebug, I've never seen the need to Tidy HTML output.
Since you do not want to have it validate for whatever reason, I will not suggest htmlpurifier ; ). Why not just use an IDE to get everything indented nicely, like Alt-Shift-F in Netbeans.
Facing the same problem i currently use a combination of two commands:
cat template-home.php | js-beautify --type html | prettier --parser php
js-beautify formats the html bits and prettier formats the php code

separating php and html... why?

So I have seen some comments on various web sites, pages, and questions I have asked about separating php and html.
I assume this means doing this:
<?php
myPhpStuff();
?>
<html>
<?php
morePhpStuff();
?>
Rather than:
<?php
doPhpStuff();
echo '<html>';
?>
But why does this matter? Is it really important to do or is it a preference?
Also it seems like when I started using PHP doing something like breaking out of PHP in a while loop would cause errors. Perhaps this is not true anymore or never was.
I made a small example with this concept but to me it seems so messy:
<?php
$cookies = 100;
while($cookies > 0)
{
$cookies = $cookies -1;
?>
<b>Fatty has </b><?php echo $cookies; ?> <b>cookies left.</b><br>
<?php
}
?>
Are there instances when it is just better to have the HTML inside the PHP?
<?php
$cookies = 100;
while($cookies > 0)
{
$cookies = $cookies -1;
echo'<b>Fatty has </b> '.$cookies.' <b>cookies left.</b><br>';
}
?>
When people talk about separating PHP and HTML they are probably referring to the practice of separating a website's presentation from the code that is used to generate it.
For example, say you had a DVD rental website and on the homepage you showed a list of available DVDs. You need to do several things: get DVD data from a database, extract and/or format that data and maybe mix some data from several tables. format it for output, combine the DVD data with HTML to create the webpage the user is going to see in their browser.
It is good practice to separate the HTML generation from the rest of the code, this means you can easily change your HTML output (presentation) without having to change the business logic (the reading and manipulation of data). And the opposite is true, you can change your logic, or even your database, without having to change your HTML.
A common pattern for this is called MVC (model view controller).
You might also want to look at the Smarty library - it's a widely used PHP library for separating presentation and logic.
Let's make it clear what is not separation
you switch from php mode to html mode
you use print or echo statements to write out html code
you use small php snipplets inside html files
If you do this, there is no separation at all, no matter if you escape from php to html blocks or do it the other way and put php code into html.
Have a look at a good templating engine, there are a plenty of reasons in the "why use ...." parts of the manuals. I'd suggert www.smarty.net especially http://www.smarty.net/whyuse.php
It will answer all your questions now you have.
It is very important to separate application logic from presentation logic in projects. The benefits include:
Readability: Your code will be much easier to read if it does not mix PHP and HTML. Also, HTML can become difficult to read if its stored and escaped in PHP strings.
Reusability: If you hard-code HTML strings within PHP code, the code will be very specifc to your project and it won't be possible to reuse your code in later projects. On the other hand, if you write small functions that do one task at a time, and put HTML into separate template files, reusing your code in future projects will be possible and much easier.
Working in a team: If you are working in a team that contains developers and designers, separation of application logic and presentation templates will be advantageous to both. Developers will be able to work on the application without worrying about the presentation, and designers (who don't necessarily know PHP very will) will be able to create and update templates without messing with PHP code.
for pages that contain a lot of HTML, embedding PHP code into the page could be easier. this is one of the first intentions behind PHP. anyway when you are developing an application with lots and lots of logic, different types of connectivity, data manipulation, ... your PHP code gets too complicated if you want to just embed them in the same pages that are shown to users. and then the story of maintenance begins. how are you going to change something in the code, fix a bug, add a new feature?
the best way is to separate your logic (where most of the code is PHP) in different files (even directories) from your page files (where most of the code is HTML, XML, CSV, ...).
this has been a concern for developers for so many years and there are recommendations to handle these general problems, that are called design patterns.
since not everyone has the experience, and can apply these design patterns into his application, some experienced developers create Frameworks, that will help other developers to use all the knowledge and experience laying in the hear of that framework.
when you look at toady's most used PHP frameworks, you see that all of them put code into PHP Classes in special directories, make configurations, and .... in none of these files you see a line of HTML. but there are special files that are used to show the results to users, and they have a lot of HTML, so you can embed your PHP values inside those HTML pages to show to users. but remember that these values are not calculated on the same page, they are results of a lot of other PHP codes, written in other PHP files that have no HTML in them.
I find it preferable to separate application logic from the view file (done well with CodeIgniter framework with MVC) as it leaves code looking relatively tidy and understandable. I have also found that separating the two leaves less margin for PHP errors, if the HTML elements are separated from the PHP there is a smaller amount of PHP that can go wrong.
Ultimately I believe it is down to preference however I feel that separation has the following pros:
Tidier Code
Less of an Error Margin
Easy to Interpret
Easier to change HTML elements
Easier to changed Application Logic
Faster Loading (HTML is not going from Parser->Browser it goes straight to browser)
However some cons may be:
It only works in PHP5 (I Believe, could be wrong, correct if needed)
It may not be what one is used to
Untidy if done incorrectly (without indentation etc, however this is the same with anything)
But as you can see, the pros outweigh said cons. Try not to mix the two also, some separation and some intergration - this may get confusing for yourself and other developers that work with you.
I hope this helped.
Benefits of the first method (separating PHP and HTML):
You don't need to escape characters
It's also possible for code editors
to highlight/indent the markup.
It's arguably easier to read
There is no downside to this method,
compared to the second method.
Functionally: they both will work, so ultimately it is a preference.
Yet, you might consider that comments are a preference as well, your code would compile and run exactly the same without comments. However most people would agree comments are essential to writing and maintaining good code. I see this as being a similar subject matter. In the long run it will make it easier to read and maintain the code it if the two are separated.
So is it important? I would say Yes.
I kick off with: the first one you can open in a WYSIWYG editor, and still see some markup, which might makes it easier to maintain.
It says that what you put in echo '' it is first processed by the programming language and then sent to the browser, but if you directly put there html code without php, that code will load faster because there is no programming involved.
And the second reason as people above said is that you should have your 'large programming code' stored separately of the html code, and in the html code just put some calls to print results like 'echo $variable'. Or use a template engine like Smarty (like I do).
Best regards,
Alexandru.
Ouch!
All of the examples in your question are perfectly impossible to read. I'd say, you do yourself and those, who might read your code a great favour and use a template engine of sorts, say, Smarty. It is extremely easy to set up and use and it WILL separate your code from presentation. It doesn't require you to put everything in classes, it just makes sure, that your logic is in one file and presentation - in another one.
I don't know how in php but in asp.net separation has the following advantages.
1. separated code is easy to understand and develop
2. designer can work in html in the same time developer can write a code

Categories