PHP Performance Question (readable code or not, web app) - php

I wondered if I should write my code clean and readable or rather small and unreadable... Or should I write it readable and then compress it afterwards when I'm publishing it on the web?
Ps. I'm building a web app,
the faster, the better!
Thanks_

I think you are greatly underestimating PHP's performance if you think this will affect it.
Write clean, readable code. In fact write code as if the next guy to maintain it is a sociopath that knows where you live.
Edit In response to AESM's comment... not in any way that matters. Also you can edit your question if you want to expand on it, instead of leaving a comment.

PHP parses the code before executing. The first stage is tokenization, which throws out all comments and whitespaces, and converts all identifiers to tokens. This means neither meaningfull names, nor sensible comments and clean formatting will have any effects at runtime. In fact all speed effects you seem to expect from compression are already lost during tokenization.
If you do have "bigger" source files due to clean coding, then tokenization will effectively take longer. However this effect is barely meassurable compared to actual parsing and execution.
If you feel you want to optimize at that point, please consider using eaccelerator, which makes an actual difference.
greetz
back2dos

"Programs must be written for people to read, and only incidentally for machines to execute."

I'd say, write a clean/readable code and then eliminate the bottlenecks, if needed.

Related

Obfuscating the php code before deploying on Cpanel? [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Has anybody used a good obfuscator for PHP? I've tried some but they don't work for very big projects. They can't handle variables that are included in one file and used in another, for instance.
Or do you have any other tricks for stopping the spread of your code?
You can try PHP protect which is a free PHP obfuscator to obfuscate your PHP code.
It is very nice, easy to use and also free. EDIT: This service is not live anymore.
As for what others have written here about not using obfuscation because it can be broken etc:
I have only one thing to answer them - don't lock your house door because anyone can pick your lock.
This is exactly the case, obfuscation is not meant to prevent 100% code theft. It only needs to make it a time-consuming task so it will be cheaper to pay the original coder.
People will offer you obfuscators, but no amount of obfuscation can prevent someone from getting at your code. None. If your computer can run it, or in the case of movies and music if it can play it, the user can get at it. Even compiling it to machine code just makes the job a little more difficult. If you use an obfuscator, you are just fooling yourself. Worse, you're also disallowing your users from fixing bugs or making modifications.
Music and movie companies haven't quite come to terms with this yet, they still spend millions on DRM.
In interpreted languages like PHP and Perl it's trivial. Perl used to have lots of code obfuscators, then we realized you can trivially decompile them.
perl -MO=Deparse some_program
PHP has things like DeZender and Show My Code.
My advice? Write a license and get a lawyer. The only other option is to not give out the code and instead run a hosted service.
See also the perlfaq entry on the subject.
Nothing will be perfect. If you just want something to stop non-programmers then here's a little script I wrote you can use:
<?php
$infile=$_SERVER['argv'][1];
$outfile=$_SERVER['argv'][2];
if (!$infile || !$outfile) {
die("Usage: php {$_SERVER['argv'][0]} <input file> <output file>\n");
}
echo "Processing $infile to $outfile\n";
$data="ob_end_clean();?>";
$data.=php_strip_whitespace($infile);
// compress data
$data=gzcompress($data,9);
// encode in base64
$data=base64_encode($data);
// generate output text
$out='<?ob_start();$a=\''.$data.'\';eval(gzuncompress(base64_decode($a)));$v=ob_get_contents();ob_end_clean();?>';
// write output text
file_put_contents($outfile,$out);
I'm not sure you can label obfuscation of an interpreted language as pointless (I'm unable to add a comment to Schwern's post, so here goes a new entry).
I think it's a little shortsighted to assume you know all the possible scenarios where someone would like to obfuscate code, and you assume that anyone will actually be willing to go to whatever necessary lengths to view that code once obfuscated. Consider my current scenario:
I work for a consulting company that is developing a large and fairly sophisticated PHP-based site. The project will be hosted on a client's server that is hosting other sites developed by other consultancies. Technically any code we write is owned by the client, so we can't license it. However, any other consultancy (competitor) with access to the server can copy our code without getting permission from the client first. We therefore have a genuine reason for obfuscation - to make the effort required for a competitor to understand our code more than the effort of creating a copy of our work from scratch.
See our SD Thicket PHP Obfuscator for an obfuscator that works just fine with arbitrarily large sets of pages. It operates primarily by scrambling identifier names. With modest to large applications, this can make the code extremely difficult to understand, which is the entire purpose.
It doesn't waste any energy on "eval(decode(encodedprogramcode))" schemes, which a lot of PHP "obfuscators" do [these are "encoder"s, not "obfuscator"s], because any clod can find that call and execute the eval-decode himself and get the decoded code.
It uses a language-precise parser to process the PHP; it will tell you if your program is syntactically invalid. More importantly, it knows the whole language precisely; it won't get lost or confused, and it won't break your code (other that what happens if you obfuscate "incorrectly", e.g., fail to identify the public API of the code correctly).
Yes, it obfuscates identifiers identically across pages; if it didn't do that, the result wouldn't work.
The best I've seen is Zend Guard.
Try this one: http://www.pipsomania.com/best_php_obfuscator.do
Recently I wrote it in Java to obfuscate my PHP projects, because I didnt find any good and compatible ready written on the net, I decided to put it online as saas, so everyone use it free. It does not change variable names between different scripts for maximum compatibility, but is obfuscating them very good, with random logic, every instruction too. Strings... everything. I believe its much better then this buggy codeeclipse, that is by the way written in PHP and very slow :)
Thicketâ„¢ Obfuscator for PHP
The PHP Obfuscator tool scrambles PHP source code to make it very difficult to understand or reverse-engineer (example). This provides significant protection for source code intellectual property that must be hosted on a website or shipped to a customer. It is a member of SD's family of Source Code Obfuscators.
Using SourceGuardian is good as it comes with a cool and easy to use GUI.
But be aware:
Pay attention to its -rather funny- licensing terms.
You are only allowed to run 1 per machine -so far this is acceptable
If you want to run the command line interface on another machine, say your web server, YOU WILL NEED ANOTHER LICENSE (Yes, it's funny and I can hear you laughing too).
Obfuscation is only adding another layer of potential bugs and security vulnerabilities to your program. Please don't do it.
The kind of people who write obfuscation software usually seem very sketchy and non-skilled anyway.
If your code is "great", crackers will go through great lengths to spread it, regardless of whether or not it is obfuscated. If nobody knows/cares about your code, they probably won't, either.

Embed C in PHP to improve speed

I'm thinking of writing a PHP extension from C, just to improve the speed. strpos() and preg_match() etc. are way too slow for my project.
But it struck me, that strpos() and preg_match() must have been 'originally' written in C or some other primitive language.
So, here my question: Is it meaningful, that I write some extension in C, just in order to improve the computation speed?
It might be useful if you can identify a "self-contained" bottleneck. PHP is still a scripting language. There are a lot of lookup operations, some memory operations which can be optimized away in C, maybe a handle/value/memory block from one of the underlying libraries that you could store/use more efficiently in your specific case, and so on and on.
But, make sure that the code block you're touching is worth the effort. I.e. first identify the bottleneck. Run a php profiler (like e.g. xdebug) and then maybe even a C profiler to see where the time is spent in the php runtime. And keep in mind that if you write the extension it's your job to keep it up to date, running and functional (including bug tracking/fixing, quality assurance, ...).
It is really cool that you are interested in writing some extension in PHP.
Please go through the below link to understand more about how Facebook started the HipHop project to increase speed. They achieved this by writing some code in the native language like C instead of PHP.
http://developers.facebook.com/blog/post/2010/02/02/hiphop-for-php--move-fast/
But instead of rewriting some already written ext in PHP, try to write your new one, you will find many articles on writing a new extension in PHP.
The existing extensions are already optimized, so if you want to do some specific work, and have a good algorithm to support it, go for writing your own extension.
not proofed, but i think, that you can't gain significantly better speed when you just do your own low-level implementation of regular-expressions or string-scanning...
php is written in c and highly optimized already...
check your code and improve the flow...
if it's impossible, take a look at "HipHop" from facebook...
Its not meaningful to write another implementation of strpos() or preg_match() in C. because PHP has already implemented them in C.
rather its meaningful to make your PHP code optimized such that It can use those functions instead of abusing them
But still If you really want to speed things up by providing yet another implementation It might help if and only if its fast enough. otherwise its just waste of time and labour.
You can have a look on PHP source code and check the current implementation of these function and see if you can really improve or not.

Using a single PHP script for an entire site

I had an idea today (that millions of others have probably already had) of putting all the sites script into a single file, instead of having multiple, seperate ones. When submitting a form, there would also be a hidden field called something like 'action' which would represent which function in the file would handle it.
I know that things like Code Igniter and CakePHP exist which help seperate/organise the code.
Is this a good or bad idea in terms of security, speed and maintenance?
Do things like this already exist that i am not aware of?
What's the point? It's just going to make maintenance more difficult. If you're having a hard time managing multiple files, you should invest the time into finding a better text editor / IDE and stop using Notepad or whatever is making it so difficult in the first place!
Many PHP frameworks rely on the Front Controller design: a single small PHP script serves as the landing point for all requests. Based on request arguments, the front controller invokes code in other PHP scripts.
But storing all code for your site in a single file is not practical, as other people have commented.
There are many forums that do this. Personally, I don't like it, mainly because if you make an error in the file, the entire site is broken until you fix it.
I like separation of each part, but I guess it has its plusses.
It's likely bad for maintenance, as you can't easily disable a section of your site for an update.
Speed: I'm not sure to be honest.
Security: You could accomplish the exact same security settings but just adding a security check to a file and then including that file in all your pages.
If you're not caching your scripts, everything in a single file means less disk I/O, and since generally, disk I/O is an expensive operation, this probably can be a significant benefit.
The thing is, by the time you're getting enough traffic for this to matter, you're probably better off going with caching anyway. I suppose it might make some limited sense, though, in special cases where you're stuck on a shared hosting environment where bandwidth isn't an issue.
Maintenance and security: composing software out of small integral pieces of code a programmer can fit inside their head (and a computer can manage neatly in memory) is almost always a better idea than a huge ol' file. Though if you wanted to make it hell for other devs to tinker with your code, the huge ol' file might serve well enough as part of an obfuscation scheme. ;)
If for some reason you were using the single-file approach to try and squeeze out extra disk I/O, then what you'd want to do is create a build process, where you did your actual development work in a series of broken-out discrete files, and issued make or ant like command to generate your single file.

What is your experience of PHP encrypters? Which one would you recommend?

We have an application that is written in PHP that we are going to license to a customer. Our company believes that the customer might intend to steal the source code and create their own fork of the software, therefore we want to encrypt the source code.
I have searched some for PHP-encrypters and found several that seems good, but since we have no previous experience of PHP-encrypters it hard to say which one is the best. Which PHP encrypters have you used and what is your experience?
So, First:
It is impossible to encrypt your entire code base because at some point there has to be an eval statement, and if the user changes the eval to an echo, they get all of your code in the browser.
And here is a bunch of people who agree with me.
Furthermore:
People will offer you obfuscators, but no amount of obfuscation can prevent someone from getting at your code. None. If your computer can run it, or in the case of movies and music if it can play it, the user can get at it. Even compiling it to machine code just makes the job a little more difficult. If you use an obfuscator, you are just fooling yourself. Worse, you're also disallowing your users from fixing bugs or making modifications. - Schwern
Now thats done:
Bytecompiling is something completely different than encrypting. It makes the PHP code into already interpreted bytes, similar to an exe file. You can include these files just like any other php file.
The byte code produced is able to be reverse engineered, but it would take lots of time and is not worth the company's time.
Check out the byte compiler PHP extension.
I'd also like to note that PHP comes with several ways of reverse engineering classes. Such as the Reflection Class. This basically allows people to see every method, variables, and constant in each of your classes without the need for your source code.
Frankly, once someone sees the functions you use, it is pretty easy to piece it together after that.
There's a lot of obfusticaters out there masquerading as encrypters.
If you really must encrypt your code use Zend.
IMHO shutting your customers out of your code is inherently evil and would rather hide some symbology in the code and sell it under a no-modify/re-sell contract. Then sue the ass off them if they try to sell it on. You could argue that encrypting your code closes down a business opportunity ;) !
C.

Seriously, should I write bad PHP code? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I'm doing some PHP work recently, and in all the code I've seen, people tend to use few methods. (They also tend to use few variables, but that's another issue.) I was wondering why this is, and I found this note "A function call with one parameter and an empty function body takes about the same time as doing 7-8 $localvar++ operations. A similar method call is of course about 15 $localvar++ operations" here.
Is this true, even when the PHP page has been compiled and cached? Should I avoid using methods as much as possible for efficiency? I like to write well-organized, human-readable code with methods wherever a code block would be repeated. If it is necessary to write flat code without methods, are there any programs that will "inline" method bodies? That way I could write nice code and then ugly it up before deployment.
By the way, the code I've been looking at is from the Joomla 1.5 core and several WordPress plugins, so I assume they are people who know what they're doing.
Note: I'm pleased that everyone has jumped on this question to talk about optimization in general, but in fact we're talking about optimization in interpreted languages. At least some hint of the fact that we're talking about PHP would be nice.
How much "efficiency" do you need? Have you even measured? Premature optimization is the root of all evil, and optimization without measurement is ALWAYS premature.
Remember also the rules of Optimization Club.
The first rule of Optimization Club is, you do not Optimize.
The second rule of Optimization Club is, you do not Optimize without measuring.
If your app is running faster than the underlying transport protocol, the optimization is over.
One factor at a time.
No marketroids, no marketroid schedules.
Testing will go on as long as it has to.
If this is your first night at Optimization Club, you have to write a test case.
I think Joomla and Wordpress are not the greatest examples of good PHP code, with no offense. I have nothing personal against the people working on it and it's great how they enable people to have a website/blog and I know that a lot of people spend all their free time on either of those projects but the code quality is rather poor (with no offense).
Review security announcements over the past year if you don't believe me; also assuming you are looking for performance from either of the two, their code does not excel there either. So it's by no means good code, but Wordpress and Joomla both excel on the frontend - pretty easy to use, people get a website and can do stuff.
And that's why they are so successful, people don't select them based on code quality but on what they enabled them to do.
To answer your performance question, yes, it's true that all the good stuff (functions, classes, etc.) slow your application down. So I guess if your application/script is all in one file, so be it. Feel free to write bad PHP code then.
As soon as you expand and start to duplicate code, you should consider the trade off (in speed) which writing maintainable code brings along. :-)
IMHO this trade off is rather small because of two things:
CPU is cheap.
Developers are not cheap.
When you need to go back into your code in six months from now, think if those nano seconds saved running it, still add up when you need to fix a nasty bug (three or four times, because of duplicated code).
You can do all sorts of things to make PHP run faster. Generally people recommend a cache, such as APC. APC is really awesome. It runs all sorts of optimizations in the background for you, e.g. caching the bytecode of a PHP file and also provides you with functions in userland to save data.
So for example if you parse a configuration file each time you run that script disk i/o is really critical. With a simple apc_store() and apc_fetch() you can store the parsed configuration file either in a file-based or a memory-based (RAM) cache and retrieve it from there until the cache expired or is deleted.
APC is not the only cache, of course.
You should see the responses to this question: Should a developer aim for readability or performance first?
To summarize the consensus: Unless you know for a fact (through testing/profiling) that your performance needs to be addressed in some specific area, readability is far more important.
In 99% of the cases, you should better worry about code understandability. Write code easy to test, understand and mantain.
In those few cases where performance really is critical, scripting languages like PHP are not your best choice. There's a reason many base library functions in PHP are written in C, after all.
Personally, while there may be overhead for a function call, if it means I write the code once (parameterized), and then use it in 85 places, I'm WAY further ahead because I can fix it in one place.
Scripting languages tend to give people the idea that "good enough" and "works" are the only criteria to consider when coding.
Especially with a fast interpreter like PHP's, I don't think lack of readability/maintainability is EVER worth the efficiency you may (or may not!) gain from it.
And a note about WordPress: I've done a lot of browsing of the WordPress code. Don't assume those people know anything about good code, please.
To answer your first question, yes it is true and it is also true for compiled op-code. Yes you can make your code faster by avoiding function calls except in extreme cases where your code grows too large because of code duplication.
You should do what you like "I like to write well-organized, human-readable code with methods wherever a code block would be repeated."
If your going to commit this horrible atrocity of removing all function calls at least use a profiler and only do it to the 10% of your code that matters.
An example of how micro-optimization leads to macro slowdowns:
If you're seriously considering manually inlining functions, consider manually unrolling loops.
JMPs are expensive, and if you can eliminate loops by unrolling and also eliminate all conditional blocks, you'll eliminate all that time wasted merely seeking around the CPU's cache.
Variable augmentation at runtime is slow too, as is pulling things out of a database, so you should inline all that data into your code as well.
Actually, loading up an interpreter for merely executing code and copying memory out to a user is exhaustively wasteful, why don't we just pre-compute all the possible pages and store each page in memory ready to go so its just a mem-copy? surely thats fast!
Ah, now we've got that slow thing called the internet between us, which is hindering user experience and limiting how much content we can use, how about we pre-compute the pages in advance, and archive them all and run them on the users local machine? that'll be really fast!
But that's going to waste cpu cycles, lots of them, what with page load time and browser content rendering etc, we'll skip the middleman and just deliver the pages to them on printed media!. Genius!.
/me watches your company collapse on its face while you spend 10 years precomputing (by hand) and printing pages nobody wants to see.
This may sound silly to you, but to the rest of us, what you proposed is just that ridiculous.
Optimisation is good, but draw the line somewhere sensible so you don't have to worry about future people whom work on the code tracking you down in your sleep for having such a crappy codebase thats unmaintainable.
note: yes, I use gentoo. how did you guess?
Of course you shouldn't write bad PHP code. But once you have something written bad, you may always use perfomance as an excuse :-)
This is premature optimization. While the statement is true that a function call costs more than increasing a local integer variable (nearly everything costs more), the costs of a function call are still very low compared to a database query.
See also:
Wikipedia -> Optimization -> When to optimize
c2.com Wiki -> Premature Optimization
PHP's main strength is that it's quick and easy to get a working app. That strength comes from the opportunity to write loose (bad) code and have it still operate in a somewhat expected way.
If you are in a position to need to conserve a few CPU cycles, PHP is not what you should be using. When PHP web apps perform poorly, it is far more likely due to inefficient queries, not the speed of the code execution.
If you're that worried about every bit on efficiency, then why on earth are you using a scripting language? You should be programming in a much faster language (insert your favorite compiled language here), probably resulting in more, and less readable code, but it'll run really fast, and you can still aim for best coding practices.
Seriously, if you're coding for running speed, you shouldn't be using PHP at all.
If you develop web applications with a MVC architectural pattern, you can greatly benefit from caching and serialization. You can cache views, or portions of it, and you can serialize models.
From experience, models often parse and generate most of the data that's being displayed. If you know a certain model won't be generating new data frequently, like a model that parses an RSS feed, you can just have it stuffed somewhere with all the parsed data and have it refreshed every once in a while.
If you look at wordpress php code, it intermingles php tags in between its html which leads to spaghetti in my mind.
Phpbb3 however is way better in that regard. For example it has a strict division between the php part, and the styles part, which are xhtml formatted files with {template} tags, parsed by a template engine. Which is much cleaner.
Write a couple 10 minute examples and run them in your profiler.
That will tell you which is faster to the millisecond.
If you don't have a profiler, post them here, and I will run them in my PHPEd profiler.
I suspect that much of the time difference, if any, comes from having to open the file that a class is stored in, but that would have to be tested too.
Then ask yourself if you care that much about a few milliseconds vs having to maintain spaghetti code - will any of your users ever notice?
Edit
The profiler won't simulate high traffic volumes, but it will tell you which method is faster for a single user, and which parts of the code are using how much time. Especially if you profile the operations being done repeatedly - say 1000 times each in a loop.
We can assume (though not always) that faster code used by a lot of people will be faster than slower code used by a lot of people.
Those who will lecture you about code micro-optimization are generally the same ones which will have 50 SQL queries per page, taking up a total of 2 seconds, because they never heard about profiling. But their code is optimizized !!! (and slow as hell)
Fact : adding another webserver is not difficult. Replicating a database is.
Optimizing webserver code can be a net loss if it adds load on the DB.
Note : 2-3 ms for simple pages (like a forum topic) including SQL is a good target for a PHP website. My old website used to do that.

Categories