Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I can't believe anyone would normally object to documentation and comments, but whats the best practice (and practical) for doing so in PHP?
In JavaScript, you can document your code and then run a minimizer to produce a production version of your code for use. What about in PHP, does the extra lines of text affect performance? Should you keep your documents in another file (which I imagine might be important for API's and frameworks but would slow down your development)?
Edit:
My logic was that it would necessarily take longer because the file size is longer and add more work (though possible negligible) for the parser to sort out. Say you had 1000 lines of code and 500 lines of comments, should you put a summarized version instead and then direct to actual documentation page?
With PHP (like ASP.NET) the code can be compiled on the server and HTML generated to be sent down the wire to the client.
PHP source code is compiled on-the-fly to an internal format that can be executed by the PHP engine. In order to speed up execution time and not have to compile the PHP source code every time the webpage is accessed, PHP scripts can also be deployed in executable format using a PHP compiler.
Source
During this process the comments are ignored and play no further part in proceedings. They don't slow down the compiler and hence don't affect performance.
You should comment your code (where appropriate) and keep the comments with the source code. Otherwise there's an even greater chance that they'll get out of step with the code and then worse than useless.
If you assume an O(1) CPU cost for every comment character, and your code is 4k, and half of the code is comments, then you waste 2k basic operations. With modern processors, a basic operation is ~ 20 clock cycles, but since there's pipeline optimizations, we can assume maybe 5 cycles per operation. Since we're dealing with gigahertz CPUs these days, 2k/1g * 5 ~= 1/10,000th of a second wasted on comment parsing.
Comment your code!
Negligible elements:
4k is one partition block, so disk reads are minimized. You'll have to go to disk anyway for the program code. CPUs have enough cache, so the whole program will be held in memory, comments and all.
Context switching events are likely to occur, but the amount of time executing PHP code is much greater than the amount of time ignoring comments.
The interpreter must maintain state, so any state-maintenance is not included in the processing time. Why? Because once we know we're in a comment, we just continue reading until we detect end of comment. All O(1) per character except when starting/ending a comment.
Setup/teardown execution of PHP is neglected because comment-detection would occur regardless of comments existing or not. The lack of comments in your code does not prevent the interpreter from attempting to detect comments. Therefore, no speedups to you.
Comment your code!
No. Does not affect performance, but it will affect your readability. A wise man once said "Comment as if the next guy to maintain your code is a homicidal maniac who knows where you live.”
Minimizing javascript is a good idea because the whole code is sent to the client - for code with a lot of comments or clients with slow connections, this could lead to a substantial delay in processing (where substantial could mean seconds..).
This is not the case with PHP - the code executes directly on the server and thus only the processed result is sent to the client. There may be some small overhead whilst the PHP preprocessor parses the code, but I would expect this to be negligable - a few milliseconds at most.
Adding documentation to your code will in no way impact your performance (neither negatively nor positively). Documenting your code is really important so other people (and yourself after a few months) can figure out what's going on. It may take perhaps one millisecond second longer for the lexer, but that's barely even worth mentioning.
It will increase your file size. But the parser ignores it, so it won't impact performance in terms of execution time, etc.
I think the general consensus is to comment within the actual code. While I suppose you could maintain separate files of documentations with references to your code, I can only imagine how cumbersome that would be to maintain.
Definitely, without question, document your PHP code .. it's already bad enough for someone trying to understand what you've done. Any detriment, and I cannot think of any, to documenting code is completely outweighed by the benefits
I am willing to bet that removing comments from almost any code results in zero noticeable speed improvement.
So the answer to your "best practice is": don't remove comments
The primary difference is that JavaScript has to be sent to the client, so those extra bytes take up resources during transfer. And it's not really important to send the documentation to the client. In PHP on the server side, those comments will be ignored by the parser. So document to your heart's content :)
the parser will ignore it. keep the documentation there and never skimp on it. the more comments the better.
When using docblocks for commenting your classes/functions/variables etc, it allows IDE's that support code suggestion/highlighting to give you information about the resuources that you are using as you write. Also, when it comes to creating documentation, there are various tools out there to automate this process using docblocks; in short, if you comment correctly then you already have created your documentation too.
In terms of efficiency, yes the parser will have to filter out comments when the file is loaded into memory (along with white-space I might add). However, this filtering process is run only once at the server and before the php script itself runs which itself will take considerably longer to run. This means that the proportion of time spent on the execution of the intended PHP will be much less comparatively and therefore any decrease in efficiency will be negligible.
Also, think of the time that you will save your successors; that is reason enough ;)
Related
A little bit of a generic question but it has been playing on my mind for a while.
Whilst learning php coding, to help me create a WordPress Theme from scratch, I have noticed that some arrays/parameters are kept to a single line whilst others are listed underneath one an other. Personally, I prefer listing the arrays underneath one and other as I feel this helps with readability and generally just looks tidier - Especially, if the array is long.
Does anyone know if listing arrays/parameters have any performance 'ill effects' such as slowing down the page load speed etc? As far as I can see, it is just a coder's preference. Is this a correct assumption?
Code formatting has no effect on performance.
Even if you claim that a larger file takes longer to read, if you are using at least PHP 5.5 then PHP will use an opcode cache - it will cache how it parsed your files for subsequent requests, eliminating any formatting that you have in your file.
I have a PHP application where sometimes heavy calculations are needed (I search for operations recorded by the users and make lots of economics analysis in long periods of time).
I'd like to improve the speed of these calculations, is it worth it to rewrite these calculations parts in C? (Among the faster languages here, C is the one I know the most).
I had already decided doing this, but when I was looking for "how to do it" I found this Stack Overflow question. There someone commented "Why not just write the whole site/page using either PHP or C?" and I know I need extra info.
If you are really worried about performance, measure first if a PHP (or other) implementation is fast enough. Possibly you will find out that there is no need to worry. If it is really heavy calculations (and there is a chance they will grow in complexity as your application evolves), it could be worth to run the calculations asynchronously in a separate backend service. For example your PHP frontend could dispatch to a C/C++ service, which eventually places results in a database. This requires lots of extra logic, somebody (the client) will have to poll regularly, but scales nicely.
There is other points to be considered than performance: if your math is complex and keeps growing, PHP may not be a good environment to formulate it. Then again, maybe a Java-based stack with a clear separation of frontend and business logic could be better just from a maintenance point of view.
I've just finished writing a pretty big class, and its turned out to be noticeably slow (when executing a function from).
The class contains 3 public functions and atleast 10 private functions, resulting in 1,569 lines of code (a fair amount of that is also comments).
Now I must admit, there is allot of singing and dancing going on behind the scenes when calling one of the public functions, but the response time is so bad, the class is not even worth using.
The class file size is 57KB. I'm wondering if the file size is the problem here, or is it just simply because of the code I'm executing - So can I just break the file down, or am I going to have to make drawbacks to my code?
I've also tried simply including the class file from another file, and the same goes...
In case this does any good:
1.) There are quite a few functions in the class that involve file reading/writing (such as file_exists(),fopen(),fwrite(),ect ect)
2.) There are no database queries/connections
3.) There are no huge (over count of 20) loops
Any help is much appreciated!
IO is very likely your slowest operation.
Size of program itself isn't much of an issue. You can profile your code with xdebug to determine what specifically is dragging you down. Using it, you can prioritize what code you will optimize first.
I've had big files with no problems in terms of speed.
Most likely reason for slowness if because you're not using file handles efficiently. For example, if you close a file after every time you use it it will take significantly longer as opposed to closing it at the end of the script (or just letting it be implicitly closed by not closing it manually).
I generally include 1 functions file into the hader of my site, now this site is pretty high traffic and I just like to make every little thing the best that I can, so my question here is,
Is it better to include multiple smaller function type files with just the code that's needed for that page or does it really make no difference to just load it all as 1 big file, my current functions file has all the functions for my whole site, it's about 4,000 lines long and is loaded on every single page load sitewide, is that bad?
It's difficult to say. 4,000 lines isn't that large in the realms of file parsing. In terms of code management, that's starting to get on the unwieldy side, but you're not likely to see much of a measurable performance difference by breaking it up into 2, 5 or 10 files, and having pages include only the few they need (it's better coding practice, but that's a separate issue). Your differential in number-of-lines read vs. number-of-files that the parser needs to open doesn't seem large enough to warrant anything significant. My initial reaction is that this is probably not an issue you need to worry about.
On the opposite side of the coin, I worked on an enterprise-level project where some operations had an include() tree that often extended into the hundreds of files. Profiling these operations indicated that the time taken by the include() calls alone made up 2-3 seconds of a 10 second load operation (this was PHP4).
If you can install extensions on your server, you should take a look at APC (see also).
It is free, by the way ;-) ; but you must be admin of your server to install it ; so it's generally not provided on shared hosting...
It is what is called an "opcode cache".
Basically, when a PHP script is called, two things happen :
the script is "compiled" into opcodes
the opcodes are executed
APC keeps the opcodes in RAM ; so the file doesn't have to be re-compiled each time it is called -- and that's a great thing for both CPU-load and performances.
To answer the question a bit more :
4,000 lines is not that much, speaking of performances ; Open a couple of files of any big application / Framework, and you'll rapidly get to a couple thousand of lines
a really important thing to take into account is maintenability : what will be easier to work with for you and your team ?
loading many small files might imply many system calls, which are slow ; but those would probably be cached by the OS... So probably not that relevant
If you are doing even 1 database query, this one (including network round-trip between PHP server and DB server) will probably take more time than the parsing of a couple thousand lines ;-)
I think it would be better if you could split the functions file up into components that is appropriate for each page; and call for those components in the appropriate pages. Just my 2 cents!
p/s: I'm a PHP amateur and I'm trying my hands on making a PHP site; I'm not using any functions. So can you enlighten me on what functions would you need for a site?
In my experience having a large include file which gets included everywhere can actually kill performance. I worked on a browser game where we had all game rules as dynamically generated PHP (among others) and the file weighed in at around 500 KiB. It definitely affected performance and we considered generating a PHP extension instead.
However, as usual, I'd say you should do what you're doing now until it is a performance problem and then optimize as needed.
If you load a 4000 line file and use maybe 1 function that is 10 lines, then yes I would say it is inefficient. Even if you used lots of functions of a combined 1000 lines, it is still inefficient.
My suggestion would be to group related functions together and store them in separate files. That way if a page only deals with, for example, database functions you can load just your database functions file/library.
Anothe reason for splitting the functions up is maintainability. If you need to change a function you need to find it in your monalithic include file. You may also have functions that are very, very similar but don't even realise it. Sorting functions by what they do allows you to compare them and get rid of things you don't need or merge two functions into one more general purpose function.
Most of the time Disc IO is what will kill your server so I think the lesser files you fetch from disc the better. Furthermore if it is possible to install APC then the file will be stored compiled into memory which is a big win.
Generally it is better, file management wise, to break stuff down into smaller files because you only need to load the files that you actually use. But, at 4,000 lines, it probably won't make too much of a difference.
I'd suggest a solution similar to this
function inc_lib($name)
{
include("/path/to/lib".$name.".lib.php");
}
function inc_class($name)
{
include("/path/to/lib".$name.".class.php");
}
I would like to implement Singular Value Decomposition (SVD) in PHP. I know that there are several external libraries which could do this for me. But I have two questions concerning PHP, though:
1) Do you think it's possible and/or reasonable to code the SVD in PHP?
2) If (1) is yes: Can you help me to code it in PHP?
I've already coded some parts of SVD by myself. Here's the code which I made comments to the course of action in. Some parts of this code aren't completely correct.
It would be great if you could help me. Thank you very much in advance!
SVD-python
Is a very clear, parsimonious implementation of the SVD.
It's practically psuedocode and should be fairly easy to understand
and compare/draw on for your php implementation, even if you don't know much python.
SVD-python
That said, as others have mentioned I wouldn't expect to be able to do very heavy-duty LSA with php implementation what sounds like a pretty limited web-host.
Cheers
Edit:
The module above doesn't do anything all by itself, but there is an example included in the
opening comments. Assuming you downloaded the python module, and it was accessible (e.g. in the same folder), you
could implement a trivial example as follow,
#!/usr/bin/python
import svd
import math
a = [[22.,10., 2., 3., 7.],
[14., 7.,10., 0., 8.],
[-1.,13.,-1.,-11., 3.],
[-3.,-2.,13., -2., 4.],
[ 9., 8., 1., -2., 4.],
[ 9., 1.,-7., 5.,-1.],
[ 2.,-6., 6., 5., 1.],
[ 4., 5., 0., -2., 2.]]
u,w,vt = svd.svd(a)
print w
Here 'w' contains your list of singular values.
Of course this only gets you part of the way to latent semantic analysis and its relatives.
You usually want to reduce the number of singular values, then employ some appropriate distance
metric to measure the similarity between your documents, or words, or documents and words, etc.
The cosine of the angle between your resultant vectors is pretty popular.
Latent Semantic Mapping (pdf)
is by far the clearest, most concise and informative paper I've read on the remaining steps you
need to work out following the SVD.
Edit2: also note that if you're working with very large term-document matrices (I'm assuming this
is what you are doing) it is almost certainly going to be far more efficient to perform the decomposition
in an offline mode, and then perform only the comparisons in a live fashion in response to requests.
while svd-python is great for learning, the svdlibc is more what you would want for such heavy
computation.
finally as mentioned in the bellegarda paper above, remember that you don't have to recompute the
svd every single time you get a new document or request. depending on what you are trying to do you could
probably get away with performing the svd once every week or so, in an offline mode, a local machine,
and then uploading the results (size/bandwidth concerns notwithstanding).
anyway good luck!
Be careful when you say "I don't care what the time limits are". SVD is an O(N^3) operation (or O(MN^2) if it's a rectangular m*n matrix) which means that you could very easily be in a situation where your problem can take a very long time. If the 100*100 case takes one minute, the 1000*1000 case would 10^3 minutes, or nearly 17 hours (and probably worse, realistically, as you're likely to be out of cache). With something like PHP, the prefactor -- the number multiplying the N^3 in order to calculate the required FLOP count, could be very, very large.
Having said that, of course it's possible to code it in PHP -- the language has the required data structures and operations.
I know this is an old Q, but here's my 2-bits:
1) A true SVD is much slower than the calculus-inspired approximations used, eg, in the Netflix prize. See: http://www.sifter.org/~simon/journal/20061211.html
There's an implementation (in C) here:
http://www.timelydevelopment.com/demos/NetflixPrize.aspx
2) C would be faster but PHP can certainly do it.
PHP Architect author Cal Evans: "PHP is a web scripting language... [but] I’ve used PHP as a scripting language for writing the DOS equivalent of BATCH files or the Linux equivalent of shell scripts. I’ve found that most of what I need to do can be accomplished from within PHP. There is even a project to allow you to build desktop applications via PHP, the PHP-GTK project."
Regarding question 1: It definitely is possible. Whether it's reasonable depends on your scenario: How big are your matrices? How often do you intend to run the code? Is it run in a web site or from the command line?
If you do care about speed, I would suggest writing a simple extension that wraps calls to the GNU Scientific Library.
Yes it's posible, but implementing SVD in php ins't the optimal approach. As you can see here PHP is slower than C and also slower than C++, so maybe it was better if you could do it in one of this languages and call them as a function to get your results. You can find an implementation of the algorithm here, so you can guide yourself trough it.
About the function calling can use:
The exec() Function
The system function is quite useful and powerful, but one of the biggest problems with it is that all resulting text from the program goes directly to the output stream. There will be situations where you might like to format the resulting text and display it in some different way, or not display it at all.
The system() Function
The system function in PHP takes a string argument with the command to execute as well as any arguments you wish passed to that command. This function executes the specified command, and dumps any resulting text to the output stream (either the HTTP output in a web server situation, or the console if you are running PHP as a command line tool). The return of this function is the last line of output from the program, if it emits text output.
The passthru() Function
One fascinating function that PHP provides similar to those we have seen so far is the passthru function. This function, like the others, executes the program you tell it to. However, it then proceeds to immediately send the raw output from this program to the output stream with which PHP is currently working (i.e. either HTTP in a web server scenario, or the shell in a command line version of PHP).
Yes. this is perfectly possible to be implemented in PHP.
I don't know what the reasonable time frame for execution and how large it can compute.
I would probably have to implement the algorithm to get a rought idea.
Yes I can help you code it. But why do you need help? Doesn't the code you wrote work?
Just as an aside question. What version of PHP do you use?