This may be better suited to server fault but I thought I'd ask here first.
We have a file that is prepended to every PHP file on our servers using auto-prepend that contains a class called Bootstrap that we use for autoloading, environment detection, etc. It's all working fine.
However, when there is an "OUT OF MEMORY" error directly preceding (i.e., less than a second or even at the same time) a request to another file on the same server, one of three things happens:
Our code for checking if(class_exists('Bootstrap'), which we used to wrap the class definition when we first got this error, returns true, meaning that class has already been declared despite this being the auto-prepend file.
We get a "cannot redeclare class Bootstrap" error from our auto-prepended file, meaning that class_exists('Bootstrap') returned false but it somehow was still declared.
The file is not prepended at all, leading to a one-time fatal error for files that depend on it.
We could, of course, try to fix the out of memory issues since those seem to be causing the other errors, but for various reasons, they are unfixable in our setup or very difficult to fix. But that's beside the point - it seems to me that this is a bug in PHP with some sort of memory leak causing issues with the auto-prepend directive.
This is more curiosity than anything since this rarely happens (maybe once a week on our high-traffic servers). But I'd like to know - why is this happening, and what can we do to fix it?
We're running FreeBSD 9.2 with PHP 5.4.19.
EDIT: A few things we've noticed while trying to fix this over the past few months:
It seems to only happen on our secure servers. The out of memory issues are predominantly on our secure servers (they're usually from our own employees trying to download too much data), so it could just be a coincidence, but it deserves pointing out
The dump of get_declared_classes when we have this issue contains classes that are not used on the page that is triggering the error. For example, the output of $_SERVER says the person is on xyz.com, but one of the declared classes is only used in abc.com, which is where the out of memory issues usually originate from.
All of this leads me to believe that PHP is not doing proper end-of-cycle garbage collection after getting an out of memory error, which causes the Bootstrap class to either be entirely or partly in memory on the next page request if it's soon enough after the error. I'm not familiar enough with PHP garbage collection to actually act on this, but I think this is most likely the issue.
You might not be able to "fix" the problem without fixing the out of memory issue. Without knowing the framework you're using, I'll just go down the list of areas that come to mind.
You stated "they're usually from our own employees trying to download too much data". I would start there, as it could be the biggest/loudest opportunity for optimizations, a few idea come to mind.
if the data being downloaded is files, perhaps you could use streams to chunk the reads, to a constant size, so the memory is not gobbled up on big downloads.
can you do download queueing, throttling.
if the data is coming from a database, besides optimizing your queries, you could rate limit them, reduce the result set sizes and ideally move such workloads to a dedicated environment, with mirrored data.
ensure your code is releasing file pointers and database connections responsibly, leaving it to PHP teardown, could result in delayed garbage collection and a sort of cascading effect, in high traffic situations.
Other low hanging fruit when it comes to memory limits
you are running php 5.4.19, if your software permits it, consider updating to more resent version "PHP 5.4 has not been patched since 2015" besides PHP 7 comes with a whole slew of performance improvements.
if you have a client side application involved monitor it's xhr and overall network activity, look for excessive polling and hanging connections.
as for your autoloader, based on your comment "The dump of get_declared_classes when we have this issue contains classes that are not used on the page that is triggering the error" you may want to check the implementation, to make sure it's not loading some sort of bundled class cache, if you are using composer, dump-autoload might be helpful.
sessions, I've seen some applications load files based on cookies and sessions, if you have such a setup, I would audit that logic and ensure there are no sticky sessions loading unneeded resources.
It's clear from your question you are running a multi-tenency server. Without proper stats it hard to be more specific, but I would think it's clear the issue is not a PHP issue, as it seems to be somewhat isolated, based on your description.
Proper Debugging and Profiling
I would suggest installing a PHP profiler, even for a short time, new relic is pretty good. You will be able to see exactly what is going on, and have the data to fix the right problem. I think they have a free trial, which should get you pointed in the right direction... There are others too, but their names escape me at the moment.
Even if class_exists returns false, it would never return true if an interface of the same name exists. However, you cannot declare an interface and class of the same name.
Try running class_exists('Bootstrap') && interface_exists('Bootstrap') to make sure you do not redeclare.
Did you have a look at __autoload function?
I believe that you could workaround this issue by creating some function like that in your code:
function __autoload($className)
{
if (\file_exists($className . '.php'))
include_once($className . '.php');
else
eval('class ' . $className . ' { function __call($method, $args) { return false; } }');
}
If you have a file called Bootstrap.php with class Bootstrap declared inside it, PHP will automatically load file, otherwise declare a ghost class that could handle any function call inside it, avoiding any error messages. Note that for ghost function I used __call magic method.
Related
As we know, return values of file_exists(), filesize() etc. functions are cached by PHP, causing considerable annoyance to developers. I often see and hear advices like "you must place clearstatcache() before your file info calls" or "write your own real_filesize() and place clearstatcache() at the first line". I have seen a lot of code filled with a lot of clearstatcache() calls. Additionally, it is not possible to clear the cache per file, you must clear the whole cache every time.
A real software either a) requests for file information rarely, or b) needs always fresh information. If someone really need for caching, they can easily implement it with a little bit of coding.
So, currently I can only see the downside of this caching. I think, filestat caching is one of the major broken things in PHP, and it is included in PHP7 too. The question is to the one who knows: what are the benefits of caching file informations in this unusable way?
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
To give some context:
I had a discussion with a colleague recently about the use of Autoloaders in PHP. I was arguing in favour of them, him against.
My point of view is that Autoloaders can help you minimise manual source dependency which in turn can help you reduce the amount of memory consumed when including lots of large files that you may not need.
His response was that including files that you do not need is not a big problem because after a file has been included once it is kept in memory by the Apache child process and this portion of memory will be available for subsequent requests. He argues that you should not be concerned about the amount of included files because soon enough they will all be loaded into memory and used on-demand from memory. Therefore memory is less of an issue and the overhead of trying to find the file you need on the filesystem is much more of a concern.
He's a smart guy and tends to know what he's talking about. However, I always thought that the memory used by Apache and PHP was specific to that particular request being handled.
Each request is assigned an amount of memory equal to memory_limit PHP option and any source compilation and processing is only valid for the life of the request.
Even with op-code caches such as APC, I thought that the individual request still needs to load up each file in it's own portion of memory and that APC is just a shortcut to having it pre-compiled for the responding process.
I've been searching for some documentation on this but haven't managed to find anything so far. I would really appreciate it if someone can point me to any useful documentation on this topic.
UPDATE:
Just to clarify, the autoloader discussion part was more of a context :).
It may not have been clear but my main question is about whether Apache will pool together its resources to respond to multiple requests (especially memory used by included files), or whether each request will need to retrieve the code required to satisfy the execution path in isolation from other requests handled from the same process.
e.g.:
Files 1, 2, 3 and 4 are an equal size of 100KB each.
Request A includes file 1, 2 and 3.
Request B includes file 1, 2, 3 and 4.
In his mind he's thinking that Request A will consume 300KB for the entirety of it's execution and Request B will only consume a further 100KB because files 1,2 and 3 are already in memory.
In my mind it's 300KB and 400KB because they are both being processed independently (if by the same process).
This brings him back to his argument that "just include the lot 'cos you'll use it anyway" as opposed to my "only include what you need to keep the request size down".
This is fairly fundamental to how I approach building a PHP website, so I would be keen to know if I'm off the mark here.
I've also always been of the belief that for large-scale website memory is the most precious resource and more of a concern than file-system checks for an autoloader that are probably cached by the kernel anyway.
You're right though, it's time to benchmark!
Here's how you win arguments: run realistic benchmark, and be on the right side of the numbers.
I've had this same discussion, so I tried an experiment. Using APC, I tried a Kohana app with a single monolithic include (containing all of Kohana) as well as with the standard autoloader. The final result was that the single include was faster at a statistically irrelevant rate (less than 1%) but used slightly more memory (according to PHP's memory functions). Running the test without APC (or XCache, etc) is pointless, so I didn't bother.
So my conclusion was to continue use autoloading because it's much simpler to use. Try the same thing with your app and show your friend the results.
Now you don't need to guess.
Disclaimer: I wasn't using Apache. I cannot emphasize enough to run your own benchmarks on your own hardware on your own app. Don't trust that my experience will be yours.
You are the wiser ninja, grasshopper.
Autoloaders don't load the class file until the class is requested. This means that they will use at most the same amount memory as manual includes, but usually much less.
Classes get read fresh from file each request even if an apache thread can handle multiple requests, so your friends 'eventuall all are read' doesn't hold water.
You can prove this by putting an echo 'foo'; above the class definition in the class file. You'll see on each new request the line will be executed regardless of if you autoload or manually include the whole world of class files at start.
I couldn't find any good concise documentation on this--i may write some with some memory usage examples--as i also have had to explain this to others and show evidence to get it to sink in. I think the folks at zend didn't think anyone would not see the benifits of autoloading.
Yes, apc and such (like all caching solutions) can overcome the resouce negatives and even eek out small gains in performance, but you eat up lots of unneeded memory if you do this on a non-trivial number of libraries and serving a large number of clients. Try something Like loading a healthy chunk of the pear libraries in a massive include file while handling 500 connections hitting your page at the same time.
Even using things like Apc you benefit from using autoloaders with any non-namespaced classes (most of the existing php code currently) as it can help avoid global namespace pollution when dealing with large umbers of class libraries.
This is my opionion.
I think autoloaders are a very bad idea for the following reasons
I like to know what and where my scripts are grabbing the data/code from. Makes debugging easier.
This also has configuration problems in so far as if one of your developers changes the file (upgrade etc) or configuration and things stop working it is harder to find out where it is broken.
I also think that it is lazy programming.
As to memory/preformance issues it is just as cheap to buy some more memory for the computer if it is struggling with that.
We are using the Recess framework for a web service. As part of this we're using Recess' caching mechanism, provided as an SQLite database.
We've been using this caching mechanism happily for about a year. However on 3 occasions now we've had issues with the SQLite database getting "locked" and causing issues. The message that we get is "exception 'PDOException' with message 'SQLSTATE[HY000]: General error: 5 database is locked' in...".
I've had a search around and it seems to be a common issue, and there's a lot of discussion about ways to minimise its likelihood or prevent it (e.g. the drupal boards). However, my issue seems to be slightly different to this. The situations I've seen described seem to indicate that it's to do with concurrency - when two PHP processes try to access the SQLite database at exactly the same time, one of them gets a lock error. In this situation efforts to simply minimise the issue make sense. But for my application, when the issue starts happening (presumably because of concurrency), the SQLite database is from that point on permanently locked. Every single cache-accessing request from this point on gets a PDOException. Our solution has been to just delete the cache file, which is not the end of the world, but this requires manual intervention and plus means we lose the built-up cache data.
Why would this be happening? Is there other reasons why we might get the lock to start with? Why does the lock persist? Is there a way to programmatically clear it? Is there a way to prevent it in the first place?
The two "solutions" I'm considering so far are:
Put a try-catch around the cache accessing functions. If we get exceptions, just ignore the cache and notify tech support to manually clear the cache.
Use a mutex (using PHP flock) for the SQLite file to prevent the issue with concurrency (but again, I'm not even certain this is the root cause).
Any information or suggestions would be greatly appreciated!
Well...
Makes tech-support handle a non-existing problem.
Ignores the fact that SQLite does it's own locking. (SQLite3 has even optimized it!)
Leaving the geek in me stand at the door, I would use "solution" number 3 (which you didn't list) and simply put a try-catch around the cache accessing functions. If you get exceptions, do a short sleep and then call the exception-causing function again.
thisFunction(...)
{
....
try
{
....
}
catch(Exception $e)
{
sleep(rand(1,3))
thisFunction(...);
}
....
}
That way, you actually gain something from the error sqlite is trying to help you with. ;)
I hope this is not a completely stupid question. I have searched quite a bit for an answer, but I can't find (or recognise) one exactly on point.
I understand that functions in PHP are not parsed until actually run. Therefore, if I have a large class with many functions, only one of which requires a large include file, will I potentially save memory if I only include the "include file" within the function (as opposed to at the top of the class file)?
I presume that, even if this would save memory, it would only do so until such time as the function was called, after which the memory would not be released until the current script stopped running?
Many Thanks,
Rob
I love this saying: "Make it work and then, if needed, make it fast." -some good programmer?
In most cases you would probably be better off focusing on good OOP structure and application design then speed. If you server is using something like Zend Optimizer having all your methods in a single file won't make any difference since it is all pre-compiled and stored in memory.(It's more complicated then this but you get the idea)
You can also load all your include files when apache starts. Then all the functions are loaded in memory. You wouldn't want to do that while developing unless you like to restart Apache every time you make a code change. But when done on production servers it can make a huge difference. And if you really want to make things fast you can write the code in C++ and load it as a module for Apache.
But in the end... do you really need that speed?
Yes it will, but be sure that the function doesn't depend on any other functions included in the parent. The memory consumption is also dependent on a couple things, from the size of the file itself to the amount of virtual memory it requires with variable setting and proper garbage collection protocols.
If the function is inside a class, it's called a method, and it might depend on its class to extend another class.
Just some things to consider. Always include the bare minimum.
Don't save memory on such cases unless you really need it, save development time. Memory is usually cheap but development/supoort time isn't. Use php opcode cacher like eAccelerator or APC, it will increase speed of execution because all files will be pre-compiled and stored in memory.
I was just reading over this thread where the pros and cons of using include_once and require_once were being debated. From that discussion (particularly Ambush Commander's answer), I've taken away the fact(?) that any sort of include in PHP is inherently expensive, since it requires the processor to parse a new file into OP codes and so on.
This got me to thinking.
I have written a small script which will "roll" a number of Javascript files into one (appending the all contents into another file), such that it can be packed to reduce HTTP requests and overall bandwidth usage.
Typically for my PHP applications, I have one "includes.php" file which is included on each page, and that then includes all the classes and other libraries which I need. (I know this isn't probably the best practise, but it works - the __autoload feature of PHP5 is making this better in any case).
Should I apply the same "rolling" technique on my PHP files?
I know of that saying about premature optimisation being evil, but let's take this question as theoretical, ok?
There is a problem with Apache/PHP on Windows which causes the application to be extremely slow when loading or even touching too many files (page which loads approx. 50-100 files may spend few seconds only with file business). This problem appears both with including/requiring and working with files (fopen, file_get_contents etc).
So if you (or more likely anybody else, due to the age of this post) will ever run your app on apache/windows, reducing the number of loaded files is absolutely necessary for you. Combine more PHP classes into one file (an automated script for it would be useful, I haven't found one yet) or be careful to not touch any unneeded file in your app.
That would depend somewhat on whether it was more work to parse several small files or to parse one big one. If you require files on an as-needed basis (not saying you necessarily should do things that way ) then presumably for some execution paths there would be considerably less compilation required than if all your code was rolled into one big PHP file that the parser had to encode the entirety of whether it was needed or not.
In keeping with the question, this is thinking aloud more than expertise on the internals of the PHP runtime, - it doesn't sound as though there is any real world benefit to getting too involved with this at all. If you run into a serious slowdown in your PHP I would be very surprised if the use of require_once turned out to be the bottleneck.
As you've said: "premature optimisation ...". Then again, if you're worried about performance, use an opcode cache like APC, which makes this problem almost disappear.
This isn't an answer to your direct question, just about your "js packing".
If you leave your javascript files alone and allow them to be included individually in the HTML source, the browser will cache those files. Then on subsequent requests when the browser requests the same javascript file, your server will return a 304 not modified header and the browser will use the cached version. However if your "packing" the javascript files together on every request, the browser will re-download the file on every page load.