Okay, so I'm relatively naive in my knowledge of the PHP VM and I've been wondering about something lately. In particular, what the request lifecycle looks like in PHP for a web application. I found an article here that gives a good explanation, but I feel that there has to be more to the story.
From what the article explains, the script is parsed and executed each time a request is made to the server! This just seems crazy to me!
I'm trying to learn PHP by writing a little micro-framework that takes advantage of many PHP 5.3/5.4 features. As such, I got to thinking about what static means and how long a static class-variable actually lives. I was hoping that my application could have a setup phase which was able to cache its results into a class with static properties. However, if the entire script is parsed and executed on each request, I fail to see how I can avoid running the application initialization steps for every request servered!
I just really hope that I am missing something important here... Any insight is greatly apreciated!
From what the article explains, the script is parsed and executed each time a request is made to the server! This just seems crazy to me!
No, that article is accurate. There are various ways of caching the results of the parsing/compilation, but the script is executed in its entirety each time. No instances of classes or static variables are retained across requests. In essence, each request gets a fresh, never-before execute copy of your application.
I fail to see how I can avoid running the application initialization steps for every request servered!
You can't, nor should you. You need to initialize your app to some blank state for each and every request. You could serialize a bunch of data into $_SESSION which is persisted across requests, but you shouldn't, until you find there is an actual need to do so.
I just really hope that I am missing something important here...
You seem to be worried over nothing. Every PHP site in the world works this way by default, and the vast, vast majority never need to worry about performance problems.
No, you are not missing anything. If you need to keep some application state, you must do it using DB, files, Memcache etc.
As this can sound crazy if you're not used to it, it's sometimes good for scaling and other things - you keep your state in some other services, so you can easily run few instances of PHP server.
A static variable, like any other PHP variable only persists for the life of the script execution and as such does not 'live' anywhere. Persistence between script executions is handled via session handlers.
Related
This will be a newbie question but I'm learning php for one sole purpose (atm) to implement a solution--everything i've learned about php was learned in the last 18 hours.
The goal is adding indirection to my javascript get requests to allow for cross-domain accesses of another website. I also don't wish to throttle said website and want to put safeguards in place. I can't rely on them being in javascript because that can't account for other peers sending their requests.
So right now I have the following makeshift code, without any throttling measures:
<?php
$expires = 15;
if(!$_GET["target"])
exit();
$fn = md5($_GET["target"]);
if(!$_GET["cache"]) {
if(!array_search($fn, scandir("cache/")) ||
time() - filemtime($file) > $expires)
echo file_get_contents("cache/".$fn);
else
echo file_get_contents(file);
}
else if($_GET["data"]) {
file_put_contents("cache/".$fn, $_GET["data"]);
}
?>
It works perfectly, as far as I can tell (doesn't account for the improbable checksum clash). Now what I want to know is, and what my search queries in google refuse to procure for me, is how php actually launches and when it ends.
Obviously if I was running my own web server I'd have a bit more insight into this: I'm not, I have no shell access either.
Basically I'm trying to figure out whether I can control for when the script ends in the code, and whether every 'get' request to the php file would launch a new instance of the script or whether it can 'wake up' the same script. The reason being I wish to track whether, say, it already sent a request to 'target' within the last n milliseconds, and it seems a bit wasteful to dump the value to a savefile and then recover it, over and over, for something that doesn't need to be kept in memory for very long.
Every HTTP request starts a new instance of the interpreter; it's basically an implementation detail whether this is a whole new process, or a reuse of an existing one.
This generally pushes you towards good simple and scalable designs: you can run multiple server processes and threads and you won't get varying behaviour depending whether the request goes back to the same instance or not.
Loading a recently-touched file will be very fast on Linux, since it will come right from the cache. Don't worry about it.
Do worry about the fact that by directly appending request parameters to the path you have a serious security hole: people can get data=../../../etc/passwd and so on. Read http://www.php.net/manual/en/security.variables.php and so on. (In this particular example you're hashing the inputs before putting them in the path so it's not a practical problem but it is something to watch for.)
More generally, if you want to hold a cache across multiple requests the typical thing these days is to use memcached.
php is done from a per-connection basis. IE: each request for a php file is seen as a new instance. Each instance is ended, generally, when the connection is closed. You can however use sessions to save data between connections for a specific user
For basic use of sessions look into:
session_start()
$_SESSION
session_destroy()
What is the best way to break up a recursive function that is using a ton of resources
For example:
function do_a_lot(){
//a lot of code and processing is done here
//it takes a lot of execution time
if($true){
//if true we have to do all of that processing again
do_a_lot();
}
}
Is there anyway to make the server only have to take the brunt of the first execution and then break up the recursion into separate processes? Or am I dreaming?
Honestly, if your function is using up that much of your system's resources, I'd most likely refactor my code. However, it's not truly multithreading, but you could perhaps look at using popen to fork your process.
One of the rule of PHP is "Share nothing". That means every PHP process is independant and shares nothing with the others. So if you want to break your execution on several PHP process you'll have to store the data somewhere. It can be a memcached storage, or a database, or the session, as you want.
Then you'll need to 'fork' your PHp process. They're solutions available to get this done on the server side. IMHO this is all hacks. Dangerous and not minded in the PHP/web way. With the exception of 'work queues' tools.
I think the nicest way is to break your task with ajax. This will allow you a clean user interface and will avoid any long response timeout in the web process. i.e. show a 'working zone' to you user, then ask in ajax for next step of the job (first one), get response (in server side stor you response), then ask for next step, store new response and respond , next step, etc. You can even add a 'stop that stuff' function on the client side.
You can check as well for 'php work queue' on google.
If it's a long running task, divide and conquer with gearman
I am developing some project with CodeIgniter and write unit tests and web tests in SimpleTest. I've noticed that my tests are not deterministic, i.e. they produce different outputs in time. I mean the test cases that should be strictly deterministic, not relying on random variables etc.
The tests look like affecting each other. Quite often, when everything goes okay, I have let's say 100 passed tests, but when I write a new test method that fails, then several other tests also do fail. But often after correcting the problem in my failing test case and re-running whole test suite 2-3 times whole suite gives a pass again.
This happens with WebTestCases generally.
Do you have any idea what could be the problem?
I do not modify any class variables that are shared etc.
I've glance at the code of SimpleTest (more or less, it's big to analyze whole flow quickly) and it looks like the instance of browser is re-created before launching different tests.
The thing that is the strangest is that after re-running, some errors disappear, and finally, all of them. Is there some caching involved in this?
I'll be grateful for hints as there is really not much documentation / blog entries / forum posts about SimpleTest in the web, except its API on the website.
Things it might be:
Caching - are you caching bad results
somewhere in the chain?
Misunderstanding - Are you sure you
are testing the right things?
Bad Data - If you are testing this on
top of a database, and the failure
corrupted the data in the database,
you might see results like you
mention.
(edit: moved the answer as a separate post)
Huh, I made quite thorough investigation and it seems that there is a bug in SimpleTest library.
They use fsockopen for opening connection, then send request via fwrite, and then incorrectly fetch response from socket. What I mean: it can happen that we read 0 bytes from socket, but we're not done, as we falsely assume, cause the server can be busy, and send data later, while we prematurely ended reading. That way, we haven't read whole response and we do tests against only partial response, causing it to fail.
Let's see if I make myself clear. I have an old set of scripts that run well on PHP4 and better don't thouch em. I have to integrate a new functionality implemented on PHP5, I need just to invoke a script on the new app from the old one.
To not have to touch the old stuff I think to somehow "kin of remotely" invoke the new one, need only to pass the $_REQUEST[] data. I can not include it as that would require migrating to another PHP version (and some name clashing). I don't need any output from the new one.
What would be the cleaner way to "call" that script passing parameters, fopen("http://theserver.com/thescript.php"....) and then passing all the necessary headers to pass the parameters? or there's somethign more direct?
Thanks!
If you need to pass POST data, you can use cURL; otherwise, you can just do file_get_contents('http://example.com/yourscript.php?param1=x¶m2=y¶m3=...'); and the HTTP wrapper will do the request for you (simplest way).
You're going to give yourself nightmares with this.
But if you really need to do it, you're not going to be able to rely on fopen. I would recommend using cURL, as Piskvor suggests.
But please, make sure you're validating and escaping any data you're pushing across correctly, or you're in for a world of hurt - the fact that you're making a cURL request to the other part of the system means that in theory, anyone else can do exactly the same thing.
This is most definitely not a long term solution, I would advise you rewrite the old parts as a priority.
After considering what you suggested on previous answers and considering safety I thought something: If both scripts are on the same server the "called" one should be on the same IP than the caller so if ips differ the invoked should not run. Is that a good idea?
I'm creating a script that makes use of the $GLOBALS variable quite a lot. Is there too much you can put into a variable?
If I have a lot of information stored in $GLOBAL variable when the page loads, is this going to slow down the site much or not really?
Is there a limit to how much information one should store in a variable? How does it work?
And would it be better to remove information from that variable when I am done with it?
Thanks for your help! Want to make sure i get this right before I go any further.
In PHP, there's a memory_limit configuration directive (in php.ini) that you should be aware of.
As meder says, you should really be taking a step back and re-evaluating things. Do you actually use all of those data on each and every web server request.
In almost every case, you'd be better off loading only the data you need, when you need it.
For instance, even if you're reading all this data from some file, instead of a database, you're probably better off splitting that file up into logical groups, and loading the data you need (once!), just before using it (the first time).
Assuming you're running Apache/mod_php, loading everything on every request will balloon the size of your httpd processes, and when you scale with traffic, you'll just start swapping out (which means your app will slow to a crawl, or even worse, become deadlocked) that much faster.
I you really need all or most of the data available for all (or nearly all) requests, consider looking into something like memcache. You can devise ways to share (read-only) data between processes, instead of duplicating it for each and every request.
Some people use a "Registry" object to handle globals.
See how Kevin Waterson does it:
http://www.phpro.org/tutorials/Model-View-Controller-MVC.html (See "5. The Registry")