I have an application on a server that monitors a log file, I've also added a view at the client side (in the form of a website). Now I would like to implement the following: Whenever a new entry has been added, the view should update as fast as possible.
First I have thought of two practical solutions:
1) Call an AJAX function that requests a php page every second, which checks for updates, and if so show's them. (Disadvantages: Lots of HTTP overhead, a lot of the time there may be no message, lots of SQL calls)
2) Call an AJAX function that requests a different php page every minute, which also checks for updates for 1 minute, but only returns if it has found an update, or else after 1 minute. (Disadvantages: HTTP overhead, but less as option 1, may still have times without message, still a lot of SQL calls)
Which of those ones would be better, or what alternative would you advice?
I have also thought of yet another solution, but I'm unsure of how to implement it. That would be that on every INSERT on a specific table in the MySQL database, the webpage would directly be notified, perhaps via a push connection, but I'm also unsure of how those work.
Related
I've just set up my first remote connection with FileMaker Server using the PHP API and something a bit strange is happening.
The first connection and response takes around 5 seconds, if I hit reload immediately afterwards, I get a response within 0.5 second.
I can get a quick response for around 60 seconds or so (not timed it yet but it seems like at least a minute but less than 5 minutes) and then it goes back to taking 5 seconds to get a response. (after that it's quick again)
Is there any way of ensuring that it's always a quick response?
I can't give you an exact answer on where the speed difference may be coming from, but I'd agree with NATH's notion on caching. It's likely due to how FileMaker Server handles caching the results on the server side and when it clears that cache out.
In addition to that, a couple of things that are helpful to know when using custom web publishing with FileMaker when it comes to speed:
The fields on your layout will determine how much data is pulled
When you perform a find in the PHP api on a specific layout, e.g.:
$request = $fm->newFindCommand('myLayout');
$request->addFindCriterion('name', $myname);
$result = $request->execute();
What's being returned is data from all of the fields available on the my layout layout.
In sql terms, the above query is equivalent to:
SELECT * FROM myLayout WHERE `name` = ?; // and the $myname variable is bound to ?
The FileMaker find will return every field/column available. You designate the returned columns by placing the fields you want on the layout. To get a true * select all from your table, you would include every field from the table on your layout.
All of that said, you can speed up your requests by only including fields on the layout that you want returned in the queries. If you only need data from 3 fields returned to your php to get the job done, only include those 3 fields on the layout the requests use.
Once you have the records, hold on to them so you can edit them
Taking the example from above, if you know you need to make changes to those records somewhere down the line in your php, store the records in a variable and use the setField and commit methods to edit them. e.g.:
$request = $fm->newFindCommand('my layout');
$request->addFindCriterion('name', $myname);
$result = $request->execute();
$records = $result->getRecords();
...
// say we want to update a flag on each of the records down the line in our php code
foreach($records as $record){
$record->setField('active', true);
$record->commit();
}
Since you have the records already, you can act on them and commit them when needed.
I say this as opposed to grabbing them once for one purpose and then grabbing them again from the database later do make updates to the records.
It's not really an answer to your original question, but since FileMaker's API is a bit different than others and it doesn't have the greatest documentation I though I'd mention it.
There are some delays that you can remove.
Ensure that the layouts you are accessing via PHP are very simple, no unnecessary or slow calculations, few layout objects etc. When the PHP engine first accesses that layout it needs to load it up.
Also check for layout and file script triggers that may be run, IIRC the OnFirstWindowOpen script trigger is called when a connection is made.
I don't think that it's related to caching. Also, it's the same when accessing via XML. Haven't tested ODBC, but am assuming that it is an issue with this too.
Once the connection is established with FileMaker Server and your machine, FileMaker Server keeps this connection alive for about 3 minutes. You can see the connection in the client list in the FM Server Admin Console. The initial connection takes a few seconds to set up (depending on how many others are connected), and then ANY further queries are lightning fast. If you run your app again, it'll reuse that connection and give results in very little time.
You can do completely different queries (on different tables) in a different application, but as long as you execute the second one on the same machine and use the same credentials, FileMaker Server will reuse the existing connection and provide results instantly. This means that it is not due to caching, but it's just the time that it takes FMServer to initially establish a connection.
In our case, we're using a web server to make FileMaker PHP API calls. We have set up a cron every 2 minutes to keep that connection alive, which has pretty much eliminated all delays.
This is probably way late to answer this, but I'm posting here in case anyone else sees this.
I've seen this happen when using external authentication with FileMaker Server. The first query establishes a connection to Active Directory, which takes some time, and then subsequent queries are fast as FMS has got the authentication figured out. If you can, use local authentication in your FileMaker file for your PHP access and make sure it sits above any external authentication in your accounts list. FileMaker runs through the auth list from top to bottom, so this will make sure that FMS successfully authenticates your web query before it gets to attempt an external authentication request, making the authentication process very fast.
How does PHP handle multiple requests from users? Does it process them all at once or one at a time waiting for the first request to complete and then moving to the next.
Actually, I'm adding a bit of wiki to a static site where users will be able to edit addresses of businesses if they find them inaccurate or if they can be improved. Only registered users may do so. When a user edits a business name, that name along with it's other occurrences is changed in different rows in the table. I'm a little worried about what would happend if 10 users were doing this simultaneously. It'd be a real mishmash of things. So does PHP do things one at time in order received per script (update.php) or all at once.
Requests are handled in parallel by the web server (which runs the PHP script).
Updating data in the database is pretty fast, so any update will appear instantaneous, even if you need to update multiple tables.
Regarding the mish mash, for the DB, handling 10 requests within 1 second is the same as 10 requests within 10 seconds, it won't confuse them and just execute them one after the other.
If you need to update 2 tables and absolutely need these 2 updates to run subsequently without being interrupted by another update query, then you can use transactions.
EDIT:
If you don't want 2 users editing the same form at the same time, you have several options to prevent them. Here are a few ideas:
You can "lock" that record for edition whenever a user opens the page to edit it, and not let other users open it for edition. You might run into a few problems if a user doesn't "unlock" the record after they are done.
You can notify in real time (with AJAX) a user that the entry they are editing was modified, just like on stack overflow when a new answer or comment was posted as you are typing.
When a user submits an edit, you can check if the record was edited between when they started editing and when they tried to submit it, and show them the new version beside their version, so that they manually "merge" the 2 updates.
There probably are more solutions but these should get you started.
It depends on which version of Apache you are using and how it is configured, but a common default configuration uses multiple workers with multiple threads to handle simultaneous requests. See http://httpd.apache.org/docs/2.2/mod/worker.html for a rundown of how this works. The end result is that your PHP scripts may together have dozens of open database connections, possibly sending several queries at the exact same time.
However, your DBMS is designed to handle this. If you are only doing simple INSERT queries, then your code doesn't need to do anything special. Your DBMS will take care of the necessary locks on its own. Row-level locking will be fastest for multiple INSERTs, so if you use MySQL, you should consider the InnoDB storage engine.
Of course, your query can always fail whether it's due to too many database connections, a conflict on a unique index, etc. Wrap your queries in try catch blocks to handle this case.
If you have other application-layer concerns about concurrency, such as one user overwriting another user's changes, then you will need to handle these in the PHP script. One way to handle this is to use revision numbers stored along with your data, and refusing to execute the query if the revision number has changed, but how you handle it all depends on your application.
For example if we have a certain php file on server getProducts.php. Does it get interrupted when multiple users request it at the same time?
for example if a user asks for details about product A, and another user about product B, and another user about a product C, etc...will php be interrupted? or it's a self generated threading system that works and respond upon and to each request?
Thank you!
This has, unexpectedly, little or nothing to do with PHP. It's not PHP that answers the user's request but the web server. For example Apache, NginX, IIS, and so on.
The web server then routes the call to a PHP instance that is usually independent of any other request being satisfied in that exact moment. The number of concurrent requests depends on the server configuration, architecture, and platform capabilities. So-called "C10K" servers are designed to front up to ten thousand connections simultaneously.
But PHP is not the only factor in the process that goes from "GET /index.php" to a bunch of HTML; any active page (PHP or ASP or Python etc.) may request further resources from, say, a database. In that case a concurrency problem arises, and whenever two users need to acquire the same resource (a row in a data table, the whole table, a log file...), some sort of semaphore system makes it so that only one of them at a time can acquire a "lock" on that specific resource, and all others must wait for their turn, even if the overlying web server is capable of handling hundreds or thousands of concurrent connections.
Update on performance issues: the same happens within PHP for things such as sessions. Imagine you have a single user requesting a single page and that page has code to generate ten more calls (to images, pop-ups, ads, AJAX...). The first request opens a session, which is a bunch of data that must remain coherent. So when the other ten calls come by, all bound to the same session, and PHP has no way of knowing whether any one of these calls wants to modify session data -- it has no recourse but to prevent the second call from proceeding until the first call has released the session lock, and once it does, the second call will block the third, and so on. Take-away point: avoiding session_start() if it is not needed (e.g. replacing it with cryptographically strong GET tokens or doing without altogether), or calling session_commit() as soon as you are finished modifying _SESSION's values, will greatly improve performances. (So will using a faster session manager, or one that doesn't do coarse lock: e.g. redis).
For example in image generation:
session_start();
// This does the magic.
session_commit();
// We can still read session. We just can't write it anymore.
// That's why we needed a session.
if (!isset($_SESSION['authorized'])) {
Header('HTTP/1.1 403 Forbidden');
die();
}
// Here the code that generates an image *and sends* it. The session
// lock, if we hadn't committed, will *not* expire until the request
// has been processed by the *client* with network slowness. (Things
// go much better if you use the CGI interface instead of module).
In your example and seeing the "WAMP" tags, you have a Windows Apache serving data retrieved from MySQL by PHP, and serving requests on products.
The Apache server will receive hundreds of connections, activate hundreds of instances of PHP module (they'll share most of their code, so memory occupation doesn't go up disastrously fast), and then all these instances will ask to MySQL, "What about product XYZ?". In MySQL parlance they will try to obtain a READ LOCK. Read lock means something like, "I'm reading this thing, so please none of you dare write on it until I'm finished". But all of them are just reading, so they will all succeed - concurrently.
So no, there will be no stops -- just then.
But suppose you also want to update a counter of product views. Then every PHP instance also needs a WRITE LOCK, which means, "I want to write on this thing, so none of you read until I'm finished or you'll risk reading half-baked data, and of course none of you write here while I'm going at it".
At this point, the table type counts. MyISAM tables have table locking: if the instance updating product A's statistics is writing on product_views, no other instance will be able to do anything with that whole table. They will all queue and wait. If the table is InnoDB, the lock is at row level - all instances updating product A will queue one after the other, parallel to those updating product B, C, D and so on. So if all instances are writing to different records, they'll run in parallel.
That's why you really want to use InnoDB tables in these cases.
Of course, if you have a record such as "page visits", and they are all updating the row for "product-page.php", you have a bottleneck right there, and in case of a high traffic site, you'd do well if you designed some other way of writing that information (one of many workarounds is to store it in a shared memory location; every now and then one of the many instances accessing it receives the task of saving the information to the database. The instances still compete for locking on the memory, but that's orders of magnitude faster than competing for a database transaction).
If you are using apache, it's a concurrent system. That means each request will be handled in parallel so your php script will not be interrupted.
This probably seems like a very simple question, and I would probably know if I had a more in depth knowledge of computer processes and the like, but anyway..
If two people request the same page from my server, is the PHP page processed once for the first person, and then a second time for the second person, or might these run along side each other at the same time?
Take this as an example. I have one stock Item left in my PHP driven online shop. A user adds this to their cart. Php script 1) checks to see if it is in stock, Yup, its in stock, so it 2)reserves it for him.
What If, in between checking if its in stock and reserving it, the same PHP page was loading for someone else, and just after user A checked if it was in stock, so did user B, before user A got a chance to reserve it, so they both end up reserving it!
Sorry if this seems silly, can't seem to find an answer on it, which is it?
Congratulations, you have identified a race condition! :-)
Whether PHP pages run in parallel or one after the other depends on the web server. Typically a web server allocates several threads to handle multiple incoming requests at once. So it may indeed happen that several instances of the same script are run in parallel if two or more users request the same page at the same time. Due to timing and scheduling differences it is unpredictable when each page will execute which action exactly.
Hence for such situations as you describe it is important to program actions in an atomic way, meaning that they either complete in their entirety or not at all. In your case you could use locks, transactions, cleverly formed UPDATE statements, UNIQUE indexes or a number of other techniques that avoid the possibility of two users reserving the same thing.
Yes, in general, without getting into too much detail: PHP scripts are executed simultanously for each request separately.
For making sure the problem you mentioned does not occur, you should probably implement feature of your database management system called "transactions". This way if you do something on the database layer and at the end you will find out the reservation can not happen, all the actions made within transaction will be rolled back.
In addition to transactions you should design your application keeping in mind that the problem you mentioned may occur. Thus you should design your database & application in a way allowing you to 1) shorten the time between "checking" and "reserving" as much as possible, 2) stopping the action if you cannot make reservation, and finally - in case of emergency - 3) identifying which reservation came first and which should be revoked.
Another idea, falling into category of "your application's design", may be something we could call "temporary reservation". That means you can temporarily (eg. for a couple of seconds) lock your reservation if you are about to make reservation. After that you can check if you really can make that reservation and either turn it into permanent reservation or just revoke it. I believe some systems also make longer temporary reservations right after the customer begins the process of reserving his/her places. Then, if the process is successful, the reservation is changed into permanent, but if some specific amount of time passes without success, the reservation can be simply revoked, allowing another customer to begin the process.
yes definately, they are parallel for php but when the database concerns you should learn transaction portion of database management system.
Yes and no. PHP may run in simultaneous processes depending on server setup, but on a small-scale, you'll only have one database. Database queries are handled sequentially, so you'll never have that kind of conflict. (As long as you check to see if an item's in stock immediately before you reserve it for someone.) More information
Of course, Users A + B might both see that it's in stock, and A might request it before B. But your code can realize that it's now out of stock and display an error to User B.
(You get into trouble with multiple database servers. If you have the same data stored across multiple servers, there's lag time before data can be fully replicated. But you won't have that issue. We're talking like top 1,000 sites here.)
We have this PHP application which selects a row from the database, works on it (calls an external API which uses a webservice), and then inserts a new register based on the work done. There's an AJAX display which informs the user of how many registers have been processed.
The data is mostly text, so it's rather heavy data.
The process is made by thousands of registers a time. The user can choose how many registers to start working on. The data is obtained from one table, where they are marked as "done". No "WHERE" condition, except the optional "WHERE date BETWEEN date1 AND date2".
We had an argument over which approach is better:
Select one register, work on it, and insert the new data
Select all of the registers, work with them in memory and insert them in the database after all the work was done.
Which approach do you consider the most efficient one for a web environment with PHP and PostgreSQL? Why?
It really depends how much you care about your data (seriously):
Does reliability matter in this case? If the process dies, can you just re-process everything? Or can't you?
Typically when calling a remote web service, you don't want to be calling it twice for the same data item. Perhaps there are side effects (like credit card charges), or maybe it is not a free API...
Anyway, if you don't care about potential duplicate processing, then take the batch approach. It's easy, it's simple, and fast.
But if you do care about duplicate processing, then do this:
SELECT 1 record from the table FOR UPDATE (ie. lock it in a transaction)
UPDATE that record with a status of "Processing"
Commit that transaction
And then
Process the record
Update the record contents, AND
SET the status to "Complete", or "Error" in case of errors.
You can run this code concurrently without fear of it running over itself. You will be able to have confidence that the same record will not be processed twice.
You will also be able to see any records that "didn't make it", because their status will be "Processing", and any errors.
If the data is heavy and so is the load, considering the application is not real time dependant the best approach is most definately getting the needed data and working on all of it, then putting it back.
Efficiency speaking, regardless of language is that if you are opening single items, and working on them individually, you are probably closing the database connection. This means that if you have 1000's of items, you will open and close 1000's of connections. The overhead on this far outweighs the overhead of returning all of the items and working on them.