php execute code before every function call - php

I want to introduce heartbeats to some of my scripts.
I have therefore added a service class that allows to save and update a timestamp in the DB.
It also keeps track of the last hearbeat and only queries the DB every 10s to prevent excessive DB calls.
Currently I need to manually add the heartbeat function call throughout my whole codebase.
Is there some way run it regularly automatically.
I found this that could be going into the right direction:
https://www.php.net/manual/en/function.uopz-set-hook.php
However this requires zend and to install an extension.
Is there some other way to do it?

A function like that should really only be used in DEV, never in prod, because it will be a massive impact on execution times.
I think you are searching for something like register_tick_function.
I'll not drop a code example here, because the PHP docs do contain good ones.

Related

Good idea to run a PHP file for a few hours as cronjob?

I would like to run a PHP script as a cronjob every night. The PHP script will import a XML file with about 145.000 products. Each product contains a link to an image which will be downloaded and saved on the server as well. I can imagine that this may cause some overload. So my question is: is it a better idea to split the PHP file? And if so, what would be a better solution? More cronjobs, with several minutes pause between each other? Run another PHP file using exec (guess not, cause I can't imagine that would make much of a difference), or someting else...? Or just use one script to import all products at once?
Thanks in advance.
It depends a lot on how you've written it in terms of whether it doesn't leak open files or database connections. It also depends on which version of php you're using. In php 5.3 there was a lot done to address garbage collection:
http://www.php.net/manual/en/features.gc.performance-considerations.php
If it's not important that the operation is transactional, i.e all or nothing (for example, if it fails half way through) then I would be tempted to tackle this in chunks where each run of the script processed the next x items, where x can be a variable depending on how long it takes. So what you'll need to do then is keep on repeating the script until nothing is done.
To do this, I'd recommend using a tool called the Fat Controller:
http://fat-controller.sourceforge.net
It can keep on repeating the script and then stop once everything is done. You can tell the Fat Controller that there's more to do, or that everything is done using exit statuses from the php script. There are some use cases on the Fat Controller website, for example: http://fat-controller.sourceforge.net/use-cases.html#generating-newsletters
You can also use the Fat Controller to run processes in parallel to speed things up, just be careful you don't run too many in parallel and slow things down. If you're writing to a database, then ultimately you'll be limited by the hard disc, which unless you have something fancy will mean your optimum concurrency will be 1.
The final question would be how to trigger this - and you're probably best off triggering the Fat Controller from CRON.
There's plenty of documentation and examples on the Fat Controller website, but if you need any specific guidance then I'd be happy to help.
To complete the previous answer, the best solution is to optimize your scripts:
Prefer JSON to XML, parsing JSON is faster (vastly).
Use one or few concurrent connection to database.
Alter multiple rows in one time (Insert 10-30 rows in one query, select 100 rows, delete multiple, not more to not overload memory and not less to make your transaction profitable).
Minimize the number of queries. (following previous point)
Skip definitively already up to date rows, use dates (timestamp, datetime).
You can also let the proc whisper with usleep(30) call.
To use multiple PHP process, use popen().

Atomic/safe serving of single-use codes

I have a list of single-use discount codes for an ecommerce site I'm partnering with. I need to set up a page on my site where my users can fill out a form, and then will be given one of the codes. The codes are pre-determined and have been sent to me in a text file; I can't just generate them on the fly. I need to figure out the best way to get an unused code from the list, and then remove it from the list (or update a flag to mark it as used) at the same time, to avoid any possibility of giving two people the same code. In other words, something similar to a queue, where I can remove one item from the queue atomically.
This webapp will be running on AWS and the current code is Python (though I could potentially use something else if necessary; PHP would be easy). Ideally I'd use one of the AWS services or mysql to do this, but I'm open to other solutions if they're not a royal pain to get integrated. Since I thought "queue," SQS popped into my head, but this is clearly not what it's intended for (e.g. the 14 day limit on messages remaining in the queue will definitely not work for me). While I'm expecting very modest traffic (which means even really hacky solutions would probably work), I'd rather learn about the RIGHT way to do this even at scale.
I cant give actual code examples, but one of the easiest ways to do it would just be an increment counter in the file, so something like
0
code1
code2
code3
etc
and just skipping that many lines every time a code is used.
You could also do this pretty simply in a database
Amazon DynamoDB is a fast, NoSQL database from AWS, and it is potentially a good fit for this use case. Setting up a database table is easy, and you could load your codes into there. DynamoDB has a DeleteItem operation that also allows you to retrieve the data within the same, atomic operation (by setting the ReturnValues parameter to ALL_OLD). This would allow you to get and delete a code in one shot, so no other requests/processes can get the same code. AWS publishes official SDKs to help you connect to and use their services, including both a Python and PHP SDK (see http://aws.amazon.com/tools/).

Execute php script only once on live site

I have a script that does an update function live. I would move it to a cron job, but due to some limitations I'd much rather have it live and called when the page loads.
The issue is that when there is a lot of traffic, it doesn't quite work, because it's using some random and weighted numbers, so if it's hit a bunch of times, the results aren't what we want.
So, question is. Is there a way to tell how many times a particular script is being accessed? And limit it to only once at a time?
Thank you!
The technique you are looking for is called locking.
The simplest way to do this is to create a temporary file, and remove it when the operation has completed. Other processes will look for that temporary file, see that it already exists and go away.
However, you also need to take care of the possibility of the lock's owner process crashing, and failing to remove the lock. This is where this simple task seems to become complicated.
File based locking solutions
PHP has a built-in flock() function that promises a OS-independent file-based locking feature. This question has some practical hints on how to use it. However, the manual page warns that under some circumstances, flock() has problems with multiple instances of PHP scripts trying to get a lock simultaneously. This question seems to have more advanced answers on the issue, but they are all not trivial to implement.
Database based locking
The author of this question - probably scared away by the complications surrounding flock() - asks for other, not file-based locking techniques and comes up with MySQL's GET_LOCK(). I have never worked with it, but it looks pretty straightforward - if you use mySQL anyway, it may be worth a shot.
Damn, this issue is complicated if you want to do it right! Interested to see whether anything more elegant comes up.
You could do something like this (requires PHP 5):
if(file_get_contents("lock.txt") == "unlocked"){
// no lock present, so place one
file_put_contents("lock.txt", "locked");
// do your processing
...
// remove the lock
file_put_contents("lock.txt", "unlocked", LOCK_EX);
}
file_put_contents() overwrites the file (as opposed to appending) by default, so the contents of the file should only ever be "locked" or nothing. You'll want to specify the LOCK_EX flag to ensure that the file isn't currently open by another instance of the script when you're trying to write to it.
Obviously, as #Pekka mentioned in his answer, this can cause problems if a fatal error occurs (or PHP crashes, or the server crashes, etc, etc) in between placing the lock and removing it, as the file will simply remain locked.
Start the script with a sql query that tests if a timestamp field from database is over 1 day ago.
If yes - write current timestamp and execute script.
pseudo-sql to show the idea:
UPDATE runs SET lastrun=NOW() WHERE lastrun<NOW()-1DAY
(different sql servers will require different changes in the above)
Check how many rows were updated to see if this script run got the lock.
Do not make it with two queries - SELECT and UPDATE because it won't be atomic anymore.

Best approach for running an "endless" process monitoring MySQL?

I have a process that has to be ran against certain things and it isn't suitable to be ran at the users end (15+ seconds to process) so I considered using a cron job but again, this is also unsuitable because it will create a back log. I have narrowed my options down to either running an endless process that monitors for mysql changes, or configuring mysql to trigger the script when it detects a change but the latter is not something I want to get into unless it's my only option, which leaves me with the "endless" monitoring option.
The sort of thing I'm considering with PHP is:
while (true) {
$db->query('SELECT * FROM database');
while($row = $db->fetch_assoc()){
// do the stuff here
}
sleep(5);
}
and then running it via the command line. Now this is theoretically sound but in practice it isn't doing as well as I hoped, using more resources than I would expect (but not out of my range, just not what I'm aiming for optimally). So my questions are as follows:
Is PHP the wrong language to do this in? PHP is what I work with, but I understand that there are times when it's the wrong choice and I think maybe this is. If it is, what language should I use?
Is there a better approach that I haven't considered and that isn't any of the ideas I have listed?
If PHP is the correct option, how can I optimise the code I posted, is there a method better than sleeping for 5 seconds after each completed operation?
Thanks in advance! I'm open to any ideas as long as they're not too far out there, I'm running my own server with free reign so there's no theoretical limit on what I can do.
I recommend moving the loop out into a shell script and then executing a new PHP process for every iteration. This way PHP will never use unbounded resources (even if there is a memory/connection leak somewhere) since the process is terminated on each iteration. Something like the following should be fine (Bash):
while true; do
php /path/to/your/script.php 2>&1 | logger ...(logger options)
sleep 5
done
I've found this approach to be far more robust for long-running scripts in PHP, probably because this is very like the way PHP operates when run as a CGI script.
You should always work with the language you're most familiar with. If this is PHP, then it's not a wrong choice.
Disconnect from the database before sleeping. This way your script won't keep a connection reserved, and it will work fine even after database restart.
Free mysql result after using it. Always check for error conditions in daemonized processes, and deal with them appropriately.
PHP might be the wrong language as it's really designed for serving requests on an ad-hoc basis, rather than creating long-running daemons. (It was originally created as a preprocessor language, then later on came into general use as a web application language.)
Something like Python might work better for your needs; it's a little more naturally designed for "daemon-like" programs.
That said, it is possible to do what you want in PHP.
what kind of problems are you experiencing?
i dont know about the database class you have there in $db, but it could generate a memory leak.
furthermore i would suggest closing all your connections and unsetting all your variables if necessary at the end of the loop and re open on the beginning!
if its only 5 second sleep maby only on every 10th interation or something. you can do a counter for that...
theese points considered theres nothing wrong with this approach.

Reusing MySQL results

I'm having somewhat theoretical question: I'm designing my own CMS/app-framework (as many PHP programmers on various levels did before... and always will) to either make production-ready solution or develop various modules/plugins that I'll use later.
Anyway, I'm thinking on gathering SQL connections from whole app and then run them on one place:
index.php:
<?php
include ('latestposts.php');
include ('sidebar.php');
?>
latestposts.php:
<?php
function gather_data ($arg){ $sql= ""; }
function draw ($data) {...}
?>
sidebar.php:
<?php
function gather_data ($arg){ $sql= ""; }
function draw ($data) {...}
?>
Now, while whole module system application is yet-to-be-figured, it's idea is already floating somewhere in my brain. However, I'm thinking, if I'm able to first load all gather_data functions, then run sql and then run draw functions - and if I'm able to reuse results!
If, in example, $sql is SELECT * FROM POSTS LIMIT 10 and $sql2 is SELECT * FROM POSTS LIMIT 5, is it possible to program PHP to see: "ah, it's the same SQL, I'll call it just once and reuse the first 5 rows"?
Or is it possible to add this behavior to some DRM?
However, as tags say, this is still just an idea in progress. If it proves to be easy to accomplish, then I will post more question how :)
So, basically: Is it possible, does it make sense? If both are yes, then... any ideas how?
Don't get me wrong, that sounds like a plausible idea and you can probably get it running. But I wonder if it is really going to be beneficial. Will it cause a system to be faster? Give you more control? Make development easier?
I would just look into using (or building) a system using well practiced MVC style coding standards, build a good DB structure, and tweak the heck out of Apache (or use something like Lighttpd). You will have a lot more widespread acceptance of your code if you ever decide to make it open source, and if you ever need a hand with it another developer could step right in and pick up the keyboard.
Also, check out query caching in MySQL--you will see a similar (though not one-to-one) benefit from caching your query results server side with regard to your query example. Even better that is stored in server memory so PHP/MySQL overhead is dropped AND you don't have to code it.
All of that aside, I do think it is possible. =)
Generally speaking, such a cache system can generate significant time savings, but at the cost of memory and complexity. The more results you want to keep, the more memory it will take; and there's no guarantee that your results will ever be used again, particularly the larger result sets.
Second, there are certain queries that should never be cached, or that should be run again even if they're in the cache. For the most part, only SELECT and SHOW queries can be cached effectively, but you need to worry about invalidating them when you modify the underlying data. Even in the same pageview, you might find yourself working around your own cache system on occasion.
Third, this kind of problem has already been solved several times. First, consider turning on the MySQL query cache. Most of the time, it will speed things up a bit without requiring any code changes on your end. However, it's a bit aggressive about invalidating entries, so you could gain some performance at a higher level.
If you need another level, consider memcached. You'll have to store and invalidate entries manually, but it can store results across page views (where you'll really find the performance benefit), and will let unused entries expire before running out of memory.

Categories