I have a Ubuntu server which is collecting incoming SNMP traps. Currently these traps are handled and logged using a PHP script.
file /etc/snmp/snmptrapd.conf
traphandle default /home/svr/00-VHOSTS/nagios/scripts/snmpTrap.php
This script is quite long and it contains many database operations. Usually the server receives thousands of traps per day and therefore this script is taking too much CPU time. My understand is this is due to high start-up cost of the php script every-time when a trap received.
I got a request to re-write this and I was thinking of running this script as a daemon. I can create an Ubuntu daemon. My question is how can I pass trap-handler to this daemon using snmptrapd.conf file?
Thank you in advance.
One suggestion is to use mysql support thats built into 5.5 of snmptrapd. That way you can use mysql as a queue and process the traps in bulk.
details of this are on the snmptrapd page: http://www.net-snmp.org/wiki/index.php/Snmptrapd
If not using mysql another option is to use a named pipe.
Do mkfifo snmptrapd.log
Now change snmptrapd to write to this log. Its not a file but it looks like one. You then write another daemon to watch the named pipe for new data.
You can probably use php-fpm / php-fcgi to minimize PHP script start-up cost.
Although, you probably need to write some wrapper shell script to forward request from snmptrapd to fcgi protocol.
But at first I'd recommend checking the PHP script. PHP start-up cost is not that high that few requests per minute should rise CPU usage notably.
Related
I am programming a website on a Linux CentOS server (I am planning to upgrade to a VPS plan where I will have root access). Much of the website will rely on scripts that are automated.
I have 2 questions about starting automated processes.
Is there any way I can start a Daemon thread, or anything like that, which will constantly be running. I need to execute a script every time an email account gets a new e-mail. I am aware of cron jobs that can run every minute, but having a script that constantly runs would be ideal, so I can execute the script the moment a new e-mail arrives.
Is there any way from code (ideally PHP) to start a thread, which runs concurrently with the main program. In the script I am using, the imap_open is used to connect to an e-mail account, which takes a few seconds every time. However, if I could fire off multiple concurrent scripts at the same time, that would ideally reduce the program's time. Is there any way to do this?
Any help with these questions would be greatly appreciated.
You can certainly write a daemon / service that runs constantly. For a starting tutorial see
http://www.netzmafia.de/skripten/unix/linux-daemon-howto.html
Your daemon can implement SMTP (there are existing libraries available to support this) to periodically check the email account for new emails and act accordingly.
Here's a question with answers from SO that discusses how to accomplish all of this with Python
How to make a Python script run like a service or daemon in Linux
For the first part, there's two easy solutions:
Use the Vixie cron #reboot start specification to start your daemon at reboot as a standard user. This and every-minute cron-jobs are the only mechanisms that make it easy to run a daemon-style service as a user.
Use procmail to start a new script on every email delivery. The downside here is that procmail will run and then start a new program on every email -- when you're getting a hundred emails per second, this could be a serious hindrance compared to a daemon that uses inotify(7) to alert a long-lived program about new emails.
For the second part, look for a wrapper for the fork(2) system call. It cleaves a program cleanly in half -- parent and child -- and allows each to continue independent execution from then on. If the child and parent need to communicate again in the future, then perhaps see if PHP supports threaded execution.
And what about incron? May be there is a way to use it in your case but you must produce a filesystem event (for example create a new file).
Question,
How can I spurn another process within a daemon?
I want to use the pear system daemon library to spurn a daemon and then spurn off processes within that daemon.
So daemon runs
and then a new process is spurn off and does calculation separately
then other processes are spurn off that runs separate from the daemon.
meanwhile, daemon keeps executing code and spurns off more processes
how can I accomplish this?
System_Daemon only handles startup/shutdown handling, general signal handling and logging.
If you want to spawn new processes from your PHP code, you need to use PHP's pcntl functions.
Spurn? I assume you mean spawn.
PHP has lots of functions for creating processes - however (AFAIK) they are all blocking (except for pcntl_exec which replaces the current process)
A quick sift through the documentation for the Pear System Daemon, this only handles the process of daemonizing the process - not of running a server process and handling multiple clients. How you go about implementing this will have a big impact on how you handle starting up new processes.
One solution would be to fork an instance of the current process to handle an incoming connection - there's an example on the socket_accept() doc page. Then it doesn't matter if the process you start is via a blocking call or not.
But a much simlper solution would be not to bother with a daemon / forking / sockets and just invoke it via [x]inetd using stdio
C.
I had the same problem before. The solution I did was to have one system_daemon calling another system_daemon through exec. You need to change the appPidLocation option to run a new instance of the same code.
To see the list of options I looked at the code of system_daemon.
Could someone explain me in two words, what is daemon and what use of them in php?
I, know that this is a process, which is runing all the time.
But i can't understand what use of it in php app?
Can someone please give examples of use?
Can i use daemon to lessen memory usage of my app?
As i understand, daemon can hold data and give it on request, so basically i can store most usable data there, to avoid getting it from mysql for each visitor?
Or i'm totally wrong? :)
Thanks ;)
A daemon is a endless running process, which just waits for jobs. A webserver ("http-daemon") waits for requests to handle, a printer daemon waits for something to print (and so on). On Win systems its called "service".
If you can use it for your application in some way highly depends on your application and what you want to do with a daemon. But also I dont recommend PHP for that.
Could someone explain me in two words, what is daemon and what use of them in php?
cli application or process
I, know that this is a process, which is runing all the time. But i can't understand what use of it in php app?
You can use it to do; job that is not visible to user or from interface, e.g. database stale data cleanup, schedule task that you you wanted to update part or something on db or page in background
Can someone please give examples of use? Can i use daemon to lessen memory usage of my app?
I think drupal or cron had cron script...perhaps checking it would help. Lessen memory? no, memory optimization is always on the application design or script coded.
As i understand, daemon can hold data and give it on request, so basically i can store most usable data there, to avoid getting it from mysql for each visitor?
No, a daemon is a script however you can create a JSON or XML data file that the daemon script can process.
Please see this answer regarding the use of PHP for a daemon. There are times when you might want to fork a child process in PHP, perhaps to execute some query while the parent does other work and then inform the parent that the job as a whole can be completed.
I would not, however use PHP to set up a socket server or similar, nor would I use PHP in any other instance where execution was measured in units greater than seconds.
I don't want to discourage you from exploring and experimenting, just caution you against putting too much trust in an implementation that exceeds the capabilities of the language.
Because a daemon is just a process that runs in an infinite loop, whether or not a daemon can be helpful for your particular app is entirely up to the daemon and the requirements of your app.
MySQL is itself run as a daemon, but a typical way of decreasing the number of calls to MySQL is to cache their output in Memcached (which not surprisingly also runs as a daemon). So the advantage of using Memcached isn't that it's a daemon, it's that it's a daemon more geared to a specific task (caching objects) than MySQLd (providing a SQL-queryable database).
If your app repeatedly needs to make the same SQL queries, then it's definitely worth considering using Memcache or another caching layer (which, yes, will most likely be provided by a daemon) in between the app and MySQL.
I know about PHP not being multithreaded but i talked with a friend about this: If i have a large algorithmic problem i want to solve with PHP isn't the solution to simply using the "curl_multi_xxx" interface and start n HTTP requests on the same server. This is what i would call PHP style multithreading.
Are there any problems with this in the typical webserver environment? The master request which is waiting for "curl_multi_exec" shouldn't count any time against its maximum runtime or memory length.
I have never seen this anywhere promoted as a solution to prevent a script killed by too restrictive admin settings for PHP.
If i add this as a feature into a popular PHP system will there be server admins hiring a russian mafia hitman to get revenge for this hack?
If i add this as a feature into a
popular PHP system will there be
server admins hiring a russian mafia
hitman to get revenge for this hack?
No but it's still a terrible idea for no other reason than PHP is supposed to render web pages. Not run big algorithms. I see people trying to do this in ASP.Net all the time. There are two proper solutions.
Have your PHP script spawn a process
that runs independently of the web
server and updates a common data
store (probably a database) with
information about the progress of
the task that your PHP scripts can
access.
Have a constantly running daemon
that checks for jobs in a common
data store that the PHP scripts can
issue jobs to and view the progress
on currently running jobs.
By using curl, you are adding a network timeout dependency into the mix. Ideally you would run everything from the command line to avoid timeout issues.
PHP does support forking (pcntl_fork). You can fork some processes and then monitor them with something like pcntl_waitpid. You end up with one "parent" process to monitor the children it spanned.
Keep in mind that while one process can startup, load everything, then fork, you can't share things like database connections. So each forked process should establish it's own. I've used forking for up 50 processes.
If forking isn't available for your install of PHP, you can spawn a process as Spencer mentioned. Just make sure you spawn the process in such a way that it doesn't stop processing of your main script. You also want to get the process ID so you can monitor the spawned processes.
exec("nohup /path/to/php.script > /dev/null 2>&1 & echo $!", $output);
$pid = $output[0];
You can also use the above exec() setup to spawn a process started from a web page and get control back immediately.
Out of curiosity - what is your "large algorithmic problem" attempting to accomplish?
You might be better to write it as an Amazon EC2 service, then sell access to the service rather than the package itself.
Edit: you now mention "mass emails". There are already services that do this, they're generally known as "spammers". Please don't.
Lothar,
As far as I know, php don't work with services, like his concorrent, so you don't have a way for php to know how much time have passed unless you're constantly interrupting the process to check the time passed .. So, imo, no, you can't do that in php :)
Greetings All!
I am having some troubles on how to execute thousands upon thousands of requests to a web service (eBay), I have a limit of 5 million calls per day, so there are no problems on that end.
However, I'm trying to figure out how to process 1,000 - 10,000 requests every minute to every 5 minutes.
Basically the flow is:
1) Get list of items from database (1,000 to 10,000 items)
2) Make a API POST request for each item
3) Accept return data, process data, update database
Obviously a single PHP instance running this in a loop would be impossible.
I am aware that PHP is not a multithreaded language.
I tried the CURL solution, basically:
1) Get list of items from database
2) Initialize multi curl session
3) For each item add a curl session for the request
4) execute the multi curl session
So you can imagine 1,000-10,000 GET requests occurring...
This was ok, around 100-200 requests where occurring in about a minute or two, however, only 100-200 of the 1,000 items actually processed, I am thinking that i'm hitting some sort of Apache or MySQL limit?
But this does add latency, its almost like performing a DoS attack on myself.
I'm wondering how you would handle this problem? What if you had to make 10,000 web service requests and 10,000 MySQL updates from the return data from the web service... And this needs to be done in at least 5 minutes.
I am using PHP and MySQL with the Zend Framework.
Thanks!
I've had to do something similar, but with Facebook, updating 300,000+ profiles every hour. As suggested by grossvogel, you need to use many processes to speed things up because the script is spending most of it's time waiting for a response.
You can do this with forking, if your PHP install has support for forking, or you can just execute another PHP script via the command line.
exec('nohup /path/to/script.php >> /tmp/logfile 2>&1 & echo $!'), $processId);
You can pass parameters (getopt) to the php script on the command line to tell it which "batch" to process. You can have the master script do a sleep/check cycle to see if the scripts are still running by checking for the process id's. I've tested up to 100 scripts running at once in this manner, at which point the CPU load can get quite high.
Combine multiple processes with multi-curl, and you should easily be able to do what you need.
My two suggestions are (a) do some benchmarking to find out where your real bottlenecks are and (b) use batching and cacheing wherever possible.
Mysqli allows multiple-statement queries, so you could definitely batch those database updates.
The http requests to the web service are more likely the culprit, though. Check the API you're using to see if you can get more info from a single call, maybe? To break up the work, maybe you want a single master script to shell out to a bunch of individual processes, each of which makes an api call and stores the results in a file or memcached. The master can periodically read the results and update the db. (Careful to rotate the data store for safe reading and writing by multiple processes.)
To understand your requirements better, you must implement your solution only in PHP? Or you can interface a PHP part with another part written in another language?
If you could not go for another language, try to perform this update maybe as php script that runs in the background and not through the apache.
You can follow Brent Baisley advice for a simple use case.
If you want to build a robuts solution, then you need to :
set up a representation of the actions in a table in database that will be your process queue;
set up a script that pop this queue and process your action;
set up a cron daemon that run this script every x.
This way you can have 1000 PHP scripts running, using your OS parallelism capabilities and not hanging when ebay is taking to to respond.
The real advantage of this system is that you can fully control the firepower you throw at your task by adjusting :
the number of request one PHP script does;
the order / number / type / priority of the action in the queue;
the number or scripts the cron daemon runs.
Thanks everyone for the awesome and quick answers!
The advice from Brent Baisley and e-satis works nicely, rather than executing the sub-processes using CURL like i did before, the forking takes a massive load off, it also nicely gets around the issues with max out my apache connection limit.
Thanks again!
It is true that PHP is not multithreaded, but it can certainly be setup with multiple processes.
I have created a system that resemebles the one that you are describing. It's running in a loop and is basically a background process. It uses up to 8 processes for batch processing and a single control process.
It is somewhat simplified because i do not have to have any communication between the processes. Everything resides in a database so each process is spawned with the full context taken from the database.
Here is a basic description of the system.
1. Start control process
2. Check database for new jobs
3. Spawn child process with the job data as a parameter
4. Keep a table of the child processes to be able to control the number of simultaneous processes.
Unfortunately it does not appear to be a widespread idea to use PHP for this type of application, and i really had to write wrappers for the low level functions.
The manual has a whole section on these functions, and it appears that there are methods for allowing IPC as well.
PCNTL has the functions to control forking/child processes, and Semaphore covers IPC.
The interesting part of this is that i'm able to fork off actual PHP code, not execute other programs.