I'm working on a small weekend project, which is basically an online IDE that allows you to run PHP, Ruby or Python code from the browser. I have everything setup and working, but the way i created the system, if a user runs a badly-written script, or a script with heavy-calculation, the system might slow down for everyone until i reach the timeout (15 seconds).
My system does not pass the fibonacci test. How can i run the process in isolation, that would allow users to create:
while (true) { fibonacci() } // pseudo-code
Without crashing the server? I have considered the following courses of action:
Running each process inside a Docker (https://www.docker.io) container, but i'm not sure how docker deals with slow containers
Running each process inside a VM
Running each process in an instantly-created EC2 instance (which is not really an option, since this is slow and expensive)
You should spawn another process using the multiprocessing module, then run the users code within that spawned process, thus keeping the inputted code "isolated" in another process. However, you should still keep in mind, you should always run this in a virtual machine because running it outside of one is unsafe on many levels.
Using this method, you can lower the processes priority, since you are in linux, and this should keep each proc. from slowing down your overall machine while the timeout runs. This is assuming that you are indeed running a linux system.
Try limiting the process to just one of your CPU cores.
You can use taskset to do that:
http://linux.die.net/man/1/taskset
You can also isolate one of your CPU cores using isolcpus (and your system processes won't use that core), and use taskset to run PHP/Ruby/Python code in that CPU core.
Learn more about isolcpus:
Whole one core dedicated to single process
https://askubuntu.com/questions/165075/how-to-get-isolcpus-kernel-parameter-working-with-precise-12-04-amd64
Related
we are working on project where we have used node-js in background socket for continues respond to web application. In between sometimes somehow some process stops automatically.
We would like to know how we can check all the process running using forever.
We are using sudo forever list to list all process. Is there any way to use this command(forever list) in .sh(shell script file) to check my specific process like responsclient is working or not. If that particular process is not working then we needs to start that.
There are several solutions that will ensure that your service is always running.
One of them is even called forever. Here you have an overview prepared by express.
However, for production services I recommend passenger The result is almost the same, but much greater scalability. For example, you can configure so that another instance is automatically added.
Almost - because it is designed to ensure the availability of HTTP, and not the constant operation of the application.
BTW: service stops, because you have uncatched exception.
Update
If you insist on forever, then: (We're talking about the same forever?)
Make sure that forever is run by the same user. forever has separate managers for all users.
Make sure you save your data in the same place. (automatic run eg by cron is different from manual startup (vaiables in env))
forever has --pidFile - then it is very easy to check if the process is working
also ps -aux | grep node should be your big friend.
No, I do not have it combined. When I started to have problems, I switched to passenger. Finally I did it well because I have professional monitoring, which I launched in less time than searching how to combine the above points together.
I have about 35 cron jobs right now. Most of them are PHP scripts that either scrape or do some calculations. The scripts also loop over 10-20 different servers to do those scrapes. (They are different countries so they have to be separate calls).
So we have 30 scripts, each has a loop over 20 servers and therefore take about 5-15 minutes to run per script. I have each script spaced out right now.
But is it better to have 80 individual scripts run instead of 35 scripts that loop and take a while? Each script would take maybe 1-2 minutes instead of 10-15min.
That would of course spawn a ton more PHP processes. Is there any issue or limit with 10-15 or more PHP processes running at once?
I'm running a cloud server performance on Rackspace.
Personally if the jobs need to complete in a certain order I would make it as linear as possible.....it might take longer but I always err . The side of data accuracy.
It depends.
If you are creating more processes that will be running at the same time you are going to increase your overall memory footprint. Each process will carry it's own overhead of memory for the process to run, and to load any libraries needed for it's process. (aside from whatever it needs to do whatever it does). You will also more than have twice as many script to monitor that they are successfully running all the time.
However in creating more processes you will be able to speed things us since you are essentially creating a multi-thread. Allowing one process to continue while another is blocking waiting for i/o.
If each script doesn't have a dependency on another, breaking them into smaller scripts should be fine. If you can handle monitoring more scripts, and the server can handle it, then I would do it.
If scripts do have dependencies, or if you would have to run so many at the same time you server usage maxes out, keep them together.
That being said, I would also try to optimize the script, make sure there isn't something you can do to make them faster without create more processes.
Depending on how you have the servers setup, I would run them at once. In addition, I would also run them at night, off hours when the web servers aren't in use and not during business operations unless your web app depends on it. If you're on a Cloud server on Rackspace I wouldn't worry about bandwidth although increasing your ram could be an issue further down the road.
Spawning a ton more PHP process shouldn't be a worry if you have sufficient amount of ram; there is no limitation on the linux side.
a) Figure out which cron needs to run in which order
b) Order the cron to be run at night, around mid-night
c) Run and fireoff the 80 scripts at once
it would also be a good idea to send you an email with cron results or report that it all went through successfully, based on the batch but not individual cron.
We have a web app which allows us to monitor and control our server applications. The web pages start applications by executing a shell script to start them. The problem we have run into is that if we need to restart apache, it kills any of the processes that were started by the web app.
The web pages are PHP, and are using the exec() command to call the start scripts. The start scripts start Java apps, and and run the apps with something like this:
nohup java ... &
As mentioned, PHP is running in Apache on Linux. Is there some other switch or way to start these processes which would not have them be child processes of Apache (and killed when it stops)?
CLARIFICATION
I am more familiar with Windows than with Linux. In Windows, if you want to accomplish what we are trying add the start keyword in the shell, i.e.:
start <batchfile>
When you use start, the new shell/process can be unhooked from the one that started it. Is there a Linux equivalent to the start command?
Starting long-lasting processes by PHP sounds like asking for big trouble.
You will have problems like yours, and you will have huge security implications.
Much better solution is to have your PHP pages save their intent that something needs to be run in batch mode into database table (MySQL or PostgreSQL).
Another process (probably running under more advanced credentials than apache www user) should run as daemon and constantly check database for new stuff to do and execute necessary tasks (also it could be fired by cron every few minutes).
This way, you will kill two birds with one stone.
I wrote up how to create long running processes with php on my blog however I've got to agree with mvp that this approach is far from ideal for your purposes - not just from the point of view of privilege seperation (using a setuid program or sudo solves that easily enough).
Depending opn what you're trying to achieve here, I suspect that the additional functionality in DJ Bernsteins daemontools will be a better fit.
You could use batch(1) to start your long lasting server processes.
In shell, you could do
batch << END
java yourjava.jar
END
if you have some batch shell script file, start it with
batch -f yourbatchfile
If you can improve the Java programs, you might have them call daemon(3) at their start time, perhaps using the daemon thing from Apache.
You probably want to store the daemons' process pid somewhere (e.g. in some file or database), to be able to stop them (first with kill -TERM, then with kill -QUIT, at last with kill -KILL).
Using daemon function or Java thing is probably better than using a batch
Part of my web application is a background script that polls from a beanstalkd server and process data.
This script needs to run continuously (like a daemon). If it crashes, it needs to be started again. It also can't be started twice (more precisely run twice).
As I want to ease the deployment and development process, I want to avoid using pcntl_fork. It's not available on Windows, it necessitates recompiling PHP on Mac, sometimes on Linux too...
Can I do this simply using a bash script to launch the PHP script in background?
# verify that the script is not already running
...
/usr/bin/php myScript.php &
If I execute this batch with crontab every hour or so, my process should run continuously and be restarted in maximum one hour if it crashes?
Assuming blindly that you control the server(s) on which your scripts run, Supervisor is probably a good solution for you.
It's a process control daemon, written in Python. You can configure it to start your PHP script and keep it running. The PHP script itself doesn't need to do anything special. No forking, no manual process control, nothing.
On the other hand, you've also expressed concern about pcntl_fork not being available on Windows. If you're really running this thing on Windows, Supervisor isn't going to work out for you, as it isn't Windows friendly. Keep in mind that Windows isn't really friendly to Unix-style daemonization either, as it would want to control the daemon as a Service. While that's possible, it's not exactly an easy or elegant solution.
I built an app in php where a feature analyzes about 10000 text files and extracts stuff from them and puts it into a mysql database. The code itself is just a for loop where every file is loaded through file_get_contents() and after the end of that iteration, its unset() from memory. The file analysis is a cron job and a single php file does all this processing.
The problem however is that the app was built (initially) entirely on a shared server everything worked seamlessly really well. I didn't notice any delays or major lags neither did users however in order for it to be able to handle more of a load, I moved everything to an EC2 server (the micro instance).
The problem I am having now is that every time I run the cronjob (process the files on hourly basis) it slows the entire server down so much that a normal page takes about 5-8 seconds to load, which sort of defeats the purpose of moving it to EC2.
The cron itself is a very long process. Here are some tests results of the script process (every hr)
SQL Insertion Time: 23.138303995132 seconds
Memory Used: 10.05 MB
Execution: 411.00507092476 seconds
But on the top of every hour the server slows down so much for 7 minutes despite of having more dedicated hardware acceleration compared to a shared server (I think at least). The graphs from EC2 dashboard show that the CPU usage is close to 100% but I don't understand how it gets to that level.
Can anyone help me determine the reason as to why this could be happening? I have noticed not even the slightest lag when the cron runs on the shared server but the case is completely different for EC2.
Please feel free to ask me anything I missed mentioning.
Micro instances are pretty slow. If you use a larger instance, it'll run a lot faster.
We use EC2 for all of our production boxes. I can't say enough good things about that platform. I'll never go back to another host.
Also, if you want to write your code in C++, it'll run A LOT faster. I wrote a simple mysql insert with this code here. It's multi-threaded, so you can asyncronously run mysql updates or inserts.
Please let me know if you need any help with it, but I'm sure you'll be able to just use a micro instance still and get great speeds.
Hope that helps...
PS. I'd be willing to help you write a C++ version for your uses... just because it's fun! :-)
Well EC2 is designed to be scalable.
Since your code is running in 1 loop to open each file one after another, it does not make for a scalable design.
Try changing your codes to break them up so that the files are handled concurrently by different instances of the php script. That way, each copy of the script can run in a thread by itself. If you have multiple servers (or instances of servers in EC2), you can run them on different machines to speed it up even more.