PHP pthreads, in pool tasks, ob_flush and flush cause crash - php

$p = new Pool(10);
for ($i = 0; i<1000; i++){
$tasks[i] = new workerThread($i);
}
foreach ($tasks as $task) {
$p->submit($task);
}
// shutdown will wait for current queue to be completed
$p->shutdown();
// garbage collection check / read results
$p->collect(function($checkingTask){
return ($checkingTask->isGarbage);
});
class workerThread extends Collectable {
public function __construct($i){
$this->i= $i;
}
public function run(){
echo $this->i;
ob_flush();
flush();
}
}
The code above is a simple example that would cause crash. I'm trying to update the page real-time by putting ob_flush();and flush(); in the Threaded Object, and it mostly works as expected. So the code above is not guaranteed to crash every time, but if you run it a couple times more, sometimes the script stops and Apache restarts with an error message "httpd.exe Application error The instruction at "0x006fb17f" referenced memory at "0x028a1e20". The memory could not be "Written". Click on OK ."
I think it's caused by flushing conflict of multiple threads when they try to flush about the same time? What can I do to work around it and flush as there's any new output.

Multiple threads should not write standard output, there is no safe way to do this.
Zend provides no facility to make it safe, it works by coincidence, and will always be unsafe.

Related

Cleanup console command on any termination

I have the following (simple) lock code for a Laravel 5.3 command:
private $hash = null;
public final function handle() {
try {
$this->hash = md5(serialize([ static::class, $this->arguments(), $this->options() ]));
$this->info("Generated signature ".$this->hash,"v");
if (Redis::exists($this->hash)) {
$this->hash = null;
throw new \Exception("Method ".$this->signature." is already running");
}
Redis::set($this->hash, true);
$this->info("Running method","vv");
$this->runMutuallyExclusiveCommand(); //Actual command is not important here
$this->cleanup();
} catch (\Exception $e) {
$this->error($e->getMessage());
}
}
public function cleanup() {
if (is_string($this->hash)) {
Redis::del($this->hash);
}
}
This works fine if the command is allowed to go through its execution cycle normally (including handling when there's a PHP exception). However the problem arises when the command is interrupted via other means (e.g. CTRL-C or when the terminal window is closed). In that case the cleanup code is not ran and the command is considered to be still "executing" so I need to manually remove the entry from the cache in order to restart it. I have tried running the cleanup code in a __destruct function but that does not seem to be called either.
My question is, is there a way to set some code to be ran when a command is terminated regardless how it was terminated?
Short answer is no. When you kill the running process, either by Ctrl-C or just closing the terminal, you terminate it. You would need to have an interrupt in your shell that links to your cleanup code, but that is way out of scope.
There are other options however. Cron jobs can be run at intermittent intervals to perform clean up tasks and other helpful things. You could also create a start up routine that runs prior to your current code. When you execute the start up routine, it could do the cleanup for you, then call your current routine. I believe your best bet is to use a cron job that simply runs at given intervals that then looks for entries in the cache that are no longer appropriate, and then cleans them. Here is a decent site to get you started with cron jobs https://www.linux.com/learn/scheduling-magic-intro-cron-linux

run big loop with parallel threads in PHP CLI

I have a computation-expensive backend process in Symfony2 / PHP that I would like to run multi-threaded.
Since I iterate over thousands of objects, I think I shouldn't start one thread per object. I would like to have a $cores variable that defines how many threads I want in parallel, then iterate through the loop and keep that many threads running. So every time a thread finishes, a new one with the next object should be started, until all objects are done.
Looking at the pthreads documentation and doing some Google searches, I can't find a useable example for this situation. All examples I found have a fixed number of threads they run once, none of them iterate over thousands of objects.
Can someone point me into the right direction to get started? I understand the basics of setting up a thread and joining it, etc. but not how to do it in a loop with a wait condition.
The answer to the question is use Pool and Worker abstraction.
The basic idea is that you ::submit Threaded objects to the Pool, which it stacks onto the next available Worker, distributing your Threaded objects (round robin) across all Workers.
Follows is super simple code is for PHP7 (pthreads v3):
<?php
$jobs = [];
while (count($jobs) < 2000) {
$jobs[] = mt_rand(0, 1999);
}
$pool = new Pool(8);
foreach ($jobs as $job) {
$pool->submit(new class($job) extends Threaded {
public function __construct(int $job) {
$this->job = $job;
}
public function run() {
var_dump($this->job);
}
});
}
$pool->shutdown();
?>
The jobs are pointless, obviously. In the real world, I guess your $jobs array keeps growing, so you can just swap foreach for some do {} while, and keep calling ::submit for new jobs.
In the real world, you will want to collect garbage in the same loop (just call Pool::collect with no parameters for default behaviour).
Noteworthy, none of this would be possible if it really were the case that PHP wasn't intended to work in multi-threaded environments ... it definitely is.
That is the answer to the question, but it doesn't make it the best solution to your problem.
You have mentioned in comments that you assume 8 threads executing Symfony code will take up less memory than 8 processes. This is not the case, PHP is shared nothing, all the time. You can expect 8 Symfony threads to take up as much memory as 8 Symfony processes, in fact, a little bit more. The benefit of using threads over processes is that they can communicate, synchronize and (appear to) share with each other.
Just because you can, doesn't mean you should. The best solution for the task at hand is probably to use some ready made package or software intended to do what is required.
Studying this stuff well enough to implement a robust solution is something that will take a long time, and you wouldn't want to deploy that first solution ...
If you decide to ignore my advice, and give it a go, you can find many examples in the github repository for pthreads.
Joe has a good approach, but I found a different solution elsewhere that I am now using. Basically, I have two commands, one control and one worker command. The control command starts background processes and checks their results:
protected function process($worker, $entity, $timeout=60) {
$min = $this->em->createQuery('SELECT MIN(e.id) FROM BM2SiteBundle:'.$entity.' e')->getSingleScalarResult();
$max = $this->em->createQuery('SELECT MAX(e.id) FROM BM2SiteBundle:'.$entity.' e')->getSingleScalarResult();
$batch_size = ceil((($max-$min)+1)/$this->parallel);
$pool = array();
for ($i=$min; $i<=$max; $i+=$batch_size) {
$builder = new ProcessBuilder();
$builder->setPrefix($this->getApplication()->getKernel()->getRootDir().'/console');
$builder->setArguments(array(
'--env='.$this->getApplication()->getKernel()->getEnvironment(),
'maf:worker:'.$worker,
$i, $i+$batch_size-1
));
$builder->setTimeout($timeout);
$process = $builder->getProcess();
$process->start();
$pool[] = $process;
}
$this->output->writeln($worker.": started ".count($pool)." jobs");
$running = 99;
while ($running > 0) {
$running = 0;
foreach ($pool as $p) {
if ($p->isRunning()) {
$running++;
}
}
usleep(250);
}
foreach ($pool as $p) {
if (!$p->isSuccessful()) {
$this->output->writeln('fail: '.$p->getExitCode().' / '.$p->getCommandLine());
$this->output->writeln($p->getOutput());
}
}
}
where $this->parallel is a variable I set to 6 on my 8 core machine, it signifies the number of processes to start. Note that this method requires that I iterate over a specific entity (it splits by that), which is always true in my use cases.
It's not perfect, but it starts completely new processes instead of threads, which I consider the better solution.
The worker command takes min and max ID numbers and does the actual work for the set between those two.
This approach works as long as the data set is reasonably well distributed. If you have no data in the 1-1000 range but every ID between 1000 and 2000 is used, the first three processes would have nothing to do.

PHP Cli key event listener function

I am making php cli application in which I need Key Event listener
let's say this is my code
Task To do
for($i=0;$i<=n;$i++){
$message->send($user[$i]);
}
Once I'm done with sending messages, I will have to keep connection alive with following code to receive delivery receipts.
I use $command = fopen ("php://stdin","r"); to receive user commands.
while(true){
$connection->refresh();
}
Connection is automatically kept alive during any activities but on idle i have to keep above loop running.
How can I run the event on pressing any key which will make this event break and execute some function?
PHP is not developed, to handle this kind of problems. The Magic Word would be Thread
The Cleanest way, I can Imagine, is taking a look into the Extension PThreads. With it, you can do Threads like in other Language:
class AsyncOperation extends Thread {
public function __construct($arg){
$this->arg = $arg;
}
public function run(){
if($this->arg){
printf("Hello %s\n", $this->arg);
}
}
}
$thread = new AsyncOperation("World");
if($thread->start())
$thread->join();
The other way would be to do tasks via a queue in a different script. There are some Queue Server out there, but it can be done simple calling shell_execute your Queue Script via PHP.exe. On Linux you need something like ...script.php > /dev/null 2>/dev/null &, Windows start /B php..., to stop waiting on Script is finished.

PHP pthreads memory issue

I am developing a networking application where I listen on a port and create a new socket and thread when a new connection request arrives; the architecture is working well but we are facing severe memory issues.
The problem is that even if I create a single thread, it does the work but the memory keeps on increasing.
To demonstrate the problem please review the following code where we start one thread of a class whose duty it to print a thread ID and a random number infinitely.
class ThreadWorker extends Thread {
public function run() {
while(1) {
echo $this->getThreadId()." => ".rand(1,1000)."\r\n";
}
}
}
$th = new ThreadWorker();
$th->start();
I am developing it on Windows OS and when I open the task manager the PHP.exe memory usage keeps on increasing until the system becomes unresponsive.
Please note that the PHP script is executed from command line:
PHP.exe pthreads-test.php
OK, I think the problem is that the thread loop is highly CPU consuming. Avoid such that code. If you just want to echo a message, I recommend putting a sleep() instruction after. Example:
class ThreadWorker extends Thread {
public function run() {
while(1) {
echo $this->getThreadId()." => ".rand(1,1000)."\r\n";
sleep(1);
}
}
}
EDIT
It seems there's a way to force garbage collection in PHP. On the other hand, sleep() is not a proper way to stabilize CPU use. Normally threads do things like reading from files, sockets or pipes, i.e., they often perform I/O operations which are normally blocking (i.e. they pause the thread until data is I/O operation is possible). This behaviour inherently yields the CPU and other resources to other threads, thus stabilizing the whole system.

Why does flock occasionally take a long time on Windows / NTFS?

I use the file system to create a application wide persistent singleton (application does not use a database). Occasionally a page will take 1-2 minutes to load and I have narrowed the problem down to the use of flock in the function that gets an instance of the singleton. Here is a simplified version of the code: (edit: left out the most important part of the code in my original post)
public static final function getInstance() {
if (is_null(self::$instance) {
$fh = fopen($filename, 'ab+');
if (flock($fh, LOCK_EX)) {
$N = filesize($filename);
if ($N > 0) {
rewind($fh);
$s = stream_get_contents($fh);
$obj = unserialize($s);
} else {
$obj = new MyClass();
}
self::$instance = $obj;
return $obj;
} else {
fclose($fh);
trigger_error("could not create lock", E_USER_WARNING);
}
} else {
return self::$instance;
}
}
The code is currently being run my development machine which uses XP and NTFS.
The lock is always created (i.e. trigger_error is not called).
The delay is random but seems to happen more often when refresh is hit.
Getting rid of flock completely eliminates the problem but it also makes the code unsafe.
Any advice?
Does anyone know a better way of creating an application wide persistent singleton?
Who closes the $fh in the if {} clause? Isn't it left open? in that case it might take a long time to unlock.
Otherwise it will hang open for at least the duration of the script.
You could try locking with the LOCK_SH parameter instead of LOCK_EX. You can still lock for writing if you find that you need to at a later time. I would further release the lock as soon as possible or other processes will block unnecessarily.

Categories