PHP PECL Threads results order

PHP PECL Threads results order - php

I have a multithread, the main idea is to run nmap commands in the console and deliver the results in an orderly manner,
example:
Results after shell_exec
Command 4
Command 1
Command 2
Command 3
How can I get the results in an orderly manner?
Command 1
Command 2
Command 3
Command 4
public function __construct($arg) {
$this->arg = $arg;
}
public function run() {
$salida = shell_exec($comando);
}
`

If you're launching them in separate threads, the jobs are unlikely to finish in the same order that they were started. You'll need to track them and wait until all are done. You didn't include much of your code, but here's a generic example:
// create jobs
$jobs[0] = new nmapJob(args0);
$jobs[1] = new nmapJob(args1);
...
// start jobs
foreach ($jobs as $job)
{
$job.start();
}
// wait for jobs to finish
foreach ($jobs as $job)
{
$job.join();
}
// display results
foreach ($jobs as $job)
{
echo($job.salida);
}
But... I suggest using a different technique. Having a shell command dangle like that isn't the best of practices, especially if it can take a while to run (as nmap jobs often do). It's more complicated, but look into running the scans asynchronously. Spawn them as a separate process and have the results dumped into a directory. A different PHP script can be used to process the results in that directory once the scans are done.

Related

How do I ensure that a process started from a queue job gets terminated after timeout?

I have a problem with terminating processes started from a queue job.
I use the yii2-queue extension to run some long running system commands that have a total execution time limit controlled by the getTtr method of the RetryableInterface. The command may take anywhere from minutes to hours to fully complete, but I need to kill it after it hits the 60-minute mark.
<?php
use Symfony\Component\Process\Process;
use yii\base\BaseObject;
use yii\queue\RetryableJobInterface;
class TailJob extends BaseObject implements RetryableJobInterface
{
public function getTtr()
{
return 10;
}
public function execute($queue)
{
$process = new Process('tail -f /var/log/dpkg.log');
$process->setTimeout(60);
$process->run();
}
public function canRetry($attempt, $error)
{
return false;
}
}
Now, the problem that I face is that even when queue/listen kills the job, the tail command (it's just an example; in production I need to run a different command) keeps running in the background. Is there any way I can force the system to kill the tail command when the job is killed?

Your script needs to keep checking if the timeout was reached; e.g.
while($process->isRunning()) {
$process->checkTimeout();
usleep(200000);
}
Read more about "Process Timeout" here:
https://symfony.com/doc/current/components/process.html

Run the command with a timeout
$process = new Process('timeout 3600 tail -f /var/log/dpkg.log');
Will limit the process to a maximum of 60 minutes.
If your script kills it first that's fine and if it doesn't then the process will die at the timeout time.
https://linux.die.net/man/1/timeout

Where to put a Crawler script in Laravel project?

I have created a really simple PHP crawler, which I want to implement in a Laravel project. I don't know where to put it tho.. I want to start the script and just run it while the application is up.
I know that it should not be in the Controllers, or in the Cron schedule, so any suggestions where to set it up?
$homepage = 'https://example.com';
$already_crawled = [];
$crawling = [];
function follow_links($url){
global $already_crawled;
global $crawling;
$doc = new DOMDocument();
$doc->loadHTML(file_get_contents($url));
$linklist = $doc->getElementsByTagName('a');
foreach ($linklist as $link) {
$l = $link->getAttribute("href");
$full_link = 'https://example.com'.$l;
if (!in_array($full_link, $already_crawled)) {
$already_crawled[] = $full_link;
$crawling[] = $full_link;
echo $full_link.PHP_EOL;
// Insert data in the DB
}
}
array_shift($crawling);
foreach ($crawling as $link) {
follow_links($link);
}
}
follow_links($homepage);

I would recommend a combination of a Service class, Command, and possibly Jobs — and then running them from worker processes.
Your Service would be a class which contains all of the logic for crawling a page. The crawler service is then used either by an artisan command, a queued job, or a combination of both.
You are right that you don't want to run the crawler directly from the built-in Laravel scheduler (because it might run for a long time and prevent other scheduled tasks from running). However, one option is to use your Laravel schedule to run a task which checks for urls that need to be re-crawled and dispatches queued jobs to your worker processes, which are very easy to implement in Laravel.
Each new discovered url can be thought of as a separate task and queued individually for crawling, rather than running the process "continually" while the application is online.

Cleanup console command on any termination

I have the following (simple) lock code for a Laravel 5.3 command:
private $hash = null;
public final function handle() {
try {
$this->hash = md5(serialize([ static::class, $this->arguments(), $this->options() ]));
$this->info("Generated signature ".$this->hash,"v");
if (Redis::exists($this->hash)) {
$this->hash = null;
throw new \Exception("Method ".$this->signature." is already running");
}
Redis::set($this->hash, true);
$this->info("Running method","vv");
$this->runMutuallyExclusiveCommand(); //Actual command is not important here
$this->cleanup();
} catch (\Exception $e) {
$this->error($e->getMessage());
}
}
public function cleanup() {
if (is_string($this->hash)) {
Redis::del($this->hash);
}
}
This works fine if the command is allowed to go through its execution cycle normally (including handling when there's a PHP exception). However the problem arises when the command is interrupted via other means (e.g. CTRL-C or when the terminal window is closed). In that case the cleanup code is not ran and the command is considered to be still "executing" so I need to manually remove the entry from the cache in order to restart it. I have tried running the cleanup code in a __destruct function but that does not seem to be called either.
My question is, is there a way to set some code to be ran when a command is terminated regardless how it was terminated?

Short answer is no. When you kill the running process, either by Ctrl-C or just closing the terminal, you terminate it. You would need to have an interrupt in your shell that links to your cleanup code, but that is way out of scope.
There are other options however. Cron jobs can be run at intermittent intervals to perform clean up tasks and other helpful things. You could also create a start up routine that runs prior to your current code. When you execute the start up routine, it could do the cleanup for you, then call your current routine. I believe your best bet is to use a cron job that simply runs at given intervals that then looks for entries in the cache that are no longer appropriate, and then cleans them. Here is a decent site to get you started with cron jobs https://www.linux.com/learn/scheduling-magic-intro-cron-linux

run big loop with parallel threads in PHP CLI

I have a computation-expensive backend process in Symfony2 / PHP that I would like to run multi-threaded.
Since I iterate over thousands of objects, I think I shouldn't start one thread per object. I would like to have a $cores variable that defines how many threads I want in parallel, then iterate through the loop and keep that many threads running. So every time a thread finishes, a new one with the next object should be started, until all objects are done.
Looking at the pthreads documentation and doing some Google searches, I can't find a useable example for this situation. All examples I found have a fixed number of threads they run once, none of them iterate over thousands of objects.
Can someone point me into the right direction to get started? I understand the basics of setting up a thread and joining it, etc. but not how to do it in a loop with a wait condition.

The answer to the question is use Pool and Worker abstraction.
The basic idea is that you ::submit Threaded objects to the Pool, which it stacks onto the next available Worker, distributing your Threaded objects (round robin) across all Workers.
Follows is super simple code is for PHP7 (pthreads v3):
<?php
$jobs = [];
while (count($jobs) < 2000) {
$jobs[] = mt_rand(0, 1999);
}
$pool = new Pool(8);
foreach ($jobs as $job) {
$pool->submit(new class($job) extends Threaded {
public function __construct(int $job) {
$this->job = $job;
}
public function run() {
var_dump($this->job);
}
});
}
$pool->shutdown();
?>
The jobs are pointless, obviously. In the real world, I guess your $jobs array keeps growing, so you can just swap foreach for some do {} while, and keep calling ::submit for new jobs.
In the real world, you will want to collect garbage in the same loop (just call Pool::collect with no parameters for default behaviour).
Noteworthy, none of this would be possible if it really were the case that PHP wasn't intended to work in multi-threaded environments ... it definitely is.
That is the answer to the question, but it doesn't make it the best solution to your problem.
You have mentioned in comments that you assume 8 threads executing Symfony code will take up less memory than 8 processes. This is not the case, PHP is shared nothing, all the time. You can expect 8 Symfony threads to take up as much memory as 8 Symfony processes, in fact, a little bit more. The benefit of using threads over processes is that they can communicate, synchronize and (appear to) share with each other.
Just because you can, doesn't mean you should. The best solution for the task at hand is probably to use some ready made package or software intended to do what is required.
Studying this stuff well enough to implement a robust solution is something that will take a long time, and you wouldn't want to deploy that first solution ...
If you decide to ignore my advice, and give it a go, you can find many examples in the github repository for pthreads.

Joe has a good approach, but I found a different solution elsewhere that I am now using. Basically, I have two commands, one control and one worker command. The control command starts background processes and checks their results:
protected function process($worker, $entity, $timeout=60) {
$min = $this->em->createQuery('SELECT MIN(e.id) FROM BM2SiteBundle:'.$entity.' e')->getSingleScalarResult();
$max = $this->em->createQuery('SELECT MAX(e.id) FROM BM2SiteBundle:'.$entity.' e')->getSingleScalarResult();
$batch_size = ceil((($max-$min)+1)/$this->parallel);
$pool = array();
for ($i=$min; $i<=$max; $i+=$batch_size) {
$builder = new ProcessBuilder();
$builder->setPrefix($this->getApplication()->getKernel()->getRootDir().'/console');
$builder->setArguments(array(
'--env='.$this->getApplication()->getKernel()->getEnvironment(),
'maf:worker:'.$worker,
$i, $i+$batch_size-1
));
$builder->setTimeout($timeout);
$process = $builder->getProcess();
$process->start();
$pool[] = $process;
}
$this->output->writeln($worker.": started ".count($pool)." jobs");
$running = 99;
while ($running > 0) {
$running = 0;
foreach ($pool as $p) {
if ($p->isRunning()) {
$running++;
}
}
usleep(250);
}
foreach ($pool as $p) {
if (!$p->isSuccessful()) {
$this->output->writeln('fail: '.$p->getExitCode().' / '.$p->getCommandLine());
$this->output->writeln($p->getOutput());
}
}
}
where $this->parallel is a variable I set to 6 on my 8 core machine, it signifies the number of processes to start. Note that this method requires that I iterate over a specific entity (it splits by that), which is always true in my use cases.
It's not perfect, but it starts completely new processes instead of threads, which I consider the better solution.
The worker command takes min and max ID numbers and does the actual work for the set between those two.
This approach works as long as the data set is reasonably well distributed. If you have no data in the 1-1000 range but every ID between 1000 and 2000 is used, the first three processes would have nothing to do.

PHP Cli key event listener function

I am making php cli application in which I need Key Event listener
let's say this is my code
Task To do
for($i=0;$i<=n;$i++){
$message->send($user[$i]);
}
Once I'm done with sending messages, I will have to keep connection alive with following code to receive delivery receipts.
I use $command = fopen ("php://stdin","r"); to receive user commands.
while(true){
$connection->refresh();
}
Connection is automatically kept alive during any activities but on idle i have to keep above loop running.
How can I run the event on pressing any key which will make this event break and execute some function?

PHP is not developed, to handle this kind of problems. The Magic Word would be Thread
The Cleanest way, I can Imagine, is taking a look into the Extension PThreads. With it, you can do Threads like in other Language:
class AsyncOperation extends Thread {
public function __construct($arg){
$this->arg = $arg;
}
public function run(){
if($this->arg){
printf("Hello %s\n", $this->arg);
}
}
}
$thread = new AsyncOperation("World");
if($thread->start())
$thread->join();
The other way would be to do tasks via a queue in a different script. There are some Queue Server out there, but it can be done simple calling shell_execute your Queue Script via PHP.exe. On Linux you need something like ...script.php > /dev/null 2>/dev/null &, Windows start /B php..., to stop waiting on Script is finished.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP PECL Threads results order - php

Related

How do I ensure that a process started from a queue job gets terminated after timeout?

Where to put a Crawler script in Laravel project?

Cleanup console command on any termination

run big loop with parallel threads in PHP CLI

PHP Cli key event listener function

Categories

Resources