BigQuery - Synchronous RunQuery - php

I am trying to create a job in GoogleBigQuery, that returns a JobId instantly whilst the Job continues to run without making the user wait.
From reading the documentation runQuery suggests this should be possible. maxRetries has been set as well as a very small timeoutMs.
The idea being that the user will get a JobId and an alert notifying them the Job is being processed and they will receive a further notification when it's complete.
Installed via Composer Version: google/cloud: ^0.53.0
Sample code included below.
runQuery
Runs a BigQuery SQL query in a synchronous fashion.
Unless $options.maxRetries is specified, this method will block until the query completes, at which time the result set will be returned.
http://googlecloudplatform.github.io/google-cloud-php/#/docs/google-cloud/v0.53.0/bigquery/bigqueryclient?method=runQuery
$client = new BigQueryClient([
'projectId' => 'XXXX',
]);
$client->dataset('XXXX');
if (!$dataset->exists()) {
throw new \Exception(sprintf('Dataset does not exist'));
}
$options = [
'timeoutMs' => 1000,
'maxRetries' => 2,
];
$queryJob = $client->queryConfig($sql, $options); //tried options here
$queryResult = $client->runQuery($queryJob,$options); //and tried options here together and individually
echo $queryResult->job()->id();

Related

How can I query AWS Batch for jobs matching multiple jobStatus values in AWS PHP SDK Version 3?

I am currently working on a PHP project using the AWS PHP SDK. I have a data import process that utilizes AWS batch. The PHP application needs to be able to check AWS for jobs that are not complete, prior to letting the user start a new job.
I am currently using the listJobs() call on the BacthClint like so, following an example given by the documentation:
<?php
$client = new Aws\Batch\BatchClient([
...
]);
$jobs = $client->listJobs([
'jobQueue' => '...',
'jobStatus' => 'RUNNING',
]);
However, I would like to get jobs matching the statuses of SUBMITTED, PENDING, RUNNABLE and STARTING as well as RUNNING.
The docs make it seem like I could submit the following value, as a pipe delinted list. But this syntax caused the request to fail:
<?php
$jobs = $client->listJobs([
'jobQueue' => '...',
'jobStatus' => 'SUBMITTED|PENDING|RUNNABLE|STARTING|RUNNING',
]);
Error:
Error executing request, Exception : Invalid job status SUBMITTED|PENDING|RUNNABLE|STARTING|RUNNING. Valid statuses are [SUBMITTED, PENDING, RUNNABLE, STARTING, RUNNING, SUCCEEDED, FAILED]
Is there some kind of way that I can submit multiple values under the 'jobStatus' input?
If not, is there some other way I can do this utilizing the AWS PHP SDK?
Note:
It looks like there is a 'filters' feature listed under the heading "Parameter Details" and "Parameter Syntax" secotion in the documentation example from before. This seems to suggest that something like this should work:
<?php
$jobs = $client->listJobs([
'jobQueue' => '...',
'filters' => [
'name' => 'jobStatus',
'values' => ['SUBMITTED', 'PENDING', 'RUNNABLE', 'STARTING', 'RUNNING']
]
]);
"You can filter the results by job status with the jobStatus parameter. If you don't specify a status, only RUNNING jobs are returned."
"The job status used to filter jobs in the specified queue. If the filters parameter is specified, the jobStatus parameter is ignored and jobs with any status are returned. If you don't specify a status, only RUNNING jobs are returned."
However this seems to return blank result sets.

Guzzle 6.3 Async Pool: Updating the list of Pool requests from inside a fulfilled promise callback?

I'm working on a process where I have a Queue, and I start with a known unit of work. As I process the unit of work, it will result in zero-or-more (unknown) units of work that gets added to the Queue. I continue to process the queue until there's no more work to perform.
I'm working on a proof-of-concept using Guzzle where I accept a first URL to seed the queue, then process the body of the response which may result in more URLs that need to be processed. My goal is to add them to the queue and have Guzzle continue processing them until there's nothing left in the queue.
In other cases, I can define a variable as the queue, and pass it by-reference into a function so that it gets updated with new work. But in the case of Guzzle Async Pools (which I think is the most efficient way to handle this), there doesn't seem to be a clear way to update the queue in-process and have the Pool execute the requests.
Does Guzzle provide a built-in approach for updating the list of Pool requests from inside a fulfilled Promise callback?
use ArrayIterator;
use GuzzleHttp\Promise\EachPromise;
use GuzzleHttp\TransferStats;
use Psr\Http\Message\ResponseInterface;
// Re-usable callback which prints the URL being requested
function onStats(TransferStats $stats) {
echo sprintf(
'%s (%s)' . PHP_EOL,
$stats->getEffectiveUri(),
$stats->getTransferTime()
);
}
// The queue of work to be performed
$requests = new ArrayIterator([
$client->get('http://httpbin.org/anything', [
'on_stats' => 'onStats',
])
]);
// Process the queue, which results in more work to be performed
$p = (new EachPromise($requests, [
'concurrency' => 50,
'fulfilled' => function(ResponseInterface $response) use ($client, &$requests) {
$hash = bin2hex(random_bytes(10));
$requests[] = $client->get(sprintf('http://httpbin.org/anything/%s', $hash), [
'on_stats' => 'onStats',
]);
},
'rejected' => function($reason) {
echo $reason . PHP_EOL;
},
]))->promise();
// Wait for everything to finish
$p->wait(true);
My question appears to be similar to Incrementally add requests to a Guzzle 5.0 Pool (Rolling Requests), but is different in that these refer to different major versions of Guzzle.
After posting this, I was able to do more searching and found some more SO threads and GitHub Issues for Guzzle. I found this library, which appears to address the problem.
https://github.com/alexeyshockov/guzzle-dynamic-pool

Laravel 5.0 delayed execution

I need to delay execution of one method in Laravel 5.0, or to be more specific, I need it to be executed at special given time. The method is sending notification through GCM to mobile app, and I need to do it repeatedly and set it to a different time. As I found out, there is no way how to intentionally delay notification in GCM. I know basics from working with cron and scheduling in Laravel, but I cant find answer to my problem.
The method I need to execute with delay is this:
public function pushAndroid($receiver, $message, $data)
{
$pushManager = new PushManager(PushManager::ENVIRONMENT_DEV);
$gcmAdapter = new GcmAdapter(array(
'apiKey' => self::GCM_API_KEY
));
$androidDevicesArray = array(new Device($receiver));
$devices = new DeviceCollection($androidDevicesArray);
$msg = new GcmMessage($message, $data);
$push = new Push($gcmAdapter, $devices, $msg);
$pushManager->add($push);
$pushManager->push();
}
Information when (date+time) it should be executed is stored in table in database. And for every notification, I need to do it only once, not repeatedly.
If you take a look at https://laravel.com/docs/5.6/scheduling you can setup something that fits your needs.
Make something with the looks of
$schedule->call(function () {
// Here you get the collection for the current date and time
$notifications = YourModel::whereDate('datecolumn',\Carbon\Carbon::now());
...
})->everyMinute();
You can also use Queues with Delayed Dispatching, if this makes more sense. Since you hinted you only need to do it once.
ProcessJobClassName::dispatch($podcast)->delay(now()->addMinutes(10));

Breaking out of a Gearman loop

I have a php application that gets requests for part numbers from our server. At that moment, we reach out to a third party API to gather pricing information to make sure we have the latest pricing for that particular request. Sometimes the third party API is slow or it might be down, so we have a database that stores the latest pricing requests for each particular part number that we can use as a fallback. I'd like to run the request to the third party API and the database in parallel using Gearman. Here is the idea:
Receive request
Through gearman, create two jobs:
Request to third party API
MySQL database lookup
Wait in a loop and return the results based on the following conditions:
If the third party API has completed return that result, return that result immediately
If an elapsed time has passed, (e.g. 2 seconds) and the third party API hasn't responded, return the MySQL lookup data
Using gearman, my thoughts were to either run the two tasks in the foreground and break out of runTasks() within the setCompleteCallback() call, or to run them in the background and check in on the two tasks within a separate loop and check in on the tasks using jobStatus().
Unfortunately, I can't get either route to work for me while still getting access to the resulting data. Is there a a better way, or are there some existing examples of how someone has made this work?
I think you've described a single blocking problem, namely the results of an 3rd-party API lookup. There's two ways you can handle this from my point of view, either you could abort the attempt altogether if you decide that you've run out of time or you could report back to the client that you ran out of time but continue on with the lookup anyway, just to update your local cache just in case it happens to respond slower than you would like. I'll describe how I would go about the former problem because that would be easier.
From the client side:
$request = array(
'productId' => 5,
);
$client = new GearmanClient( );
$client->addServer( '127.0.0.1', 4730 );
$results = json_decode($client->doNormal('apiPriceLookup', json_encode( $request )));
if($results && property_exists($results->success) && $results->success) {
// Use local data
} else {
// Use fresh data
}
This will create a job on the job server with a function name of 'apiPriceLookup' and pass it the workload data containing a product id of 5. It will wait for the results to come back, and check for a success property. If it exists and is true, then the api lookup was successful.
The idea is to set the timeout condition then in the worker task, which completely depends on how you're implementing the API lookup. If you're using cURL (or some wrapper around cURL), you can see the answer to how to detect a timeout here.
From the worker side:
$worker= new GearmanWorker();
$worker->addServer();
$worker->addFunction("apiPriceLookup", "apiPriceLookup", $count);
while ($worker->work());
function apiPriceLookup($job) {
$payload = json_decode($job->workload());
try {
$results = [
'data' => PerformApiLookupForProductId($payload->productId),
'success' => true,
];
} catch(Exception $e) {
$results = ['success' => false];
}
return json_encode($results);
}
This just creates a GearmanWorker object and subscribes it the function of apiPriceLookup. It will call the function apiPriceLookup whenever a client submits a task to the job server. That function calls out to another function, PerformApiLookupForProductId, which should be written so as to throw an exception whenever a timeout condition occurs.
I don't think this would be considered using exceptions to control logic flow, I think timeout conditions generally are exceptional (or should be) events. For instance, Guzzle will throw a GuzzleHttp\Exception\RequestException when it has decided to timeout.

GAE: Push Task Queues - How to push tasks to specific Queues? - using GAE-PHP

Hi i am new to GAE task queues, I created one queue with name anchorextractor, this is showing in queues list.
Then i created a task with the url ('/worker/extractor/1'). after creating if i echo the name of task, its showing name ( task3 ). After i checked the queues list is Taskqueue page, Tasks under this queue is 0. Actually there are 3 tasks created. I tried with all possibilities. I think i explained well and no need of code here. If u need more explination i will give. Please anyone help me. (I am updating the question with code for reference, following is the code):
require_once 'google/appengine/api/taskqueue/PushTask.php';
use google\appengine\api\taskqueue\PushTask;
require_once 'google/appengine/api/taskqueue/PushQueue.php';
use google\appengine\api\taskqueue\PushQueue;
$queue = new PushQueue('tagextractor');
$task = new PushTask('/worker/anchorextractor/1', ['content_id' => 'aa', 'content_type' => 'aa']);
echo "Task Name = ".$task_name = $task->add();
$queue->addTasks([$task]);
Try this syntax instead, it will log the new tasks name to the AppEngine logs as proof that the task was created:
require_once 'google/appengine/api/taskqueue/PushTask.php';
use \google\appengine\api\taskqueue\PushTask;
$task_name = (new PushTask('/worker/anchorextractor/1', array(
'content_id' => 'aa',
'content_type' => 'aa'
)))->add("tagextractor");
syslog(LOG_INFO, "new task=".$task_name);
Tasks do get processed very quickly, so it is sometimes difficult to "see" them in the queue, you can however go to the queue in the admin console and pause it, the tasks will then build up until you either run it manually or resume the queue.

Categories