Is it possible to setup Guzzle + Pool over HTTP/2? - php

Guzzle provides a mechanism to send concurrent requests: Pool. I used the example from the docs: http://docs.guzzlephp.org/en/stable/quickstart.html#concurrent-requests. It works quite fine, sends concurrent requests and everything is awesome except one thing: it seems Guzzle ignores HTTP/2 in this case.
I've prepared a simplified script that sends two requests to https://stackoverflow.com, the first one is using Pool, the second one is just a regular Guzzle request. Only the regular request connects via HTTP/2.
<?php
include_once 'vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\Pool;
use GuzzleHttp\Psr7\Request;
$client = new Client([
'version' => 2.0,
'debug' => true
]);
/************************/
$requests = function () {
yield new Request('GET', 'https://stackoverflow.com');
};
$pool = new Pool($client, $requests());
$promise = $pool->promise();
$promise->wait();
/************************/
$client->get('https://stackoverflow.com', [
'version' => 2.0,
'debug' => true,
]);
Here is an output: https://pastebin.com/k0HaDWt6 (I highlighted important parts with "!!!!!")
Does anybody know why Guzzle does this and how to make Pool work with HTTP/2?

Found what was wrong: new Client() doesn't actually accept 'version' as an option if passed to Pool requests are created as new Request(). Either the protocol version must be provided as an option of every request or the requests must be created as $client->getAsync() (or ->postAsync or whatever).
See the corrected code:
...
$client = new Client([
'debug' => true
]);
$requests = function () {
yield new Request('GET', 'https://stackoverflow.com', [], null, '2.0');
};
/* OR
$client = new Client([
'version' => 2.0,
'debug' => true
]);
$requests = function () use ($client) {
yield function () use ($client) {
return $client->getAsync('https://stackoverflow.com');
};
};
*/
$pool = new Pool($client, $requests());
$promise = $pool->promise();
$promise->wait();
...

Related

Call to undefined method Goutte\Client::setClient()

I am stuck with this error...
but the client is defined.
my code like this
use Goutte\Client;
use Illuminate\Http\Request;
use GuzzleHttp\Client as GuzzleClient;
class WebScrapingController extends Controller
{
public function doWebScraping()
{
$goutteClient = new Client();
$guzzleClient = new GuzzleClient(array(
'timeout' => 60,
'verify' => false
));
$goutteClient->setClient($guzzleClient);
$crawler = $goutteClient->request('GET', 'https://duckduckgo.com/html/?q=Laravel');
$crawler->filter('.result__title .result__a')->each(function ($node) {
dump($node->text());
});
}
}
I think error from this line
$goutteClient->setClient($guzzleClient);
goutte: "^4.0" guzzle: "7.0" Laravel Framework: "6.20.4"
This answer is regarding creating instance of Goutte client, a simple PHP Web Scraper
For Version >= 4.0.0
Pass HttpClient(either guzzle httpclient , symfony httpclient) instance directly inside the instance of Goutte Client.
use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;// or use GuzzleHttp\Client as GuzzleClient;
$client = new Client(HttpClient::create(['timeout' => 60]));
// or
// $guzzleClient = new GuzzleClient(['timeout' => 60, 'verify' => false]); // pass this to Goutte Client
For Version <= 4.0.0 (i.e from 0.1.0 to 3.3.1)
use Goutte\Client;
use GuzzleHttp\Client as GuzzleClient;
$goutteClient = new Client();
$guzzleClient = new GuzzleClient(['timeout' => 60]);
$goutteClient->setClient($guzzleClient);

Google PHP Api Client - I keep getting Error 401: UNAUTHENTICATED

I've been struggling with this for hours now, if not days and can't seem to fix it.
My Requests to Cloud Functions are being denied with error code: 401: UNAUTHENTICATED.
My Code is as follow:
putenv('GOOGLE_APPLICATION_CREDENTIALS=' . FIREBASE_SERIVCE_PATH);
$client = new Google_Client();
$client->useApplicationDefaultCredentials();
$client->addScope(Google_Service_CloudFunctions::CLOUD_PLATFORM);
$httpClient = $client->authorize();
$promise = $httpClient->requestAsync("POST", "<MyCloudFunctionExecutionUri>", ['json' => ['data' => []]]);
$promise->then(
function (ResponseInterface $res) {
echo "<pre>";
print_r($res->getStatusCode());
echo "</pre>";
},
function (RequestException $e) {
echo $e->getMessage() . "\n";
echo $e->getRequest()->getMethod();
}
);
$promise->wait();
I'm currently executing this from localhost as I'm still in development phase.
My FIREBASE_SERIVCE_PATH constant links to my service_account js
My Cloud Function index.js:
const functions = require('firebase-functions');
const admin = require('firebase-admin');
admin.initializeApp();
// CORS Express middleware to enable CORS Requests.
const cors = require('cors')({
origin: true,
});
exports.testFunction = functions.https.onCall((data, context) => {
return new Promise((resolve, reject) => {
resolve("Ok:)");
});
});
// [END all]
My Cloud Function Logs:
Function execution took 459 ms, finished with status code: 401
What am I doing wrong so I get Unauthenticated?
PS: My testFunction works perfectly when invoked from my Flutter mobile app who uses: https://pub.dartlang.org/packages/cloud_functions
Update:
I have followed this guide: https://developers.google.com/api-client-library/php/auth/service-accounts but in the "Delegating domain-wide authority to the service account" section, it only states If my application runs in a Google Apps domain, however I wont using Google Apps domain, and plus I'm on localhost.
First of all thanks to Doug Stevenson for the answer above! It helped me to get a working solution for callable functions (functions.https.onCall).
The main idea is that such functions expect the auth context of the Firebase User that already logged in. It's not a Service Account, it's a user record in the Authentication section of your Firebase project. So, first, we have to authorize a user, get the ID token from response and then use this token for the request to call a callable function.
So, below is my working snippet (from the Drupal 8 project actually).
use Exception;
use Google_Client;
use Google_Service_CloudFunctions;
use GuzzleHttp\Psr7;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\Promise;
use GuzzleHttp\RequestOptions;
$client = new Google_Client();
$config_path = <PATH TO SERVICE ACCOUNT JSON FILE>;
$json = file_get_contents($config_path);
$config = json_decode($json, TRUE);
$project_id = $config['project_id'];
$options = [RequestOptions::SYNCHRONOUS => TRUE];
$client->setAuthConfig($config_path);
$client->addScope(Google_Service_CloudFunctions::CLOUD_PLATFORM);
$httpClient = $client->authorize();
$handler = $httpClient->getConfig('handler');
/** #var \Psr\Http\Message\ResponseInterface $res */
$res = $httpClient->request('POST', "https://www.googleapis.com/identitytoolkit/v3/relyingparty/verifyPassword?key=<YOUR FIREBASE PROJECT API KEY>", [
'json' => [
'email' => <FIREBASE USER EMAIL>,
'password' => <FIREBASE USER PASSWORD>,
'returnSecureToken' => TRUE,
],
]);
$json = $res->getBody()->getContents();
$data = json_decode($json);
$id_token = $data->idToken;
$request = new Request('POST', "https://us-central1-$project_id.cloudfunctions.net/<YOUR CLOUD FUNCTION NAME>", [
'Content-Type' => 'application/json',
'Authorization' => "Bearer $id_token",
], Psr7\stream_for(json_encode([
'data' => [],
])));
try {
$promise = Promise\promise_for($handler($request, $options));
}
catch (Exception $e) {
$promise = Promise\rejection_for($e);
}
try {
/** #var \Psr\Http\Message\ResponseInterface $res */
$res = $promise->wait();
$json = $res->getBody()->getContents();
$data = json_decode($json);
...
}
catch (Exception $e) {
}
Callable functions impose a protocol on top of regular HTTP functions. Normally you invoke them using the Firebase client SDK. Since you don't have an SDK to work with that implements the protocol, you'll have to follow it yourself. You can't just invoke them like a normal HTTP function.
If you don't want to implement the protocol, you should instead use a regular HTTP function, and stop using the client SDK in your mobile app.

Guzzle does not send a post request

i'm using PHP with Guzzle.
I have this code:
$client = new Client();
$request = new \GuzzleHttp\Psr7\Request('POST', 'http://localhost/async-post/tester.php',[
'headers' => ['Content-Type' => 'application/x-www-form-urlencoded'],
'form_params' => [
'action' => 'TestFunction'
],
]);
$promise = $client->sendAsync($request)->then(function ($response) {
echo 'I completed! ' . $response->getBody();
});
$promise->wait();
For some reason Guzzle Doesn't send the POST Parameters.
Any suggestion?
Thanks :)
I see 2 things.
The parameters have to go as string (json_encode)
And you were also including them as part of the HEADER, not the BODY.
Then i add a function to handle the response as ResponseInterface
$client = new Client();
$request = new Request('POST', 'https://google.com', ['Content-Type' => 'application/x-www-form-urlencoded'], json_encode(['form_params' => ['s' => 'abc',] ]));
/** #var Promise\PromiseInterface $response */
$response = $client->sendAsync($request);
$response->then(
function (ResponseInterface $res) {
echo $res->getStatusCode() . "\n";
},
function (RequestException $e) {
echo $e->getMessage() . "\n";
echo $e->getRequest()->getMethod();
}
);
$response->wait();
In this test Google responds with a
Client error: POST https://google.com resulted in a 405 Method Not Allowed
But is ok. Google doesn't accepts request like this.
Guzzle isn't truely asynchronous. It's more of multi-threading. That is why you have the wait() line to prevent the your current PHP script from closing until all multiple spun threads finish. If you remove the wait() line, the PHP process spun by the script ends immediately with all it's threads and your request is never sent.
Ergo, you need Guzzle (and Curl) for multi-processing(concurrent) I/O and not for asynchronous I/O. In your case, you are processing one request and Guzzle promises are simply an overkill.
To send a request with Guzzle, simply do this:
use GuzzleHttp\Client;
use GuzzleHttp\Psr7\Request;
$client = new Client();
$header = ['Content-Type' => 'application/x-www-form-urlencoded'];
$body = json_encode(['id' => '2', 'name' => 'dan']);
$request = new Request('POST', 'http://localhost/async-post/tester.php', $header, $body);
$response = $client->send($request);
Also, it seems you are using the form action attribute rather than the actual form data in form-params.
I'm posting this answer because I tried to achieve something really asynchronous with php - Schedule I/O processing as a background task, continue processing script and serve the page; I/O continues in background and completes without disrupting the client. Laravel Queues was the best thing I could find.

How do you log all API calls using Guzzle 6

I'm trying to use guzzle 6 which works fine but I'm lost when it comes to how to log all the api calls. I would like to simply log timing, logged in user from session, url and any other usual pertinent info that has to do with the API call. I can't seem to find any documentation for Guzzle 6 that refers to this, only guzzle 3 (Where they've changed the logging addSubscriber call). This is how my current API calls are:
$client = new GuzzleHttp\Client(['defaults' => ['verify' => false]]);
$res = $client->get($this->url . '/api/details', ['form_params' => ['file' => $file_id]]);
You can use any logger which implements PSR-3 interface with Guzzle 6
I used Monolog as logger and builtin middleware of Guzzle with MessageFormatter in below example.
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\MessageFormatter;
use Monolog\Logger;
$stack = HandlerStack::create();
$stack->push(
Middleware::log(
new Logger('Logger'),
new MessageFormatter('{req_body} - {res_body}')
)
);
$client = new \GuzzleHttp\Client(
[
'base_uri' => 'http://httpbin.org',
'handler' => $stack,
]
);
echo (string) $client->get('ip')->getBody();
The details about the log middleware and message formatter has not well documented yet. But you can check the list which variables you can use in MessageFormatter
Also there is a guzzle-logmiddleware which allows you to customize formatter etc.
#KingKongFrog This is the way to specify the name of the log file
$logger = new Logger('MyLog');
$logger->pushHandler(new StreamHandler(__DIR__ . '/test.log'), Logger::DEBUG);
$stack->push(Middleware::log(
$logger,
new MessageFormatter('{req_body} - {res_body}')
));
For Guzzle 7 I did this::
require './guzzle_7.2.0.0/vendor/autoload.php';
require './monolog/vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;
use GuzzleHttp\Pool;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\MessageFormatter;
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
use GuzzleHttp\TransferStats;
//$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$logger = null;
$messageFormat =
//['REQUEST: ', 'METHOD: {method}', 'URL: {uri}', 'HTTP/{version}', 'HEADERS: {req_headers}', 'Payload: {req_body}', 'RESPONSE: ', 'STATUS: {code}', 'BODY: {res_body}'];
'REQUEST: urldecode(req_body)';
$handlerStack = \GuzzleHttp\HandlerStack::create();
$handlerStack->push(createGuzzleLoggingMiddleware($messageFormat));
function getLogger() {
global $logger;
if ($logger==null) {
$logger = new Logger('api-consumer');
$logger->pushHandler(new \Monolog\Handler\RotatingFileHandler('./TataAigHealthErrorMiddlewarelog.txt'));
}
var_dump($logger);
return $logger;
}
function createGuzzleLoggingMiddleware(string $messageFormat){
return \GuzzleHttp\Middleware::log(getLogger(), new \GuzzleHttp\MessageFormatter($messageFormat));
}
function createLoggingHandlerStack(array $messageFormats){
global $logger;
$stack = \GuzzleHttp\HandlerStack::create();
var_dump($logger);
collect($messageFormats)->each(function ($messageFormat) use ($stack) {
// We'll use unshift instead of push, to add the middleware to the bottom of the stack, not the top
$stack->unshift(createGuzzleLoggingMiddleware($messageFormat) );
});
return $stack;
}
//$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$client = new Client(['verify' => false, 'handler' => $tapMiddleware($handlerStack)]);
WOW !!
unshift() is indeed better than push() in reverse order ...
$handlers = HandlerStack::create();
$logger = new Logger('Logger');
$templates = [
'{code} >> {req_headers}',
'{code} >> {req_body}',
'{code} << {res_headers}',
'{code} << {res_body}'
];
foreach ($templates as $template) {
$handlers->unshift($this->getMiddleware($logger, $template));
}
$client = new Client([
RequestOptions::DEBUG => false,
'handler' => $handlers
]);
Using this function to obtain the Middleware:
private function getMiddleware(Logger $logger, string $template): callable {
return Middleware::log($logger, new MessageFormatter($template));
}
Logger comes from "monolog/monolog": "^1.27.1".
And these are all supported variable substitutions.

How to dynamically add extra requests to a Guzzle Pool?

I'm using Guzzle (http://guzzlephp.org) to GET a large number of urls (~300k) . The urls are retrieved from an Elastic Search instance, and I would like to keep adding urls to a Pool so the Pool stays rather small instead of adding them all at once.
Is this possible? I looked at the Pool.php, but did not find a way to do this. Is there a way?
Use while and generator (yield).
$client = new GuzzleHttp\Client();
$client = new Client();
$requests = function () {
$uris = ['http://base_url'];
$visited_uris = []; // maybe database instead of array
while(len($uris)>0)
yield new Request('GET', array_pop($uris));
}
};
$pool = new Pool($client, $requests(), [
'concurrency' => 5,
'fulfilled' => function ($response, $index) {
$new_uri = get_new_uri(); // implement function to get new $uri
if(in_array($new_uri, $visited_uris)) {
array_push($uris, $uri);
}
array_push($visited_uris, $uri);
}
]);
$promise = $pool->promise();
$promise->wait();

Categories