I am stuck with this error...
but the client is defined.
my code like this
use Goutte\Client;
use Illuminate\Http\Request;
use GuzzleHttp\Client as GuzzleClient;
class WebScrapingController extends Controller
{
public function doWebScraping()
{
$goutteClient = new Client();
$guzzleClient = new GuzzleClient(array(
'timeout' => 60,
'verify' => false
));
$goutteClient->setClient($guzzleClient);
$crawler = $goutteClient->request('GET', 'https://duckduckgo.com/html/?q=Laravel');
$crawler->filter('.result__title .result__a')->each(function ($node) {
dump($node->text());
});
}
}
I think error from this line
$goutteClient->setClient($guzzleClient);
goutte: "^4.0" guzzle: "7.0" Laravel Framework: "6.20.4"
This answer is regarding creating instance of Goutte client, a simple PHP Web Scraper
For Version >= 4.0.0
Pass HttpClient(either guzzle httpclient , symfony httpclient) instance directly inside the instance of Goutte Client.
use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;// or use GuzzleHttp\Client as GuzzleClient;
$client = new Client(HttpClient::create(['timeout' => 60]));
// or
// $guzzleClient = new GuzzleClient(['timeout' => 60, 'verify' => false]); // pass this to Goutte Client
For Version <= 4.0.0 (i.e from 0.1.0 to 3.3.1)
use Goutte\Client;
use GuzzleHttp\Client as GuzzleClient;
$goutteClient = new Client();
$guzzleClient = new GuzzleClient(['timeout' => 60]);
$goutteClient->setClient($guzzleClient);
Related
I'm trying to upgrade my website's code from Slim v2 to v4. I'm not a hardcore programmer so I'm facing issues. In Slim v2 I had some middleware where I was able to assign parameters to the Twig view before the route code executed. Now I'm trying to manage the same with Slim v4 but without success.
So this is a test code:
use Psr\Http\Message\ResponseInterface as Response;
use Psr\Http\Message\ServerRequestInterface as Request;
use Psr\Http\Server\RequestHandlerInterface as RequestHandler;
use Slim\Factory\AppFactory;
use Slim\Views\Twig;
use Slim\Routing\RouteContext;
require 'vendor/autoload.php';
require 'config.php';
lib\Cookie::init();
$container = new \DI\Container();
$container->set('view', function($container) {
return Twig::create(__DIR__ . '/views');
});
$container->set('flash', function ($container) {
return new \Slim\Flash\Messages();
});
$container->get('view')->getEnvironment()->addGlobal('flash', $container->get('flash'));
AppFactory::setContainer($container);
$app = AppFactory::create();
$app->addErrorMiddleware(true, false, false);
$fb = new Facebook\Facebook([
'app_id' => '...',
'app_secret' => '...',
'default_graph_version' => '...',
]);
$beforeMiddleware = function (Request $request, RequestHandler $handler) use ($fb) {
$response = $handler->handle($request);
if (!isset($_SESSION['fbuser'])) {
$helper = $fb->getRedirectLoginHelper();
$permissions = ['email'];
$loginUrl = $helper->getLoginUrl('...', $permissions);
$this->get('view')->offsetSet('fbloginurl', $loginUrl);
}
else {
$this->get('view')->offsetSet('fbuser', $_SESSION['fbuser']);
}
$uri = $request->getUri();
$this->get('view')->offsetSet('currenturl', $uri);
return $response;
};
$app->add($beforeMiddleware);
$app->get('/test', function (Request $request, Response $response, $args) {
$oViewParams = new \lib\ViewParams("home", "", "", "", "");
$oProfession = new \models\Profession();
$oBlogPost = new models\BlogPost();
$oBlogTopic = new models\BlogTopic();
$professions = $oProfession->getProfessionsWithLimit(14);
$posts = $oBlogPost->getMainPagePosts();
echo $this->get('view')->offsetGet('fbloginurl');
$params = array('professions' => $professions,
'posts' => $posts,
'viewp' => $oViewParams->getMassParams());
return $this->get('view')->render($response, 'index.html', $params);
});
$app->run();
When I use echo $this->get('view')->offsetGet('fbloginurl'); within the middleware it shows up. When I use the same within the route there is nothing show up...
The next code in the chain of middleware (or your routes) is called when you have...
$response = $handler->handle($request);
As this is before you set any of the values you want to use in twig, they aren't yet set. Move the above line after setting these value and the values should then be available to the rest of the code.
Guzzle provides a mechanism to send concurrent requests: Pool. I used the example from the docs: http://docs.guzzlephp.org/en/stable/quickstart.html#concurrent-requests. It works quite fine, sends concurrent requests and everything is awesome except one thing: it seems Guzzle ignores HTTP/2 in this case.
I've prepared a simplified script that sends two requests to https://stackoverflow.com, the first one is using Pool, the second one is just a regular Guzzle request. Only the regular request connects via HTTP/2.
<?php
include_once 'vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\Pool;
use GuzzleHttp\Psr7\Request;
$client = new Client([
'version' => 2.0,
'debug' => true
]);
/************************/
$requests = function () {
yield new Request('GET', 'https://stackoverflow.com');
};
$pool = new Pool($client, $requests());
$promise = $pool->promise();
$promise->wait();
/************************/
$client->get('https://stackoverflow.com', [
'version' => 2.0,
'debug' => true,
]);
Here is an output: https://pastebin.com/k0HaDWt6 (I highlighted important parts with "!!!!!")
Does anybody know why Guzzle does this and how to make Pool work with HTTP/2?
Found what was wrong: new Client() doesn't actually accept 'version' as an option if passed to Pool requests are created as new Request(). Either the protocol version must be provided as an option of every request or the requests must be created as $client->getAsync() (or ->postAsync or whatever).
See the corrected code:
...
$client = new Client([
'debug' => true
]);
$requests = function () {
yield new Request('GET', 'https://stackoverflow.com', [], null, '2.0');
};
/* OR
$client = new Client([
'version' => 2.0,
'debug' => true
]);
$requests = function () use ($client) {
yield function () use ($client) {
return $client->getAsync('https://stackoverflow.com');
};
};
*/
$pool = new Pool($client, $requests());
$promise = $pool->promise();
$promise->wait();
...
I'm using Slim and bshaffer's OAuth2.0 library to build out an API. Currently I'm trying to extend the Pdo.php class included with the OAuth library to override a few functions. However, when I try to run my code, I'm getting an error saying:
PHP Fatal error: Interface
'OAuth2\Storage\AuthorizationCodeInterface' not found
/vendor/bshaffer/oauth2-server-php/src/OAuth2/Storage/Pdo.php on line
21
Since that is the base class, and I don't want to modify it, I'm not sure how to address this issue.
Also, using the base Pdo class like so works fine, it's able to find the interfaces
$storage = new OAuth2\Storage\Pdo(array('dsn' => $dsn, 'username' => $user, 'password' => $pw));
Here is my index.php file
<?php
use League\OAuth2\Server\Storage\SessionInterface;
require '/***/root/vendor/Slim/Slim.php';
require_once('/***/root/vendor/bshaffer/oauth2-server-php/src/OAuth2/Autoloader.php');
require_once('/***/root/vendor/bshaffer/oauth2-server-php/src/OAuth2/Storage/Pdo.php');
require_once('/***/root/custom_pdo.php');
OAuth2\Autoloader::register();
\Slim\Slim::registerAutoloader();
$app = new \Slim\Slim();
$app->post(
'/token',
function () {
$dsn = 'mysql:host='.MYSQL_HOST.';dbname=DB';
$user = MYSQL_USER;
$pw = MYSQL_PASS;
$storage = new custom_pdo(array('dsn' => $dsn, 'username' => $user, 'password' => $pw));
$server = new OAuth2\Server($storage);
$server->addGrantType(new OAuth2\GrantType\ClientCredentials($storage));
$response = $server->handleTokenRequest(OAuth2\Request::createFromGlobals())->send();
print_r($response);
}
);
$app->run();
And here is my custom_pdo.php class
<?php
namespace OAuth2\Storage;
use OAuth2\OpenID\Storage\UserClaimsInterface;
use OAuth2\OpenID\Storage\AuthorizationCodeInterface as OpenIDAuthorizationCodeInterface;
require_once('/***/root/vendor/bshaffer/oauth2-server-php/src/OAuth2/Autoloader.php');
require_once('/***/root/vendor/bshaffer/oauth2-server-php/src/OAuth2/Storage/AuthorizationCodeInterface.php');
class custom_pdo extends Pdo {
}
Bshaffer Pdo class: https://github.com/bshaffer/oauth2-server-php/blob/develop/src/OAuth2/Storage/Pdo.php
It looks like it was a namespace issue. I wasn't calling my create object statement with the namespace, and that was causing the various issues I was running into. Here is the final code:
index.php
<?php
use League\OAuth2\Server\Storage\SessionInterface;
require '/***/root/vendor/autoload.php';
require "/***/root/custom_pdo.php";
$app = new \Slim\Slim();
$app->post(
'/token',
function () {
$dsn = 'mysql:host='.MYSQL_HOST.';dbname=DB';
$user = MYSQL_USER;
$pw = MYSQL_PASS;
$storage = new OAuth2\Storage\custom_pdo(array('dsn' => $dsn, 'username' => $user, 'password' => $pw));
$server = new OAuth2\Server($storage);
$server->addGrantType(new OAuth2\GrantType\ClientCredentials($storage));
$response = $server->handleTokenRequest(OAuth2\Request::createFromGlobals())->send();
print_r($response);
}
);
$app->run();
custom.php
<?php
namespace OAuth2\Storage;
use OAuth2\OpenID\Storage\UserClaimsInterface;
use OAuth2\OpenID\Storage\AuthorizationCodeInterface as OpenIDAuthorizationCodeInterface;
class custom_pdo extends Pdo {
}
I'm trying to use guzzle 6 which works fine but I'm lost when it comes to how to log all the api calls. I would like to simply log timing, logged in user from session, url and any other usual pertinent info that has to do with the API call. I can't seem to find any documentation for Guzzle 6 that refers to this, only guzzle 3 (Where they've changed the logging addSubscriber call). This is how my current API calls are:
$client = new GuzzleHttp\Client(['defaults' => ['verify' => false]]);
$res = $client->get($this->url . '/api/details', ['form_params' => ['file' => $file_id]]);
You can use any logger which implements PSR-3 interface with Guzzle 6
I used Monolog as logger and builtin middleware of Guzzle with MessageFormatter in below example.
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\MessageFormatter;
use Monolog\Logger;
$stack = HandlerStack::create();
$stack->push(
Middleware::log(
new Logger('Logger'),
new MessageFormatter('{req_body} - {res_body}')
)
);
$client = new \GuzzleHttp\Client(
[
'base_uri' => 'http://httpbin.org',
'handler' => $stack,
]
);
echo (string) $client->get('ip')->getBody();
The details about the log middleware and message formatter has not well documented yet. But you can check the list which variables you can use in MessageFormatter
Also there is a guzzle-logmiddleware which allows you to customize formatter etc.
#KingKongFrog This is the way to specify the name of the log file
$logger = new Logger('MyLog');
$logger->pushHandler(new StreamHandler(__DIR__ . '/test.log'), Logger::DEBUG);
$stack->push(Middleware::log(
$logger,
new MessageFormatter('{req_body} - {res_body}')
));
For Guzzle 7 I did this::
require './guzzle_7.2.0.0/vendor/autoload.php';
require './monolog/vendor/autoload.php';
use GuzzleHttp\Client;
use GuzzleHttp\Exception\RequestException;
use GuzzleHttp\Pool;
use GuzzleHttp\Psr7\Request;
use GuzzleHttp\HandlerStack;
use GuzzleHttp\Middleware;
use GuzzleHttp\MessageFormatter;
use Monolog\Logger;
use Monolog\Handler\StreamHandler;
use GuzzleHttp\TransferStats;
//$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$logger = null;
$messageFormat =
//['REQUEST: ', 'METHOD: {method}', 'URL: {uri}', 'HTTP/{version}', 'HEADERS: {req_headers}', 'Payload: {req_body}', 'RESPONSE: ', 'STATUS: {code}', 'BODY: {res_body}'];
'REQUEST: urldecode(req_body)';
$handlerStack = \GuzzleHttp\HandlerStack::create();
$handlerStack->push(createGuzzleLoggingMiddleware($messageFormat));
function getLogger() {
global $logger;
if ($logger==null) {
$logger = new Logger('api-consumer');
$logger->pushHandler(new \Monolog\Handler\RotatingFileHandler('./TataAigHealthErrorMiddlewarelog.txt'));
}
var_dump($logger);
return $logger;
}
function createGuzzleLoggingMiddleware(string $messageFormat){
return \GuzzleHttp\Middleware::log(getLogger(), new \GuzzleHttp\MessageFormatter($messageFormat));
}
function createLoggingHandlerStack(array $messageFormats){
global $logger;
$stack = \GuzzleHttp\HandlerStack::create();
var_dump($logger);
collect($messageFormats)->each(function ($messageFormat) use ($stack) {
// We'll use unshift instead of push, to add the middleware to the bottom of the stack, not the top
$stack->unshift(createGuzzleLoggingMiddleware($messageFormat) );
});
return $stack;
}
//$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$client = new Client(['verify' => false, 'handler' => $tapMiddleware($handlerStack)]);
WOW !!
unshift() is indeed better than push() in reverse order ...
$handlers = HandlerStack::create();
$logger = new Logger('Logger');
$templates = [
'{code} >> {req_headers}',
'{code} >> {req_body}',
'{code} << {res_headers}',
'{code} << {res_body}'
];
foreach ($templates as $template) {
$handlers->unshift($this->getMiddleware($logger, $template));
}
$client = new Client([
RequestOptions::DEBUG => false,
'handler' => $handlers
]);
Using this function to obtain the Middleware:
private function getMiddleware(Logger $logger, string $template): callable {
return Middleware::log($logger, new MessageFormatter($template));
}
Logger comes from "monolog/monolog": "^1.27.1".
And these are all supported variable substitutions.
I need to send multiple requests so I want to implement a batch request.
How can we do it in Guzzle6?
Using the the old way:
$client->send(array(
$client->get($courses), //api url
$client->get($job_categories), //api url
));
is giving me the error:
GuzzleHttp\Client::send() must implement interface Psr\Http\Message\RequestInterface, array given
try something like this
$client = new Client();
foreach ($links as $link) {
$requests[] = new Request('GET', $link);
}
$responses = Pool::batch($client, $requests, array(
'concurrency' => 15,
));
foreach ($responses as $response) {
//do something
}
don't forget
use GuzzleHttp\Pool;
use GuzzleHttp\Client;
use GuzzleHttp\Psr7\Request;