Webscraping Symfony/Panther: Can't get HTML - php

I want to scrape a site with the symfony panther package within a Laravel application. According to the documentation https://github.com/symfony/panther#a-polymorphic-feline I cannot use the HttpBrowser nor the HttpClient classes because they do not support JS.
Therefore I try to use the ChromClient which uses a local chrome executable and a chromedriver binary shipped with the panther package.
$client = Client::createChromeClient();
$crawler = $client->request('GET', 'http://example.com');
dd($crawler->html());
Unfortunately, I only receive the empty default chrome page as HTML:
<html><head></head><body></body></html>
Every approach to do something else with the $client or the $crawler-instance leads to an error "no nodes available".
Additionally, I tried the basic example from the documentation https://github.com/symfony/panther#basic-usage --> same result.
I'm using ubuntu 18.04 Server under WSL on Windows and installed the google-chrome-stable deb-package. This seemed to work because after the installation the error "the binary was not found" does not longer occur.
I also tried to manually use the executable of the Windows host system but this only opens an empty CMD window always reopened when closing. I have to kill the process via TaskManager.
Is this because the Ubuntu server does not have any x-server available?
What can I do to receive any HTML?

So, I'm probably late, but I got the same problem with a pretty easy solution: Just open a simple crawler with the response content.
This one differs from the Panther DomCrawler especially in methods, but it is is safer to evaluate HTML structures.
$client = Client::createChromeClient();
$client->request('GET', 'http://example.com');
$html = $client->getInternalResponse()->getContent();
$crawler = new Symfony\Component\DomCrawler\Crawler($html);
// you can use following to get the whole HTML
$crawler->outerHtml();
// or specific parts
$crawler->filter('.some-class')->outerHtml();

$client = Client::createChromeClient();
$crawler = $client->request('GET', 'http://example.com');
/**
* Get all Html code of page
*/
$client->getCrawler()->html();
/**
* For example to filter field by ID = AuthenticationBlock and get text
*/
$loginUsername = $client->getCrawler()->filter('#AuthenticationBlock')->text();

Related

Getting ms exchange server version using php-ews

I am using php-ews library(new version) to display the calendar events of the users in my project. The users can set their ms exchange account details in their profile and then use the calendar within my project itself. The code is like below
$host = '{host_set_on_users_profile}';
$username = '{username_set_on_users_profile}';
$password = '{password_set_on_users_profile}';
$version = Client::VERSION_2016;
$client = new Client($host, $username, $password, $version);
$client->setTimezone($timezone);
// Build request
$request = new FindItemType();
// more request building code here
$response = $client->FindItem($request);
But I am getting the below issue
[faultstring] => The specified server version is invalid.
The reason is I have used fixed VERSION_2016 and the users can have any version of ms exchange set up for the account.
So is there any way using which first I can find out the server version based on host,username and password? so that I can use the same for creating the client object
To your question: normally when you execute a request to the Exchange within the response you also get the version of the responding Exchange server. See the header element in the documentation at https://learn.microsoft.com/en-us/exchange/client-developer/web-service-reference/expanddl-operation#successful-expanddl-response-example Although you don't get a good readable year version this version number corresponds to a certain version.
Having said this, you can for example make a first simple dummy request like expandDL with a dummy mail address and get then the version for all your subsequent requests. Or even better you set for your very first FindItem request the version to 2007 and retrieve then with the response the correct version.
Another option would be to set your request version always to 2007 which is the lowest possible. You can do this if you don't need functionality from later Exchange Versions like GetInboxRule or GetPhoneCallInformation.

Dialogflow Crash in AWS Deployment

I deployed the same source code to AWS EC2 Linux instance, but it fails to display the response Text from dialogflow.
I checked the conversation history in Dialogflow console, it shows both request and response correctly. However, the dialogflow client(PHP) seems crashes after calling the function "detectIntent".
Unfortunately, got no way to find any logs.
Reinstalled Dialogflow Client Library
$formattedSession = $sessionsClient->sessionName($agent, $agentSession->session_id);
// Set Text Input
$textInput = new TextInput();
$textInput->setText($text);
$textInput->setLanguageCode($lang);
// Set Parameters
$optionalArgs = array();
$queryInput = new QueryInput();
$queryInput->setText($textInput);
$response = $sessionsClient->detectIntent($formattedSession, $queryInput, $optionalArgs);
$action = $response->getQueryResult()->getAction(); //The action name from the matched intent.
Hope the following experience helps others:
For my case, the php version is not compatible with one of the Google API library. Hence it crashes somewhere we cannot capture.
Solution: Uninstall PHP, and Install the compatible version of php.

PHP : Scrape data generated with javascript ( ES6 )

I try to scrape data of some URL with phantomjs and php phantomjs , but my target page generated some of the data with ES6 and phantomjs doesn't support it yet , and I got some errors like this ( in Console log ) :
ReferenceError: Can't find variable: Set
and my code is :
use JonnyW\PhantomJs\Client;
$client = Client::getInstance();
$client->getEngine()->setPath('C:\\Users\\XXX\\Desktop\\bin\\phantomjs.exe');
$request = $client->getMessageFactory()->createRequest('example.com', 'GET');
$response = $client->getMessageFactory()->createResponse();
$client->send($request, $response);
var_dump($response->getConsole());
I search a lot! and I found the phantomjs will support ES6 in new version ( v2.5 ) and release a beta version but it's doesn't work for me!
now, what I do? is there any way to scrape this page?
While the future of PhantomJS is not yet certain, may I suggest another headless browser to use: puppeteer. It is based on Google Chrome headless and behind it is a separate team of Google engineers.
There are already projects to control it from PHP, most notable at the moment is puphpeteer*
__
* (notable in the way that not only can it make screenshots/PDF, but it also offers javascript evaluation)

How do I debug a php nusoap call requiring basic authentication that doesn't respond at all?

I am trying to re-write a Drupal module that has fallen behind the API of the gateway it connects to.
A stripped back version of the code I think is causing the problem is as follows:
$namespace = ($this->testing) ? 'https://api.sandbox.ewaypayments.com/' : 'https://api.ewaypayments.com/';
$endpoint = $this->url;
$httpUsername = $this->user_name;
$httpPassword = $this->password;
$client = new nusoap_client($endpoint, TRUE);
$client->setCredentials($httpUsername, $httpPassword, 'basic');
$client->response_timeout = 50;
$result = $client->call($operation, array('request' => $params), $namespace);
The $result is consistently false. If I put anything like this into my code it also consistently returns empty:
$error = $client->getError();
watchdog('connection_message', $error);
I'm a bit out of my depth and without any error messages in my Apache logs or in the Drupal watchdog I cannot see a way forward.
1. Turn on PHP error reporting if it's not already on.
Check that the error_reporting, display_errors settings in your php.ini file are set to E_ALL and On respectively when you are developing locally. You can also add these directives at the beginning of your PHP script to set them at run time:
error_reporting(E_ALL);
ini_set('display_errors', 'On');
2. Catch NuSOAP errors like this:
$result = $client->call($operation, array('request' => $params), $namespace);
if ($client->fault) {
echo 'Error: ';
print_r($result);
} else {
// check result
$err_msg = $client->getError();
if ($err_msg) {
// Print error msg
echo 'Error: '.$err_msg;
} else {
// Print result
echo 'Result: ';
print_r($result);
}
}
3. Verify you are using the correct API parameters and endpoint:
From the eWAY API reference, your endpoints are:
https://api.ewaypayments.com/soap.asmx (production)
https://api.sandbox.ewaypayments.com/soap.asmx (sandbox)
4. Similar eWAY API projects that you can reverse-engineer:
Commerce eWAY for Drupal (last version is Mar. 2014)
eWAY-RapidAPI (uses JSON and cURL)
eWay-PHP-API (uses XML and cURL)
eWay Payment Gateway (uses SOAPClient)
There are a couple of things I would like to say in this case.
First, why do you have to use that library ? You can use Zend_Soap_Client (if you don't have it you can install it using composer:
http://framework.zend.com/downloads/composer (look for zendframework/zend-soap)
Then, you can download a trial version of PHPStorm. Its debugging tools when used with http://xdebug.org are really awesome, you can inspect the entire variable and environment space in runtime.
Finally, you can use a friendly error managing tool like http://raygun.io, you insert a few lines of code, create a trial account in there, and in minutes you get all errors that are happening in your application.
In your case, you can see for example the current value of $operation, which seems the function being called on the webservice.
Here's the code for inspecting all functions being offered in a webservice using Zend_Soap_Client:
$endpoint = 'http://your.example.endpoint/?wsdl';
$soapClient = new Zend_Soap_Client($endpoint);
$functions = $soapClient->getFunctions();
var_dump($functions);
Since you are using SOAP requests your endpoint is incorrect, it should be https://api.ewaypayments.com/soap.asmx or
https://api.sandbox.ewaypayments.com/soap.asmx
For better performance you may think to disable nusoap debugging.
To check, edit the /include/nusoap/nusoap.php file and set the debug level to 0, like this:
['nusoap_base']->globalDebugLevel = 0;
One step even further is to remove all lines that start with:
$this->debug(
or
$this->appendDebug(
Source:
http://kb.omni-ts.com/entry/245/
You could give this module a try: https://www.drupal.org/project/eway_integration
We are currently working with eWay to test this module together. It works with Drupal Commerce and implements eWay's RAPID 3.1 API and is PCI compliant.

Jira Soap with a Php

I have seen little to know instruction on using php to develop a client website to make remote calls to JiRA.
Currently I'm trying to make a soap client using JSP/Java to connect to a local jira instance. I would like to create and search issues that is all. We are currently having some problems using Maven2 and getting all the files we need from the repository since we are behind a major firewall(yes I've used the proxy).
I have a lot of experience with PHP and would like to know if using the PHP soapclient calls can get the job done.
http://php.net/manual/en/soapclient.soapclient.php
Yes it can be done, using SOAP or XML-RPC.
Using the APIs is pretty much straight forward - have a look at the API documentation to find the right functions for you. your code should look something like :
<?
$soapClient = new SoapClient("https://your.jira/rpc/soap/jirasoapservice-v2?wsdl");
$token = $soapClient->login('user', 'password');
...
... # get/create/modify issues
...
?>
Example of adding a new comment:
$issueKey = "key-123";
$myComment = "your comment";
$soapClient = new SoapClient("https://your.jira/rpc/soap/jirasoapservice-v2?wsdl");
$token = $soapClient->login('user', 'password');
$soapClient->addComment($token, $issueKey, array('body' => $myComment));
Example of creating an issue:
$issue = array(
'type'=>'1',
'project'=>'TEST',
'description'=>'my description',
'summary'=>'my summary',
'priority'=>'1',
'assignee'=>'user',
'reporter'=>'user',
);
$soapClient = new SoapClient("https://your.jira/rpc/soap/jirasoapservice-v2?wsdl");
$token = $soapClient->login('user', 'password');
$soapClient->createIssue($token, $issue);
Note that you need to install php-soap in linux (or it's equivalent in windows) to be able to use the SOAP library.

Categories