How to bypass web site redirect screen? - php

I try to download content of the web page with web scraping but on of the main problems is I can not bypass redirect of websites. for example when I try login to the website and submit the login form. I see waiting page and just waiting page.
but in browser after waiting page I redirect to profile page
I downloaded goutte and created my script but in submit form I have problem because when I submit wrongdoer password or username I will see incorrect password but when I enter correct username and password I will see waiting image to redirect
First Edit
according to the update response my code is
<?php
require_once 'vendor/autoload.php';
use Goutte\Client;
$client = new Client();
$url = 'https://egghead.io/users/sign_in';
$username = 'xxxx';
$password = 'xxxx';
$crawler = $client->request('GET', $url, [
'allow_redirects' => true
]);
$form = $crawler->selectButton('Sign In')->form();
$crawler = $client->submit($form, array('user[email]' => $username, 'user[password]' => $password));
$crawler->filter('body')->each(function ($node){
print $node->html();
});

Goutte will automatically follow redirects unless you tell it not to. You can customize the redirect behavior using the allow_redirects request option.
Set to true to enable normal redirects with a maximum number of 5
redirects. This is the default setting.
Set to false to disable redirects.
Pass an associative array containing the 'max' key to specify the
maximum number of redirects and optionally provide a 'strict' key
value to specify whether or not to use strict RFC compliant redirects
(meaning redirect POST requests with POST requests vs. doing what
most browsers do which is redirect POST requests with GET requests).
ref:
http://docs.guzzlephp.org/en/latest/quickstart.html#redirects
Update:
$crawler = $client->request('GET', 'http://egghead.io', [
'allow_redirects' => true
]);
$crawler = $client->click($crawler->selectLink('Sign in')->link());
$form = $crawler->selectButton('Sign in')->form();
$crawler = $client->submit($form, array('login' => 'fabpot', 'password' => 'xxxxxx'));
$crawler->filter('.flash-error')->each(function ($node) {
print $node->text()."\n";
});

Related

Symfony 4 Functional Form Test with CSRF Token and Session

I am using a symfony 4.4 with the form bundle and want to make some functional tests.
My form has multiple steps and I want to perform a complete form until the success message. But with csrf_protection:true I can't even get to page 2. If I disable it for the test environment I can get to page 2. If I dump my session at page 2, I can see, that it is empty. Here is an example of my test:
$client = static::createClient();
$client->request('GET', '/test');
$crawler = $client->submitForm('Next', ['name' => 'Max Mustermann']); // => lands on step2
$crawler = $client->submitForm('Next', ['email' => 'asdf#ase.de']); // => lands back on step 1 with emtpy form. So session is empty
Does anyone have an idea what is wrong here?
This page says, that is have to work, but it doesn't.
https://symfony.com/doc/4.4/components/http_foundation/session_testing.html#functional-testing
I suggest you to retrieve the form from the client's response then submit the form with the data, as example:
// Get the crawler
$crawler = $client->request('GET', '/test');
// Get the form
$form = $crawler->filter('form')->form();
// Fill the form. NB: double check your form structure/fields name
$form['name'] = 'Max Mustermann';
$form['email'] = 'asdf#ase.de';
// Submit the form.
$crawler = $this->client->submit($form);
$this->assertEquals(200, $this->client->getResponse()->getStatusCode());

Want to get redirect URL after sending post Curl request

I am sending a basic auth request to a site to log in a user. If they credentials are correct I want to show the dashboard. Therefore I need the redirectURL coming in "redirect_url" under "+curl: curl resource #212 ▼"
This is my code. And I want to get the redirect url in $redirectUrl 1
$curl = new Curl\Curl();
$curl->post('https://xyz.ryver.com/application/login', array(
'username' => 'xyza#xyz.com',
'password' => 'mypass',
));
$redirectUrl= ""; //store redirecting URL here
return redirect($redirectUrl);
I want to get this "redirect_url"'s value 2
Try this code
return redirect()->away('https://www.your_url_here.com');
Hope this help.

Guzzle 6 live data streaming: How to?

Project is consuming URL API which is updating the data every seconds. By using Guzzle 6, How can i refresh the data in browser without AJAX?
...
...
$un = 'admin';
$pa = 'password';
$base_uri = 'http://example.com:82';
$uri1 = 'api/instant/connectopc';
$uri2 = 'api/instant/displaydata?site=SITE';
$cookieFile = 'jar.txt';
$cookieJar = new FileCookieJar($cookieFile, true);
$client = new Client([
'base_uri' => $base_uri,
'auth'=>[$un, $pa],
'cookie'=>$cookieJar,
'curl' => [
CURLOPT_COOKIEJAR => 'jar.txt',
CURLOPT_COOKIEFILE => 'jar.txt'
],
]);
$connect = $client->get($uri1);
//live data to be refresh every seconds. How to do?
$live= $client->get($uri2, ['cookies' => $cookieJar]);
...
How to accomplish live data streaming?
You cannot do any live streaming from the same page once browser has closed the connection. You have to open another connection. Via Ajax or another technology like WebSockets for example if you need realtime data exchange.
You can't do live streaming with PHP .. You need to use a programming language like NodeJS :) .. PHP ends the connection at the end :)

Crawler + Guzzle: Accessing to form

I am using the php guzzle Client to grab the website, and then process it with the symfony 2.1 crawler
I am trying to access a form....for example this test form here
http://de.selfhtml.org/javascript/objekte/anzeige/forms_method.htm
$url = 'http://de.selfhtml.org/javascript/objekte/anzeige/forms_method.htm';
$client = new Client($url);
$request = $client->get();
$request->getCurlOptions()->set(CURLOPT_SSL_VERIFYHOST, false);
$request->getCurlOptions()->set(CURLOPT_SSL_VERIFYPEER, false);
$response = $request->send();
$body = $response->getBody(true);
$crawler = new Crawler($body);
$filter = $crawler->selectButton('submit')->form();
var_dump($filter);die();
But i get the exception:
The current node list is empty.
So i am kind of lost, on how to access the form
Try using Goutte, It is a screen scraping and web crawling library build on top of the tools that you are already using (Guzzle, Symfony2 Crawler). See the GitHub repo for more info.
Your code would look like this using Goutte
<?php
use Goutte\Client;
$url = 'http://de.selfhtml.org/javascript/objekte/anzeige/forms_method.htm';
$client = new Client();
$crawler = $client->request('GET', $url);
$form = $crawler->selectButton('submit')->form();
$crawler = $client->submit($form, array(
'username' => 'myuser', // assuming you are submitting a login form
'password' => 'P#S5'
));
var_dump($crawler->count());
echo $crawler->html();
echo $crawler->text();
If you really need to setup the CURL options you can do it this way:
<?php
$url = 'http://de.selfhtml.org/javascript/objekte/anzeige/forms_method.htm';
$client = new Client();
$guzzle = $client->getClient();
$guzzle->setConfig(
array(
'curl.CURLOPT_SSL_VERIFYHOST' => false,
'curl.CURLOPT_SSL_VERIFYPEER' => false,
));
$client->setClient($guzzle);
// ...
UPDATE:
When using the DomCrawler I often times get that same error. Most of the time is because I'm not selecting the correct element in the page, or because it doesn't exist. Try instead of using:
$crawler->selectButton('submit')->form();
do the following:
$form = $crawler->filter('#signin_button')->form();
Where you are using the filter method to get the element by id if it has one '#signin_button' or you could also get it by class '.signin_button'.
The filter method requires The CssSelector Component.
Also debug your form by printing out the HTML (echo $crawler->html();) and ensuring that you are actually on the right page.

Change facebook redirect URL on login

I'm trying to change the redirect URL for Facebook login on my site. This is so I can go to a page where I can create a new user in my database if they don't already exist and THEN redirect them to my main page. However, when I try to login, I get the following error on Facebook: An error occurred with PHP SDK Unit Tests. Please try again later.
My code (I want to redirect them to mysite.com/createfbuser.php):
public function getLoginUrl($params=array()) {
$this->establishCSRFTokenState();
$currentUrl = $_SERVER['SERVER_NAME'] . '/createfbuser.php';
return $this->getUrl(
'www',
'dialog/oauth',
array_merge(array(
'client_id' => $this->getAppId(),
'redirect_uri' => $currentUrl, // possibly overwritten
'state' => $this->state,
'scope' => 'email'),
$params));
}
EDIT The original code reads $currentUrl = $this->getCurrentUrl(); for a reference
First of all, you don't have to edit PHP SDK, below is the sample for authenticating the user and then redirecting to your landing page,
Make sure you replace:
YOUR-APP-ID-HERE with Your facebook application id,
YOUR-APP-API-SECRET-HERE with Your facebook application secret key
YOUR-REDIRECT-URL-HERE with Your landing page URL
<?php
// Requires Facebook PHP SDK 3.0.1: https://github.com/facebook/php-sdk/
require ('facebook.php');
define('FACEBOOK_APP_ID',"YOUR-APP-ID-HERE");
define('FACEBOOK_SECRET',"YOUR-APP-API-SECRET-HERE");
define('REDIRECT_URI',"YOUR-REDIRECT-URL-HERE");
$user = null;
$facebook = new Facebook(array(
'appId' => FACEBOOK_APP_ID,
'secret' => FACEBOOK_SECRET,
'cookie' => true
));
$user = $facebook->getUser(); // Get the UID of the connected user, or 0 if the Facebook user is not connected.
if($user == 0) {
// If the user is not connected to your application, redirect the user to authentication page
/**
* Get a Login URL for use with redirects. By default, full page redirect is
* assumed. If you are using the generated URL with a window.open() call in
* JavaScript, you can pass in display=popup as part of the $params.
*
* The parameters:
* - redirect_uri: the url to go to after a successful login
* - scope: comma separated list of requested extended perms
*/
$login_url = $facebook->getLoginUrl($params = array('redirect_uri' => REDIRECT_URI));
echo ("<script> top.location.href='".$login_url."'</script>");
} else {
// if the user is already connected, then redirect them to landing page or show some content
echo ("<script> window.location.href='".REDIRECT_URI."'</script>");
}
?>
If you want to get extended permissions, then simply add another "scope" parameter to the login url, ex:
$login_url = $facebook->getLoginUrl($params = array('redirect_uri' => REDIRECT_URI,'scope' => 'comma-separated-list-of-requested-extended-perms'));
For more details on permissions, refer facebook permissions docs
The URL you redirect to needs to be on the same domain as you've configured for your app in the app settings, beyond that there's no restriction really, you can set the redirect_url to any URL on your domain.
When you try to auth the app as an admin there should be a more specific error message visible - it's quite likely Error 191 ( see Facebook API error 191 for another possible cause and solution)

Categories