How can scrape website via PHP that requires POST data? - php

I'm trying to scrape a website that takes in POST data to return the correct page (sans POST it returns 15 results, with POST data it returns all results).
Currently my code is looking like this:
$curl = curl_init();
curl_setopt($curl,CURLOPT_URL,"http://www.thisismyurl.com/awesome");
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, XXXXXX);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$result= curl_exec($curl);
I know that I need to put my postfields into the space filled with "XXXXXX", I just don't know where to dig up the post fields/values and how to structure them into the variable that I pass into there.
Any help would be greatly appreciated!

If it's a simple form, then just extract all the form fields and duplicate them in your script. If it's some dynamic form, like javascript building up a request and using ajax, then you can sniff the data using developer tools (e.g. Firefox's Firebug Net tab, HTTPfox, etc...) and extract the post data as it gets sent over.
Either way, once you know what fields/data are being sent, the rest should be (relatively) easy to duplicate/build.

I think someone may look for code to replace XXXXXX. I use the following piece of code.
$ch = curl_init();
$timeout=5;
$name=$_REQUEST['name'];
$pass=$_REQUEST['pass'];
$data = array('username' => '$name', 'password' => '$pass');
$data=http_build_query($data);
curl_setopt($ch,CURLOPT_URL,"superawsomesite.com");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,$timeout);
$data = curl_exec($ch);
curl_close($ch);

Related

How to get results from other website

I need to fetch data from this web page Sender score
.I try to use cURL but it renders white page.
Here is my code :
$ch = curl_init();
$keyword = "an-example.com";
curl_setopt($ch, CURLOPT_URL, 'https://www.senderscore.org/lookup.php?lookup='.$keyword.'&validLookup=true');
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
print_r($data);
curl_close($ch);
any idea ?
Regards.
You're getting a blank page because of the captcha which is needed to fill in. Perhaps senderscore has an API which you can use? Or maybe there's another website available doing the same thing. I thought this was about scoring email statusus right? Then maybe this site will help you out: http://www.reputationauthority.org/domain_lookup.php?ip=somedomain.nl&Submit.x=0&Submit.y=0&Submit=Search
I can use this site without the need of captcha or any other bot interference.

Header() substitute

Hi I am new to php and want to know some alternate function for the header('location:mysit.php');
I am in a scenario that I am sending the request like this:
header('Location: http://localhost/(some external site).php'&?var='test')
something like this but what I wanna do is that I want to send values of variables to the external site but I actually dont want that page to pop out.
I mean variables should be sent to some external site/page but on screen I want to be redirected to my login page. But seemingly I dont know any alternative please guide me. Thx.
You are searching for PHP cUrl:
$ch = curl_init();
// set URL and other appropriate options
curl_setopt($ch, CURLOPT_URL, "http://www.example.com/");
curl_setopt($ch, CURLOPT_HEADER, 0);
// grab URL and pass it to the browser
curl_exec($ch);
// close cURL resource, and free up system resources
curl_close($ch);
Set the location header to the place you actually want to redirect the browser to and use something like cURL to make an HTTP request to the remote site.
The way you usually would do that is by sending those parameters by cURL, parse the return values and use them however you need.
By using cURL you can pass POST and GET variables to any URL.
Like so:
$ch = curl_init('http://example.org/?aVariable=theValue');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
Now, in $result you have the response from the URL passed to curl_init().
If you need to post data, the code needs a little more:
$ch = curl_init('http://example.org/page_to_post_to.php');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'variable1=value1&variable2=value2');
$result = curl_exec($ch);
curl_close($ch);
Again, the result from your POST reqeust is saved to $result.
You could connect to another URL in the background in numerous ways. There's cURL ( http://php.net/curl - already mentioned here in previous comments ), there's fopen ( http://php.net/manual/en/function.fopen.php ), there's fsockopen ( http://php.net/manual/en/function.fsockopen.php - little more advanced )

Can't get this content spinning API to work with PHP

Can you help me get this content spinning API working? It was wrote to work with C# but is there a way I can get it to work using PHP? I've been trying to post to the URL stated on that page using cURL, but all I'm getting back is a blank page. Here's the code I'm using:
$url = "http://api.spinnerchief.com/apikey=YourAPIKey&username=YourUsername&password=YourPassword";
// Some content to POST
$post_fields = "SpinnerChief is totally free. Not a 'lite' or 'trial' version, but a 100% full version and totally free. SpinnerChief may be free, but it is also one of the most powerful content creation software tools available. It's huge user-defined thesaurus means that its sysnonyms are words that YOU would normally use in real life, not some stuffy dictionary definition. And there's more...SpinnerChief can only get better, because we listen to our users, We take onboard your ideas and suggestions, and we update SpinnerChief so that it becomes the software YOU want!";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_PORT, 9001);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_fields);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($ch);
curl_close($ch);
echo $result;
Can anyone see something wrong I'm doing? Thanks a lot for the help.
The value for CURLOPT_POST should be 1, and the posted data should be set with CURLOPT_POSTFIELDS.

Authentication using JSON and PHP

Can anyone kindly tell me how to use JSON base Authentication using/in PHP?
I got some helping code (as given below), but can not understand that how to use it.
url = http://testerws.heroku.com/
POST url + 'user_session.json',{:username => "admin",:password => "admin"}, :accept => 'application/json'
I am not sure whether this code is correct (with respect to syntax) or not. And I can not understand how to use it in combination with PHP.
So kindly help me to solve this, I am in great need of it.
You can use curl of php to post the json data.
$json = json_encode($data);
$http_post_data = array("data" => $json);
$url = 'http://testerws.heroku.com/test.php';
$ch = curl_init();
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_POSTFIELDS, $http_post_data);
$result = curl_exec($ch);
curl_close($ch);
Then test.php will get the data which you post. You can do authentication your self and return to the calling script.
But the authentication is not that easy for a security system.
FYR.
http://blog.evernote.com/tech/2011/05/17/architectural-digest/
and read the comment.
What is the context of the authentification? If it's a webapplication you could probably fill in a form, post the data as JSON using JQuery and decoding the JSON on the PHP side and check the validation. Be sure to encrypt the information you send from the browser though. Username and password as clear text is not secure at all.

How do you do HTML form testing without real user input simulation?

this question is like this one, except it's for PHP testing via browser. It's about testing your form input.
Right now, i have a form on a single page. It has 12 input boxes. Every time i test the form, i have write those 12 input boxes in my browser.
i know it's not a specific coding question. This question is more about how to do direct testing on your form
So, how to do recursive testing without consuming too much of your time ?
I think Selenium Remote Control is one of the most popular names in the field of web interface testing. See for example this question.
If you don't want to use some big programs for test just one small form - you can use your own testing bike :)
$args = array(/* Your _POST params */)
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url); // Your local|remote url
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, $args);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$response = curl_exec($ch);
// Parse the response here
// You may specify the loop with need args for your 12 checkboxes

Categories