Make cURL behave like exactly like form - php

I have a form on my site which sends data to some remote site - simple html form.
What I want to do is to use data user enters into form for statistical purposes.
So I instead of sending data to the remote page I send it first to my script which resends it the remote site.
The thing is I need it to behave in exact way the usual form would behave taking user to the remote site and displaying resources.
When I use this code it kinda works but not in the way I want it to:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $action);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $fields);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$result = curl_exec($ch);
curl_close($ch);
Problem is that it displays response in the same script. For example if $action is for example:
somesite.com/processform.php and my script name is mysqcript.php it would display the response of "somesite.com/processform.php" inside "mysqcript.php" so all the relative links are not working.
How do I make it to send the user to "somesite.com/processform.php"? Same thing that pressing the button would do?
Leonti

I think you will have to do this on your end, as translating relative paths is the client's job. It should be simple: Just take the base directory of the request you made
http://otherdomain.com/my/request/path.php
and add it in front of every outgoing link that does not begin with "/" or a protocol ("http://", "ftp://").
Detecting all the outgoing links is hard, but I am 100% sure there are ready-made PHP classes that do that. Check for example this article and the getLinks() function in the user comments. I am not 100% sure whether this is what you need but it certainly goes to the right direction.

Here are a couple of possible solutions, which I post separately so they don't get mixed up with the one I recommend:
1 - keep using cURL, parse the response and add a <base/> tag to it. It should work for pretty much everything on that page.
<base href="http://realsite.com/form_url.php" />
2 - do not alter the submit URL. Submit the form to the real URL, but capture its content using some Javascript library (YUI does that) and send it to your script via XHR. It's still kind of hacky though.

There are several ways to do that. Here's one of the easiest: just use a 307 redirect.
header('Location: http://realsite.com/form_url.php', true, 307');
You can do your logging and stuff either before or after header() but if you do it after calling header() you will need to start your script with
ignore_user_abort(true);
Note that browsers are supposed to notify the user that their form is being redirected.

Related

Page redirect not working in php when accessed without a browser

I am developing an application in which the input I receive is through an SMS gateway ( and not a browser). I need to process the data obtained through SMS and pass it onto another PHP file which will finish the processing and send back an SMS to the SMS gateway.
However, when I try to redirect from page1.php to page2.php, it is not working with the following code:
page1.php:
$url = "location:http://www.iweavesolutions.com/$extra?sms=".$msg."&keyword=".$key."&num=".$msg_num."&src=".$source;
header($url);
page2.php:
$msg = $_GET['sms'];
$msg_num = $_GET['num'];
$keyword = $_GET['keyword'];
$src = $_GET['src'];
send_sms($msg,$msg_num);
However, the header call in the first page doesn't seem to work. php documentation says that header is used for browser related activities. In my application there is no browser at all. So, do I need to change my mechanism for passing values across files? Please help
please refer to "CURL"
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,"http://www.iweavesolutions.com");
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,2);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'variable1=abc&variable2=123');
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch,CURLOPT_MAXREDIRS,1);
$buffer = curl_exec($ch);
curl_close($ch);
some thing like this
Sending a location:[someUrl] header as an answer to a request just tells the requesting client to do another request to that location. It is up to the client whether to follow this redirect or not. Browsers will usually do this, other clients may not.
If the client you're dealing with (the SMS gateway) does not follow location header redirects, you need to check with the clients documentation if there is some mechanism to make him do that. If there is no way to redirect the client, you need to change your server side logic to get rid of the need for the redirect, i.e. you need to call the processing logic in your 'page2.php' directly from 'page1.php' without the indirection of the redirect (or bundle the whole logic in one file, etc.).
The SMS gateway probably does not implement HTTP properly. IME this is not uncommon.
As a side note, your first script (assuming it is complete) is written assuming register_globals is enabled - this has been deprecated for a long time, and does not url-encode the values - which may be the cause of the issue here. If not, you'll need to either:
fix the SMS gateway
change the end point registered on the SMS gateway to eliminate the ned for redirection
include the code from the redirected script into the current endpoint script
proxy the request from the gateway in the endpoint script.

Trying to find the best method

I will set up a register page using MSSQL.
The system must work like:
User appends data at something.com/register.php
The data is sent to host-ip-address/regsecond.php which my database will be at. (For security reasons, this php page wont directly access to the database.
The php page at host will start another PHP page or EXE file will directly reach database directly and securely.
As my php level is not high, I wanted to learn If i could start php scripts which will work and do their job without coming into users browsers. Here I explain what I say:
" I append some data at x.php, and it starts another PHP script which will do the job with the DATA appended from x.php but the -another PHP script- wont come into users browser "
I was hopefully clear ,as summary, should I use exe [will be harder] or can I start PHP script without coming into browser. And how of course.
You can do this using the curl extension. You can find info on it here:
http://php.net/manual/en/book.curl.php
You can do something like the following:
$postdata = array(
'item1' => 'data'
);
$ch = curl_init("http://host-ip-address/regsecond.php");
curl_setopt ($ch, CURLOPT_POST, true);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_exec($ch);
curl_close($ch);
This makes a call directly from your first script to your second script without exposing anything to the user. On the far side, the data will come in as regular post data ($_POST).
You can't post data through PHP to a different website.
If you would like your website then you can configure your PHP script to connect to a different server for your MySQL, I wouldn't say it's a huge amount safer. For example
Instead of:
mysql_connect(localhost,username,password);
Try this
mysql_connect(http://your-ip:portnumber,username,password);
I'm not sure I understand this correctly but you may
§1 use a "public" php script that invokes a private one:
<?php
//public register script
//now call private
//store data to txt-file or similar..
require('/path/outside/www-data/script_that_processes_further.php');
§2 request a script at another server,
<?php
file_get_contents('http://asdf.aspx?firstname=' . $theFirstName); //simplistic
//other options would be curl, xml/soap or whatever.
§1 may be used with §2.
regards,
/t

PHP cross domain requests

I am a green programmer and I was originally trying to make cross domain requests in JS. I quickly learned that this is not allowed. Unlike similar questions posted on here, I would like to see if I can use PHP to make them for me instead of JSONP requests. Is this possible?
Simple workflow...
BROWSER: POST to my PHP the request-payload & request-headers
PHP: POST to Other Domain's URL the request-payload & request-headers
Other Domain: Process Request and send response
PHP: Send the Response-Content and Response-Header Info back to the browser
Here is what I am trying to work with http://msdn.microsoft.com/en-us/library/bb969500%28v=office.12%29.aspx
My goal is to make a Communicator Web Access Client that is web based and mobile friendly.
A link to a working example would be awesome!
CURL yould be your option in this case, something simple as:
<?php
$ch = curl_init('http://otherdomain.com/');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, false);
$result = curl_exec($ch);
var_dump($result);
?>
In this case, $result would contain the html code of the site. Please be aware that it doesn't going to execute any javascript as if you were visiting the site on the browser.
You are talking about web services and seems that the goal is process payments. Any major payment gateway have APIs prepared for that. In any case you can study by your own. Here a good starting point http://ajaxonomy.com/2008/xml/web-services-part-1-soap-vs-rest

How to display images when using cURL?

When scraping page, I would like the images included with the text.
Currently I'm only able to scrape the text. For example, as a test script, I scraped Google's homepage and it only displayed the text, no images(Google logo).
I also created another test script using Redbox, with no success, same result.
Here's my attempt at scraping the Redbox 'Find a Movie' page:
<?php
$url = 'http://www.redbox.com/Titles/AvailableTitles.aspx';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;
?>
the page was broken, missing box art, missing scripts, etc.
Looking at FF's Firebug's Extension 'Net' tool(allows me to check headers and file paths), I discovered that Redbox's images and css files were not loaded/missing (404 not found). I noticed why, it was because my browser was looking for Redbox's images and css files in the wrong place.
Apperently the Redbox images and css files are located relative to the domain, likewise for Google's logo. So if my script above is using its domain as the base for the files path, how could I change this?
I tried altering the host and referer request headers with the script below, and I've googled extensively, but no luck.
My fix attempt:
<?php
$url = 'http://www.redbox.com/Titles/AvailableTitles.aspx';
$referer = 'http://www.redbox.com/Titles/AvailableTitles.aspx';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Host: www.redbox.com") );
curl_setopt ($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;
?>
I hope I made sense, if not, let me know and I'll try to explain it better.
Any help would be great! Thanks.
UPDATE
Thanks to everyone(especially Marc, and Wyatt), your answers helped me figure out a method to implement.
I was able to succesfully test by following the steps below:
Download the page and its requisites via Wget.
Add <base href="..." /> to downloaded page's header.
Upload the revised downloaded page and its original requisites via Wput to a temporary server.
Test uploaded page on temporary server via browser
If the uploaded page is not displayed properly, some of the requisites might be missing still(css,jss,ect). View which are missing via a tool that lets you view header responses(eg. the 'net' tool from FF's Firebug Addon). After locating the missing requisites, visit original page that the uploaded page is based on, take note of proper requisite locations that were missing, then revise the downloaded page from step 1 to
accommodate the new proper locations and begin at step 3 again. Else, if page is rendered properly, then success!
Note: When revising the downloaded page I manually edited the code, I'm sure you could use regEX or a parsing library on cUrl's request to automate the process.
When you scrape a URL, you're retrieving a single file, be it html, image, css, javascript, etc... The document you see displayed in a browser is almost always the result of MULTIPLE files: the original html, each seperate image, each css file, each javascript file. You enter only a single address, but fully building/displaying the page will require many HTTP requests.
When you scrape the google home page via curl and output that HTML to the user, there's no way for the user to know that they're actually viewing Google-sourced HTML - it appears as if the HTML came from your server, and your server only. The user's browser will happily suck in this HTML, find the images, and request the images from YOUR server, not google's. Since you're not hosting any of google's images, your server responds with a properly 404 "not found" error.
To make the page work properly, you've got a few choices. The easiest is to parse the HTML of the page and insert a <base href="..." /> tag into the document's header block. This will tell any viewing browsers that "relatively" links within the document should be fetched from this 'base' source (e.g. google).
A harder option is to parse the document and rewrite any references to external files (images ,css, js, etc...) and put in the URL of the originating server, so the user's browser goes to the original site and fetches from there.
The hardest option is to essentially set up a proxy server, and if a request comes in for a file that doesn't exist on your server, to try and fetch the corresponding file from Google via curl and output it to the user.
If the site you're loading is using relative paths for its resource URLs (i.e. /images/whatever.gif instead of http://www.site.com/images/whatever.gif), you're going to need to do some rewriting of those URLs in the source you get back, since cURL won't do that itself, though Wget (official site seems to be down) does (and will even download and mirror the resources for you), but does not provide PHP bindings.
So, you need to come up with a methodology to scrape through the resulting source and change relative paths into absolute paths. A naive way would be something like this:
if (!preg_match('/src="https?:\/\/"/', $result))
$result = preg_replace('/src="(.*)"/', "src=\"$MY_BASE_URL\\1\"", $result);
where $MY_BASE_URL is the base URL you want to rewrite, i.e. http://www.mydomain.com. That won't work for everything, but it should get you started. It's not an easy thing to do, and you might be better off just spawning off a wget command in the background and letting it mirror or rewrite the HTML for you.
Try obtaining the images by having the raw output returned, using the CURLOPT_BINARYTRANSFER option set to true, as below
curl_setopt($ch,CURLOPT_BINARYTRANSFER, true);
I've used this successfully to obtain images and audio from a webpage.

curl sending GET instead of POST

Actually, it's gotten so messy that I'm not even sure curl is the culprit. So, here's the php:
$creds = array(
'pw' => "xxxx",
'login' => "user"
);
$login_url = "https://www.example.net/login-form"; //action value in real form.
$loginpage = curl_init();
curl_setopt($loginpage, CURLOPT_HEADER, 1);
curl_setopt($loginpage, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($loginpage, CURLOPT_URL, $login_url);
curl_setopt($loginpage, CURLOPT_POST, 1);
curl_setopt($loginpage, CURLOPT_POSTFIELDS, $creds);
$response = curl_exec($loginpage);
echo $response;
I get the headers (which match the headers of a normal, successful request), followed by the login page (I'm guessing curl captured this due to a redirect) which has an error to the effect of "Bad contact type".
I thought the problem was that the request had the host set to the requesting server, not the remote server, but then I noticed (in Firebug), that the request is sent as GET, not POST.
If I copy the login site's form, strip it down to just the form elements with values, and put the full URL for the action, it works just great. So I would think this isn't a security issue where the login request has to originate on the same server, etc. (I even get rid of the empty hidden values and all of the JS which set some of the other cookies).
Then again, I get confused pretty quickly.
Any ideas why it's showing up as GET, or why it's not working, for that matter?
When troubleshooting the entire class of PHP-cURL-related problems, you simply have to turn on CURLOPT_VERBOSE and give CURLOPT_STDERR a file handle.
tail -f your file, compare the headers and response to the ones you see in Firebug, and the problem should become clear.
The request is made from the server, and will not show up in Firebug. (You probably confused it with another request by your browser). Use wireshark to find out what really happens. You are not setting CURLOPT_FOLLOWLOCATION; redirects should not be followed.
Summarizing: Guess less, post more. Link to a pcap dump, and we will be able to tell exactly what you're doing wrong; or post the exact output of the php script, and we might.
The shown code does a multipart formpost (since you pass a hash array to the POSTFIELDS option), which probably is not what the target server expects.
try throwing in a print_r(curl_getinfo($loginpage)) at the end, see what the header data it sent back as.
also, if your trying to fake that your logging in from their site, your going to want to make sure your sending the correct referrer with your post, so that they "think" you were on the website when you sent it.

Categories