I need to store XML data sent over HTTP POST to my server. In the log files I see that the data is successfully sent to my server. But I have no idea how to get the data.
I tried to catch them with the php://input stream like in the code below. The problem I see is that php://input is just read when the file containing the code is called.
$xml = file_get_contents("php://input");
$var_str = var_export($xml, true);
file_put_contents('api-test/test.txt', $var_str);
Is there any way to set some kind of listener/watcher to the php://input stream? Maybe PHP is the wrong technology to realize this. Is there some other way like AJAX?
The problem I see is that php://input is just read when the file containing the code is called.
Yes.
That's how PHP (in a server-side programming context) works.
The client makes an HTTP request to a URL
The server receives the HTTP request and determines that that URL is handled by a particular PHP program (typically by matching the path component of the URL to a directory and file name unless the Front Controller Pattern is being used)
The PHP program is executed and has access to data from the request
The server sends the output of the PHP program back
Is there any way to set some kind of listener/watcher to the php://input stream?
You get a new stream every time a request is made. So the typical way to watch it is to put a PHP script at the URL that the request is being made to.
Then make sure each request is made to the same URL.
(If you need to support requests being made to different URLs, then look into the Front Controller Pattern).
Maybe PHP is the wrong technology to realize this.
It's a perfectly acceptable technology for handling HTTP requests.
Is there some other way like AJAX?
Ajax is a buzzword meaning "Make an HTTP request with JavaScript". Since you are receiving the requests and not making them, Ajax isn't helpful.
Related
I have a GAE PHP script that accepts a POSTed message consisting of $_POST['version_name'], $_POST['version_comments'] and $_FILES['userfile']['tmp_name'][0].
It runs a file_get_contents against $_FILES['userfile']['tmp_name'][0] and stores the binary away in a CloudSQL DB.
This is the end point for a PHP-driven form, so users can upload new versions (with names / comments) through a friendly GUI from their browser. It works fine.
Now I want to be able to use the same handler as the end point for a Python script. I've written this:
r = requests.post('http://handler_url_here/',
data={'version_name': "foo", 'version_comments': "bar"},
files={'userfile': open('version_archive.tar.gz', 'rb')})
version_archive.tar.gz is a non-empty file, but file_get_contents($_FILES['userfile']['tmp_name'][0]) is returning null. Uploading files is a bit tricky with GAE, so I'd prefer to not change the listener - is there some way I can make Python send its payload in the same format the listener is expecting?
$_POST['version_name'] and $_POST['version_comments'] are working as expected.
I'd start by looking at the middle-man, which in this case is the HTTP request. Keep in mind, your Python script isn't posting directly to PHP; it's making an HTTP POST request, which is then getting interpreted by PHP into the $_POST variables and whatnot.
Figure out a way to "capture" or "dump" the HTTP request that Python is sending so you can inspect its contents. (You can find a number of free tools that help you do this in various ways. Reading the HTTP request should be pretty self-explanatory if you're familiar with working with $_GET and $_POST variables in PHP.) Then send a supposedly identical request from PHP, capture the HTTP request, and determine how and why they're different.
Good luck!
I'm trying to parse data from http://skytech.si/
I looked around a bit and I find out that the site uses http://skytech.si/skytechsys/data.php?c=tabela to show data. When I open this file in my browser I get nothing. Is the file protected and can run only from server side or something?
Is there any way to get data from it? If I cold get HTML data (perhaps in a table?) I would probably know how to parse it.
If not, would it be still possible to parse website and how?
I had a look at the requests made;
http://skytech.si/skytechsys/?c=graf&l=bf0b3c12e9b2c2d65bd5ae8925886b57
http://skytech.si/skytechsys/?c=tabela
Forbidden
You don't have permission to access /skytechsys/ on this server.
This website doesn't allow 'outside' GET requests. You could try parsing the data via file-put-contents but I don't think you will be able to get specific data tables (aside from those on that home) due to AJAX requests that need to be made. I believe the /data? is the controller to handle data which is not exposed via the API.
When you open this URL in your browser you send GET request. Data returned under this address is accessible after sending POST request with params as follows c:tabela, l:undefined, x:undefined. Analyze headers next time and look on Network log if you are using Chrome/Chromium.
If that website does not expose an API, it is not recommended to parse the data, as their HTML structure is prone to change.
See:
http://php.net/manual/en/function.file-put-contents.php
And then you can interpret it with an HTML-parsing engine or with an regular expression (not recommended).
so in vb.net, the language I'm most familiar with, there is an option to, instead of saying myrequest.write(), say myrequest.writebyte(ValueOfByte). This can be done when writing to a file or to an http request stream. I need it for writing to an http request stream.
I was wondering if there was any way to do this is php as I'm not trying to use the normal curl_exec which writes everything at once?
Thank in advance,
Spencer
I know want to know what happens behind the scene of a HTTP post method.
i.e browser sends a HTTP post request to a server side script in PHP (eg).
How does PHP's $_POST variable get the values from the client.
Could someone explain in details or point to a guide.
The HTTP protocol(*) specifies how the browser should send the request.
HTTP basically consists of a set of headers in plain text, separated by line feeds, followed by the data being transmitted. Inside the HTTP request, POST data is actually formatted pretty much the same as GET data; it's just in a different part of the HTTP headers.
You can use tools like Firebug or Fiddler to see exactly how the headers and data are formatted for incoming and outgoing HTTP requests. It's actually all quite simple to read, so you should be able to work it out just by looking.
Once it gets to the server, the PHP interpreter is responsible for translating the raw HTTP request data into its standard $_GET, $_POST, etc variables. This is something that PHP does for you.
Other languages (eg Perl) do not have this functionality built in, so a Perl programmer would have to have code in their program to parse the incoming request data into useful variables. Fortunately, even Perl has a standard library which can be included that does the job, so even Perl programmers don't generally have to write the code themselves any more.
The way PHP, and any other language, does it is simply string manipulation. As I said, the HTTP data is plain text and is received in simple string format, so it's just a case of breaking it down by splitting it on question mark and equal sign characters.
As PHP does it all behind the scenes, you probably don't need to worry about the exact mechanisms it uses, but the PHP source code is available if you really want to find out.
I said it's all in plain text. HTTPS, of course, is encrypted. However by the time PHP gets hold of it, the Apache server has already done the decryption, so as far as PHP is concerned it's still plain text.
(*) Before anyone pulls me up on it, yes, I know that saying "HTTP protocol" is a redundancy, like "ATM machine" or "PIN number".
The browser encodes the data according to the content-type of the form, then transmits it as the body of a POST request. PHP then picks it up and populates $_POST with the names and values (performing special handling when the name includes the characters [ and ] or .).
I'd suggest to get a capturing proxy (e.g. Fiddler) or a network capture tool (e.g. Wireshark) and watch your own browsing traffic for a while; it will give you a nice view of the issue.
Other than that, POST is rather similar to GET, except that the data is sent in the body of the request instead of the URL, and there are two ways to encode them (multipart-form-data in addition to the urlencode that's shared with GET)
Well, let's ilustrate step by step, starting with a page containing a [form action="foo.php" method="post"]
Once you click submit (or hit enter), browser will trigger an event named "submit". This event can be catched internally for processing with javascript/dom, and this is what most sites do for validation or Ajax routines.
If routines does not stop the flow with a return false, browser continues to process the post request (this process is the same as making a post with XMLHttpRequest Object).
Browser will check first method, action and content encoding, then parse inputs values to know the size of data it will send, and encode it.
Finally it send something like this (raw values):
POST /foo.php HTTP/1.1
Host: example.org
Content-Type: application/x-www-form-urlencoded
Content-Length: 7
foo=bar
This is a POST request. But note that it can send content-length and send variables in chunks. Browser and server know this can happen (this is the POST method purpose). When a server receives a POST request, it keeps listening to the browser until the content received match the informed content length.
Now the other side. Server receives the request, listen the content, parse it (foo = bar; xxx = baz), and make it available on its environment for that specific request, thus you can catch it with PHP or Python, or Java...
That's it. Ah note you can pass both GET and POST variables in the same request!
Using a [form action="foo.php?someVar=123&anotherVar=TRUE" method="post"]
Will make the browser send the request as
POST /foo.php?someVar=123&anotherVar=TRUE HTTP/1.1
Host: example.org
Content-Type: application/x-www-form-urlencoded
Content-Length: 7
foo=bar
And server when parsing this request will make the following variables available:
GET[someVar] = 123
GET[anotherVar] = TRUE
POST[foo] = bar
I ran into this problem when scraping sites with heavy usage of javascript to obfuscate it's data.
For example,
"a href="javascript:void(0)" onClick="grabData(23)"> VIEW DETAILS
This href attribute, reveals no information about the actual URL. You'd have to manually look and examine the grabData() javascript function to get a clue.
OR
The old school way is manually opening up Live HTTP header add on for firefox, and monitoring the POST perimeters, which reveals the actual URL being POSTed.
So i'm wondering, is there a way to capture the POST parameters in a server side script or Javscript, as Live HTTP header does, for the outgoing and incoming POST parameters? This would make even the most javscript obfuscated web pages easily scrapable.
thanks.
I'm not sure I understand the question but...
In PHP, incoming POST parameters are stored in the $_POST array, you can display them with print_r($_POST);.