I'm really unsure if this is even possible but we have an issue where we control an interface that is having XML posted in to it via HTTP post in the form of www.url.com/script.php?xml=<xmlgoeshere>. That is then URL encoded and passed in to us, and we decode and parse it.
Except I have one client who just refuses to url encode their incoming code, which works fine except for when the XML hits an ampersand, at which point everything is being parsed as an end of the xml variable.
www.url.com/script.php?xml=<xmlstart...foo&bar.../>
The end result being that I have XML being POST/GET'd into the xml variable as normal, and then I lose half of the incoming content because of the ampersand.
Now I know that's expected/proper behavior, my question is, is it possible to capture the &bar.../> segment of this code, so that if we hit a known error I can crowbar this into working anyways? I know this is non-ideal but I'm at my wit's end dealing with the outside party.
UPDATE
Ok so I was totally confused. After grabbing the server variables as mentioned below, it looks like I'm not getting the querystring, but that's because on the query they're submitting it has:
[CONTENT_TYPE] => application/x-www-form-urlencoded
[QUERY_STRING] =>
That being the case, is the above behavior still to be expected? Is their a way to get the raw form input in this case? Thanks to the below posters for their help
You'd be hard pressed to do it, if it's even possible, because the fragments of a query string take the format foo=bar with the & character acting as the separator. This means that you'd get an unpredictible $_GET variable created that would take the key name of everything between the & and the next = (assuming there even is one) that would take the value from the = to the next & or the end of the string.
It might be possible to attempt to parse the $_GET array in some way to recover the lost meaning but it would never be all that reliable. You might have more luck trying to parse $_SERVER ['QUERY_STRING'], but that's not guaranteed to succeed either, and would be a hell of a lot of effort for a problem that can be avoided just by the client using the API properly.
And for me, that's the main point. If your client refuses to use your API in the way you tell them to use it, then it's ultimately their problem if it doesn't work, not yours. Of course you should accommodate your clients to a reasonable standard, but that doesn't mean bending over backwards for them just because they refuse to accommodate your needs or technical standards that have been laid down for the good of everyone.
If the only parameter you use is xml=, and it's always at the front, and there are no other parameters, you can do something like this pseudocode:
if (count($_GET)>1 or is_not_well_formed_xml($_GET['xml'])) {
$xml = substr($_SERVER['QUERY_STRING'], 4);
if (is_not_well_formed_xml($xml)) {
really_fail();
}
}
However, you should tell the client to fix their code, especially since it's so easy for them to comply with the standard! You might still get trouble if the xml contains a ? or a #, since php or the web server may get confused about where the query string starts (messing up your $_SERVER['QUERY_STRING'], and either PHP, the client's code or an intermediary proxy or web server may get confused about the #, because that usually is the beginning of a fragment.
E.g., Something like this might be impossible to transmit reliably in a query parameter:
<root><link href="http://example.org/?querystring#fragment"/></root>
So tell them to fix their code. It's almost certainly incredibly easy for them to do so!
UPDATE
There's some confusion about whether this is a GET or POST. If they send a POST with x-www-form-urlencoded body, you can substitute file_get_contents('php://input') for $_SERVER['QUERY_STRING'] in the code above.
YES, Its possible. Using $_SERVER["QUERY_STRING"].
For your url www.url.com/script.php?xml=<xmlstart...foo&bar.../>, $_SERVER["QUERY_STRING"] should contain, xml=<xmlstart...foo&bar.../>;
The following code should extract the xml data.
$pos=strpos($_SERVER["QUERY_STRING"], 'xml');
$xml="";
if($pos!==false){
$xml = substr($_SERVER["QUERY_STRING"], $pos+strlen("xml="));
}
The problem here is that the query string will be parsed for & and = characters. If you know where your = character will be after the "bar" key then you may be able to capture the value of the rest of the string. However if you hit more & you are going to need to know the full content of the incoming message body. If you do then you should be able to get the rest of the content.
Related
Today I've found a malware on one site, I have deleted it, of course, and everything is ok, but in order to understand where it comes from, I would like to understand its logic, but it is encoded, in a quite easy way. At the beginning of the file I see:
$i96="QU~T<`_YM82iAN>/v#s\"'q#tZFjJX6a\tcI)yS^boD.\$du|3\rWw=rC!;[4*P5LVkB?%19m:p7 -zK,gOl{Efx]0R}&h+\n\\(enGH";
This is used then in all the rest of the file, as a dictionary of characters, from now on, there are all assignments like this:
$GLOBALS['rpdxi45'] = $i96[94].$i96[51].$i96[51].$i96[39].$i96[51].$i96[6].$i96[51].$i96[94].$i96[70].$i96[39].$i96[51].$i96[23].$i96[11].$i96[95].$i96[77];
Does anyone has a clue on how I can decode this (without infecting a server of mine, of course), or at least has the name of this type of encryption? Just to know if I can find something on the web.
If someone is interested, I can post the rest of the file, I found it odd.
Update: the file is actually a malicious shell hack. If you find it on your server, delete it and contact your sysadmin.
It is obfuscating the phrase "error_reporting"
<?php
$i96="QU~T<`_YM82iAN>/v#s\"'q#tZFjJX6a\tcI)yS^boD.\$du|3\rWw=rC!;[4*P5LVkB?%19m:p7 -zK,gOl{Efx]0R}&h+\n\\(enGH";
echo $i96[94].$i96[51].$i96[51].$i96[39].$i96[51].$i96[6].$i96[51].$i96[94].$i96[70].$i96[39].$i96[51].$i96[23].$i96[11].$i96[95].$i96[77];
$GLOBALS['rpdxi45'] is storing a string constructed from the characters of the string held in $i96.
Echoing $GLOBALS['rpdxi45'] will show you the string that has been constructed.
See here: http://ideone.com/Jy1uty
Following on from this question, i realised you can only use $POST when using a form...d'oh.
Using jQuery or cURL when there's no form still wouldn't address the problem that i need to post a long string in the url.
Problem
I need to send data to my database from a desktop app, so figured the best way is to use the following url format and append the data to the end, so it becomes:
www.mysite.com/myscript.php?testdata=somedata,moredata,123,xyz,etc,etc,thisgetslong
With my previous script, I was using $GET to read the [testdata=] string and my web host told me $GET can only read 512 chars, so that was the problem.
Hack
Using the script below, I'm now able to write thousands of characters; my question, is this viable or is there a better way?
<?
include("connect.php"); //Connect to the database
//hack - read the url directly and search the string for the data i need
$actual_link = "http://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]";
$findme = '=';
$pos = strpos($actual_link, $findme) + 1; //find start of data to write
$data = substr($actual_link, $pos); //grab data from url
$result = mysql_query("INSERT INTO test (testdata) VALUES ('$data')");
// Check result
if ($result) {echo $data;}
else echo "Error ".$mysqli->error;
mysql_close(); ?>
Edit:
Replaced image with PHP code.
I've learned how not to ask a question - don't use the word hack as it riles peoples feathers and don't use an image for code.
I just don't get how to pass a long string to a formless PHP page and whilst i appreciate people's responses, the answers about cURL don't make sense to me. From this page, it's not clear to me how you'd pass a string from a .NET app for example. I clearly need to do lots of research and apologise for my asinine question(s).
The URL has a practical fixed limit of ~2000 chars, so you should not be passing thousands of chars into the URL. The query portion of the URL is only meant to be used for a relatively short set of parameters.
Instead, you can build up a request body to send via cURL/jQuery/etc for POSTing. This is how a browser will submit form data, and you should probably do the same.
In your scenario, there are two important elements that you need to examine.
First, what is the client that is performing the http operation? I can't tell from your text if the client is going to be a browser, or an application. The client is whatever you have in your solution that is going to be invoking a GET or POST operation.
This is important. When you read about query string length limitations online, it's usually within the context of someone using a browser with a long URL. There is no standard across browsers for maximum URL length. But if you think about it in practical fashion, you'd never want to share an immensely large URL by posting it somewhere or sending it in an e-mail; having to do the cut-and-paste into a client browser would frustrate someone pretty quickly. On the other hand, if the client is an application, then it's just two machines exchanging data and there's really no human factor involved.
The second point to examine is the web server. The web server implementation may pose limitations on URL length, or maybe not. Again, there is no standard.
In any event, if you use a GET operation, your constraint will be the minimum of what both your client AND server allow (i.e. if both have no limit, you have no limit; if either has a limit of 200 bytes, your limit is 200 bytes; if one has a 200 byte limit and the other has a 400 byte limit, your limit is 200 bytes)
Taking a look back, you mentioned "desktop app" but have failed to tell us what language you're developing in, and what operating system. It matters -- that's your CLIENT.
Best of luck.
I am trying to get an HTML/PHP script to interpret data sent after the ? in a url. I've seen sites that do this, YouTube being one of them. I thought this was called Post Data (not sure if it is), I've been searching for a few days, and I can find is the PHP $_POST[''] with some HTML forms reading the data from a textbox, but I would like to read directly from the url, EX. www.example.com?ver=1
How would I go about doing this?
What you're looking for is called a query string. You can find that data in $_GET.
print_r($_GET);
If you need access to the raw data (and you probably don't, unless you need multiples for some variable names), check $_SERVER['QUERY_STRING'].
You can't do that in HTML pages. In PHP pages, you can read (and process) the parameters using the $_GET array. This array contains all the things after which come after ? in the URL. Suppose we have a URL like
page.php?a=b&c=d
Then we can access a and c parameters by $_GET['a'] and $_GET['b']. There is also $_POST which works a bit different. You can google it to find out more.
This isn't the best question ever, but since search engines feel the need to ignore symbols, I have to ask somewhere.
In a link, I'll sometimes see a ?, such as [link]/file.extension?some_type_of_info, or even +,&,=, etc ('best example' of what I mean is youtube videos). What are these called and what do they do? A good site would be great to :)
I am mostly interested because I have a site that loads stuff into a page, and currently the way I allow 'bookmarking' a page (or more important to me, being able to go back a 'page') is use hash values to represent my 'page'.
Ultimately I would like to not have the page refresh, which is why hash values are good, but I'd like alternatives if any (not really what hashmarks are meant for, but mostly different browsers seem to treat assigning the hash values in jquery differently)
Again, sorry this is mostly just a "what is this" question, but if anyone could tell me pros/cons towards using the method in question versus hash values, that would be great also :)
See the url specification, in particular the section syntax components:
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
… and the definition of query.
The query component contains non-hierarchical data that, along with
data in the path component (Section 3.3), serves to identify a
resource within the scope of the URI's scheme and naming authority
(if any). The query component is indicated by the first question
mark ("?") character and terminated by a number sign ("#") character
or by the end of the URI.
Ultimately I would like to not have the page refresh
Use the history API. This is independent of the structure of the URL (other than having to be a URL on the same origin).
The part after the ? is called query string. It's used to pass parameters to a web site. Parameters are separated using the & sign. For example, this would pass parameters to a site:
http://test.site.tld/index.php?parameter=value&another=anotherValue
This would pass the parameters "parameter" (with value "value") and the parameter "another" (with value "anotherValue") to the script index.php.
The + sign is sometimes used to represent a space. For example, "Hello World" could be represented as "Hello+World" or "Hello%20World".
A # sign is used to jump directly to an anchor within the page. For example
http://test.site.tld/index.php#docs
Would jump to the anchor "docs" within the web site.
The ? in a URL introduces the query string, which is data provided to the server. Everything prior to the ? specifies the resource on the server (in theory), and everything after it is additional data.
So for example:
http://example.com/foo/bar/page.php?data=one
http://example.com/foo/bar/page.php?data=two
Both URLs cause the page.php page to be retrieved by the server, and since it's a PHP page, a properly-configured server will run the PHP code within it. That PHP code can access the query string data as one big string via $_SERVER['QUERY_STRING'], or as a series of name/value pairs (if that's what it is, it doesn't have to be) via $_GET['paramname']. Note that that's _GET because query string parameters are GET parameters; POST parameters are sent via a different mechanism (not available for just links; you need a form or similar).
The stuff at the end of the url is a querystring. ? is used to denote the beginning of a querystring, they use key=value pairs seperated by &.
To address your question of whether this can be used for bookmarking, I believe the approach you are currently using with URL hashes (#) is correct.
The ? part is a querystring, GET parameters are sent that way.
The more interesting part of your question is: how can I enable the back-button/history for users in a dynamic website? Check out this library: https://github.com/browserstate/History.js/
It enables you (for newer browsers) to get/set history states. Each dynamic page gets it's own address. For older browsers, there is the hash-bang fallback (#/page/page).
I m building a small search script for my website. I need to send data by get method because by POST it will get real messy as I have to show many pages of search results.
So, My question is Can I use get method directly? means do i need to encode url or any other thing ??
I have checked it in modern browsers. It works just fine..
Thanks
Edit:
Urlencode is used when puting variables in url.
I am submitting my search form with method='get' Then I get variable and perform search query and make new page links with variable data.
- Length,Size is not a prob.
U people suggesting I should use urlencode func. while making new links only ???
You can and should use urlencode() on data that possibly contains spaces and other URL-unfriendly characters.
http://php.net/manual/en/function.urlencode.php
You need to URL Encode the parameters on the URL eg http://www.example.com/MyScript.php?MyVariable=%3FSome%20thing%3F.
Be aware that there's a limit to how much data can be sent via GET - more restrictive on older browsers. If I remember correctly, IE6 has a limit of 1024 characters in the URL so if you think you're going to go over that, consider using POST or you may exclude some users.
You should use urlencode($variable) (Link) before sending the variable (even though the browser usually takes care of this) and urldecode ($variable) (Link) after receiving it, this way you can be sure special chars will be treated correctly.