I am retrieving an encoded url via querystring. I need to pass it again to the next page. When I retrieve it the first time, using $_REQUEST['url'], only the slashes are decoded, e.g:
http://example.com/search~S10?/Xllamas&searchscope=10&SORT=D/Xllamas&searchscope=10&SORT=D&SUBKEY=llamas/51%2C64%2C64%2CB/browse
The php docs page for urldecode advises against decoding request data, and says that it will already be decoded. I need it either completely decoded, so I can encode it again without double-encoding some parts, or not decoded at all.
I'm not sure why my experience of this data is incongruous with the php docs. Appreciate any help or pointers to same!!
EDIT: attempt to post relevant code, which is scattered about:
the url is encoded and added to the querystring (in an html file using smarty template):
<a class="button" href="{$baseurl}search_nojs?searcharg={$searcharg|escape:'url'}&url={$next|escape:'url'}"><span>Next>></span></a>
if that link was followed, i'm grabbing the url back out of the querystring (in a php file):
if(array_key_exists('url', $_REQUEST)) {
$sm->assign("searchurl", $_REQUEST['url']);
}
Then I'd like to stick the url back into the querystring for the next link (in another html file):
href="{$baseurl}detail?bibid={$res.bibid}&searcharg={$searcharg}{if $searchurl}&searchurl={$searchurl}{/if}"
I'm also printing {$searchurl} straight onto the page, and getting the same half-escaped result.
Here is another example of the querystring vs. the data i get from $_REQUEST:
originally encoded url in querystring:
searcharg=mammals&url=http%3A%2F%2Fexample.com%2Fsearch%7ES10%3F%2FXmammals%26searchscope%3D10%26SORT%3DD%2FXmammals%26searchscope%3D10%26SORT%3DD%26SUBKEY%3Dmammals%2F51%252C1114%252C1114%252CB%2Fbrowse
data retrieved from $_REQUEST:
searcharg=mammals&searchurl=http://example.com/search~S10?/Xmammals&searchscope=10&SORT=D/Xmammals&searchscope=10&SORT=D&SUBKEY=mammals/51%2C1114%2C1114%2CB/browse
I know this method may seem curious -- I am trying to make a mobile display, working around a black-box database. Thanks again for any help!!
Here is another example of the querystring vs. the data i get from $_REQUEST:
originally encoded url in querystring:
searcharg=mammals&url=http%3A%2F%2Fexample.com%2Fsearch%7ES10%3F%2FXmammals%26searchscope%3D10%26SORT%3DD%2FXmammals%26searchscope%3D10%26SORT%3DD%26SUBKEY%3Dmammals%2F51%252C1114%252C1114%252CB%2Fbrowse
This is double encoded. For example: %252C -> %2C -> ,
So at the point that you encode the url parameter, you're introducing double encoding. Perhaps you should ensure that, before encoding parameters, you decode them until they can be decoded no more (aka canonicalisation). You could use urldecode in a loop for this.
You also want to ensure that when you put the url parameter back into html context (as a link) that you escape for HTML Attributes too. Otherwise you have an XSS vulnerability.
The comma (U+002C) is a reserved character in the query and thus must be encoded with %2C:
3.4. Query Component
The query component is a string of information to be interpreted by
the resource.
query = *uric
Within a query component, the characters ";", "/", "?", ":", "#",
"&", "=", "+", ",", and "$" are reserved.
Related
I am sending the below url with query string. In the query string one parameter
"approverCmt" has value with hash(#).
"/abc/efd/xyz.jas?approverCmt=Transaction Log #459505&batchNm=XS_10APR2015_082224&mfrNm=Timberland"
In server side when I tried to retrieve it from the request I get
approverCmt = Transaction Log -----> "#459505" is missing
batchNm = null
mfrNm = null
And If I remove hash(#) from query string or If I replace # with %23 every thing works fine
I don't understand why I am getting null for one parameter if another parameter contains a hash(#) symbol.
Appreciate if any one can explain.
This is known as the "fragment identifier".
As mentioned in wikipedia:
The fragment identifier introduced by a hash mark # is the optional last part of a URL for a document. It is typically used to identify a portion of that document.
The part after the # is info for the client. It is not sent to the server. Put everything only the browser needs here.
You can use the encodeURIComponent() function in JavaScript to encode special characters in a URL, so that # characters are converted to other characters that way you can be sure your whole URL will be sent to the server.
The Hash value is for the anchor, so it is only client-side, it is often used in client-side framework like angular for client-side routing.
The anchor is NOT available server-side.
In your case you don't need an anchor, but a parameter value with a # break the query string the value is "Transaction Log #459505".
EDIT Naive solution that doesn't work, just let it ther for history, See Real solution below
The solution is to encode client-side and decode serveur-side
Encoding in javascript
encodeURI("Transaction Log #459505")
//result value "Transaction%20Log%20#459505"
Decode in Java
java.net.URLDecoder.decode("Transaction%20Log%20#459505");
//result "Transaction Log #459505"
EDIT: But: Javascript doesn't encode in the same way than Java
So the correct answer (I hope) is to manually replace all your # with %23, then Java will decode it normally, or to use encodeURIComponent as suggested in comments. For your need the replace solution seem to be enough.
Encode in Javascript:
encodeURI("yourUrl/Transaction Log #459505").replace(/#/,"%23")
//result: yourUrl/Transaction%20Log%20%23459505
The decode in Java doesn't change
java.net.URLDecoder.decode("Transaction%20Log%20#459505")
// result (java.lang.String) Transaction Log #459505
Sorry for long post, I didn't see the difference bettween Java and the JavaScrip Url encoding
the hash is an anchor:
see wikipedia for more information
kindly I have two links,
when using both of the links in another page, the first link is decoded automatically by GET Method and the second didn't.
the problem is that if there is a space in any attribute, the get don't decode automatically the URL and if there are no spaces, the get automatically decoding the URL which is the correct behaviour
tip : the only encoded attribute is BodyStr and encoded via URLENCODE PHP function.
another tip: the difference between both is the space in subjectStR Attribute
I want to know why spaces in URL prevent GET Global Variable from automatically decoding all the attributes
$message=urlencode($message);
http://localhost/test4.php?me=ahmed&y=1&clientid=55&default=1&Subjectstr=**Email From Contactuspage`**&BodyStr=$message
http://localhost/test4.php?me=ahmed&y=
1&clientid=55&default=1&Subjectstr=**EmailFromContactuspage**&BodyStr=$message
Space isn't allowed in URL query strings. If you put an unencoded space in SubjectStr, the URL ends at that point, so the server never sees the BodyStr parameter.
You need to URL-encode SubjectStr. Replace the spaces with + or %20.
$message=urlencode($message);
$url = "http://localhost/test4.php?me=ahmed&y=1&clientid=55&default=1&Subjectstr=Email+From+Contactuspage&BodyStr=$message"
The reason why it stops at space is because of the HTTP protocol. The client sends:
GET <url> HTTP/1.1
This request line is parsed by looking for the space between the URL and the HTTP version token. If there's a space in the URL, that will be treated as the end of the URL.
When connecting to PayPal I use a URL like this (I am using fake values here, but the structure is real):
https://www.paypal.com/cgi-bin/webscr?&business=ZDS346347&cmd=_xclick&amount=100&item_name=Test&no_note=1&no_shipping=1&rm=2&return=http://www.website.com/registration.php?paypal=1&classid=122&sessionid=264&studentid=2286
The problem is when I send this url, it truncates my return value query string from this:
paypal=1&classid=122&sessionid=264&studentid=2286
to this:
paypal=1
The ampersands in the return value are confusing it, but I need to use them so I can process those query string values on the return.
Is there someway, I can pass that whole return string to PayPal so it won't truncate after the first ampersand it hits.
Thanks,
Chris
Wrap the passed URL with urlencode to turn the ampersands into PayPal-parsable characters, then when your URL gets called use urldecode to decode them.
This happens because PayPal's URL simply splits everything after the ? into chunks by the & symbol. It doesn't know when one is part of your website or not. So it's sending PayPal classid=122 as it's own key/value pair, not as a part of your URL. Encoding the URL this way should make it work correctly.
edit Referenced the wrong PHP functions. urlencode/decode are for GET parameter passing, htmlspecialchars is for storing HTML data
i am processing a activation link.which looks like this.
localhost/actvte/validate.php?type=activate&geo=define&
value=227755RYQBENU5G8WE7RFPO6CD6Z#MJ1H1FA#G#IZWZ53903
&target=loaded&resrc=G6MYMI2R67727229911380184297841084713071U8VUYIGR
&master=user#gmail.com
but when i use $_get['value'] i get o/p only "227755RYQBENU5G8WE7RFPO6CD6Z".after this whole link becomes useless.if i do
echo $_get['target']; or echo $_get['master'];
it says undefined variable 'target' or 'master'.
so how can i process this large link.
What you should do is use the urlencode() function in PHP on the string before putting it in the GET. This way your string becomes 227755RYQBENU5G8WE7RFPO6CD6Z%23MJ1H1FA%40G%23IZWZ53903 and not 227755RYQBENU5G8WE7RFPO6CD6Z#MJ1H1FA#G#IZWZ53903 as special characters cannot be used in the query string.
Hashes for example will never even be send by the browser to the server, so everything behind that will not reach you.
Please look at RFC 3986 for more information about the URI syntaxing (including hashes).
By not using or properly encoding the fragment identifier (#).
You should not use # in the URL but encode it some way, or use another character.
The first # in a URL indicates the start of the fragment identifier. If you want to send it as data rather then a separator component of a URL then you need to express it as %23.
I am working with an XML feed that has, as one of it's nodes, a URL string similar to the following:
http://aflite.co.uk/track/?aid=13414&mid=32532&dl=http://www.google.com/&aref=chris
I understand that ampersands cause a lot of problems in XML and should be escaped by using & instead of a naked &. I therefore changed the php to read as follows:
<node><?php echo ('http://aflite.co.uk/track/?aid=13414&mid=32532&dl=http://www.google.com/&aref=chris'); ?></node>
However when this generates the XML feed, the string appears with the full &
and so the actual URL does not work. Apologies if this is a very basic misunderstanding but some guidance would be great.
I've also tried using %26 instead of & but still getting the same problem.
If you are inserting something into XML/HTML you should always use the htmlspecialchars function. this will escape your strings into correct XML syntax.
but you are running into a second problem.
your have added a second url to the first one.
this need also escaped into url syntax.
for this you need to use urlencode.
<node><?php echo htmlspecialchars('http://aflite.co.uk/track/?aid=13414&mid=32532&aref=chris&dl='.urlencode('http://www.google.com/')); ?></node>
& is correct for escaping ampersands in an XML document. The example you've given should work.
You state that it doesn't work, but you haven't stated what application you're using, or in what way it doesn't work. What exactly happens when you click the link? Do the & strings end up in the browser's URL field? If that's the case, it sounds like a fault with the software you've viewing the XML with. Have you tried looking at the XML in another application to see if the problem is consistent?
To answer the final part of your question: %26 would definitely not work for you -- this would be what you'd use if your URL parameters needed to contain ampersands. Say for example in aref=chris, if the name chris were to an ampersand (lets say the username was chris&bob), then that ampersand would need to be escaped using %26 so that the URL parser didn't see it as starting a new URL parameter.
Hope that helps.