$_SERVER['QUERY_STRING'] does not print unicode values as it is - php

http://localhost/fw/api/fw_api.php?rule=unicode&action=create&phrase=යුනිකෝඩ්
I accessing the above url. In fw_api.php, when I echo the $_SERVER['QUERY_STRING'] it does not give the actual value of my Unicode phrase value "යුනිකෝඩ්" as in the URL. Is there any fix for this or am I doing/expecting something wrong here? Need help.
header ('Content-type: text/html; charset=utf-8');
echo $_GET['phrase'];
echo $_SERVER['QUERY_STRING'];
die;
Actual Result:
යුනිකෝඩ්
rule=unicode&action=create&phrase=%E0%B6%BA%E0%B7%94%E0%B6%B1%E0%B7%92%E0%B6%9A%E0%B7%9D%E0%B6%A9%E0%B7%8A
What I expected
යුනිකෝඩ්
rule=unicode&action=create&phrase=යුනිකෝඩ්

The actual value is actually "%E0%B6%BA%E0%B7%94%E0..."!
URLs must consist of a subset of ASCII, they cannot contain other "Unicode characters". Your browser may be so nice as to let you input arbitrary Unicode characters and actually display them as characters, but behind the scenes the URL value is percent encoded. You'll have to decode it with rawurldecode.
The query string is automatically being parsed and decoded by PHP and placed in the $_GET array (and $_POST for the request body). But the raw query string you'll have to parse and decode yourself.

Encode a value with special characters.
$token = "a{l#3a3s9a";
rawurlencode($token); //The coding would be "%7Bl%403a3s9a"
Send the encoded value to the database
Receive the parameter value by URL
$body = file_get_contents("php://input");
if ($body == null && isset($_SERVER['QUERY_STRING'])) {
parse_str($_SERVER['QUERY_STRING'], $this->parameters);
return;
}
The parameter values are automatically decoded with parse_str () without the need to use rawurldecode()
Use the value obtained by URL ("a{l#3a3s9a")
This encoding would be used to obtain special characters through a URL segment.
GL

Related

PHP converting URL string to ASCII characters

I have a string of Characters that is passed in a URL.
The string happens to contain a group of characters that is equivalent to an ASCII code.
When I try to use the string on the page using the $_GET command, it converts the part of the string that is equivalent to the ASCII code to the ASCII code instead of passing the actual string.
For example the URL contains a string Name='%bert%'. But when I echo out $_GET['Name'] I get '3/4rt%' instead of '%bert%'. How can I get the actual text?
You're not escaping your data properly.
If you want to use %bert% in a URL, you need to encode your % as %25, making your query string value %25bert%25.
% in a URL means that the next two characters are going to be some encoded entity, so if you want to use it literally, it must be encoded this way.
You can read more information here: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
try passing Name='%25bert%25' instead of Name='%bert%'.
Note: %25 acts as escape character for % is url query string!

PHP - $_GET - decode utf-8

The documentation on this page http://ru2.php.net/manual/en/function.urldecode.php says that "The superglobals $_GET and $_REQUEST are already decoded".
But on my server this code
var_dump($_GET['str'])
returns
string(21) "ффф"
How can I make php decode strings in $_GET ?
You should set correct header content-type on pages with form:
header('Content-Type: text/html; charset="UTF-8"');
And you should get correct data from $_GET without any decoding operations.
As #deceze states, that string already is decoded. But if you want to transform it into readable characters, use html_entity_decode().
$string = 'ффф';
echo html_entity_decode($string);
returns
ффф
Example: http://3v4l.org/eqDf3
That is decoded. The value is already decoded from its URL percent encoded form. The original was likely:
%26%231092%3B%26%231092%3B%26%231092%3B
It has now been decoded to:
ффф
The content of the string is escaped HTML. If you're sending escaped HTML, you'll get escaped HTML. If you don't like escaped HTML, don't send escaped HTML. PHP is not going to try every possible encoding format recursively on URL values until nothing more can be decoded.
The number after &# is a decimal unicode code-point which is unrelated to UTF-8.
According to http://www.utf8-chartable.de/unicode-utf8-table.pl?start=1024&number=1024&unicodeinhtml=dec, your character is:
U+0444 ф d1 84 ф ф CYRILLIC SMALL LETTER EF
Here, d1 84 is the UTF-8 representation for it.
As mentioned earlier, html_entity_decode("ффф", null, 'UTF-8') should do the trick.
It returns the following string:
'ÐäÐäÐä'
Which hexadecimal representation can be found like this:
>> bin2hex($s)
'd184d184d184'
It is indeed correct according to the table quoted previously.

Translate URLENCODED data into UTF-8 in PHP

I've got a string that is in my database like 中华武魂 when I post my request to retrieve the data via my website I'm getting the data to the server in the format %E4%B8%AD%E5%8D%8E%E6%AD%A6%E9%AD%82
What decoding steps to I have to take in order to get it back to the usable form?
While also cleaning the user input to ensure they're not going to try an SQL injection attack?
(escape string before or after encoding?)
EDIT:
rawurldecode(); // returns "中åŽæ­¦é­‚"
urldecode(); // returns "中åŽæ­¦é­‚"
public function utf8_urldecode($str) {
$str = preg_replace("/%u([0-9a-f]{3,4})/i","&#x\\1;",urldecode($str));
return html_entity_decode($str,null,'UTF-8');
}
// returns "中åŽæ­¦é­‚"
... which actually works when I try and use it in an SQL statement.
I think because I was doing an echo and die(); without specifying a header of UTF-8 (thus I guess that was reading to me as latin)
Thanks for the help!
When your data is actually that percent-encoded form, you just have to call rawurldecode:
$data = '%E4%B8%AD%E5%8D%8E%E6%AD%A6%E9%AD%82';
$str = rawurldecode($data);
This suffices as the data already is encoded in UTF-8: 中 (U+4E2D) is encoded with the byte sequence 0xE4B8AD in UTF-8 and that is encoded with %E4%B8%AD when using the percent-encoding.
That your output does not seem to be as expected is probably because the output is interpreted with the wrong character encoding, probably Windows-1252 instead of UTF-8. Because in Windows-1252, 0xE4 represents ä, 0xB8 represents ¸, 0xAD represents å, and so on. So make sure to specify the output character encoding properly.
Use PHP's urldecode:
http://php.net/manual/en/function.urldecode.php
You have choices here: urldecode or rawurldecode.
If you had encoded your string using urlencode, you must use urldecode because of the way spaces are handled. While urlencode converts spaces to +, it is not the same with rawurlencode.

Pass text with special characters as get parameter in php

I want to pass any text as a get parameter to a php script. For know I just append the text this way:
action.php?text=Hello+my+name+is+bob
This url is composed by javascript and I do a ajax request with this url.
In action.php I do
$encoded = array_map('rawurlencode', $_GET);
But this does not work for special chars like ÖÄüä.
Any idea how to solve this?
url_encode(string) will return the given string with special characters converted into %XX format.
http://us3.php.net/manual/en/function.urlencode.php
I know you can send special characters fine without encoding through $_POST, which is another alternative
try with url_encode().........

get values in url

I need to get values in the url as it is
ex:
http://www.example.com/index?url=1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA==
but vriable url give me value of "1 LY2ePh1pjX4tjZ4 GS393Y2pjd16Cbq63T3tbfzMzd16CarA=="
Even though i have expected "1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA=="
any one can help me for this or know the reason
You see, you need to encode certain characters if you need to send them in a URL. For further references, I suggest you should read this Page. It seems that the URL you are getting isn't being encoded properly. If the URL is coming from your site, then I would suggest you to encode it properly.
In PHP, there is a function called urlencode, which may help you with this task.
A short explanation
URLs can only be sent over internet using ASCII character set.If you want to send characters which is outside this set, you need to encode it.URL encoding replaces unsafe ASCII characters with % followed by two hexadecimal digits corresponding to the character values in the ISO-8859-1 character-set.
The client sending the request apparently isn't URL encoding the value correctly. You can re-encode it after it's being decoded like this:
urlencode($_GET["url"])
IT convert %2B to space
The parameter you sent is wrong, it should have been encoded like so..
<?php
echo '<a href="http://www.example.com/index?url=', urlencode('1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA=='), '">';
?>
i have added encoding correctly now,It convert == correctly, but + sign encode to %2B correctly but in decode process it convert to space
As it seems that you’re having a Base-64 value there: You can use the URL safe alphabet for Base-64 that uses - and _ instead of + and / respectively:
$base64 = "1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA==";
// plain Base-64 to URL safe Base-64
$base64_safe = strtr($base64, '+/', '-_');
// URL safe Base-64 to plain Base-64
$base64 = strtr($base64_safe, '-_', '+/');
And if you know the length of the data, you can also omit the = padding:
rtrim($base64, '=')

Categories