I have an issue with charsets and how they are encoded in a request I send. I have a test case where I want the code to end up with the exact same md5-hash on both sides. While still being the character 'å', obviously. (So not converted into some broken char or just '?')
The source input is utf8 and contains a norwegian character, for example "båt".
This input will then be sent to an API that wants data to be latin1 / ISO-8859-1.
One goal is also to avoid having to add utf8_decode to the receiving end.
So this is the very simplified code of what I've sent until now:
$password_send = 'båt';
echo "Test 1: " . md5( $password ) . "\n";
$params = array('password' => $password);
$request = xmlrpc_encode_request($module, $params);
And this is how the receiving end treats it. It basically just converts it into an md5 hash and sends it to another method. No other conversion of the incoming data has been done.
$_hash = md5( $password_receive );
echo "Test 2: {$_hash}\n";
Member::updatePassword($member_id, $_hash);
I need the $_hash to be (when 'båt' is sent) to end up as the hash 7e2cdd98fccee62723784a815a2ecdcb. Since this is the md5-hash that 'båt' resolves into when the password 'båt' is saved on the site itself (and not trough the API)
So when I send 'båt' in the API-request, then on the receiver end, it ends up with: fd9cac747daca144726dc579c32f48a, which is wrong. When I check the md5() of 'båt' before I send it, then it is also displayed as fd9cac747daca144726dc579c32f48ae.
I guess this is expected, since I don't use utf8_decode yet, but if I change what I send, like so: $password_send = utf8_decode('båt');
Then it still doesn't end up with the correct hash on the receiver end, then it ends up with: b865deb1e3b0891a41c5444c00893a0f
However, if I also add utf8_decode on the receiver end, like so: $_hash = utf8_decode($password_receive), then it ends up with the hash I need it to be: 7e2cdd98fccee62723784a815a2ecdcb
But this seems very wrong... Having to do utf8_decode on both sides. And while this hash is now correct on the receiving end too, the issue is that I don't want to change any code on the receiving end. And it doesn't work to just do utf8_decode twice before I send the value, because then I just end up with the hash c2d1fbc45e123f65edd74401ef58dd6a on the receiving end (which is the equivalent of doing md5('b?t'). It only worked when I do utf8_decode once before I send it, and once on the receiver end.
So I started to realize that xmlrpc_encode_request probably is the culprit, in that it maybe did some conversion on it's own. First I checked what a var_dump of $request said, in the cases where the $password_send value has NOT been utf8_decoded. And that is:
<string>båt</string>
When I do utf8_decode on the value $password_send before it's made into an xmlrpc request, then it is:
<string>båt</string>
Then I read the documentation on xmlrpc_encode_request. And I've tried various combinations of output_options, but none of them seems to work. In every scenario I still have to do utf8_decode in the code on the input data on the receiver end to end up with the exact same md5 that I need.
I realize this might be somewhat confusing. I would really really appreciate it if someone is able to help me out here. By giving me some pointer on what I should do or try. Because I've gotten completely lost on this issue now :(
The problem seems to be the escaping of xmlrpc_encode_request function. I have same problem albeit with czech "ě" character. I believe it might be bug in PHP, I however found a simple workaround.
Just turn of escaping of non-print and non-ascii characters.
echo xmlrpc_encode_request('test', 'å'); //Ã¥ - incorrect
echo xmlrpc_encode_request('test', 'å', ['escaping' => 'markup']); //å - correct
Related
I have a site where anyone can leave comments.
By leaving a comment browser makes an ajax request to PHP script, sending encodeURIComponent-ed data to PHP script.
Earlier, in the PHP script, I added
$data = str_replace("\n","\\n",str_replace("\"","\\\"",$_POST["text"]));
Now I’ve been testing by inputting random stuff and found an exploit: if to input %00, it will be added to my comments file as null-terminator and corrupts my data. Also, other percent-encoded value will be decoded.
I am sending data as a regular application/x-www-form-urlencoded.
How to fix that?
So, the solution I’ve made so far is:
$_POST["text"] = str_replace("\"","\\\"",$_POST["text"]);
for($i=0;$i<=40;$i++)
if(chr($i)!="\n"&&chr($i)!="\r"&&chr($i)!=" "&&chr($i)!="("&&chr($i)!="&"&&chr($i)!="!"&&chr($i)!="\""&&chr($i)!="'")
$_POST["text"] = str_replace(chr($i),"",$_POST["text"]);
$_POST["text"] = str_replace("\\","",$_POST["text"]);
It just removes all special and potentially malware non-readable characters with some exceptions (newlines, ampersands etc.).
The only issue of this solution is that it removes backslash (but successfully writes data).
I have an api logging system which records logins but I do not want to store passwords in the logs.
This is an example of a request string to the log:
NOTE: the string will not be exactly the same and will contain parameters in different order, so I am thinking maybe someREGEX can handle this?
api.my.geatapim/live/?action=login_user&username=joe#bloggs.com&password=PassWord&session_length=10080
What I need to do, is:
Detect if the parameter "password=" is in the string
If its in the string replace the password part with OBFUSCATED so result will be:
api.my.geatapim/live/?action=login_user&username=joe#bloggs.com&password=OBFUSCATED&session_length=10080
I have tried this but does not work: $request_string = preg_replace("/password=\d+/", "password=OBFUSCATED", $request_string);
The Expression
\d+ is for digits ([0-9]). You'll want to include more character sets for the password, considering the one you provided is using [A-Za-z].
$request_string = preg_replace("/password=\w+/", "password=OBFUSCATED", $request_string);
Though, considering a typical password will have a bigger character set than [a-zA-Z0-9_], taking into account special characters (but since it's in a URL, it'll possibly be urlencoded()'d. For example, P&ssW0rd! will become P%26ssW0rd!.)
$request_string = preg_replace("/password=[^&]+/", "password=OBFUSCATED", $request_string);
"I do not want to store passwords in the logs."
This logic won't modify what is put into your Apache/Nginx/Whatever access_log (unless you write these logs to /dev/null or another void place). You can also not write the passwords in the logs if you change it from a HTTP GET to a HTTP POST (or HTTP PUT) and have the credentials in the body, or, use HTTP Authentication headers.
Although your question is quite easy to solve, it has nothing to do with your actual problem. you simply should never transfer password data via $_GET - it's one of the big no no-s of handling credentials. — Franz Gleichmann
Try this code, it works
<?php
$request_string = "api.my.geatapim/live/?action=login_user&username=joe#bloggs.com&password=PassWord&session_length=10080";
echo $request_string = preg_replace("/password=\w+/", "password=OBFUSCATED", $request_string);
?>
Output : api.my.geatapim/live/?action=login_user&username=joe#bloggs.com&password=OBFUSCATED&session_length=10080
I'm having some troubles with my $_POST/$_REQUEST datas, they appear to be utf8_encoded still.
I am sending conventional ajax post requests, in these conditions:
oXhr.setRequestHeader("Content-type", "application/x-www-form-urlencoded; charset=utf-8");
js file saved under utf8-nobom format
meta-tags in html <header> tag setup
php files saved under utf-8-nobom format as well
encodeURIComponent is used but I tried without and it gives the same result
Ok, so everything is fine: the database is also in utf8, and receives it this way, pages show well.
But when I'm receiving the character "º" for example (through $_REQUEST or $_POST), its binary represention is 11000010 10111010, while "º" hardcoded in php (utf8...) binary representation is 10111010 only.
wtf? I just don't know whether it is a good thing or not... for instance if I use "#º#" as a delimiter of the explode php function, it won't get detected and this is actually the problem which lead me here.
Any help will be as usual greatly appreciated, thank you so much for your time.
Best rgds.
EDIT1: checking against mb_check_encoding
if (mb_check_encoding($_REQUEST[$i], 'UTF-8')) {
raise("$_REQUEST is encoded properly in utf8 at index " . $i);
} else {
raise(false);
}
The encoding got confirmed, I had the message raised up properly.
Single byte utf-8 characters do not have bit 7(the eight bit) set so 10111010 is not utf-8, your file is probably encoded in ISO-8859-1.
I have an encrypted, base64 encoded array that I need to put into a url and insert into emails we send to clients to enable them to be identified (uniquely) - the problem is that base64_encode() often appends an = symbol or two after it's string of characters, which by default is disallowed by CI.
Here's an example:
http://example.com/cec/pay_invoice/VXpkUmJnMWxYRFZWTEZSd0RXZFRaMVZnQWowR2N3TTdEVzRDZGdCbkQycFFaZ0JpQmd4V09RRmdWbkVMYXdZbUJ6OEdZQVJ1QlNJTU9Bb3RWenNFSmxaaFVXcFZaMXQxQXpWV1BRQThVVEpUT0ZFZ0RRbGNabFV6VkNFTlpsTWxWV29DTmdackEzQU5Nd0lpQURNUGNGQS9BRFlHWTFacUFTWldOZ3M5QmpRSGJBWTlCREVGWkF4V0NtQlhiZ1IzVm1CUk9sVm5XMllEWlZaaEFHeFJZMU51VVdNTmJsdzNWVzlVT0EwZw==
Now I understand I can allow the = sign in config.php, but I don't fully understand the security implications in doing so (it must have been disabled for a reason right?)
Does anyone know why it might be a bad idea to allow the = symbol in URLs?
Thanks!
John.
Not sure why = is disallowed, but you could also leave off the equals signs.
$base_64 = base64_encode($data);
$url_param = rtrim($base_64, '=');
// and later:
$base_64 = $url_param . str_repeat('=', strlen($url_param) % 4);
$data = base64_decode($base_64);
The base64 spec only allows = signs at the end of the string, and they are used purely as padding, there is no chance of data loss.
Edit: It's possible that it doesn't allow this as a compatibility option. There's no reason that I can think of from a security perspective, but there's a possibility that it may mess with query string parsing somewhere in the tool chain.
Please add the character "=" to $config['permitted_uri_chars'] in your config.php file you can find that file at application/config folder
Originally there are no any harmful characters in the url at all. But there are not experienced developers or bad-written software that helps some characters to become evil.
As of = - I don't see any issues with using it in urls
Instead of updating config file you can use urlencode and urldecode function of native php.
$str=base64_encode('test');
$url_to_be_send=urlencode($str);
//send it via url
//now on reciveing side
//assuming value passed via get is stored in $encoded_str
$decoded_str=base64_decode(urldecode($encoded_str));
My previous question and this question both are related a bit. please have a look at my previous question I did not found any other way to unserialize the data so coming with the string operation
I am able to get the whole content from file but not able to get the specific string from this content.
I want to search a specific string from these content but function stop working when the reach at first special character in the string. If I am searching something found before the special character the works properly.
String operation function of PHP not working properly when the encounter first special character in the string and stop processing immediately, Hence they does not give me the correct output.
Originally they looks like (^#)
:"Mage_Core_Model_Message_Collection":2:{s:12:"^#*^#_messages";a:0:{}s:20:"^#*^#_lastAddedMessage";N;}
but when I did echo they are display as ?
Here is the code what I tried
$file='/var/www/html/products/var/session/sess_ciktos8icvk11grtpkj3u610o3';
$contents=file_get_contents($file);
$contents=htmlspecialchars($contents);
//$contents=htmlentities($contents);
echo $contents;
$restData=strstr($contents,'"id";s:4:"');
echo $restData;
$id=substr($restData,0,strpos($restData,'"'));
echo $id;
I changed the default_charset to iso-8859-1 and also utf-8 but not working with both
Please let me know How I can resolve this.
Thanks.
These characters that you see as ^# are actually null bytes. They don't have any proper display, neither they are meant to be displayed - it's an internal representation of protected properties in the engine. You're not supposed to mess with them.
As for resolving, it'd be nice to know what kind of resolution you seek - what result are you trying to achieve?