I'm playing around with encrypt/decrypt coding in php. Interesting stuff!
However, I'm coming across some issues involving what text gets encrypted into.
Here's 2 functions that encrypt and decrypt a string. It uses an Encryption Key, which I set as something obscure.
I actually got this from a php book. I modified it slightly, but not to change it's main goal.
I created a small example below that anyone can test.
But, I notice that some characters show up as the "encrypted" string. Characters like "=" and "+".
Sometimes I pass this encrypted string via the url. Which may not quite make it to my receiving scripts. I'm guessing the browser does something to the string if certain characters are seen. I'm really only guessing.
is there another function I can use to ensure the browser doesn't touch the string? or does anyone know enough php bas64_encode() to disallow certain characters from being used? I'm really not going to expect the latter as a possibility. But, I'm sure there's a work-around.
enjoy the code, whomever needs it!
define('ENCRYPTION_KEY', "sjjx6a");
function encrypt($string) {
$result = '';
for($i=0; $i<strlen($string); $i++) {
$char = substr($string, $i, 1);
$keychar = substr(ENCRYPTION_KEY, ($i % strlen(ENCRYPTION_KEY))-1, 1);
$char = chr(ord($char)+ord($keychar));
$result.=$char;
}
return base64_encode($result)."/".rand();
}
function decrypt($string){
$exploded = explode("/",$string);
$string = $exploded[0];
$result = '';
$string = base64_decode($string);
for($i=0; $i<strlen($string); $i++) {
$char = substr($string, $i, 1);
$keychar = substr(ENCRYPTION_KEY, ($i % strlen(ENCRYPTION_KEY))-1, 1);
$char = chr(ord($char)-ord($keychar));
$result.=$char;
}
return $result;
}
echo $encrypted = encrypt("reaplussign.jpg");
echo "<br>";
echo decrypt($encrypted);
You could use PHP's urlencode and urldecode functions to make your encryption results safe for use in URLs, e.g
echo $encrypted = urlencode(encrypt("reaplussign.jpg"));
echo "<br>";
echo decrypt(urldecode($encrypted));
You should look at urlencode() to escape the string correctly for use in the query.
If you are worried about +,= etc. similar characters, you should have a look at http://php.net/manual/en/function.urlencode.php and it's friends from "See also" section. Encode it in encrypt() and decode at the beginning of decrypt().
If this doesn't work for you, maybe some simple substitution?
$text = str_replace('+','%20',$text);
Related
I'm writing a PHP application that accepts an URL from the user, and then processes it with by making some calls to binaries with system()*. However, to avoid many complications that arise with this, I'm trying to convert the URL, which may contain Unicode characters, into ASCII characters.
Let's say I have the following URL:
https://täst.de:8118/news/zh-cn/新闻动态/2015/
Here two parts need to be dealt with: the hostname and the path.
For the hostname, I can simply call idn_to_ascii().
However, I can't simply call urlencode() over the path, as each of the characters that need to remain unmodified will also be converted (e.g. news/zh-cn/新闻动态/2015/ -> news%2Fzh-cn%2F%E6%96%B0%E9%97%BB%E5%8A%A8%E6%80%81%2F2015%2F as opposed to news/zh-cn/%E6%96%B0%E9%97%BB%E5%8A%A8%E6%80%81/2015/).
How should I approach this problem?
*I'd rather not deal with system() calls and the resulting complexity, but given that the functionality is only available by calling binaries, I unfortunately have no choice.
split URL by / then urlencode() that part then put it back together
$url = explode("/", $url);
$url[2] = idn_to_ascii($url[2]);
$url[5] = urlencode($url[5]);
$url = join("/", $url);
You could use PHP's iconv function:
inconv("UTF-8", "ASCII//TRANSLIT", $url);
The following can be used for this transformation:
function convertpath ($path) {
$path1 = '';
$len = strlen ($path);
for ($i = 0; $i < $len; $i++) {
if (preg_match ('/^[A-Za-z0-9\/?=+%_.~-]$/', $path[$i])) {
$path1 .= $path[$i];
}
else {
$path1 .= urlencode ($path[$i]);
}
}
return $path1;
}
I have a text field in my Drupal form, which I need to sanitise before saving into the database. The field is for a custom name, and I expect some users may want to write for example "Andy's" or "John's home".
The problem is, that when I run the field value through the check_plain() function, the apostrophe gets converted into ' - which means Andy's code becomes Andy's code.
Can I somehow exclude the apostrophe from the check_plain() function, or otherwise deal with this problem? I have tried wrapping in the format_string() function, but it's not working:
$nickname = format_string(check_plain($form_state['values']['custom_name'], array(''' => "'")));
Thanks.
No, you can't exclude handling of some character in check_plain(), because it's simply passes your text to php function htmlspecialchars() with ENT_QUOTES flag:
function check_plain($text) {
return htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
}
ENT_QUOTES means that htmlspecialchars() will convert both double and single quotes to HTML entities.
Instead of check_plain() you could use htmlspecialchars() with ENT_COMPAT (so it will leave single-quotes alone):
htmlspecialchars($text, ENT_COMPAT, 'UTF-8');
but that can cause some security issues.
Another option is to write custom regular expression to properly sanitize your input.
I've been a bit worried about the security issue T-34 mentioned, so I've tried writing a work-around function which seems to be working OK. The function strips out the apostrophes, then runs check_plain() on each part, and pieces it back together again, re-inserting the apostrophes.
The function is:
function my_sanitize ($text) {
$clean = '';
$no_apostrophes = explode("'", $text);
$length = count($no_apostrophes);
if($length > 1){
for ($i = 0; $i < $length; $i++){
$clean .= CHECK_PLAIN($no_apostrophes[$i]);
if($i < ($length-1)){
$clean .= "'";
}
}
}
else{
$clean = CHECK_PLAIN($text);
}
return $clean;
}
And an example call is:
$nickname = my_sanitize($nickname);
I'm new to Xor encryption, and I'm having some trouble with the following code:
function xor_this($string) {
// Let's define our key here
$key = ('magic_key');
// Our plaintext/ciphertext
$text =$string;
// Our output text
$outText = '';
// Iterate through each character
for($i=0;$i<strlen($text);)
{
for($j=0;$j<strlen($key);$j++,$i++)
{
$outText .= $text{$i} ^ $key{$j};
//echo 'i='.$i.', '.'j='.$j.', '.$outText{$i}.'<br />'; //for debugging
}
}
return $outText;
}
When I run this it works for normal strings, like 'dog' but it only partially works for strings containing numbers, like '12345'.
To demonstrate...
xor_this('dog') = 'UYV'
xor_this('123') = ''
It's also interesting to note that xor_this( xor_this('123') ) = '123', as I expect it to. I'm pretty sure the problem resides somewhere in my shaky understanding of bitwise operators, OR possibly the way PHP handles strings that contain numbers. I'm betting there's someone clever out there that knows exactly what's wrong here. Thanks.
EDIT #1: It's not truly 'encryption'. I guess obfuscation is the correct term, which is what I'm doing. I need to pass a code containing unimportant data from a user without them being able to easily tamper with it. They're completing a timed activity off-line and submitting their time to an online scoreboard via this code. The off-line activity will obfuscate their time (in milliseconds). I need to write a script to receive this code and turn it back into the string containing their time.
How i did it, might help someone ...
$msg = 'say hi!';
$key = 'whatever_123';
// print, and make unprintable chars available for a link or alike.
// using $_GET, php will urldecode it, if it was passed urlencoded
print "obfuscated, ready for url: " . urlencode(obfuscate($msg, $key)) . "\n";
print "deObfuscated: " . obfuscate(obfuscate($msg, $key), $key);
function obfuscate($msg, $key) {
if (empty($key)) return $msg;
return $msg ^ str_pad('', strlen($msg), $key);
}
I think you might have a few problems here, I've tried to outline how I think you can fix it:
You need to use ord(..) to get the ASCII value of a character so that you can represent it in binary. For example, try the following:
printf("%08b ", ord('A')); // outputs "01000001"
I'm not sure how you do an XOR cipher with a multi-byte key, as the wikipedia page on XOR cipher doesn't specify. But I assume for a given key like "123", your key starts "left-aligned" and extends to the length of the text, like this:
function xor_this($text) {
$key = '123';
$i = 0;
$encrypted = '';
foreach (str_split($text) as $char) {
$encrypted .= chr(ord($char) ^ ord($key{$i++ % strlen($key)}));
}
return $encrypted;
}
print xor_this('hello'); // outputs "YW_]]"
Which encrypts 'hello' width the key '12312'.
There's no guarantee that the result of the XOR operation will produce a printable character. If you give us a better idea of the reason you're doing this, we can probably point you to something sensible to do instead.
I believe you are faced with console output and encoding problem rather than XOR-related.
Try to output results of xor function in a text file and see a set of generated characters. I believe HEX editor would be the best choice to observe and compare a generated characters set.
Basically to revert text back (even numbers are in) you can use the same function:
var $textToObfuscate = "Some Text 12345";
var $obfuscatedText = $xor_this($textToObfuscate);
var $restoredText = $xor_this($obfuscatedText);
Based on the fact that you're getting xor_this( xor_this('123') ) = '123', I am willing to guess that this is merely an output issue. You're sending data to the browser, the browser is recognizing it as something which should be rendered in HTML (say, the first half dozen ASCII characters). Try looking at the page source to see what is really there. Better yet, iterate through the output and echo the ord of the value at each position.
Use this code, it works perfect
function scramble($inv) {
$key=342244; // scramble key
$invarr=str_split($inv);
for($index=0;$index<=strlen($inv)-1;$index++) {
srand($key);
$var=rand(0,255);
$res=$res.(chr(ord($var)) ^ chr(ord($invarr[$index])));
$key++;
}
return($res);
}
Try this:
$outText .= (string)$text{$i} ^ (string)$key{$j};
If one of the two operands is an integer, PHP casts the other to an integer and XORs them for a numeric result.
Alternatively, you could use this:
$outText .= chr(ord($text{$i}) ^ ord($key{$j}));
// Iterate through each character
for($i=0; $i<strlen($text); $i++)
{
$outText .= chr(ord($text{$i}) ^ ord($key{$i % strlen($key)))};
}
note: it probably will create some weird characters...
Despite all the wise suggestions, I solved this problem in a much simpler way:
I changed the key! It turns out that by changing the key to something more like this:
$key = 'ISINUS0478331006';
...it will generate an obfuscated output of printable characters.
Arrrgh. Does anyone know how to create a function that's the multibyte character equivalent of the PHP count_chars($string, 3) command?
Such that it will return a list of ONLY ONE INSTANCE of each unique character. If that was English and we had
"aaabggxxyxzxxgggghq xcccxxxzxxyx"
It would return "abgh qxyz" (Note the space IS counted).
(The order isn't important in this case, can be anything).
If Japanese kanji (not sure browsers will all support this):
漢漢漢字漢字私私字私字漢字私漢字漢字私
And it will return just the 3 kanji used:
漢字私
It needs to work on any UTF-8 encoded string.
Hey Dave, you're never going to see this one coming.
php > $kanji = '漢漢漢字漢字私私字私字漢字私漢字漢字私';
php > $not_kanji = 'aaabcccbbc';
php > $pattern = '/(.)\1+/u';
php > echo preg_replace($pattern, '$1', $kanji);
漢字漢字私字私字漢字私漢字漢字私
php > echo preg_replace($pattern, '$1', $not_kanji);
abcbc
What, you thought I was going to use mb_substr again?
In regex-speak, it's looking for any one character, then one or more instances of that same character. The matched region is then replaced with the one character that matched.
The u modifier turns on UTF-8 mode in PCRE, in which it deals with UTF-8 sequences instead of 8-bit characters. As long as the string being processed is UTF-8 already and PCRE was compiled with Unicode support, this should work fine for you.
Hey, guess what!
$not_kanji = 'aaabbbbcdddbbbbccgggcdddeeedddaaaffff';
$l = mb_strlen($not_kanji);
$unique = array();
for($i = 0; $i < $l; $i++) {
$char = mb_substr($not_kanji, $i, 1);
if(!array_key_exists($char, $unique))
$unique[$char] = 0;
$unique[$char]++;
}
echo join('', array_keys($unique));
This uses the same general trick as the shuffle code. We grab the length of the string, then use mb_substr to extract it one character at a time. We then use that character as a key in an array. We're taking advantage of PHP's positional arrays: keys are sorted in the order that they are defined. Once we've gone through the string and identified all of the characters, we grab the keys and join'em back together in the same order that they appeared in the string. You also get a per-character character count from this technique.
This would have been much easier if there was such a thing as mb_str_split to go along with str_split.
(No Kanji example here, I'm experiencing a copy/paste bug.)
Here, try this on for size:
function mb_count_chars_kinda($input) {
$l = mb_strlen($input);
$unique = array();
for($i = 0; $i < $l; $i++) {
$char = mb_substr($input, $i, 1);
if(!array_key_exists($char, $unique))
$unique[$char] = 0;
$unique[$char]++;
}
return $unique;
}
function mb_string_chars_diff($one, $two) {
$left = array_keys(mb_count_chars_kinda($one));
$right = array_keys(mb_count_chars_kinda($two));
return array_diff($left, $right);
}
print_r(mb_string_chars_diff('aabbccddeeffgg', 'abcde'));
/* =>
Array
(
[5] => f
[6] => g
)
*/
You'll want to call this twice, the second time with the left string on the right, and the right string on the left. The output will be different -- array_diff just gives you the stuff in the left side that's missing from the right, so you have to do it twice to get the whole story.
Please try to check the iconv_strlen PHP standard library function. Can't say about orient encodings, but it works fine for european and east europe languages. In any case it gives some freedom!
$name = "My string";
$name_array = str_split($name);
$name_array_uniqued = array_unique($name_array);
print_r($name_array_uniqued);
Much easier. User str_split to turn the phrase into an array with each character as an element. Then use array_unique to remove duplicates. Pretty simple. Nothing complicated. I like it that way.
I have the following code to generate a random password string:
<?php
$password = '';
for($i=0; $i<10; $i++) {
$chars = array('lower' => array('a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'), 'upper' => array('A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'), 'num' => array('1','2','3','4','5','6','7','8','9','0'), 'sym' => array('!','£','$','%','^','&','*','(',')','-','=','+','{','}','[',']',':','#','~',';','#','<','>','?',',','.','/'));
$set = rand(1, 4);
switch($set) {
case 1:
$set = 'lower';
break;
case 2:
$set = 'upper';
break;
case 3:
$set = 'num';
break;
case 4:
$set = 'sym';
break;
}
$count = count($chars[$set]);
$digit = rand(0, ($count-1));
$output = $chars[$set][$digit];
$password.= $output;
}
echo $password;
?>
However every now and then one of the characters it outputs will be a capital a with a ^ above it. French or something. How is this possible? it can only pick whats it my arrays!
The only non-ascii character is the pound character, so my guess is that it has to do with this.
First off, it's probably a good idea to avoid that one, as not many people will be able to easily type it.
Good chance that the encoding of your php file (or the encoding set by your editor) is not the same as your output encoding.
Are you sure it is indeed a character not in your array, or is the browser just unable to output? For example your monetary pound sign. Ensure that both PHP, DB, and HTML output all use the same encoding.
On a separate note, your loop is slightly more complicated than it needs to be. I typically see password generators randomize a string versus several arrays. A quick example:
$chars = "abcdefghijkABCDEFG1289398$%#^&";
$pos = rand(0, strlen($chars) - 1);
$password .= $chars[$pos];
i think you generate special HTML characters
for example here and iso8859-1 table
You may be seeing the byte sequence C2 A3, appearing as your capital A with a circumflex followed by a pound symbol. This is because C2A3 is the UTF-8 sequence for a pound sign. As such, if you've managed to enter the UTF-8 character in your PHP file (possibly without noticing it, depending on your editor and environment) you'd see the separate byte sequence as output if your environment is then ASCII / ISO8859-1 or similar.
As per Jason McCreary, I use this function for such Password Creation
function randomString($length) {
$characters = "0123456789abcdefghijklmnopqrstuvwxyz" .
"ABCDEFGHIJKLMNOPQRSTUVWXYZ$%#^&";
$string = '';
for ($p = 0; $p < $length; $p++)
$string .= $characters[mt_rand(0, strlen($characters))];
return $string;
}
The pound symbol (£) is what is breaking, since it is not part of the basic ASCII character set.
You need to do one of the following:
Drop the pound symbol (this will also help people using non-UK keyboards!)
Convert the pound symbol to an HTML entity when outputting it to the site (&#pound;)
Set your site's character set encoding to UTF-8, which will allow extended characters to be displayed. This is probably the best option in the long run, and should be fairly quick and easy to achieve.