Detect base64 encoding in PHP? - php

Is there some way to detect if a string has been base64_encoded() in PHP?
We're converting some storage from plain text to base64 and part of it lives in a cookie that needs to be updated. I'd like to reset their cookie if the text has not yet been encoded, otherwise leave it alone.

Apologies for a late response to an already-answered question, but I don't think base64_decode($x,true) is a good enough solution for this problem. In fact, there may not be a very good solution that works against any given input. For example, I can put lots of bad values into $x and not get a false return value.
var_dump(base64_decode('wtf mate',true));
string(5) "���j�"
var_dump(base64_decode('This is definitely not base64 encoded',true));
string(24) "N���^~)��r��[jǺ��ܡם"
I think that in addition to the strict return value check, you'd also need to do post-decode validation. The most reliable way is if you could decode and then check against a known set of possible values.
A more general solution with less than 100% accuracy (closer with longer strings, inaccurate for short strings) is if you check your output to see if many are outside of a normal range of utf-8 (or whatever encoding you use) characters.
See this example:
<?php
$english = array();
foreach (str_split('az019AZ~~~!##$%^*()_+|}?><": Iñtërnâtiônàlizætiøn') as $char) {
echo ord($char) . "\n";
$english[] = ord($char);
}
echo "Max value english = " . max($english) . "\n";
$nonsense = array();
echo "\n\nbase64:\n";
foreach (str_split(base64_decode('Not base64 encoded',true)) as $char) {
echo ord($char) . "\n";
$nonsense[] = ord($char);
}
echo "Max nonsense = " . max($nonsense) . "\n";
?>
Results:
Max value english = 195
Max nonsense = 233
So you may do something like this:
if ( $maxDecodedValue > 200 ) {} //decoded string is Garbage - original string not base64 encoded
else {} //decoded string is useful - it was base64 encoded
You should probably use the mean() of the decoded values instead of the max(), I just used max() in this example because there is sadly no built-in mean() in PHP. What measure you use (mean,max, etc) against what threshold (eg 200) depends on your estimated usage profile.
In conclusion, the only winning move is not to play. I'd try to avoid having to discern base64 in the first place.

function is_base64_encoded($data)
{
if (preg_match('%^[a-zA-Z0-9/+]*={0,2}$%', $data)) {
return TRUE;
} else {
return FALSE;
}
};
is_base64_encoded("iash21iawhdj98UH3"); // true
is_base64_encoded("#iu3498r"); // false
is_base64_encoded("asiudfh9w=8uihf"); // false
is_base64_encoded("a398UIhnj43f/1!+sadfh3w84hduihhjw=="); // false
http://php.net/manual/en/function.base64-decode.php#81425

I had the same problem, I ended up with this solution:
if ( base64_encode(base64_decode($data)) === $data){
echo '$data is valid';
} else {
echo '$data is NOT valid';
}

Better late than never: You could maybe use mb_detect_encoding() to find out whether the encoded string appears to have been some kind of text:
function is_base64_string($s) {
// first check if we're dealing with an actual valid base64 encoded string
if (($b = base64_decode($s, TRUE)) === FALSE) {
return FALSE;
}
// now check whether the decoded data could be actual text
$e = mb_detect_encoding($b);
if (in_array($e, array('UTF-8', 'ASCII'))) { // YMMV
return TRUE;
} else {
return FALSE;
}
}
UPDATE For those who like it short
function is_base64_string_s($str, $enc=array('UTF-8', 'ASCII')) {
return !(($b = base64_decode($str, TRUE)) === FALSE) && in_array(mb_detect_encoding($b), $enc);
}

We can combine three things into one function to check if given string is a valid base 64 encoded or not.
function validBase64($string)
{
$decoded = base64_decode($string, true);
$result = false;
// Check if there is no invalid character in string
if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $string)) {$result = false;}
// Decode the string in strict mode and send the response
if (!$decoded) {$result = false;}
// Encode and compare it to original one
if (base64_encode($decoded) != $string) {$result = false;}
return $result;
}

I was about to build a base64 toggle in php, this is what I did:
function base64Toggle($str) {
if (!preg_match('~[^0-9a-zA-Z+/=]~', $str)) {
$check = str_split(base64_decode($str));
$x = 0;
foreach ($check as $char) if (ord($char) > 126) $x++;
if ($x/count($check)*100 < 30) return base64_decode($str);
}
return base64_encode($str);
}
It works perfectly for me.
Here are my complete thoughts on it: http://www.albertmartin.de/blog/code.php/19/base64-detection
And here you can try it: http://www.albertmartin.de/tools

base64_decode() will not return FALSE if the input is not valid base64 encoded data. Use imap_base64() instead, it returns FALSE if $text contains characters outside the Base64 alphabet
imap_base64() Reference

Here's my solution:
if(empty(htmlspecialchars(base64_decode($string, true)))) {
return false;
}
It will return false if the decoded $string is invalid, for example: "node", "123", " ", etc.

$is_base64 = function(string $string) : bool {
$zero_one = ['MA==', 'MQ=='];
if (in_array($string, $zero_one)) return TRUE;
if (empty(htmlspecialchars(base64_decode($string, TRUE))))
return FALSE;
return TRUE;
};
var_dump('*** These yell false ***');
var_dump($is_base64(''));
var_dump($is_base64('This is definitely not base64 encoded'));
var_dump($is_base64('node'));
var_dump($is_base64('node '));
var_dump($is_base64('123'));
var_dump($is_base64(0));
var_dump($is_base64(1));
var_dump($is_base64(123));
var_dump($is_base64(1.23));
var_dump('*** These yell true ***');
var_dump($is_base64(base64_encode('This is definitely base64 encoded')));
var_dump($is_base64(base64_encode('node')));
var_dump($is_base64(base64_encode('123')));
var_dump($is_base64(base64_encode(0)));
var_dump($is_base64(base64_encode(1)));
var_dump($is_base64(base64_encode(123)));
var_dump($is_base64(base64_encode(1.23)));
var_dump($is_base64(base64_encode(TRUE)));
var_dump('*** Should these yell true? Might be edge cases ***');
var_dump($is_base64(base64_encode('')));
var_dump($is_base64(base64_encode(FALSE)));
var_dump($is_base64(base64_encode(NULL)));

May be it's not exactly what you've asked for. But hope it'll be usefull for somebody.
In my case the solution was to encode all data with json_encode and then base64_encode.
$encoded=base64_encode(json_encode($data));
this value could be stored or used whatever you need.
Then to check if this value isn't just a text string but your data encoded you simply use
function isData($test_string){
if(base64_decode($test_string,true)&&json_decode(base64_decode($test_string))){
return true;
}else{
return false;
}
or alternatively
function isNotData($test_string){
if(base64_decode($test_string,true)&&json_decode(base64_decode($test_string))){
return false;
}else{
return true;
}
Thanks to all previous answers authors in this thread:)

Usually a text in base64 has no spaces.
I used this function which worked fine for me. It tests if the number of spaces in the string is less than 1 in 20.
e.g: at least 1 space for each 20 chars --- ( spaces / strlen ) < 0.05
function normalizaBase64($data){
$spaces = substr_count ( $data ," ");
if (($spaces/strlen($data))<0.05)
{
return base64_decode($data);
}
return $data;
}

Your best option is:
$base64_test = mb_substr(trim($some_base64_data), 0, 76);
return (base64_decode($base64_test, true) === FALSE ? FALSE : TRUE);

Related

function for checking strlen of a password doesn't work in php

I was making a sign-up page and everything worked and got send to the db but you could enter a weak pwd. I wanted to make sure that the pwd length had a minimum length of 8. I added these lines of code but when I tested it it skipped this code and you could enter any pwd you want. does anyone know why this line is getting skipped and what a sollution for this problem is?
function pwdTooShort($pwd) {
$result;
if (strlen($pwd) > 7) {
$result = true;
}
else{
$result = false;
}
return $result;
}
if (isset($_POST["submit"])) {
$pwd = $_POST["pwd"];
require_once 'functions.inc.php';
if(pwdTooShort($pwd) !== false) {
header("location: ../sign-in.php?error=passwordtooshort");
exit();
}
}
if (isset($_GET["error"])){
if($_GET["error"] == "passwordtooshort"){
echo "<p> password is too short </p>";
}
}
<form action="include/signup.inc.php" method = "post">
<input type="password" name = "pwd" />
</form>
You have some logic issues.
Your pwdTooShort() will now return true if the password has more than 7 characters (backwards). You can change that function to:
function pwdTooShort($pwd)
{
// Return true if it is 7 characters or shorter
return mb_strlen($pwd) <= 7;
}
I also changed strlen() to mb_strlen() to account for multibyte characters, as #vee suggested in comments.
Improvement suggestion
The if-statement is technically correct, but is "over complicated".
You can change
if (pwdTooShort($pwd) !== false)
to either
if (pwdTooShort($pwd) === true)
or just
if (pwdTooShort($pwd))
to make it easier to read
From your function name to check password too short, I think it should return true if too short and false if it is not.
Here is the code. It is just flip true and false.
/**
* Check if password is too short (less than 8 characters).
*
* #param string $pwd The raw password without hash.
* #return bool Return `true` if it is too short, return `false` if it is not.
*/
function pwdTooShort($pwd) {
$result;
if (mb_strlen($pwd) > 7) {
$result = false;
}
else{
$result = true;
}
return $result;
}
The code above, I changed from strlen() to mb_strlen() to let it supported multi byte characters (unicode text) and exactly count the characters.
I recommend you not limit the characters to non unicode. This is recommeded by OWASP.
Allow usage of all characters including unicode and whitespace. There should be no password composition rules limiting the type of characters permitted.
The whitespace they said means between character. You can still trim($password).
The rest of the code will be work fine with this.

Reading an email's subject in Unicode out of an IMAP server

I'm using Zend Framework 1's IMAP server connector and I'm trying to fetch an email from server with Unicode characters in its subject. Here's how I do it:
$message = $imapServer->getMessage($message_number);
echo $message->getHeader('subject');
The problem is that it comes out encoded:
=?UTF-8?B?2KjYp9uM?=
I can find the encoding function within Zend_Mail class named _encodeHeader but I can not find the decoding pair! Does anyone know how to decode this string?
And here's the encoder function:
protected function _encodeHeader($value)
{
if (Zend_Mime::isPrintable($value) === false) {
if ($this->getHeaderEncoding() === Zend_Mime::ENCODING_QUOTEDPRINTABLE) {
$value = Zend_Mime::encodeQuotedPrintableHeader($value, $this->getCharset(), Zend_Mime::LINELENGTH, Zend_Mime::LINEEND);
} else {
$value = Zend_Mime::encodeBase64Header($value, $this->getCharset(), Zend_Mime::LINELENGTH, Zend_Mime::LINEEND);
}
}
return $value;
}
Search for a "RFC2047 decoder" and pick one of the existing libraries which does just that. If nothing is usable, roll your own.
Here's how I solved it:
switch (strtolower($encoding)) {
case \Zend_Mime::ENCODING_QUOTEDPRINTABLE:
if (preg_match('/^\s?=\?([^\?]+)\?Q\?/', $str, $matches) === 1) {
$str = preg_replace('/\s?=\?'.preg_quote($matches[1]).'\?Q\?/', ' ', $str);
$str = strtr($str, array('?=' => ''));
$str = trim($str);
}
return \Zend_Mime_Decode::decodeQuotedPrintable($str);
case \Zend_Mime::ENCODING_BASE64:
return base64_decode($encodedText);
case \Zend_Mime::ENCODING_7BIT:
case \Zend_Mime::ENCODING_8BIT:
default:
return $encodedText;
}

Avoid re-conversion of a UTF-8 String PHP

Well, I have a BD with a lot of ISO strings and another with UTF-8 (yes, I ruin everything) and now I'm making a custom function that rewrite all the BD again to have all in UTF-8, the problem, is the conversion with UTF-8 strings... The ? appears:
$field = $fila['Field'];
$acon = mysql_fetch_array(mysql_query("SELECT `$field` as content FROM `$curfila` WHERE id='$i'"));
$content = $acon['content'];
if(!is_numeric($content)) {
if($content != null) {
if(ip2long($content) === false) {
mb_internal_encoding('UTF-8');
if(mb_detect_encoding($content) === "UTF-8") {
$sanitized = utf8_decode($content);
if($sanitized != $content) {
echo 'Fila [ID ('.$i.')] <b>'.$field.'</b> => '.$sanitized.'<br>';
//mysql_query("UPDATE `$curfila` SET `$field`='$sanitized' WHERE id='$i'");
}
}
}
}
}
PD: I check all the columns and rows of all the tables of the BD. (I show all everything before doing anything)
So, how can I detect that?
I tried mb_detect_encoding, but the all the string are in UTF-8... So, which function can I use now?
Thanks in advance.

Determining URL or String PHP

I'm making a link and text service, but I have a problem, which is: there is only 1 input text form, and the user could paste something like this:
http:// asdf .com - which would register as a link, or 'asdf http:// test .com' because of the http://, it would register as a url, or
asdf - which would register as a string, because it doesn't contain http://
BUT my problem arises when the user writes something like:
asdf http://asdf.com, which in my current program outputs a "url" value. I've been experimenting for about an hour now, and I've got 3 bits of code (they were all in the same document being commented, so forgive me if they give errors!)
<?
$str = $_POST['paste'];
if(stristr($str, "http://")) {
$type = "url";
}
if(stristr($str, "https://")) {
$type = "url";
}
if($type!="url") {
$type = "string";
}
?>
Next:
<?
$type = "url";
if($type=="url"){
$t = substr($str, 8);
if(stristr($t, "https://")==$t){
$type = "url";}
if(stristr($t, "https://")==$t){
$type = "url";}
if(stristr($t, "http://")!=$t){
$type = "string";}
if(stristr($t, "https://")!=$t){
$type = "string";}
}
echo $type;
?>
Next:
<?
$url = "hasttp://cake.com";
if(stristr($url, "http://")=="") {
$type = "string"; } else {
$type = "url";
$sus = 1;}
if(stristr($url, "http://")==$url) {
$type = "url"; }
if($sus==1) {
$r = substr($url, 7);
if(stristr($r,"http://")!="http://") {
$type = "url"; }
if($r=="") {
$type = "string";
}
}
echo $type;
?>
I have no clue how I could go about classifying a string like 'asdf http://asdf.com' as a string, whilst classifying 'asdf' as a string, and classifying 'http://asdf.com' as a url.. Another idea I haven't tried yet is strpos, but that's what I'm working on now.
Any ideas?
Thanks alot! :)
Some parts of this question are getting cut off for some reason, apologies!
$type = '';
if (preg_match('%^https?://[^\s]+$%', $url)) {
$type = 'url';
} else {
$type = 'string';
}
This will match any value which starts with http:// or https://, and does not contain any space in it as type url. If the value does not start with http:// or https://, or it contains a space in it, it will be type string.
PHP parse_url is your function:
On seriously malformed URLs, parse_url() may return FALSE.
If the component parameter is omitted, an associative array is returned. At least one element will be present within the array. Potential keys within this array are:
scheme - e.g. http
host
port
user
pass
path
query - after the question mark ?
fragment - after the hashmark #
If the component parameter is specified, parse_url() returns a string (or an integer, in the case of PHP_URL_PORT) instead of an array. If the requested component doesn't exist within the given URL, NULL will be returned.
If I'm understanding the problem correctly you want to detect when the user inputs both a string and a url and parse each of them correspondingly.
Try using explode(" ", $userInput);, this will return an array containing all strings separated by a space. Than you can check that for each element in the array and set the type.
$type = strpos($str, 'http') === 0 ? 'url' : 'string':
The strpos function returns the position of a match within a string or FALSE if no match. The tripple equals checks that the result does not only translates to 0 (as FALSE would have done), but that it is in fact integer as well (i.e., the string begins with http).
You could also use something like
switch (true) {
case strpos(trim($str), 'http://') === 0:
case strpos(trim($str), 'https://') === 0:
$type = 'url';
break;
default:
$type = 'string';
break; // I know this is not needed, but it is pretty :-)
}
You should use a regular expression to check if the string starts with http
if(preg_match('/^http/',$string_to_check)){
//this is a url
}

IP Address Validation Help

I am using this IP Validation Function that I came across while browsing, it has been working well until today i ran into a problem.
For some reason the function won't validate this IP as valid: 203.81.192.26
I'm not too great with regular expressions, so would appreciate any help on what could be wrong.
If you have another function, I would appreciate if you could post that for me.
The code for the function is below:
public static function validateIpAddress($ip_addr)
{
global $errors;
$preg = '#^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}' .
'(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$#';
if(preg_match($preg, $ip_addr))
{
//now all the intger values are separated
$parts = explode(".", $ip_addr);
//now we need to check each part can range from 0-255
foreach($parts as $ip_parts)
{
if(intval($ip_parts) > 255 || intval($ip_parts) < 0)
{
$errors[] = "ip address is not valid.";
return false;
}
return true;
}
return true;
} else {
$errors[] = "please double check the ip address.";
return false;
}
}
I prefer a simplistic approach described here. This should be considered valid for security purposes. Although make sure you get it from $_SERVER['REMOTE_ADDR'], any other http header can be spoofed.
function validateIpAddress($ip){
return long2ip(ip2long($ip)))==$ip;
}
There is already something built-in to do this : http://fr.php.net/manual/en/filter.examples.validation.php See example 2
<?php
if (filter_var($ip, FILTER_VALIDATE_IP)) {
// Valid
} else {
// Invalid
}
Have you tried using built-in functions to try and validate the address? For example, you can use ip2long and long2ip to convert the human-readable dotted IP address into the number it represents, then back. If the strings are identical, the IP is valid.
There's also the filter extension, which has an IP validation option. filter is included by default in PHP 5.2 and better.
Well, why are you doing both regex and int comparisons? You are "double" checking the address. Also, your second check is not valid, as it will always return true if the first octet is valid (you have a return true inside of the foreach loop).
You could do:
$parts = explode('.', $ip_addr);
if (count($parts) == 4) {
foreach ($parts as $part) {
if ($part > 255 || $part < 0) {
//error
}
}
return true;
} else {
return false;
}
But as others have suggested, ip2long/long2ip may suit your needs better...

Categories