Validate session cookie value with regex - php

I'm trying to validate a session cookie value with regex, I have done some test, but without any success...could someone write and explain me how to match a non-ASCII character, spaces(\s), comma and semicolon (basically all the forbidden characters of a session value)? Obviously if one of those character is found the entire line is invalid.
At this moment my function is this:
session_name("RazorphynSupport");
if(isset($_COOKIE['RazorphynSupport']) && !is_string($_COOKIE['RazorphynSupport']) || !preg_match('/^[a-z0-9]{26,40}$/',$_COOKIE['RazorphynSupport'])){
setcookie(session_name(),'invalid',time()-3600);
//return error
exit();
}
session_start();
//Logout
if($_POST[$_SESSION['token']['act']]=='logout' && isset($_SESSION['status'])){
//Logout
}
//Session Check
if(isset($_SESSION['time']) && time()-$_SESSION['time']<=1800)
$_SESSION['time']=time();
else if(isset($_SESSION['id']) && !isset($_SESSION['time']) || isset($_SESSION['time']) && time()-$_SESSION['time']>1800){
//Destroy session; return error
exit();
}
else if(isset($_SESSION['ip']) && $_SESSION['ip']!=retrive_ip()){
//Destroy session; return error
exit();
}
else if(!isset($_POST[$_SESSION['token']['act']]) && !isset($_POST['act']) && $_POST['act']!='faq_rating' || $_POST['token']!=$_SESSION['token']['faq']){
//Destroy session; return error
exit();
}
Before the session_start(), but obviously it's wrong. My problem is that if the session contains any illegal character the session_start function return an error, so to prevent it I would like to check the "integrity" of the cookie
EDIT
form firebug cookie tab:
Cookie Name -> RazorphynSupport
Value -> ETpSx-T6VFuIYS3fejyaq0
I need to validate ETpSx-T6VFuIYS3fejyaq0 (that is a random generated string)

Instead of listing the things that you will deny, you should list the things that you will allow.
I am reminded of this TheDailyWTF story.
What do you expect in a session cookie? (By which I will assume you mean the value of a session cookie, which is a session id) Is it a hexadecimal string? How long? 32 bytes? 256? Something else entirely? Let's say your cookies are a 32 byte hexacimal string. Then the following regex will catch them:
/^[a-f0-9]{32}$/
Anything not matching that regex is known to be invalid.
However, there's a much more important step to take: you should check whether you actually have a session with that id that has not been expired yet.

I think you can use this pattern:
if ( preg_match('/^[^:; ]+$/i', 'ETpSx-T6VFuIYS3fejyaq0') ) {
// valid
}
this pattern allowed using digits, letters a-z in uppercase and lowercase, and also hyphen -.

Related

What are the limits on session names in PHP?

The PHP docs on session_name() say:
It should contain only alphanumeric characters; it should be short and descriptive (i.e. for users with enabled cookie warnings). ... The session name can't consist of digits only, at least one letter must be present. Otherwise a new session id is generated every time.
So it's clear you must have something non-numeric in there, but it's not quite clear what characters you can't have. The cookie spec itself denies ()<>#,;:\"/[]?={}, but that still leaves others that might be permitted but are not strictly alphanumeric. This is important because cookie security prefixes use - and _ in names like __Secure-PHPSESSID. So I had a rummage in the PHP source code at the session_name function – but I can't see that it does anything other than check it's a string. In practice, it works fine, but I'd be more comfortable knowing precisely why! For example, this works:
session_name('__Secure-PHPSESSID');
session_start();
$_SESSION['test'] = $_SESSION['test'] . "\n" . rand(0,100);
var_dump($_SESSION);
So what are the actual limits on PHP session names?
I got a bit further with this. The rules for a session name are defined in this validation function, which permits [a-zA-Z0-9,-]{1,256} (but not numeric-only). You can have commas and dashes in session names in addition to alphanumerics, so the docs are wrong on that. This function is called from an internal session_create_id function, which triggers a warning if the session name doesn't pass that validation.
Despite this, no warning is triggered when passing in a session name containing _. This is demonstrable:
<?php
ini_set('display_errors', true);
error_reporting(E_ALL);
session_name('__Secure-MySession');
session_start();
if (!array_key_exists('test', $_SESSION)) {
$_SESSION['test'] = '';
}
$_SESSION['test'] .= "\n" . rand(0,100);
var_dump($_SESSION);
echo session_name();
This works perfectly, triggering no errors or warnings, and shows a growing list of random numbers (showing that the session storage is working and therefore the cookies are too), and the second session_name call with no params shows the session name that we set:
__Secure-MySession
And the HTTP headers show that the script sets a cookie called __Secure-MySession:
I also tried naming the session My_Session, just in case PHP looks for explicit __Session- prefix, but that works just fine too. Other characters like # or ( do not trigger an error either; in those cases the session name is URL-encoded, which looks remarkably like this bug that was fixed quite a while ago. As expected, 123, works, but also URL-encodes the comma.
So while this demonstrates that having _ in session names works fine, I can't tell you why. I've asked elsewhere too, and if I find out, I will update this question!
Coincidentally, draft 06 of RFC6265bis expires today.

why are "+" characters missing from cookies or data strings passed to PHP, and how can I correct this

I thought I had a perfect scheme, using base64 encoded data for cookies in visitor pages, to identify the visitor. (Actually the cookies represent an RC4 encoded, re-processed with base64 to make a "cookie safe" result. Since there are no characters output by base 64 that are illegal for cookies in any browser, I was confident this would not pose a problem. I further hoped to check the cookie from a PHP script via the $_COOKIE array. All seemed to be going well until a particular cookie value ended up being base64 encoded as...
9xu3EhM5+6duW4feCL4aHuxOceo=
There was definitely no problem writing or reading this cookie value to my browser. If I create it using javascript and then examine it using the browser's privacy options, it is NOT corrupt. If I read the cookie via javascript and display it in an alert() or the console, it is also NOT corrupt. But upon "reading that" cookie from the PHP $_COOKIE array, what I got back was...
9xu3EhM5 6duW4feCL4aHuxOceo=
This is PHP 5.6 if it matters. Why is the "+" symbol missing? And sadly, the problem is not confined to the $_COOKIE array! Even writing a simple PHP program to respond back with what I send it (via a GET request), I still see the "+" sign missing in the response.
If this is a problem that is related to character encoding, I can't see how. Even if I just plug my PHP script URL into the browser's address bar, where no active page has set any character encoding, the "+" sign is lost en rout to the script. And I've also verified that a simple script to do nothing but respond with the hard coded "non corrupt" string works fine.
So clearly the problem is confined to the passage of data FROM the browser TO the PHP. And even if I could come up with some crazy scheme to compensate for strings passed manually (like via a POST request), I don't see any way to control what the PHP script sees when data is pulled from the $_COOKIE array.
What can I do? I really have been counting on the script being able to do this seemingly simple task.
---EDIT---------------
Though I've found others complaining about this mysterious "+" character going missing since posting, I've seen no simple solution, and decided to implement my own. Since I've been doing all my base64 (encode and decode) from within my PHP scripts anyway, and since my code is the only place where these strings must be created, stored, and recovered, I've decided to run all base64 encoded strings through this routine (below) before using it to store a cookie. Likewise, I'll pass each cookie obtained (for example, via the $_COOKIE array) through it prior to base-64 decoding it.
// from browser to PHP. substitute troublesome chars with
// other cookie safe chars, or vis-versa.
function fix64($inp) {
$out =$inp;
for($i = 0; $i < strlen($inp); $i++) {
$c = $inp[$i];
switch ($c) {
case '+': $c = '*'; break; // definitly won't transfer!
case '*': $c = '+'; break;
case '=': $c = ':'; break; // = symbol seems like a bad idea
case ':': $c = '='; break;
case '/': $c = '_'; break; // no good for dir name!!!
case '_': $c= '/'; break;
default: continue;
}
$out[$i] = $c;
}
return $out;
}
I'm simply substituting "+" (and I decided "=" as well) with other "cookie safe" characters, before returning the encoded value to the page, for use as a cookie.
EDIT-----
I added and altered the above a little to also remove/replace the "/" character, which is not a problem with the $_COOKIE array, but it is a troublesome character if, for example, you wanted to write a file or create a directory with the same name as the cookie.
Note that the length of the string being processed doesn't change. When the same (or another page on the site) runs my PHP script again, and I recover the cookie, I can then pass it back through the same fix64() call I created, knowing that from there I can decode it like normal base64.
I did not answer my own question, as I was hoping there would be some simple "official" PHP setting I could invoke that would change this behavior, and am still hopeful such a thing exists. But for my case, and for now, this is a reasonable approach, which can easily be reversed if I need to someday.
setcookie() exists since PHP/4 and produces URL-encoded values:
setcookie('a', '9xu3EhM5+6duW4feCL4aHuxOceo=');
Set-Cookie: a=9xu3EhM5%2B6duW4feCL4aHuxOceo%3D
Accordingly, $_COOKIE URL-decodes the values:
Cookie: a=9xu3EhM5%2B6duW4feCL4aHuxOceo%3D
array(1) {
["a"]=>
string(28) "9xu3EhM5+6duW4feCL4aHuxOceo="
}
Since PHP/5 there's also setrawcookie() with the only purpose of not URL-encoding values:
setrawcookie('b', '9xu3EhM5+6duW4feCL4aHuxOceo=');
Set-Cookie: b=9xu3EhM5+6duW4feCL4aHuxOceo=
But $_COOKIE still assumes URL-encoded input and stuff breaks (+ is the obsolete encoding for U-0020 'SPACE', aka good old whitespace):
Cookie: b=9xu3EhM5+6duW4feCL4aHuxOceo=
array(1) {
["b"]=>
string(28) "9xu3EhM5 6duW4feCL4aHuxOceo="
}
Interestingly, I couldn't find a counterpart for setrawcookie(). That leaves you in the situation of having to write your own parser :-! $_SERVER['HTTP_COOKIE'] contains the raw value of the HTTP header, which is a semicolon-separated list, e.g.:
a=9xu3EhM5%2B6duW4feCL4aHuxOceo%3D; b=9xu3EhM5+6duW4feCL4aHuxOceo=
For instance, the Slim microframework has a Cookies::parseHeader() method to do exactly that (not sure why, since they urldecode() everything anyway):
public static function parseHeader($header)
{
if (is_array($header) === true) {
$header = isset($header[0]) ? $header[0] : '';
}
if (is_string($header) === false) {
throw new InvalidArgumentException('Cannot parse Cookie data. Header value must be a string.');
}
$header = rtrim($header, "\r\n");
$pieces = preg_split('#[;]\s*#', $header);
$cookies = [];
foreach ($pieces as $cookie) {
$cookie = explode('=', $cookie, 2);
if (count($cookie) === 2) {
$key = urldecode($cookie[0]);
$value = urldecode($cookie[1]);
if (!isset($cookies[$key])) {
$cookies[$key] = $value;
}
}
}
return $cookies;
}
I guess you can use this code and skip the decoding part.

PHP - preg_match confusion - not checking for correct character set - needs spaces

I'm currently building an application that will have addresses inside it and I'm using preg_match to do some character detection and throw an error the user's way if they use invalid characters to make sure it's secure.
My problem is using preg_match it seems to be behaving strangely and I don't think I'm using it correctly or not because of how it's acting.
My code below should allow letters A to Z, both upper and lowercase for the City, County and Country. The code below is an example of the city when that field is updated:
// The user wants to update the city so let's go through the update process
if ( isset($_POST['city']) ) {
// Check that the city is different before we try to update it
if ( $newcompanycity != $companycity ) {
// Check that the city is less than 50 characters and contains valid characters
if ( (strlen($newcompanycity) <= 50) && (preg_match('/[^A-Za-z]/', $newcompanycity)) ) {
// The city is fine so let's update it
$update_success = true;
mysqli_query($sql, "UPDATE clients SET city = '$newcompanycity' WHERE companyid = '$companyid'") or die(mysqli_error($sql));
} else {
// The city doesn't meet the requirements to be update so return an error
$update_fail = true;
$city_error = true;
}
}
}
Now the issue is that $city_error is being triggered when the current value is "Sheffield" and you change it to "York" it returns an error as the $city_error variable becomes true. However changing the value from "Sheffield" to "Sheffield 1", it then works and updates the database.
Am I missing something here? I thought that A-Za-z only checks for letters and if there are only letters then it should work. But this doesn't seem to be working at all.
Just a quick update before I posted this. I just realised I need to add a space at the end of string and then it works. I'm really confused with this. So without a space it returns an error as preg_match doesn't allow it but with a space even though it's not defined inside preg_match it allows it. Surely this isn't normal behavior?
Your regex is /[^A-Za-z]/ which means something like "everything that is not A-Z and a-z".
The preg_match function returns the number of matches, so if no invalid characters were found it should return 0.
So if you change (preg_match('/[^A-Za-z]/', $newcompanycity)) to (preg_match('/[^A-Za-z]/', $newcompanycity) === 0) it should work as expected as it becomes true if no invalid characters where found.
To include whitespaces just add them to your regex: /[^A-Za-z ]/.
Sometimes regexp only further complicates things. PHP has a great function called ctype_alpha() that will check if a variable is only A-Za-z.
(ctype_alpha($newcompanycity))
Here's a working example for you

Recovering from an invalid session id in the most elegant manner

Lately, I've been getting the following error(s) in my log:
PHP Warning: session_start(): The session id is too long or contains illegal characters, valid characters are a-z, A-Z, 0-9 and '-,' in [...] on line [..]
PHP Warning: Unknown: The session id is too long or contains illegal characters, valid characters are a-z, A-Z, 0-9 and '-,' in Unknown on line 0
PHP Warning: Unknown: Failed to write session data (files). Please verify that the current setting of session.save_path is correct (/var/lib/php5) in Unknown on line 0
The answers to this previous question suffice for detecting and diverting such scenarios, so that no error is generated, but I'm interested in the most elegant recovery; that is, ending up with a valid new (empty) session.
First, though, a few refinements to the detection & diversion code in that previous question, which is now six years old:
These days, sessions are more likely to be handled exclusively through cookies. The session.use_only_cookies flag in my php.ini is enabled, and since I don't remember changing it I assume this is the default value.
Which characters are used in a valid session id, and how many of those characters, depends on the session.hash_function and session.hash_bits_per_character values in php.ini. My values are 0 and 5, respectively, which (unless I'm mistaken) means my own session ids should match the regular expression /^[a-v0-9]{26}$/. I assume these are default values as well.
The name of the cookie used to store the session can be customized using session.name in php.ini. The correct value can always be retrieved using the session_name() function.
Given these, the most elegant means of diversion would (probably) be as follows:
function my_session_start() {
if (!isset($_COOKIE[session_name()]) || preg_match('/^[a-v0-9]{26}$/', $_COOKIE[session_name()]) !== 0) {
return session_start(); // since 5.3, returns TRUE on success, FALSE on failure.
}
else {
return false;
}
}
As for recovery (which would replace the return false; in the else block), one answer to the previous question suggested the following:
session_id(uniqid());
session_start();
session_regenerate_id();
I'm concerned, however, that this means of recovery, having received only two upvotes and no comments, is insufficiently reviewed.
The answer to this question suggests that the internal session_start() function relies directly on the value of $_COOKIE[session_name()], rather than some other internal representation of that value. Is this the case? If so, a my_session_start() function with detection and recovery could be as simple as:
function my_session_start() {
if (isset($_COOKIE[session_name()]) && preg_match('/^[a-v0-9]{26}$/', $_COOKIE[session_name()]) === 0) {
unset($_COOKIE[session_name()]);
}
return session_start();
}
I've only seen this happen when $_COOKIE['PHPSESSID'] === ''. This happens when someone uses Firebug to "clear cookie". I want to only handle this specific scenario; if the session ID is invalid in some other way, I want to be warned. Dealing with my specific use case is simple:
$session_name = session_name();
if (isset($_COOKIE[$session_name]) and empty($_COOKIE[$session_name])) {
// This happens when someone does "clear cookie" in Firebug; it causes session_start()
// to trigger a warning. session_start() relies on $_COOKIE[$session_name], thus:
unset($_COOKIE[$session_name]);
}
session_start();
Out of curiosity, do you actually need to deal with any scenario other than it being an empty string? It could only be something else if done maliciously (in order to trigger a warning), but since you have display_errors disabled in production, nobody can gain any information from that.

Prevent Whitespaces from Wysiwyg Input

I have CKEditor embedded in my page. I need to prevent plain whitespaces and breaklines that doesn't come with any characters. There must be at least one actual visible character.
The following answer is totally not consistent, sometimes it works fine and sometimes it does nothing, it allows whitespaces:
if(!empty($_POST['rtxt_article']))
{
if (trim(strip_tags($_POST['rtxt_article']))) {
// do something
}
else
{
//ops! please fill in data
}
}
else
{
//ops! please fill in data
}
I also tried this:
$plainText = strip_tags($_POST['rtxt_offer']);
$isNotEmpty = trim($plainText);
if($isNotEmpty)
{
//do something
}
When the above snippet doesn't have effect anymore, i put ! sign and the snippet works again. After a while, the snippet doesn't work until i remove ! and vice versa. Totally inconsistent. This is how i put !:
if(!$isNotEmpty) ...
if (!trim(strip_tags($_POST['rtxt_article']))) ...
Any idea? Any other solution?
Give this a shot. It will first check to see if their is actually input but using the empty() function. With adding the ! to empty() what happens is that the if statement is being asked if $_POST['rtxt_article'] is NOT empty, meaning that there is at least once character in it.
if (!empty($_POST['rtxt_article']) && trim(strip_tags($_POST['rtxt_article']))) {
// do something
}
If for some reason it is still being passed with the new line character, then you could scrub the $_POST var first.
edited:
$var = trim($_POST['rtxt_article']);
if (!empty($var)) {
// do something
}

Categories