Stop CodeIgniter from incorrectly decoding URL-encoded content - php

By default, CodeIgniter blocks %27 (') from appearing in URLs. I have commented out the entire $config['permitted_uri_chars'] directive as a result. However, when I am now parsing part of the URL as a method argument that contains %27, or any other URL encoded portion, CodeIgniter converts it into a plain ?, before I can even run rawurledcode() on it. How can I stop it from doing this? We're using CI v1.7.x.
Here is some simple code to show it:
In the "Program" controller:
function test($parameter)
{
echo $parameter;
}
Then we load http://example.com/program/test/o%27clock, and we get:
o?clock
I expected o'clock or at least o%27clock which I could just rawurldecode() with to get o'clock.
UPDATE: Unfortunately, I was wrong. It was not being caused by CodeIgniter. Rather, the presences of suhosin was doing the substitution.

The problem wasn't CodeIgniter. It was suhosin.

Related

Codeigniter trying to send HTML content via ajax?

I'm trying send an HTML string from the client to the server via ajax. I keep getting "disallowed key characters" error. So I took this $config['permitted_uri_chars'] = 'a-z 0-9~%.:_\-'; and set it to nothing $config['permitted_uri_chars'] = ''; Since CodeIgniter says Leave blank to allow all characters -- but only if you are insane. But I still get Disallowed Key Characters error.
This is how I'm trying to send it:
var content = '<p class="MsoNormal">Hi {$first_name}</p>\n<p class="MsoNormal">My name is Bill, etc etc.</p>';
$.get('/task/preview_template', {content:content}, function(data) {
console.log(data); //Disallowed Key Characters
});
_clean_input_keys is your likely culprit for what's throwing the error, and you have a large number of characters that fall outside of the allowed characters of "/^[a-z0-9:_\/-]+$/i".
There are a few ways that I can think of that might handle this:
Modify _clean_input_keys so that it accepts the extra characters. This, of course, is an internal function for a reason and shouldn't be changed unless you know what you're doing. (Alternatively, you may be able to modify it to allow the special characters for HTML encoding and HTML encode the string. This helps mitigate the compromise to security that comes with adding such characters to _clean_input_keys.)
Encode your string before sending it, then decode it on the server side. This is a little more work on both your part, and that of the computers involved, but it keeps _clean_input_keys intact, and should allow you to send your string up, if you can find an encoding that is reliable in both directions and doesn't produce any disallowed characters. Since you're using GET, you may also run into GET input limits on not only the server, but browser-side, as well.
Use POST instead of GET and send your content as a data object. Then just use the $_POST variable on the server, instead of $_GET. While this may work, it is a bit unorthodox and nonstandard usage of the REST verbs.
Store your template content on the server, and reference it by name, instead of storing it in the JavaScript. This, of course, only works if you're not generating your template content on the fly in the JavaScript. If you're using the same template(s) in all of your JavaScript calls, though, then there's really no reason to send that information from JavaScript to begin with.

Decode a byte encoded string via my URL

We have a PHP site on Zend Framework with a backend Postgresql database. Our primary character encoding is UTF-8.
I just checked our error log and found a strange entry. My URL is as follows:
www.mydomain.com/schuhe-für-breite-füsse
however someone (or maybe a bot) has tried to access this URL as follows:
www.mydomain.com/schuhe-f\xc3\xbcr-breite-f\xc3\xbcsse/
It's the first time I've seen something like the above. Two things are happening on my page:
1) The above URL is queried against our CMS. This works fine for some reason, I think Postgresql reaslises it is byte-encoded and then converts it back when tried to find this SEF URL in our database.
2) An Ajax request is made on the page, passing the same SEF URL. This fails. I believe the slashes are causing a problem on Javascript.
To avoid this I want to decode any URL that is encoded like this. However a quick test of the following code did not decode anything for me :(
$landing_sef_url = $this->_getParam('landing_sef_url');
$utf8=html_entity_decode($landing_sef_url);
$iso8859=utf8_decode($utf8);
$test3 = html_entity_decode($landing_sef_url, 1, "ISO-8859-1");
$test4 = urldecode($landing_sef_url);
echo utf8_decode("$landing_sef_url");
echo "<br/><br/>";
die($landing_sef_url . " -- $utf8 -- $iso8859 <br/>$test3<br/>$test4");
I found the above via various posts online but they all print back the same result - schuhe-f\xc3\xbcr-breite-f\xc3\xbcsse
Any help would be MUCH appreciated. Many thanks!
This method seems to do what you're looking for:
http://li.php.net/manual/en/function.stripcslashes.php
But if you're just looking to unescape \x## sequences, you could also do this with a fairly simple regular expression.

PHP json_decode returns null

I'm writing PHP code that uses a database. To do so, I use an array as a hash-map.
Every time content is added or removed from my DB, I save it to file.
I'm forced by my DB structure to use this method and can't use mysql or any other standard DB (School project, so structure stays as is).
I built two functions:
function saveDB($db){
$json_db = json_encode($db);
file_put_contents("wordsDB.json", $json_db);
} // saveDB
function loadDB(){
$json_db = file_get_contents("wordsDB.json");
return json_decode($json_db, true);
} // loadDB
When echo-ing the string I get after the encoding or after loading from file, I get a valid json (Tested it on a json viewer) Whenever I try to decode the string using json_decode(), I get null (Tested it with var_dump()).
The json string itself is very long (~200,000 characters, and that's just for testing).
I tried the following:
Replacing single/double-quotes with double/single-quotes (Without any backslashes, with one backslash and three backslashes. And any combination I could think of with a different number of backslashes in the original and replaced string), both manually and using str_replace().
Adding quotes before and after the json string.
Changing the page's encoding.
Decoding without saving to file (Right after encoding).
Checked for slashes and backslashes. None to be found.
Tried addslashes().
Tried using various "Escape String" variants.
json_last_error() doesn't work. I get no error number (Get null, not 0).
It's not my server, so I'm not sure what PHP version is used, and I can't upgrade/downgrade/install anything.
I believe the size has something to do with it, because small strings seem to work fine.
Thanks Everybody :)
In your JSON file change null to "null" and it will solve the problem.
Check if your file is UTF8 encoded. json_decode works with UTF8 encoded data only.
EDIT:
After I saw uploaded JSON data, I did some digging and found that there are 'null' key. Search for:
"exceeding":{"S01E01.html":{"2217":1}},null:{"S01E01.html":
Change that null to be valid property name and json_decode will do the job.
I had a similar problem last week. my json was valid according to jsonlint.com.
My json string contained a # and a & and those two made json_decode fail and return null.
by using var_dump(json_decode($myvar)) which stops right where it fails I managed to figure out where the problem was coming from.
I suggest var_dumping and using find dunction to look for these king of characters.
Just on the off chance.. and more for anyone hitting this thread rather than the OP's issue...I missed the following, someone had htmlentities($json) way above me in the call stack. Just ensure you haven't been bitten by the same and check the html source.
Kickself #124

Remove double-quotes from a json_encoded string on the keys

I have a json_encoded array which is fine.
I need to strip the double-quotes on all of the keys of the json string on returning it from a function call.
How would I go about doing this and returning it successfully?
Thanks!
I do apologise, here is a snippet of the json code:
{"start_date":"2011-01-01 09:00","end_date":"2011-01-01 10:00","text":"test"}
Just to add a little more info:
I will be retrieving the JSON via an AJAX request, so if it would be easier, I am open to ideas in how to do this on the javascript side.
EDITED as per anubhava's comment
$str = '{"start_date":"2011-01-01 09:00","end_date":"2011-01-01 10:00","text":"test"}';
$str = preg_replace('/"([^"]+)"\s*:\s*/', '$1:', $str);
echo $str;
This certainly works for the above string, although there maybe some edge cases that I haven't thought of for which this will not work. Whether this will suit your purposes depends on how static the format of the string and the elements/values it contains will be.
TL;DR: Missing quotes is how Chrome shows it is a JSON object instead of a string. Ensure that you have Header('Content-Type: application/json; charset=UTF8'); in PHP's AJAX response to solve the real problem.
DETAILS:
A common reason for wanting to solve this problem is due to finding this difference while debugging the processing of returned AJAX data.
In my case I saw the difference using Chrome's debugging tools. When connected to the legacy system, upon success, Chrome showed that there were no quotes shown around keys in the response according to the debugger. This allowed the object to be immediately treated as an object without using a JSON.parse() call. Debugging my new AJAX destination, there were quotes shown in the response and variable was a string and not an object.
I finally realized the true issue when I tested the AJAX response externally saw the legacy system actually DID have quotes around the keys. This was not what the Chrome dev tools showed.
The only difference was that on the legacy system there was a header specifying the content type. I added this to the new (WordPress) system and the calls were now fully compatible with the original script and the success function could handle the response as an object without any parsing required. Now I can switch between the legacy and new system without any changes except the destination URL.

PHP form auto escaping posted data?

I have an HTML form POSTing to a PHP page.
I can read in the data using the $_POST variable on the PHP.
However, all the data seems to be escaped.
So, for example
a comma (,) = %2C
a colon (:) = %3a
a slash (/) = %2
so things like a simple URL of such as http://example.com get POSTed as http%3A%2F%2Fexample.com
Any ideas as to what is happening?
Actually you want urldecode. %xx is an URL encoding, not a html encoding. The real question is why are you getting these codes. PHP usually decodes the URL for you as it parses the request into the $_GET and $_REQUEST variables. POSTed forms should not be urlencoded. Can you show us some of the code generating the form? Maybe your form is being encoded on the way out for some reason.
See the warning on this page: http://us2.php.net/manual/en/function.urldecode.php
Here is a simple PHP loop to decode all POST vars
foreach($_POST as $key=>$value) {
$_POST[$key] = urldecode($value);
}
You can then access them as per normal, but properly decoded. I, however, would use a different array to store them, as I don't like to pollute the super globals (I believe they should always have the exact data in them as by PHP).
This shouldn't be happening, and though you can fix it by manually urldecode()ing, you will probably be hiding a basic bug elsewhere that might come round to bite you later.
Although when you POST a form using the default content-type ‘application/x-www-form-encoded’, the values inside it are URL-encoded (%xx), PHP undoes that for you when it makes values available in the $_POST[] array.
If you are still getting unwanted %xx sequences afterwards, there must be another layer of manual URL-encoding going on that shouldn't be there. You need to find where that is. If it's a hidden field, maybe the page that generates it is accidentally encoding it using urlencode() instead of htmlspecialchars(), or something? Putting some example code online might help us find out.

Categories