Strip Apostrophes from URL - php

[EDIT] I am placing the comment I entered near the bottom of this post to, hopefully avoid further down votes.
This was a pretty basic question stemming from my misunderstanding of what exactly $_REQUEST is. My understanding was that it was an index that referenced $_POST and $_GET (and $_COOKIE). However, I found that $_REQUEST is, itself, an array, so I simply changed the variables in $_REQUEST. Not an optimal solution, but a solution, nonetheless. It has the added advantage that the $_GET variables, with the apostrophes still there, are available. Perhaps not the best practice, but please note before you down vote that I have very little control over this data - coming in from one API and going out to another.
I have an API currently in use. We have a problem with some customers sending apostrophes in the URL. My question is how best to strip the apostrophes within the URL array. Perhaps using array_walk or something similar?
So that $_REQUEST[Customer] == "O'Henry's"
Becomes $_REQUEST[Customer] == "OHenrys"
EDIT: Judging from some of the answers here, I believe I need to explain a little better. This is an API that is already written and is the preliminary interface for another AS400 API. I have nothing to do with building the URL. I am receiving it. All I am concerned about is removing the apostrophes, without changing any other code. So the best way is to go through the array. In the body of the code, the variable references are all using $_REQUEST[]. I COULD go in and change those to $_GET[] if absolutely necessary but would rather avoid that.
This Works
foreach($_REQUEST as $idx => $val)
{
$_REQUEST[$idx] = str_replace("'" , '' , $val);
}
However, I am a little leery of using $_REQUEST in that manner. Does anyone see a problem with that. (Replacing $_REQUEST with $_GET does not work)

For some use cases, it might make sense to store a "clean" or "pretty" version of the name. In that case, you may want to standardize to a case and have a whitelist of characters rather than a blacklist consisting of just single quotes. Use a regex to enforce this, perhaps similar to this one:
preg_replace("/[^[:alnum:][:space:]]/u", '', $string);
If you do that, consider if it is necessary to differentiate between different customers named O'Henrys, O'Henry's, OHenrys, O'henry's, and so on. Make sure your constraints are enforced by the app and the database.
The array_walk_recursive function is a reasonable way to hit every item in an array:
function sanitize(&$item, $key)
{
if (is_string($item)) {
// apply whitelist constraints
}
}
array_walk_recursive($array, 'sanitize');
It's hard to tell without more context, but it seems possible you may be asking the wrong question / solving the wrong problem.
Remember that you can almost always escape "special" characters and render them a non-issue.
In an HTML context where a single quote might cause problems (such as an attribute value denoted by single quotes), escape for HTML using htmlspecialchars or a library-specific alternative:
<?php
// some stuff
$name = "O'Henry's";
?><a data-customer='<?=htmlspecialchars($name, ENT_QUOTES|ENT_HTML5);?>'>whatever</a><?php
// continue
For JavaScript, encode using json_encode:
<?php
// some stuff
$name = "O'Henry's";
?><script>
var a = <?=json_encode($name);?>
alert(a); // O'Henry's
</script>
For SQL, use PDO and a prepared statement:
$dbh = new PDO('mysql:host=localhost;dbname=whatever', $user, $pass);
$name = "O'Henry's";
$stmt = $dbh->prepare("INSERT INTO REGISTRY (name) VALUES (:name)");
$stmt->bindParam(':name', $name);
$stmt->execute();
For use in a URL query string, use urlencode:
<?php
// some stuff
$name = "O'Henry's";
?>whatever<?php
// continue
For use in a URL query path use rawurlencode:
<?php
// some stuff
$name = "O'Henry's";
?>whatever<?php
// continue
Libraries and frameworks will provide additional ways to escape things in those and other contexts.

If you want them removing altogether as an illegal character:
<?php foreach($myArray as $idx => $val){
$myArray[$idx] = str_replace("'" , '' , $val);
}
?>
However this shouldn't be your solution to SQL Inserts etc.. Better off using mysqli::real_escape_string OR prepared statements

This was a pretty basic question stemming from my misunderstanding of what exactly $_REQUEST is. My understanding was that it was an index that referenced $_POST and $_GET (and $_COOKIE). However, I found that $_REQUEST is, itself, an array, so I simply changed the variables in $_REQUEST. Not an optimal solution, but a solution, nonetheless. It has the added advantage that the $_GET variables, with the apostrophes still there, are available. Not the best practice, though.

EDIT:
Reading the edits you made on your question, the best solution for you is str_replace(). But no need to loop through your array, the 3rd parameter can be an array !
This will strip apostrophes of every item in $foo:
$foo = [
"O'Henry's",
"D'Angleterre"
];
$foo = str_replace("'", "", $foo);
If you really need to remove the apostrophes use str_replace():
$foo = "O'Henry's";
$foo = str_replace("'", "", $foo);
// OUTPUT: OHenrys
If you can keep them, you better encode them. urlencode() may be a way to do:
$foo = urlencode($foo);
// OUTPUT: O%27Henry%27s
If you build this URL from an array you could use http_build_query():
$foo = [
'Customer' => "O'Henry's"
];
$foo = http_build_query($foo);
// OUTPUT: Customer=O%27Henry%27s

Related

Is a foreach loop on $_GET a good way to apply htmlspecialchars?

I'm wondering if there is a significant downside to using the following code:
if(isset($_GET)){
foreach($_GET as $v){
$v = htmlspecialchars($v);
}
}
I realize that it probably isn't necessary to use htmlspecialchars on each variable. Anyone know offhand if this is good to do?
UPDATE:
Because I don't think my above code would work, I'm updating this with the code that I'm using (despite the negativity towards the suggestions). :)
if(isset($_GET)){
foreach($_GET as $k=>$v){
$_GET[$k] = htmlspecialchars($v);
}
}
This totally depends on what you want to do.
In general, the answer is "no", and you should only escape data specifically for their intended purpose. Randomly escaping data without purpose isn't helping, and it just causes further confusion, as you have to keep track of what's been escaped and how.
In short, keep your data stored raw, and escape it specifically for its intended use when you use it:
for HTML output, use htmlentities().
for shell command names, use escapeshellcmd().
for shell arguments, use escapeshellarg().
for building a GET URL string, use urlencode() on the parameter values.
for database queries, use the respective database escape mechanism (or prepared statements).
This reasoning applies recursively. So if you want to write a link to a GET URL to the HTML output, it'd be something like this:
echo "click";
It'd be terrible if at that point you'd have to remember if $var had already previously been escaped, and how.
Blanket escaping isn't necessary, and it's possibly harmful to the data. Don't do it.
Apply htmlspecialchars() only to data that you are about to output in a HTML page - ideally immediately before, or directly when you output it.
It won't affect numbers, but it can backfire for string parameters which are not intended to be put in HTML code.
You have to treat each key different depending on its meaning. Possibility of generalization also depends on your application.
The way you're doing it won't work. You need to make $v a reference, and it breaks for anything requiring recursion ($_GET['array'][0], for example).
if(isset($_GET)) {
foreach($_GET as &$v) {
$v = htmlspecialchars($v);
}
}

How can I get all submitted form values in PHP and automatically assign them to variables?

I'm trying to migrate a website from one host to another. On the first host, when you submit a form, all of the form values are automatically stuck into variables with the input name (this is PHP). On the new host, these values are all null unless I do this:
$data = $_GET['data'];
Is there a PHP configuration setting that is causing this? If there isn't, is there an easy way to loop through all of the $_GET variables and automatically assign their values to a variable with the same name?
Thanks!
The setting is register_globals, but it is now deprecated and strongly advised against using it because it is a security risk. Anyone can set variables in your script which might interact in a negative or unexpected way with your code.
If you absolutely must, you can do it like this:
foreach ($_GET as $key=>$value) {
$$key = $value;
}
or, more simply:
import_request_variables("g");
or, to make it a little safer:
import_request_variables("g", "myprefix_"); // This way forces you to use "myprefix_"
// in front of the variables, better ensuring you are not unaware
// of the fact that this can come from a user
extract($_GET) could also work, as someone else pointed out, and it also allows specification (via extra arguments) of adding a prefix or what to do if your extraction conflicts with an already existing variable (e.g., if you extracted after you defined some other variables).
Look at the extract function : http://www.php.net/manual/en/function.extract.php
You could do something like this:
foreach ($_GET["data"] as $name => $value){
$$name = $value;
}
The issue with this is that it makes it easy for people to fiddle with the variables in your script. I could visit http://yoursite.com/?sql=DELETE+FROM...
I'd advise against doing this and just sticking to using $_GET.
Your question infers you are not doing any filtering or validation when assigning $_GET['data'] to $data, unless you are doing these kind of checks further down your script.
From what I have seen most programmers would do this first, in an effort to fail early if expected data did not match expectations, so that the above assignment in the case of expecting a positive int would become something like:
if( isset($_GET['data']) && (int)$_GET['data'] === 0){
//fail
}else{
$data = $_GET['data'];
}
So seeing just plain
$data = $_GET['data']
makes me wince.

Easiest and most efficient way to get data from URL using php?

Solution?
Apparently there isn't a faster way, I'm okay with that.
I am just learning php and I am trying to figure out some good tips and tricks so I don't get into a bad habit and waste time.
I am passing in values into a php script. I am using $_GET so the URL looks like this:
/poll_results.php?Sports=tennis&cat=Sports&question=Pick+your+favorite+sports
Now I know how to accept those values and place them into variables like so:
$sports = $_GET['Sports'];
$cat = $_GET['cat'];
$question = $_GET['question'];
Super simple yet if I am passing 5 - 6 things it can get bothersome and I don't like typing things out for every single variable, that's the only reason. I know there is a better way of doing this. I have tried list($var, $var, $var) = $_GET but that doesn't work with an associative array just indexed ones (i think).
I also tried variable variables like so:
foreach($_GET as $value) {
$$values = $value;
echo $$values;
}
But that gave me a Notice: Undefined variable: values in poll_results.php on line 14. Line 14 is the $$values = $value. I don't know if that's a big deal or not... but I'm not turning off error reporting as I am still in the process of building the script. It does do what I want it to do though...
Any answers will be copied and pasted into my question so the next person knows :D
Thanks guys!
Your second bit of code is wrong. It ought to be like
foreach ($_GET as $key => $value) {
$$key = $value;
}
if i understand your intent. However, you're basically reinventing register_globals, which....eh. That'll get ya hacked.
If you have certain variables you want to get, you could do like
foreach (array('Sports', 'cat', 'question') as $key)
{
$$key = $_GET[$key];
}
which is less likely to overwrite some important variable (whether by accident or because someone was messing around with URLs).
Use parse_url() to extract the query string from a URL you've got in a string, then parse_str() to extract the individual arguments of the query string.
If you want to pollute your script with the contents of the superglobals, then you can use extract(). however, be aware that this is basically replicating the hideous monstrosity known as "register_globals", and opens all kinds of security vulnerabilities.
For instant, what if one of the original query arguments was _GET=haha. You've now trashed the $_GET superglobal by overwriting it via extract().
I am just learning php and I am trying to figure out some good tips and tricks so I don't get into a bad habit and waste time.
If I am passing 5 - 6 things it can get bothersome and I don't like typing things out for every single variable, that's the only reason.
What you are trying to do will, unless curbed, become a bad habit and even before then is a waste of time.
Type out the variables: your digits like exercise and your brain can take it easy when it doesn't have to figure out which variables are available (or not, or maybe; which would be the case when you use variable variables).
You can use
foreach($_GET as $key => $value)
To preserve the key and value associativity.
Variable variables (the $$value) are a bad idea. With your loop above say you had a variable named $password that is already defined from some other source. Now I can send $_GET['password'] and overwrite your variable! All sorts of nastiness can result from this. It's the same reason why PHP abandoned register_globals which essentially does the same thing.
My advice: use $_POST when possible. It keeps your URLs much cleaner for one thing. Secondly there's no real reason to assign the array to variables anyway, just use them where you need them in the program.
One good reason for this, especially in a large program, is that you'll instantly know where they came from, and that their data should not be trusted.

A nice function to escape $_POST // what do you think about it

I'm using $_POST and aware about mysql exploit, I decided to use this function on the top of my page, therefore all POST will be safe:
Can you tell me if I miss something and this function will really do the job as I think it will?
function clean_post(){
if ( $_POST){
foreach ($_POST as $k => $v) {
$_POST[$k]=stripslashes($v);
$_POST[$k]=mysql_real_escape_string($v);
$_POST[$k]=preg_replace('/<.*>/', "", "$v");
}
}
if ( $_COOKIE){
foreach ($_COOKIE as $k => $v) {
$_COOKIE[$k]=stripslashes($v);
$_COOKIE[$k]=mysql_real_escape_string($v);
$_COOKIE[$k]=preg_replace('/<.*>/', "", "$v");
}
}
}
It will also remove all html tag, a safest option to output the result might be to use:
<pre>
$foo
</pre>
Cheers!
Cheers!
I think it's a bad idea to do this. It will corrupt the data your users enter even before it hits the database. This approach will also encourage you to use lazy coding where you consistently don't escape data because you believe that all your data is already "clean". This will come back to bite you one day when you do need to output some unsafe characters and you either forget to escape them or you aren't really sure which function you need to call so you just try something and hope that it works.
To do it properly you should ensure that magic quotes is disabled and only escape data when necessary, using precisely the correct escaping method - no more, no less.
There are some problems with it.
First you apply functions on types that doesn't need them, your integers for example needs only a (int) cast to be secure.
Second you do not secure lenght, when you're requesting a '12 chars string' it would be a good idea to ensure you've got only 12 chars, and not 2048. Limiting size is really something your attackers will not like.
Third in your foreach loop you have a $v variable, you assign 3 times a function on $v to $_POST[$k]. So the 1st two assignements are lost when the 3rd occurs...
Then all the things previous people said are right :-)
The rule is apply the filter at the right moment for the right output. HTML output need an html filter (htmlspecialchars), but the database doesn't need it, it need a database escaping. Let's say you want to extract data from your database to build a CSV or a PDF, HTML escaping will make you life harder. You'll need CSV escaping at this time, or PDF escaping.
Finally it is effectively hard to remember if you are manipulating a data which is already well escaped for your output. And I recommend you an excellent read on Joel on Software about Apps Hungarian. The text is quite long, but very good, and the web escaping sequence is used as an example on why Apps Hungarian is good (even if System Hungarain is bad).
Hi this is my first answer for any question asked on web so please review it.
Put this code in top of your script and no need to assign these posted values to any variables for doing the same job of making the input data safe for database. Just use $_POST values as it is in your query statements.
foreach ($_POST as $k => $v) {
if(!is_array($_POST[$k]) ) { //checks for a checkbox array & so if present do not escape it to protect data from being corrupted.
if (ini_get('magic_quotes_gpc')) {
$v = stripslashes($v);
}
$v = preg_replace('/<.*>/', "", "$v"); //replaces html chars
$_POST[$k]= mysql_real_escape_string(trim($v));
}
}
Don't forget $_GET[]
if ($_POST OR $_GET)
Also you can add strip_tags()
I don't know whether your function is correct or not, but the principle is certainly incorrect. You want to escape only where you need to, i.e. just before you pass things into MySQL (in fact you don't even want to do that, ideally; use bound parameters).
There are plenty of situations where you might want the raw data as passed in over the HTTP request. With your approach, there's no ability to do so.
In general, I don't think it's that good of an idea.
Not all post data necessarily goes into MySQL, so there is no need to escape it if it doesn't. That said, using something like PDO and prepared statements is a better way, mysql_* functions are deprecated.
The regular expression could destroy a lot of potentially valid text. You should worry about things like HTML when outputting, not inputting. Furthermore, use a function like strip_tags or htmlspecilchars to handle this.
stripslashes is only necessary if magic quotes are enabled (which they shouldn't be, but always is possible)
When working with stripslashes I'd use get_magic_quotes_gpc():
if (get_magic_quotes_gpc()) {
$_POST[$k]=stripslashes($v);
}
Otherwise you'll over-strip.

What are ways to improve my PHP data sanitation class?

I'm putting together a site, (we're already using javascript to prevalidate on the client side). However after getting tired of writing mysql_real_escape_string every other line. I wrote this class that only has two functions mainly focused on sanitising data in user-input/sql. My question is, what are ways to achieve easier input-sanitizing and while improving code readability?
<?php
class Safe {
function userinput($string){
$string = strip_tags($string);
$string = htmlspecialchars($string);
return $string;
}
function sql ($string){
$sqlstuff = Array("union", "select", "update", "delete", "outfile", "create");
$string = Safe::str($string);
$string = mysql_escape_string($string);
$string = str_ireplace($sqlstuff, "", $string);
return $string;
}
}
?>
Sorry, this is going to sound harsh, but your class is completely broken.
You should not be using htmlspecialchars for sanitizing input, it is only useful for escaping output. You do not need to encode HTML for insertion to the database nor should you. Only using htmlspecialchars when sending output to the browser
You should not be stripping tags from your input, you should be leaving them alone and again using htmlspecialchars when you output that data later to insure HTML tags are escaped and not interpreted by the browser
You should not be using mysql_escape_string or mysql_real_escape_string, you should be using PDO. If you are writing a new site there is absolutely no reason not to start out correctly and use PDO. Do it.
You should not be filtering out "union", "select", etc, that's dumb. Those words can appear in regular English language, and they're harmless if you're properly escaping quotes which PDO will handle for you.
Again, sorry for the harsh tone of this answer, but scrap the entire thing and use PDO. There is literally nothing salvageable here.
It's a good idea to use a class like that, particularily if it simplifies input handling. There's however a few points I'd like to comment on:
You should use mysql_real_escape_string instead of the PHP3 mysql_escape_string.
The first function should be called html or something. userinput sounds to vague and misrepresentative.
HTML escaping needs more parameters htmlspecialchars($str, ENT_QUOTES, "UTF-8") to be perfectly safe
The blacklisting of dangerous SQL keywords is not a good idea. It hints at a wrong approach to using SQL queries (if you receive queries via HTTP requests, that's your problem).
Also you should not attempt to filter them. Instead detected them, write to the error/security log, and die() immediately. If there is an attempt to circumvent security, there's no point in attempting any "cleaning" of the request.
You can also use filter_* functions that are bundled with PHP and provide you with the mechanism to filter request parameters according to specific filtering rules.
With few extra tricks, you could even filter arrays of different types of data (thanks to erisco!).
class sanitizer {
public function sanitizeValues($values, $filters) {
$defaultOptions=FILTER_FLAG_NO_ENCODE_QUOTES | FILTER_FLAG_STRIP_LOW | FILTER_NULL_ON_FAILURE;
$filters=(array)$filters;
$values=(array)$values;
foreach ($filters as $key => $filter) {
if($parts=explode('/', $key)){
$v=&$values;
foreach ($parts as $part){
$v=&$v[$part];
}
$filter=(array)$filter;
$filter[1]=isset($filter[1])?$filter[1]:$defaultOptions;
$v=filter_var($v, $filter[0], $filter[1]);
// consider if you really need this here instead of PDO
// $v=mysql_real_escape_string($v);
}
else{
$values[$key]=isset($values[$key]) ? filter_var($values[$key], $filter[0], $filter[1]) : null;
}
}
return $values;
}
}
$manager=sanitizer::sanitizeValues($_GET['manager'], array(
'manager/managerID'=>FILTER_VALIDATE_INT,
'manager/username'=>FILTER_SANITIZE_STRING,
'manager/name'=>FILTER_SANITIZE_STRING,
'manager/email'=>FILTER_SANITIZE_STRING,
'manager/phone'=>FILTER_SANITIZE_STRING,
'manager/bio'=>FILTER_SANITIZE_STRING,
'manager/enabled'=>FILTER_VALIDATE_BOOLEAN,
'manager/password'=>FILTER_SANITIZE_STRING));
This will produce an array complete with all the needed fields based on the 'manager' parameter in _GET, with all values filtered and, optionally, escaped.

Categories