PHP - htmlspecialchars performance and concerns. Alternatives?

PHP - htmlspecialchars performance and concerns. Alternatives? - php

I am making a website that is 99% based around user content. I have been reading a lot about security vs xss, csrf, sql injection and all that fun stuff. I understand it all well and have been incorporating proper security. The thing I am concerned about is performance and over usage, looking for a better way.
I understand the idea of accept user input as is. Filter and validate user input before going into database and then output with sanitization with something like htmlspecialchars.
Now here is the thing. Every “entry” a user adds to the database can have around 30 different pieces of information attached to it.
So if they view a page. I would output around 30 htmlspecialchars on that page alone. That seems like over a kill. A listing or search page might have 5 or more variables for each of those items and at 20 listing a page I am easily hitting 100+ uses of htmlspecialchars. That seems insane.
Would this cause a strain on my cheap server? Is there a better way to do it?
My horrible ideas.
(1) How about using strip tags when inputting into the database? I understand the vulnerability of outputting into attributes without htmlspecialchars, but I control where every variable outputs and the worst would be variables going into things like <h4>$title</h4> or <li>$info</li> never into an href or anything. Wouldn't this save a ton of server usage to have the sanitization done once, instead of on every page load? I could still call htmlspecialchars on a variable if I have to put it inside an attribute.
(2) I understand this a horrible idea. But how about storing the htmlspecialchars sanitized text directly in the database? I know if I ever want to do something else with this data like, make an api, output as json or pdf, I would have to decode htmlspecialchars. But none of those situations are something I would ever do. This seems like it would save a TON of server resources, as I would be sanitizing only once instead of every page load.
(3) Store literal input and htmlspecialchars version sanitized of the text in another column. This way the user still sees their input as it was entered and I only have to htmlspecialchars once on input to the database, instead of every page load. Yes more database storage but otherwise what would be the problems?
Edit: Thanks I now see this is micro optimization.

My Opinion: You shouldn't have a big issue with performance. In the future your performance issues will actually decrease since techonology is only enhancing performance regarding the speed of CPU cycles and other factors.
I recommend you keep using the htmlspecialchars when echoing out the data. 30 function calls to htmlspecialchars is very little work for your server (give your server and php some credit xD) and for the reasons stated above will be even less work in the future.

Use http://htmlpurifier.org/, its open source PHP Library used by lot of big forums to clean up user inputs.
you can save the cleaned-up html in your database.

Related

executing code from database

I have a PHP code stored in the database, I need to execute it when retrieved.
But my code is a mix of HTML and PHP, mainly used in echo "";
A sample that looks like my code:
echo "Some Text " . $var['something'] . " more text " . $anotherVar['something2'];
How can I execute a code like the either if I add the data to the DB with echo""; or without it.
Any ideas?
UPDATE:
I forgot to mention, I'm using this on a website that will be used on intranet and security will be enforced on the server to ensure data safety.

I have a PHP code stored in the database
STOP now.
Move the code out of the database.
And never mix your code with data again.

It's not only a bad idea but also invitation to several type of hacking attempts.
You can do with eval(). but never use it . The eval() is very dangerous because it allows execution of arbitrary PHP code. Its use thus is discouraged. If you have carefully verified that there is no other option than to use this construct, pay special attention not to pass any user provided data into it without properly validating it beforehand.

See eval. It lets you pass a string containing PHP and run it as if you'd written it directly into your file.
It's not a common practice to store executable PHP in a database; is the code you store really that different that it makes more sense to maintain many copies of it rather than adapting it to do the same thing to static data in the database? The use of eval is often considered bad practice as it can lead to problems with maintenance, if there's a way of avoiding it, it's normally worth it.

You can execute code with eval():
$code_str = "echo 'Im executed'";
eval($code_str );
BUT PAY ATTENTION that this is not safe: if someone will get access on your database he will be able to execute any code on your server

use the eval() function.
heres some info
http://www.php.net/manual/en/function.eval.php
something along the lines of:
eval($yourcode);
If that is the last resort, you want it to be secure as it will evaluate anything and hackers love that. Look into Suhosin or other paths to secure this in production.

As everyone'd indicated using eval() is a bad approach for your need. But you can have almost the same result by using whitelist approach.
Make a php file , db_driven_functions.php for instance. get your data from db. and map them in an array as below
//$sql_fn_parameters[0] = function name
//$sql_fn_parameters[1,2,3.....] = function parameters
Then define functions those include your php code blocks.for instance
my_echo($sql_fn_parameters){
echo $sql_fn_parameters[1];//numbered or assoc..
}
then pull the data which contains function name
after controlling if that function is defined
function_exists("$sql_fn_parameters[0]")
call function
call_user_func_array() or call_user_func()
( any you may also filter parameters array $sql_sourced_parameters_array does not contain any risky syntaxes for more security.)
And have your code controlled from db without a risk.
seems a little bit long way but after implementing it's really a joy to use an admin panel driven php flow.
BUT building a structure like this with OOP is better in long term. (Autoloading of classes etc. )

Eval is not safe obviously.
The best route IMO
Save your data in a table
Run a stored procedure when you are ready to grab and process that data

You should not abuse the database this way. And in general, dynamic code execution is a bad idea. You could employ a more elegant solution to this problem using template engines like Smarty or XSLT.

There are a few way to achieve this:
1) By using evil
eval($data);
That's not a typo, eval is usually considered evil and for good reasons. If you think you have fully validated user data to safely use eval, you are likely wrong, and have given a hacker full access to your system. Even if you only use eval for your own data, hacking the database is now enough to gain full access to everything else. It's also a nightmare to debug code used in eval.
2) Save the data to a file, then include it
file_put_contents($path, $data); include $path;
There are still the same security concerns as eval but at least this time the code is easier to debug. You can even test the code before executing it, eg:
if (strpos(exec('php -l '.$path), 'No syntax errors detected') === false))
{
include $path;
}
The downside to this method, is the extra overhead involved in saving the code.
3) Execute the code straight from the database.
You'd need to use database software that allows this. As far as I am aware, this is only includes database software that stores the content as text files. Having database software with "php eval" built in would not be a good thing. You could try txt-db-api. Alternatively, you could write your own. It would like become very difficult to maintain if you do though but is something to consider if you know exactly how you want your data to be structured and are unlikely to change your mind later.
This could save a lot of overhead and have many speed benefits. It likely won't though. Many types of queries run way faster using a traditional database because they are specifically designed for that purpose. If there's a possibility of trying to write to a file more than once at the same time, then you have to create a locking method to handle that.
4) Store php code as text files outside of the database
If your database contains a lot of data that isn't php code, why even store the php code in the database? This could save a lot of overhead, and if you're database is hacked, then it may no longer be enough to gain full access to your system.
Some of the security considerations
Probably more than 99% of the time, you shouldn't even be attempting to do what you are doing. Maybe you have found an exception though, but just being an intranet, isn't enough, and certainly doesn't mean it's safe to ignore security practices. Unless everyone on the intranet needs full admin access, they shouldn't be able to get it. It's best for everyone to have the minimum privileges necessary. If one machine does get hacked, you don't want the hacker to have easy access to everything on the entire intranet. It's likely the hacker will hide what they are doing and will introduce exploits to later bypass your server security.
I certainly need to do this for the CMS I am developing. I'm designing it mainly to produce dynamic content, not static content. The data itself is mostly code. I started off with simple text files, however it slowly evolved into a complicated text file database. It's very fast and efficient, as the only queries I need are very simply and use indexing. I am now focusing on hiding the complexity from myself and making it easy to maintain with greater automation. Directly writing php code or performing admin tasks requires a separate environment with Superuser access for only myself. This is only out of necessity though, as I manage my server from within, and I have produced my own debugging tools and made an environment for code structured a specific way that hides complexity. Using a traditional code editor, then uploading via ssh would now be too complicated to be efficient. Clients will only be able to write php code indirectly though and I have to go to extreme lengths to make that possible, just to avoid the obvious security risks. There are not so obvious ones too. I've had to create an entire framework called Jhp and every piece of code, is then parsed into php. Every function has to pass a whitelist, is renamed or throws an error, and every variable is renamed, and more. Without writing my own parser and with just a simple blacklist, it would never be even a tiny bit secure. Nothing whatsoever client-side can be trusted, unless I can confirm on every request that it has come entirely from myself, and even then my code error checks before saving so I don't accidentally break my system, and just in case I still do, I have another identical environment to fix it with, and detailed error information in the console that even works for fatal errors, whilst always been hidden from the public.
Conclusion
Unless you go to the same lengths I have (at minimum), then you will probably just get hacked. If you are sure that it is worth going to those lengths, then maybe you have found an exception. If your aim is to produce code with code, then the data is always going to be code and it cannot be separated. Just remember, there are a lot more security considerations other than what I have put in this answer and unless the entire purpose of what you are doing makes this a necessity, then why bother at all mix data with code?

is it okay to "repeatedly" xss-clean data in CodeIgniter?

The following are ways to XSS-clean data in Codeigniter:
set global_xss_filtering in config to TRUE
use xss_clean()
use xss_clean as a validation rule
set the second parameter to TRUE in $this->input->post('something', TRUE)
Is it okay to use all or more than one of them on one piece of data?
For example, would it be okay if I still used $this->input->post('something', TRUE) even if the data has already been cleaned by global_xss_filtering and xss_clean validation rule?

It's not going to hurt you, but it is definitely is pointless.
There's a very good chance that eventually, you will reach a point where the global XSS filter is going to be cumbersome. Since it can't be disabled per controller without extensive hacks, and access to the raw $_REQUEST data will be impossible, you will need to disable it globally. This will happen the moment you want to process a single piece of trusted data, or data that isn't HTML output and must remain intact.
Using it as a form validation rule is pointless and potentially destructive as well. Imagine what this site would be like if every time you typed <script> it was replaced with [removed], with no way to revert it in the future. For another example, what if a user users some "XSS" content in his password? Your application will end up altering the input silently.
Just use the XSS filter where you need it: on your HTML output, places where javascript can be executed.

Yes. Assume, your input is 'A'. Then, lets say you run an xss_clean to get XSS-safe content:
B = xss_clean(A)
Now, lets say I do it again to get C:
C = css_clean(B)
Now, if B and C differ, then it must mean that B had some xss-unsafe content. Which clearly means that xss_clean is broken as it did not clean A properly. So as long as you assume that the function returns xss-safe content, you are good to go.
One argument that can be made is what if the function modifies even xss-safe content? Well, that would suck and it would still mean that the function is broken, but that is not the case (saying just out of my experience, as in haven't seen it behave like this ever).
The only drawback I see is the additional processing overhead, but doing it twice is fine (once with global filtering, and once doing it explicitly, just in case global filtering is turned off sometime by someone), and is a pretty ok overhead cost for the security assurance.
Also, if I may add, codeigniters xss clean doesn't really parse the HTML and drop the tags and stuff. It just simply converts the < and > to < and >. So with that in mind, I don't see anything that could go wrong.

Using xss_clean even once is bad as far as I am concerned. This routine attempts to sanitise your data by removing parts or replacing parts. It is lossy and not guaranteed to return the same content when run multiple times. It is also hard to predict and will not always act appropriately. Given the amount of things it does to try to sanitise a string there is a massive performance hit for using this on input. Even the tiniest bit of input such as a=b will cause a flurry of activity for xss_clean.
I would like to say that you should never use xss_clean but realistically I can't say that. This system is made for inexperienced developers who do not know how to safely manage user content. I'm an experienced developer so I can say that no project I am working on should ever use xss_clean. The fact is though, the corruption issues will be less problematic for inexperience developers with simpler usage and ultimately it probably will make their code more secure even if they should be making their code more secure themselves rather than relying on quick dirty and cheap hacks. On the otherhand, xss_clean isn't guaranteed to make your code completely secure and can ultmimately make things worse by giving a false sense of security. You are recommended to really study instead to make sure you understand exactly everything your code does so you can make it truly secure. xss_clean does not compensate for code errors, it compensates for coder errors.
Ideally xss_clean wants to be done only on output (and wants to be replaced with htmlentities, etc) but most people wont bother with this as it's simpler for them to violate data purity by just filtering all input rather than filtering output (something can be input once but output ten times). Again, an undisciplined developer may not put xss_clean for one out of those ten cases of output.
Realistically however, the only real decent way is to properly encode everything in the view the moment it is to be displayed on a page. The problem with pre-emptive encoding is that you store data that might be incorrectly encoded and you can double encode data if it is input, then output into a form, then inputted again. If you think of something like an edit box you can have some serious problems with data growth. Not all sanitation removes content. For example, if you addslashes this will add content. If you have a slash in your content every time you run addslashes a new slash is added causing it to grow. Although there is a good chance your data will end up embedded in HTML you also can't always really know where data will end up. Suddenly you get a new requirement that applies to previous data and that's it, you're screwed because you applied and lossy filter to incoming data prior to storage. By lossy, in this case, that might mean your job after corrupting all the user data in your database. Your data is usually the most valuable thing for a web application. This is a big problem with pre-emptive encoding. It is easier to work with if you always know your data is pure and can escape it according to the situation at had but if your data could be in any condition down the line this can be very problematic. The filtering can also cause some occasional logical breakages. As the sanitisation can remove content for example, two strings that don't match can be made to match.
Many of the problems with xss_clean on input are the same or similar to those for magic_quotes:
http://en.wikipedia.org/wiki/Magic_quotes
Summary: You should not use it but instead block bad data on user input and escape properly on output. If you want to sanitise user data, it should happen in the client (browser, form validation) so that the user can see it. You should never have invisible data alteration. If you must run xss_clean. You should only run it once, on output. If you're going to use it for validation of input, have $posted_data !== xss_clean($posted_data) then reject.

Tricks to speed up page load time in PHP

I know of these two tricks for speeding page load time up some:
#ini_set('zlib.output_compression', 1);
which turns on compression
ob_implicit_flush(true);
which implicitly flushes the output buffer, meaning as soon as anything is output it is immediately sent to the user's browser. This one's a tad tricky, since it just creates the illusion that the page is loading quickly while in actuality it takes the same amount of time and the data is just being shown faster.
What other php tricks are there to make your pages load (or appear to load) faster?

It is always better to define a real bottleneck and then try to avoid it.
The way to follow any trick that is supposed to make something faster without understanding whether you have the problem or not - is always a wrong way.

The best way is to ensure that your script isn't creating/destroying unnecessary variables and make everything as efficient as possible. After that, you can look into a caching service so that the server does not have to reparse specific parts of a page.
If all that doesn't make it as fast as you need it to be, you can even "compile" the php code. Facebook does this to support faster load times. They created something called "HipHop for PHP" and you can read about it at: https://developers.facebook.com/blog/post/358/
There are other PHP compilers you can use to help.
If all this fails, then I suggest you either recode the website in a different language, or figure out why it is taking so long (more specifically, WHAT is causing it to take so long) and change that part of the website.

There are some that can speed your website(code custmoization)
1) If you’re looping through an array, for example, count() it beforehand, store the value in a variable, and use that for your test. This way, you avoid needlessly firing the test function with every loop iteration.
2) use build in function instead of custom function
3) put JavaScript function and files at bottom of file
4) use caching

Among the best tricks to speed up PHP page loads is to use as little PHP as possible, i.e. use a PHP cache/accelerator such as Zend or APC, or cache as much as you can yourself. PHP that does not need to be parsed again is faster, and PHP that does not run at all is still faster.
The same goes (maybe even more so) for database. Use as few queries as possible. If you can combine two queries into one, you save one round trip.

What is the best way to "clean" information to be stored in a SQL database?

Scenario:
I have a blog that I want to make a post to. I have a form set up where I can write out a blog post and submit it to a seperate php page that then stores it in a database (after it confirms it is me posting) where it will be read from and displayed on the home page. How can I easily escape any quotes or anything that will interfere with it being stored in the database but still allow it to be displayed properly (with all formatting intact)?
Thanks

Prepared statements in PHP will do a good job of taking care of sanitizing data as it goes into the database.

The only things that will interfere with it being stored in a MySQL database can be easily escaped by mysql_real_escape_string().
When you pull it out of the database, everything will look the same as before it was escaped and put in. Before you display it on a web page, you'll want to run htmlspecialchars() on the text to prevent any malicious scripting from having an effect.
An optional command would be strip_tags() if you don't want the text to contain any HTML at all.

Prepared statements are always a really good idea. But, you might consider moving your database code to a stored procedure. This will increase security and performance (in most cases, depending on what database you use and how you cache results).
If you are not going with the stored procedures route, also make sure to disable multiple lines of commands per call to database. This should be in the database config files. It will disable the possibility of doing this:
your command;malicious command
Although there are other ways, this is definitely the most secure.

How do I execute PHP that is stored in a MySQL database?

I'm trying to write a page that calls PHP that's stored in a MySQL database. The page that is stored in the MySQL database contains PHP (and HTML) code which I want to run on page load.
How could I go about doing this?

You can use the eval command for this. I would recommend against this though, because there's a lot of pitfalls using this approach. Debugging is hard(er), it implies some security risks (bad content in the DB gets executed, uh oh).
See When is eval evil in php? for instance. Google for Eval is Evil, and you'll find a lot of examples why you should find another solution.
Addition: Another good article with some references to exploits is this blogpost. Refers to past vBulletin and phpMyAdmin exploits which were caused by improper Eval usage.

Easy:
$x // your variable with the data from the DB
<?php echo eval("?>".$x."<?") ?>
Let me know, works great for me in MANY applications, can't help but notice that everyone is quick to say how bad it is, but slow to actually help out with a straight answer...

eval() function was covered in other responses here. I agree you should limit use of eval unless it is absolutely needed. Instead of having PHP code in db you could have just a class name that has method called, say, execute(). Whenever you need to run your custom PHP code just instantiate the class of name you just fetched from db and run ->execute() on it. It is much cleaner solution and gives you great field of flexibility and improves site security significantly.

You can look at the eval function in PHP. It allows you to run arbitrary PHP code. It can be a huge security risk, though, and is best avoided.

Have you considered using your Source Control system to store different forks for the various installations (and the modules that differ among them)? That would be one of several best practices for application configuration I can think of. Yours is not an unusual requirement, so it's a problem that's been solved by others in the past; and storing code in a database is one I think you'd have a hard time finding reference to, or being advised as a best practice.
Good thing you posted the clarification. You've probably unintentionally posed an answer in search of a suitable question.

Read php code from database and save to file with unique name and then include file
this easy way for run php code and debug it.
$uniqid="tmp/".date("d-m-Y h-i-s").'_'.$Title."_".uniqid().".php";
$file = fopen($uniqid,"w");
fwrite($file,"<?php \r\n ".$R['Body']);
fclose($file);
// eval($R['Body']);
include $uniqid;

How I did this is to have a field in the database that identified something unique about the block of code needing to be executed. That one word is in the file name of that code. I put the strings together to point to the php file to be included. example:
$lookFor = $row['page'];
include("resources/" . $lookFor . "Codebase.php");
In this way even if a hacker could access you DB he couldn't put malicious code straight in there to be executed. He could perhaps change the reference word, but unless he could actually put a file directly onto the server it would do him no good. If he could put files directly onto the server, you're sunk then anyway if he really wants to be nasty. Just my two cents worth.
And yes, there are reasons you would want to execute stored code, but there are cons.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.