How to prevent saving unusual HTML entities from PHP form? - php

Every now and then, I get unusual data saved to the database from my PHP form that looks like this:
Mr. Smith's
What could be causing this, and is there a better way to remove the entities than using preg_replace, since the php decode functions don't properly decode the entire thing?

I would suggest looking at the code processing data from the form pre-insertion into the database. If you are sanitising the data to be displayed on a web page use htmlentities($var); if you are are only sanitising it for security purposes look into prepared statements / stored procedures or just mysql_real_escape_string($var). If all else fails post the code and we'll have a look.

This must be due to some technical problem. The best way to decode the entities and after that if you find something like:
/&([a-z]+|#[0-9]+);/
Do not accept the form, just alert the user about the invalid value.

You could take a look at Codeingiters' source code where they remove entities at https://bitbucket.org/ellislab/codeigniter/src/c07dcadf094e/system/libraries/Security.php in the xss_clean method. This will give you a good idea of how to clean most of them more effectively.

Related

Editing and Saving user HTML with Javascript - how safe is it?

For example I have a Javascript-powered form creation tool. You use links to add html blocks of elements (like input fields) and TinyMCE to edit the text. These are saved via an autosave function that does an AJAX call in the background on specific events.
The save function being called does the database protection, but I'm wondering if a user can manipulate the DOM to add anything he wants(like custom HTML, or an unwanted script).
How safe is this, if at all?
First thing that comes to mind is that I should probably search for, and remove any inline javascript from the received html code.
Using PHP, JQuery, Ajax.
Not safe at all. You can never trust the client. It's easy even for a novice to modify DOM on the client side (just install Firebug for Firefox, for example).
While it's fine to accept HTML from the client, make sure you validate and sanitize it properly with PHP on the server side.
Are you saving the full inline-html in your database?
If so, try to remake everything and only save the nessesary data to your backend. ALL fields should also be controlled if they are recieved in the expected way.
All inline-js is easily removed.
You can never trust the user!
Absolutely unsafe, unless you take the steps to make it safe of course. StackOverflow allows certain tags, filtered so that users can't do malicous things. You'll definately need to do something similar.
I'd opt to sanitize input server side so that everyone gets their input sanitized, whether they've blocked scripts or not. Using something like this: http://www.phpclasses.org/package/3746-PHP-Remove-unsafe-tags-and-attributes-from-HTML-code.html or http://grom.zeminvaders.net/html-sanitizer implemented with AJAX would be a pretty good solution

PHP sanitize and check input procedure for simplexml?

I am passing a textarea input boxs' contents via POST to my php file from html (no javascript allowed).
I then use simplexml to get the feed at the url the user entered.
Unfortunately, the user can enter anything into the textarea. Which I am told is dangerous.
What is the recommended way to clean and secure the POST contents using PHP to get them ready and safe for the simplexml procedure?
(basically, to be sure they are not malicious and check they are a valid url)
Content inside a $_POST array are strings, so there's nothing ineherently unsafe there.
User enters php code? It surely won't be executed, so no problem here (this, among many others, is a reason not to use such things as eval()). So whatever php function or command he writes it will be read as a simple string, and string are no harmful whatever they contain.
User enters malicious javascript? Still no problem, as javascript inside php, or inside a database for what that matters, is pretty useless since it needs a browser to execute.
This leads to the real issue: user supplied contents needs to be "sanitized" only right before passing it to the target medium. If you're going to feed a database , use the escaping tools provided by your engine. If you're going to output it on the webpage, that's when you need to sanitize from malicious XSS attacks.
Sanitizing a POST array per se , before actually doing anything with its content, is wrong as you never know for sure when and where that content needs to be used; so don't even think to use strip_tags() or analogue functions that comes to your mind right after you get the POST value, but pass it as is and add the necessary escaping/sanitizing just when needed.
What you actually need to do, then, you only know, so act accordingly
Which I am told is dangerous.
it is wrong.
What is the recommended way to clean and secure the POST contents
it am afraid there is nothing to secure

Cleaning up user-submitted text to be good and readable?

I have a classifieds website built in PHP+JS, and i'd like to clean and format automatically the text submitted by the users. the users are really unpredictable and they use all uppercase, wrong spacing between commas, extra tabs or spacing that causes even JSOn errors..,and any sort of style error you can imagine...(and i never imagined!)
i wonder wether is there any script or rules on how to clean up a text to look at least decent...
What you need to know is how to sanitise input and encode it appropriately. A quick google has turned up this. You can never trust anything from the browser, ever! Everything needs checked server side so please learn from your question. This, in my opinion, is one of the failings of PHP as other technologies provide a certain degree of protection automatically.
With regards to encoding take a look at this
There are various ways you can filter user input. Have a look at lcfirst function for example. It would be a good idea to consider htmlentities and striptags functions too to fully sanitise your data.
NEVER TRUST USER INPUT is a good rule to live by

Can a simple web form like this get hacked?

Hi I have a web form that sends a string to one php file which redirects them to a corresponding URL. I've searched about web form hacking and I've only received information about PHP and SQL... my site only uses a single PHP file, very basic etc. Would it be open to any exploits? I'm obviously not going to post the URL, but here is some code I was working on for the php file:
Newbie PHP coding problem: header function (maybe, I need someone to check my code)
Thanks
From that little snippet, I don't see anything dangerous. "Hackers" can enter pretty much anything they want into $_REQUEST['sport'] and thereby $searchsport, but the only place you use it is to access your array. If it's not found in your array.... nothing much will happen. I think you're safe in this limited scenario ;) Just be careful not to use $searchsport for...... just about anything else. Echoing it, or inserting it into a DB is dangerous.
Uh, it really depends. If you are inserting data into a MySQL DB without sanitizing, the answer is a huge yes. This is something you need to decide for yourself if you aren't going to show code.
The solution you've got in the linked question is pretty safe.
Every possible action is hardcoded in your script.
Nothing to worry about.
Though asking for the "web form like this" you'd better to provide a web form. Not the link to the question that contains a code that can be presumed as this form's handler.

How to safely allow embed content?

I run a website (sorta like a social network) that I wrote myself. I allow the members to send comments to each other. In the comment; i take the comment and then call this line before saving it in db..
$com = htmlentities($com);
When I want to display it; I call this piece of code..
$com = html_entity_decode($com);
This works out well most of the time. It allows the users to copy/paste youtube/imeem embed code and send each other videos and songs. It also allows them to upload images to photobucket and copy/paste the embed code to send picture comments.
The problem I have is that some people are basically putting in javascript code there as well that tends to do nasty stuff such as open up alert boxes, change location of webpage and things like that.. I am trying to find a good solution to solving this problem once and for all.. How do other sites allow this kind of functionality?
Thanks for your feedback
First: htmlentities or just htmlspecialchars should be used for escaping strings that you embed into HTML. You shouldn't use it for escaping string when you insert them into a SQL query - Use mysql_real_escape_string (For MySql) or better yet - use prepared statements, which have bound parameters. Make sure that magic_quotes are turned off or disabled otherwise, when you manually escape strings.
Second: You don't unescape strings when you pull them out again. Eg. there is no mysql_real_unescape_string. And you shouldn't use stripslashes either - If you find that you need, then you probably have magic_quotes turned on - turn them off instead, and fix the data in the database before proceeding.
Third: What you're doing with html_entity_decode completely nullifies the intended use of htmlentities. Right now, you have absolutely no protection against a malicious user injecting code into your site (You're vulnerable to cross site scripting aka. XSS). Strings that you embed into a HTML context, should be escaped with htmlspecialchars (or htmlentities). If you absolutely have to embed HTML into your page, you have to run it through a cleaning-solution first. strip_tags does this - in theory - but in practise it's very inadequate. The best solution I currently know of, is HtmlPurifier. However, whatever you do, it is always a risk to let random user embed code into your site. If at all possible, try to design your application such that it isn't needed.
I so hope you are scrubbing the data before you send it to the database. It sounds like you are a prime target for a SQl injection attack. I know this is not your question, but it is something that you need to be aware of.
Yes, this is a problem. A lot of sites solve it by only allowing their own custom markup in user fields.
But if you really want to allow HTML, you'll need to scrub out all "script" tags. I believe there are libraries available that do this. But that should be sufficient to prevent JS execution in user-entered code.
This is how Stackoverflow does it, I think, over at RefacterMyCode.
You may want to consider Zend Filter, it offers a lot more than strip_tags and you do not have to include the entire Zend Framework to use it.

Categories