Encoding problem when using htmlentities method - php

I've a problem of character encoding in php, so this's the php code:
n_event=$_GET['ndlann'];
$nom_complet=htmlentities(stripslashes($_POST['nom']));
$email_comment=htmlentities(stripslashes($_POST['email']));
$titre_comment=htmlentities(stripslashes($_POST['titre']));
$texte_comment=htmlentities(stripslashes(nl2br($_POST['commentaire'])));
$pays_comment=$_POST['pays'];
$date_ajout=date('Y/m/d');
Data will be added in a database table , you see that this data comes from a comments form,
so when the user enters some comments with orient languages carachters (arabic,hebrew...etc), the input data will change to something like :
Ø´Ù�را عÙ�Ù� اÙ�Ù�Ù�ضÙ�Ø
I tried to delete the htmlentities method and that works fine , but does start another problem of comments form security (js scripts will be executed)
What can I do with this situation?
and thanks

Do not use htmlentities() ever.
This function has been obsoleted long time ago.
Use htmlspecialchars() instead.
you have also bunch of nonsense in your code
doing htmlentities(nl2br(*)) has no sense.
make stripslashes conditional, only if magic quotes are set on.
there is a possible problem with pays field.
I am also afraid that you're taking htmlentities as some sort of SQL escaing function. Am I right?

In my opinion, and according to the PHP doc, the accepted answer is not correct.
Nowhere it is written that this function has been deprecated.
If you set correctly the third argument of the function, called $encoding, it will solve your problem.
I hope this helps.

Related

PHP - How to call method from a variable

i hope you may be able to help me out.
I am building a scrape script using simple html dom.
I have a few sites where i need to get the thumbnail path, name of the movie and some other stuffs. I have build me an admin panel where i save in plaintext the methods required to find that stuff based on the matching pattern.
Eg.
$movie_name = $result->children(0)->children(0)->innertext;
This works just like it supposed to work but when i save children(0)->children(0)->innertext in the database and then back into variable, eg,
$variable = "children(0)->children(0)->innertext";
$movie_name = $result->$variable;
it does not work.
I am pretty sure i am going horribly wrong about this, so please give me a hint how i could just save the methods in plaintext and then call them.
It must be stored in plaintext because the dom is frequently changing so i will be able to keep up with it.
Best regards.
You're looking for the PHP eval() function:
$movie_name = $result->eval($variable);
Having said that, be warned that eval is evil.
Instead, I would recommend xpath.
Hope this helps!
Got it, eval() was the answer. Since no user input is going to the eval() its pretty safe in my particular case. Just had to do some escaping and declaring the variable containing the method inside eval();
This piece of code works for me.
$res_mov_url_e = eval("\$res_mov_url = \$result->$movie_url;");
Anyway big thanks guys!

The secure way to use GET parameter in the DOM?

Here is a simplified of my code:
$html = "<a class='myclass' href='/path?arg='" . $_GET['param']. "'>mylink</a>";
Today I was reading about XSS attack and I think my code is under XSS attack. Howver I'm not sure, but it smells that.
Anyway, if my thought is right, how can I avoid that? Based on some researches, one way is using strip_tags(). Now I want to know, can I rely on it? And is that fine enough?
This is about encode something with the correct function.
Always look what you want to product, then choose the encoder!
Samples:
When you are building HTML its good to use htmlspecialchars and/or htmlentities.
When you are build SQL its good to use for mysql PDO::quote or mysqli_real_escape_string.
Answer:
In your case, you are building an URL. For this you need to use urlencode.
In addition you also need to escape it to correct HTML with htmlentities, because you are building HTML in the next step.
See the sample in PHP manual -> urlencode link (Example #2).
You should use htmlspecialchars() whenever you want to output a parameter that came from user.
$variable = htmlspecialchars($_GET['param'], ENT_QUOTES, 'UTF-8');

stripslashes issue in php

when i use stripslashes in php but i did not get the exact solution. I have menstion below which i used in my code those are
Example if i have the value in table like suresh\'s kuma\"r
i trying to display the value in the following three formats but no one is giving exact value
1) value=<?=stripslashes($row[1])?> //output is suresh's
2) value='<?=stripslashes($row[1])?>' //output is suresh
3) value="<?=stripslashes($row[1])?>" //output is suresh's kuma
But the exact output i need is suresh's kuma"r
let me know how to resolve the this issue?
The issue has nothing do to with stripslashes. If I guess correctly, the problem lies in the fact that in your examples quotes break the html field attribute;
I'll show you by manually echoing out your $row content as per your infos:
value=sures kumar --> leads to browser to interpret this as value="sures" kumar
value='suresh'khumar --> well, same story value='sures' khumar
value="Suresh"Khumar -->what can I say...you know the drill
Escaping the quotes won't affect html, since backslashes has no meaning in html.
Both value="Suresh" and value="Suresh\" will work fine for the browser, but your name will always be interpreted by the browser as some unknown attribute, leaving only the first part inside the value.
What you might do, instead, is apply htmlentities($row[1],ENT_QUOTES) so that they get converted in the equivalent entity (&quote;,for ex.) and not break your value attribute. See manual.
Another issue is that you shouldn't be having backslashes in your database in the first place; this might be due to the presence of magic_quotes enabled in your provider, or you passing manually addslashes() or other wrong trickery. If you want to insert into a database values containing quotes, use the escaping mechanism provided by your database driver (mysql_real_escape_string() in mysql, for ex.), or better tools (preparated statements with query bindings).
You should first get rid of all the slashes using that stripslashes and re-saving back the content; but slashes or not, the issue would appear again if you don't format that appropriately for your html, as I showed above.
Are you sure you want stripslashes instead of addslashes? Is the purpose is to quote the " characters?

A PHP Function that verify code language

I have a form with 2 textareas; the first one allows user to send HTML Code, the second allows to send CSS Code. I have to verify with a PHP function, if the language is correct.
If the language is correct, for security, i have to check that there is not PHP code or SQL Injection or whatever.
What do you think ? Is there a way to do that ?
Where can I find this kind of function ?
Is "HTML Purifier" http://htmlpurifier.org/ a good solution ?
If you have to validate the date to insert them in to database - then you just have to use mysql_real_escape_string() function before inserting them in to db.
//Safe database insertion
mysql_query("INSERT INTO table(column) VALUES(".mysql_real_escape_string($_POST['field']).")");
If you want to output the data to the end user as plain text - then you have to escape all html sensitive chars by htmlspecialchars(). If you want to output it as HTML, the you have to use HTML Purify tool.
//Safe plain text output
echo htmlspecialchars($data, ENT_QUOTES);
//Safe HTML output
$data = purifyHtml($data); //Or how it is spiecified in the purifier documentation
echo $data; //Safe html output
for something primitive you can use regex, BUT it should be noted using a parser to fully-exhaust all possibilities is recommended.
/(<\?(?:php)?(.*)\?>)/i
Example: http://regexr.com?2t3e5 (change the < in the expression back to a < and it will work (for some reason rexepr changes it to html formatting))
EDIT
/(<\?(?:php)?(.*)(?:\?>|$))/i
That's probably better so they can't place php at the end of the document (as PHP doesn't actually require a terminating character)
SHJS syntax highlighter for Javascript have files with regular expressions http://shjs.sourceforge.net/lang/ for languages that highlights — You can check how SHJS parse code.
HTMLPurifier is the recommended tool for cleaning up HTML. And as luck has it, it also incudes CSSTidy and can sanitize CSS as well.
... that there is not PHP code or SQL Injection or whatever.
You are basing your question on a wrong premise. While HTML can be cleaned, this is no safeguard against other exploitabilies. PHP "tags" are most likely to be filtered out. If you are doing something other weird (include-ing or eval-ing the content partially), that's no real help.
And SQL exploits can only be prevented by meticously using the proper database escape functions. There is no magic solution to that.
Yes. htmlpurifier is a good tool to remove malicious scripts and validate your HTML. Don't think it does CSS though. Apparently it works with CSS too. Thanks Briedis.
Ok thanks you all.
actually, i realize that I needed a human validation. Users can post HTML + CSS, I can verify in PHP that the langage & the syntax are correct, but it doesn't avoid people to post iframe, html redirection, or big black div that take all the screen.
:-)

How save JavaScript and HTML in option without it being auto-escaped?

And there I thought I knew Wordpress well. It now seems that update_option() auto-escapes code. If I want to save some Javascript or HTML code in an option, this behavior renders the code unusable.
I refuse to do a str_replace on the returned value to filter out every backslash. There has to be a better way.
Here's the PHP for the text box to enter some code:
$option = unserialize(get_option('option'));
<textarea name="option[box]"><?php echo $option['box']; ?></textarea>
This is what happens after submitting the form (in essence):
update_option('option', serialize($_POST));
Any ideas?
Edit: I now got it to work by using PHP's stripslashes() where the script has to be rendered, and htmlentities(stripslashes()) in the text box to display the stored code. While this does the job, I'd still like to know if there is a better solution.
It now seems that update_option() auto-escapes code.
It only sanitizes the value for database entry. You'll find the real troublemaker is around line 750 in wp-settings.php, and the WP function add_magic_quotes().
Yep, you read that right, add magic quotes!
For some reason, WordPress decided to enforce magic quotes, so you'll always need to stripslashes on GET and POST when writing plugins and the like.
That's true #TheDeadMedic stripslashes must be used like;
echo stripslashes(get_option( 'option' ));

Categories