I have a string that has special letters like "á" and htmlcode like "<input type='text' />". When I store this string in my DB I use: htmlentities($string, ENT_QUOTES);.
The problem is when I output the text, I use html_entity_decode($string_from_db, ENT_QUOTES) and all the entities I have in the database like "á" for the letters and "<input type='text' title="LA1&qu..." for the htmlcode gets converted. So my output will show the "á" letter and a text field which is not normal. I want the letter to be like that but for the field I want to show the code "<input type='text' />" not the actual field.
I need this for a multilingual site with alot of user input, so I need to be able to process the special letter properly but also protect for bad input. Any advice is greatly apreciated.
Well it seems I figured it out .... at least for now. Here's what I'm doing:
The text submitted by the user i sanitize it with:
function sanitize_form_input($string) {
$string = mysql_real_escape_string($string);
return $string;
}
Got page encoding, php encoding, html encoding, mysql encoding ... and any other possible thing with encoding set to UTF-8.
Output the text with:
function sanitize_db_output($string) {
return htmlentities(stripslashes($string), ENT_QUOTES, 'UTF-8');
}
Please let me know if this is a wrong way to do it.
You could just do an additional htmlspecialchars after html_entity_decode; that function will only convert the characters which have a special function in HTML to their entity:
htmlspecialchars(html_entity_decode($string_from_db, ENT_QUOTES), ENT_QUOTES)
That should take care that the resulting string has no unencoded html characters. Of course, performancewise, that's probably not the best solution, but it's simple!
Related
I have an issue with the special characters. For ex. In the database is written "A & A" (database is set on utf8-unicode-ci).
I am retrieving in autosugest list the values correctly with:
while ($row = mysql_fetch_array($result)) {
$keywords = htmlspecialchars($row['name']);
echo "<keywords>". $keywords ."</keywords>";
}
When I click to select the "A & A" in the input field is filled as A & amp; A
the header is set on :<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Can you please let me know how to display the special character?
If you want to convert from A & A to A&A, use htmlspecialchars_decode on the text.
If you want to convert from A&A to A & A use htmlspecialchars.
In your case removing htmlspecialchars operation on the text pulled from your database will do.
Since your issue appears to be with the & character being replaced with &, maybe running something like $text = str_replace("&", "&", htmlspecialchars($text)); will work better for you, and prevents XSS.
If you have "A & A" in the database, htmlspecialchars will do that.
Remove htmlspecialchars.
Its better use rawurlencode before you inserting the value to database. Then whenever you fetching the value, use rawurldecode. I think it may solve your problem.
rawurldecode($row['name']);
Check this http://php.net/manual/en/function.rawurlencode.php
I have a string which is decoded as base36, ie 0-9a-z,
any other characters were decoded as follows: a unicode character code, converted to base36 and preceeded by capital letter 'A', and followed by letter 'B'.
If multiple unicode chars appear, only the last one if followed by 'B'.
Example:
zergme#wtfd-婴儿服饰.com
converted as:
zergmeA1sBwtfdA19Ahv8Ag1rAkctAub4A1aBcom
It was convenient to convert the data that way, but now I'm bashing my head on how to write a decode it back algorithm.
I already provided for a function that convert charcodes to the Unicode chars, which let be called 'unichr($code)';
...but I can't think of a good way finding these chars.
I was trying to use regexp first, something like:
preg_replace('/A.*?B?(?=[AB])/',"$1",$mail);
But it didn't work the way I wanted... And I also didn't realize how to cast my custom convertion function aka 'unichr()' on the matches.
Then I was also thinking about manually finding chars with strpos(), but it also turned out to be messy.
Could you advice some pattern? Or whether I should elaborate on regexp or try to use some loop? I'm kinda blank... Thanks :)
LOLMAO
That is it, Looks like I figured out, thanks to your contribution:
'/A(.*?)((?=A)|B)/'
Have you looked into using preg_replace_callback() instead? It takes a function instead of a string as the replace value, and will pass the matches to the function and use the function's return value as the replace string.
Loose example, you'll have to play around a bit
<?php
$str = 'zergmeA1sBwtfdA19Ahv8Ag1rAkctAub4A1aBcom';
function convert_to_unicode_cb( $match )
{
// $match1 would be 1s, 19, hv8, etc
return unichr( $match[1] );
}
preg_replace_callback( '/A(.*?)(?=A|B)/', 'convert_to_unicode_cb', $str );
How aobut Base64 encoding (gzcompress) and decoding (gzuncompress).
Save the following with the name "testBase64.php":
<?php
if(isset($_POST['text'])){
echo("<b>input:</b> ".$_POST['text']."<br/>");
$c = gzcompress($_POST['text']);
echo("<b>base64 encoding:</b> .".$c."<br/>");
echo("<b>base64 decoding:</b> " .gzuncompress($c));
exit;
}
?>
<html>
<body>
<form method=post action=testBase64.php>
<input type=text name=text />
<input type=submit />
</form>
</body>
</html>
Run and enter "zergme#wtfd-婴儿服饰.com" in the text field.
Output:
input: zergme#wtfd-婴儿服饰.com
base64 encoding: .xœ«J-JÏMu(/IKÑUS62645³Òæ–– ÚÌØÂH[YXë%ççG°#
base64 decoding: zergme#wtfd-婴儿服饰.com
Hope this helps.
I'm inserting the following TEXT value into MySQL using..
$groupname = addslashes($_POST['groupname'];
When getting the value from Mysql I'm using
$name = $row['groupname'];
echo $name;
And this show correctly as "Mr. Davis's Group"
but when this value in added to a form as
then I pass the value to another page, and retrieve it as
$name = $_POST['groupname'];
echo $name;
it show up as "Mr. Davis" keeping everything before the apostrophy.
??No clue why, i've tried adding stripslashes($_POST['groupname']; and same thing happens
<input name='groupname' type='hidden' value='$groupname' />
Will generate:
<input name='groupname' type='hidden' value='Mr Davis's Group' />
^----
At the indicated spot, the browser's parser will see the 'end' of the value=, followed by some unknown attribute s and a broken attribute Group '.
To embed this type of text in a form, you need to use htmlspecialchars(), which will convert any HTML metacharacters (<, >, ', ") into their character entity equivalents, so they can be safely embedded in a form.
addslashes() is a deprecated method of "safely" adding something into a database. It will not make something safe to embed in HTML.
Check the text encoding of your input webpage. Match your db charset - use utf-8.
I have used html_entities for UTF-8 in php.
$input = "<div> 'Testing' </div>";
echo htmlentities($input,ENT_NOQUOTES,"UTF-8");
But, above encoding is working for normal input, if i give below input and use encoding then I am getting blank output.
$input = "<div>Other 'user' is working on this line. Please contribute the next line.</div>";
echo htmlentities($input,ENT_NOQUOTES,"UTF-8");
I dont know how this is giving blank output.
If i print $input then I am getting below value in $input.
<div>Other user working on this line.�Please contribute the next line.</div>
Is any thing missed in htmlentities code, Please folks provide your suggestions.
Thanks,
-Pravin.
Try passing $input to utf8_encode first, and then passing the data to htmlentities with only the ENT_NOQUOTES option set:
<?php
$input = "<div>Other 'user' is working on this line. Please contribute the next line.</div>";
echo htmlentities(utf8_encode($input),ENT_NOQUOTES);
?>
I am retrieving data from my SQL database...
data exactly as it is in the DB = (21:48:26) <username> some text here. is it ok?
when i try and echo $row['log']."<br>";
it displays it as = (21:48:26) some text here. is it ok?
i assume this is due to the <> brackets making it think its an HTML opener... would this be the case? and if so how would one echo a string that contains HTML?
Use htmlspecialchars() to translate HTML control characters into their entities:
echo htmlspecialchars($row['log'])."<br>";
You need to escape the characters so it is not recognized as an HTML element, but as text:
echo htmlentities( $row['log'] ) . '<br/>';
i assume this is due to the <>
brackets making it think its an HTML
opener...
Yes, any construction in <> brackets is treated by web browser as HTML tag. So, you should use either htmlspecialchars() or htmlentities() or some similar custom function to convert "<" and ">" symbols to "<" and ">" strings, which are displayed to user as brackets.
Some more comments:
ALL text data displayed to user must be passed through htmlspecialchars() funciton (or through other function with similar behavior), since "some text" may also contain tags, etc.
Probably it would be better to store date/time, username and "some text" in separate table columns in DB, in order to satisfy relational database constraints. This may require some additional input data parsing.