I'm keeping a database that is filled automaticlly by my users. but when there is an input like My Father's Will. It will get into the database like: My Father's Will.
This is not what I want. Can someone tell me how to enable these kinds of special characters or possibly a work around to not show these ugly characters to my users.
I'm using PHP, a MySQL server and PHPMyAdmin as DB Management tool.
It looks like the ' is escaped like a HTML character. I guess you're doing a wrong escaping, like using htmlentities instead of mysql_real_escape_string. If this info doesn't help, please post your code. It will be guessing without.
When you pull the values out of your database, use htmlspecialchars_decode(). This will convert all html special characters back into regular text.
$str = 'My Father's Will';
echo htmlspecialchars_decode($str);
will output:
My Father's Will
I can't really figure what you are asking, since "My Father's Will" and "My Father's Will" is exactly the same?
But it seems like a problem related to either string escaping in PHP or conflicting encoding in the MySQL-database, try to have a look into both and feel free to specify you question a bit more.
It sounds like you might be escaping (such as php's htmlentities()) your input on its way to the database. The correct thing to do would be to instead escape it only on output back to the screen.
Most likely you have a call to htmlspecialchars(..., ENT_QUOTES) in your code somewhere, which would encode ' and " into character entities. If they're in the database in encoded form, and the end-user sees the character entities, then you're doing a double-encoding and your script's output is something like &x27;.
Related
I'm trying to output the name of a project i.e. "David's Project" in a form, if a user does not correctly input all data in the form, to save the user having to input the name again.
If I var_dump $name I see David's project. But if I echo $name I see David"'" Project. I realise that ' (single quote) becomes "'"; but I have tried using ENT_NOQUOTES and ENT_COMPAT to avoid encoding the single quote but neither works.
$name = trim(filter_input(INPUT_POST, 'name0', FILTER_SANITIZE_STRING));
<form method="post" class="form" />
Title: <input type="text" name="name0" value="<?php echo
htmlspecialchars($name, ENT_NOQUOTES); ?>">
Am I doing something wrong or should the ENT_NOQUOTES work? I tried using str_replace to replace with ' with an \' but this didn't work either.
The only way round this I have found is to use this:
htmlspecialchars_decode(htmlspecialchars($name, ENT_NOQUOTES));
Is that acceptable?
Sorry I realise this is probably a really stupid question but I just can't get my head around it.
Thanks for any replies.
You can accept a simple answer if it solves your problem BUT you should really understand that what you have delved into is a much larger issue you or someone has created for you.
Databases should not contain HTML encoded characters unless they are specifically meant for storing HTML. I highly doubt this is the case as it very rarely is.
Someone is inserting HTML into your database (html encoding data on insert). This means if you ever want to use a mobile app that is not HTML based, or a command line, or anything at all that might use the data and isn't HTML based, you are going to run into a weird problem where the HTML encoded characters have to be removed on output. This is typically kind of the backwards way to do it and can often cause issues.
You rarely need to "sanitize" your inputs. If anything, you should reject input that is not allowed OR simply escape it in the proper way while inserting it into the database. Sanitizing is only a thing in very special circumstances, which you don't appear to have right now. You're simply inputting and outputting text.
You should pretty much never change users input
My suggestion, if possible, is to fix your INSERT code first so it isn't html encoding data. This html encoding should happen when you output the data TO AN HTML FORMAT. You would use htmlspecialchars() to do this.
I have some text that I will be saving to my DB. Text may look something like this: Welcome & This is a test paragraph. When I save this text to my DB after processing it using htmlspecialchars() and htmlentities() in PHP, the sentence will look like this: Welcome & This is a test paragraph.
When I retrieve and display the same text, I want it to be in the original format. How can I do that?
This is the code that I use;
$text= htmlspecialchars(htmlentities($_POST['text']));
$text= mysqli_real_escape_string($conn,$text);
There are two problems.
First, you are double-encoding HTML characters by using both htmlentities and htmlspecialchars. Both of those functions do the same thing, but htmlspecialchars only does it with a subset of characters that have HTML character entity equivalents (the special ones.) So with your example, the ampersand would be encoded twice (since it is a special character), so what you would actually get would be:
$example = 'Welcome & This is a test paragraph';
$example = htmlentities($example);
var_dump($example); // 'Welcome & This is a test paragraph'
$example = htmlspecialchars($example);
var_dump($example); // 'Welcome & This is a test paragraph'
Decide which one of those functions you need to use (probably htmlspecialchars will be sufficient) and use only one of them.
Second, you are using these functions at the wrong time. htmlentities and htmlspecialchars will not do anything to "sanitize" your data for input into your database. (Not saying that's what you're intending, as you haven't mentioned this, but many people do seem to try to do this.) If you want to protect yourself from SQL injection, bind your values to prepared statements. Escaping it as you are currently doing with mysqli_real_escape_string is good, but it isn't really sufficient.
htmlspecialchars and htmlentities have specific purposes: to convert characters in strings that you are going to output into an HTML document. Just wait to use them until you are ready to do that.
I'm using a 3rd party API that seems to return its data with the entity codes already in there. Such as The Lion’s Pride.
If I print the string as-is from the API it renders just fine in the browser (in the example above it would put in an apostrophe). However, I can't trust that the API will always use the entities in the future so I want to use something like htmlentities or htmlspecialchars myself before I print it. The problem with this is that it will encode the ampersand in the entity code again and the end result will be The Lion’s Pride in the HTML source which doesn't render anything user friendly.
How can I use htmlentities or htmlspecialchars only if it hasn't already been used on the string? Is there a built-in way to detect if entities are already present in the string?
No one seems to be answering your actual question, so I will
How can I use htmlentities or htmlspecialchars only if it hasn't already been used on the string? Is there a built-in way to detect if entities are already present in the string?
It's impossible. What if I'm making an educational post about HTML entities and I want to actually print this on the screen:
The Lion’s Pride
... it would need to be encoded as...
The Lion&;#8217;s Pride
But what if that was the actual string we wanted to print on the string ? ... and so on.
Bottom line is, you have to know what you've been given and work from there – which is where the advice from the other answers comes in – which is still just a workaround.
What if they give you double-encoded strings? What if they start wrapping the html-encoded strings in XML? And then wrap that in JSON? ... And then the JSON is converted to binary strings? the possibilities are endless.
It's not impossible for the API you depend on to suddenly switch the output type, but it's also a pretty big violation of the original contract with your users. To some extent, you have to put some trust in the API to do what it says it's going to do. Unit/Integration tests make up the rest of the trust.
And because you could never write a program that works for any possible change they could make, it's senseless to try to anticipate any change at all.
Decode the string, then re-encode the entities. (Using html_entity_decode())
$string = htmlspecialchars(html_entity_decode($string));
https://eval.in/662095
There is NO WAY to do what you ask for!
You must know what kind of data is the service giving back.
Anything else would be guessing.
Example:
what if the service is giving back & but is not escaping ?
you would guess it IS escaping so you would wrongly interpret as & while the correct value is &
I think the best solution, is first to decode all html entities/special chars from the original string, and then html encode the string again.
That way you will end up with a correctly encoded string, no matter if the original string was encoded or not.
You also have the option of using htmlspecialchars_decode();
$string = htmlspecialchars_decode($string);
It's already in htmlentities:
php > echo htmlentities('Hi&mom', ENT_HTML5, ini_get('default_charset'), false);
Hi&mom
php > echo htmlentities('Hi&mom', ENT_HTML5, ini_get('default_charset'), true);
Hi&;mom
Just use the [optional]4th argument to NOT double-encode.
In my PHP code, I'm setting up an area for people to enter their own info to be displayed. The info is stored in an array and I want to make it as flexible as possible.
If I have something like...
$myArray[]['Text'] = 'Don't want this to fail';
or
$myArray[]['Text'] = "This has to be "easy" to do";
How would I go about escaping the apostrophe or quote within the array value?
Thanks
Edit: Since there is only a one to one relationship, I changed my array to this structure...
$linksArray['Link Name'] ='/path/to/link';
$linksArray['Link Name2'] ='/path/to/link2';
$linksArray['Link Name2'] ='/path/to/link3';
The plan is I set up a template with an include file that has these links in a format someone else (a less technical person) can maintain. They will have direct access to the PHP and I'm afraid they may put a single or double quote in the "link name" area and break the system.
Thanks again.
POSSIBLE SOLUTION:
Thanks #Tim Cooper.
Here's a sample that worked for me...
$link = "http://www.google.com";
$text = <<<TEXT
Don't you loving "googling" things
TEXT;
$linksArray[$text] = $link;
Using a heredoc might be a good solution:
$myArray[]['Text'] = <<<TEXT
Place text here without escaping " or '
TEXT;
PHP will process these strings properly upon input.
If you are constructing the strings yourself as you have shown, you can alternate between quotation styles (single and double)...as in:
$myArray[]['Text'] = "Don't want this to fail";
$myArray[]['Text'] = 'This has to be "easy" to do';
Or, if you must escape the characters, you use the \ character before the quotation.
$myArray[]['Text'] = 'Don\'t want this to fail';
$myArray[]['Text'] = "This has to be \"easy\" to do";
If you really want to make i easy, use a separate configuration file in either INI or XML style. INI is usually the easiest for people to edit manually. XML is good if you have a really nested structure.
Unless you are letting users enter direct PHP code (you probably aren't), you don't have to worry about what they enter until you go to display it. When you actually display the info they enter, you will want to sanitize it using something like htmlentities().
Edit: I realize I may be misunderstanding your question. If so, ignore this! :)
You can use the addslashes($str) function to automatically escape quotes.
You can also try htmlentities, which will encode quotes and other special values into HTML entities: http://php.net/manual/en/function.htmlentities.php
I have a form into which I entered a newline character which looked correct when I entered it, but when the data is now pulled from the database, instead of the white space, I get the \n\r string showing up.
I try to do this:
$hike_description = nl2br($hike_description);
But it doesn't work. Does anyone know how this can be fixed? I am using PHP.
And here is the page where this is happening. See the description section of the page:
http://www.comehike.com/hikes/scheduled_hike.php?hike_id=130
Thanks,
Alex
Does anyone know how this can be fixed?
Sure.
Your code doing unnecessary escaping, most likely before adding text to the database.
So, instead of replacing it back, you have to find that harmful code and get rid of it.
This means, you have probably plain text '\n\r' strings in the db.
Try to sanitize db output before display:
$sanitized_text = preg_replace('/\\[rn]/','', $text_from_db);
(just a guess).
Addendum:
Of course, as Col. Shrapnel pointed out, there's something fundamentally wrong
with the contents of the database (or, it is used this way by convention and you don't know that).
For now, you have fixed a symptom partially
but it would be much better to look for the reason for these escaped characters
being in the database at all.
Regards
rbo
You can use str_replace to clean up the input.
$hike_description = nl2br(str_replace("\r\n", "\n", $hike_description));
$hike_description = str_replace(array('\n','\r'),'',$hike_description);
You may want to read up on the differences between the single quote and double quote in PHP as well: http://php.net/manual/en/language.types.string.php