I have a search form. I use the following line to get the value. When it returns, it replaces plus sign to space, letters after single/double quotes are deleted. I want to enable users to search for the keywords they want. How can I allow these letters to display?
$title = trim(filter_input(INPUT_GET, 'title', FILTER_SANITIZE_SPECIAL_CHARS));
When I send with GET.
header("Location:http://site.org/search/?title=$title");
I tried using urlencode() and works for plus signs, but it didn't work for quotes. For example c"s would return c"s.
Thanks.
Those are special characters that correspond to what they are appearing to be, for instance a plus in a GET request corresponds to a space.
Please see this link.
You will have to replace these characters before you redirect. You can do this with urlencode.
From the manual for FILTER_SANITIZE_SPECIAL_CHARS
HTML-escape '"<>& and characters with ASCII value less than 32,
optionally strip or encode other special characters.
If the ascii value of the characters you are trying to are below 32, you will probably need to go a different route.
You need to encode / escape your data for the medium you are outputting to:
use url_encode to send the variables via a url;
use prepared statements with bound variables to query the database, that way you are safe from sql injection and you can search for texts with quotes in the database;
use htmlspecialchars to output your results to html or something like json_encode to send it to javascript.
Related
I have a password with special chars in my php file.
The original String returns a 500 html error code.
The following chars are the root cause.
()[]$
The line of code is:
private $password = "abc(de)fgh[ijk]lmn$opq$";
How can I correctly escape those chars?
I have tried to replace them with the HTML charset, as well as \\
Single quotes are the simplest way to make a string. They just display what they are given, no bells and whistles, no special "powers" like being able to show variable values.
Use Single quotes.
So I have a string with \x codes such as "\xA3" and printing these in UTF-8 encoding will print out the correct symbols such as £ in this case.
But how can I keep the original representation so that it just prints out \xA3 instead?
I've tried using str_replace("\\x", "\\x", $xcodes) and I've also tried preg_replace('#\\\\x\w+#', '\\x', $xcodes) but neither works.
Edit 1:
Please note the strings are from external sources so using single/double quotes in PHP doesn't make a difference. As soon as I print them out directly as they are, these characters are converted to symbols which is why I want to find a way to preserve the \x codes and see the raw presentation.
Edit 2:
The strings are coming from the user agent with codes such as \xA3\xA9 (yes, Shellshock exploit tests) so when I run the following:
$ua = $_SERVER['HTTP_USER_AGENT'];
// Prepare query with PDO, "insert into user_agents (values) values (:ua)"
// Define placeholder :ua as string $ua.
// Execute insert query.
MySQL will return an "invalid string value, \xA3\xA9" error. This is partly why I want to retain the hex code representation but also so that I can see the non-printable characters.
So instead of "Chrome 70.45[Non-printable characters]", I'd see "Chrome 70.45\Ax3\xA9".
I was wondering if anybody knew how to get around this problem.
I am gathering user input from a HTML form which is then posted using htmlspecialchars into PHP to avoid issues when using quotes/etc...
However, I also want to run server-side validation checks on the data being gathered through regular expressions - though I'm not sure how to go about this.
So far, I have thought of decoding the htmlspecialchars - but because I am going to be using the Strings straight away, this means that the code could break after I run this conversion. e.g: Let's say the user inputted a single quote, " into a field. This would be converted to ", then if I decode this and use it in a variable, it could end up like: $string = """; which is going to give me issues.
Any advice on this would be greatly appreciated!
You seem to misunderstand the difference between data and how this data is altered to be parseable in a certain context.
A php string can contain any data. What is stored in this string is the "raw" form: the form in which we want to manipulate the data if needed.
In certain contexts, not all characters are valid. For example, in a html textarea, the < and > characters may not be used, because they are special characters. We still want to be able to use these characters. To use special characters in a context, we escape these characters. By escaping a special character it looses its special meaning. In the context of a html textarea, the < character is escaped as the sequence <. Unlike the < character, this escaped sequence does not have a special meaning in html, and thus if we send the following sequence to the browser, it knows how to parse that sequence and display the right thing: <textarea><</textarea>. When we talk about what the data is that this textarea contains, we do not say that it contains <, but instead we say that it contains <.
As you said, in a php script, in a double quoted string, the " character has a special meaning. This has only to do with parsing. PHP simply does not know how to parse a sequence $str = """;. If we would want to have the double quote in such a double quoted string, we would need to escape it. We escape a double quote in a php double quoted string by prepending it with a \. To make a string containing a single double quote, using the double quoted notation, you would write $str = "\"";.
However, none of this matters.. You are taking input from a html form. When you click the submit button, the browser reads what is in the textarea(, and decodes it as html?). The browser then encodes it in a way as dictated by the form tag, and sends it to the server. The server then decodes the blob of text back in it's raw data form. That data is passed to PHP, and it is this form you will encounter in $_POST['myTextarea'].
In conclusion: If data is encoded, realize for which context it was encoded and decode it based on that context. You do not need to escape for php quoted strings, because you are working on internal strings. There is nothing to parse. Remind yourself that when you are going to use the data somewhere, that you should take care that all special characters in your data for that particular context are escaped.
I suppose that htmlspecialchars() function is called after posting the form to PHP. Simplest solution then will be to match against regular expression first and then do htmlspecialchars().
Also, if you have string encoded with htmlspecialchars(), after decoding with htmlspecialchars_decode(), PHP internal representation will be "\"", so you break nothing. There is big difference how you write strings by hand to PHP file and how PHP internally handle them. You really don't need to be bothered by this.
I have a string of Characters that is passed in a URL.
The string happens to contain a group of characters that is equivalent to an ASCII code.
When I try to use the string on the page using the $_GET command, it converts the part of the string that is equivalent to the ASCII code to the ASCII code instead of passing the actual string.
For example the URL contains a string Name='%bert%'. But when I echo out $_GET['Name'] I get '3/4rt%' instead of '%bert%'. How can I get the actual text?
You're not escaping your data properly.
If you want to use %bert% in a URL, you need to encode your % as %25, making your query string value %25bert%25.
% in a URL means that the next two characters are going to be some encoded entity, so if you want to use it literally, it must be encoded this way.
You can read more information here: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm
try passing Name='%25bert%25' instead of Name='%bert%'.
Note: %25 acts as escape character for % is url query string!
I'm running quoted_printable_decode() on HTML content that is stored in DB and has a lot of these types of characters =C5=DD= etc..
However, I also have this string in the HTML which I did not mean to replace:
link
Since it has =b in it, it replaces it as well.
Is there any way to avoid this?
Encode the = as =3D, which is the equivalent in Quoted Printable.