XPATH compare strings with special chars - php

I'm trying to compare the contents of elements using xpath.
Sample code:
div[# class = "name" and. = "'.$ data.'"]
Unfortunately, sometimes $data contains &quotes or other special characters.
example:
My school is a "super"
In this case, I can not compare content. What can I do with this?

The other answers (up to know) all suffer from escaping too many characters (eg., addslashes($string) also escapes double quotes).
Anyway, PHP only supports XPath 1.0 which suffers from bad escaping capabilities. From XPath 2.0 on, one can escape the quotes used to declare the string by doubling them (eg., 'foo''bar' will return foo'bar). In XPath 1.0, there is no way to do so.
One way to get out of this would be to ignore single quotes in both the input and search value by using
$string = str_replace("'", "", $data);
$xpath = "div[#class = 'name' and translate(., \"'\", '') = '$string']";
The str_replace line removes all single quotes from the search token, and the translate call within XPath removes all single quotes from the string to compared with.

You can use htmlentities or html_entity_decode, the way it suits you better, I didn't understood if you have special characters, or they were in the data.
Check php documentation.

PHP has a function called "addSlashes" to fix this problem. It's use:
addSlashes(your_php_variable);

A function called addslashes() escapes special characters such as quotes.
Try this:
div[# class = "name" and. = "'.<?php echo addslashes($data) ?>.'"]

Related

How to store escaped characters in MySQL and display them in php?

For example I want to store the String "That's all". MySQL automatically escapes the ' character. How do I echo that String from the database using php but remove the \ in front of escaped characters like \' ? I would also like to preserve other formatting like new lines and blank spaces.
Have you tried stripslashes(), regarding the linebreaks just use the nl2br() function.
Example:
$yourString = "That\'s all\n folks";
$yourString = stripslashes(nl2br($yourString));
echo $yourString;
Note: \\ double slashes will turn to \ single slashes
You should probably setup your own function, something like:
$yourString = "That\'s all\n folks";
function escapeString($string) {
return stripslashes(nl2br($string));
}
echo escapeString($yourString);
There are also several good examples in the nl2br() docs
Edit 2
The reason your are seeing these is because mysql is escaping line breaks, etc. I am guessing you are using mysql_* functions. You should probably look into mysqli or PDO.
Here is an example:
$yourString = "That's all
folks";
echo mysql_escape_string($yourString);
Outputs:
That\'s all\r\n folks
If you use prepared statements, those characters will not be escaped on insert.
Use stripslashes() to remove slashes if you cannot avoid adding slashes on input.
At first, magic_quotes_gpc escapes the character like ' or ". You can also disable this in your php.ini. But then you should escape the things yourself that no query can get "infected".
Lookup mysql injection for more information.
When the escaped string is been written in your database. The string doesn't contain theses escape charakters and when you output them again. You should see the result as you want it.
Me for myself prefer the method by storing everything without escapes and escape or display things when I output them. You could also easily use an str_replace("\n", "", $text) to prevent newslines are displayed.
Greetings MRu

Apostrophe issue

I have built a search engine using php and mysql.
Problem:
When I submit a word with an apostrophe in it and return the value to the text field using $_GET the apostrophe has been replaced with a backslash and all characters after the apostrophe are missing.
Example:
Submitted Words: Just can't get enough
Returned Value (Using $_GET): Just can\
Also the url comes up like this:search=just+can%27t+get+enough
As you can see the ' has been replaced with a \ and get enough is missing.
Question:
Does anybody know what causes this to happen and what is the solution to fix this problem?
The code:
http://tinypaste.com/11d62
If you're running PHP version less than 5.3.0, the slash might be added by the Magic Quotes which you can turn off in the .ini file.
From your description of "value to the text field" I speculate you have some output code like this:
Redisplay
<input value='<?=$_GET['search']?>'>
In that case the contained single quote will terminate the html attribute. And anything behind the single quote is simply garbage to the browser. In this case applying htmlspecialchars to the output helps.
(The backslash is likely due to magic_quotes or mysql_*_escape before outputting the text. I doubt the question describes a database error here.)
Update: It seems it's indeed an output problem here:
echo "<a href='searchmusic.php?search=$search&s=$next'>Next</a>";
Regardless of if you use single or double quotes you would need:
echo "<a href='searchmusic.php?search="
. htmlspecialchars(stripslashes($search))
. "&s=$next'>Next</a>";
(Notice that using stripslashes is a workaround here. You should preserve the original search text, or disable the magic_quotes rather.)
Okay I forgot something crucial. htmlspecialchars needs the ENT_QUOTES parameter - always, and in your case particularly:
// prepare for later output:
$search = $_GET['search'];
$html_search = htmlspecialchars(stripslashes($search), ENT_QUOTES);
And then use that whereever you wanted to display $search before:
echo "<a href='searchmusic.php?search=$html_search&s=$next'>Next</a>";
Single quotes are important in PHP and MySQL.
A single quote is a delimeter for a string in PHP, for example:
$str = 'my string';
If you want to include a literal quote inside a string you must tell PHP that the quote is not the end of the string. It is escaped with the backslash, for example:
$str = 'my string with a quote \' inside it';
See PHP Strings for more on this.
MySQL operates in a similar way. An example query might be:
$username = 'andyb';
$quert = "SELECT * FROM users WHERE user_name = '$username'";
The single quote delimits the string parameter. If the $username included a single quote, this would cause the query to end prematurely. Correctly escaping parameters is an important concept to be familiar with as it is one attack vector for breaking into a database - see SQL Injection for more information.
One way to handle this escaping is with mysql_real_escape_string().

Strip out all single quotes

I am looking for the best way to strip single quotes as it keeps breaking my important.
so
The image’s emotiveness enables
only comes through as
The image
It breaks at the single quote ' .I need a good way to strip out the tags can someone help.
I have looked at stripslashes();
Whats the best way function to stripout , - £
any help please.
MANAGED TO FIX IT>
Thank you for your help people i manage to fix it using the following function.
string utf8_encode ( string $data )
Cant figure out why it was coming out in that format from the database all i can think is it 6 years old website.
;)
I'm not 100% certain because PHP isn't my forte, but I think you need to look at something like urlencode(). This will encode all the special characters properly.
Note: This will remove all single quotes!
str_replace("'", "", $your_string);
example:
$your_string = "The image’s emotiveness enables.";
echo str_replace("'", "", $your_string);
output
The images emotiveness enables.
If you want to keep single quotes in string you should consider using real escape functions (recommended).
It sounds like what you really want is to encode the single quotes, not remove them. On the assumption that you are inserting into the MySQL database, look into mysql_real_escape_string.
The best way to get rid of specific characters is using str_replace.
To remove all single quotes from a string:
$noQuotes = str_replace("'", '', $stringWithQuotes);
There is several ways, depending on what are you doing.
You could use addslashes to escape all single / double quotes. You can unescape it with stripslashes later.
If you are planning on saving those data into MySQL database, you should use mysql_real_escape_string.
If you want to output data on HTML page, use htmlspecialchars to convert all special characters into HTML entities.
The next way is to use str_replace to remove all quotes, as few other people in this thread already mentioned.

How to pass a string containing both single and double quotes as a parameter to XSLT in PHP?

I have a simple PHP-based XSLT trasform code that looks like that:
$xsl = new XSLTProcessor();
$xsl->registerPHPFunctions();
$xsl->setParameter("","searchterms", $searchterms);
$xsl->importStylesheet($xslDoc);
echo $xsl->transformToXML($doc);
The code passes the variable $searchterms, which contains a string, as a parameter to the XSLT style sheet which in turns uses it as a text:
<title>search feed for <xsl:value-of select="$searchterms"/></title>
This works fine until you try to pass a string with mixes in it, say:
$searchterms = '"some"'." text's quotes are mixed."
In that point the XSLT processor screams:
Cannot create XPath expression (string
contains both quote and double-quotes)
What is the correct way to safely pass arbitrary strings as input to XSLT? Note that these strings will be used as a text value in the resulting XML and not as an XPATH paramater.
Thanks,
Boaz
This has been logged as a bug:
https://bugs.php.net/bug.php?id=64137
This comment:
This shortcoming comes from the fact that XPath 1.0 does not provide a mechanism to escape characters, so PHP does not have a straightforward way to express a string that contains both types of quotes. XPath 1.0 does, however, provide a function to concatenate strings. Using concat(), a string composed of the two characters "' can be expressed as concat('"',"'"). concat() takes 2 or more arguments.
Includes the following work-around:
So as long as you alternate the quoting style, you can express a string containing any number of quotes of both types.
Another solution is to replace all straight-quotes with single-quotes:
$t = str_replace( "\"", "''", $t );
$xsltEngine->setParameter( "some-text", $t );
if your final output is HTML you could try htmlencoding it. As long as entities are set in stylesheet should be OK
You could use &apos; to escape the single quotes:
$searchterms = '"some" text&apos;s quotes are mixed.'
This works for me to replace the single quotation marks with html special characters using str_replace
$t = str_replace("'", "'", $t);
$xsltEngine->setParameter( "some-text", $t );
As indicated in one of the answers, this is a bug. Therefore, you can not pass both quotes.
But you can replace the double quote with another character, and then use translate to restore the original.
The main thing to choose is a character that will not appear on your text. For example \x7F.
$xsl = new XSLTProcessor();
$xsl->registerPHPFunctions();
$xsl->setParameter("","searchterms", strtr($searchterms, '"', "\x7F"));
$xsl->importStylesheet($xslDoc);
echo $xsl->transformToXML($doc);
and
<title>search feed for <xsl:value-of select="translate($searchterms, '', '"')"/></title>
Also, you can not use html entities, since they are being escaped.
Or use disable-output-escaping="yes":
<title>search feed for <xsl:value-of select="$searchterms" disable-output-escaping="yes"/></title>
with
$xsl->setParameter("","searchterms", htmlspecialchars($searchterms));
The first method you can use for built-in expressions. For example:
<title attr="foo {translate($searchterms, '', '"')} bar">bazz</title>

How can I use XPath to perform a case-insensitive search and support non-english characters?

I am performing a search in an XML file, using the following code:
$result = $xml->xpath("//StopPoint[contains(StopName, '$query')]");
Where $query is the search query, and StopName is the name of a bus stop. The problem is, it's case sensitive.
And not only that, I would also be able to search with non-english characters like ÆØÅæøå to return Norwegian names.
How is this possible?
In XPath 1.0 (which is, I believe, the best you can get with PHP SimpleXML), you'd have to use the translate() function to produce all-lowercase output from mixed-case input.
For convenience, I would wrap it in a function like this:
function findStopPointByName($xml, $query) {
$upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZÆØÅ"; // add any characters...
$lower = "abcdefghijklmnopqrstuvwxyzæøå"; // ...that are missing
$arg_stopname = "translate(StopName, '$upper', '$lower')";
$arg_query = "translate('$query', '$upper', '$lower')";
return $xml->xpath("//StopPoint[contains($arg_stopname, $arg_query)");
}
As a sanitizing measure I would either completely forbid or escape single quotes in $query, because they will break your XPath string if they are ignored.
In XPath 2.0 you can use lower-case() function, which is unicode aware, so it'll handle non-ASCII characters fine.
contains(lower-case(StopName), lower-case('$query'))
To access XPath 2.0 you need XSLT 2.0 parser. For example SAXON. You can access it from PHP via JavaBridge.
Non-English names should not be a problem. Just add them to your XPath. (XML is defined as using Unicode).
As for case-insensitivity, ...
XPath 1.0 includes the following statement:
Two strings are equal if and only if they consist of the same sequence of UCS characters.
So even using explicit predicates on the local-name will not help.
XPath 2 includes functions to map case. E.g. fn:upper-case
Additional: using XPath's translate function should allow case mapping to be faked in XPath 1, but the input will need to include every cased code point you and your users will ever need:
"test" = translate($inputString, "abcdefghijklmnopqrstuvwxyz", "ABCDEFGHIJKLMNOPQRSTUVWXYZ")
In addition:
$xml->xpath("//StopPoint[contains(StopName, '$query')]");
You will need to strip out any apostrophe characters from $query to avoid breaking your expression.
In XPath 2.0 you can double-up the quote being used in the delimiter to put that quote into a string literal, but in XPath 1.0 it's impossible to include the delimiter in the string.

Categories