Php variable into a XML request string - php

I have the below code wich is extracting the Artist name from a XML file with the ref asrist code.
<?php
$dom = new DOMDocument();
$dom->load('http://www.bookingassist.ro/test.xml');
$xpath = new DOMXPath($dom);
echo $xpath->evaluate('string(//Artist[ArtistCode = "COD Artist"] /ArtistName)');
?>
The code that is pulling the artistcode based on a search
<?php echo $Artist->artistCode ?>
My question :
Can i insert the variable generated by the php code into the xml request string ?
If so could you please advise where i start reading ...
Thanks

You mean the XPath expression. Yes you can - it is "just a string".
$expression = 'string(//Artist[ArtistCode = "'.$Artist->artistCode.'"]/ArtistName)'
echo $xpath->evaluate($expression);
But you have to make sure that the result is valid XPath and your value does not break the string literal. I wrote a function for a library some time ago that prepares a string this way.
The problem in XPath 1.0 is that here is no way to escape any special character. If you string contains the quotes you're using in XPath it breaks the expression. The function uses the quotes not used in the string or, if both are used, splits the string and puts the parts into a concat() call.
public function quoteXPathLiteral($string) {
$string = str_replace("\x00", '', $string);
$hasSingleQuote = FALSE !== strpos($string, "'");
if ($hasSingleQuote) {
$hasDoubleQuote = FALSE !== strpos($string, '"');
if ($hasDoubleQuote) {
$result = '';
preg_match_all('("[^\']*|[^"]+)', $string, $matches);
foreach ($matches[0] as $part) {
$quoteChar = (substr($part, 0, 1) == '"') ? "'" : '"';
$result .= ", ".$quoteChar.$part.$quoteChar;
}
return 'concat('.substr($result, 2).')';
} else {
return '"'.$string.'"';
}
} else {
return "'".$string."'";
}
}
The function generates the needed XPath.
$expression = 'string(//Artist[ArtistCode = '.quoteXPathLiteral($Artist->artistCode).']/ArtistName)'
echo $xpath->evaluate($expression);

Related

How to escape all invalid characters from DOM XPath Query?

I have the following function that finds values within a HTML DOM;
It works, but when i give parameter $value like: Levi's Baby Overall,
it cracks, because it does not escape the , and ' chars
How to escape all invalid characters from DOM XPath Query?
private function extract($file,$url,$value) {
$result = array();
$i = 0;
$dom = new DOMDocument();
#$dom->loadHTMLFile($file);
//use DOMXpath to navigate the html with the DOM
$dom_xpath = new DOMXpath($dom);
$elements = $dom_xpath->query("//*[text()[contains(., '" . $value . "')]]");
if (!is_null($elements)) {
foreach ($elements as $element) {
$nodes = $element->childNodes;
foreach ($nodes as $node) {
if (($node->nodeValue != null) && ($node->nodeValue === $value)) {
$xpath = preg_replace("/\/text\(\)/", "", $node->getNodePath());
$result[$i]['url'] = $url;
$result[$i]['value'] = $node->nodeValue;
$result[$i]['xpath'] = $xpath;
$i++;
}
}
}
}
return $result;
}
One shouldn't substitute placeholders in an XPath expression with arbitrary, user-provided strings -- because of the risk of (malicious) XPath injection.
To deal safely with such unknown strings, the solution is to use a pre-compiled XPath expression and to pass the user-provided string as a variable to it. This also completely eliminates the need to deal with nested quotes in the code.
PHP has no built-in function for escaping/quoting strings for XPath queries. furthermore, escaping strings for XPath is surprisingly difficult to do, here's more information on why: https://stackoverflow.com/a/1352556/1067003 , and here is a PHP port of his C# XPath quote function:
function xpath_quote(string $value):string{
if(false===strpos($value,'"')){
return '"'.$value.'"';
}
if(false===strpos($value,'\'')){
return '\''.$value.'\'';
}
// if the value contains both single and double quotes, construct an
// expression that concatenates all non-double-quote substrings with
// the quotes, e.g.:
//
// concat("'foo'", '"', "bar")
$sb='concat(';
$substrings=explode('"',$value);
for($i=0;$i<count($substrings);++$i){
$needComma=($i>0);
if($substrings[$i]!==''){
if($i>0){
$sb.=', ';
}
$sb.='"'.$substrings[$i].'"';
$needComma=true;
}
if($i < (count($substrings) -1)){
if($needComma){
$sb.=', ';
}
$sb.="'\"'";
}
}
$sb.=')';
return $sb;
}
example usage:
$elements = $dom_xpath->query("//*[contains(text()," . xpath_quote($value) . ")]");
notice how i did not add the quoting characters (") in the xpath itself, because the xpath_quote function does it for me (or the concat() equivalent if needed)

php regular expression to match string if NOT in an HTML tag

I'm trying to solve this bug in Drupal's Hashtags module: http://drupal.org/node/1718154
I've got this function that matches every word in my text that is prefixed by "#", like #tag:
function hashtags_get_tags($text) {
$tags_list = array();
$pattern = "/#[0-9A-Za-z_]+/";
preg_match_all($pattern, $text, $tags_list);
$result = implode(',', $tags_list[0]);
return $result;
}
I need to ignore internal links in pages, such as link, or, more in general, any word prefixed by # that appears inside an HTML tag (so preceeded by < and followed by >).
Any idea how can I achieve this?
Can you strip the tags first because matching (using the strip_tags function)?
function hashtags_get_tags($text) {
$text = strip_tags($text);
$tags_list = array();
$pattern = "/#[0-9A-Za-z_]+/";
preg_match_all($pattern, $text, $tags_list);
$result = implode(',', $tags_list[0]);
return $result;
}
A regular expression is going to be tricky if you want to only match hashtags that are not inside an HTML tag.
You could throw out the tags before hand using preg_replace
function hashtags_get_tags($text) {
$tags_list = array();
$pattern = "/#[0-9A-Za-z_]+/";
$text=preg_replace("/<[^>]*>/","",$text);
preg_match_all($pattern, $text, $tags_list);
$result = implode(',', $tags_list[0]);
return $result;
}
I made this function using PHP DOM.
It returns all links that have # in the href.
If you want it to only remove internal hash tags, replace this line:
if(strpos($link->getAttribute('href'), '#') === false) {
with this:
if(strpos($link->getAttribute('href'), '#') !== 0) {
This is the function:
function no_hashtags($text) {
$doc = new DOMDocument();
$doc->loadHTML($text);
$links = $doc->getElementsByTagName('a');
$nohashes = array();
foreach($links as $link) {
if(strpos($link->getAttribute('href'), '#') === false) {
$temp = new DOMDocument();
$elem = $temp->importNode($link->cloneNode(true), true);
$temp->appendChild($elem);
$nohashes[] = $temp->saveHTML();
}
}
// return $nohashes;
return implode('', $nohashes);
// return implode(',', $nohashes);
}

how to display data id, name?

My file xml:
<pasaz:Envelope>
<pasaz:Body>
<loadOffe>
<offe>
<off>
<id>120023</id>
<name>my name John</name>
<name>Test</name>
</off>
</offe>
</loadOffe>
</pasaz:Body>
</pasaz:Envelope>
How to view a php (id and name).
If you're just looking for a simple way to extract the contents of a tag, but don't want to go to all the trouble of parsing the XML properly, you could do something like this:
$xml = ""; // your xml data as a string
function get_tag_contents($xml, $tagName) {
$startPosition = strpos($xml, "<" . $tagName . ">");
$endPosition = strpos($xml, "</" . $tagName . ">");
$length = $endPosition - ($startPosition + 1);
return substr($xml, $startPosition, $length);
}
$id = get_tag_contents($xml, "id");
$name = get_tag_contents($xml, "name");
This assumes you haven't assigned any attributes to your tags, and that each tag is unique (in the example you gave us I noted two "name" tags, and if you want both you'll need to make this solution a bit more robust or do proper XML parsing).
How to get all items?
Example (does not work ..)
$pliks = simplexml_load_file("file.xml");
foreach ($pliks->children('pasaz', true) as $body)
{
foreach ($body->children() as $loadOffe)
{
if ($loadOffe->offe->off) {
echo "<p>id: $loadOffe->id</p>";
echo "$id->id";
echo "<p>name: <b>$name->name</b></p>";
}
}
// echo $loadOffe->offe->off->id;
}
As Marc B suggested in his comment you should use DOM, either use getElementsByTagName() or DOMXPath, example for getElementaByTagName():
$dom = new DOMDocument;
$dom->loadXML($xml);
$ids = $dom->getElementsByTagName('id');
if( $ids || !$ids->length){
throw new Exception( 'Id not found');
}
return $ids->item(0);

How to remove php code from a string?

I have a string that has php code in it, I need to remove the php code from the string, for example:
<?php $db1 = new ps_DB() ?><p>Dummy</p>
Should return <p>Dummy</p>
And a string with no php for example <p>Dummy</p> should return the same string.
I know this can be done with a regular expression, but after 4h I haven't found a solution.
<?php
function filter_html_tokens($a){
return is_array($a) && $a[0] == T_INLINE_HTML ?
$a[1]:
'';
}
$htmlphpstring = '<a>foo</a> something <?php $db1 = new ps_DB() ?><p>Dummy</p>';
echo implode('',array_map('filter_html_tokens',token_get_all($htmlphpstring)));
?>
As ircmaxell pointed out: this would require valid PHP!
A regex route would be (allowing for no 'php' with short tags. no ending ?> in the string / file (for some reason Zend recommends this?) and of course an UNgreedy & DOTALL pattern:
preg_replace('/<\\?.*(\\?>|$)/Us', '',$htmlphpstring);
Well, you can use DomDocument to do it...
function stripPHPFromHTML($html) {
$dom = new DomDocument();
$dom->loadHtml($html);
removeProcessingInstructions($dom);
$simple = simplexml_import_dom($d->getElementsByTagName('body')->item(0));
return $simple->children()->asXml();
}
function removeProcessingInstructions(DomNode &$node) {
foreach ($node->childNodes as $child) {
if ($child instanceof DOMProcessingInstruction) {
$node->removeChild($child);
} else {
removeProcessingInstructions($child);
}
}
}
Those two functions will turn
$str = '<?php echo "foo"; ?><b>Bar</b>';
$clean = stripPHPFromHTML($str);
$html = '<b>Bar</b>';
Edit: Actually, after looking at Wrikken's answer, I realized that both methods have a disadvantage... Mine requires somewhat valid HTML markup (Dom is decent, but it won't parse <b>foo</b><?php echo $bar). Wrikken's requires valid PHP (any syntax errors and it'll fail). So perhaps a combination of the two (try one first. If it fails, try the other. If both fail, there's really not much you can do without trying to figure out the exact reason they failed)...
A simple solution is to explode into arrays using the php tags to remove any content between and implode back to a string.
function strip_php($str) {
$newstr = '';
//split on opening tag
$parts = explode('<?',$str);
if(!empty($parts)) {
foreach($parts as $part) {
//split on closing tag
$partlings = explode('?>',$part);
if(!empty($partlings)) {
//remove content before closing tag
$partlings[0] = '';
}
//append to string
$newstr .= implode('',$partlings);
}
}
return $newstr;
}
This is slower than regex but doesn't require valid html or php; it only requires all php tags to be closed.
For files which don't always include a final closing tag and for general error checking you could count the tags and append a closing tag if it's missing or notify if the opening and closing tags don't add up as expected, e.g. add the code below at the start of the function. This would slow it down a bit more though :)
$tag_diff = (substr_count($str,'<?') - (substr_count($str,'?>');
//Append if there's one less closing tag
if($tag_diff == 1) $str .= '?>';
//Parse error if the tags don't add up
if($tag_diff < 0 || $tag_diff > 1) die('Error: Tag mismatch.
(Opening minus closing tags = '.$tag_diff.')<br><br>
Dumping content:<br><hr><br>'.htmlentities($str));
This is an enhanced version of strip_php suggested by #jon that is able to replace php part of code with another string:
/**
* Remove PHP code part from a string.
*
* #param string $str String to clean
* #param string $replacewith String to use as replacement
* #return string Result string without php code
*/
function dolStripPhpCode($str, $replacewith='')
{
$newstr = '';
//split on each opening tag
$parts = explode('<?php',$str);
if (!empty($parts))
{
$i=0;
foreach($parts as $part)
{
if ($i == 0) // The first part is never php code
{
$i++;
$newstr .= $part;
continue;
}
//split on closing tag
$partlings = explode('?>', $part);
if (!empty($partlings))
{
//remove content before closing tag
if (count($partlings) > 1) $partlings[0] = '';
//append to out string
$newstr .= $replacewith.implode('',$partlings);
}
}
}
return $newstr;
}
If you are using PHP, you just need to use a regular expression to replace anything that matches PHP code.
The following statement will remove the PHP tag:
preg_replace('/^<\?php.*\?\>/', '', '<?php $db1 = new ps_DB() ?><p>Dummy</p>');
If it doesn't find any match, it won't replace anything.

highlight the word in the string, if it contains the keyword

how write the script, which menchion the whole word, if it contain the keyword? example: keyword "fun", string - the bird is funny, result - the bird is * funny*. i do the following
$str = "my bird is funny";
$keyword = "fun";
$str = preg_replace("/($keyword)/i","<b>$1</b>",$str);
but it menshions only keyword. my bird is funny
Try this:
preg_replace("/\w*?$keyword\w*/i", "<b>$0</b>", $str)
\w*? matches any word characters before the keyword (as least as possible) and \w* any word characters after the keyword.
And I recommend you to use preg_quote to escape the keyword:
preg_replace("/\w*?".preg_quote($keyword)."\w*/i", "<b>$0</b>", $str)
For Unicode support, use the u flag and \p{L} instead of \w:
preg_replace("/\p{L}*?".preg_quote($keyword)."\p{L}*/ui", "<b>$0</b>", $str)
You can do the following:
$str = preg_replace("/\b([a-z]*${keyword}[a-z]*)\b/i","<b>$1</b>",$str);
Example:
$str = "Its fun to be funny and unfunny";
$keyword = 'fun';
$str = preg_replace("/\b([a-z]*${keyword}[a-z]*)\b/i","<b>$1</b>",$str);
echo "$str"; // prints 'Its <b>fun</b> to be <b>funny</b> and <b>unfunny</b>'
<?php
$str = "my bird is funny";
$keyword = "fun";
$look = explode(' ',$str);
foreach($look as $find){
if(strpos($find, $keyword) !== false) {
if(!isset($highlight)){
$highlight[] = $find;
} else {
if(!in_array($find,$highlight)){
$highlight[] = $find;
}
}
}
}
if(isset($highlight)){
foreach($highlight as $replace){
$str = str_replace($replace,'<b>'.$replace.'</b>',$str);
}
}
echo $str;
?>
Here by am added multi search in a string for your reference
$keyword = ".in#.com#dot.com#1#2#3#4#5#6#7#8#9#one#two#three#four#five#Six#seven#eight#nine#ten#dot.in#dot in#";
$keyword = implode('|',explode('#',preg_quote($keyword)));
$str = "PHP is dot .com the amazon.in 123455454546 dot in scripting language of choice.";
$str = preg_replace("/($keyword)/i","<b>$0</b>",$str);
echo $str;
Basically, since this is HTML, what you have to do is iterate over text nodes and split those containing the search string into up to three nodes (before match, after match and the highlighted match). If "after match" node exist, it must be processed too. Here is a PHP7 example using PHP DOM extension. The following function accepts preg_quoted UTF-8 search string (or regex-conpatible expression like apple|orange). It will enclose every match in a given tag with a given class.
function highlightTextInHTML($regex_compatible_text, $html, $replacement_tag = 'span', $replacement_class = 'highlight') {
$d = new DOMDocument('1.0','utf-8');
$d->loadHTML('<head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"/></head>' . $html);
$xpath = new DOMXPath($d);
$process_node = function(&$node) use($regex_compatible_text, $replacement_tag, $replacement_class, &$d, &$process_node) {
$i = preg_match("~(?<before>.*?)(?<search>($regex_compatible_text)+)(?<after>.*)~ui", $node->textContent, $m);
if($i) {
$x = $d->createElement($replacement_tag);
$x->setAttribute('class', $replacement_class);
$x->textContent = $m['search'];
$parent_node = $node->parentNode;
$before = null;
$after = null;
if(!empty($m['after'])) {
$after = $d->createTextNode($m['after']);
$parent_node->replaceChild($after, $node);
$parent_node->insertBefore($x, $after);
} else {
$parent_node->replaceChild($x, $node);
}
if(!empty($m['before'])) {
$before = $d->createTextNode($m['before']);
$parent_node->insertBefore($before, $x);
}
if($after) {
$process_node($after);
}
}
};
$node_list = $xpath->query('//text()');
foreach ($node_list as $node) {
$process_node($node);
}
return preg_replace('~(^.*<body>)|(</body>.*$)~mis', '', $d->saveHTML());
}
Search and highlight the word in your string, text, body and paragraph:
<?php $body_text='This is simple code for highligh the word in a given body or text'; //this is the body of your page
$searh_letter = 'this'; //this is the string you want to search for
$result_body = do_Highlight($body_text,$searh_letter); // this is the result with highlight of your search word
echo $result_body; //for displaying the result
function do_Highlight($body_text,$searh_letter){ //function for highlight the word in body of your page or paragraph or string
$length= strlen($body_text); //this is length of your body
$pos = strpos($body_text, $searh_letter); // this will find the first occurance of your search text and give the position so that you can split text and highlight it
$lword = strlen($searh_letter); // this is the length of your search string so that you can add it to $pos and start with rest of your string
$split_search = $pos+$lword;
$string0 = substr($body_text, 0, $pos);
$string1 = substr($body_text,$pos,$lword);
$string2 = substr($body_text,$split_search,$length);
$body = $string0."<font style='color:#FF0000; background-color:white;'>".$string1." </font> ".$string2;
return $body;
} ?>

Categories