Replace something inside CDATA with PHP - php

I want to replace chr(10) with
with PHP within
<!CDATA[[Text
test
test]]>
But I'm very poor in REGEX.

$xml = "cc\n<!CDATA[[Text\ntest\ntest]]>\naa\nbb\n";
$callback = function($m) {
return '<!CDATA[[' . preg_replace("~" . chr(10) . "~s", '
', $m[1]) . ']]>';
};
echo preg_replace_callback('~<!CDATA\[\[(.+?)\]\]>~s', $callback, $xml);
p.s. you can probably do it without preg_replace_callback, but it looks nicer than to put all logic into preg_replace...

Why use RegEx?
$final = str_replace( chr(10), '
', $cdata );

Related

Only replace words outside of HTML-tags

I would like to replace words outside of HTML-tags.
So if I got
Hello
and I want to replace "Hello" with "Bye" I would like to get this result:
Bye.
Well, I learned that I have to use a DOM-parser to achieve that.
So I used https://github.com/sunra/php-simple-html-dom-parser and included it.
Now I did
$test = $dom->find('text');
To get the text of the dom.
Now I can loop through the results:
foreach($test as $t) {
if (strpos($t->innertext,$word)!==false) {
$t->innertext = preg_replace(
'/\b' . preg_quote( $word, "/" ) . '\b/i',
"<a href='$url' target='$target' data-uk-tooltip title='$item->title'>\$0</a>",
$t->innertext,1
);
}
}
But unfortunately, if $item->title contains $word, the HTML-structure is smashed.
It looks like there was much confusion. According to the docs, $dom->find($tag) returns an array of all tags, but you are looking for a tag called text ?
Maybe you should try $test = $dom->find('a'); instead ?
Also in your code, it is not clear where the variables $url, $target and $item come from :
foreach($test as $t) {
if (strpos($t->innertext,$word)!==false) {
$t->innertext = preg_replace(
'/\b' . preg_quote( $word, "/" ) . '\b/i',
"<a href='$url' target='$target' data-uk-tooltip title='$item->title'>\$0</a>",
$t->innertext,1
);
}
}
This should work better:
foreach($test as $t) {
if (strpos($t->innertext,$word)!==false) {
$t->innertext = preg_replace(
'/\b' . preg_quote( $word, "/" ) . '\b/i',
"Replacement",
$t->innertext,1
);
}
}

PHP: Can I remove ‪ #‎ from string

I need to remove ‪#‎ from string. I found this method:
$string = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $string);
It doesn't work for the Thai language. I want to remove like this:
from
‪#‎Apple‬ ‪#‎ผลไม้‬
to
#Apple #ผลไม้
I can not understand why str_replace() did not work for you. This will do the job:
function cleanString($string) {
$search = array('‪', '‎', '‬');
$replace = array('', '', '');
return str_replace($search, $replace, $string);
}
$string = '‪#‎Apple‬ ‪#‎ผลไม้‬';
echo $string . "\n";
echo cleanString($string) . "\n";
Output is:
‪#‎Apple‬ ‪#‎ผลไม้‬
#Apple #ผลไม้
Working example can be found at http://sandbox.onlinephpfunctions.com/code/bbdbdf0758e5ea06faf32281021ae859b6d75a51

How can I avoid adding href to an overlapping keyword in string?

Using the following code:
$text = "أطلقت غوغل النسخة المخصصة للأجهزة الذكية العاملة بنظام أندرويد من الإصدار “25″ لمتصفحها الشهير كروم.ولم تحدث غوغل تطبيق كروم للأجهزة العاملة بأندرويد منذ شهر تشرين الثاني العام الماضي، وهو المتصفح الذي يستخدمه نسبة 2.02% من أصحاب الأجهزة الذكية حسب دراسة سابقة. ";
$tags = "غوغل, غوغل النسخة, كروم";
$tags = explode(",", $tags);
foreach($tags as $k=>$v) {
$text = preg_replace("/\b{$v}\b/u","$0",$text, 1);
}
echo $text;
Will give the following result:
I love PHP">love PHP</a>, but I am facing a problem
Note that my text is in Arabic.
The way is to do all in one pass. The idea is to build a pattern with an alternation of tags. To make this way work, you must before sort the tags because the regex engine will stop at the first alternative that succeeds (otherwise 'love' will always match even if it is followed by 'php' and 'love php' will never be matched).
To limit the replacement to the first occurence of each word you can remove tag from the array once it has been found and you test if it is always present in the array inside the replacement callback function:
$text = 'I love PHP, I love love but I am facing a problem';
$tagsCSV = 'love, love php, facing';
$tags = explode(', ', $tagsCSV);
rsort($tags);
$tags = array_map('preg_quote', $tags);
$pattern = '/\b(?:' . implode('|', $tags) . ')\b/iu';
$text = preg_replace_callback($pattern, function ($m) use (&$tags) {
$mLC = mb_strtolower($m[0], 'UTF-8');
if (false === $key = array_search($mLC, $tags))
return $m[0];
unset($tags[$key]);
return '<a href="index.php?s=news&tag=' . rawurlencode($mLC)
. '">' . $m[0] . '</a>';
}, $text);
Note: when you build an url you must encode special characters, this is the reason why I use preg_replace_callback instead of preg_replace to be able to use rawurlencode.
If you have to deal with an utf8 encoded string, you need to add the u modifier to the pattern and you need to replace strtolower with mb_strtolower)
the preg_split way
$tags = explode(', ', $tagsCSV);
rsort($tags);
$tags = array_map('preg_quote', $tags);
$pattern = '/\b(' . implode('|', $tags) . ')\b/iu';
$items = preg_split($pattern, $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$itemsLength = count($items);
$i = 1;
while ($i<$itemsLength && count($tags)) {
if (false !== $key = array_search(mb_strtolower($items[$i], 'UTF-8'), $tags)) {
$items[$i] = '<a href="index.php?s=news&tag=' . rawurlencode($tags[$key])
. '">' . $items[$i] . '</a>';
unset($tags[$key]);
}
$i+=2;
}
$result = implode('', $items);
Instead of calling preg_replace multiple times, call it a single time with a regexp that matches any of the tags:
$tags = explode(",", tags);
$tags_re = '/\b(' . implode('|', $tags) . ')\b/u';
$text = preg_replace($tags_re, '$0', $text, 1);
This turns the list of tags into the regexp /\b(love|love php|facing)\b/u. x|y in a regexp means to match either x or y.

Using str_replace in preg_match_all result

I have this code:
preg_match_all('#href="/mp3/(.*?).html#', $content, $salida);
and I need to replace "_" to " " (space) in output (array), something like this
$salida = str_replace('_', ' ', $salida);
obviously that code does not work
I think what you're looking for is preg_replace_callback
$salida = preg_replace_callback(
'(href="/mp3/.*?\.html)',
function($m) {return str_replace("_","",$m[0]);},
$content);

New line to paragraph function

I have this interesting function that I'm using to create new lines into paragraphs. I'm using it instead of the nl2br() function, as it outputs better formatted text.
function nl2p($string, $line_breaks = true, $xml = true) {
$string = str_replace(array('<p>', '</p>', '<br>', '<br />'), '', $string);
// It is conceivable that people might still want single line-breaks
// without breaking into a new paragraph.
if ($line_breaks == true)
return '<p>'.preg_replace(array("/([\n]{2,})/i", "/([^>])\n([^<])/i"), array("</p>\n<p>", '<br'.($xml == true ? ' /' : '').'>'), trim($string)).'</p>';
else
return '<p>'.preg_replace(
array("/([\n]{2,})/i", "/([\r\n]{3,})/i","/([^>])\n([^<])/i"),
array("</p>\n<p>", "</p>\n<p>", '<br'.($xml == true ? ' /' : '').'>'),
trim($string)).'</p>';
}
The problem is that whenever I try to create a single line break, it inadvertently removes the first character of the paragraph below it. I'm not familiar enough with regex to understand what is causing the problem.
Here is another approach that doesn't use regular expressions. Note, this function will remove any single line-breaks.
function nl2p($string)
{
$paragraphs = '';
foreach (explode("\n", $string) as $line) {
if (trim($line)) {
$paragraphs .= '<p>' . $line . '</p>';
}
}
return $paragraphs;
}
If you only need to do this once in your app and don't want to create a function, it can easily be done inline:
<?php foreach (explode("\n", $string) as $line): ?>
<?php if (trim($line)): ?>
<p><?=$line?></p>
<?php endif ?>
<?php endforeach ?>
The problem is with your match for single line breaks. It matches the last character before the line break and the first after. Then you replace the match with <br>, so you lose those characters as well. You need to keep them in the replacement.
Try this:
function nl2p($string, $line_breaks = true, $xml = true) {
$string = str_replace(array('<p>', '</p>', '<br>', '<br />'), '', $string);
// It is conceivable that people might still want single line-breaks
// without breaking into a new paragraph.
if ($line_breaks == true)
return '<p>'.preg_replace(array("/([\n]{2,})/i", "/([^>])\n([^<])/i"), array("</p>\n<p>", '$1<br'.($xml == true ? ' /' : '').'>$2'), trim($string)).'</p>';
else
return '<p>'.preg_replace(
array("/([\n]{2,})/i", "/([\r\n]{3,})/i","/([^>])\n([^<])/i"),
array("</p>\n<p>", "</p>\n<p>", '$1<br'.($xml == true ? ' /' : '').'>$2'),
trim($string)).'</p>';
}
I also wrote a very simple version:
function nl2p($text)
{
return '<p>' . str_replace(['\r\n', '\r', '\n'], '</p><p>', $text) . '</p>';
}
#Laurent's answer wasn't working for me - the else statement was doing what the $line_breaks == true statement should have been doing, and it was making multiple line breaks into <br> tags, which PHP's native nl2br() already does.
Here's what I managed to get working with the expected behavior:
function nl2p( $string, $line_breaks = true, $xml = true ) {
// Remove current tags to avoid double-wrapping.
$string = str_replace( array( '<p>', '</p>', '<br>', '<br />' ), '', $string );
// Default: Use <br> for single line breaks, <p> for multiple line breaks.
if ( $line_breaks == true ) {
$string = '<p>' . preg_replace(
array( "/([\n]{2,})/i", "/([\r\n]{3,})/i", "/([^>])\n([^<])/i" ),
array( "</p>\n<p>", "</p>\n<p>", '$1<br' . ( $xml == true ? ' /' : '' ) . '>$2' ),
trim( $string ) ) . '</p>';
// Use <p> for all line breaks if $line_breaks is set to false.
} else {
$string = '<p>' . preg_replace(
array( "/([\n]{1,})/i", "/([\r]{1,})/i" ),
"</p>\n<p>",
trim( $string ) ) . '</p>';
}
// Remove empty paragraph tags.
$string = str_replace( '<p></p>', '', $string );
// Return string.
return $string;
}
Here's an approach that comes with a reverse method to replace paragraphs back to regular line breaks and vice versa.
These are useful to use when building a form input. When saving a users input you may want to convert line breaks to paragraph tags, however when editing the text in a form, you may not want the user to see any html characters. Then we would replace the paragraphs back to line breaks.
// This function will convert newlines to HTML paragraphs
// without paying attention to HTML tags. Feed it a raw string and it will
// simply return that string sectioned into HTML paragraphs
function nl2p($str) {
$arr=explode("\n",$str);
$out='';
for($i=0;$i<count($arr);$i++) {
if(strlen(trim($arr[$i]))>0)
$out.='<p>'.trim($arr[$i]).'</p>';
}
return $out;
}
// Return paragraph tags back to line breaks
function p2nl($str)
{
$str = preg_replace("/<p[^>]*?>/", "", $str);
$str = str_replace("</p>", "\r\n", $str);
return $str;
}
Expanding upon #NaturalBornCamper's solution:
function nl2p( $text, $class = '' ) {
$string = str_replace( array( "\r\n\r\n", "\n\n" ), '</p><p>', $text);
$string = str_replace( array( "\r\n", "\n" ), '<br />', $string);
return '<p' . ( $class ? ' class="' . $class . '"' : '' ) . '>' . $string . '</p>';
}
This takes care of both double line breaks by converting them to paragraphs, and single line breaks by converting them to <br />
Just type this between your lines:
echo '<br>';
This will give you a new line.

Categories