My code-
$input = "this text is for highlighting a text if it exists in a string. Let us check if it works or not";
$pattern ="/if/";
$replacement= "H1Fontbracket"."if"."H1BracketClose";
echo preg_replace($pattern, $replacement, $input);
Now the problem is that when i run this code, it splits into multiple lines, what else do i need to do so that i am able to get it in one line
Use str_replace rather than preg_replace. preg_replace will return an array of strings, and str_replace will just return the string:
echo str_replace($pattern, $replacement, $input);
What do you mean by multiple lines? Of course it'll show up as multiple lines on a webpage if you wrap the ifs in header tags. Headers are block elements. And more importantly, headers are headers. Not for highlighting text.
If you want to highlight something with HTML, you should probably use a span with a class, or you could use the HTML5 element mark:
$input = "this text is for highlighting a text if it exists in an iffy string.";
echo preg_replace('/\\bif\\b/', '<span class="highlighted">$0</span>', $input);
echo preg_replace('/\\bif\\b/', '<mark>$0</mark>', $input);
The \\b is to only match if words, and not just the if letters, which might be part of a different word. Then in your CSS you can decide how the marked words should show up:
.highlighted { background: yellow }
mark { background: yellow }
Or whatever. I would recommend that you read up a bit on how HTML and CSS works if you're going to make web pages :)
Try this
$input = "this text is for highlighting a text if
it exists in a string. Let us check if it works or not";
$pattern="if";
$replacement="<h1>". $pattern. "</h1>";
$input= str_replace($pattern,$replacement,$input);
echo "$input";
function highlight($str,$search){
$patterns = array('/\//', '/\^/', '/\./', '/\$/', '/\|/',
'/\(/', '/\)/', '/\[/', '/\]/', '/\*/', '/\+/',
'/\?/', '/\{/', '/\}/', '/\,/');
$replace = array('\/', '\^', '\.', '\$', '\|', '\(', '\)',
'\[', '\]', '\*', '\+', '\?', '\{', '\}', '\,');
$search = preg_replace($patterns, $replace, $search);
$search = str_replace(" ","|",$search);
return #preg_replace("/(^|\s)($search)/i",'${1}<span class=highlight>${2}</span>',$str);
}
Related
I am currently using preg_replace to replace hashtags mentioned with html links like shown below. The issue is there is a possibility there will be html code as well being checked. So some css such as color: #000000; will force it to try convert that hex code into a link.
I basically need my regex to ignore doing any preg_replace if the last letter of a word is ;. Here's what I currently have:
$str = preg_replace('/#([a-zA-Z0-9!_%]+)/', '#$1', $str);
Example input: 'I like #action movies!'
Expected output: I like #action movies!'
I cannot use the end of the string to check this as chunks of text is checked at any given time so the string supplied could be #computer text text text #computer for instance.
Appreciate any assistance.
In your regex you can check if next to your hashtag there is a ;, non alphanumeric, end of line or end of string:
/#([a-zA-Z0-9!_%]+)([^;\w]{1}|$)/
Then use $1 and $2 accordingly
'#$1$2'
Your code will look like
$str = preg_replace('/#([a-zA-Z0-9!_%]+)([^;\w]{1}|$)/', '#$1$2',$str);
Here you can see some tests: https://regex101.com/r/yN4tJ6/65
Until a regEx guru come to your rescue (if ever...) and because you are in PHP; here is a solution with few lines of code.
$str="hi #def; #abc #ghi"; // just a test case (first one need be skipped)
if (preg_match_all('/#([a-zA-Z0-9!_%]+.?)/', $str,$m)){
foreach($m[1] as $k) if(substr($k,-1)!=';') {
$k=trim($k);
$str=str_replace("#$k","<a href='http://wxample.com/tags/$k'>#$k</a>",$str);
}
}
print "$str\n";
you can add a condition to check last string is ; or not and use it accordingly .
Example :
if (substr($str, -1)==';'){
//do nothing
}
else {
$str = preg_replace('/#([a-zA-Z0-9!_%]+)/', '#$1', $str);
}
Hope this help .
This regex should work:
#([\w!%]+(?=[\s,!?.\n]|$))
Demo: https://regex101.com/r/KrRiD3/2
Your PHP code:
$str = 'I like #strategy games #f1f1f1; #e2e2e2; #action games!';
$str = preg_replace('/#([\w!%]+(?=[\s,!?.\n]|$))/', '#$1', $str);
echo $str;
output:
I like #strategy games #f1f1f1; #e2e2e2; #action games!
Well, You can use below code, Actually I am new to regex so it is not that professional but it works, here it is
$data = "<p style='color:#00000;'>Heloo</p> #computer text text text #computer #say #goo1d #sd! #say_hello";
echo preg_replace("/(?<!\:)(\s+)\#([\w]+)(?!\;)/",'#$2',$data);
This expression I have use
/(?<!\:)(\s+)\#([\w]+)(?!\;)/
Output is
<p style='color:#00000;'>Heloo</p> #computer text text text #computer #say #goo1d #sd! #say_hello
I hope it helps someone.
I have this text : http://pastebin.com/2Zgbs7hi
And i want to be able to remove the HTML code from it and just display the plain text but i want to keep at least one line break where there are currently a few line breaks
i have tried:
$ticket["summary"] = 'pastebin example';
$TicketSummaryDisplay = nl2br($ticket["summary"]);
$TicketSummaryDisplay = stripslashes($TicketSummaryDisplay);
$TicketSummaryDisplay = trim(strip_tags($TicketSummaryDisplay));
$TicketSummaryDisplay = preg_replace('/\n\s+$/m', '', $TicketSummaryDisplay);
echo $TicketSummaryDisplay;
that is displaying as plain text, but it shows it all as one big block of text with no line breaks at all
Maybe this will earn you some time.
<?php
libxml_use_internal_errors(true); //crazy o tags
$html = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$dom = new DOMDocument;
$dom->loadHTML($html);
$result='';
foreach ($dom->getElementsByTagName('p') as $node) {
if (strstr($node->nodeValue, 'Legal Disclaimer:')){
break;
}
$result .= $node->nodeValue;
}
echo $result;
This example should successfully store text from html into an array of strings.
After stripping all the tags, you can use preg_split with \R special character ( matches any newline sequence ) to convert string into array. That array will now have several blank values, and there will be also some amount of html non-breaking space entities, so we will check the array for empty values with array_filter() function ( it will remove all items that do not satisfy the filter conditions, in our case, an empty value ). Here are a problem with entity, because and space characters are not the same, they have different ASCII code, so trim() function will not remove spaces. Here are two possible solutions, the first uncommented part will only replace   and check for white space characters, while the second commented one will decode all html entities and also check for spaces.
PHP:
$text = file_get_contents( 'http://pastebin.com/raw.php?i=2Zgbs7hi' );
$text = strip_tags( $text );
$array = array_filter(
preg_split( '/\R/', $text ),
function( &$item ) {
$item = str_replace( ' ', ' ', $item );
return trim( $item );
// $item = html_entity_decode( $item );
// return trim( str_replace( "\xC2\xA0", ' ', $item ) );
}
);
foreach( $array as $value ) {
echo $value . '<br />';
}
Array output:
Array
(
[8] => Hi,
[11] => Ashley has explained that I need to ask for another line and broadband for the wifi to work, please can you arrange this.
[13] => Regards
[23] => Legal Disclaimer:
[24] => This email and its attachments are confidential. If you received it by mistake, please don’t share it. Let us know and then delete it. Its content does not necessarily represent the views of The Dragon Enterprise
[25] => Centre and we cannot guarantee the information it contains is complete. All emails are monitored and may be seen by another member of The Dragon Enterprise Centre's staff for internal use
)
Now you should have clear array with only items with value in it. By the way, newlines in HTML are expressed through <br />, not through \n, your example as response in a web browser still has them, but they are only visible in page source code. I hope I did not missed the point of the question.
try this get text output with line brakes
<?php
$ticket["summary"] = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$TicketSummaryDisplay = nl2br($ticket["summary"]);
echo strip_tags($TicketSummaryDisplay,'<br>');
?>
You are asking on how to add line-breaks to your "one big block of text with no line breaks at all".
Short answer
After you stripped the HTML tags, apply wordwrap with a desired text-block length
$text = wordwrap($text, 90, "<br />\n");
I really wonder, why nobody suggested that function before.
there is also chunk_split around, which doesn't take words into account and just splits after a certain number of chars. breaking words - but that's not what you want, i guess.
PHP
<?php
$text = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
/**
* Returns string without html tags, also
* removes takes control chars, spaces and " " into account.
*/
function dropHtmlTags($string) {
// remove html tags
//$string = preg_replace ('/<[^>]*>/', ' ', $string);
$string = strip_tags($string);
// control characters and " "
$string = str_replace("\r", '', $string); // remove
$string = str_replace("\n", ' ', $string); // replace with space
$string = str_replace("\t", ' ', $string); // replace with space
$string = str_replace(" ", ' ', $string);
// remove multiple spaces
$string = preg_replace('/ {2,}/', ' ', $string);
$string = trim($string);
return $string;
}
$text = dropHtmlTags($text);
// The Answer: insert line breaks after 95 chars,
// to get rid of the "one big block of text with no line breaks at all"
$text = wordwrap($text, 95, "<br />\n");
// if you want to insert line-breaks before the legal disclaimer,
// uncomment the next line
//$text = str_replace("Regards Legal Disclaimer", "<br /><br />Regards Legal Disclaimer", $text);
echo $text;
?>
Result
first section shows your text block
second section shows the text with wordwrap applied (code from above)
Hello it can be done as follows:
$abc= file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
$abc = strip_tags("\n", $abc);
echo $abc;
Please, let me know whether it works
you may use
<?php
$a= file_get_contents('a.txt');
echo nl2br(htmlspecialchars($a));
?>
<?php
$handle = #fopen("pastebin.html", "r");
if ($handle) {
while (!feof($handle)) {
$buffer = fgetss($handle, 4096);
echo $buffer;
}
fclose($handle);
}
?>
output is
Hi,
Ashley has explained that I need to ask for another line and broadband for the wifi to work, please can you arrange this.
Regards
Legal Disclaimer:
This email and its attachments are confidential. If you received it by mistake, please don’t share it. Let us know and then delete it. Its content does not necessarily represent the views of The Dragon Enterprise
Centre and we cannot guarantee the information it contains is complete. All emails are monitored and may be seen by another member of The Dragon Enterprise Centre's staff for internal use
You can probably write additional code to convert to spaces etc.
I'm not sure I did understand everything correctly but this seems to be your expected result:
$txt = file_get_contents('http://pastebin.com/raw.php?i=2Zgbs7hi');
var_dump(preg_replace("/(\ \;(\s{1,})?)+/", "\n", trim(strip_tags(preg_replace("/(\s){1,}/", " ", $txt)))));
//more readable
$txt = preg_replace("/(\s){1,}/", " ", $txt);
$txt = trim(strip_tags($txt));
$txt = preg_replace("/(\ \;(\s{1,})?)+/", "\n", $txt);
The strip_tags() function strips HTML and PHP tags from a string, if that is what you are trying to accomplish.
Examples from the docs:
<?php
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text);
echo "\n";
// Allow <p> and <a>
echo strip_tags($text, '<p><a>');
?>
The above example will output:
Test paragraph. Other text
<p>Test paragraph.</p> Other text
I'm struggling on replacing text in each link.
$reg_ex = "/(http|https)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$text = '<br /><p>this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p>';
if(preg_match_all($reg_ex, $text, $urls))
{
foreach($urls[0] as $url)
{
echo $replace = str_replace($url,'http://www.sometext'.$url, $text);
}
}
From the code above, I'm getting 3x the same text, and the links are changed one by one: everytime is replaced only one link - because I use foreach, I know.
But I don't know how to replace them all at once.
Your help would be great!
You don't use regexes on html. use DOM instead. That being said, your bug is here:
$replace = str_replace(...., $text);
^^^^^^^^--- ^^^^^---
you never update $text, so you continually trash the replacement on every iteration of the loop. You probably want
$text = str_replace(...., $text);
instead, so the changes "propagate"
If you want the final variable to contain all replacements change it so something like this...
You basically are not passing the replaced string back into the "subject". I assume that is what you are expecting since it's a bit difficult to understand the question.
$reg_ex = "/(http|https)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$text = '<br /><p>this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p><p>another - this is a content with a link we are supposed to click</p>';
if(preg_match_all($reg_ex, $text, $urls))
{
$replace = $text;
foreach($urls[0] as $url) {
$replace = str_replace($url,'http://www.sometext'.$url, $replace);
}
echo $replace;
}
just a quick question about Regular expressions: Will this code work for any grooming I will need to do? (i.e. Can this be inputted into a database and be safe?)
function markdown2html($text) {
$text = htmlspecialchars($text, ENT_QUOTES, 'UTF-8');
// Strong Emphasis
$text = preg_replace('/__(.+?)__/s', '<strong>$1</strong>', $text);
$text = preg_replace('/\*\*(.+?)\*\*/s', '<strong>$1</strong>', $text);
// Underline
$text = preg_replace('/_([^_]+)_/', '<p style="text-decoration: underline;">$1</p>', $text);
//Italic
$text = preg_replace('/\*([^\*]+)\*/', '<em>$1</em>', $text);
// Windows to Unix
$text = str_replace('\r\n', '\n', $text);
// Macintosh to Unix
$text = str_replace('\r', '\n', $text);
//Paragraphs
$text = '<p>' . str_replace("\n\n", '</p><p>', $text) . '</p>';
$text = str_replace("\n", '<br />', $text);
// [Linked Text](Url)
$text = preg_replace('/\[([^\]]+)]\(([a-z0-9._~:\/?##!$&\'()*+,;=%]+)\)/i', '$1', $text);
return $text;
}
No, absolutely not.
Your code has nothing to do with SQL -- it does not modify ' or \ characters at all. Commingling the formatting functionality of this function with SQL escaping is silly.
Your code may also introduce HTML injection in some situations -- I'm particularly suspicious of the URL linking regex. Without a proper parser involved, I would not trust it an inch.
No, the data can not assured to be safe after passing through that function.
You need to either escape sql-sensitive characters or use PDO/Mysqli. Preapared statements are much more handy anyway.
Don't use the old way of hacking together a query, ie:
$query = 'select * from table where col = '.$value;
You're just asking for trouble there.
A couple of things jumped out at me:
I believe that the first two regexs ('/__(.+?)__/s' and the corresponding one for *) handle ___word___ and ***word*** incorrectly –– they will treat the third character as part of the word, so you will get *word* (where the first * is bold and the trailing one is not) instead of word.
On the third one ('/_([^_]+)_/'), is it really appropriate for
do _not_ do that
to turn into
do <p style="text-decoration: underline;">not</p> do that
?
Of course I’m not saying that it’s OK to use if you fix these issues.
<hr>I want to remove this text.<embed src="stuffinhere.html"/>
I tried using regex but nothing works.
Thanks in advance.
P.S. I tried this: $str = preg_replace('#(<hr>).*?(<embed)#', '$1$2', $str)
You'll get a lot of advice to use an HTML parser for this kind of thing. You should do that.
The rest of this answer is for when you've decided that the HTML parser is too slow, doesn't handle ill formed (i.e. standard in the wild) HTML, or is a pain in the ass to integrate into the system you don't control. I created the following small shell script
$str = '<hr>I want to remove this text.<embed src="stuffinhere.html"/>';
$str = preg_replace('#(<hr>).*?(<embed)#', '$1$2', $str);
var_dump($str);
//outputs
string(35) "<hr><embed src="stuffinhere.html"/>"
and it did remove the text, so I'd check your source documents and any other PHP code around your RegEx. You're not feeding preg_replace the string you think you are. My best guess is your source document has irregular case, or there's whitespace between the <hr /> and <embed>. Try the following regular expression instead.
$str = '<hr>I want to remove
this text.
<EMBED src="stuffinhere.html"/>';
$str = preg_replace('#(<hr>).*?(<embed)#si', '$1$2', $str);
var_dump($str);
//outputs
string(35) "<hr><EMBED src="stuffinhere.html"/>"
The "i" modifier says "make this search case insensitive". The "s" modifier says "the [.] character should also match my platform's line break/carriage return sequence"
But use a proper parser if you can. Seriously.
I think the code is self-explanatory and pretty easy to understand since it does not use regex (and it might be faster)...
$start='<hr>';
$end='<embed src="stuff...';
$str=' html here... ';
function between($t1,$t2,$page) {
$p1=stripos($page,$t1);
if($p1!==false) {
$p2=stripos($page,$t2,$p1+strlen($t1));
} else {
return false;
}
return substr($page,$p1+strlen($t1),$p2-$p1-strlen($t1));
}
$found=between($start,$end,$str);
while($found!==false) {
$str=str_replace($start.$found.$end,$start.$end,$str);
$found=between($start,$end,$str);
}
// do something with $str here...
$text = '<hr>I want to remove this text.<embed src="stuffinhere.html"/>';
$text = preg_replace('#(<hr>).*?(<embed.*?>)#', '$1$2', $text);
echo $text;
If you want to hard code src in embed tag:
$text = '<hr>I want to remove this text.<embed src="stuffinhere.html"/>';
$text = preg_replace('#(<hr>).*?(<embed src="stuffinhere.html"/>)#', '$1$2', $text);
echo $text;