Manipulate HTML paragraphs in php [duplicate] - php

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Highlight keywords in a paragraph
Here is another question for you. I have a small problem in php and I thought before find an extra-ordinary solution by myself there maybe an easier and faster way to solve the problem.
Assuming I have a string which contains HTML paragraph tags like:
$string="<p>Hello this is nick</p>
<p>i need some help over here</p>
<p></p><p>Does anyone know a solution</p>"
And an array of stings which contains some "clue" words:
$array=("Hello","nick", "help", "anyone", "solution")
I now would like to do the following:
Output the $string in a browser but the "clue" words should have a special format e.g. being bold or highlighted.
What makes me find this a bit difficult is that I want to keep the paragraphs as there are. In other words I want the final output to look exactly as the original (including new lines/new paragraphs) but with some words bold
I thought I could use strip_tags to remove <p> and </p> tags and then split the returned string by spaces. So as to get an array of words. Then I would output each word individually by checking if that word is contained in the $array. If yes, then it would be outputted with a bold style.
In this way I clearly lose the notion of new paragraphs and all the paragraphs will be merged in a single one.
Is there an easy way to fix that ? For example a way to have the knowledge that e.g. word "Hello" starts in a new paragraph? Or is there something else I can do?

Just replace the words with formatted versions of themselves. The regex below maintains the case and replaces full words only (so that for example in the word "snicker" the word "nick" inside it isn't replaced).
preg_replace( '/\b('.implode( '|', $array ).')\b/i', '<em>$1</em>', $string );

Why not just replace your clue words directly ?
$string = str_ireplace(array('hello', 'nick'), array('<strong>hello</strong>', '<strong>nick</strong>'), $string);
(of course the second array passed to the function would be generated beforehand)

use str_replace and replace the words with bold tags around them

Related

How do you replace AND update in PHP using preg_replace (or similar)? [duplicate]

This question already has answers here:
What does the $1$2$4 mean in this preg_replace?
(3 answers)
Closed 4 years ago.
I want to loop through an array converting specific key/value pairs that contain markup to HTML.
So an example value for $comment['comment_text'] would be:
This has *bolded* text
And should become:
This has <strong>bolded</strong> text
Here's what I've tried:
$pattern = "/\*\b.*?\b\*/i";
$newComment = preg_replace($pattern, "<strong>$&</strong>",
$comment['comment_text']);
And what I get:
This has $& text
I realize I'm mashing up Javascript with PHP, but reading about back references in PHP hasn't made things any clearer.
My strings may have multiple bolded (in markup) instances...
Any help appreciated.
UPDATE:
Apologies - I didn't realize that Stackoverflow was converting asterisks to italics. I converted the example to code.
Also, my confusion came down to the use of $0 vs. $1. Which I still don't fully understand. I thought the numbers referred to the matches in the string...so if you had 5 instances you could refer to them by $0 through $4.
If you use $0 you get:
This has <strong>*bolded*</strong> text
But if you use $1 you get the desired result.
Do this.
$pattern = "/\*\b(.*?)\b\*/";
$newComment = preg_replace($pattern, "<strong>$1</strong>", $comment['comment_text']);
Here $1 refers to the group 1 match. Here I'm supposing that you want to make text between ** bolded.

preg_replace all links in file_get_contents not containing a word [duplicate]

This question already has answers here:
Regular expression to match a line that doesn't contain a word
(34 answers)
Closed 9 years ago.
I'm reading a page into a variable and I would like to disable all links that do not contain the word "remedy" in the address. The code I have so far grabs all the links including ones with "remedy". What am I doing wrong?
$page = preg_replace('~<a href=".*?(?!remedy).*?".*?>(.*?)</a>~i', '<font color="#808080">$1</font>', $page);
-- solution --
$page = preg_replace('~<a href="(.(?!remedy))*?".*?>(.*?)</a>~i', '<font color="#808080">$2</font>', $page);
Try ~<a href="(.(?!remedy))*?".*?>(.*?)</a>~i
To the question, what you are doing wrong: Regexes match ever if anyhow possible and for each url (even that containing remedy) it is possible to match '~<a href=".*?(?!remedy).*?".*?>(.*?)</a>~i' because you did not specify remedy may not be contained anywhere in the attribute but you specified there must be anything/nothing (.*?) that is not followed by remedy and that is the case for any url except those that begin with exactly <a href="remedy". Hope one can understand that...
I would probably use this:
<a href="(?:(?!remedy)[^"])*"[^>]*>([^<]*)</a>
The most interesting part is this:
"(?:(?!remedy)[^"])*"
Each time the [^"] is about to consume another character, it yields to the lookahead so it confirm that it's not the first character of the word remedy. Using [^"] instead of . prevents it from looking at anything beyond the closing quote. I also took the liberty of replacing your .*?s with negated character classes. This serves the same purpose, keeping the match "corralled" in the area where you want it to match. It's also more efficient and more robust.
Of course, I'm assuming the <a> element's content is plain text, with no more elements nested inside it. In fact, that's just one of many simplifying assumptions I've made. You can't match HTML with regexes without them.

Matching pricing from html - regex [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Matching Product Prices from an HTML text
I have a string which is usually, but not always, html page source
I want to extract pricing from within the string. I know this is not an exact science and the combination of currency symbol placement etc is endless but anything better than nothing.
example string:
$string = 'the price is <tag>£10.00</tag>';
So, I am starting with the following regex:
$price = preg_match('#(?:\$|\£|\€|\£|\&\#163;)(\d+(?:\.\d+)?)#', $string);
But of course this only returns the first character.
My question is, is there a way keep going through $string until it finds a certain character? e.g. < or a space? and then return what was found which in this case would be: 10.00
Is this a feasible way of doing this or is there a better way?
Here's the above in an example:
http://ideone.com/u8erb
Read the docs for preg_match, it does not return your match, it only returns if there was a match.
Try this
$string = 'the price is <tag>£10.00</tag>';
$price = preg_match_all('#(?:\$|\£|\€|\£|\&\#163;)(\d+(?:\.\d+)?)#', $string, $matches);
//This will contain your matches
var_dump($matches);
How about using preg_match_all with (\d+(?:\.\d+)?)(?=<\s*/\s*tag\s*>), since the currency may change? Any solution with regex will depend on a set of assumptions, so it's good to get those down first:
Where should you be looking, are these prices occurring within a given div?
What is the full set of possible values?
Try to make your regex as broad as possible, since a common reason it'll fail in the future is because something minor has changed which you haven't considered. If these prices are occurring in a tag with ids and classes, consider using an XHTML parser instead:
http://php.net/manual/en/book.dom.php
http://simplehtmldom.sourceforge.net/

Replace content between two words [duplicate]

This question already has answers here:
Get content between two strings PHP
(7 answers)
Closed 4 years ago.
I am trying to replace the content between two words using php. The content between the two words is different so I can't use tradition str_replace. I want to replace the content between two words for example:
I would like to replace **some string of text** between two words
change to:
I would like to replace between two words
You can see that I removed all the wording between "some" and "text". Again I cannot use regular str_replace because the text between the two words may differ. For example it may say:
I would like to replace **some words of text** between two words
change to:
I would like to replace between two words
The regex is simple: /some .*? text/
Just replace it with the empty string.
According to your question, only the inner part of your string changes. If that is the case it's rather trivial, because you already have the solution: You do not need to replace it, but you just need to not take it over:
$result = substr($string, 0, $startlen) . substr($string, -$endlen);
Probably this helps you to find some more "resolution angles" for such problems.

Regex - Grab a specific word within specific tags

I don't consider myself a PHP "noob", but regular expressions are still new to me.
I'm doing a CURL where I receive a list of comments. Every comment has this HTML structure:
<div class="comment-text">the comment</div>
What I want is simple: I want to get, from a preg_match_all, the comments that have the word "cool" in this specific DIV tag.
What I have so far:
preg_match_all("#<div class=\"comment-text\">\bcool\b</div>#Uis", $getcommentlist, $matchescomment);
Sadly, this doesn't work. But if the REGEX is simply #\bcool\b#Uis, it will work. But I really want to capture the word "cool" in those tags.
I know I could do 2 regular expressions (one that gets all the comments, the other that filters each of them to capture the word "cool"), but I was wondering how could I do this in one preg_match_all?
I don't think I'm far from the solution, but somehow I just can't find it. Something's definitely missing.
Thank you for your time.
This should give you what you're looking for, and provide some flexibility if you want to change things a bit:
$input = '<div class="comment-text">the comment</div><div class="comment-text">cool</div><div class="comment-text">this one is cool too</div><div class="comment-text">ool</div>';
$class="comment-text";
$text="cool";
$pattern = '#<div class="'.$class.'">([^<]*'.$text.'[^<]*)</div>#s';
preg_match_all($pattern, $input, $matches);
Obviously, you need to set your input as the value for $input. After this runs, an array of the <div>s that matched will be in $matches[0] and an array of the text that matched will be in $matches[1]
You can change the class of div to match or the within-div text to require by changing the $class and $text values, respectively.

Categories