PHP replace string help - php

i am designing a site with a comment system and i would like a twitter like reply system.
The if the user puts #a_registered_username i would like it to become a link to the user's profile.
i think preg_replace is the function needed for this.
$ALL_USERS_ROW *['USERNAME'] is the database query array for all the users and ['USERNAME'] is the username row.
$content is the comment containing the #username
i think this should not be very hard to solve for someone who is good at php.
Does anybody have any idea how to do it?

$content = preg_replace( "/\b#(\w+)\b/", "http://twitter.com/$1", $content );
should work, but I can't get the word boundary matches to work in my test ... maybe dependent on the regex library used in versions of PHP
$content = preg_replace( "/(^|\W)#(\w+)(\W|$)/", "$1http://twitter.com/$2$3", $content );
is tested and does work

You want it to go through the text and get it, here is a good starting point:
$txt='this is some text #seanja';
$re1='.*?'; # Non-greedy match on filler
$re2='(#)'; # Any Single Character 1
$re3='((?:[a-z][a-z]+))'; # Word 1
if ($c=preg_match_all ("/".$re1.$re2.$re3."/is", $txt, $matches))
{
$c1=$matches[1][0];
$word1=$matches[2][0]; //this is the one you want to replace with a link
print "($c1) ($word1) \n";
}
Generated with:
http://www.txt2re.com/index-php.php3?s=this%20is%20some%20text%20#seanja&-40&1
[edit]
Actually, if you go here ( http://www.gskinner.com/RegExr/ ), and search for twitter in the community tab on the right, you will find a couple of really good solutions for this exact problem:
$mystring = 'hello #seanja #bilbobaggins sean#test.com and #slartibartfast';
$regex = '/(?<=#)((\w+))(\s)/g';
$replace = '$1$3';
preg_replace($regex, $replace, $myString);

$str = preg_replace('~(?<!\w)#(\w+)\b~', 'http://twitter.com/$1', $str);
Does not match emails. Does not match any spaces around it.

Related

preg_replace words with #

Trying to use preg_replace to find words with # in them and replace the whole word with nothing.
<?php
$text = "This is a text #removethis little more text";
$textreplaced = preg_replace('/#*. /', '', $text);
echo $captions;
Should output: This is a text little more text
Been trying to google on special charc and such but am lost.
Use \w:
$textreplaced = preg_replace('/#[\w]+ /', '', $text);
echo $textreplaced;
Only finding one at character at a time
I believe you are only finding the '#' to begin with, but if you find the whole string inside use \b around the regex so your final regex should be something like /(#).{2,}?\b/.
The ? mark is important because regexes are greedy and grab as many letters as posible
Just a tip visit a tester like regexpal

URL Regex issue, php

I have a URL regex I use (and have used quite frequently). It does me well for finding various URL formats and http protocols. That said, I wouldn't be writing here if all was dandy in Dandyland.
I've encountered a hiccup that my current regex below is causing.
When searching a string for URLs, if a string consists of something like example...see it will treat it as a URL. There can be any number of periods, however it only pulls the last 3 characters after the last period.
Any ideas how to resolve this?
Example:
$string = "Here's a url, hello.com. But this...shouldn't show.";
$url_regex = "/((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?#)?([a-z0-9-.]*)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_\-~#\(\)\%]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:#&#%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/i";
preg_match_all($url_regex, $string, $urls);
return $urls;
The problem here was that you had added a period within the allowed characters which meant there could be more than one consecutive periods. Also \b is important when you're dealing with inline searches.
\b((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_-]+(\:[a-z0-9+!*(),;?&=\$_-]+)?#)?([a-z0-9-]*)\.([a-z]+){2,3}(\:[0-9])?(\/([a-z0-9+\$_\-~#\(\)\%]?)+)*\/?(\?[a-z+&\$_-][a-z0-9;:#&#%=+\/\$_-]*)?(#[a-z_-][a-z0-9+\$_-]*)?\b
Debuggex Demo
Edit: Updated the answer to ignore matches like example.c
Following code solve your issue. I have test at my end.
$string = "Here's a url, hello.com. But this...shouldn't show.";
$url_regex = "/((https?|ftp)\:\/\/)?([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?#)?([a-z0-9-]+?)\.([a-z]{2,3})(\:[0-9]{2,5})?(\/([a-z0-9+\$_\-~#\(\)\%]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:#&#%=+\/\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?/i";
preg_match_all($url_regex, $string, $urls);
Use https and http with urls in the string.
$string = "this is my website http://example.com and this is my friend website https://pqr.com etc, this...shouldn't show";
$regex = '/\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|$!:,.;]*[A-Z0-9+&##\/%=~_|$]/i';
preg_match_all($regex, $string, $matches);
print_r($matches[0]);

PHP preg_replace, split or match?

I need to parse a string and replace a specific format for tv show names that don't fit my normal format of my media player's queue.
Some examples
Show.Name.2x01.HDTV.x264 should be Show.Name.S02E01.HDTV.x264
Show.Name.10x05.HDTV.XviD should be Show.Name.S10E05.HDTV.XviD
After the show name, there may be 1 or 2 digits before the x, I want the output to always be an S with two digits so add a leading zero if needed. After the x it should always be an E with two digits.
I looked through the manual pages for the preg_replace, split and match functions but couldn't quite figure out what I should do here. I can match the part of the string I want with /\dx\d{2}/ so I was thinking first check if the string has that pattern, then try and figure out how to split the parts out of the match but I didn't get anywhere.
I work best with examples, so if you can point me in the right direction with one that would be great. My only test area right now is a PHP 4 install, so please no PHP 5 specific directions, once I understand whats happening I can probably update it later for PHP 5 if needed :)
A different approach as a solution using #sprintf using PHP4 and below.
$text = preg_replace('/([0-9]{1,2})x([0-9]{2})/ie',
'sprintf("S%02dE%02d", $1, $2)', $text);
Note: The use of the e modifier is depreciated as of PHP5.5, so use preg_replace_callback()
$text = preg_replace_callback('/([0-9]{1,2})x([0-9]{2})/',
function($m) {
return sprintf("S%02dE%02d", $m[1], $m[2]);
}, $text);
Output
Show.Name.S02E01.HDTV.x264
Show.Name.S10E05.HDTV.XviD
See working demo
preg_replace is the function you are looking function.
You have to write a regex pattern that picks correct place.
<?php
$replaced_data = preg_replace("~([0-9]{2})x([0-9]{2})~s", "S$1E$2", $data);
$replaced_data = preg_replace("~S([1-9]{1})E~s", "S0$1E", $replaced_data);
?>
Sorry I could not test it but it should work.
An other way using the preg_replace_callback() function:
$subject = <<<'LOD'
Show.Name.2x01.HDTV.x264 should be Show.Name.S02E01.HDTV.x264
Show.Name.10x05.HDTV.XviD should be Show.Name.S10E05.HDTV.XviD
LOD;
$pattern = '~([0-9]++)x([0-9]++)~i';
$callback = function ($match) {
return sprintf("S%02sE%02s", $match[1], $match[2]);
};
$result = preg_replace_callback($pattern, $callback, $subject);
print_r($result);

How to ignore regex matches wrapped by a particular string?

I had a great idea for some functionality on a project and I've tried to implement it to the best of my ability but I need a little help achieving the desired effect. The page in question is: http://dev.favorcollective.com/guidelines/ (just to provide some context)
I'm using php's preg_replace to go through a particular page's contents (giant string) and I'm having it search for glossary terms and then I wrap the terms with a bit of html that enables dynamic glossary definition tooltips.
Here is my current code:
function annotate($content)
{
global $glossary_terms;
$search = array();
$replace = array();
$count=1;
foreach ($glossary_terms as $term):
array_push($search,'/\b('.preg_quote($term['term'],'/').')[?=a-zA-Z]*/i');
$id = "annotation-".$count;
$replacement = ''.$term['term'].'<span id="'.$id.'" style="display:none;"><span class="term">'.$term['term'].'</span><span class="definition">'.$term['def'].'</span></span>';
array_push($replace,(string)$replacement);
$count++;
endforeach;
return preg_replace($search, $replace, $content);
}
• But what if I want to ignore matches inside of <h#> </h#> tags?
• I also have a particular string that I do not want a specific term to match within. For example, I want the word "proficiency" to match any time it is NOT used in the context of "ACTFL Proficiency Guidelines" how would I go about adding exceptions to my regular expression? Is that even an option?
• Finally, how can I return the matched text as a variable? Currently when I match for a term ending in 's' or 'ing' (on purpose) my script prints the matched term rather than the original string that was matched (i.e. it's replacing "descriptions" with "description"). Is there anyway to do that?
not a php guy (c#), but here goes. I assume that:
'/\b('.preg_quote($term['term'],'/').')[?=a-zA-Z]*/i' will map to this far more readable pattern:
/\b(ESCAPED_TERM)[?=a-zA-Z]*/i
so, as far as excluding <h#> type tags, regex is ok only if you can assume your data would be the simple, non-nested case: <h#>TERM<h#>. If you can, you can use a negative lookahead assertion:
/\b(ESCAPED_TERM)(?!<h\d>)[?=a-zA-Z]*/i
you can use a lookahead with a lookbehind to handle your special case:
/\b(ESCAPED_TERM|(?<!ACTFL )Proficiency(?!\sGuidelines))(?!<h\d>)[?=a-zA-Z]*/i
note: if you have a bunch of these special cases, PHP might (should) have an "ignore whitespace" flag which will let you put each token on newline.
Regular expressions are awesome, wonderful, magical. But everything has its limits.
That's why it's nice to have a language like PHP to provide the extra functionality. :)
Can you strip out headers with a non-greedy regexp?
$content = preg_replace('/<h[1-6]>.*?<\/h[1-6]>/sim', "", $content);
If non-greedy evaluations aren't working, what about just assuming that there won't be any other HTML inside your headers?
$content = preg_replace('/<h[1-6]>[^<]*<\/h[1-6]>/im', "", $content);
Also, you might want to use sprintf to simplify your replacement:
/*
1 get_bloginfo('url')
2 preg_replace( '/\s+/', '', $term['term']).
3 $id
4 $term['term']
5 $term['def']
*/
$rfmt = '%4$s<span id="%3$s" style="display:none;"><span class="term">%4$s</span><span class="definition">%5$s</span></span>';
...
$replacement = sprintf($rfmt, get_bloginfo('url'), preg_replace( '/\s+/', '', $term['term']), $id, $term['term'], $term['def'] );

How to extract a word using regex in php?

How do I extract foo from the following URL and store it in a varialbe, using regex in php?
http://example.com/pages/foo/inside.php
I googled quite a bit for an answer but most regex examples were too complex for me to understand.
preg_match('~pages/(.+?)/~', "http://example.com/pages/foo/inside.php", $matches);
echo $matches[1];
Well, there could be multiple solutions, based on what rule you want the foo to be extracted. As you didn't specify it yet, I'll just guess that you want to get the folder name of the current file (if that's wrong, please expand your question).
<?php
$str = 'http://example.com/pages/foo/inside.php';
preg_match( '#/([^/]+)/[^/]+$#', $str, $match );
print_r( $match );
?>
If the first part is invariant:
$s = 'http://example.com/pages/foo/inside.php';
preg_match('#^http://example.com/pages/([^/]+).*$#', $s, $matches);
$foo = $matches[1];
The main part is ([^/]+) which matches everything which is not a slash (/). That is, we're matching until finding the next slash or end of the string (if the "foo" part can be the last).
$str = 'http://example.com/pages/foo/inside.php';
$s=parse_url($str);
$whatiwant=explode("/",$s['path']);
print $whatiwant[2];

Categories