Im working on a commenting web application and i want to parse user mentions (#user) as links. Here is what I have so far:
$text = "#user is not #user1 but #user3 is #user4";
$pattern = "/\#(\w+)/";
preg_match_all($pattern,$text,$matches);
if($matches){
$sql = "SELECT *
FROM users
WHERE username IN ('" .implode("','",$matches[1]). "')
ORDER BY LENGTH(username) DESC";
$users = $this->getQuery($sql);
foreach($users as $i=>$u){
$text = str_replace("#{$u['username']}",
"<a href='#' class='ct-userLink' rel='{$u['user_id']}'>#{$u['username']}</a> ", $text);
}
$echo $text;
}
The problem is that user links are being overlapped:
<a rel="11327" class="ct-userLink" href="#">
<a rel="21327" class="ct-userLink" href="#">#user</a>1
</a>
How can I avoid links overlapping?
Answer Update
Thanks to the answer picked, this is how my new foreach loop looks like:
foreach($users as $i=>$u){
$text = preg_replace("/#".$u['username']."\b/",
"<a href='#' title='{$u['user_id']}'>#{$u['username']}</a> ", $text);
}
Problem seems to be that some usernames can encompass other usernames. So you replace user1 properly with <a>user1</a>. Then, user matches and replaces with <a><a>user</a>1</a>. My suggestion is to change your string replace to a regex with a word boundary, \b, that is required after the username.
The Twitter widget has JavaScript code to do this. I ported it to PHP in my WordPress plugin. Here's the relevant part:
function format_tweet($tweet) {
// add #reply links
$tweet_text = preg_replace("/\B[#@]([a-zA-Z0-9_]{1,20})/",
"#<a class='atreply' href='http://twitter.com/$1'>$1</a>",
$tweet);
// make other links clickable
$matches = array();
$link_info = preg_match_all("/\b(((https*\:\/\/)|www\.)[^\"\']+?)(([!?,.\)]+)?(\s|$))/",
$tweet_text, $matches, PREG_SET_ORDER);
if ($link_info) {
foreach ($matches as $match) {
$http = preg_match("/w/", $match[2]) ? 'http://' : '';
$tweet_text = str_replace($match[0],
"<a href='" . $http . $match[1] . "'>" . $match[1] . "</a>" . $match[4],
$tweet_text);
}
}
return $tweet_text;
}
instead of parsing for '#user' parse for '#user ' (with space in the end) or ' #user ' to even avoid wrong parsing of email addresses (eg: mailaddress#user.com) maybe ' #user: ' should also be allowed. this will only work, if usernames have no whitespaces...
You can go for a custom str replace function which stops at first replace.. Something like ...
function str_replace_once($needle , $replace , $haystack){
$pos = strpos($haystack, $needle);
if ($pos === false) {
// Nothing found
return $haystack;
}
return substr_replace($haystack, $replace, $pos, strlen($needle));
}
And use it like:
foreach($users as $i=>$u){
$text = str_replace_once("#{$u['username']}",
"<a href='#' class='ct-userLink' rel='{$u['user_id']}'>#{$u['username']}</a> ", $text);
}
You shouldn’t replace one certain user mention at a time but all at once. You could use preg_split to do that:
// split text at mention while retaining user name
$parts = preg_split("/#(\w+)/", $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$n = count($parts);
// $n is always an odd number; 1 means no match found
if ($n > 1) {
// collect user names
$users = array();
for ($i=1; $i<$n; $i+=2) {
$users[$parts[$i]] = '';
}
// get corresponding user information
$sql = "SELECT *
FROM users
WHERE username IN ('" .implode("','", array_keys($users)). "')";
$users = array();
foreach ($this->getQuery($sql) as $user) {
$users[$user['username']] = $user;
}
// replace mentions
for ($i=1; $i<$n; $i+=2) {
$u = $users[$parts[$i]];
$parts[$i] = "<a href='#' class='ct-userLink' rel='{$u['user_id']}'>#{$u['username']}</a>";
}
// put everything back together
$text = implode('', $parts);
}
I like dnl solution of parsing ' #user', but maybe is not suitable for you.
Anyway, did you try to use strip_tags function to remove the anchor tags? That way you have the string without the links, and you can parse it building the links again.
strip_tags
Related
The problem is related to php. I have a text in a variable say $text and some start position and end positions. How to inject the some substring on the specific position. the problem is that if i put some text in a specific position then position index changes and it will be difficult to put next sub string to the right place. Help me. thanks in advance.
function previewTxt($text,$topicid, $sectionid)
{
$dbConn = new DBConnUtil();
$queryString = "SELECT * from inline_topic_xref where TOPIC_ID=$topicid and SECTION_ID=$sectionid";
$result = $dbConn->run_query($queryString);
$newtext=$text;
$count=0;
while($row = $result->fetch_object())
{
$newtext = replace($newtext, "<a href='#'>" ,$row->REF_TEXT_START);
$newtext = replace($newtext, "</a>" ,$row->REF_TEXT_END);
}
return $newtext;
}
function replace($org_text,$str_rep, $position)
{
$length=strlen($org_text);
$temp1=substr($org_text,0,$position);
$temp2=substr($org_text,$position,$length);
$replaced =$temp1.$str_rep.$temp2;
return $replaced;
}
have you tried
substr_replace($oldstr, $str_to_insert, $pos, 0)
?
here is the documentation
http://php.net/manual/en/function.substr-replace.php
I need to verify a text to show it in the page of a website. I need to transform all urls links of the the same website(not others urls of other websites) in links. I need to involve all them with the tag <a>. The problem is is the property href, that I need to put the correct url inside it. I am trying to verify all the the text and if I find a url, I need to verify if it contains the substring "http://". If not, I must put it in the href property. I did some attempt, but all their aren't working yet :( . Any idea how can I do this?
My function is below:
$string = "This is a url from my website: http://www.mysite.com.br and I have a article interesting there, the link is http://www.mysite.com.br/articles/what-is-psychology/205967. I need that the secure url link works too https://www.mysite.com.br/articles/what-is-psychology/205967. the following urls must be valid too: www.mysite.com.br and mysite.com.br";
function urlMySite($string){
$verifyUrl = '';
$urls = array("mysite.com.br");
$text = explode(" ", $string);
$alltext = "";
for($i = 0; $i < count($texto); $i++){
foreach ($urls as $value){
$pos = strpos($text[$i], $value);
if (!($pos === false)){
$verifyUrl = " <a href='".$text[$i]."' target='_blank'>".$text[$i]."</a> ";
if (strpos($verifyUrl, 'http://') !== true) {
$verifyUrl = " <a href='http://".$text[$i]."' target='_blank'>".$text[$i]."</a> ";
}
$alltext .= $verifyUrl;
} else {
$alltext .= " ".$text[$i]." ";
}
}
}
return $alltext;
}
You should use PREG_MATCH_ALL to find all occurances of the URL and replace each of the Matches with a clickable Link.
You could use this function:
function augmentText($text){
$pattern = "~(https?|file|ftp)://[a-z0-9./&?:=%-_]*~i";
preg_match_all($pattern, $text, $matches);
if( count($matches[0]) > 0 ){
foreach($matches[0] as $match){
$text = str_replace($match, "<a href='" . $match . "' target='_blank'>" . $match . "</a>", $text);
}
}
return $text;
}
Change the reguylar expression pattern to match only the URL's you want to make clickable.
Good luck
I have a site crawler which displays a list of urls, but the problem is I cannot for the life of me get the last regex quite right.
all urls end up listed as:
http://www.website.org/page1.html&--EFTTIUGJ4ITCyh0Frzb_LFXe_eHw
http://website.net/page2/&--EyqBLeFeCkSfmvA7p0cLrsy1Zm1g
http://foobar.website.com/page3.php&--E5WRBxuTOQikDIyBczaVXveOdRFg
The Urls can all be different and the only thing which seems static is the & symbol.
How would go abouts getting rid of the & symbol and everything beyond it to the right?
Here is what I have tried with the above results:
function getresults($sterm) {
$html = file_get_html($sterm);
$result = "";
// find all span tags with class=gb1
foreach($html->find('h3[class="r"]') as $ef)
{
$result .= $ef->outertext . '<br>';
}
return $result;
}
function geturl($url) {
$var = $url;
$result = "";
preg_match_all ("/a[\s]+[^>]*?href[\s]?=[\s\"\/url?q=\']+".
"(.*?)[\"\']+.*?>"."([^<]+|.*?)?<\/a>/",
$var, $matches);
$matches = $matches[1];
foreach($matches as $var)
{
$result .= $var."<br>";
}
echo preg_replace('/sa=U.*?usg=.*?AFQjCN/', "--" , $result);
}
if url are ALWAYS in the same format, use explode :
<?php
$tmp = explode("&", "http://foobar.website.com/page3.php&--E5WRBxuTOQikDIyBczaVXveOdRFg");
?>
$tmp[0] should content "http://foobar.website.com/page3.php" and
$tmp[1] should content "--E5WRBxuTOQikDIyBczaVXveOdRFg"
A simple way to remove everything after the & character:
$result = substr($result, 0, strpos($result, '&'));
I need some help with twitter hashtag, I need to extract a certain hashtag as string variable in PHP.
Until now I have this
$hash = preg_replace ("/#(\\w+)/", "#$1", $tweet_text);
but this just transforms hashtag_string into link
Use preg_match() to identify the hash and capture it to a variable, like so:
$string = 'Tweet #hashtag';
preg_match("/#(\\w+)/", $string, $matches);
$hash = $matches[1];
var_dump( $hash); // Outputs 'hashtag'
Demo
I think this function will help you:
echo get_hashtags($string);
function get_hashtags($string, $str = 1) {
preg_match_all('/#(\w+)/',$string,$matches);
$i = 0;
if ($str) {
foreach ($matches[1] as $match) {
$count = count($matches[1]);
$keywords .= "$match";
$i++;
if ($count > $i) $keywords .= ", ";
}
} else {
foreach ($matches[1] as $match) {
$keyword[] = $match;
}
$keywords = $keyword;
}
return $keywords;
}
As i understand you are saying that
in text/pargraph/post you want to show tag with hash sign(#) like this:- #tag
and in url you want to remove # sign because the string after # is not sended to server in request so i have edited your code and try out this:-
$string="www.funnenjoy.com is best #SocialNetworking #website";
$text=preg_replace('/#(\\w+)/','<a href=/hash/$1>$0</a>',$string);
echo $text; // output will be www.funnenjoy.com is best <a href=search/SocialNetworking>#SocialNetworking</a> <a href=/search/website>#website</a>
Extract multiple hashtag to array
$body = 'My #name is #Eminem, I am rap #god, #Yoyoya check it #out';
$hashtag_set = [];
$array = explode('#', $body);
foreach ($array as $key => $row) {
$hashtag = [];
if (!empty($row)) {
$hashtag = explode(' ', $row);
$hashtag_set[] = '#' . $hashtag[0];
}
}
print_r($hashtag_set);
You can use preg_match_all() PHP function
preg_match_all('/(?<!\w)#\w+/', $description, $allMatches);
will give you only hastag array
preg_match_all('/#(\w+)/', $description, $allMatches);
will give you hastag and without hastag array
print_r($allMatches)
You can extract a value in a string with preg_match function
preg_match("/#(\w+)/", $tweet_text, $matches);
$hash = $matches[1];
preg_match will store matching results in an array. You should take a look at the doc to see how to play with it.
Here's a non Regex way to do it:
<?php
$tweet = "Foo bar #hashTag hello world";
$hashPos = strpos($tweet,'#');
$hashTag = '';
while ($tweet[$hashPos] !== ' ') {
$hashTag .= $tweet[$hashPos++];
}
echo $hashTag;
Demo
Note: This will only pickup the first hashtag in the tweet.
I have the following, simple code:
$text = str_replace($f,''.$u.'',$text);
where $f is a URL, like http://google.ca, and $u is the name of the URL (my function names it 'Google').
My problem is, is if I give my function a string like
http://google.ca http://google.ca
it returns
Google" target="_blank">Google</a> Google" target="_blank">Google</a>
Which obviously isn't what I want. I want my function to echo out two separate, clickable links. But str_replace is replacing the first occurrence (it's in a loop to loop through all the found URLs), and that first occurrence has already been replaced.
How can I tell str_replace to ignore that specific one, and move onto the next? The string given is user input, so I can't just give it a static offset or anything with substr, which I have tried.
Thank you!
One way, though it's a bit of a kludge: you can use a temporary marker that (hopefully) won't appear in the string:
$text = str_replace ($f, '' . $u . '',
$text);
That way, the first substitution won't be found again. Then at the end (after you've processed the entire line), simply change the markers back:
$text = str_replace ('XYZZYPLUGH', $f, $text);
Why not pass your function an array of URLs, instead?
function makeLinks(array $urls) {
$links = array();
foreach ($urls as $url) {
list($desc, $href) = $url;
// If $href is based on user input, watch out for "javascript: foo;" and other XSS attacks here.
$links[] = '<a href="' . htmlentities($href) . '" target="_blank">'
. htmlentities($desc)
. '</a>';
}
return $links; // or implode('', $links) if you want a string instead
}
$urls = array(
array('Google', 'http://google.ca'),
array('Google', 'http://google.ca')
);
var_dump(makeLinks($urls));
If i understand your problem correctly, you can just use the function sprintf. I think something like this should work:
function urlize($name, $url)
{
// Make sure the url is formatted ok
if (!filter_var($url, FILTER_VALIDATE_URL))
return '';
$name = htmlspecialchars($name, ENT_QUOTES);
$url = htmlspecialchars($url, ENT_QUOTES);
return sprintf('%s', $url, $name);
}
echo urlize('my name', 'http://www.domain.com');
// my name
I havent test it though.
I suggest you to use preg_replace instead of str_replace here like this code:
$f = 'http://google.ca';
$u = 'Google';
$text='http://google.ca http://google.ca';
$regex = '~(?<!<a href=")' . preg_quote($f) . '~'; // negative lookbehind
$text = preg_replace($regex, ''.$u.'', $text);
echo $text . "\n";
$text = preg_replace($regex, ''.$u.'', $text);
echo $text . "\n";
OUTPUT:
Google Google
Google Google