I am trying to create a mention system and so far I've converted the #username in a link. But I wanted to see if it is possible for it to recognise whitespace for the names. For example: #Marie Lee instead of #MarieLee.
Also, I'm trying to convert the name in the link into lowercase letters (like: profile?id=marielee while leaving the mentioned showed with the uppercased, but haven't been able to.
This is my code so far:
<?php
function convertHashtags($str) {
$regex = '/#+([a-zA-Z0-9_0]+)/';
$str = preg_replace($regex, strtolower('$0'), $str);
return($str);
}
$string = 'I am #Marie Lee, nice to meet you!';
$string = convertHashtags($string);
echo $string;
?>
You may use this code with preg_replace_callback and an enhanced regex that will match all space separated words:
define("REGEX", '/#\w+(?:\h+\w+)*/');
function convertHashtags($str) {
return preg_replace_callback(REGEX, function ($m) {
return '$0';
}, $str);
}
If you want to allow only 2 words then you may use:
define("REGEX", '/#\w+(?:\h+\w+)?/');
You can filter out usernames based on alphanumeric characters, digits or spaces, nothing else to extract for it. Make sure that at least one character is matched before going for spaces to avoid empty space match with a single #. Works for maximum of 2 space separated words correctly for a username followed by a non-word character(except space).
<?php
function convertHashtags($str) {
$regex = '/#([a-zA-Z0-9_]+[\sa-zA-Z0-9_]*)/';
if(preg_match($regex,$str,$matches) === 1){
list($username,$name) = [$matches[0] , strtolower(str_replace(' ','',$matches[1]))];
return "<a href='profile?id=$name'>$username</a>";
}
throw new Exception('Unable to find username in the given string');
}
$string = 'I am #Marie Lee, nice to meet you!';
$string = convertHashtags($string);
echo $string;
Demo: https://3v4l.org/e2S8C
If you want the text to appear as is in the innerHTML of the anchor tag, you need to change
list($username,$name) = [$matches[0] , strtolower(str_replace(' ','',$matches[1]))];
to
list($username,$name) = [$str , strtolower(str_replace(' ','',$matches[1]))];
Demo: https://3v4l.org/dCQ4S
Related
I found this solution on stackoverflow for getting the first word from a sentence.
$myvalue = 'Test me more';
$arr = explode(' ',trim($myvalue));
echo $arr[0]; // will print Test
However, this case takes ' ' (a space) as the divider. Does anyone know how to get the first word from a string if you do not know what the divider is? It can be ' ' (space), '.' (full stop), '.' (or comma). Basically, how do you take anything that is a letter from a string up to the point where there is no letter?
E.g.:
'House, rest of sentence here' would give 'House'
'House.' would also give 'House'
'House thing' would also give 'House'
Thanks!
There is a string function (strtok) which can be used to split a string into smaller strings (tokens) based on some separator(s). For the purposes of this thread, the first word (defined as anything before the first space character) of Test me more can be obtained by tokenizing the string on the space character.
<?php
$value = "Test me more";
echo strtok($value, " "); // Test
?>
For more details and examples, see the strtok PHP manual page.
preg_split is what you're looking for.
$str = "bla1 bla2,bla3";
$words = preg_split("/[\s,]+/", $str);
This snippet splits the $str by space, \t, comma, \n.
Use the preg_match() function with a regular expression:
if (preg_match('/^\w*/', 'Your text here', $matches) > 0) {
echo $matches[0]; // $matches[0] will contain the first word of your sentence
} else {
// no match found
}
For example, if my sentence is $sent = 'how are you'; and if I search for $key = 'ho' using strstr($sent, $key) it will return true because my sentence has ho in it.
What I'm looking for is a way to return true if I only search for how, are or you. How can I do this?
You can use the function preg-match that uses a regex with word boundaries:
if(preg_match('/\byou\b/', $input)) {
echo $input.' has the word you';
}
If you want to check for multiple words in the same string, and you're dealing with large strings, then this is faster:
$text = explode(' ',$text);
$text = array_flip($text);
Then you can check for words with:
if (isset($text[$word])) doSomething();
This method is lightning fast.
But for checking for a couple of words in short strings then use preg_match.
UPDATE:
If you're actually going to use this I suggest you implement it like this to avoid problems:
$text = preg_replace('/[^a-z\s]/', '', strtolower($text));
$text = preg_split('/\s+/', $text, NULL, PREG_SPLIT_NO_EMPTY);
$text = array_flip($text);
$word = strtolower($word);
if (isset($text[$word])) doSomething();
Then double spaces, linebreaks, punctuation and capitals won't produce false negatives.
This method is much faster in checking for multiple words in large strings (i.e. entire documents of text), but it is more efficient to use preg_match if all you want to do is find if a single word exists in a normal size string.
One thing you can do is breaking up your sentence by spaces into an array.
Firstly, you would need to remove any unwanted punctuation marks.
The following code removes anything that isn't a letter, number, or space:
$sent = preg_replace("/[^a-zA-Z 0-9]+/", " ", $sent);
Now, all you have are the words, separated by spaces. To create an array that splits by space...
$sent_split = explode(" ", $sent);
Finally, you can do your check. Here are all the steps combined.
// The information you give
$sent = 'how are you';
$key = 'ho';
// Isolate only words and spaces
$sent = preg_replace("/[^a-zA-Z 0-9]+/", " ", $sent);
$sent_split = explode(" ", $sent);
// Do the check
if (in_array($key, $sent))
{
echo "Word found";
}
else
{
echo "Word not found";
}
// Outputs: Word not found
// because 'ho' isn't a word in 'how are you'
#codaddict's answer is technically correct but if the word you are searching for is provided by the user, you need to escape any characters with special regular expression meaning in the search word. For example:
$searchWord = $_GET['search'];
$searchWord = preg_quote($searchWord);
if (preg_match("/\b$searchWord\b", $input) {
echo "$input has the word $searchWord";
}
With recognition to Abhi's answer, a couple of suggestions:
I added /i to the regex since sentence-words are probably treated case-insensitively
I added explicit === 1 to the comparison based on the documented preg_match return values
$needle = preg_quote($needle);
return preg_match("/\b$needle\b/i", $haystack) === 1;
I'm attempting to create a bad word filter in PHP that will analyze the word and match against an array of known bad words, but keep the first letter of the word and replace the rest with asterisks. Example:
fook would become f***
shoot would become s**
The only part I don't know is how to keep the first letter in the string, and how to replace the remaining letters with something else while keeping the same string length.
$string = preg_replace("/\b(". $word .")\b/i", "***", $string);
Thanks!
$string = 'fook would become';
$word = 'fook';
$string = preg_replace("~\b". preg_quote($word, '~') ."\b~i", $word[0] . str_repeat('*', strlen($word) - 1), $string);
var_dump($string);
$string = preg_replace("/\b".$word[0].'('.substr($word, 1).")\b/i", "***", $string);
This can be done in many ways, with very weird auto-generated regexps...
But I believe using preg_replace_callback() would end up being more robust
<?php
# as already pointed out, your words *may* need sanitization
foreach($words as $k=>$v)
$words[$k]=preg_quote($v,'/');
# and to be collapsed into a **big regexpy goodness**
$words=implode('|',$words);
# after that, a single preg_replace_callback() would do
$string = preg_replace_callback('/\b('. $words .')\b/i', "my_beloved_callback", $string);
function my_beloved_callback($m)
{
$len=strlen($m[1])-1;
return $m[1][0].str_repeat('*',$len);
}
Here is unicode-friendly regular expression for PHP:
function lowercase_except_first_letter($s) {
// the following line SKIP the first word and pass it to callback func...
// \W it allows to keep the first letter even in words in quotes and brackets
return preg_replace_callback('/(?<!^|\s|\W)(\w)/u', function($m) {
return mb_strtolower($m[1]);
}, $s);
}
I would like to replace just complete words using php
Example :
If I have
$text = "Hello hellol hello, Helloz";
and I use
$newtext = str_replace("Hello",'NEW',$text);
The new text should look like
NEW hello1 hello, Helloz
PHP returns
NEW hello1 hello, NEWz
Thanks.
You want to use regular expressions. The \b matches a word boundary.
$text = preg_replace('/\bHello\b/', 'NEW', $text);
If $text contains UTF-8 text, you'll have to add the Unicode modifier "u", so that non-latin characters are not misinterpreted as word boundaries:
$text = preg_replace('/\bHello\b/u', 'NEW', $text);
multiple word in string replaced by this
$String = 'Team Members are committed to delivering quality service for all buyers and sellers.';
echo $String;
echo "<br>";
$String = preg_replace(array('/\bTeam\b/','/\bfor\b/','/\ball\b/'),array('Our','to','both'),$String);
echo $String;
Result: Our Members are committed to delivering quality service to both buyers and sellers.
Array replacement list: In case your replacement strings are substituting each other, you need preg_replace_callback.
$pairs = ["one"=>"two", "two"=>"three", "three"=>"one"];
$r = preg_replace_callback(
"/\w+/", # only match whole words
function($m) use ($pairs) {
if (isset($pairs[$m[0]])) { # optional: strtolower
return $pairs[$m[0]];
}
else {
return $m[0]; # keep unreplaced
}
},
$source
);
Obviously / for efficiency /\w+/ could be replaced with a key-list /\b(one|two|three)\b/i.
You can also use T-Regx library, that quotes $ or \ characters while replacing
<?php
$text = pattern('\bHello\b')->replace($text)->all()->with('NEW');
ok i have some string
'Hello^<php>World&*124><
i ju*st press enteR'
how do i return it to ( a function is better ? )
'Hello World123
i just press enter'
allow
numbers
text
spaces , newline , etc
how do i do that with a regex? do i have to use regex? is there a another way ?
Thanks
Adam Ramadhan
You can do this:
function removeBad($str)
{
return preg_replace("/[^a-zA-Z0-9_ (\n|\r\n)]+/", "", $str);
}
This will remove anything other than alphabet, numbers, space and newline
If you also want to remove any tags such as <php> in your text, you could do:
function removeBad($str)
{
$str = strip_tags($str);
return preg_replace("/[^a-zA-Z0-9_ (\n|\r\n)]+/", "", $str);
}
Usage:
$str = removeBad('Hello^<php>World&*124><');
echo $str;
Result:
HelloWorld124
.
$str = removeBad('i ju*st press EnteR');
echo $str;
Result:
i just press EnteR
Regex substitution can do this for you. I think that you need two. The first one to remove everything between the < and > characters. The second one to remove any character that is NOT in your allowed character set. That is the safest way to do it.