PHP mention system with usernames with space - php

I wanted to know if it's possible to make a PHP mention system with usernames with space ?
I tried this
preg_replace_callback('##([a-zA-Z0-9]+)#', 'mentionUser', htmlspecialchars_decode($r['content']))
My function:
function mentionUser($matches) {
global $db;
$req = $db->prepare('SELECT id FROM members WHERE username = ?');
$req->execute(array($matches[1]));
if($req->rowCount() == 1) {
$idUser = $req->fetch()['id'];
return '<a class="mention" href="members/profile.php?id='.$idUser.'">'.$matches[0].'</a>';
}
return $matches[0];
It works, but not for the usernames with space...
I tried to add \s, it works, but not well, the preg_replace_callback detect the username and the other parts of the message, so the mention don't appear...
Is there any solution ?
Thanks !

I know you said that you just removed the ability to add a space, but I still wanted to post a solution. To be clear, I don't necessarily think you should use this code, because it probably is just easier to keep things simple, but I think it should work still.
Your major problem is that almost every mention will incur two lookups because #bob johnson went to the store could be either bob or bob johnson and there's no way to determine that without going to the databases. Caching will greatly reduce this problem, luckily.
Below is some code that generally does what you are looking for. I made a fake database using just an array for clarity and reproducibility. The inline code comments should hopefully make sense.
function mentionUser($matches)
{
// This is our "database" of users
$users = [
'bob johnson',
'edward',
];
// First, grab the full match which might be 'name' or 'name name'
$fullMatch = $matches['username'];
// Create a search array where the key is the search term and the value is whether or not
// the search term is a subset of the value found in the regex
$names = [$fullMatch => false];
// Next split on the space. If there isn't one, we'll have an array with just a single item
$maybeTwoParts = explode(' ', $fullMatch);
// Basically, if the string contained a space, also search only for the first item before the space,
// and flag that we're using a subset
if (count($maybeTwoParts) > 1) {
$names[array_shift($maybeTwoParts)] = true;
}
foreach ($names as $name => $isSubset) {
// Search our "database"
if (in_array($name, $users, true)) {
// If it was found, wrap in HTML
$ret = '<span>#' . $name . '</span>';
// If we're in a subset, we need to append back on the remaining string, joined with a space
if ($isSubset) {
$ret .= ' ' . array_shift($maybeTwoParts);
}
return $ret;
}
}
// Nothing was found, return what was passed in
return '#' . $fullMatch;
}
// Our search pattern with an explicitly named capture
$pattern = '##(?<username>\w+(?:\s\w+)?)#';
// Three tests
assert('hello <span>#bob johnson</span> test' === preg_replace_callback($pattern, 'mentionUser', 'hello #bob johnson test'));
assert('hello <span>#edward</span> test' === preg_replace_callback($pattern, 'mentionUser', 'hello #edward test'));
assert('hello #sally smith test' === preg_replace_callback($pattern, 'mentionUser', 'hello #sally smith test'));

Try this RegEx:
/#[a-zA-Z0-9]+( *[a-zA-Z0-9]+)*/g
It will find an at sign first, and then try to find one or more letter or numbers. It will try to find zero or more inner spaces and zero or more letters and numbers coming after that.
I am assuming the username only contains A-Za-z0-9 and space.

Related

How can I str_replace partially in PHP in a dynamic string with unknown key content

Working in WordPress (PHP). I want to set strings to the database like below. The string is translatable, so it could be in any language keeping the template codes. For the possible variations, I presented 4 strings here:
<?php
$string = '%%AUTHOR%% changed status to %%STATUS_new%%';
$string = '%%AUTHOR%% changed status to %%STATUS_oldie%%';
$string = '%%AUTHOR%% changed priority to %%PRIORITY_high%%';
$string = '%%AUTHOR%% changed priority to %%PRIORITY_low%%';
To make the string human-readable, for the %%AUTHOR%% part I can change the string like below:
<?php
$username = 'Illigil Liosous'; // could be any unicode string
$content = str_replace('%%AUTHOR%%', $username, $string);
But for status and priority, I have different substrings of different lengths.
Question is:
How can I make those dynamic substring be replaced on-the-fly so that they could be human-readable like:
Illigil Liosous changed status to Newendotobulous;
Illigil Liosous changed status to Oldisticabulous;
Illigil Liosous changed priority to Highlistacolisticosso;
Illigil Liosous changed priority to Lowisdulousiannosso;
Those unsoundable words are to let you understand the nature of a translatable string, that could be anything other than known words.
I think I can proceed with something like below:
<?php
if( strpos($_content, '%%STATUS_') !== false ) {
// proceed to push the translatable status string
}
if( strpos($_content, '%%PRIORITY_') !== false ) {
// proceed to push the translatable priority string
}
But how can I fill inside those conditionals efficiently?
Edit
I might not fully am clear with my question, hence updating the query. The issue is not related to array str_replace.
The issue is, the $string that I need to detect is not predefined. It would come like below:
if($status_changed) :
$string = "%%AUTHOR%% changed status to %%STATUS_{$status}%%";
else if($priority_changed) :
$string = "%%AUTHOR%% changed priority to %%PRIORITY_{$priority}%%";
endif;
Where they will be filled dynamically with values in the $status and $priority.
So when it comes to str_replace() I will actually use functions to get their appropriate labels:
<?php
function human_readable($codified_string, $user_id) {
if( strpos($_content, '%%STATUS_') !== false ) {
// need a way to get the $status extracted from the $codified_string
// $_got_status = ???? // I don't know how.
get_status_label($_got_status);
// the status label replacement would take place here, I don't know how.
}
if( strpos($_content, '%%PRIORITY_') !== false ) {
// need a way to get the $priority extracted from the $codified_string
// $_got_priority = ???? // I don't know how.
get_priority_label($_got_priority);
// the priority label replacement would take place here, I don't know how.
}
// Author name replacement takes place now
$username = get_the_username($user_id);
$human_readable_string = str_replace('%%AUTHOR%%', $username, $codified_string);
return $human_readable_string;
}
The function has some missing points where I currently am stuck. :(
Can you guide me a way out?
It sounds like you need to use RegEx for this solution.
You can use the following code snippet to get the effect you want to achieve:
preg_match('/%%PRIORITY_(.*?)%%/', $_content, $matches);
if (count($matches) > 0) {
$human_readable_string = str_replace("%%PRIORITY_{$matches[0]}%%", $replace, $codified_string);
}
Of course, the above code needs to be changed for STATUS and any other replacements that you require.
Explaining the RegEx code in short it:
/
The starting of any regular expression.
%%PRIORITY_
Is a literal match of those characters.
(
The opening of the match. This is going to be stored in the third parameter of the preg_match.
.
This matches any character that isn't a new line.
*?
This matches between 0 and infinite of the preceding character - in this case anything. The ? is a lazy match since the %% character will be matched by the ..
Check out the RegEx in action: https://regex101.com/r/qztLue/1

How to detect usernames in a comment using preg_match?

I am trying to design a comment/reply system like the one in stackoverflow where if #username is mentioned in a comment then a notification is send to him.
As an example take the comment
$comment="hello #myname and #my-name and #my+name and #my%name and #my&name and #my_name and #my name #my/name and #3535 and #12";
the problem is my code
if(preg_match('~#([^\s]+)~', $comment, $matches)){
print_r($matches);
}
only finds the username #myname. Is there a way to fix this so that it detects all usernames?
Also, which of the usernames mentioned in the comment above are valid usernames in stackoverflow for example are my-name, my%name valid usernames and are they detected when they are mensioned in a stackoverflow comment.
Finally, is it possible to replace every valid username in my comment example by <strong>username</strong>?
The problem with your code is that the preg_match() function finds the first matched pattern and returns true or false without moving ahead along the rest of the string. So it won't go through the next usernames.
For this wrapping the preg_match() conditional in a loop can be a good deal.
This code should get it done!
$comment="hello #myname and #my-name and #my+name and #my%name and #my&name and #my_name and #my name #my/name and #3535 and #12";
$comment_arr = explode(' ', $comment);
// echo '<pre>';
// print_r($comment_arr);
// echo '</pre>';
$usernames = [];
$new_comment_arr = [];
for ($i=0; $i < count($comment_arr) ; $i++)
{
if( preg_match('/^#(.*)/', $comment_arr[$i]) )
{
array_push($usernames, $comment_arr[$i]); // push the usernames
array_push($new_comment_arr, '<strong>'.$comment_arr[$i].'</strong>'); // push the usernames with '<strong>' wrapped around in the new comments array
}
else
array_push($new_comment_arr, $comment_arr[$i]); // push the unmatched words(other words) in the new comments array
}
echo '<pre>';
print_r($new_comment_arr);
print_r($usernames);
echo '</pre>';
$new_comment = implode(' ', $new_comment_arr); // implode the new array
echo $new_comment; // the new comment with '<strong>' wrapped around the usernames
The username #my name shouldn't be allowed.
In some cases if you want the username to be in URL then such a username gets converted to #my%20name.
Also do not allow '/' in a username as, if you Rewrite the URL, it will be treated as an argument and can lead to 404 Errors.
As far as I'm concerned, you should allow only letters, numbers and underscores( '_' ) in a username.
Why not try Social Plugin for comments
I suggest you to use Facebook plugin for comments
for more details http://developers.facebook.com/docs/plugins/comments
I think you must collect all the name that started sit char # in array so you can use the array for everything you want like send notification to all of them by loop the array.
I have made a code to accommodate it.
<?php
$comment="hello #myname and #my-name and #my+name and #my%name and #my&name and #my_name and #my name #my/name and #3535 and #12";
$keywords = preg_split("/[\s]+/", $comment);
foreach($keywords as $row=>$value){
if(preg_match("/^#/",$value)==0){
unset($keywords[$row]);
}
}
print_r($keywords);
?>

PHP performant search a text for given usernames

I am currently dealing with a performance issue where I cannot find a way to fix it. I want to search a text for usernames mentioned with the # sign in front. The list of usernames is available as PHP array.
The problem is usernames may contain spaces or other special characters. There is no limitation for it. So I can't find a regex dealing with that.
Currently I am using a function which gets the whole line after the # and checks char by char which usernames could match for this mention, until there is just one username left which totally matches the mention. But for a long text with 5 mentions it takes several seconds (!!!) to finish. for more than 20 mentions the script runs endlessly.
I have some ideas, but I don't know if they may work.
Going through username list (could be >1.000 names or more) and search for all #Username without regex, just string search. I would say this would be far more inefficient.
Checking on writing the usernames with JavaScript if space or punctual sign is inside the username and then surround it with quotation marks. Like #"User Name". Don't like that idea, that looks dirty for the user.
Don't start with one character, but maybe 4. and if no match, go back. So same principle like on sorting algorithms. Divide and Conquer. Could be difficult to implement and will maybe lead to nothing.
How does Facebook or twitter and any other site do this? Are they parsing the text directly while typing and saving the mentioned usernames directly in the stored text of the message?
This is my current function:
$regular_expression_match = '#(?:^|\\s)#(.+?)(?:\n|$)#';
$matches = false;
$offset = 0;
while (preg_match($regular_expression_match, $post_text, $matches, PREG_OFFSET_CAPTURE, $offset))
{
$line = $matches[1][0];
$search_string = substr($line, 0, 1);
$filtered_usernames = array_keys($user_list);
$matched_username = false;
// Loop, make the search string one by one char longer and see if we have still usernames matching
while (count($filtered_usernames) > 1)
{
$filtered_usernames = array_filter($filtered_usernames, function ($username_clean) use ($search_string, &$matched_username) {
$search_string = utf8_clean_string($search_string);
if (strlen($username_clean) == strlen($search_string))
{
if ($username_clean == $search_string)
{
$matched_username = $username_clean;
}
return false;
}
return (substr($username_clean, 0, strlen($search_string)) == $search_string);
});
if ($search_string == $line)
{
// We have reached the end of the line, so stop
break;
}
$search_string = substr($line, 0, strlen($search_string) + 1);
}
// If there is still one in filter, we check if it is matching
$first_username = reset($filtered_usernames);
if (count($filtered_usernames) == 1 && utf8_clean_string(substr($line, 0, strlen($first_username))) == $first_username)
{
$matched_username = $first_username;
}
// We can assume that $matched_username is the longest matching username we have found due to iteration with growing search_string
// So we use it now as the only match (Even if there are maybe shorter usernames matching too. But this is nothing we can solve here,
// This needs to be handled by the user, honestly. There is a autocomplete popup which tells the other, longer fitting name if the user is still typing,
// and if he continues to enter the full name, I think it is okay to choose the longer name as the chosen one.)
if ($matched_username)
{
$startpos = $matches[1][1];
// We need to get the endpos, cause the username is cleaned and the real string might be longer
$full_username = substr($post_text, $startpos, strlen($matched_username));
while (utf8_clean_string($full_username) != $matched_username)
{
$full_username = substr($post_text, $startpos, strlen($full_username) + 1);
}
$length = strlen($full_username);
$user_data = $user_list[$matched_username];
$mentioned[] = array_merge($user_data, array(
'type' => self::MENTION_AT,
'start' => $startpos,
'length' => $length,
));
}
$offset = $matches[0][1] + strlen($search_string);
}
Which way would you go? The problem is the text will be displayed often and parsing it every time will consume a lot of time, but I don't want to heavily modify what the user had entered as text.
I can't find out what's the best way, and even why my function is so time consuming.
A sample text would be:
Okay, #Firstname Lastname, I mention you!
Listen #[TEAM] John, you are a team member.
#Test is a normal name, but #Thât♥ should be tracked too.
And see #Wolfs garden! I just mean the Wolf.
Usernames in that text would be
Firstname Lastname
[TEAM] John
Test
Thât♥
Wolf
So yes, there is clearly nothing I know where a name may end. Only thing is the newline.
I think the main problem is, that you can't distinguish usernames from text and it's a bad idea, to lookup maybe thousands of usernames in a text, also this can lead to further problems, that John is part of [TEAM] John‌ or JohnFoo...
It's needed to separate the usernames from other text. Assuming that you're using UTF-8, could put the usernames inside invisible zero-w space \xE2\x80\x8B and non-joiner \xE2\x80\x8C.
The usernames can now be extracted fast and with little effort and if needed still verified in db.
$txt = "
Okay, #\xE2\x80\x8BFirstname Lastname\xE2\x80\x8C, I mention you!
Listen #\xE2\x80\x8B[TEAM] John\xE2\x80\x8C, you are a team member.
#\xE2\x80\x8BTest\xE2\x80\x8C is a normal name, but
#\xE2\x80\x8BThât?\xE2\x80\x8C should be tracked too.
And see #\xE2\x80\x8BWolfs\xE2\x80\x8C garden! I just mean the Wolf.";
// extract usernames
if(preg_match_all('~#\xE2\x80\x8B\K.*?(?=\xE2\x80\x8C)~s', $txt, $out)){
print_r($out[0]);
}
Array
(
[0] => Firstname Lastname
1 => [TEAM] John
2 => Test
3 => Thât♥
4 => Wolfs
)
echo $txt;
Okay, #​Firstname Lastname, I mention you!
Listen #​[TEAM] John‌, you are a team member.
#​Test‌ is a normal name, but
#​Thât♥‌ should be tracked too.
And see #​Wolfs‌ garden! I just mean the Wolf.
Could use any characters you like and that possibly don't occur elsewhere for separation.
Regex FAQ, Test at eval.in (link will expire soon)

Can I add variable name within a string?

I am creating an OpenCart extension where the admin can change his email templates using the user interface in the admin panel.
I would like the user to have the option to add variables to his custom email templates. For example he could put in:
Hello $order['customer_firstname'], your order has been processed.
At this point $order would be undefined, the user is simply telling defining the message that is to be sent. This would be stored to the database and called when the email is to be sent.
The problem is, how do I get "$order['customer_firstname']" to become a litteral string, and then be converted to a variable when necessary?
Thanks
Peter
If I understand your question correctly, you could do something like this:
The customer has a textarea or similar to input the template
Dear %NAME%, blah blah %SOMETHING%
Then you could have
$values = array('%SOMETHING%' => $order['something'], '%NAME%' => $order['name']);
$str = str_replace(array_keys($values), array_values($values), $str);
the user will be using around 40 variables. Is there a way I can set it to do that for each "%VARIABLE%"?
Yes, you can do so for each variable easily with the help of a callback function.
This allows you, to process each match with a function of your choice, returning the desired replacement.
$processed = preg_replace_callback("/%(\S+)%/", function($matches) {
$name = $matches[1]; // between the % signs
$replacement = get_replacement_if_valid($name);
return $replacement;
},
$text_to_replace_in
);
From here, you can do anything you like, dot notation, for example:
function get_replacement_if_valid($name) {
list($var, $key) = explode(".", $name);
if ($var === "order") {
$order = init_oder(); // symbolic
if(array_key_exists($key, $order)) {
return $order[$key];
}
}
return "<invalid key: $name>";
}
This simplistic implementation allows you, to process replacements such as %order.name% substituting them with $order['name'].
You could define your own simple template engine:
function template($text, $context) {
$tags = preg_match_all('~%([a-zA-Z0-9]+)\.([a-zA-Z0-9]+)%~', $text, $matches);
for($i = 0; $i < count($matches[0]); $i++) {
$subject = $matches[0][$i];
$ctx = $matches[1][$i];
$key = $matches[3][$i];
$value = $context[$ctx][$key];
$text = str_replace($subject, $value, $text);
}
return $text;
}
This allows you to transform a string like this:
$text = 'Hello %order.name%. You have %order.percent%% discount. Pay a total ammount of %payment.ammount% using %payment.type%.';
$templated = template($text, array(
'order' => array(
'name' => 'Alex',
'percent' => 20
),
'payment' => array(
'type' => 'VISA',
'ammount' => '$299.9'
)
));
echo $templated;
Into this:
Hello Alex. You have 20% discount. Pay a total ammount of $299.9 using VISA.
This allows you to have any number of variables defined.
If you want to keep the PHP-syntax, then a regex would be appropriate to filter them:
$text = preg_replace(
"/ [$] (\w+) \[ '? (\w+) \'? \] /exi",
"$$1['$2']", # basically a constrained eval
$text
);
Note that it needs to be executed in the same scope as $order is defined. Else (and preferrably) use preg_replace_callback instead for maximum flexibility.
You could also allow another syntax this way. For example {order[customer]} or %order.customer% is more common and possibly easier to use than the PHP syntax.
You can store it as Hello $order['customer_firstname'] and while accessing make sure you have double-quotes "" to convert the variable to its corresponding value.
echo "Hello $order['customer_firstname']";
Edit: As per the comments, a variation to Prash's answer,
str_replace('%CUSTOMERNAME%', $order['customer_name'], $str);
What you're looking for is:
eval("echo \"" . $input . "\";");
but please, PLEASE don't do that, because that lets the user run any code he wants.
A much better way would be a custom template-ish system, where you provide a list of available values for the user to drop in the code using something like %user_firstname%. Then, you can use str_replace and friends to swap those tags out with the actual values, but you can still scan for any sort of malicious code.
This is why Markdown and similar are popular; they give the user control over presentation of his content while still making it easy to scan for HTML/JS/PHP/SQL injection/anything else they might try to sneak in, because whitelisting is easier than blacklisting.
Perhaps you can have a template like this:
$tpl = "Hello {$order['customer_firstname']}, your order has been processed.".
If $order and that specific key is not null, you can use echo $tpl directly and show the content of 'customer_firstname' key in the text. The key are the curly braces here.

Most Efficient Way to Search for "Bad Names" in a User's Name

I have an app that I'm developing, in it users can choose a name for themselves. I need to be able to filter out "bad" names, so I do this for now:
$error_count=0;
$bad_names="badname1badname2";
preg_match_all("/\b".$user_name."\b/i",$global['bad_names'],
$matches,PREG_OFFSET_CAPTURE);
if(count($matches[0])>0)
{
$error_count++;
}
This would tell me if the user's name was inside the bad names list, however, it doesn't tell me if the bad name itself is in the user's name. They could combine a bad word with something else and I wouldn't detect it.
What kind of regex (if I even use regex) would I use for this? I need to be able to take any bad name (preferably in an array like $bad_names), and search through the user's name to see whether that word is within their name. I'm not great with regex, and the only way I can think of is to put it all through a loop which seems highly inefficient. Anyone have a better idea? I guess I need to figure out how to search through a string with an array.
$badnames = array('name1', 'name2');
// you need to quote the names so they can be inserted into the
// regular expression safely
$badnames_quoted = array();
foreach ($badnames as $name) {
$badnames_quoted[] = preg_quote($name, '/');
}
// now construct a RE that will match any bad name
$badnames_re = '/\b('.implode('|', $badnames_quoted).')\b/Siu';
// no need to gather all matches, or even to see what matched
$hasbadname = preg_match($badnames_re, $thestring);
if ($hasbadname) {
// bad name found
}
private static $bad_name = array("word1", "word2", "word3");
private static $forbidden_name = array (array of unwanted character strings)
private static function userNameValid($name_in) {
$badFound = preg_match("/\b(" . implode(self::$bad_name,"|") . ")\b/i", $name_in); // checks array for exact match
$forbiddenFound = preg_match("/(" . implode(self::$forbidden_name,"|") . ")/i", $name_in); // checks array for any character match with a given name (i.e. "ass" would be found in assassin)
if ($badFound) {
return FALSE;
} elseif ($forbiddenFound) {
return FALSE;
} else {
return TRUE;
}
This works GREAT for me

Categories