Regular Expressions, detecting a text pattern

Regular Expressions, detecting a text pattern - php

I'm interacting with my users via SMS, if they send me an SMS with this pattern, I need to perform an action:
Pattern:
*TEXT*TEXT*TEXT#
In TEXT all the characters are allowed, so I have made this regex:
if (preg_match('/^\*([^*]*)\*([^*]*)\*([^#]*)\#$/', $text)){
// perform the action...
}
The above regex works actually good, but it's not allowing next lines after #, for example:
'*hello there*how you doing!?* and blah#' pass the regex, but:
'*hello there*how you doing!?* and blah#
'
is not passed by the above regex(pay attention to the next lines after #)
So I decided to:
$text = str_replace("\n\r", '', $text);
But the above example is still not passed :-(
How should I allow next lines here in the regex? or get rid of them?
Thanks for your help

To allow for optional spaces and/or newlines after the hash:
if (preg_match('/^\*([^*]*)\*([^*]*)\*([^#]*)\#\s*$/', $text)){
I've added \s* as the expression just before the end-of-subject.
You can also use trim beforehand:
if (preg_match('/^\*([^*]*)\*([^*]*)\*([^#]*)\#$/', trim($test))){
Update
As an added requirement that text between the stars cannot be empty:
if (preg_match('/^\*([^*]+)\*([^*]+)\*([^#]+)\#$/', trim($test))){

Oops error, like #Jack said we must add \s* and forget about "m".
I changed * to + so that it returns false when there's nothing between ** :
$text = "*fddsfdsf*dfdfd*5f8ssfdssf8#
";
if (preg_match('/^\*([^*]+)\*([^*]+)\*([^#]+)\#\s*$/', $text)){
echo "yes";
}else{
echo "no";
}

Related

php regex remove inline comment only

I have simple code look like this
function session(){
return 1; // this default value for session
}
I need regex or code to remove the comment // this is default value for session, And only remove this type of comment, which starts by a space or two or more, then //, then a newline after it.
All other types of comment and cases are ignored.

UPDATED (1)
And only remove this type of comment, which starts by a space or two or more, then //, then a newline after it
Try this one:
regex101 1
PHP Fiddle 1 -hit "run" or F9 to see the result
/\s+\/\/[^\n]+/m
\s+ starts by a space or two or more
\/\/ the escaped //
[^\n]+ anything except a new line
UPDATE: to make sure -kinda-this only applied to code lines, we can make use of the lookbehind (2) regex to check if there is a semicolon ; before the space[s] and the comment slashes //, so the regex will be this:
regex101 2
PHP Fiddle 2
/(?<=;)\s+\/\/[^\n]+/m
where (?<=;) is the lookbehind which basically tells the engine to look behind and check if there's a ; before it then match.
-----------------------------------------------------------------------
(1) The preg_replace works globally, no need for the g flag
(2) The lookbehind is not supported in javascript

A purely regex solution would look something like this:
$result = preg_replace('#^(.*?)\s+//.*$#m', '\1', $source);
but that would still be wrong because you could get trapped by something like this:
$str = "This is a string // that has a comment inside";
A more robust solution would be to completely rewrite the php code using token_get_all() to actually parse the PHP code into tokens that you can then selectively remove when you re-emit the code:
foreach(token_get_all($source) as $token)
{
if(is_array($token))
{
if($token[0] != T_COMMENT || substr($token[1] != '//', 0, 3))
echo $token[1];
}
else
echo $token;
}

Preg_match for different language URLs

I have some text like this :
$text = "Some thing is there http://example.com/جميع-وظائف-فى-السليمانية
http://www.example.com/جميع-وظائف-فى-السليمانية nothing is there
Check me http://example.com/test/for_me first
testing http://www.example.com/test/for_me the url
Should be test http://www.example.com/翻译-英语教师-中文教师-外贸跟单
simple text";
I need to preg_match the URL, but they are of different languages.
So, I need to get the URL itself, from each line.
I was doing like this :
$text = preg_replace("/[\n]/", " <br>", $text);
$lines = explode("<br>", $text);
foreach($line as $textLine){
if (preg_match("/(http\:\/\/(.*))/", $textLine, $match )) {
// some code
// Here I need the url
}
}
My current regex is /(http\:\/\/(.*))/, please suggest how I can make this compatible with the URLs in different languages?

A regular expression like this may work for you?
In my test it worked with the text example you gave however it is not very advanced. It will simple select all characters after http:// or https:// until a white-space character occures (space, new line, tab, etc).
/(https?\:\/\/(?:[^\s]+))/gi
Here is a visual example of what would be matched from your sample string:
http://regex101.com/r/bR0yE9

You don't need to work line by line, you can search directly:
if (preg_match_all('~\bhttp://\S+~', $text, $matches))
print_r($matches);
Where \S means "all that is not a white character".There is no special internalisation problem.
Note: if you want to replace all newlines after with <br/>, I suggest to use $text = preg_replace('~\R~', '<br/>', $text);, because \R handles several type of newlines when \n will match only unix newlines.

preg issue with PHP

I have the following piece of PHP code:
$string = "Ouch!; Funny, these photos were taken with my own phone... … ";
echo preg_replace("[^A-Za-z0-9:\/.,;]", '', $string);
As far as I can tell, this removes everything that is not Alphanumeric as well as the characters: : . , /
When I run it, I get:
Ouch!; Funny, these photos were taken with my own phone... â€¦
Instead of what I was expecting:
Ouch!; Funny, these photos were taken with my own phone...
These special characters are still making it in, even though I am excluding them. Any ideas?
Answer:
Summarized from the answers and comments below - this will eliminate special characters, but allows .',;?/\: and insures that we don't end up with multiple blanks:
preg_replace("/[^A-Za-z0-9:\/.,;!##'?!\s+!]/",' ', $string)

PHP regular expressions, including preg_replace, expect delimiters around the regular expression.
$string = "Ouch!; Funny, these photos were taken with my own phone... … ";
echo preg_replace("/[^A-Za-z0-9:\/.,;]/u", ' ', $string);
Note the / on either side of your expression. You'll also probably want the utf-8 modifier u (thx #jon).
Now in this case, you're actually going to end up with:
Ouch;Funny,thesephotosweretakenwithmyownphone...
This isn't what you wrote out however; in order to do that, you'll need a bit more complex code. You could simply replace with ' ' (space) but you might end up with a bunch of unwanted whitespace.

This works:
$string = "Ouch!; Funny, these photos were taken with my own phone... … ";
echo preg_replace("/[^A-Za-z0-9:\/\.,; ]/", '', $string);
http://3v4l.org/ne7Qu

Regular expression to replace broken email links

Problem: authors have added email addresses wrongly in a CMS - missing out the 'mailto:' text.
I need a regular expression, if possible, to do a search and replace on the stored MySQL content table.
Cases I need to cope with are:
No 'mailto:'
'mailto:' is already included (correct)
web address not email - no replace
multiple mailto: required (more than one in string)
Sample string would be: (line breaks added for readability)
add1#test.com and
add2#test.com and
real web link
second one to replace add3#test.com
Required output would be:
add1#test.com and
add2#test.com and
real web link
second one to replace add3#test.com
What I tried (in PHP) and issues:
pattern: /href="(.+?)(#)(.+?)(<\/a> )/iU
replacement: href="mailto:$1$2$3$4
This is adding mailto: to the correctly formatted mailto: and acting greedily over the last two links.
Thanks for any help. I have looked about, but am running out of time on this as it was an unexpected content issue.
If you are able to save me time and give the SQL expression, that would be even better.

Try replace
/href="(?!(mailto:|http:\/\/|www\.))/iU
with
href="mailto:
?! loosely means "the next characters aren't these".
Alternative:
Replace
/(href=")(?!mailto:)([^"]+#)/iU
with
$1mailto:$2
[^"]+ means 1 or more characters that aren't ".
You'd probably need a more complex matching pattern for guaranteed correctness.
MySQL REGEX matching:
See this or this.

You need to apply a proper mail pattern first (e.g: Using a regular expression to validate an email address), second search for mailto:before mail or nothing (e.g: (mailto:|)), and last preg_replace_callback suits for this.
This looks like working as you wish (searching only email addresses in double quotes);
$s = 'add1#test.com and
add2#test.com and
real web link
second one to replace add3#test.com';
echo preg_replace_callback(
'~"(mailto:|)([_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4}))"~i',
function($m) {
// print_r($m); #debug
return '"mailto:'. $m[2] .'"';
},
$s
);
Output as you desired;
add1#test.com and
add2#test.com and
real web link
second one to replace add3#test.com

Use the following as pattern:
/(href=")(?!mailto:)(.+?#.+?")/iU
and replace it with
$1mailto:$2
(?!mailto:) is a negative lookahead checking whether a mailto: follows. If there is no such one, remaining part is checked for matching. (.+?#.+?") matches one or more characters followed by a # followed by one or more characters followed by a ". Both + are non-greedy.
The matched pattern is replaced with first capture group (href=") followed by mailto: followed by second capture group (upto closing ").

Regex For PHP Code?

I have the following code
<?
php drupal_set_message("Your registration submission has been received.");
drupal_goto("/events-initiatives/events-listing");
?>
And I want to remove everything but the Your registration submission has been received. and this message will change, so I need it to be a wildcard. So it would also make say
<?php
drupal_set_message("Testing!!!");
drupal_goto("/events-initiatives/events-listing");
?>
But I can't figure out how to do the PHP code, my current one is
preg_replace('#(<?php drupal_set_message(").*?("); drupal_goto("/guidelines-resources/professionals/lending-library"); ?>)#', '$1$2', $string);
but that isn't working, it seems to have problems with the ( in it.
Any idea how I could do this?

From looking at your original post, (before your regex was changed into a PHP snippet) I'd suggest you are looking for a regex along these lines:
#<\?php\s+drupal_set_message\(".*?"\);\s+drupal_goto\("/guidelines-resources/professionals/lending-library"\);\s+\?>#
Note that this regex:
escapes all special characters (e.g., ?, ( and )) with preceding slashes
replaces a single space with \s+ which matches one or more consecutive whitespace characters
EDIT
After rereading your question, if the only thing you want left is the text that is passed as an argument to drupal_set_message, then try this:
$pattern = '#\bdrupal_set_message\("(.*?)"\)#';
$found = preg_match($pattern, $subject, $matches);
// if found, $matches[1] will contain the argument to drupal_set_message

You can escape the special characters (though really, just the open and close parentheses) with backslashes. On a side note, if you have a decent IDE then it should have sophisticated regex-capable search-and-replace; use it (although if you do, you'll probably need to also escape the forward slashes, as those are the most likely delimiters that your IDE would use).

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regular Expressions, detecting a text pattern - php

Oops error, like #Jack said we must add \s* and forget about "m". I changed * to + so that it returns false when there's nothing between ** : $text = "fddsfdsfdfdfd5f8ssfdssf8# "; if (preg_match('/^\([^]+)\([^]+)\([^#]+)\#\s*$/', $text)){ echo "yes"; }else{ echo "no"; }

Related

php regex remove inline comment only

Preg_match for different language URLs

preg issue with PHP

Regular expression to replace broken email links

Regex For PHP Code?

Categories

Resources

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regular Expressions, detecting a text pattern - php

Oops error, like #Jack said we must add \s* and forget about "m". I changed * to + so that it returns false when there's nothing between ** : $text = "*fddsfdsf*dfdfd*5f8ssfdssf8# "; if (preg_match('/^\*([^*]+)\*([^*]+)\*([^#]+)\#\s*$/', $text)){ echo "yes"; }else{ echo "no"; }

Related

php regex remove inline comment only

Preg_match for different language URLs

preg issue with PHP

Regular expression to replace broken email links

Regex For PHP Code?

Categories

Resources

Oops error, like #Jack said we must add \s* and forget about "m". I changed * to + so that it returns false when there's nothing between ** : $text = "fddsfdsfdfdfd5f8ssfdssf8# "; if (preg_match('/^\([^]+)\([^]+)\([^#]+)\#\s*$/', $text)){ echo "yes"; }else{ echo "no"; }