I have this combination in a string:
"I am tagging #username1.blah. and #username2.test. and #username3."
I tried this:
preg_replace('/\#^|(\.+)/', '', 'I am tagging #username1.blah. and #username2.test. and #username3. in my status.');
But the result is:
"I am tagging #username1blah and #username2test and #username3 in my status"
The above result is not what I wanted.
This is what I want to achieve:
"I am tagging #username.blah and #username2.test and #username3 in my status."
Could someone help me what I have done wrong in the pattern?
Many thanks,
Jon
I don't like regex very much, but when you are sure that the dots you want to remove are always followed by a space, you could do something like this:
php > $a = "I am tagging #username1.blah. and #username2.test. and #username3.";
php > echo str_replace(". ", " ", $a." ");
I am tagging #username1.blah and #username2.test and #username3
Try this:
preg_replace('/\.(\s+|$)/', '\1', $r);
This will replace dots at the end of "words" that are starting with #
$input = "I am tagging #username1.blah. and #username2.test. and #username3. in my status.";
echo preg_replace('/(#\S+)\.(?=\s|$)/', '$1', $input);
(#\S+)\.(?=\s|$) will match a dot at the end of a non whitespace (\S) series when the dot is followed by whitespace or the end of the string ((?=\s|$))
preg_replace('/\.( |$)/', '\1', $string);
How about:
preg_replace("/(#\S+)\.(?:\s|$)/", "$1", $string);
/\#\w+(\.\w+)?(?<dot>\.)/
That will match all dots and name them in the dot group
Related
The issue:
Basically when it sees type of letter that regex don't allow it messes up with the link.
My function in php to convert the names that are read from database into links:
function convertActor($str) {
$regex = "/([a-zA-Z-.' ])+/";
$str = preg_replace($regex, "<a href='/pretraga.php?q=$0' title='$0' class='actor'>$0</a>", $str);
return $str;
}
Also I want to allow spaces, dashes, dots and single quotes.
Thanks in advance.
You could try this:
$regex = "/(?:[a-zA-Z.\-' ]|[^\\u0000-\\u007F,])+/";
Which converts régime, Coffehouse Coder, Apple into
'<a href='/pretraga.php?q=régime' title='régime' class='actor'>régime</a>,<a href='/pretraga.php?q= Coffehouse Coder' title=' Coffehouse Coder' class='actor'> Coffehouse Coder</a>,<a href='/pretraga.php?q= Apple' title=' Apple' class='actor'> Apple</a>',
here on regex101.com.
The follwing regex should work for you
[^\u0000-\u007F ]|[a-zA-Z-.' ]\g
[^\u0000-\u007F ] : will match all non English Unicode characters.
[a-zA-Z-.' ]: will match English alphabets
For PHP use this : [^\\u0000-\\u007F]|[a-zA-Z-.']
For everyone that is looking for the solve, i found this temporary solution:
$regex = "/([\w-.' \á-\ÿ])+/";
Without dot and stuff but with spaces:
$regex = "/([a-zA-Z \á-\ÿ])+/";
Cheers
i'm not very firm with regular Expressions, so i have to ask you:
How to find out with PHP if a string contains a word starting with # ??
e.g. i have a string like "This is for #codeworxx" ???
I'm so sorry, but i have NO starting point for that :(
Hope you can help.
Thanks,
Sascha
okay thanks for the results - but i did a mistake - how to implement in eregi_replace ???
$text = eregi_replace('/\B#[^\B]+/','\\1', $text);
does not work??!?
why? do i not have to enter the same expression as pattern?
Match anything with has some whitespace in front of a # followed by something else than whitespace:
$ cat 1812901.php
<?php
echo preg_match("/\B#[^\B]+/", "This should #match it");
echo preg_match("/\B#[^\B]+/", "This should not# match");
echo preg_match("/\B#[^\B]+/", "This should match nothing and return 0");
echo "\n";
?>
$ php 1812901.php
100
break your string up like this:
$string = 'simple sentence with five words';
$words = explode(' ', $string );
Then you can loop trough the array and check if the first character of each word equals "#":
if ($stringInTheArray[0] == "#")
Assuming you define a word a sequence of letters with no white spaces between them, then this should be a good starting point for you:
$subject = "This is for #codeworxx";
$pattern = '/\s*#(.+?)\s/';
preg_match($pattern, $subject, $matches);
print_r($matches);
Explanation:
\s*#(.+?)\s - look for anything starting with #, group all the following letters, numbers, and anything which is not a whitespace (space, tab, newline), till the closest whitespace.
See the output of the $matches array for accessing the inner groups and the regex results.
#OP, no need regex. Just PHP string methods
$mystr='This is for #codeworxx';
$str = explode(" ",$mystr);
foreach($str as $k=>$word){
if(substr($word,0,1)=="#"){
print $word;
}
}
Just incase this is helpful to someone in the future
/((?<!\S)#\w+(?!\S))/
This will match any word containing alphanumeric characters, starting with "#." It will not match words with "#" anywhere but the start of the word.
Matching cases:
#username
foo #username bar
foo #username1 bar #username2
Failing cases:
foo#username
#username$
##username
I have the following text file:
...
"somewords MYWORD";123123123123
"someother MYWORDOTHER";456456456456
"somedifferent MYWORDDIFFERENT";789789789
...
I need to match the word MYWORD, MYWORDOTHER, MYWORDDIFFERENT and then substitute the space before this word with ";".
Someone can figure out a regex?
I have done something like that:
+[^ ][^ ][^ ][^ ][^ ][^ ][^ ]";
but this works only with a specific word length. I need to modify to get any word of any length.
Any help?
why don't you str_replace() ?
$string = '"somewords MYWORD";123123123123
"someother MYWORDOTHER";456456456456
"somedifferent MYWORDDIFFERENT";789789789';
$replace = str_replace(' MYWORD', ';MYWORD', $string);
echo $replace;
Codepad Example
This is untested, but should work to replace the space before the last word in the quotes...
preg_replace('/(".+) (\w+";\d+)/',"$1;$2", $your_string);
preg_replace('/\s(\w+);\d/', ';$1', $text);
Maybe this:
$result = preg_replace('/([ ])(\w+)";/im', ';$2";', $subject);
in:
"somewords MYWORD";123123123123
"someother MYWORDOTHER";456456456456
"somedifferent MYWORDDIFFERENT";789789789
out:
"somewords;MYWORD";123123123123
"someother;MYWORDOTHER";456456456456
"somedifferent;MYWORDDIFFERENT";789789789
Use this :
preg_replace('#"(\w+)\s+(\w+)"#',"$1;$2",$text);
while($line=fgets($file))
{
$str=preg_replace("/ (\w)/i",";$1",$line);//use this line if you want to replace every space
$str=preg_replace("/ (\w+)\";(\d)/i",";$1\";$2",$line);//use this line if you only want to replace the last space
echo $str;//or wherever you want to output
}
Edit:
Alright, I made a typo in the original answer.
Now corrected with a codepad:http://codepad.org/leQHTuFR
i'm not very firm with regular Expressions, so i have to ask you:
How to find out with PHP if a string contains a word starting with # ??
e.g. i have a string like "This is for #codeworxx" ???
I'm so sorry, but i have NO starting point for that :(
Hope you can help.
Thanks,
Sascha
okay thanks for the results - but i did a mistake - how to implement in eregi_replace ???
$text = eregi_replace('/\B#[^\B]+/','\\1', $text);
does not work??!?
why? do i not have to enter the same expression as pattern?
Match anything with has some whitespace in front of a # followed by something else than whitespace:
$ cat 1812901.php
<?php
echo preg_match("/\B#[^\B]+/", "This should #match it");
echo preg_match("/\B#[^\B]+/", "This should not# match");
echo preg_match("/\B#[^\B]+/", "This should match nothing and return 0");
echo "\n";
?>
$ php 1812901.php
100
break your string up like this:
$string = 'simple sentence with five words';
$words = explode(' ', $string );
Then you can loop trough the array and check if the first character of each word equals "#":
if ($stringInTheArray[0] == "#")
Assuming you define a word a sequence of letters with no white spaces between them, then this should be a good starting point for you:
$subject = "This is for #codeworxx";
$pattern = '/\s*#(.+?)\s/';
preg_match($pattern, $subject, $matches);
print_r($matches);
Explanation:
\s*#(.+?)\s - look for anything starting with #, group all the following letters, numbers, and anything which is not a whitespace (space, tab, newline), till the closest whitespace.
See the output of the $matches array for accessing the inner groups and the regex results.
#OP, no need regex. Just PHP string methods
$mystr='This is for #codeworxx';
$str = explode(" ",$mystr);
foreach($str as $k=>$word){
if(substr($word,0,1)=="#"){
print $word;
}
}
Just incase this is helpful to someone in the future
/((?<!\S)#\w+(?!\S))/
This will match any word containing alphanumeric characters, starting with "#." It will not match words with "#" anywhere but the start of the word.
Matching cases:
#username
foo #username bar
foo #username1 bar #username2
Failing cases:
foo#username
#username$
##username
I'm detecting #replies in a Twitter stream with the following PHP code using regexes.
$text = preg_replace('!^#([A-Za-z0-9_]+)!', '#$1', $text);
$text = preg_replace('! #([A-Za-z0-9_]+)!', ' #$1', $text);
How can I best combine these two rules without false flagging email#domain.com as a reply?
OK, on a second thought, not flagging whatever#email means that the previous element has to be a "non-word" item, because any other element that could be contained in a word could be signaled as an email, so it would lead:
!(^|\W)#([A-Za-z0-9_]+)!
but then you have to use $2 instead of $1.
Since the ^ does not have to stand at the beginning of the RE, you can use grouping and | to combine those REs.
If you don't want re-insert the whitespace you captured, you have to use "positive lookbehind":
$text = preg_replace('/(?<=^|\s)#(\w+)/',
'#$1', $text);
or "negative lookbehind":
$text = preg_replace('/(?<!\S)#(\w+)/',
'#$1', $text);
...whichever you find easier to understand.
Here's how I'd do the combination
$text = preg_replace('!(^| )#([A-Za-z0-9_]+)!', '$1#$2', $text);
$text = preg_replace('/(^|\W)#(\w+)/', '#$2', $text);
preg_replace('%(?<!\S)#([A-Za-z0-9_]+)%', '#$1', $text);
(?<!\S) is loosely translated to "no preceding non-whitespace character". Sort of a double-negation, but also works at the start of the string/line.
This won't consume any preceding character, won't use any capturing group, and won't match strings such as "foo-#host.com", which is a valid e-mail address.
Tested:
Input = 'foo bar baz-#qux.com bee #def goo#doo #woo'
Output = 'foo bar baz-#qux.com bee #def goo#doo #woo'
Hu, guys, don't push too far... Here it is :
!^\s*#([A-Za-z0-9_]+)!
I think you can use alternation,: so look for the beginning of a string or a space
'!(?:^|\s)#([A-Za-z0-9_]+)!'