For example:
$s1 = "Test Test the rest of string"
$s2 = "Test the rest of string"
I would like to match positively $s1 but not $s2, because first word in $s1 is the same as second. Word 'Test' is example, regular expression should work on any words.
if(preg_match('/^(\w+)\s+\1\b/',$input)) {
// $input has same first two words.
}
Explanation:
^ : Start anchor
( : Start of capturing group
\w+ : A word
) : End of capturing group
\s+ : One or more whitespace
\1 : Back reference to the first word
\b : Word boundary
~^(\w+)\s+\1(?:\W|$)~
~^(\pL+)\s+\1(?:\PL|$)~u // unicode variant
\1 is a back reference to the first capturing group.
This does not cause Test Testx to return true.
$string = "Test Test";
preg_match('/^(\w+)\s+\1(\b|$)/', $string);
Not working everywhere, see the comments...
^([^\b]+)\b\1\b
^(\B+)\b\1\b
Gets the first word, and matches if the same word is repeated again after a word boundary.
Related
I have a string like "some words 12345cm some more words"
and I want to extract the 12345cm bit from that string. So I get the position of the first number:
$position_of_first_number = strcspn( "some words 12345cm some more words" , '0123456789' );
Then the position of the first space after $position_of_first_number
$position_of_space_after_numbers = strpos("some words 12345cm some more words", " ", $position_of_first_number);
Then I want to have a function which return the portion of the string between $position_of_first_number and $position_of_space_after_numbers.
How do I do it?
You can use the substr function. Note that it takes a starting position and a length, which you can calculate as the difference between the start and end positions.
Since you are looking for a pattern like blank-digits-letters-blank, I would recommend a regular expression using preg_match:
$s = "some words 12345cm some more words";
preg_match("/\s(?P<result>\d+[^\W\d_]+)\s/", $s, $matches);
echo $matches["result"];
12345cm
Explaining the pattern:
"/.../" limits the pattern in PHP
\s matches any whitespace character
(?P<name>...) names the following pattern
\d+ matches 1 or more digits
[^\W\d_]+ matches 1 or more Unicode-letters (i.e. any character that is not a non-alphanumeric character; see this answer)
i'm not very firm with regular Expressions, so i have to ask you:
How to find out with PHP if a string contains a word starting with # ??
e.g. i have a string like "This is for #codeworxx" ???
I'm so sorry, but i have NO starting point for that :(
Hope you can help.
Thanks,
Sascha
okay thanks for the results - but i did a mistake - how to implement in eregi_replace ???
$text = eregi_replace('/\B#[^\B]+/','\\1', $text);
does not work??!?
why? do i not have to enter the same expression as pattern?
Match anything with has some whitespace in front of a # followed by something else than whitespace:
$ cat 1812901.php
<?php
echo preg_match("/\B#[^\B]+/", "This should #match it");
echo preg_match("/\B#[^\B]+/", "This should not# match");
echo preg_match("/\B#[^\B]+/", "This should match nothing and return 0");
echo "\n";
?>
$ php 1812901.php
100
break your string up like this:
$string = 'simple sentence with five words';
$words = explode(' ', $string );
Then you can loop trough the array and check if the first character of each word equals "#":
if ($stringInTheArray[0] == "#")
Assuming you define a word a sequence of letters with no white spaces between them, then this should be a good starting point for you:
$subject = "This is for #codeworxx";
$pattern = '/\s*#(.+?)\s/';
preg_match($pattern, $subject, $matches);
print_r($matches);
Explanation:
\s*#(.+?)\s - look for anything starting with #, group all the following letters, numbers, and anything which is not a whitespace (space, tab, newline), till the closest whitespace.
See the output of the $matches array for accessing the inner groups and the regex results.
#OP, no need regex. Just PHP string methods
$mystr='This is for #codeworxx';
$str = explode(" ",$mystr);
foreach($str as $k=>$word){
if(substr($word,0,1)=="#"){
print $word;
}
}
Just incase this is helpful to someone in the future
/((?<!\S)#\w+(?!\S))/
This will match any word containing alphanumeric characters, starting with "#." It will not match words with "#" anywhere but the start of the word.
Matching cases:
#username
foo #username bar
foo #username1 bar #username2
Failing cases:
foo#username
#username$
##username
im searching a paragrahp (string) for a certain word. and i want to replace that word with another word, but i want to replace on the second occurence of my find.
here is what i tried
$string = 'hello my name is hello';
$output = str_replace('hello', 'Gary', $string);
// desired output
//hello my name is Gary
It is very simple but i cant get it right. Please bare in mind my string is very long and has all types of characters in it
With this regex : /^.*?hello\b.*?\Khello/ :
^ assert position at start of the string
.*? matches any character (except newline)
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match
Check this demo : https://regex101.com/r/lW2kK1/2
which gives you :
$re = "/^.*?hello\\b.*?\\Khello/";
$str = "hello my name is hello";
$subst = "Gary";
$result = preg_replace($re, $subst, $str);
How would I remove repeating characters (e.g. remove the letter k in cakkkke for it to be cake)?
One straightforward way to do this would be to loop through each character of the string and append each character of the string to a new string if the character isn't a repeat of the previous character.
Here is some code that can do this:
$newString = '';
$oldString = 'cakkkke';
$lastCharacter = '';
for ($i = 0; $i < strlen($oldString); $i++) {
if ($oldString[$i] !== $lastCharacter) {
$newString .= $oldString[$i];
}
$lastCharacter = $oldString[$i];
}
echo $newString;
Is there a way to do the same thing more concisely using regex or built-in functions?
Use backrefrences
echo preg_replace("/(.)\\1+/", "$1", "cakkke");
Output:
cake
Explanation:
(.) captures any character
\\1 is a backreferences to the first capture group. The . above in this case.
+ makes the backreference match atleast 1 (so that it matches aa, aaa, aaaa, but not a)
Replacing it with $1 replaces the complete matched text kkk in this case, with the first capture group, k in this case.
You want to first match a character, followed by that character repeated: (.)\1+. Replace that with the first character. The brackets create a backreference to the first character, which you use both to match the repeated instances and as the replacement text.
preg_replace('/(.)\1+/', '$1', $str);
i'm not very firm with regular Expressions, so i have to ask you:
How to find out with PHP if a string contains a word starting with # ??
e.g. i have a string like "This is for #codeworxx" ???
I'm so sorry, but i have NO starting point for that :(
Hope you can help.
Thanks,
Sascha
okay thanks for the results - but i did a mistake - how to implement in eregi_replace ???
$text = eregi_replace('/\B#[^\B]+/','\\1', $text);
does not work??!?
why? do i not have to enter the same expression as pattern?
Match anything with has some whitespace in front of a # followed by something else than whitespace:
$ cat 1812901.php
<?php
echo preg_match("/\B#[^\B]+/", "This should #match it");
echo preg_match("/\B#[^\B]+/", "This should not# match");
echo preg_match("/\B#[^\B]+/", "This should match nothing and return 0");
echo "\n";
?>
$ php 1812901.php
100
break your string up like this:
$string = 'simple sentence with five words';
$words = explode(' ', $string );
Then you can loop trough the array and check if the first character of each word equals "#":
if ($stringInTheArray[0] == "#")
Assuming you define a word a sequence of letters with no white spaces between them, then this should be a good starting point for you:
$subject = "This is for #codeworxx";
$pattern = '/\s*#(.+?)\s/';
preg_match($pattern, $subject, $matches);
print_r($matches);
Explanation:
\s*#(.+?)\s - look for anything starting with #, group all the following letters, numbers, and anything which is not a whitespace (space, tab, newline), till the closest whitespace.
See the output of the $matches array for accessing the inner groups and the regex results.
#OP, no need regex. Just PHP string methods
$mystr='This is for #codeworxx';
$str = explode(" ",$mystr);
foreach($str as $k=>$word){
if(substr($word,0,1)=="#"){
print $word;
}
}
Just incase this is helpful to someone in the future
/((?<!\S)#\w+(?!\S))/
This will match any word containing alphanumeric characters, starting with "#." It will not match words with "#" anywhere but the start of the word.
Matching cases:
#username
foo #username bar
foo #username1 bar #username2
Failing cases:
foo#username
#username$
##username