Search string for first word that has an exclamation-mark - php

I have a string like this:
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
I want to get the first word that is followed by an exclamation-mark. So in the example above, it should be:
$word = 'k-on';
I'm lost as to what's the appropriate approach to take. Maybe a regex solution?

If you need to only support ASCII letter words, you can use
/\b[a-z]+(?:-[a-z]+)*!/i
See regex demo
If you plan to support Unicode, use \p{L}:
/\b\p{L}+(?:-\p{L}+)*!/u
See another regex demo
Here is the pattern explanation:
\b - a word boundary (the previous character must be a non-word one or the beginning of the string)
\p{L}+ - 1 or more Unicode characters (or ASCII if [a-zA-Z] is used)
(?:-\p{L}+)* - zero or more sequences of:
- - a literal hyphen
\p{L}+ - 1 or more Unicode characters (or ASCII if [a-zA-Z] is used)
! - a literal ! symbol
PHP demo:
$re = '/\b\p{L}+(?:-\p{L}+)*!/u';
$str = "Hello k-ąn! Lorem Ipsum! Lorem.";
preg_match($re, $str, $match);
print_r($match);

I think this might do what you're looking for. Basically split the string into words, look for the first word that ends in '!', do whatever then break out of the loop:
$string = 'Hello k-on! Lorem Ipsum! Lorem.';
arry = explode(" ", $string);
foreach ($arry as $word) {
if (substr($word,-1) == "!") {
do something ...
break;
}
}

$string = 'Hello k-on! Lorem Ipsum! Lorem.';
preg_match('/[A-Za-z0-9-]+!/', $string, $match);
$yourWord = str_replace("!", "", $match[0]); //prints k-on
obviously, the Solution for the requirement is RegExp, here i used a simple expression which allows AlphaNumeric String, exceptionally allowing hyphen(-) as well. use of preg_match matches the pattern into the string and returns the first matching keyword, which in your case is k-on! and used str_replace in order to take out the exclamation from the returned string.
know more about preg_match : http://php.net/manual/en/function.preg-match.php

Related

Php select from string

Hi I'm new to php and I need a little help
I need to change the text that is between ** in php string and put it between html tag
$text = "this is an *example*";
But I really don't know how and i need help
personally I would use explode, you can then piece the sentence back together if the example appears in the middle of a sentence
<?php
$text = "this is an *example*";
$pieces = explode("*", $text);
echo $pieces[0];
?>
Edit:
Since you're looking for what basically amounts to custom BB Code use this
$text = "this is an *example*";
$find = '~[\*](.*?)[\*]~s';
$replace = '<span style="color: green">$1</span>';
echo preg_replace($find,$replace,$text);
You can add this to a function and have it parse any text that gets passed to it, you can also make the find and replace variables into arrays and add more codes to it
You really should use a DOM parser for things like this, but if you can guaratee it will always be the * character you can use some regex:
$text = "this is an *example*";
$regex = '/(?<=\*)(.*?)(?=\*)/';
$replacement = 'ostrich';
$new_text = preg_replace($regex, $replacement, $text);
echo $new_text;
Returns
this is an *ostrich*
Here is how the regex works:
Positive Lookbehind (?<=\*)
\* matches the character * literally (case sensitive)
1st Capturing Group (.*?)
.*? matches any character (except for line terminators)
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
Positive Lookahead (?=\*)
\* matches the character * literally (case sensitive)
This regex essentially starts and ends by looking at what is ahead of and behind the search character you specified and leaves those characters intact during the replacement with preg_replace().

Replace all the first character of words in a string using preg_replace()

I have a string as
This is a sample text. This text will be used as a dummy for "various" RegEx "operations" using PHP.
I want to select and replace all the first alphabet of each word (in the example : T,i,a,s,t,T,t,w,b,u,a,d,f,",R,",u,P). How do I do it?
I tried /\b.{1}\w+\b/. I read the expression as "select any character that has length of 1 followed by word of any length" but didn't work.
You may try this regex as well:
(?<=\s|^)([a-zA-Z"])
Demo
Your regex - /\b.{1}\w+\b/ - matches any string that is not enclosed in word characters, starts with any symbol that is in a position after a word boundary (thus, it can even be whitespace if there is a letter/digit/underscore in front of it), followed with 1 or more alphanumeric symbols (\w) up to the word boundary.
That \b. is the culprit here.
If you plan to match any non-whitespace preceded with a whitespace, you can just use
/(?<!\S)\S/
Or
/(?<=^|\s)\S/
See demo
Then, replace with any symbol you need.
You may try to use the following regex:
(.)[^\s]*\s?
Using the preg_match_all and implode the output result group 1
<?php
$string = 'This is a sample text. This text will be used as a dummy for'
. '"various" RegEx "operations" using PHP.';
$pattern = '/(.)[^\s]*\s?/';
$matches;
preg_match_all($pattern, $string, $matches);
$output = implode('', $matches[1]);
echo $output; //Output is TiastTtwbuaadf"R"uP
For replace use something like preg_replace_callback like:
$pattern = '/(.)([^\s]*\s?)/';
$output2 = preg_replace_callback($pattern,
function($match) { return '_' . $match[2]; }, $string);
//result: _his _s _ _ample _ext. _his _ext _ill _e _sed _s _ _ummy _or _various" _egEx _operations" _sing _HP.

PHP Regex: Remove words less than 3 characters

I'm trying to remove all words of less than 3 characters from a string, specifically with RegEx.
The following doesn't work because it is looking for double spaces. I suppose I could convert all spaces to double spaces beforehand and then convert them back after, but that doesn't seem very efficient. Any ideas?
$text='an of and then some an ee halved or or whenever';
$text=preg_replace('# [a-z]{1,2} #',' ',' '.$text.' ');
echo trim($text);
Removing the Short Words
You can use this:
$replaced = preg_replace('~\b[a-z]{1,2}\b\~', '', $yourstring);
In the demo, see the substitutions at the bottom.
Explanation
\b is a word boundary that matches a position where one side is a letter, and the other side is not a letter (for instance a space character, or the beginning of the string)
[a-z]{1,2} matches one or two letters
\b another word boundary
Replace with the empty string.
Option 2: Also Remove Trailing Spaces
If you also want to remove the spaces after the words, we can add \s* at the end of the regex:
$replaced = preg_replace('~\b[a-z]{1,2}\b\s*~', '', $yourstring);
Reference
Word Boundaries
You can use the word boundary tag: \b:
Replace: \b[a-z]{1,2}\b with ''
Use this
preg_replace('/(\b.{1,2}\s)/','',$your_string);
As some solutions worked here, they had a problem with my language's "multichar characters", such as "ch". A simple explode and implode worked for me.
$maxWordLength = 3;
$string = "my super string";
$exploded = explode(" ", $string);
foreach($exploded as $key => $word) {
if(mb_strlen($word) < $maxWordLength) unset($exploded[$key]);
}
$string = implode(" ", $exploded);
echo $string;
// outputs "super string"
To me, it seems that this hack works fine with most PHP versions:
$string2 = preg_replace("/~\b[a-zA-Z0-9]{1,2}\b\~/i", "", trim($string1));
Where [a-zA-Z0-9] are the accepted Char/Number range.

Replace symbol if it is preceded and followed by a word character

I want to change a specific character, only if it's previous and following character is of English characters. In other words, the target character is part of the word and not a start or end character.
For Example...
$string = "I am learn*ing *PHP today*";
I want this string to be converted as following.
$newString = "I am learn'ing *PHP today*";
$string = "I am learn*ing *PHP today*";
$newString = preg_replace('/(\w)\*(\w)/', '$1\'$2', $string);
// $newString = "I am learn'ing *PHP today* "
This will match an asterisk surrounded by word characters (letters, digits, underscores). If you only want to do alphabet characters you can do:
preg_replace('/([a-zA-Z])\*([a-zA-Z])/', '$1\'$2', 'I am learn*ing *PHP today*');
The most concise way would be to use "word boundary" characters in your pattern -- they represent a zero-width position between a "word" character and a "non-word" characters. Since * is a non-word character, the word boundaries require the both neighboring characters to be word characters.
No capture groups, no references.
Code: (Demo)
$string = "I am learn*ing *PHP today*";
echo preg_replace('~\b\*\b~', "'", $string);
Output:
I am learn'ing *PHP today*
To replace only alphabetical characters, you need to use a [a-z] as a character range, and use the i flag to make the regex case-insensitive. Since the character you want to replace is an asterisk, you also need to escape it with a backslash, because an asterisk means "match zero or more times" in a regular expression.
$newstring = preg_replace('/([a-z])\*([a-z])/i', "$1'$2", $string);
To replace all occurances of asteric surrounded by letter....
$string = preg_replace('/(\w)*(\w)/', '$1\'$2', $string);
AND
To replace all occurances of asteric where asteric is start and end character of the word....
$string = preg_replace('/*(\w+)*/','\'$1\'', $string);

PHP regular expression find and append to string

I'm trying to use regular expressions (preg_match and preg_replace) to do the following:
Find a string like this:
{%title=append me to the title%}
Then extract out the title part and the append me to the title part. Which I can then use to perform a str_replace(), etc.
Given that I'm terrible at regular expressions, my code is failing...
preg_match('/\{\%title\=(\w+.)\%\}/', $string, $matches);
What pattern do I need? :/
I think it's because the \w operator doesn't match spaces. Because everything after the equal sign is required to fit in before your closing %, it all has to match whatever is inside those brackets (or else the entire expression fails to match).
This bit of code worked for me:
$str = '{%title=append me to the title%}';
preg_match('/{%title=([\w ]+)%}/', $str, $matches);
print_r($matches);
//gives:
//Array ([0] => {%title=append me to the title%} [1] => append me to the title )
Note that the use of the + (one or more) means that an empty expression, ie. {%title=%} won't match. Depending on what you expect for white space, you might want to use the \s after the \w character class instead of an actual space character. \s will match tabs, newlines, etc.
You can try:
$str = '{%title=append me to the title%}';
// capture the thing between % and = as title
// and between = and % as the other part.
if(preg_match('#{%(\w+)\s*=\s*(.*?)%}#',$str,$matches)) {
$title = $matches[1]; // extract the title.
$append = $matches[2]; // extract the appending part.
}
// find these.
$find = array("/$append/","/$title/");
// replace the found things with these.
$replace = array('IS GOOD','TITLE');
// use preg_replace for replacement.
$str = preg_replace($find,$replace,$str);
var_dump($str);
Output:
string(17) "{%TITLE=IS GOOD%}"
Note:
In your regex: /\{\%title\=(\w+.)\%\}/
There is no need to escape % as its
not a meta char.
There is no need to escape { and }.
These are meta char but only when
used as a quantifier in the form of
{min,max} or {,max} or {min,}
or {num}. So in your case they are treated literally.
Try this:
preg_match('/(title)\=(.*?)([%}])/s', $string, $matches);
The match[1] has your title and match[2] has the other part.

Categories