PHP Regex match one letter and one number - php

I'm trying to replace any occurrences when you find a single letter followed by a single number in a string.
$word = 'AB001J1'; //or ZR010F2 or ZQ10B5
echo str_replace('/^(?=.*\pL)(?=.*\p{Nd})/', '', $word);
Trying to get the result AB001 //or ZR010 or ZQ10

A regex splitting approach works well here:
$word = 'AB001J1';
$output = preg_split("/(?<=[0-9])(?=[A-Z])/", $word, 2)[0];
echo $output; // AB001
The above strategy is to split the input string at any point in between a digit and uppercase letter (in that order). This separates the various terms, and we retain only the first one.

Related

How to get the index of last word with an uppercase letter in PHP

Considering this input string:
"this is a Test String to get the last index of word with an uppercase letter in PHP"
How can I get the position of the last uppercase letter (in this example the position of the first "P" (not the last one "P") of "PHP" word?
I think this regex works. Give it a try.
https://regex101.com/r/KkJeho/1
$pattern = "/.*\s([A-Z])/";
//$pattern = "/.*\s([A-Z])[A-Z]+/"; pattern to match only all caps word
Edit to solve what Wiktor wrote in comments I think you could str_replace all new lines with space as the input string in the regex.
That should make the regex treat it as a single line regex and still give the correct output.
Not tested though.
To find the position of the letter/word:
$str = "this is a Test String to get the last index of word with an uppercase letter in PHP";
$pattern = "/.*\s([A-Z])(\w+)/";
//$pattern = "/.*\s([A-Z])([A-Z]+)/"; pattern to match only all caps word
preg_match($pattern, $str, $match);
$letter = $match[1];
$word = $match[1] . $match[2];
$position = strrpos($str, $match[1].$match[2]);
echo "Letter to find: " . $letter . "\nWord to find: " . $word . "\nPosition of letter: " . $position;
https://3v4l.org/sJilv
If you also want to consider a non-regex version: You can try splitting the string at the whitespace character, iterating the resulting string array backwards and checking if the current string's first character is an upper case character, something like this (you may want to add index/null checks):
<?php
$str = "this is a Test String to get the last index of word with an uppercase letter in PHP";
$explodeStr = explode(" ",$str);
$i = count($explodeStr) - 1;
$characterCount=0;
while($i >= 0) {
$firstChar = $explodeStr[$i][0];
if($firstChar == strtoupper($firstChar)){
echo $explodeStr[$i]. ' at index: ';
$idx = strlen($str)-strlen($explodeStr[$i] -$characterCount);
echo $idx;
break;
}
$characterCount += strlen($explodeStr[i]) +1; //+1 for whitespace
$i--;
}
This prints 80 which is indeed the index of the first P in PHP (including whitespaces).
Andreas' pattern looks pretty solid, but this will find the position faster...
.* \K[A-Z]{2,}
Pattern Demo
Here is the PHP implementation: Demo
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
var_export(preg_match('/.* \K[A-Z]{2,}/',$str,$out,PREG_OFFSET_CAPTURE)?$out[0][1]:'fail');
// 80
If you want to see a condensed non-regex method, this will work:
Code: Demo
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
$allcaps=array_filter(explode(' ',$str),'ctype_upper');
echo "Position = ",strrpos($str,end($allcaps));
Output:
Position = 80
This assumes that there is an all caps word in the input string. If there is a possibility of no all-caps words, then a conditional would sort it out.
Edit, after re-reading the question, I am unsure what exactly makes PHP the targeted substring -- whether it is because it is all caps, or just the last word to start with a capitalized letter.
If just the last word starting with an uppercase letter then this pattern will do: /.* \K[A-Z]/
If the word needs to be all caps, then it is possible that /b word boundaries may be necessary.
Some more samples and explanation from the OP would be useful.
Another edit, you can declare a set of characters to exclude and use just two string functions. I am using a-z and a space with rtrim() then finding the right-most space, and adding 1 to it.
$str='this is a Test String to get the last index of word with an uppercase letter in PHP test';
echo strrpos(rtrim($str,'abcdefghijklmnopqrstuvwxyz '),' ')+1;
// 80

Regex Preg_match_all match all pattern

Here is my concern,
I have a string and I need to extract chraracters two by two.
$str = "abcdef" should return array('ab', 'bc', 'cd', 'de', 'ef'). I want to use preg_match_all instead of loops. Here is the pattern I am using.
$str = "abcdef";
preg_match_all('/[\w]{2}/', $str);
The thing is, it returns Array('ab', 'cd', 'ef'). It misses 'bc' and 'de'.
I have the same problem if I want to extract a certain number of words
$str = "ab cd ef gh ij";
preg_match_all('/([\w]+ ){2}/', $str); // returns array('ab cd', 'ef gh'), I'm also missing the last part
What am I missing? Or is it simply not possible to do so with preg_match_all?
For the first problem, what you want to do is match overlapping string, and this requires zero-width (not consuming text) look-around to grab the character:
/(?=(\w{2}))/
The regex above will capture the match in the first capturing group.
DEMO
For the second problem, it seems that you also want overlapping string. Using the same trick:
/(?=(\b\w+ \w+\b))/
Note that \b is added to check the boundary of the word. Since the match does not consume text, the next match will be attempted at the next index (which is in the middle of the first word), instead of at the end of the 2nd word. We don't want to capture from middle of a word, so we need the boundary check.
Note that \b's definition is based on \w, so if you ever change the definition of a word, you need to emulate the word boundary with look-ahead and look-behind with the corresponding character set.
DEMO
In case if you need a Non-Regex solution, Try this...
<?php
$str = "abcdef";
$len = strlen($str);
$arr = array();
for($count = 0; $count < ($len - 1); $count++)
{
$arr[] = $str[$count].$str[$count+1];
}
print_r($arr);
?>
See Codepad.

Php replace exact word

Here is my problem:
Using preg_replace('#\b(word)\b#','****',$text);
Where in text I have word\word and word, the preg_replace above replaces both word\word and word so my resulting string is ***\word and ***.
I want my string to look like : word\word and ***.
Is this possible? What am I doing wrong???
LATER EDIT
I have an array with urls, I foreach that array and preg_replace the text where url is found, but it's not working.
For instance, I have http://www.link.com and http://www.link.com/something
If I have http://www.link.com it also replaces http://www.link.com/something.
You are effectively specifying that you don't want certain characters to count as word boundary. Therefore you need to specify the "boundaries" yourself, something like this:
preg_replace('#(^|[^\w\\])(word)([^\w\\]|$)#','**',$text);
What this does is searches for the word surrounded by line boundaries or non-word characters except the back slash \. Therefore it will match .word, but not .word\ and not `\word. If you need to exclude other characters from matching, just add them inside the brackets.
You could just use str_replace("word\word", "word\word and"), I dont really see why you would need to use a preg_replace in your case given above.
Here is a simple solution that doesn't use a regex. It will ONLY replace single occurances of 'word' where it is a lone word.
<?php
$text = "word\word word cat dog";
$new_text = "";
$words = explode(" ",$text); // split the string into seperate 'words'
$inc = 0; // loop counter
foreach($words as $word){
if($word == "word"){ // if the current word in the array of words matches the criteria, replace it
$words[$inc] = "***";
}
$new_text.= $words[$inc]." ";
$inc ++;
}
echo $new_text; // gives 'word\word *** cat dog'
?>

preg_split : Get first word in a line

Can you please help assemble a regex to be used in preg_split which will split a string on it's first word - case insensitive (up until the first space).
This should work
$result = preg_split('/\s/', trim($subject));
$firstword = $result[0]
If sentence has space as word separators you can do:
list($firstWord) = explode(' ',trim($input));
If you just need to split up until the first space character, your regex is essentially just a space character:
$output = preg_split('/ /', 'My name is Mansoor', 2);
echo $output[0]; // Will return 'My';
echo $output[1]; // will return 'name is Mansoor';
If you only need the first word, make sure you pass the optional argument (the 2) to specify that you want only two results in your $output array -- the first word, and the rest of the sentence. Otherwise, you'll spend time parsing text that you don't care about.

Split alphanumeric string between leading digits and trailing letters

I have a string like:
$Order_num = "0982asdlkj";
How can I split that into the 2 variables, with the number as one element and then another variable with the letter element?
The number element can be any length from 1 to 4 say and the letter element fills the rest to make every order_num 10 characters long in total.
I have found the php explode function...but don't know how to make it in my case because the number of numbers is between 1 and 4 and the letters are random after that, so no way to split at a particular letter.
You can use preg_split using lookahead and lookbehind:
print_r(preg_split('#(?<=\d)(?=[a-z])#i', "0982asdlkj"));
prints
Array
(
[0] => 0982
[1] => asdlkj
)
This only works if the letter part really only contains letters and no digits.
Update:
Just to clarify what is going on here:
The regular expressions looks at every position and if a digit is before that position ((?<=\d)) and a letter after it ((?=[a-z])), then it matches and the string gets split at this position. The whole thing is case-insensitive (i).
Use preg_match() with a regular expression of (\d+)([a-zA-Z]+). If you want to limit the number of digits to 1-4 and letters to 6-9, change it to (\d+{1,4})([a-zA-Z]{6,9}).
preg_match("/(\\d+)([a-zA-Z]+)/", "0982asdlkj", $matches);
print("Integer component: " . $matches[1] . "\n");
print("Letter component: " . $matches[2] . "\n");
Outputs:
Integer component: 0982
Letter component: asdlkj
http://ideone.com/SKtKs
You can also do it using preg_split by splitting your input at the point which between the digits and the letters:
list($num,$alpha) = preg_split('/(?<=\d)(?=[a-z]+)/i',$Order_num);
You can use a regex for that.
preg_match('/(\d{1,4})([a-z]+)/i', $str, $matches);
array_shift($matches);
list($num, $alpha) = $matches;
Check this out
<?php
$Order_num = "0982asdlkj";
$split=split("[0-9]",$Order_num);
$alpha=$split[(sizeof($split))-1];
$number=explode($alpha, $Order_num);
echo "Alpha -".$alpha."<br>";
echo "Number-".$number[0];
?>
with regards
wazzy
My preferred approach would be sscanf() because it is concise, doesn't need regex, offers the ability to cast the numeric segment as integer type, and doesn't generate needless fullstring matches like preg_match(). %s does rely, though, on the fact that there will be no whitespaces in the letters segment of the string.
Demo
$Order_num = "0982asdlkj";
var_export (
sscanf($Order_num, '%d%s')
);
This can also be set up to declare individual variables.
sscanf($Order_num, '%d%s', $numbers, $letters)
If wanting to use a preg_ function, preg_split() is most appropriate, but I wouldn't use expensive lookarounds. Match the digits, then forget them (with \K). This will split the string without consuming any characters. Demo
var_export (
preg_split('/\d+\K/', $Order_num)
);
To assign variables, use "symmetric array destructuring".
[$numbers, $letters] = preg_split('/\d+\K/', $Order_num);
Beyond these single function approaches, there will be MANY two function approaches like:
$numbers = rtrim($Order_num, 'a..z');
$letters = ltrim($Order_num, '0..9');
But I wouldn't use them in a professional script because they lack elegance.

Categories