I am searching a line using preg_match_all, but it is not known exactly what this line will look like. For example, it could look like this:
XXX012-013-015-######
Or it could look like this:
XXX012-013-015-XXX001-002-######
Where the 'X's are any letter and the '#'s are any number.
This is the relevant portion of the preg_match_all code that works exactly as expected if the line was always setup like the first example:
if (preg_match_all('([A-Z]{3})((?:[0-9]{3}[->]{1}){1,32})([0-9]{2})([0-9]{2})([0-9]{2})...rest of code...#', $wwalist, $matches)) {
$wwaInfo['locationabbrev'][$wwanum] = $matches[2][$keys[$wwanum]];
}
The $matches[2] will display "012-013-015" as expected. Since the first part, xxx012-013-015, can repeat, I need for the preg_match_all $matches[2] to display the following if it is run on the second example:
012-013-015-001-002
This was my attempt, but it does not work:
if (preg_match_all('#([A-Z]{3})((?:[0-9]{3}[->]{1}){1,32})((?:[A-Z]{3}){0,1})(?:((?:[0-9]{3}[->]{1}){1,3}){0,3})([0-9]{2})([0-9]{2})([0-9]{2})...rest of code...#', $wwalist, $matches)) {
Hopefully this makes sense. Any help would be much appreciated! Thanks!
You aren't going to be able to match and join matches in the same step.
Will this work for you:
Code: (Pattern Demo) (PHP Demo)
$strings=[
'ABC012-013-015-XYZ001-002-345435',
'ABC012-013-015-345453',
'XYZ013-014-015-016-EFG017-123456'
];
foreach($strings as $s){
if(preg_match('/[A-Z]{3}\d{3}/',$s)){ // check if string qualifies
echo 'Match found. Prepared string: ';
$s=preg_replace('/([A-Z]{3}|-\d{6})/','',$s); // remove unwanted substrings
echo "$s\n";
}
}
Output:
Match found. Prepared string: 012-013-015-001-002
Match found. Prepared string: 012-013-015
Match found. Prepared string: 013-014-015-016-017
You could use a replace call and then output a new string with the matches, so for example:
ABC012-013-015-XYZ001-002-345435
ABC012-013-015-345453
XYZ013-014-015-016-EFG017-123456
$rep = preg_replace( '/(?mi-Us)([^0-9-\n]{3,})|-[0-9]{4,}/', '', $str) ;
echo ( $rep );
Should result in:
012-013-015-001-002
012-013-015
013-014-015-016-017
To output to an array:
$mat = preg_match_all( '/([0-9-]+)\n/', $rep, $res) ;
print_r( $res[1] ) ;
foreach( $res[1] as $result ) {
echo $result . "\n" ;
}
For the code you've shown you could probably do:
$rep = preg_replace( '/(?mi-Us)([^0-9-\n]{3,})|-[0-9]{4,}/', '', $wwalist ) ;
if (preg_match_all('/([0-9-]+)\n/', $rep, $matches)) {
$wwaInfo['locationabbrev'][$wwanum] = $matches[1][$keys[$wwanum]];
print_r( $wwaInfo['locationabbrev'][$wwanum] ); // comment out when done testing
}
Which should return the array:
Array
(
[0] => 012-013-015-001-002
[1] => 012-013-015
[2] => 013-014-015-016-017
)
Related
I've spent my last 4 hours figuring out how to ... I got to ask for your help now.
I'm trying to extract from a text multiple substring match my starting_words_array and ending_words_array.
$str = "Do you see that ? Indeed, I can see that, as well as this." ;
$starting_words_array = array('do','I');
$ending_words_array = array('?',',');
expected output : array ([0] => 'Do you see that ?' [1] => 'I can see that,')
I manage to write a first function that can find the first substring matching one of both arrays items. But i'm not able to find how to loop it in order to get all the substring matching my requirement.
function SearchString($str, $starting_words_array, $ending_words_array ) {
forEach($starting_words_array as $test) {
$pos = strpos($str, $test);
if ($pos===false) continue;
$found = [];
forEach($ending_words_array as $test2) {
$posStart = $pos+strlen($test);
$pos2 = strpos($str, $test2, $posStart);
$found[] = ($pos2!==false) ? $pos2 : INF;
}
$min = min($found);
if ($min !== INF)
return substr($str,$pos,$min-$pos) .$str[$min];
}
return '';
}
Do you guys have any idea about how to achieve such thing ?
I use preg_match for my solution. However, the start and end strings must be escaped with preg_quote. Without that, the solution will be wrong.
function searchString($str, $starting_words_array, $ending_words_array ) {
$resArr = [];
forEach($starting_words_array as $i => $start) {
$end = $ending_words_array[$i] ?? "";
$regEx = '~'.preg_quote($start,"~").".*".preg_quote($end,"~").'~iu';
if(preg_match_all($regEx,$str,$match)){
$resArr[] = $match[0];
}
}
return $resArr;
}
The result is what the questioner expects.
If the expressions can occur more than once, preg_match_all must also be used. The regex must be modify.
function searchString($str, $starting_words_array, $ending_words_array ) {
$resArr = [];
forEach($starting_words_array as $i => $start) {
$end = $ending_words_array[$i] ?? "";
$regEx = '~'.preg_quote($start,"~").".*?".preg_quote($end,"~").'~iu';
if(preg_match_all($regEx,$str,$match)){
$resArr = array_merge($resArr,$match[0]);
}
}
return $resArr;
}
The resut for the second variant:
array (
0 => "Do you see that ?",
1 => "Indeed,",
2 => "I can see that,",
)
I would definitely use regex and preg_match_all(). I won't give you a full working code example here but I will outline the necessary steps.
First, build a regex from your start-end-pairs like that:
$parts = array_map(
function($start, $end) {
return $start . '.+' . $end;
},
$starting_words_array,
$ending_words_array
);
$regex = '/' . join('|', $parts) . '/i';
The /i part means case insensitive search. Some characters like the ? have a special purpose in regex, so you need to extend above function in order to escape it properly.
You can test your final regex here
Then use preg_match_all() to extract your substrings:
preg_match_all($regex, $str, $matches); // $matches is passed by reference, no need to declare it first
print_r($matches);
The exact structure of your $matches array will be slightly different from what you asked for but you will be able to extract your desired data from it
Benni answer is best way to go - but let just point out the problem in your code if you want to fix those:
strpos is not case sensitive and find also part of words so you need to changes your $starting_words_array = array('do','I'); to $starting_words_array = array('Do','I ');
When finding a substring you use return which exit the function so you want find any other substring. In order to fix that you can define $res = []; at the beginning of the function and replace return substr($str,$pos,... with $res[] = substr($str,$pos,... and at the end return the $res var.
You can see example in 3v4l - in that example you get the output you wanted
I'm having a really hard time understanding RegEx in general, so I have no clue how is it possible to use it in such an issue.
So here we have a tuple
$tuple = "(12342,43244)";
And what I try to do is get:
$value_one = 12342;
So from (value_one,value_two) get value_one.
I know it can be possible with explode( ',', $tuple ) and then delete the 1st character '(' out of the 1st element in exploded array, but that seems super sloppy, is there a way to pattern match in this manner in PHP?
Here is the simplest preg_match example with the \(([0-9]+) regex that matches a (, and captures into Group 1 one or more digits from 0 to 9 range:
$tuple = "(12342,43244)";
if (preg_match('~\(([0-9]+)~', $tuple, $m))
{
echo $m[1];
}
See the IDEONE demo
Wrapped into a function:
function retFirstDigitChunk($input) {
if (preg_match('~\(([0-9]+)~', $input, $m)) {
return $m[1];
} else {
return "";
}
}
See another demo
Or, to get both as an array:
function retValues($input) {
if (preg_match('~\((-?[0-9]+)\s*,\s*(-?[0-9]+)~', $input, $m)) {
return array('left'=>$m[1], 'right'=>$m[2]);
} else {
return "";
}
}
$tuple = "(12342,43244)";
print_r(retValues($tuple));
Output: Array( [left] => 12342 [right] => 43244 )
You have to search the number preceeded by an open brace and followed by a comma. The pattern is:
$value_one = preg_replace('/\((\d+),.*/', '$1', $tuple);
If you are looking for something efficient, try to avoid the use of regex when possible:
$result = explode(',', ltrim($tuple, '('))[0];
or
sscanf($tuple, '(%[^,]', $result);
I am trying to get the integer on the left and right for an input from the $str variable using REGEX. But I keep getting the commas back along with the integer. I only want integers not the commas. I have also tried replacing the wildcard . with \d but still no resolution.
$str = "1,2,3,4,5,6";
function pagination()
{
global $str;
// Using number 4 as an input from the string
preg_match('/(.{2})(4)(.{2})/', $str, $matches);
echo $matches[0]."\n".$matches[1]."\n".$matches[1]."\n".$matches[1]."\n";
}
pagination();
How about using a CSV parser?
$str = "1,2,3,4,5,6";
$line = str_getcsv($str);
$target = 4;
foreach($line as $key => $value) {
if($value == $target) {
echo $line[($key-1)] . '<--low high-->' . $line[($key+1)];
}
}
Output:
3<--low high-->5
or a regex could be
$str = "1,2,3,4,5,6";
preg_match('/(\d+),4,(\d+)/', $str, $matches);
echo $matches[1]."<--low high->".$matches[2];
Output:
3<--low high->5
The only flaw with these approaches is if the number is the start or end of range. Would that ever be the case?
I believe you're looking for Regex Non Capture Group
Here's what I did:
$regStr = "1,2,3,4,5,6";
$regex = "/(\d)(?:,)(4)(?:,)(\d)/";
preg_match($regex, $regStr, $results);
print_r($results);
Gives me the results:
Array ( [0] => 3,4,5 [1] => 3 [2] => 4 [3] => 5 )
Hope this helps!
Given your function name I am going to assume you need this for pagination.
The following solution might be easier:
$str = "1,2,3,4,5,6,7,8,9,10";
$str_parts = explode(',', $str);
// reset and end return the first and last element of an array respectively
$start = reset($str_parts);
$end = end($str_parts);
This prevents your regex from having to deal with your numbers getting into the double digits.
I have a PHP string with a list of items, and I would like to get the last one.
The reality is much more complex, but it boils down to:
$Line = 'First|Second|Third';
if ( preg_match( '#^.*|(?P<last>.+)$#', $Line, $Matches ) > 0 )
{
print_r($Matches);
}
I expect Matches['last'] to contain 'Third', but it does not work. Rather, I get Matches[0] to contain the full string and nothing else.
What am I doing wrong?
Please no workarounds, I can do it myself but I would really like to have this working with preg_match
You have this:
'#^.*|(?P<last>.+)$#'
^
... but I guess you are looking for a literal |:
'#^.*\|(?P<last>.+)$#'
^^
Just use :
$Line = 'First|Second|Third' ;
$lastword = explode('|', $line);
echo $lastword['2'];
If your syntax is always kinda the same, and by that I mean that uses the | as separator, you could do the following, if you like it.
$Line = 'First|Second|Third' ;
$line_array = explode('|', $Line);
$line_count = count($line_array) - 1;
echo $line_array[$line_count];
or
$Line = 'First|Second|Third' ;
$line_array = explode('|', $Line);
end($line_array);
echo $line_array[key($line_array)];
Example of PHP preg_match to get the last match:
<?php
$mystring = "stuff://sometext/2010-01-01/foobar/2016-12-12.csv";
preg_match_all('/\d{4}\-\d{2}\-\d{2}/', $mystring, $matches);
print_r($matches);
print("\nlast match: \n");
print_r($matches[0][count($matches[0])-1]);
print("\n");
?>
Prints the whole object returned and the last match:
Array
(
[0] => Array
(
[0] => 2010-01-01
[1] => 2016-12-12
)
)
last match:
2016-12-12
$string = 'test check one two test3';
$result = mb_eregi_replace ( 'test|test2|test3' , '<$1>' ,$string ,'i');
echo $result;
This should deliver: <test> check one two <test3>
Is it possible to get, that test and test3 was found, without using another match function ?
You can use preg_replace_callback instead:
$string = 'test check one two test3';
$matches = array();
$result = preg_replace_callback('/test|test2|test3/i' , function($match) use ($matches) {
$matches[] = $match;
return '<'.$match[0].'>';
}, $string);
echo $result;
Here preg_replace_callback will call the passed callback function for each match of the pattern (note that its syntax differs from POSIX). In this case the callback function is an anonymous function that adds the match to the $matches array and returns the substitution string that the matches are to be replaced by.
Another approach would be to use preg_split to split the string at the matched delimiters while also capturing the delimiters:
$parts = preg_split('/test|test2|test3/i', $string, null, PREG_SPLIT_DELIM_CAPTURE);
The result is an array of alternating non-matching and matching parts.
As far as I know, eregi is deprecated.
You could do something like this:
<?php
$str = 'test check one two test3';
$to_match = array("test", "test2", "test3");
$rep = array();
foreach($to_match as $val){
$rep[$val] = "<$val>";
}
echo strtr($str, $rep);
?>
This too allows you to easily add more strings to replace.
Hi following function used to found the any word from string
<?php
function searchword($string, $words)
{
$matchFound = count($words);// use tha no of word you want to search
$tempMatch = 0;
foreach ( $words as $word )
{
preg_match('/'.$word.'/',$string,$matches);
//print_r($matches);
if(!empty($matches))
{
$tempMatch++;
}
}
if($tempMatch==$matchFound)
{
return "found";
}
else
{
return "notFound";
}
}
$string = "test check one two test3";
/*** an array of words to highlight ***/
$words = array('test', 'test3');
$string = searchword($string, $words);
echo $string;
?>
If your string is utf-8, you could use preg_replace instead
$string = 'test check one two test3';
$result = preg_replace('/(test3)|(test2)|(test)/ui' , '<$1>' ,$string);
echo $result;
Oviously with this kind of data to match the result will be suboptimal
<test> check one two <test>3
You'll need a longer approach than a direct search and replace with regular expressions (surely if your patterns are prefixes of other patterns)
To begin with, the code you want to enhance does not seem to comply with its initial purpose (not at least in my computer). You can try something like this:
$string = 'test check one two test3';
$result = mb_eregi_replace('(test|test2|test3)', '<\1>', $string);
echo $result;
I've removed the i flag (which of course makes little sense here). Still, you'd still need to make the expression greedy.
As for the original question, here's a little proof of concept:
function replace($match){
$GLOBALS['matches'][] = $match;
return "<$match>";
}
$string = 'test check one two test3';
$matches = array();
$result = mb_eregi_replace('(test|test2|test3)', 'replace(\'\1\')', $string, 'e');
var_dump($result, $matches);
Please note this code is horrible and potentially insecure. I'd honestly go with the preg_replace_callback() solution proposed by Gumbo.