preg_grep matching incorrectly?

preg_grep matching incorrectly? - php

My code:
$q = array('r%and_dy', 'cat09', '##$%%^');
$result = preg_grep('/[a-zA-Z0-9]+/', $q);
print_r($result);
Using the same regular expression with javascript will only match 'cat09', but in php this returns:
Array
(
[0] => r%and_dy
[1] => cat09
)
What do I have to write so that it only returns 'cat09'?
EDIT: you want to see the javascript. The javascript match function with the 'g' flag is the equivalent function of preg_grep in php, but it doesn't accept an array - here's a fiddle, which each item as a separate line. http://jsfiddle.net/64A5w/
EDIT: jsfiddle is down, so here is the the javascript equivalent. First I should mention, preg_grep only accepts arrays, and automatically returns global matches (it does not accept a g flag). Javascript match only accepts strings, and g must be specified.
var str = 'r%and_dy';
var result = str.match(/[a-zA-Z0-9]+/g);
document.write(result);
which displays: r,and,dy. The php equivalent would be passing preg_grep $str = array('r%and_dy'). It should return the same array But it returns r%and_dy as a single match (as shown above).

Your problem is that you are matching strings that contain one or more alphanumeric character. Try:
$q = array('r%and_dy', 'cat09', '##$%%^');
$result = preg_grep('/^[a-zA-Z0-9]+$/', $q);
print_r($result);
The ^ and $ mean the start and the end of the string respectively.
From the manual:
Returns the array consisting of the elements of the input array that match the given pattern.

Related

preg_replace passing the matched group to a function weired result

I have the following preg_replace not preg_replace_callback which uses arrays for search patterns and replacement not only a single value and it works fine:
preg_replace(['/\{/','/\}/','/"(.*?)"/'],['<span class=\'olive\'>{','}</span>','<span class=\'olive\'>${0}</span>'],FoxText::insertBr($model->TafseerText));
However, when I try to pass ${0} to function something like:
preg_replace(['/\{/','/\}/','/"(.*?)"/'],['<span class=\'olive\'>{','}</span>',FoxText::pattern2VerseId("\$0")],FoxText::insertBr($model->TafseerText));
In the FoxText::pattern2VerseId function I try print_r as follows:
public static function pattern2VerseId($txt, $pattern = '/\(((\d+)-(\w+))\)/u')
{
$parts = array_map('trim',explode('-', $txt));
print_r(explode('-', $parts[0]));
return $parts[0].' *'.$parts[0].'|';
}
It prints Array ( [0] => $0 ) while the return value is matched string from the previous call!
In other words, how could it able to return $parts[0] as a string and It could not able to explode this string. Or how could I pass the value correctly to the function to be processed there?
By the way, the string is something like (125-Verse)

Because when you call the function pattern2VerseId you call it with the string $0. And since string $0 doesn't contain any hyphen, the explode just returns an array with single element containing the string.
explode('-', '$0') // will return Array([0] => $0)
By "\$0" are you actually trying to get the first part of the matched regex, i.e. 125 in this case? Because you're not doing it right.

Since I have PHP < 7. i.e there is no preg_replace_callback_array, the only solution that I have able to use is replacing the first pattern(s) using preg_replace then passing the output to one preg_replace_callback
$p = preg_replace(['/\{/','/«/','/\(/','/\}/','/»/','/\)/','/"(.*?)"/'],['<span class=\'olive\'>{','<span class=\'olive\'>«','<span class=\'olive\'>(','}</span>','»</span>',')</span>','<span class=\'olive\'>${0}</span>'],FoxText::insertBr($model->TafseerText));
$callback = function($m){return FoxText::pattern2VerseId($m);};
echo preg_replace_callback('/\(((\d+)-(\w+))\)/u', $callback, $p);

preg_match_all in php showing result blank

How do i match this with REGEXP and PHP ?
"s:6:\"[\"50\"]\";",
"s:5:\"[\"1\"]\";"
I want to match numbers between : [\"50\"] this only or could be one or more.
I have a pattern and want to take only numbers from json_encode value also serialize() in php this is code :
$result = [];
foreach($impressions as $impression) {
preg_match_all('/\x5C/', $impression->subcategories, $result);
}
return $result;
if no preg_match then here is result :
"s:6:\"[\"50\"]\";",
"s:5:\"[\"1\"]\";"
I am using this to match only digit where \ is so i can take number only like 50 or 1
Any idea how i can pic number with regular expressions ? value hex not works '/\x5C/' showing me result blank but here : Works fine if i put result and test with same REGEXP.

First of all, you can not go through an array of strings that way with preg_match_all – your $result array gets overwritten in each loop iteration.
And then, you need to capture the numbers you want to see in your result set. To do that, you must mask the [, ] and \ characters each with another \ – and then capture the digits in the middle by putting them in ( and )
$impressions[] = "s:6:\"[\"50\"]\";";
$impressions[] = "s:5:\"[\"1\"]\";";
foreach($impressions as $impression) {
preg_match_all('#\[\\"([0-9]+)\\"\]#', $impression, $matches); // I chose # as delimiter
// here – with so many \ involved, we don’t need / around it to add to the confusion
$results[] = $matches; // $matches will be overwritten in each iteration, so we
// preserve its content here by putting it into the $results array
}
var_dump($results);

Efficient way to parse this string into array in PHP?

Background
I have an array which I create by splitting a string based on every occurrence of 0d0a using preg_split('/(?<=0d0a)(?!$)/').
For example:
$string = "78781110d0a78782220d0a";
will be split into:
Array ( [0] => 78781110d0a [1] => 78782220d0a )
A valid array element has to start with 7878 and end with 0d0a.
The Problem
But sometimes, there's an additional 0d0a in the string which splits into an extra and invalid array element, i.e., that doesn't begin with 7878.
Take this string for example:
$string = "78781110d0a2220d0a78783330d0a";
This is split into:
Array ( [0] => 78781110d0a [1] => 2220d0a [2] => 78783330d0a )
But it should actually be:
Array ( [0] => 78781110d0a2220d0a [1] => 78783330d0a)
My Solution
I've written the following (messy) code to get around this:
$data = Array('78781110d0a','2220d0a','78783330d0a');
$i = 0; //count for $data array;
$j = 0; //count for $dataFixed array;
$dataFixed = $data;
foreach($data as $packet) {
if (substr($packet,0,4) != "7878") { //if packet doesn't start with 7878, do some fixing
if ($i != 0) { //its the first packet, can't help it!
$j++;
if ((substr(strtolower($packet), -4, 4) == "0d0a")) { //if the packet doesn't end with 0d0a, its 'mostly' not valid, so discard it
$dataFixed[$i-$j] = $dataFixed[$i-$j] . $packet;
}
unset($dataFixed[$i-$j+1]);
$dataFixed = array_values($dataFixed);
}
}
$i++;
}
Description
I first copy the array to another array $dataFixed. In a foreach loop of the $data array, I check whether it starts with 7878. If it doesn't, I join it with the previous array in $data. I then unset the current array in $dataFixed and reset the array elements with array_values.
But I'm not very confident about this solution.. Is there a better, more efficient way?
UPDATE
What if the input string doesn't end in 0d0a like its supposed to? It will stick to the previous array element..
For e.g.: in the string 78781110d0a2220d0a78783330d0a0000, 0000 should be separated as another array element.

Use another positive lookahead (?=7878) to form:
preg_split('/(?<=0d0a)(?=7878)/',$string)
Note: I removed (?!$) because I wasn't sure what that was for, based on your example data.
For example, this code:
$string = "78781110d0a2220d0a78783330d0a";
$array = preg_split('/(?<=0d0a)(?=7878)(?!$)/',$string);
print_r($array);
Results in:
Array ( [0] => 78781110d0a2220d0a [1] => 78783330d0a )
UPDATE:
Based on your revised question of having possible random characters at the end of the input string, you can add three lines to make a complete program of:
$string = "78781110d0a2220d0a787830d0a330d0a0000";
$array = preg_split('/(?<=0d0a)(?=7878)/',$string);
$temp = preg_split('/(7878.*0d0a)/',$array[count($array)-1],null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
$array[count($array)-1] = $temp[0];
if(count($temp)>1) { $array[] = $temp[1]; }
print_r($array);
We basically do the initial splitting, then split the last element of the resulting array by the expected data format, keeping the delimiter using PREG_SPLIT_DELIM_CAPTURE. The PREG_SPLIT_NO_EMPTY ensures we won't get an empty array element if the input string doesn't end in random characters.
UPDATE 2:
Based on your comment below where it seems you're implying there might be random characters between any of the desired matches, and you want these random characters preserved, you could do this:
$string = "0078781110d0a2220d0a2220d0a0000787830d0a330d0a000078781110d0a2220d0a0000787830d0a330d0a0000";
$split1 = preg_split('/(7878.*?0d0a)/',$string,null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
$result = array();
foreach($split1 as $e){
$split2 = preg_split('/(.*0d0a)/',$e,null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
foreach($split2 as $el){
// test if $el doesn't start with 7878 and ends with 0d0a
if(strpos($el,'7878') !== 0 && substr($el,-4) == '0d0a'){
//if(preg_match('/^(?!7878).*0d0a$/',$el) === 1){
$result[ count($result)-1 ] = $result[ count($result)-1 ] . $el;
} else {
$result[] = $el;
}
}
}
print_r($result);
The strategy employed here is different than above. First we split the input string based on the delimiter that matches your desired data, using the nongreedy regex .*?. At this point we have some strings that contain the ending of a desired value and some garbage at the end, so we split again based on the last occurrence of "0d0a" with the greedy regex .*0d0a. We then append any of those resulting values that don't start with "7878" but end with "0d0a" to the previous value, as this should repair the first and second halves that got split because it contained an extra "0d0a".
I provided two methods for the innermost if statement, one using regular expressions. The regex one is marginally slower in my testing, so I've left that one commented out.
I might still not have your full requirements, so you'll have to let me know if it works and perhaps provided your full dataset.

I think you are using a delimiter "0d0a" which also happens to be part of a content! Its not possible to avoid getting junk data as long as delimiter can also be part of content. Somehow delimiter must be unique.
Possible solutions.
Change the delimited to something else that doesn't occur as part of your data ( 000000, #!.;)
If you are definite about length of text that easy arrange item may have, use it. As per examples its not possible.
Solutions given in answers considering only sample data you have shared. If you are confidant about what will be the content of string, then these solutions given by others are pretty good to use. Otherwise these solutions wont assure you guarantee!
Best solution: Fix right delimiter then use regex or explode whatever you prefer.

Why don't you use preg_match_all instead? You can avoid all of the non-capturing groups (the look aheads, look behinds) in order to split the string (which without the non-capturing groups removes the matches), and just find the matches you're looking for:
Updated
<?php
$string = "00787817878110d0a22278780d0a78783330d0a00";
preg_match_all('/7878.*?0d0a(?=7878|[^(7878)]*?$)/', $string, $arr);
print_r($arr);
?>
Gives an array $arr[0] => ( [0] => 787817878110d0a22278780d0a, [1] => 78783330d0a ). Strips leading and trailing garbage characters (whatever doesn't start with 7878 or end with 7878 or 0d0a.
So $arr[0] would be the array of values that you are looking for.
See example on ideone
Works with multiple 7878 values and multiple 0d0a values (even though that's ridiculous).
Update
If splitting is more your style, why not avoid regular expressions altogether?
<?php
$string = "787817878110d0a22278780d0a78783330d0a";
$arr = explode('0d0a7878', $string);
$string = implode('0d0a,7878', $arr);
$arr = explode(',', $string);
print_r($arr);
?>
Here we split the string by the delimiter 0d0a7878, which is what #CharlieGorichanaz's solution is doing, and props to him for the quick, accurate solution. We then add a comma, because who doesn't love comma separated values? And we explode again on the commas for an array of desired values. Performance-wise, this ought to be faster than using regular expressions. See example.

Replace text within brackets with thus-named variable in PHP

I want to replace all strings in square brackets ([]) with a randomly chosen item from an array that's named that string.
It's very similar to this issue, but with a twist, in that I want to replace different brackets' contents with strings from arrays named that.
An example should make this a bit clearer.
So say I've got the string
"This is a very [adjective] [noun], and this is a [adjective] [noun]."
And the variables:
$adjective = array("big","small","good","bad");
$noun = array("house","dog","car");
And we want it to return "This is a very big house, and this is a good dog." or whatever, by choosing randomly. That is, I want to write a PHP function that will replace each [string] with a randomly chosen item from the array named $string. For now it doesn't matter if by randomly choosing it ends up repeating choices, but it must make a fresh choice for each [] item.
I hope I've explained this clearly. If you get what I'm trying to achieve and can think of a better way to do it I'd be very grateful.

Algorithm
Match for this regex: (\[.*?\])
For each match group pick an item from the related array.
Replace in string by order.
Implementation
$string = "This is a very [adjective] [noun], and this is a [adjective] [noun].";
$adjective = array("big","small","good","bad");
$noun = array("house","dog","car");
// find matches against the regex and replaces them the callback function.
$result = preg_replace_callback(
// Matches parts to be replaced: '[adjective]', '[noun]'
'/(\[.*?\])/',
// Callback function. Use 'use()' or define arrays as 'global'
function($matches) use ($adjective, $noun) {
// Remove square brackets from the match
// then use it as variable name
$array = ${trim($matches[1],"[]")};
// Pick an item from the related array whichever.
return $array[array_rand($array)];
},
// Input string to search in.
$string
);
print $result;
Explanation
preg_replace_callback function performs a regular expression search and replace using provided callback function.
First parameter is regular expression to match (enclosed between slashes): /(\[.*?\])/
Second parameter is callback function to call for each match. Takes the current match as parameter.
We have to use use() here to access the arrays from inside the function, or define the arrays as global: global $adjective = .... Namely, we have to do one of the followings:
a) Define arrays as global:
...
global $adjective = array("big","small","good","bad");
global $noun = array("house","dog","car");
...
function($matches) {
...
b) Use use:
...
$adjective = array("big","small","good","bad");
$noun = array("house","dog","car");
...
function($matches) use ($adjective, $noun) {
...
First line of the callback function:
trim: Removes square brackets ([]) from the match using trim function.
${}: Creates a variable to use as array name with the match name. For example, if the $match is [noun] then trim($matches[1],"[]") returns noun (without brackets) and ${noun} becomes the array name: $noun. For more information on the topic, see variable variables.
Second line randomly picks an index number available for the $array and then returns the element at this position.
Third parameter is the input string.

The code below will do the work:
$string = "This is a very [adjective] [noun], and this is a [adjective] [noun]."
function replace_word ( $matches )
{
$replaces = array(
'[adjective]' => array("big", "small", "good", "bad"),
'[noun]' => array("house", "dog", "car")
);
return $replaces[$matches[0]][array_rand($replaces[ $matches[0] ])];
}
echo preg_replace_callback("(\[.*?\])", "replace_word", $string);
First, we regular expression match on the [something] parts of the word, and call the replace_word() callback function on it with preg_replace_callback(). This function has an internal $replaces two dimension deep array defined inside, each row defined in a [word type] => array('rep1', 'rep2', ...) format.
The tricky and a bit obfuscated line is the return $replaces[$matches[0]][array_rand($replaces[ $matches[0] ])];. If I chunk it down a bit, it'll be a lot more parsable for you:
$random = array_rand( $replaces[ $matches[0] ] );
$matches[0] is the word type, this is the key in the $replaces array we are searching for. This was found by regular expression in the original string. array_rand() basically selects one element of the array, and returns its numerical index. So $random right now is an integer somewhere between 0 and the (number of elements - 1) of the array containing the replaces.
return $replaces[ $matches[0] ][$random];
This will return the $randomth element from the replace array. In the code snippet, these two lines are put together into one line.
Showing one element only once
If you want disjunct elements (no two adjective or noun repeated twice), then you will need to do another trick. We will set the $replaces array to be defined not inside the replace_word() function, but outside it.
$GLOBALS['replaces'] = array(
'[adjective]' => array("big", "small", "good", "bad"),
'[noun]' => array("house", "dog", "car")
);
Inside the function, we will set the local $replaces variable to be a reference to the newly set array, with calling $replaces = &$GLOBALS['replaces'];. (The & operator sets it a reference, so everything we do with $replaces (remove and add elements, for example) modifies the original array too. Without it, it would only be a copy.)
And before arriving on the return line, we call unset() on the currently to-be-returned key.
unset($replaces[$matches[0]][array_rand($replaces[ $matches[0] ])]);
The function put together now looks like this:
function replace_word ( $matches )
{
$replaces = &$GLOBALS['replaces'];
unset($replaces[$matches[0]][array_rand($replaces[ $matches[0] ])]);
return $replaces[$matches[0]][array_rand($replaces[ $matches[0] ])];
}
And because $replaces is a reference to the global, the unset() updates the original array too. The next calling of replace_word() will not find the same replace again.
Be careful with the size of the array!
Strings containing more replace variables than the amount of replace values present will throw an Undefined index E_NOTICE. The following string won't work:
$string = "This is a very [adjective] [noun], and this is a [adjective] [noun]. This is also an [adjective] [noun] with an [adjective] [noun].";
One of the outputs look like the following, showing that we ran out of possible replaces:
This is a very big house, and this is a big house. This is also an small with an .

Another good (easier) method of doing this (not my solution)
https://stackoverflow.com/a/15773754/2183699
Using a foreach to check on which variables you want to replace and replacing them with
str_replace();

You can use preg_match and str_replace function to achive this goal.
First find the matches using preg_match function and then create search & replace array from the result.
Call str_replace function by passing the previous arrays as parameters.

This is my minor update to mmdemirbas' answer above. It lets you set the variables outside of the function (i.e. use globals, as said).
$result = preg_replace_callback(
// Matches parts to be replaced: '[adjective]', '[noun]'
'/(\[.*?\])/',
// Callback function. Use 'use()' or define arrays as 'global'
function($matches) use ($adjective, $noun) {
// Remove square brackets from the match
// then use it as variable name
$arrayname = trim($matches[1],"[]");
$array = $GLOBALS[$arrayname];
// Pick an item from the related array whichever.
return $array[array_rand($array)];
},
// Input string to search in.
$string
);
print $result;

preg_math multiply responce

<?php
$string = "Movies and Stars I., 32. part";
$pattern = "((IX|IV|V?I{0,3}[\.]))";
if(preg_match($pattern, $string, $x) == false)
{
print "NAPAKA!";
}
else
{
print_r($x);
}
?>
And the response is:
Array ( [0] => I. [1] => I. )
I should get only 1 response... Why do I get multiple responses?

The element at index 0 is the whole matched string. The element at index 1 is the contents of the first capture group, i.e. the content inside the parenthesis. In this case, they just happen to be the same. Just use $x[0] to get the value you're looking for.

The nested parenthesis should, in this instance, be a "non-capturing" subpattern.
$pattern = "~((?:IX|IV|V?I{0,3}[\.]))~";
Try that. It will tell the regex compiler to not capture the results of those parenthesis into the array.
In fact, looking at your regex, you don't even need those parenthesis. Make your regex this:
$pattern = "~IX|IV|V?I{0,3}[\.]~";
That should also work.

Your pattern has multiple groups in it -> the () brackets tell you what to capture in your match.
Try this:
$pattern = "(IX|IV|V?I{0,3}[\.])";
If you have a hard time identifying the wanted groups in the result you can name them as specified in the php.net documentation.
That would look something like this:
$pattern = "(?P<groupname>IX|IV|V?I{0,3}[\.])";

You get 0-indexed for all mathced string and result for every paretness (). it's helpful to get groups i.e
preg_match('~([0-9]+)([a-z]+)','12abc',$x);
$x is ([0]=>12abc [1]=>12 [2]=>abc)
In your case you can simply delete () (1 pair ot them, 1 pair is used as delimiters)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

preg_grep matching incorrectly? - php

Related

preg_replace passing the matched group to a function weired result

preg_match_all in php showing result blank

Efficient way to parse this string into array in PHP?

Replace text within brackets with thus-named variable in PHP

preg_math multiply responce

Categories

Resources