Regex to get repeating matches within a match - php

I have this sample string in a source:
#include_plugin:PluginName param1=value1 param2=value2#
What I want is to find all occurances of #include_plugin:*# from a source with a result of the PluginName and each paramN=valueN.
At this moment I'm fiddling with something like this (and have tried many variants): /#include_plugin:(.*\b){1}(.*\=.*){0,}#/ (using this resource). Unfortunately I can't seem to define a pattern which is giving me the result I want. Any suggestions?
Update with example:
Say I have this string in a .tpl-file. #include_plugin:BestSellers limit=5 fromCategory=123#
I want it to return an array with:
0 => BestSellers,
1 => limit=5 fromCategory=123
Or even better (if possible):
0 => BestSellers,
1 => limit=5,
2 => fromCategory=123

You can do it in 2 steps. First capture the line with a regex, then explode the parameters into an array:
$subject = '#include_plugin:PluginName param1=value1 param2=value2#';
$pattern = '/#include_plugin:([a-z]+)( .*)?#/i';
preg_match($pattern, $subject, $matches);
$pluginName = $matches[1];
$pluginParams = isset($matches[2])?explode(' ', trim($matches[2])):array();

You can use this regex:
/#include_plugin:([a-zA-Z0-9]+)(.*?)#/
The PluginName is in the first capturing group, and the parameters are in the second capturing group. Note that the parameters, if any, has a leading spaces.
It is not possible to write a regex to extract to your even better case, unless the maximum number of parameters in known.
You can do extra processing by first trimming leading and trailing spaces, then split along /\s+/.

I'm not sure of your character-set that your PluginName can contain, or the parameters/values, but in case they are limited you can use the following regex:
/#include_plugin:((?:\w+)(?:\s+[a-zA-Z0-9]+=[a-zA-Z0-9]+)*)#/
This will capture the plugin name followed by any list of alpha-numeric parameters with their values. The output can be seen with:
<?
$str = '#include_plugin:PluginName param1=value1 param2=value2#
#include_plugin:BestSellers limit=5 fromCategory=123#';
$regex = '/#include_plugin:((?:\w+)(?:\s+[a-zA-Z0-9]+=[a-zA-Z0-9]+)*)#/';
$matches = array();
preg_match_all($regex, $str, $matches);
print_r($matches);
?>
This will output:
Array
(
[0] => Array
(
[0] => #include_plugin:PluginName param1=value1 param2=value2#
[1] => #include_plugin:BestSellers limit=5 fromCategory=123#
)
[1] => Array
(
[0] => PluginName param1=value1 param2=value2
[1] => BestSellers limit=5 fromCategory=123
)
)
To get the array in the format you need, you can iterate through the results with:
$plugins = array();
foreach ($matches[1] as $match) {
$plugins[] = explode(' ', $match);
}
And now you'll have the following in $plugins:
Array
(
[0] => Array
(
[0] => PluginName
[1] => param1=value1
[2] => param2=value2
)
[1] => Array
(
[0] => BestSellers
[1] => limit=5
[2] => fromCategory=123
)
)

$string = "#include_plugin:PluginName1 param1=value1 param2=value2# #include_plugin:PluginName2#";
preg_match_all('/#include_plugin:([a-zA-Z0-9]+)\s?([^#]+)?/', $string, $matches);
var_dump($matches);
is this what you are looking for?
array(3) {
[0]=>
array(2) {
[0]=>
string(55) "#include_plugin:PluginName1 param1=value1 param2=value2"
[1]=>
string(27) "#include_plugin:PluginName2"
}
[1]=>
array(2) {
[0]=>
string(11) "PluginName1"
[1]=>
string(11) "PluginName2"
}
[2]=>
array(2) {
[0]=>
string(27) "param1=value1 param2=value2"
[1]=>
string(0) ""
}
}

This Regex will give you multiple groups, one for each plugin.
((?<=#include_plugin:)(.+))

Related

explode multiple delimiters in php

My array looks like:
{flower},{animals},{food},{people},{trees}
I want to explode with {, , & }.
My output should contain only words inside curly brackets.
My code:
$array = explode("},{", $list);
After execution of this code $array will be
$array = Array (
[0] => {flower
[1] => animals
[2] => food
[3] => people
[4] => trees}
)
But output array should be:
$array = Array (
[0] => flower
[1] => animals
[2] => food
[3] => people
[4] => trees
)
Can anyone please tell me how can I modify my code to get this array?
I would go for preg_split like below
<?php
$list = "{flower},{animals},{food},{people},{trees}";
$array = preg_split('/[},{]/', $list, 0, PREG_SPLIT_NO_EMPTY);
print_r($array);
?>
The output is
Array
(
[0] => flower
[1] => animals
[2] => food
[3] => people
[4] => trees
)
You could try to extract the words using a RegEx instead of splitting the string:
$list = "{flower},{animals},{food},{people},{trees}";
// Match anything between curly brackets
// The "U" flag prevents the regex to make a single match with the first and last brackets
preg_match_all('~{(.+)}~U', $list, $result);
// Only keep the 1st capturing group
$words = $result[1];
var_dump($words);
Output:
array(5) {
[0]=>
string(6) "flower"
[1]=>
string(7) "animals"
[2]=>
string(4) "food"
[3]=>
string(6) "people"
[4]=>
string(5) "trees"
}

Regular expression clean results

I am trying to extract the words from this string with PHP :
$string= '\'banana\', "orange", "apple"';
Using this pattern :
/([\'"])(.*?)\1/
But it gives me this results :
array(3) {
[0] array(3) {
[0] "'banana'"
[1] ""orange""
[2] ""apple""
}
[1] array(3) {
[0] "'"
[1] ""
[2] ""
}
[2] array(3) {
[0] "banana"
[1] "orange"
[2] "apple"
}
}
Is these a way I can clean it up to just :
array(3) {
[0] "banana"
[1] "orange"
[2] "apple"
}
Thanks for your help.
With regex, you can just extract word characters using the \w matcher.
Put it together with a Global flag, and I believe you'll get what you want :)
See this example
Use preg_split and describe all you want to remove and where you want to cut the string, then add the flag PREG_SPLIT_NO_EMPTY to filter empty parts.
$subject = '\'banana\', "orange", "apple"';
$pattern = '~["\'](?:,\s*["\']|\z)|\A["\']~';
$result = preg_split($pattern, $subject, -1, PREG_SPLIT_NO_EMPTY);
print_r($result);
Obviously this way doesn't check if each item is well enclosed between the same type of quotes, but is it really necessary?

Explode a POST variable after a pattern REGEX

I'm terrible at regex, hard to understand for me so I need some help. I have a variable which looks something like this:
["...=", "...=", "...="]
Those are 3 values which I want to split into an array. The way I see it, I want to split it at the comma which comes after a quote ", ". Can someone please help me with the regex for preg_split?
You could try the below code to split the input string according to ", "
<?php
$yourstring = '["...=", "...=", "...="]';
$regex = '~", "~';
$splits = preg_split($regex, $yourstring);
print_r($splits);
?>
Output:
Array
(
[0] => ["...=
[1] => ...=
[2] => ...="]
)
If you don't want "[,]" in the output then you could try the below code.
<?php
$data = '["...=", "...=", "...="]';
$regex = '~(?<=\["|", ")[^"]*~';
preg_match_all($regex, $data, $matches);
print_r($matches);
?>
Output:
Array
(
[0] => Array
(
[0] => ...=
[1] => ...=
[2] => ...=
)
)
$string = '["...=", "...=", "...="]';
$parts = preg_split('/,\s/', $string);
var_dump($parts);
Program output:
array(3) {
[0]=>
string(34) ""...=""
[1]=>
string(36) ""...=""
[2]=>
string(37) ""...=""
}
So long as the double-quote symbol cannot occur within the double-quotes that contain the content, this pattern should validate and capture the three values:
^\["([^"]+)"\], \["([^"]+)"\], \["([^"]+)"\]$
If double-quotes can appear within the content, or the number of values is variable, then this pattern will not work.

How does one split a string on two delimeters into an associative array

The answer on this question, pointed me in a possible direction, but it processes the string once, then loops through the result. Is there a way to do it in one process?
My string is like this, but much longer:
954_adhesives
7_air fresheners
25_albums
236_stuffed_animial
819_antenna toppers
69_appliances
47_aprons
28_armbands
I'd like to split it on linebreaks, then on underscore so that the number before the underscore is the key and the phrase after the underscore is the value.
Just use a regular expression and array_combine:
preg_match_all('/^([0-9]+)_(.*)$/m', $input, $matches);
$result = array_combine($matches[1], array_map('trim', $matches[2]));
Sample output:
array(8) {
[954]=>
string(9) "adhesives"
[7]=>
string(14) "air fresheners"
[25]=>
string(6) "albums"
[236]=>
string(15) "stuffed_animial"
[819]=>
string(15) "antenna toppers"
[69]=>
string(10) "appliances"
[47]=>
string(6) "aprons"
[28]=>
string(8) "armbands"
}
Use ksort or arsort if you need the result sorted as well, by keys or values respectively.
You can do it in one line:
$result = preg_split('_|\n',$string);
Here is a handy-dandy tester: http://www.fullonrobotchubby.co.uk/random/preg_tester/preg_tester.php
EDIT:
For posterity, here's my solution. However, #Niels Keurentjes answer is more appropriate, as it matches a number at the beginning.
If you wanted to do this with regular expressions, you could do something like:
preg_match_all("/^(.*?)_(.*)$/m", $content, $matches);
Should do the trick.
If you want the result to be a nested array like this;
Array
(
[0] => Array
(
[0] => 954
[1] => adhesives
)
[1] => Array
(
[0] => 7
[1] => air fresheners
)
[2] => Array
(
[0] => 25
[1] => albums
)
)
then you could use an array_map eg;
$str =
"954_adhesives
7_air fresheners
25_albums";
$arr = array_map(
function($s) {return explode('_', $s);},
explode("\n", $str)
);
print_r($arr);
I've just used the first three lines of your string for brevity but the same function works ok on the whole string.

php regular expression: how to correctly escape characters?

How to write preg_match, to match string *My* ?
This doesn't work:
$ptn = "/\*(.*)\*/";
$str = "*My*";
preg_match($ptn, $str, $matches);
print_r($matches);
because it outputs:
Array
(
[0] => *My*
[1] => *My*
)
instead of:
Array
(
[0] => *My*
[1] => My
)
Works fine here:
php > preg_match('/\*(.*)\*/', '*My*', $matches);
php > var_dump($matches);
array(2) {
[0]=>
string(4) "*My*"
[1]=>
string(2) "My"
}
Remember that the $matches array will ALWAYS contain the entire matched string in position 0, then the individal matches in slots 1+.

Categories