I'm trying to find a regex capable of capturing the content of short codes produces in Wordpress.
My short codes have the following structure:
[shortcode name param1="value1" param2="value2" param3="value3"]
The number of parameters is variable.
I need to capture the shortcode name, the parameter name and its value.
The closest results I have achieved is with this:
/(?:\[(.*?)|\G(?!^))(?=[^][]*])\h+([^\s=]+)="([^\s"]+)"/
If I have the following content in the same string:
[specs product="test" category="body"]
[pricelist keyword="216"]
[specs product="test2" category="network"]
I get this:
0=>array(
0=>[specs product="test"
1=> category="body"
2=>[pricelist keyword="216"
3=>[specs product="test2"
4=> category="network")
1=>array(
0=>specs
1=>
2=>pricelist
3=>specs
4=>)
2=>array(
0=>product
1=>category
2=>keyword
3=>product
4=>category)
3=>array(
0=>test
1=>body
2=>216
3=>test2
4=>network)
)
I have tried different regex models but I always end up with the same issue, if I have more than one parameter, it fails to detect it.
Do you have any idea of how I could achieve this?
Thanks
Laurent
You could make use of the \G anchor using 3 capture groups, where capture group 1 is the name of the shortcode, and group 2 and 3 the key value pairs.
Then you can remove the first entry of the array, and remove the empty entries in the 1st, 2nd and 3rd entry.
This is a slightly updated pattern
(?:\[(?=[^][]*])(\w+)|\G(?!^))\h+(\w+)="([^"]+)"
Regex demo | Php demo
Example
$s = '[specs product="test" category="body"]';
$pattern = '/(?:\[(?=[^][]*])(\w+)|\G(?!^))\h+(\w+)="([^"]+)"/';
$strings = [
'[specs product="test" category="body"]',
'[pricelist keyword="216"]',
'[specs product="test2" category="network" key="value"]'
];
foreach($strings as $s) {
if (preg_match_all($pattern, $s, $matches)) {
unset($matches[0]);
$matches = array_map('array_filter', $matches);
print_r($matches);
}
}
Output
Array
(
[1] => Array
(
[0] => specs
)
[2] => Array
(
[0] => product
[1] => category
)
[3] => Array
(
[0] => test
[1] => body
)
)
Array
(
[1] => Array
(
[0] => pricelist
)
[2] => Array
(
[0] => keyword
)
[3] => Array
(
[0] => 216
)
)
Array
(
[1] => Array
(
[0] => specs
)
[2] => Array
(
[0] => product
[1] => category
[2] => key
)
[3] => Array
(
[0] => test2
[1] => network
[2] => value
)
)
I have a long string like this I1:1;I2:2;I8:2;NA1:5;IA1:[1,2,3,4,5];S1:asadada;SA1:[1,2,3,4,5];SA1:[1,2,3,4,5];. Now I just want to get certain words like 'I1','I2','I8','NA1' and so on i.e. words between ':'&';' only ,and store them in array. How to do that efficiently?
I have already tried using preg_split() and it works but giving me wrong output. As shown below.
// $a is the string I want to extract words from
$str = preg_split("/[;:]/", $a);
print_r($str);
The output I am getting is this
Array
(
[0] => I8
[1] => 2
[2] => I1
[3] => 1
[4] => I2
[5] => 2
[6] => I3
[7] => 2
[8] => I4
[9] => 4
[10] =>
)
Array
(
[0] => NA1
[1] => 5
[2] =>
)
Array
(
[0] => IA1
[1] => [1,2,3,4,5]
[2] =>
)
Array
(
[0] => S1
[1] => asadada
[2] =>
)
Array
(
[0] => SA1
[1] => [1,2,3,4,5]
[2] =>
)
But I am expecting 'I8','I1','I2','I3','I4' also in seperated array with position [0]. Any help on how to do this.
You could try something like.
<?php
$str = 'I1:1;I2:2;I8:2;NA1:5;IA1:[1,2,3,4,5];S1:asadada;SA1:[1,2,3,4,5];SA1:[1,2,3,4,5];';
preg_match_all('/(?:^|[;:])(\w+)/', $str, $result);
print_r($result[1]); // Matches are here in $result[1]
You can perform a greedy match to match the items between ; and : using preg_match_all()
<?php
$str = 'I1:1;I2:2;I8:2;NA1:5;IA1:[1,2,3,4,5];S1:asadada;SA1:[1,2,3,4,5];SA1:[1,2,3,4,5];';
preg_match_all('/;(.+?)\:/',$str,$matches);
print_r($matches[1]);
Live Demo: https://3v4l.org/eBsod
One possible approach is using a combination of explode() and implode(). The result is returned as a string, but you can easily put it into an array for example.
<?php
$input = "I1:1;I2:2;I8:2;NA1:5;IA1:[1,2,3,4,5];S1:asadada;SA1:[1,2,3,4,5];SA1:[1,2,3,4,5];.";
$output = array();
$array = explode(";", $input);
foreach($array as $item) {
$output[] = explode(":", $item)[0];
}
echo implode(",", $output);
?>
Output:
I1,I2,I8,NA1,IA1,S1,SA1,SA1,.
I'm trying to split a sentence at the .!? while keeping them, but for some reason it's not working correctly. What am I doing wrong?
$input = "hi i am1. hi i am2.";
$inputX = preg_split("~[.!?]+\K\b~", $input);
print_r($inputX);
Result:
Array ( [0] => hi i am1. hi i am2. )
Expected Result:
Array ( [0] => hi i am1. [1] => hi i am2. )
I am not sure if you need to do a preg_split() but try preg_match_all() if that is an option:
$input = "hi i am1. hi i am2.";
preg_match_all("/[^\.\?\!]+[\.\!\?]/", $input,$matched);
print_r($matched);
Gives you:
Array
(
[0] => Array
(
[0] => hi i am1.
[1] => hi i am2.
)
)
Try without \b, I think it is redundant (if it is not a case) here.
$input = "hi i am1. hi i am2.?! hi i am2.?";
$inputX = preg_split("~(?>[.!?]+)\K(?!$)~", $input);
print_r($inputX);
The (?!$) is to avoid splitting on matched element, if it is on the end of string, so there will not be an additional empty result. The atomic grouping ?> is to avoid spliting if there is series of characters on the end of string, like ?!.(without atomic grouping it would split on !, and last result would be single char .). Output:
Array
(
[0] => hi i am1.
[1] => hi i am2.?!
[2] => hi i am2.?
)
i hope this is what you are expecting
$input = "hi i am1. hi i !am?2."; // i have added other ?! symbols also
$inputX = preg_split("/(\.|\!|\?)/", $input,-1,PREG_SPLIT_DELIM_CAPTURE);
print_r($inputX)
output:
Array ( [0] => hi i am1 [1] => . [2] => hi i [3] => ! [4] => am [5] => ? [6] => 2 [7] => . [8] => )
I have a string:
xyz.com?username="test"&pwd="test"#score="score"#key="1234"
output format:
array (
[0] => username="test"
[1] => pwd="test"
[2] => score="score"
[3] => key="1234"
)
This should work for you:
Just use preg_split() with a character class with all delimiters in it. At the end just use array_shift() to remove the first element.
<?php
$str = 'xyz.com?username="test"&pwd="test"#score="score"#key="1234"';
$arr = preg_split("/[?&##]/", $str);
array_shift($arr);
print_r($arr);
?>
output:
Array
(
[0] => username="test"
[1] => pwd="test"
[2] => score="score"
[3] => key="1234"
)
You can use preg_split function with regex pattern including all those delimimting special characters. Then remove the first value of the array and reset keys:
$s = 'xyz.com?username="test"&pwd="test"#score="score"#key="1234"';
$a = preg_split('/[?&##]/',$s);
unset($a[0]);
$a = array_values($a);
print_r($a);
Output:
Array (
[0] => username="test"
[1] => pwd="test"
[2] => score="score"
[3] => key="1234"
)
Got a number list with separator(s) like these (note: quotes not included):
"1"
"1$^20"
"23$^100$^250"
I watch to write a regex to match both syntax of numbers and separators and also return all numbers in list, the best try I can get in PHP is this code segment:
preg_match_all("/(\d+)(?:\\$\\^){0,1}/", $s2, $n);
print_r($n);
but it returns:
Array
(
[0] => Array
(
[0] => 1
[1] => 20
)
[1] => Array
(
[0] => 1
[1] => 20
)
)
What I need is:
Array
(
[0] => 1
[1] => 20
)
or at least:
Array
(
[0] => Array
(
[0] => 1
[1] => 20
)
)
You can just get the first entry in your match array like this:
$s2 = "1$^20";
preg_match_all("/(\d+)(?:\$\^){0,1}/", $s2, $n);
print_r($n[0]);
// Array ( [0] => 1 [1] => 20 )
Or drop the group and just extract the numbers like this:
$s2 = "1$^20";
preg_match_all("/\d+/", $s2, $n);
print_r($n);
// Array ( [0] => Array ( [0] => 1 [1] => 20 ) )
Another alternative might be to use preg_split:
$s2 = "1$^20";
$n = preg_split('/\$\^/', $s2);
print_r($n);
// Array ( [0] => 1 [1] => 20 )
I thought about this quesiton again. I know I need not only split them but also check the value syntax. And what if it's a text seprated list? ... Hmm... then a smart way comes into my mind as follows in PHP codes:
// Split and also check value validity of number separated list
$pattern1 = "/(\d+?)\\$\\^/";
$1 = "1^$23";
$s1 .= "$^"; // Always append one separator set
preg_match_all($pattern1, $s1, $matches);
Change \d to . will work for text separated list or number-text-mixed separated list, too.