get all preg_match_all matches recursively? - php

I am trying to parse some attributes from a string. I am using the function below, which is working fine but I would like to parse all nested matches, right now it is only parsing the root level when the goal is to parse all the matches even the ones contained inside others:
function get_all_attributes( $tag, $text ){
preg_match_all( '/\[(\[?)(embed|wp_caption|caption|gallery|playlist|audio|video|acf|page|row|column|loop\-grid|loop\-grid\-item|grid\-filters|page\-title|page\-section|header|header\-column|header\-menu|header\-logo)(?![\w-])([^\]\/]*(?:\/(?!\])[^\]\/]*)*?)(?:(\/)\]|\](?:([^\[]*+(?:\[(?!\/\2\])[^\[]*+)*+)\[\/\2\])?)(\]?)/s', $text, $matches );
$out = array();
if( isset( $matches[2] ) )
{
foreach( (array) $matches[2] as $key => $value )
{
if( $tag === $value ){
$out[] = array(
'name' => $tag,
'attributes' => shortcode_parse_atts( $matches[3][$key] ),
'content' => trim($matches[5][$key]),
);
}
}
}
return $out;
}
The following is the string being parsed, it contains shortcodes from Wordpress, which I am trying to put it in array to easily get the attributes later on:
[page key="2298"]
[page-title key="1446986321457"]aaaa[/page-title]
[page-title key="1446986418207"]bbbbb[/page-title]
[row key="1446893994674"]
[column key="1446893994674_1"]
[image key="1446893994674_1_logo"]ccc[/image]
[/column]
[/row]
Is it possible to get all strings both parents and children in the same array maybe using a recursive regex?

Related

PHP parse_str() function allowing passing shorthand array

We are accepting strings from templates to a markup engine, which allows for configuration to be passed in a "simple" form.
The engine parses the strings via PHP, using an adapted version of the parse_str() function - so we can parse any combination of the strings below:
config=posts_per_page:"5",default:"No questions yet -- once created they will appear here."&markup->template="{{ questions }}"
gives:
Array(
[config] => Array
(
[posts_per_page] => 5
[default] => No questions yet -- once created they will appear here.
)
[markup] => Array
(
[template] => {{ questions }}
)
)
OR:
config->default=all:"<p class='ml-3'>No members here yet...</p>"
To Get:
Array
[config] => Array
(
[default] => Array
(
[all] => <p class='ml-3'>No members here yet...</p>
)
)
)
Another:
config=>handle:"medium"
Returns:
Array (
[config] => Array
(
[>handle] => medium
)
)
Strings can be passed with spaces ( and multi-line spaces ) and string parameters should be passed between "double quotes" to preserve natural spacing - we run the following preg_replace on the string before it is passed to the parse_str method:
// strip white spaces from data that is not passed inside double quotes ( "data" ) ##
$string = preg_replace( '~"[^"]*"(*SKIP)(*F)|\s+~', "", $string );
So far, so good - until we try to pass a "delimiter" inside a string value, then it is treated literally - for example the following string returns a corrupt array:
config=posts_per_page:"5",default:"No questions yet -- once created, they will appear here."&markup->template="{{ questions }}"
Returns the following array:
Array (
[config] => Array
(
[posts_per_page] => 5
[default] => No questions yet -- once created
[ they will appear here."] =>
)
[markup] => Array
(
[template] => {{ questions }}
)
)
The "," was treated literally and the string was broken into an extra array part.
One simple solution is to create delimiters and operators with a lower chance of conflicting with string values - for example changing "," to "###" - but one important part of the markup used is that it is easy to write and read - it's intended use-case is for front-end developers to pass simple arguments to the template parser - this is one reason we have tried to avoid JSON - which is of course a good fit in terms of passing data, but it's hard to read and write - of course, that statement is subjective and open to opinion :)
Here is the parse_str method:
public static function parse_str( $string = null ) {
// h::log($string);
// delimiters ##
$operator_assign = '=';
$operator_array = '->';
$delimiter_key = ':';
$delimiter_and_property = ',';
$delimiter_and_key = '&';
// check for "=" delimiter ##
if( false === strpos( $string, $operator_assign ) ){
h::log( 'e:>Passed string format does not include asssignment operator "'.$operator_assign.'" -- '.$string );
return false;
}
# result array
$array = [];
# split on outer delimiter
$pairs = explode( $delimiter_and_key, $string );
# loop through each pair
foreach ( $pairs as $i ) {
# split into name and value
list( $key, $value ) = explode( $operator_assign, $i, 2 );
// what about array values ##
// example -- sm:medium, lg:large
if( false !== strpos( $value, $delimiter_key ) ){
// temp array ##
$value_array = [];
// split value into an array at "," ##
$value_pairs = explode( $delimiter_and_property, $value );
// h::log( $value_pairs );
# loop through each pair
foreach ( $value_pairs as $v_pair ) {
// h::log( $v_pair ); // 'sm:medium'
# split into name and value
list( $value_key, $value_value ) = explode( $delimiter_key, $v_pair, 2 );
$value_array[ $value_key ] = $value_value;
}
// check if we have an array ##
if ( is_array( $value_array ) ){
$value = $value_array;
}
}
// $key might be in part__part format, so check ##
if( false !== strpos( $key, $operator_array ) ){
// explode, max 2 parts ##
$md_key = explode( $operator_array, $key, 2 );
# if name already exists
if( isset( $array[ $md_key[0] ][ $md_key[1] ] ) ) {
# stick multiple values into an array
if( is_array( $array[ $md_key[0] ][ $md_key[1] ] ) ) {
$array[ $md_key[0] ][ $md_key[1] ][] = $value;
} else {
$array[ $md_key[0] ][ $md_key[1] ] = array( $array[ $md_key[0] ][ $md_key[1] ], $value );
}
# otherwise, simply stick it in a scalar
} else {
$array[ $md_key[0] ][ $md_key[1] ] = $value;
}
} else {
# if name already exists
if( isset($array[$key]) ) {
# stick multiple values into an array
if( is_array($array[$key]) ) {
$array[$key][] = $value;
} else {
$array[$key] = array($array[$key], $value);
}
# otherwise, simply stick it in a scalar
} else {
$array[$key] = $value;
}
}
}
// h::log( $array );
# return result array
return $array;
}
I will try to skip splitting string between "double quotes" - probably via another regex, but perhaps there are other potential pitfalls waiting that might not make this approach viable long-term - any help glady accepted!
One solution, is to change the following:
from:
$value_pairs = explode( $delimiter_and_property, $value );
to:
$value_pairs = self::quoted_explode( $value, $delimiter_and_property, '"' );
which calls a new method found on another SO answer ( linked in comment block ):
/**
* Regex Escape values
*/
public static function regex_escape( $subject ) {
return str_replace( array( '\\', '^', '-', ']' ), array( '\\\\', '\\^', '\\-', '\\]' ), $subject );
}
/**
* Explode string, while respecting delimiters
*
* #link https://stackoverflow.com/questions/3264775/an-explode-function-that-ignores-characters-inside-quotes/13755505#13755505
*/
public static function quoted_explode( $subject, $delimiter = ',', $quotes = '\"' )
{
$clauses[] = '[^'.self::regex_escape( $delimiter.$quotes ).']';
foreach( str_split( $quotes) as $quote ) {
$quote = self::regex_escape( $quote );
$clauses[] = "[$quote][^$quote]*[$quote]";
}
$regex = '(?:'.implode('|', $clauses).')+';
preg_match_all( '/'.str_replace('/', '\\/', $regex).'/', $subject, $matches );
return $matches[0];
}

Parsing (non-standard) json to array/object

I have string like this:
['key1':'value1', 2:'value2', 3:$var, 4:'with\' quotes', 5:'with, comma']
And I want to convert it to an array like this:
$parsed = [
'key1' => 'value1',
2 => 'value2',
3 => '$var',
4 => 'with\' quotes',
5 => 'with, comma',
];
How can I parse that?
Any tips or codes will be appreciated.
What can't be done?
Using standard json parsers
eval()
explode() by , and explode() by :
As you cannot use any pre-built function, like json_decode, you'll have to try and find the most possible scenarios of quoting, and replace them with known substrings.
Given that all of the values and/or keys in the input array are encapsulated in single quotes:
Please note: this code is untested
<?php
$input = "[ 'key1':'value1', 2:'value2', 3:$var, 4:'with\' quotes', 5: '$var', 'another_key': 'something not usual, like \'this\'' ]";
function extractKeysAndValuesFromNonStandardKeyValueString ( $string ) {
$input = str_replace ( Array ( "\\\'", "\'" ), Array ( "[DOUBLE_QUOTE]", "[QUOTE]" ), $string );
$input_clone = $input;
$return_array = Array ();
if ( preg_match_all ( '/\'?([^\':]+)\'?\s*\:\s*\'([^\']+)\'\s*,?\s*/', $input, $matches ) ) {
foreach ( $matches[0] as $i => $full_match ) {
$key = $matches[1][$i];
$value = $matches[2][$i];
if ( isset ( ${$value} ) $value = ${$value};
else $value = str_replace ( Array ( "[DOUBLE_QUOTE]", "[QUOTE]" ), Array ( "\\\'", "\'" ), $value );
$return_array[$key] = $value;
$input_clone = str_replace ( $full_match, '', $input_clone );
}
// process the rest of the string, if anything important is left inside of it
if ( preg_match_all ( '/\'?([^\':]+)\'?\s*\:\s*([^,]+)\s*,?\s*/', $input_clone, $matches ) ) {
foreach ( $matches[0] as $i => $full_match ) {
$key = $matches[1][$i];
$value = $matches[2][$i];
if ( isset ( ${$value} ) $value = ${$value};
$return_array[$key] = $value;
}
}
}
return $return_array;
}
The idea behind this function is to first replace all the possible combinations of quotes in the non-standard string with something you can easily replace, then perform a standard regexp against your input, then rebuild everything assuring you're resetting the previously replaced substrings

PHP - Reorganize Query String Parameters

Suppose I have a query string like this:
?foo1bar1=a&foo1bar2=b&foo1bar3=c&foo2bar1=d&cats1dogs1=z
The parameters in this string could be arbitrary, and could have any number of indexes (so you could have just foo=, you could have foo1bar1= or something like foo1bar1baz1=. However, the parameters and their relevant indexes will be known ahead of time.
I'd like to be able to take this query string, plus a configuration, and re-structure it... The configuration might look something like this:
$indexes = array('foodex', 'bardex');
$columns = array('foo<foodex>bar<bardex>', 'cats<foodex>dogs<bardex>');
And the desired output would be the "columns" reorganized into rows indexed by the appropriate indexes, ready for storing in database rows. Something like this...
array(
array(
'foodex' => 1,
'bardex' => 1,
'foo<foodex>bar<bardex>' => 'a',
'cats<foodex>dogs<bardex>' => 'z'
),
array(
'foodex' => 1,
'bardex' => 2,
'foo<foodex>bar<bardex>' => 'b',
'cats<foodex>dogs<bardex>' => null
),
etc.
)
I've thought of a couple ideas for solving this problem, but nothing seems terribly elegant... I could:
Write a recursive function that loops through all possible values of a known index, and then calls itself to loop through all possible values of the next known index, then records the results. This would be super slow... you might loop through thousands or millions of possible index values only to find a handful in the query string.
Loop through each actual value in the query string, do some sort of regex check to see if it matches one of the columns I'm looking for including wildcards for each index that's listed within it. Then I could build some sort of multi-dimensional array using the indexes and eventually flatten it for the output. This would run much faster, but seems awfully complex.
Is there an elegant solution staring me in the face? I'd love to hear suggestions.
here is quick sample you can start with:
// your configuration
$indexes = array ('foodex', 'bardex');
$columns = array ('foo<foodex>bar<bardex>', 'cats<foodex>dogs<bardex>');
// column names converted into regexps
$columns_re = array_map ( function ($v) {
global $indexes;
return '/^' . str_replace ( array_map ( function ($v) {
return '<' . $v . '>';
}, $indexes ), '(\d+)', $v ) . '$/';
}, $columns );
// output array
$array = array ();
foreach ( $_GET as $key => $value ) {
foreach ( $columns_re as $reIdx => $re ) {
$matches = array ();
if (preg_match_all ( $re, $key, $matches )) {
// generate unique row id as combination of all indexes
$rowIdx = '';
foreach ( $indexes as $i => $idxName )
$rowIdx .= $matches [$i + 1] [0] . '_';
// fill output row with default values
if (! isset ( $array [$rowIdx] )) {
$array [$rowIdx] = array ();
foreach ( $indexes as $i => $idxName )
$array [$rowIdx] [$idxName] = $matches [$i + 1] [0];
foreach ( $columns as $name )
$array [$rowIdx] [$name] = null;
}
// fill actually found value
$array [$rowIdx] [$columns [$reIdx]] = $value;
}
}
}
tested with php 5.3, with some modifications can be run under any version

How to get content of tags in array?

I'm using woocommerce and a function returns item data in the below format
<dl class="variation">
<dt>options:</dt><dd>redwood-120mm-x-28mm</dd>
<dt>length:</dt><dd>3.6</dd>
<dt>linear metres:</dt><dd>500</dd>
</dl>
I want to input this data into an array, like the following;
array("options:" => "redwood-120mm-x-28mm", "length:"=> "3.6", "linear metres:" => "500");
How do i do this?
This is the function:
global $woocommerce;
foreach ( $woocommerce->cart->get_cart() as $cart_item_key => $cart_item ) {
echo $woocommerce->cart->get_item_data( $cart_item );
}
}
You can do it by regular expression, maybe you want to give a shot for DOM in PHP.
My explanation is in the code in comments.
//String to parse
$string = '<dl class="variation">
<dt>options:</dt><dd>redwood-120mm-x-28mm</dd>
<dt>length:</dt><dd>3.6</dd>
<dt>linear metres:</dt><dd>500</dd>
</dl>';
//Keys, you want to find
$keys = array('options', 'length', 'linear metres');
//The result array
$result = array();
//Loop through the keys
foreach ($keys as $key) {
//Insert the result into the result array
$result[$key] = getValueByKey($key, $string);
}
//Show results
var_dump($result);
function getValueByKey($key, $string) {
//The pattern by key
$pattern = '/<dt>' . $key . ':<\/dt><dd>(.*?)<\/dd>/i';
//Initialize a match array
$matches = array();
//Do the regular expression
preg_match($pattern, $string, $matches);
if (!empty($matches[1])) {
//If there are match, then return with it
return $matches[1];
}
//Otherwise return with false
return false;
}
Output is:
array (size=3)
'options' => string 'redwood-120mm-x-28mm' (length=20)
'length' => string '3.6' (length=3)
'linear metres' => string '500' (length=3)

Create shortcode with parameter in PHP Joomla

I've created a simple shortcode plugin on Joomla.
Actually I am trying to integrate Cleeng Video with Joomla. And will connect it's users in the future ( I hope ).
I've stack on creating shortcode's parameter. I don't know how to parse it's parameter and value.
My Shortcode is here (no parameter)
{cleengvideo}<iframe class="wistia_embed" src="http://fast.wistia.net/embed/iframe/5r8r9ib6di" name="wistia_embed" width="640" height="360" frameborder="0" scrolling="no" allowfullscreen=""></iframe>{/cleengvideo}
My code is here
public function onContentPrepare($content, $article, $params, $limit) {
preg_match_all('/{cleengvideo}(.*?){\/cleengvideo}/is', $article->text, $matches);
$i = 0;
foreach ($matches[0] as $match) {
$videoCode = $matches[1][$i];
$article->text = str_replace($match, $videoCode, $article->text);
}
I want to set height, width and 5r8r9ib6di this code from shortcode at least.
Please can anyone help me with adding and parsing it's parameter
To get a parameter, you can simply use the following code:
$params->get('param_name', 'default_value');
So for example, in your XML file, if you had a field like so:
<field name="width" type="text" label="Width" default="60px" />
you would call the parameter like so:
$params->get('width', '60px');
Note that you don't have to add the default value as the second string, however I always find it good practice.
Hope this helps
I think I could found it's solution.
It's here https://github.com/Cleeng/cleeng-wp-plugin/blob/master/php/classes/Frontend.php
Code is
$expr = '/\[cleeng_content(.*?[^\\\])\](.*?[^\\\])\[\/cleeng_content\]/is';
preg_match_all( $expr, $post->post_content, $m );
foreach ( $m[0] as $key => $content ) {
$paramLine = $m[1][$key];
$expr = '/(\w+)\s*=\s*(?:\"|")(.*?)(?<!\\\)(?:\"|")/si';
preg_match_all( $expr, $paramLine, $mm );
if ( ! isset( $mm[0] ) || ! count( $mm[0] ) ) {
continue;
}
$params = array( );
foreach ( $mm[1] as $key => $paramName ) {
$params[$paramName] = $mm[2][$key];
}
if ( ! isset( $params['id'] ) ) {
continue;
}
$content = array(
'contentId' => $params['id'],
'shortDescription' => #$params['description'],
'price' => #$params['price'],
'itemType' => 'article',
'purchased' => false,
'shortUrl' => '',
'referred' => false,
'referralProgramEnabled' => false,
'referralRate' => 0,
'rated' => false,
'publisherId' => '000000000',
'publisherName' => '',
'averageRating' => 4,
'canVote' => false,
'currencySymbol' => '',
'sync' => false
);
if ( isset( $params['referral'] ) ) {
$content['referralProgramEnabled'] = true;
$content['referralRate'] = $params['referral'];
}
if ( isset( $params['ls'] ) && isset( $params['le'] ) ) {
$content['hasLayerDates'] = true;
$content['layerStartDate'] = $params['ls'];
$content['layerEndDate'] = $params['le'];
}
$this->cleeng_content[$params['id']] = $content;
}
Hope this helps someone searching for shortcode parameters, for parameters in short code we can use preg_match_all like that
preg_match_all('/{cleengvideo(.*?)}(.*?){\/cleengvideo}/is', $article->text, $matches);
This will give a array with 3 array elements, the second array have the parameters which you can maupulate with codes.
Hope this helps.

Categories