Parsing attributes in PHP using regular expressions - php

Consider that i have the string,
$string = 'tag2 display="users" limit="5"';
Using the preg_match_all function, i need to get the output
Required o/p
Array
(
[0] => Array
(
[0] => tag2
[1] => tag2
[2] =>
)
[1] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[2] => Array
(
[0] => limit="5"
[1] => limit
[2] => 5
)
)
I tried using this pattern '/([^=\s]+)="([^"]+)"/' but it is not recognizing the parameter with no value (in this case tag2) Instead it gives the output
What I am getting
Array
(
[0] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[1] => Array
(
[0] => limit="5"
[1] => limit
[2] => 5
)
)
What will be the pattern for getting the required output ?
EDIT 1: I also need to get the attributes which are not wrapped with quotes ex: attr=val. Sorry for not mentioning before.

Try this:
<?php
$string = 'tag2 display="users" limit="5"';
preg_match_all('/([^=\s]+)(="([^"]+)")?/', $string, $res);
foreach ($res[0] as $r => $v) {
$o[] = array($res[0][$r], $res[1][$r], $res[3][$r]);
}
print_r($o);
?>
It outputs me:
Array
(
[0] => Array
(
[0] => tag2
[1] => tag2
[2] =>
)
[1] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[2] => Array
(
[0] => limit="5"
[1] => limit
[2] => 5
)
)

I think it's not fully possible to give you with one call what you're looking for, but this is pretty close:
$string = 'tag2 display="users" limit=5';
preg_match_all('/([^=\s]+)(?:="?([^"]+)"?|())?/', $string, $res, PREG_SET_ORDER);
print_r($res);
Output:
Array
(
[0] => Array
(
[0] => tag2
[1] => tag2
[2] =>
[3] =>
)
[1] => Array
(
[0] => display="users"
[1] => display
[2] => users
)
[2] => Array
(
[0] => limit=5
[1] => limit
[2] => 5
)
)
As you can see, the first element has no value, I tried to work around that and offer an empty match now. So this builds the array you were asking for, but has an additional entry on the empty attribute.
However the main point is the PREG_SET_ORDER flag of preg_match_all. Maybe you can live with this output already.

Maybe you're interested in this litte snippet that parses all sorts of attribute styles. <div class="hello" id=foobar style='display:none'> is valid html(5), not pretty, I know…
<?php
$string = '<tag2 display="users" limit="5">';
$attributes = array();
$pattern = "/\s+(?<name>[a-z0-9-]+)=(((?<quotes>['\"])(?<value>.*?)\k<quotes>)|(?<value2>[^'\" ]+))/i";
preg_match_all($pattern, $source, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
$attributes[$match['name']] = $match['value'] ?: $match['value2'];
}
var_dump($attributes);
will give you
$attributes = array(
'display' => 'users',
'limit' => '5',
);

Related

PHP add String to multidimensional Array, comma seperated

I'm trying to add a string to a 3*x Array. I have a string as an input with 150*3 values.
<?php
$myString = "5.1,3.5,Red,4.9,3,Blue,4.7,3.2,Red,4.6,3.1,Red,5,3.6,Red," //and so on
?>
the result should look like
Array
(
[0] => Array
(
[0] => 5.1
[1] => 3.5
[2] => Red
)
[1] => Array
(
[0] => 4.9
[1] => 3
[2] => Blue
)
//and so on
)
First, you will need to convert the comma separated string into an array. Then you can use the array_chunk() function.
$myString = "5.1,3.5,Red,4.9,3,Blue,4.7,3.2,Red,4.6,3.1,Red,5,3.6,Red";
$explodedStringToArray = explode(',', $myString);
$chunked_array = array_chunk($explodedStringToArray, 3);
print_r($chunked_array);
This will produce:
Array
(
[0] => Array
(
[0] => 5.1
[1] => 3.5
[2] => Red
)
[1] => Array
(
[0] => 4.9
[1] => 3
[2] => Blue
)
[2] => Array
(
[0] => 4.7
[1] => 3.2
[2] => Red
)
[3] => Array
(
[0] => 4.6
[1] => 3.1
[2] => Red
)
[4] => Array
(
[0] => 5
[1] => 3.6
[2] => Red
)
)
You can use explode() on the string, and then use array_chunk() to chunk the array we have from explode function, keep in mind to check for the chunk size
working snippet: https://3v4l.org/qD1t0
<?php
$myString = "5.1,3.5,Red,4.9,3,Blue,4.7,3.2,Red,4.6,3.1,Red,5,3.6,Red"; //and so on
$arr = explode(",", $myString);
$chunks = array_chunk($Arr, 3);
print_r($chunks);

Capture function name with brackets

I am trying to capture the whole function name for example:
player($var)
I tried this:
preg_match_all('/(function )(?P<name>\w+)/', $content, $matches, PREG_OFFSET_CAPTURE);
This returns:
[2] => Array
(
[0] => Array
(
[0] => __construct
[1] => 140
)
[1] => Array
(
[0] => player
[1] => 365
)
[2] => Array
(
[0] => creates
[1] => 13356
)
[3] => Array
(
[0] => onYouTubeIframeAPIReady
[1] => 13475
)
Why is this not returning the function variables?
To capture the variables, change the regex to:
/(function )(?P<name>\w+\s*\([^)]*\))/
or
/(function )(?P<name>\w+)\s*\((P<variables>[^)]*)\)/
The function name is the named group name and the variables in group variables

preg_match_all and umlets

I am using preg_match_all to filter out strings
The string which I have supplied in preg_match_all is
$text = "Friedric'h Wöhler"
after that I use
preg_match_all('/(\"[^"]+\"|[\\p{L}\\p{N}\\*\\-\\.\\?]+)/', $text, $arr, PREG_PATTERN_ORDER);
and the result i get when I print $arr is
Array
(
[0] => Array
(
[0] => friedric
[1] => h
[2] => w
[3] => ouml
[4] => hler
)
[1] => Array
(
[0] => friedric
[1] => h
[2] => w
[3] => ouml
[4] => hler
)
)
Somehow the ö character is replaced by ouml which I am not really sure how to figure this out
I am expecting following result
Array
(
[0] => Array
(
[0] => Friedric'h
[1] => Wöhler
)
)
Per nhahtdh's comment:
$text = "Friedric'h Wöhler";
preg_match_all('/"[^"]+"|[\p{L}\p{N}*.?\\\'-]+/u', $text, $arr, PREG_PATTERN_ORDER);
echo "<pre>";
print_r($arr);
echo "</pre>";
Gives
Array
(
[0] => Array
(
[0] => Friedric'h
[1] => Wöhler
)
)
If you think preg_match_all() is messy, you could take a look at pattern():
$p = '"[^"]+"|[\p{L}\p{N}*.?\\\'-]+'; // automatic delimiters
$text = "Friedric'h Wöhler";
$result = pattern($p)->match($text)->all();

extract values from a query regex

I need to extract the values ​​of a condition (WHERE) and did a regex, but I can not get the values ​​correctly.
//Patherns
$regex = "/([a-zA-Z_]+)\s([\<\=\>\s]{0,4})\s+(\".*\")/";
//values ​​to be extracted
$string = 'idCidade >= "bla" OR idEstado="2" and idPais="3"';
//regex function
preg_match_all(
$regex,
$string,
$output
);
//displays the result
echo '<pre>';print_r($output);
//incorrect output
Array
(
[0] => Array
(
[0] => idCidade >= "bla" OR idEstado="2" and idPais="3"
)
[1] => Array
(
[0] => idCidade
)
[2] => Array
(
[0] => >=
)
[3] => Array
(
[0] => "bla" OR idEstado="2" and idPais="3"
)
)
I need the regular expression to export the values ​​to an array like this;
//correct output
Array
(
[0] => Array
(
[0] => idCidade >= "bla" OR idEstado="2" and idPais="3"
)
[1] => Array
(
[0] => idCidade
[1] => idEstado
[2] => idPais
)
[2] => Array
(
[0] => >=
[1] => =
[2] => =
)
[3] => Array
(
[0] => "bla"
[1] => "2"
[2] => "3"
)
[4] => Array
(
[0] => "OR"
[1] => "AND"
[2] => ""
)
)
Your mistake was probably the .* which matches too much. You'd need to make it "ungreedy" with appending a question mark: .*?
I would however suggest this regex:
'/(OR|AND)?\s*(\w+)\s*([<=>!]+)\s*("[^"]*"|\'[^\']*\'|\d+)/i'
This matches the boolean connector first and optionally, so that you get:
[1] => Array
(
[0] =>
[1] => OR
[2] => and
)
[2] => Array
(
[0] => idCidade
[1] => idEstado
[2] => idPais
)
[3] => Array
(
[0] => >=
[1] => =
[2] => =
)
[4] => Array
(
[0] => "bla"
[1] => "2"
[2] => "3"
)
I've also made it work for SQL-compliant strings and decimals. But this is only borderline a job for regex. A real parser would be advisable. (Though I don't know your use case.)
Try this. This outputs the exact result you need.
<?php //Patherns
$regex = '/([a-zA-Z_]+)\s*([>=<]*)\s*"([^"]*)"\s*(or|and)*/i';
//values to be extracted
$string = 'idCidade >= "bla" OR idEstado="2" and idPais="3"';
//regex function
preg_match_all(
$regex,
$string,
$output
);
//displays the result
echo '<pre>';print_r($output);

How can I split a list with multiple delimiters?

Basically, I want to enter text into a text area, and then use them. For example
variable1:variable2#variable3
variable1:variable2#variable3
variable1:variable2#variable3
I know I could use explode to make each line into an array, and then use a foreach loop to use each line separately, but how would I separate the three variables to use?
Besides preg_split:
$line = 'variable11:variable12#variable13';
print_r(preg_split('/[:#]/', $line));
/*
Array
(
[0] => variable11
[1] => variable12
[2] => variable13
)
*/
you could do a preg_match_all:
$text = 'variable11:variable12#variable13
variable21:variable22#variable23
variable31:variable32#variable33';
preg_match_all('/([^\r\n:]+):([^\r\n#]+)#(.*)\s*/', $text, $matches, PREG_SET_ORDER);
print_r($matches);
/*
Array
(
[0] => Array
(
[0] => variable11:variable12#variable13
[1] => variable11
[2] => variable12
[3] => variable13
)
[1] => Array
(
[0] => variable21:variable22#variable23
[1] => variable21
[2] => variable22
[3] => variable23
)
[2] => Array
(
[0] => variable31:variable32#variable33
[1] => variable31
[2] => variable32
[3] => variable33
)
)
*/
try preg_split http://php.net/manual/en/function.preg-split.php
if necessary, you could make several calls to "explode"
http://jp.php.net/manual/en/function.explode.php

Categories