explode string on multiple words - php

There is a string like this:
$string = 'connector:rtp-monthly direction:outbound message:error writing data: xxxx yyyy zzzz date:2015-11-02 10:20:30';
This string is from user Input. So it will never have the same order. It's an input field which I need to split to build a DB query.
Now I would like to split the string based on words given in a array() which is like a mapper containing the words I need to find in the string. Looking like so:
$mapper = array(
'connector' => array('type' => 'string'),
'direction' => array('type' => 'string'),
'message' => array('type' => 'string'),
'date' => array('type' => 'date'),
);
Only the keys of the $mapper will be relevant. I've tried with foreach and explode like:
$parts = explode(':', $string);
But the problem is: There can be colons somewhere in the string so I don't need to explode there. I only need to explode if a colon is followed right after the mapper key. The mapper keys in this case are:
connector // in this case split if "connector:" is found
direction // untill "direction:" is found
message // untill "message:" is found
date // untill "date:" is found
But remember also, the user input can varey. So the string will always change ant the order of the string and the mapper array() will never be in the same order. So I'm not sure if explode is the right way to go, or if I should use a regex. And if so how to do it.
The desired result should be an array looking something like this:
$desired_result = array(
'connector' => 'rtp-monthly',
'direction' => 'outbound',
'message' => 'error writing data: xxxx yyyy zzzz',
'date' => '2015-11-02 10:20:30',
);
Help is much appreciated.

The trickier part of this is matching the original string. You can do it with Regex with the help of lookahead positive assertions:
$pattern = "/(connector|direction|message|date):(.+?)(?= connector:| direction:| message:| date:|$)/";
$subject = 'connector:rtp-monthly direction:outbound message:error writing data: xxxx yyyy zzzz date:2015-11-02 10:20:30';
preg_match_all($pattern, $subject, $matches, PREG_SET_ORDER );
$returnArray = array();
foreach($matches as $item)
{
$returnArray[$item[1]] = $item[2];
}
In this Regex /(connector|direction|message|date):(.+?)(?= connector:| direction:| message:| date:|$)/, you're matching:
(connector|direction|message|date) - find a keyword and capture it;
: - followed by a colon;
(.+?) - followed by any character many times non greedy, and capture it;
(?= connector:| direction:| message:| date:|$) - up until the next keyword or the end of the string, using a non-capturing look-ahead positive assertion.
The result is:
Array
(
[connector] => rtp-monthly
[direction] => outbound
[message] => error writing data: xxxx yyyy zzzz
[date] => 2015-11-02 10:20:30
)
I didn't use the mapper array just to make the example clear, but you could use implode to put the keywords together.

Our aim isto make one array that contains the values of two arrays that we would extract from the string. It is neccesary to have two arrays since there are two string delimeters we wish to consider.
Try this:
$parts = array();
$large_parts = explode(" ", $string);
for($i=0; $i<count($large_parts); $i++){
$small_parts = explode(":", $large_parts[$i]);
$parts[$small_parts[0]] = $small_parts[1];
}
$parts should now contain the desired array
Hope you get sorted out.

Here you are. The regex is there to "catch" the key (any sequence of characters, excluding blank space and ":"). Starting from there, I use "explode" to "recursively" split the string. Tested ad works good
$string = 'connector:rtp-monthly direction:outbound message:error writing data date:2015-11-02';
$element = "(.*?):";
preg_match_all( "/([^\s:]*?):/", $string, $matches);
$result = array();
$keys = array();
$values = array();
$counter = 0;
foreach( $matches[0] as $id => $match ) {
$exploded = explode( $matches[ 0 ][ $id ], $string );
$keys[ $counter ] = $matches[ 1 ][ $id ];
if( $counter > 0 ) {
$values[ $counter - 1 ] = $exploded[ 0 ];
}
$string = $exploded[ 1 ];
$counter++;
}
$values[] = $string;
$result = array();
foreach( $keys as $id => $key ) {
$result[ $key ] = $values[ $id ];
}
print_r( $result );

You could use a combination of a regular expression and explode(). Consider the following code:
$str = "connector:rtp-monthly direction:outbound message:error writing data date:2015-11-02";
$regex = "/([^:\s]+):(\S+)/i";
// first group: match any character except ':' and whitespaces
// delimiter: ':'
// second group: match any character which is not a whitespace
// will not match writing and data
preg_match_all($regex, $str, $matches);
$mapper = array();
foreach ($matches[0] as $match) {
list($key, $value) = explode(':', $match);
$mapper[$key][] = $value;
}
Additionally, you might want to think of a better way to store the strings in the first place (JSON? XML?).

Using preg_split() to explode() by multiple delimiters in PHP
Just a quick note here. To explode() a string using multiple delimiters in PHP you will have to make use of the regular expressions. Use pipe character to separate your delimiters.
$string = 'connector:rtp-monthly direction:outbound message:error writing data: xxxx yyyy zzzz date:2015-11-02 10:20:30';
$chunks = preg_split('/(connector|direction|message)/',$string,-1, PREG_SPLIT_NO_EMPTY);
// Print_r to check response output.
echo '<pre>';
print_r($chunks);
echo '</pre>';
PREG_SPLIT_NO_EMPTY – To return only non-empty pieces.

Related

PHP Regex with recursively filter string and append sub-string to array

I'm basically trying to extract parts of a string AFTER a character "/" but using PHP PCRE (Regular Expressions) NOT PHP substr() function, I would like to test if the initial string has multiple "/" characters using a combination of PHP PCRE (Regular Expressions) and preg_match() or preg_match_all().
I am able to select for a SINGLE iteration using a regular expression.
<?php
$rules = array(
'dbl' => "/(?'d'[^/]+)/(?'p'[^/]+)", // '.../a/a' DOUBLE ITERATION
'single' => "/(?'d'[\w\-]+)",// '.../a' SINGLE ITERATION
'multiple' => "" //MULTIPLE ITERATION
);
$string = "a/b/c/d/e";
foreach ( $rules as $action => $rule ) {
if ( preg_match_all( '~^'.$rule.'$~i', $string, $params ) ) {
switch ($action) {
case 'multiple':
$arr = explode("/", $string);
print_r($arr);
//do something
...
}
}
}
?>
I know this is because of my lack of sufficient knowledge of Regular Expressions, however, I need a dynamic Regex code to match the condition that the initial string has multiple "/" characters and then recursively store these substrings to an array.
I would approach this differently: I would first explode $string on / and then apply logic based on the number of elements in the results.
<?php
$string = "a/b/c/d/e";
$arr = explode("/", $string);
if (count($arr) > 2) {
print_r($arr);
// do something knowing there were 2 or more slashes in $string
}
?>
If you need different actions for 0, 1 or 2 slashes, add elseif blocks testing for fewer elements in $arr and put the corresponding actions there.
To answer the question, Using Wiktor Stribiżew's Regex Code:
<?php
$rules = array(
'dbl' => "/(?'d'[^/]+)/(?'p'[^/]+)", // '.../a/a' DOUBLE ITERATION
'single' => "/(?'d'[\w\-]+)",// '.../a' SINGLE ITERATION
'multiple' => "/[^/]+(?:/[^/]+){2,}/?" //MULTIPLE ITERATION
);
$string = "a/b/c/d/e";
foreach ( $rules as $action => $rule ) {
if ( preg_match_all( '~^'.$rule.'$~i', $string, $params ) ) {
switch ($action) {
case 'multiple':
$arr = explode("/", $string);
print_r($arr);
//do something
...
}
}
}
?>
For others who reference this resource, kindly upvote Wiktor Stribiżew's answer once/ if he posts it.

(PHP) Replace string of array elements using regex

I have an array
Array
(
[0] => "http://example1.com"
[1] => "http://example2.com"
[2] => "http://example3.com"
...
)
And I want to replace the http with https of each elements using RegEx. I tried:
$Regex = "/http/";
$str_rpl = '${1}s';
...
foreach ($url_array as $key => $value) {
$value = preg_replace($Regex, $str_rpl, $value);
}
print_r($url_array);
But the result array is still the same. Any thought?
You actually print an array without changing it. Why do you need regex for this?
Edited with Casimir et Hippolyte's hint:
This is a solution using regex:
$url_array = array
(
0 => "http://example1.com",
1 => "http://example2.com",
2 => "http://example3.com",
);
$url_array = preg_replace("/^http:/i", "https:", $url_array);
print_r($url_array);
PHP Demo
Without regex:
$url_array = array
(
0 => "http://example1.com",
1 => "http://example2.com",
2 => "http://example3.com",
);
$url_array = str_replace("http://", "https://", $url_array);
print_r($url_array);
PHP Demo
First of all, you are not modifying the array values at all. In your example, you are operating on the copies of array values. To actually modify array elements:
use reference mark
foreach($foo as $key => &$value) {
$value = 'new value';
}
or use for instead of foreach loop
for($i = 0; $i < count($foo); $i++) {
$foo[$i] = 'new value';
}
Going back to your question, you can also solve your problem without using regex (whenever you can, it is always better to not use regex [less problems, simpler debugging, testing etc.])
$tmp = array_map(static function(string $value) {
return str_replace('http://', 'https://', $value);
}, $url_array);
print_r($tmp);
EDIT:
As Casimir pointed out, since str_replace can take array as third argument, you can just do:
$tmp = str_replace('http://', 'https://', $url_array);
This expression might also work:
^http\K(?=:)
which we can add more boundaries, and for instance validate the URLs, if necessary, such as:
^http\K(?=:\/\/[a-z0-9_-]+\.[a-z0-9_-]+)
DEMO
Test
$re = '/^http\K(?=:\/\/[a-z0-9_-]+\.[a-z0-9_-]+)/si';
$str = ' http://example1.com ';
$subst = 's';
echo preg_replace($re, $subst, trim($str));
Output
https://example1.com
The expression is explained on the top right panel of regex101.com, if you wish to explore/simplify/modify it, and in this link, you can watch how it would match against some sample inputs, if you like.
RegEx Circuit
jex.im visualizes regular expressions:

Convert string with no delimiters into associative multidimensional array

I need to parse a string that has no delimiting character to form an associative array.
Here is an example string:
*01the title*35the author*A7other useless infos*AEother useful infos*AEsome delimiters can be there multiple times
Every "key" (which precedes its "value") is comprised of an asterisk (*) followed by two alphanumeric characters.
I use this regex pattern: /\*[A-Z0-9]{2}/
This is my preg_split() call:
$attributes = preg_split('/\*[A-Z0-9]{2}/', $line);
This works to isolate the "value", but I also need to extract the "key" to form my desired associative array.
What I get looks like this:
$matches = [
0 => 'the title',
1 => 'the author',
2 => 'other useless infos',
3 => 'other useful infos',
4 => 'some delimiters can be there multiple times'
];
My desired output is:
$matches = [
'*01' => 'the title',
'*35' => 'the author',
'*A7' => 'other useless infos',
'*AE' => [
'other useful infos',
'some delimiters can be there multiple times',
],
];
Use the PREG_SPLIT_DELIM_CAPTURE flag of the preg_split function to also get the captured delimiter (see documentation).
So in your case:
# The -1 is the limit parameter (no limit)
$attributes = preg_split('/(\*[A-Z0-9]{2})/', $line, -1, PREG_SPLIT_DELIM_CAPTURE);
Now you have element 0 of $attributes as everything before the first delimiter and then alternating the captured delimiter and the next group so you can build your $matches array like this (assuming that you do not want to keep the first group):
for($i=1; $i<sizeof($attributes)-1; $i+=2){
$matches[$attributes[$i]] = $attributes[$i+1];
}
In order to account for delimiters being present multiple times you can adjust the line inside the for loop to check whether this key already exists and in that case create an array.
Edit: a possibility to create an array if necessary is to use this code:
for($i=1; $i<sizeof($attributes)-1; $i+=2){
$key = $attributes[$i];
if(array_key_exists($key, $matches)){
if(!is_array($matches[$key]){
$matches[$key] = [$matches[$key]];
}
array_push($matches[$key], $attributes[$i+1]);
} else {
$matches[$attributes[$i]] = $attributes[$i+1];
}
}
The downstream code can certainly be simplified, especially if you put all values in (possibly single element) arrays.
You may match and capture the keys into Group 1 and all the text before the next delimiter into Group 2 where the delimiter is not the same as the first one captured. Then, in a loop, check all the keys and values and split those values with the delimiter pattern where it appears one or more times.
The regex is
(\*[A-Z0-9]{2})(.*?)(?=(?!\1)\*[A-Z0-9]{2}|$)
See the regex demo.
Details
(\*[A-Z0-9]{2}) - Delimiter, Group 1: a * and two uppercase letters or digits
(.*?) - Value, Group 2: any 0+ chars other than line break chars, as few as possible
(?=(?!\1)\*[A-Z0-9]{2}|$) - up to the delimiter pattern (\*[A-Z0-9]{2}) that is not equal to the text captured in Group 1 ((?!\1)) or end of string ($).
See the PHP demo:
$re = '/(\*[A-Z0-9]{2})(.*?)(?=(?!\1)\*[A-Z0-9]{2}|$)/';
$str = '*01the title*35the author*A7other useless infos*AEother useful infos*AEsome delimiters can be there multiple times';
$res = [];
if (preg_match_all($re, $str, $m, PREG_SET_ORDER, 0)) {
foreach ($m as $kvp) {
$tmp = preg_split('~\*[A-Z0-9]+~', $kvp[2]);
if (count($tmp) > 1) {
$res[$kvp[1]] = $tmp;
} else {
$res[$kvp[1]] = $kvp[2];
}
}
print_r($res);
}
Output:
Array
(
[*01] => the title
[*35] => the author
[*A7] => other useless infos
[*AE] => Array
(
[0] => other useful infos
[1] => some delimiters can be there multiple times
)
)
Ok, I answer my own question on how to handle the multiple same delimiters.
Thanks to #markus-ankenbrand for the start:
$attributes = preg_split('/(\*[A-Z0-9]{2})/', $line, -1, PREG_SPLIT_DELIM_CAPTURE);
$matches = [];
for ($i = 1; $i < sizeof($attributes) - 1; $i += 2) {
if (isset($matches[$attributes[$i]]) && is_array($matches[$attributes[$i]])) {
$matches[$attributes[$i]][] = $attributes[$i + 1];
} elseif (isset($matches[$attributes[$i]]) && !is_array($matches[$attributes[$i]])) {
$currentValue = $matches[$attributes[$i]];
$matches[$attributes[$i]] = [$currentValue];
$matches[$attributes[$i]][] = $attributes[$i + 1];
} else {
$matches[$attributes[$i]] = $attributes[$i + 1];
}
}
The fat if/else statement does not look really nice, but it does what it need to do.
Here is a functional-style approach that doesn't require duplicate-keyed values to be consecutively written in the input string.
Use preg_match_all() to isolate the two components of each subexpression in the input string.
Use array_map() to replace each row of indexed match values with a single, associative element.
Use the spread operator (...) to unpack the newly modified matches array as indvidual associative arrays and feed that to array_merge_recursive(). The native behavior of array_merge_recursive() is to only create subarray structures where necessary.
Code: (Demo)
$str = '*01the title*35the author*A7other useless infos*AEother useful infos*AEsome delimiters can be there multiple times';
var_export(
array_merge_recursive(
...array_map(
fn($row) => [$row[1] => $row[2]],
preg_match_all(
'/(\*[A-Z\d]{2})(.+?)(?=$|\*[A-Z\d]{2})/',
$str,
$m,
PREG_SET_ORDER
) ? $m : []
)
)
);
Output:
array (
'*01' => 'the title',
'*35' => 'the author',
'*A7' => 'other useless infos',
'*AE' =>
array (
0 => 'other useful infos',
1 => 'some delimiters can be there multiple times',
),
)

PHP Regex for a specific numeric value inside a comma-delimited integer number string

I am trying to get the integer on the left and right for an input from the $str variable using REGEX. But I keep getting the commas back along with the integer. I only want integers not the commas. I have also tried replacing the wildcard . with \d but still no resolution.
$str = "1,2,3,4,5,6";
function pagination()
{
global $str;
// Using number 4 as an input from the string
preg_match('/(.{2})(4)(.{2})/', $str, $matches);
echo $matches[0]."\n".$matches[1]."\n".$matches[1]."\n".$matches[1]."\n";
}
pagination();
How about using a CSV parser?
$str = "1,2,3,4,5,6";
$line = str_getcsv($str);
$target = 4;
foreach($line as $key => $value) {
if($value == $target) {
echo $line[($key-1)] . '<--low high-->' . $line[($key+1)];
}
}
Output:
3<--low high-->5
or a regex could be
$str = "1,2,3,4,5,6";
preg_match('/(\d+),4,(\d+)/', $str, $matches);
echo $matches[1]."<--low high->".$matches[2];
Output:
3<--low high->5
The only flaw with these approaches is if the number is the start or end of range. Would that ever be the case?
I believe you're looking for Regex Non Capture Group
Here's what I did:
$regStr = "1,2,3,4,5,6";
$regex = "/(\d)(?:,)(4)(?:,)(\d)/";
preg_match($regex, $regStr, $results);
print_r($results);
Gives me the results:
Array ( [0] => 3,4,5 [1] => 3 [2] => 4 [3] => 5 )
Hope this helps!
Given your function name I am going to assume you need this for pagination.
The following solution might be easier:
$str = "1,2,3,4,5,6,7,8,9,10";
$str_parts = explode(',', $str);
// reset and end return the first and last element of an array respectively
$start = reset($str_parts);
$end = end($str_parts);
This prevents your regex from having to deal with your numbers getting into the double digits.

Error parsing regex pattern in php

I want to split a string such as the following (by a divider like '~##' (and only that)):
to=enquiry#test.com~##subject=test~##text=this is body/text~##date=date
into an array containing e.g.:
to => enquiry#test.com
subject => test
text => this is body/text
date => date
I'm using php5 and I've got the following regex, which almost works, but there are a couple of errors and there must be a way to do it in one go:
//Split the string in the url of $text at every ~##
$regexp = "/(?:|(?<=~##))(.*?=.*?)(?:~##|$|\/(?!.*~##))/";
preg_match_all($regexp, $text, $a);
//$a[1] is an array containing var1=content1 var2=content2 etc;
//Now create an array in the form [var1] = content, [var2] = content2
foreach($a[1] as $key => $value) {
//Get the two groups either side of the equals sign
$regexp = "/([^\/~##,= ]+)=([^~##,= ]+)/";
preg_match_all($regexp, $value, $r);
//Assign to array key = value
$val[$r[1][0]] = $r[2][0]; //e.g. $val['subject'] = 'hi'
}
print_r($val);
My queries are that:
It doesn't seem to capture more than 3 different sets of parameters
It is breaking on the # symbol and so not capturing email addresses e.g. returning:
to => enquiry
subject => test
text => this is body/text
I am doing multiple different regex searches where I suspect I would be able to do one.
Any help would be really appreciated.
Thanks
Why are you using regex when there is much simple method to do this by explode like this
$str = 'to=enquiry#test.com~##subject=test~##text=this is body/text~##date=date';
$array = explode('~##',$str);
$finalArr = array();
foreach($array as $val)
{
$tmp = explode('=',$val);
$finalArr[$tmp['0']] = $tmp['1'];
}
echo '<pre>';
print_r($finalArr);

Categories