Array(
[1] => put returns (between) paragraphs
[2] => (for) linebreak (add) 2 spaces at end
[3] => indent code by 4 (spaces!)
[4] => to make links
)
Want to get text inside brackets (for each value):
take only first match
remove this match from the value
write all matches to new array
After function arrays should look like:
Array(
[1] => put returns paragraphs
[2] => linebreak (add) 2 spaces at end
[3] => indent code by 4
[4] => to make links
)
Array(
[1] => between
[2] => for
[3] => spaces!
[4] =>
)
What is the solution?
I would use the regular expression /\((\([^()]*\)|[^()]*)\)/ (this will match one or two pairs of parentheses) together with preg_split:
$matches = array();
foreach ($arr as &$value) {
$parts = preg_split('/\((\([^()]*\)|[^()]*)\)/', $value, 2, PREG_SPLIT_DELIM_CAPTURE);
if (count($parts) > 1) {
$matches[] = current(array_splice($parts, 1, 1));
$value = implode('', $parts);
}
}
Using preg_split with PREG_SPLIT_DELIM_CAPTURE flag set will contain the matched separators in the result array. So a match was found, there are at least three parts. In that case the second member is the one we are looking for. That member is removed with array_splice that does also return the array of removed members. To get the removed member, current is used on the return value of array_splice. The remaining members are then put back together.
Assuming you meant (between) and not ((between))
$arr = array(
0 => 'put returns (between) paragraphs',
1 => '(for) linebreak (add) 2 spaces at end',
2 => 'indent code by 4 (spaces!)',
3 => 'to make links');
var_dump($arr);
$new_arr = array();
foreach($arr as $key => &$str) {
if(preg_match('/(\(.*?\))/',$str,$m)) {
$new_arr[] = $m[1];
$str = preg_replace('/\(.*?\)/','',$str,1);
}
else {
$new_arr[] = '';
}
}
var_dump($arr);
var_dump($new_arr);
Working link
Related
I have this array:
$dataArr = Array(
[0] => Repper
[1] => Pavel
[2] => 7.1.1970
[3] => K.H.Máchy //start of address
[4] => 1203/2,
[5] => Bruntál // end of address
[6] => EM092884
[7] => 7.1.2019
);
I need to modify this array so that the address (index 3 to index 6) is below index 3, but indexes 4 and 5 will be removed. Thus, the newly modified array will have indexes from 0 to 5 (6 values). The number of values from index 3 (from the beginning of the address) may be even greater and the address may end, for example, with index number 9. But the beginning of the address is always from index 3.
Expected result:
$dataArr= Array(
[0] => Repper
[1] => Pavel
[2] => 7.1.1970
[3] => K.H.Máchy 1203/2, Bruntál
[4] => EM092884
[5] => 7.1.2019
);
My idea was as follows. I try something like this:
I go through the matrix from index 3 and look for a regular match (the value just after the end of the address). Until the array value matches the regex, I compile the values into string.
$address = NULL; //empty variable for address from $dataArr
foreach($dataArr as $k => $val) {
if($k >= 3) {
if(! preg_match('/^\[A-Za-z]{2}\d{6}/', $val)) {
$address .= $val;
//Then put this variable $address in position $dataArr[3]
}
}
}
But it seems that the 'if' condition with preg_match is still true. I need the foreach cycle to stop before index 6, but the cycle is still working, to last value of array. Where's the mistake? This problem hinders me in completing the script. Thank you very much.
One other possibility, pop off the beginning and end
$first = array_splice($dataArr, 0, 3);
$last = array_splice($dataArr, -2);
Then implode the remaining part and put it all back together.
$dataArr = array_merge($first, [implode(' ', $dataArr)], $last);
// or in PHP 7.4
$dataArr = [...$first, implode(' ', $dataArr), ...$last];
This should work regardless of the size of the address, but it does totally depend on the last two elements after the address always being present, so if there's any way those would be missing sometimes you'll need something a little more complicated to account for that.
Why overcomplicate things with regexp and loops? Just literally do what you describe: if your address runs from n to m, take the array slice from n to m, implode that to a string, set array[n] to that string, and then remove fields [n+1...m] from your array:
function collapse_address($arr, $start, $end) {
$len = $end - $start;
// collapse the address:
$arr[$start] = join(" ", array_slice($arr, $start, $len));
// remove the now-duplicate fields:
array_splice($arr, $start + 1, $len - 1);
// and we're done.
return $arr;
}
$arr = Array(
'Repper',
'Pavel',
'7.1.1970',
'K.H.Máchy', //start of address
'1203/2',
'Bruntál', // end of address
'EM092884',
'7.1.2019'
);
$arr = collapse_address($arr, 3, 6);
result:
Array
(
[0] => Repper
[1] => Pavel
[2] => 7.1.1970
[3] => K.H.Máchy 1203/2 Bruntál
[4] => EM092884
[5] => 7.1.2019
)
Of course, you might not want $end to be exclusive, but that's up to you.
I have an array variable which contain values like this:
$items = array(
"tbFrench",
"eaItaly1",
"discount21",
"kkMM5",
"NbndA",
"fcMNSS334"
);
i nedd to remove last character of string from this array values if the last character contain number, for example:
$newItems = array();
foreach($items as $item){
$newItems[] = $this->removeLastCharacter($item);
}
print_r($newItems);
....
function removeLastCharacter($string){
// ????
}
i want the result to look like this when i print_r the $newItems variable:
Array ( [0] => tbFrench [1] => eaItaly [2] => discount2 [3] => kkMM [4] => NbndA [5] => fcMNSS33 )
You could use regular expressions to remove the last digit.
function removeLastCharacter($string){
return preg_replace('[\d$]', '', $string);
}
\d matches every digit and $ references the end of the string. So this will only replace the last character if it is a digit at the end.
You can do a RegEx replacement over all the items in an array by simply providing the array as the subject, like so:
$items = preg_replace('/^d$/', '', $items);
There's no need to put it into a function at all - print_r($items) outputs:
Array
(
[0] => tbFrench
[1] => eaItaly
[2] => discount2
[3] => kkMM
[4] => NbndA
[5] => fcMNSS33
)
If you want to replace all trailing digits you can use /^\d+$/
Give a try with below code if it solve your problem
$items = array(
"tbFrench",
"eaItaly1",
"discount21",
"kkMM5",
"NbndA",
"fcMNSS334"
);
$newArr=array();
foreach($items as $item){
$data = preg_replace('[\d$]','',$item);
array_push($newArr,$data);
}
print_r($newArr);
Why not simply?
print_r( preg_replace( '/\d+$/', "", $items ) ); // preg_replace accepts an array as argument, pass yours directly, no need for a loop.
Array
(
[0] => tbFrench
[1] => eaItaly
[2] => discount
[3] => kkMM
[4] => NbndA
[5] => fcMNSS
)
Regex Explanation:
\d+ — matches a digit (equal to [0-9])
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
$ — asserts position at the end of a line
There is a quick trick using rtrim.
$result = rtrim($str,"0..9");
second argument is a range using 2 dots ".."
You are done !!!
I have a string whose correct syntax is the regex ^([0-9]+[abc])+$. So examples of valid strings would be: '1a2b' or '00333b1119a555a0c'
For clarity, the string is a list of (value, letter) pairs and the order matters. I'm stuck with the input string so I can't change that. While testing for correct syntax seems easy in principle with the above regex, I'm trying to think of the most efficient way in PHP to transform a compliant string into a usable array something like this:
Input:
'00333b1119a555a0c'
Output:
array (
0 => array('num' => '00333', 'let' => 'b'),
1 => array('num' => '1119', 'let' => 'a'),
2 => array('num' => '555', 'let' => 'a'),
3 => array('num' => '0', 'let' => 'c')
)
I'm having difficulty using preg_match for this. For example this doesn't give the expected result, the intent being to greedy-match on EITHER \d+ (and save that) OR [abc] (and save that), repeated until end of string reached.
$text = '00b000b0b';
$out = array();
$x = preg_match("/^(?:(\d+|[abc]))+$/", $text, $out);
This didn't work either, the intent here being to greedy-match on \d+[abc] (and save these), repeated until end of string reached, and split them into numbers and letter afterwards.
$text = '00b000b0b';
$out = array();
$x = preg_match("/^(?:\d+[abc])+$/", $text, $out);
I'd planned to check syntax as part of the preg_match, then use the preg_match output to greedy-match the 'blocks' (or keep the delimiters if using preg_split), then if needed loop through the result 2 items at a time using for (...; i+=2) to extract value-letter in their pairs.
But I can't seem to even get that basic preg_split() or preg_match() approach to work smoothly, much less explore if there's a 'neater' or more efficient way.
Your regex needs a few matching groups
/([0-9]+?)([a-z])/i
This means match all numbers in one group, and all letters in another. Preg match all gets all matches.
The key to the regex is the non greedy flag ? which matches the shortest possible string.
match[0] is the whole match
match[1] is the first match group (the numbers)
match[2] is the second match group (the letter)
example below
<?php
$input = '00333b1119a555a0c';
$regex = '/([0-9]+?)([a-z])/i';
$out = [];
$parsed = [];
if (preg_match_all($regex, $input, $out)) {
foreach ($out[0] as $index => $value) {
$parsed[] = [
'num' => $out[1][$index],
'let' => $out[2][$index],
];
}
}
var_dump($parsed);
output
array(4) {
[0] =>
array(2) {
'num' =>
string(5) "00333"
'let' =>
string(1) "b"
}
[1] =>
array(2) {
'num' =>
string(4) "1119"
'let' =>
string(1) "a"
}
[2] =>
array(2) {
'num' =>
string(3) "555"
'let' =>
string(1) "a"
}
[3] =>
array(2) {
'num' =>
string(1) "0"
'let' =>
string(1) "c"
}
}
Simple solution with preg_match_all(with PREG_SET_ORDER flag) and array_map functions:
$input = '00333b1119a555a0c';
preg_match_all('/([0-9]+?)([a-z]+?)/i', $input, $matches, PREG_SET_ORDER);
$result = array_map(function($v) {
return ['num' => $v[1], 'let' => $v[2]];
}, $matches);
print_r($result);
The output:
Array
(
[0] => Array
(
[num] => 00333
[let] => b
)
[1] => Array
(
[num] => 1119
[let] => a
)
[2] => Array
(
[num] => 555
[let] => a
)
[3] => Array
(
[num] => 0
[let] => c
)
)
You can use:
$str = '00333b1119a555a0c';
$arr=array();
if (preg_match_all('/(\d+)(\p{L}+)/', $str, $m)) {
array_walk( $m[1], function ($v, $k) use(&$arr, $m ) {
$arr[] = [ 'num'=>$v, 'let'=>$m[2][$k] ]; });
}
print_r($arr);
Output:
Array
(
[0] => Array
(
[num] => 00333
[let] => b
)
[1] => Array
(
[num] => 1119
[let] => a
)
[2] => Array
(
[num] => 555
[let] => a
)
[3] => Array
(
[num] => 0
[let] => c
)
)
All of the above work. But they didn't seem to have the elegance I wanted - they needed to loop, use array mapping, or (for preg_match_all()) they needed another almost identical regex as well, just to verify the string matched the regex.
I eventually found that preg_match_all() combined with named captures solved it for me. I hadn't used named captures for that purpose before and it looks powerful.
I also added an optional extra step to simplify the output if dups aren't expected (which wasn't in the question but may help someone).
$input = '00333b1119a555a0c';
preg_match_all("/(?P<num>\d+)(?P<let>[dhm])/", $input, $raw_matches, PREG_SET_ORDER);
print_r($raw_matches);
// if dups not expected this is also worth doing
$matches = array_column($raw_matches, 'num', 'let');
print_r($matches);
More complete version with input+duplicate checking
$input = '00333b1119a555a0c';
if (!preg_match("/^(\d+[abc])+$/",$input)) {
// OPTIONAL: detected $input incorrectly formatted
}
preg_match_all("/(?P<num>\d+)(?P<let>[dhm])/", $input, $raw_matches, PREG_SET_ORDER);
$matches = array_column($raw_matches, 'num', 'let');
if (count($matches) != count($raw_matches)) {
// OPTIONAL: detected duplicate letters in $input
}
print_r($matches);
Explanation:
This uses preg_match_all() as suggested by #RomanPerekhrest and #exussum to break out the individual groups and split the numbers and letters. I used named groups so that the resulting array of $raw_matches is created with the correct names already.
But if dups arent expected, then I used an extra step with array_column(), which directly extracts data from a nested array of entries and creates a desired flat array, without any need for loops, mapping, walking, or assigning item by item: from
(group1 => (num1, let1), group2 => (num2, let2), ... )
to the "flat" array:
(let1 => num1, let2 => num2, ... )
If named regex matches feels too advanced then they can be ignored - the matches will be given numbers anyway and this will work just as well, you would have to manually assign letters and it's just harder to follow.
preg_match_all("/(\d+)([dhm])/", $input, $raw_matches, PREG_SET_ORDER);
$matches = array_column($raw_matches, 1, 2);
If you need to check for duplicated letters (which wasn't in the question but could be useful), here's how: If the original matches contained >1 entry for any letter then when array_column() is used this letter becomes a key for the new array, and duplicate keys can't exist. Only one entry for each letter gets kept. So we just test whether the number of matches originally found, is the same as the number of matches in the final array after array_coulmn. If not, there were duplicates.
I'd like to parse a string like the following :
'serviceHits."test_server"."http_test.org" 31987'
into an array like :
[0] => serviceHits
[1] => test_server
[2] => http_test.org
[3] => 31987
Basically I want to split in dots and spaces, treating strings within quotes as a single value.
The format of this string is not fixed, this is just one example. It might contain different numbers of elements with quoted and numerical elements in different places.
Other strings might look like :
test.2 3 which should parse to [test|2|3]
test."342".cake.2 "cheese" which should parse to [test|342|cake|2|cheese]
test."red feet".3."green" 4 which should parse to [test|red feet|3|green|4]
And sometimes the oid string may contain a quote mark, which should be included if possible, but it's the least important part of the parser:
test."a \"b\" c" "cheese face" which should parse to [test|a "b" c|cheese face]
I'm trying to parse SNMP OID strings from agent written by people with quite varying ideas on what an OID should look like, in a generic manner.
Parsing off the oid string (the bit separated with dots) return value (the last value) into separate named arrays would be nice. Simply splitting on space before parsing the string wouldn't work, as both the OID and the value can contain spaces.
Thanks!
I agree this can be hard to find one regexp to resolve this issue.
Here's a complete solution :
$results = array();
$str = 'serviceHits."test_\"server"."http_test.org" 31987';
// Encode \" to something else temporary
$str_encoded_quotes = strtr($str,array('\\"'=>'####'));
// Split by strings between double-quotes
$str_arr = preg_split('/("[^"]*")/',$str_encoded_quotes,-1,PREG_SPLIT_DELIM_CAPTURE);
foreach ($str_arr as $substr) {
// If value is a dot or a space, do nothing
if (!preg_match('/^[\s\.]$/',$substr)) {
// If value is between double-quotes, it's a string
// Return as is
if (preg_match('/^"(.*)"$/',$substr)) {
$substr = preg_replace('/^"(.*)"$/','\1',$substr); // Remove double-quotes around
$results[] = strtr($substr,array('####'=>'"')); // Get escaped double-quotes back inside the string
// Else, it must be splitted
} else {
// Split by dot or space
$substr_arr = preg_split('/[\.\s]/',$substr,-1,PREG_SPLIT_NO_EMPTY);
foreach ($substr_arr as $subsubstr)
$results[] = strtr($subsubstr,array('####'=>'"')); // Get escaped double-quotes back inside string
}
}
// Else, it's an empty substring
}
var_dump($results);
Tested with all of your new string examples.
First attempt (OLD)
Using preg_split :
$str = 'serviceHits."test_server"."http_test.org" 31987';
// -1 : no limit
// PREG_SPLIT_NO_EMPTY : do not return empty results
preg_split('/[\.\s]?"[\.\s]?/',$str,-1,PREG_SPLIT_NO_EMPTY);
The easiest way is probably to replace dots and spaces inside strings with placeholders, split, then remove the placeholders. Something like this:
$in = 'serviceHits."test_server"."http_test.org" 31987';
$a = preg_replace_callback('!"([^"]*)"!', 'quote', $in);
$b = preg_split('![. ]!', $a);
foreach ($b as $k => $v) $b[$k] = unquote($v);
print_r($b);
# the functions that do the (un)quoting
function quote($m){
return str_replace(array('.',' '),
array('PLACEHOLDER-DOT', 'PLACEHOLDER-SPACE'), $m[1]);
}
function unquote($str){
return str_replace(array('PLACEHOLDER-DOT', 'PLACEHOLDER-SPACE'),
array('.',' '), $str);
}
Here is a solution that works with all of your test samples (plus one of my own) and allows you to escape quotes, dots, and spaces.
Due to the requirement of handling escape codes, a split is not really possible.
Although one can imagine a regex that matches the entire string with '()' to mark the separate elements, I was unable to get it working using preg_match or preg_match_all.
Instead I parsed the string incrementally, pulling off one element at a time. I then use stripslashes to unescape quotes, spaces, and dots.
<?php
$strings = array
(
'serviceHits."test_server"."http_test.org" 31987',
'test.2 3',
'test."342".cake.2 "cheese"',
'test."red feet".3."green" 4',
'test."a \\"b\\" c" "cheese face"',
'test\\.one."test\\"two".test\\ three',
);
foreach ($strings as $string)
{
print"'{$string}' => " . print_r(parse_oid($string), true) . "\n";
}
/**
* parse_oid parses and OID and returns an array of the parsed elements.
* This is an all-or-none function, and will return NULL if it cannot completely
* parse the string.
* #param string $string The OID to parse.
* #return array|NULL A list of OID elements, or null if error parsing.
*/
function parse_oid($string)
{
$result = array();
while (true)
{
$matches = array();
$match_count = preg_match('/^(?:((?:[^\\\\\\. "]|(?:\\\\.))+)|(?:"((?:[^\\\\"]|(?:\\\\.))+)"))((?:[\\. ])|$)/', $string, $matches);
if (null !== $match_count && $match_count > 0)
{
// [1] = unquoted, [2] = quoted
$value = strlen($matches[1]) > 0 ? $matches[1] : $matches[2];
$result[] = stripslashes($value);
// Are we expecting any more parts?
if (strlen($matches[3]) > 0)
{
// I do this (vs keeping track of offset) to use ^ in regex
$string = substr($string, strlen($matches[0]));
}
else
{
return $result;
}
}
else
{
// All or nothing
return null;
}
} // while
}
This generates the following output:
'serviceHits."test_server"."http_test.org" 31987' => Array
(
[0] => serviceHits
[1] => test_server
[2] => http_test.org
[3] => 31987
)
'test.2 3' => Array
(
[0] => test
[1] => 2
[2] => 3
)
'test."342".cake.2 "cheese"' => Array
(
[0] => test
[1] => 342
[2] => cake
[3] => 2
[4] => cheese
)
'test."red feet".3."green" 4' => Array
(
[0] => test
[1] => red feet
[2] => 3
[3] => green
[4] => 4
)
'test."a \"b\" c" "cheese face"' => Array
(
[0] => test
[1] => a "b" c
[2] => cheese face
)
'test\.one."test\"two".test\ three' => Array
(
[0] => test.one
[1] => test"two
[2] => test three
)
I'm trying to explode and separate the results in 2 different arrays
One One_x
Two Two_xx
Three Three_xxx
Four Four_xxxx
I first want to explode the break line ( \n )..
then explode the space to come up with all One, Two, Three, Four in an array
AND One_x, Two_xx, Three_xxx, Four_xxxx in a different array
i tried to explode the break line
$ex = explode("\n", $numbers);
then
foreach($ex as $number){
$ex = explode(" ", $number);
}
but it seems a bit confusing to me,
How to solve this ?
$array1 = array();
$array2 = array();
foreach($ex as $number){
$tmp = explode(" ", $number);
$array1[] = $tmp[0];
$array2[] = $tmp[1];
}
Simplest way I think
EDIT: care you set $ex in loop ... Not a good idea. Use another var
Instead of runnig explode once or twice, a regex can split up the input at once, and assert the structure at it:
preg_match_all('/
^ # line start
(\w+) # One, Two, ...
\s+ # spaces
(\w+) # Four_xxxxx
$ # line end
/smix',
$input,
$array
);
print_r($array);
Will give you:
[0] => Array
(
[0] => One One_x
[1] => Two Two_xx
[2] => Three Three_xxx
[3] => Four Four_xxxx
)
[1] => Array
(
[0] => One
[1] => Two
[2] => Three
[3] => Four
)
[2] => Array
(
[0] => One_x
[1] => Two_xx
[2] => Three_xxx
[3] => Four_xxxx
)
One could also use PREG_SET_ORDER or even extract the trailing _xxxx (?)numbers.