I'd like to remove all parentheses from a set of strings running through a loop. The best way that I've seen this done is with the use of preg_replace(). However, I am having a hard time understanding the pattern parameter.
The following is the loop
$coords= explode (')(', $this->input->post('hide'));
foreach ($coords as $row)
{
$row = trim(preg_replace('/\*\([^)]*\)/', '', $row));
$row = explode(',',$row);
$lat = $row[0];
$lng = $row[1];
}
And this is the value of 'hide'.
(1.4956873362063747, 103.875732421875)(1.4862491569669245, 103.85856628417969)(1.4773257504016037, 103.87968063354492)
That pattern is wrong as far as i know. i got it from another thread, i tried to read about patterns but couldn't get it. I am rather short on time so I posted this here while also searching for other ways in other parts of the net. Can someone please supply me with the correct pattern for what I am trying to do? Or is there an easier way of doing this?
EDIT: Ah, just got how preg_replace() works. Apparently I misunderstood how it worked, thanks for the info.
I see you actually want to extract all the coordinates
If so, better use preg_match_all:
$ php -r '
preg_match_all("~\(([\d\.]+), ?([\d\.]+)\)~", "(654,654)(654.321, 654.12)", $matches, PREG_SET_ORDER);
print_r($matches);
'
Array
(
[0] => Array
(
[0] => (654,654)
[1] => 654
[2] => 654
)
[1] => Array
(
[0] => (654.321, 654.12)
[1] => 654.321
[2] => 654.12
)
)
I don't understand entirely why you would need preg_replace. explode() removes the delimiters, so all you have to do is remove the opening and closing parantheses on the first and last string respectively. You can use substr() for that.
Get first and last elements of array:
$first = reset($array);
$last = end($array);
Hope that helps.
"And this is the value of $coords."
If $coords is a string, your foreach makes no sense. If that string is your input, then:
$coords= explode (')(', $this->input->post('hide'));
This line removes the inner parentheses from your string, so your $coords array will be:
(1.4956873362063747, 103.875732421875
1.4862491569669245, 103.85856628417969
1.4773257504016037, 103.87968063354492)
The pattern parameter accepts a regular expression. The function returns a new string where all parts of the original that match the regex are replaced by the second argument, i.e. replacement
How about just using preg_replace on the original string?
preg_replace('#[()]#',"",$this->input->post('hide'))
To dissect your current regex, you are matching:
an asterisk character,
followed by an opening parenthesis,
followed by zero or more instances of
any character but a closing parenthesis
followed by a closing parenthesis
Of course, this will never match, since exploding the string removed the closing and opening parentheses from the chunks.
Related
I've looked around for solutions close to this but have not been successful in finding a solution. I'm looking to clean up some legacy code via php_codesniffer but the fixer doesn't fix comments or arrays that go past 80 cols just lets you know about them. I have a solution that works for the comments but I am getting stuck on the regex for the arrays.
A sample line I would like to fix is:
$line = "drupal_add_js(array('my_common' => array('my_code_validate' => variable_get('my_code_validate', FALSE), 'inner_index2 => 'inner_value2'), 'another_item' => 'another_value'), 'setting');";
$solution = preg_match('/array.*(\(.*?\))/', $line);
echo $solution;
I'd like
$solution = "'my_common' => array('my_code_validate' => variable_get('my_code_validate', FALSE), 'inner_index2 => 'inner_value2'), 'another_item' => 'another_value'";
but I am getting 1 instead. Notice that there is another array in there which is fairly common. I only want to capture the first array's values, and then I can split them up on separate lines from there. Ultimately I'd like to share my solutions to the php codesniffer project so bonus points for showing how to code a new fixer for squizlabs.
You may use
if (preg_match('~array(\(((?:[^()]++|(?1))*)\))~', $s, $matches)) {
echo $matches[2];
}
See this demo.
Details
array - a literal substring
(\(((?:[^()]++|(?1))*)\)) - Group 1:
\(
((?:[^()]++|(?1))*) - Group 2 (the required value):
(?:[^()]++|(?1))* - zero or more repetitions of 1+ chars other than ( and ) or the whole Group 1 pattern recursed
\) - a ) char
Try this solution:
.*?array\(('.*?)\), [^\)]+'\);.*
Replace with:
$1
Demo: https://regex101.com/r/oV4nvT/4/
I want to extract the dimension from this given string.
$str = "enough for hitting practice. The dimension is 20'X10' *where";
I expect 20'X10' as the result.
I tried with the following code to get the number before and after the string 'X. But it is returning an empty array.
$regexForMinimumPattern ='/((?:\w+\W*){0,1})\'X\b((?:\W*\w+){0,1})/i';
preg_match_all ($regexForMinimumPattern, $str, $minimumPatternMatches);
print_r($minimumPatternMatches);
Can anyone please help me to fix this? Thanks in advance.
Just remove the \b from your pattern (and append a \' in the end if you want the trailing quote):
$regexForMinimumPattern ='/((?:\w+\W*){0,1})\'X((?:\W*\w+){0,1})\'/i';
NB: \b is the meta-character for word-boundaries, you don't need it here.
Assuming that the format of the string we want is 00'X00 :
$regexForMinimumPattern ='/[0-9]{1,2}\'X[0-9]{1,2}/i';
this gives you a result like
Array ( [0] => Array ( [0] => 20'X10 ) )
So: can a simple preg_replace()do that? Perhaps...
<?php
$str = "enough for hitting practice. The dimension is 20'X10' *where";
$dim = preg_replace("#(.*?)(\d*?)(\.\d*)?(')(X)(\d*?)(\.\d*)?(')(.+)#i","$2$3$4$5$6$7", $str);
var_dump($dim); //<== YIELDS::: string '20'X10' (length=6)
You may try it out Here.
I have bunch of strings like this:
a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc
And what I need to do is to split them up based on the hashtag position to something like this:
Array
(
[0] => A
[1] => AAX1AAY222
[2] => B
[3] => BBX4BBY555BBZ6
[4] => C
[5] => MMM1
[6] => D
[7] => ARA1
[8] => E
[9] => ABC
)
So, as you see the character right behind the hashtag is captured plus everything after the hashtag just right before the next char+hashtag.
I've the following RegEx which works fine only when I have a numeric value in the end of each part.
Here is the RegEx set up:
preg_split('/([A-Z])+#/', $text, 0, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
And it works fine with something like this:
C#mmm1D#ara1
But, if I change it to this (removing the numbers):
C#mmmD#ara
Then it will be the result, which is not good:
Array
(
[0] => C
[1] => D
)
I've looked at this question and this one also, which are similar but none of them worked for me.
So, my question is why does it work only if it has followed by a number? and how I can solve it?
Here you can see some of them sample strings which I have:
a#123b#abcc#def456 // A:123, B:ABC, C:DEF456
a#abc1def2efg3b#abcdefc#8 // A:ABC1DEF2EFG3, B:ABCDEF, C:8
a#abcdef123b#5c#xyz789 // A:ABCDEF123, B:5, C:XYZ789
P.S. Strings are case-insensitive.
P.P.S. If you ever thinking what the hell are these strings, they are user submitted answers to a questionnaire, and I can't do anything on them like refactoring as they are already stored and just need to be proceed.
Why Not Using explode?
If you look at my examples you will see that I need to capture the character right before the # as well. If you think it's possible with explode() please post the output as well, thanks!
Update
Should we focus on why /([A-Z])+#/ works only if numbers included? thanks.
Instead of using preg_split(), decide what you want to match instead:
A set of "words" if followed by either <any-char># or <end-of-string>.
A character if immediately followed by #.
$str = 'a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc';
preg_match_all('/\w+(?=.#|$)|\w(?=#)/', $str, $matches);
Demo
This expression uses two look-ahead assertions. The results are in $matches[0].
Update
Another way of looking at it would be this:
preg_match_all('/(\w)#(\w+)(?=\w#|$)/', $str, $matches);
print_r(array_combine($matches[1], $matches[2]));
Each entry starts with a single character, followed by a hash, followed by X characters until either the end of the string is encountered or the start of a next entry.
The output is this:
Array
(
[a] => aax1aay222
[b] => bbx4bby555bbz6
[c] => mmm1
[d] => ara1
[e] => abc
)
If you still want to use preg_split you can remove the + and it might work as expected:
'/([A-Z])#/i'
Since then you only match the hashtag and ONE alpha character before, and not all them.
Example: http://codepad.viper-7.com/z1kFDb
Edit: Added a case-insensitive flag i in the pattern.
Use explode() rather than Regexp
$tmpArray = explode("#","a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc");
$myArray = array();
for($i = 0; $i < count($tmpArray) - 1; $i++) {
if (substr($tmpArray[$i],0,-1)) $myArray[] = substr($tmpArray[$i],0,-1);
if (substr($tmpArray[$i],-1)) $myArray[] = substr($tmpArray[$i],-1);
}
if (count($tmpArray) && $tmpArray[count($tmpArray) - 1]) $myArray[] = $tmpArray[count($tmpArray) - 1];
edit: I updated my answer to reflect better reading the questions
You can use explode() function that will split the string except the hash signs, like stated in the answers given before.
$myArray = explode("#",$string);
For the string 'a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc' this returns something like
$myarray = array('a', 'aax1aay22b', 'bbx4bby555bbz6c' ....);
All you need now is to take the last character of each string in array as another item.
$copy = array();
foreach($myArray as $item){
$beginning = substr($item,0,strlen($item)-1); // this takes all characters except the last one
$ending = substr($item,-1); // this takes the last one
$copy[] = $beginning;
$copy[] = $ending;
} // end foreach
This is an example, not tested.
EDIT
Instead of substr($item,0,strlen($item)-1); you might use substr($item,0,-1);.
Background
I have an array which I create by splitting a string based on every occurrence of 0d0a using preg_split('/(?<=0d0a)(?!$)/').
For example:
$string = "78781110d0a78782220d0a";
will be split into:
Array ( [0] => 78781110d0a [1] => 78782220d0a )
A valid array element has to start with 7878 and end with 0d0a.
The Problem
But sometimes, there's an additional 0d0a in the string which splits into an extra and invalid array element, i.e., that doesn't begin with 7878.
Take this string for example:
$string = "78781110d0a2220d0a78783330d0a";
This is split into:
Array ( [0] => 78781110d0a [1] => 2220d0a [2] => 78783330d0a )
But it should actually be:
Array ( [0] => 78781110d0a2220d0a [1] => 78783330d0a)
My Solution
I've written the following (messy) code to get around this:
$data = Array('78781110d0a','2220d0a','78783330d0a');
$i = 0; //count for $data array;
$j = 0; //count for $dataFixed array;
$dataFixed = $data;
foreach($data as $packet) {
if (substr($packet,0,4) != "7878") { //if packet doesn't start with 7878, do some fixing
if ($i != 0) { //its the first packet, can't help it!
$j++;
if ((substr(strtolower($packet), -4, 4) == "0d0a")) { //if the packet doesn't end with 0d0a, its 'mostly' not valid, so discard it
$dataFixed[$i-$j] = $dataFixed[$i-$j] . $packet;
}
unset($dataFixed[$i-$j+1]);
$dataFixed = array_values($dataFixed);
}
}
$i++;
}
Description
I first copy the array to another array $dataFixed. In a foreach loop of the $data array, I check whether it starts with 7878. If it doesn't, I join it with the previous array in $data. I then unset the current array in $dataFixed and reset the array elements with array_values.
But I'm not very confident about this solution.. Is there a better, more efficient way?
UPDATE
What if the input string doesn't end in 0d0a like its supposed to? It will stick to the previous array element..
For e.g.: in the string 78781110d0a2220d0a78783330d0a0000, 0000 should be separated as another array element.
Use another positive lookahead (?=7878) to form:
preg_split('/(?<=0d0a)(?=7878)/',$string)
Note: I removed (?!$) because I wasn't sure what that was for, based on your example data.
For example, this code:
$string = "78781110d0a2220d0a78783330d0a";
$array = preg_split('/(?<=0d0a)(?=7878)(?!$)/',$string);
print_r($array);
Results in:
Array ( [0] => 78781110d0a2220d0a [1] => 78783330d0a )
UPDATE:
Based on your revised question of having possible random characters at the end of the input string, you can add three lines to make a complete program of:
$string = "78781110d0a2220d0a787830d0a330d0a0000";
$array = preg_split('/(?<=0d0a)(?=7878)/',$string);
$temp = preg_split('/(7878.*0d0a)/',$array[count($array)-1],null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
$array[count($array)-1] = $temp[0];
if(count($temp)>1) { $array[] = $temp[1]; }
print_r($array);
We basically do the initial splitting, then split the last element of the resulting array by the expected data format, keeping the delimiter using PREG_SPLIT_DELIM_CAPTURE. The PREG_SPLIT_NO_EMPTY ensures we won't get an empty array element if the input string doesn't end in random characters.
UPDATE 2:
Based on your comment below where it seems you're implying there might be random characters between any of the desired matches, and you want these random characters preserved, you could do this:
$string = "0078781110d0a2220d0a2220d0a0000787830d0a330d0a000078781110d0a2220d0a0000787830d0a330d0a0000";
$split1 = preg_split('/(7878.*?0d0a)/',$string,null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
$result = array();
foreach($split1 as $e){
$split2 = preg_split('/(.*0d0a)/',$e,null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
foreach($split2 as $el){
// test if $el doesn't start with 7878 and ends with 0d0a
if(strpos($el,'7878') !== 0 && substr($el,-4) == '0d0a'){
//if(preg_match('/^(?!7878).*0d0a$/',$el) === 1){
$result[ count($result)-1 ] = $result[ count($result)-1 ] . $el;
} else {
$result[] = $el;
}
}
}
print_r($result);
The strategy employed here is different than above. First we split the input string based on the delimiter that matches your desired data, using the nongreedy regex .*?. At this point we have some strings that contain the ending of a desired value and some garbage at the end, so we split again based on the last occurrence of "0d0a" with the greedy regex .*0d0a. We then append any of those resulting values that don't start with "7878" but end with "0d0a" to the previous value, as this should repair the first and second halves that got split because it contained an extra "0d0a".
I provided two methods for the innermost if statement, one using regular expressions. The regex one is marginally slower in my testing, so I've left that one commented out.
I might still not have your full requirements, so you'll have to let me know if it works and perhaps provided your full dataset.
I think you are using a delimiter "0d0a" which also happens to be part of a content! Its not possible to avoid getting junk data as long as delimiter can also be part of content. Somehow delimiter must be unique.
Possible solutions.
Change the delimited to something else that doesn't occur as part of your data ( 000000, #!.;)
If you are definite about length of text that easy arrange item may have, use it. As per examples its not possible.
Solutions given in answers considering only sample data you have shared. If you are confidant about what will be the content of string, then these solutions given by others are pretty good to use. Otherwise these solutions wont assure you guarantee!
Best solution: Fix right delimiter then use regex or explode whatever you prefer.
Why don't you use preg_match_all instead? You can avoid all of the non-capturing groups (the look aheads, look behinds) in order to split the string (which without the non-capturing groups removes the matches), and just find the matches you're looking for:
Updated
<?php
$string = "00787817878110d0a22278780d0a78783330d0a00";
preg_match_all('/7878.*?0d0a(?=7878|[^(7878)]*?$)/', $string, $arr);
print_r($arr);
?>
Gives an array $arr[0] => ( [0] => 787817878110d0a22278780d0a, [1] => 78783330d0a ). Strips leading and trailing garbage characters (whatever doesn't start with 7878 or end with 7878 or 0d0a.
So $arr[0] would be the array of values that you are looking for.
See example on ideone
Works with multiple 7878 values and multiple 0d0a values (even though that's ridiculous).
Update
If splitting is more your style, why not avoid regular expressions altogether?
<?php
$string = "787817878110d0a22278780d0a78783330d0a";
$arr = explode('0d0a7878', $string);
$string = implode('0d0a,7878', $arr);
$arr = explode(',', $string);
print_r($arr);
?>
Here we split the string by the delimiter 0d0a7878, which is what #CharlieGorichanaz's solution is doing, and props to him for the quick, accurate solution. We then add a comma, because who doesn't love comma separated values? And we explode again on the commas for an array of desired values. Performance-wise, this ought to be faster than using regular expressions. See example.
<?php
$string = "Movies and Stars I., 32. part";
$pattern = "((IX|IV|V?I{0,3}[\.]))";
if(preg_match($pattern, $string, $x) == false)
{
print "NAPAKA!";
}
else
{
print_r($x);
}
?>
And the response is:
Array ( [0] => I. [1] => I. )
I should get only 1 response... Why do I get multiple responses?
The element at index 0 is the whole matched string. The element at index 1 is the contents of the first capture group, i.e. the content inside the parenthesis. In this case, they just happen to be the same. Just use $x[0] to get the value you're looking for.
The nested parenthesis should, in this instance, be a "non-capturing" subpattern.
$pattern = "~((?:IX|IV|V?I{0,3}[\.]))~";
Try that. It will tell the regex compiler to not capture the results of those parenthesis into the array.
In fact, looking at your regex, you don't even need those parenthesis. Make your regex this:
$pattern = "~IX|IV|V?I{0,3}[\.]~";
That should also work.
Your pattern has multiple groups in it -> the () brackets tell you what to capture in your match.
Try this:
$pattern = "(IX|IV|V?I{0,3}[\.])";
If you have a hard time identifying the wanted groups in the result you can name them as specified in the php.net documentation.
That would look something like this:
$pattern = "(?P<groupname>IX|IV|V?I{0,3}[\.])";
You get 0-indexed for all mathced string and result for every paretness (). it's helpful to get groups i.e
preg_match('~([0-9]+)([a-z]+)','12abc',$x);
$x is ([0]=>12abc [1]=>12 [2]=>abc)
In your case you can simply delete () (1 pair ot them, 1 pair is used as delimiters)