I've got a group of strings which I need to chunk into an array.
The string needs to be split on either /, ,, with, or &.
Unfortunately it is possible for a string to contain two of the strings which needs to be split on, so I can't use split() or explode().
For example, a string could say first past/ going beyond & then turn, so I am trying to get an array that would return:
array('first past', 'going beyond', 'then turn')
The code I am currently using is
$splittersArray=array('/', ',', ' with ','&');
foreach($splittersArray as $splitter){
if(strpos($string, $splitter)){
$splitString = split($splitter, $string);
foreach($splitString as $split){
I can't seem to find a function in PHP that allows me to do this.
Do I need to be passing the string back into the top of the funnel, and continue to go through the foreach() after the string has been split again and again?
This doesn't seem very efficient.
Use a regular expression and preg_split.
In the case you mention, you would get the split array with:
$splitString = preg_split('/(\/|\,| with |\&/)/', $string);
To concisely write the pattern use a character class for the single-character delimiters and add the with delimiter as a value after the pipe (the "or" character in regex). Allow zero or more spaces on either side of the group of delimiters so that the values in the output don't need to be trimmed.
I am using the PREG_SPLIT_NO_EMPTY function flag in case a delimiter occurs at the start or end of the string and you don't want to have any empty elements generated.
Code: (Demo)
$string = 'first past/ going beyond & then turn with everyone';
var_export(
preg_split('~ ?([/,&]|with) ?~', $string, 0, PREG_SPLIT_NO_EMPTY)
);
Output:
array (
0 => 'first past',
1 => 'going beyond',
2 => 'then turn',
3 => 'everyone',
)
Related
What is the best way to form a string to a specific format like in excel(##,# etc.)?
I want to set the format like 00:00.00, if someone put 1:10.10 i want to add a zero in front of them.
If the input looks like 1.5.07 it must convert to 01:50.07, harder one: 0.3 must convert to 00:00.30.
I think this is possible with some regex, i can write the expected format but how can i reform the input with this?
$input = "00:13,40";
echo preg_replace("/(d){2}:(d){2}\.(d){2}/","???",$input);
the , in the input must converted to a dot. The input can be "anything" like 13.40 and must converted to 00:13.40. It must replace wrong separator and add missing 0 in front or end (0:13.4)
I don't know if I have enough of a test battery from your question details, but the following technique will correct zero-pad the strings that you have provided.
The pattern effectively parses a string with 2 optional sets of numbers, then a required number before the end of the string.
The pattern allows for any combination of colon, comma, or dots as delimiting characters. If you need more delimiting characters, add them to the character classes.
I added a negative lookahead ((?!\d+$)) to ensure that a number with only two sets of numbers does not use the first set as the hours (first number in the result).
sprintf() is an elegant way of enforcing the zero-padded format in each group of the replacement string.
Code: (Demo)
$tests = [
'1:10.10',
'1.5.07',
'0.3',
'00:13,40',
'15,17',
];
var_export(
preg_replace_callback(
'~(?:(\d{1,2})[:.,](?!\d+$))?(?:(\d{1,2})[:.,])?(\d{1,2})$~',
function($m) {
return sprintf('%02d:%02d.%02d', $m[1], $m[2], $m[3]);
},
$tests
)
);
Output:
array (
0 => '01:10.10',
1 => '01:05.07',
2 => '00:00.03',
3 => '00:13.40',
4 => '00:15.17',
)
Alternatively, you can achieve the same result by loosening the pattern to match digits and non-digits (in an alternating order) and give priority to values that are farther to the right side of the string.
Code: (Demo)
preg_replace_callback(
'~(?:(\d+)\D+)??(?:(\d+)\D+)?(\d+)$~',
function($m) {
unset($m[0]);
return vsprintf('%02d:%02d.%02d', $m);
},
$tests
)
I have the word AK747, I use regex to detect if a string (at least 2 chars ex: AK) is followed by a number (at least to digits ex: 747).
EDIT : (sorry that I wasn't clear on this guys)
I need to do this above because :
In some case I need to split to match search against AK-747. When I search for string 'AK-747' with keyword 'AK747' it won't find a match unless I use levenshtein in database, so I prefer splitting AK747 to AK and 747.
My code:
$strNumMatch = preg_match('/^[a-zA-Z]{2,}[0-9]{2,}$/',
$value, $match);
if(isset($match[0]))
echo $match[0];
How do I split to array ['AK', '747'] for example with preg_split() or any other way?
$input = 'AK-747';
if (preg_match('/^([a-z]{2,})-?([0-9]{2,})$/i', $input, $result)) {
unset($result[0]);
}
print_r($result);
The output:
Array
(
[1] => AK
[2] => 747
)
You may try this:
preg_match('/[0-9]{2,}/', $value, $matches, PREG_OFFSET_CAPTURE);
$position = $matches[0][1];
$letters = substr($value, 0, $position);
$numbers = substr($value, $position);
This way you get the position of the first number and split there.
EDIT:
Starting from your original approach this could look somewhat like this:
$strNumMatch = preg_match('/^([a-zA-Z]{2,})([0-9]{2,})$/', $value, $match, PREG_OFFSET_CAPTURE);
if($strNumMatch){
$position = $matches[2][1];
$letters = substr($value, 0, $position);
$numbers = substr($value, $position);
$alternative = $letters.'-'.$numbers;
}
preg_split() is a very sensible and direct call since you desire an indexed array containing the two substrings.
Code: (Demo)
$input = 'AK-747';
var_export(preg_split('/[a-z]{2,}\K-?/i',$input));
Output:
array (
0 => 'AK',
1 => '747',
)
The \K means "restart the fullstring match". Effectively, everything to the left of \K is retained as the first element in the result array and everything to right (the optional hyphen) is omitted because it is considered the delimiter. Pattern Demo
Code: (Demo)
I process a small battery of inputs to show what can be done and explain after the snippet.
$inputs=['AK747','AK-747','AK-','AK']; // variations as I understand them
foreach($inputs as $input){
echo "$input returns: ";
var_export(preg_split('/[a-z]{2,}\K-?/i',$input,2,PREG_SPLIT_NO_EMPTY));
echo "\n";
}
Output:
AK747 returns: array (
0 => 'AK',
1 => '747',
)
AK-747 returns: array (
0 => 'AK',
1 => '747',
)
AK- returns: array (
0 => 'AK',
)
AK returns: array (
0 => 'AK',
)
preg_split() takes a pattern that receives a pattern that will match a variable substring and use it as a delimiter. If - were present in every input string then explode('-',$input) would be most appropriate. However, - is optional in this task, so the pattern must allow - to be optional (this is what the ? quantifier does in all of the patterns on this page).
Now, you couldn't just use a pattern like /-?/, that would split the string on every character. To overcome this, you need to tell the regex engine the exact expected location for the optional -. You do this by referencing [a-z]{2,} before the -? (single intended delimiter).
The pattern /[a-z]{2,}-?/i does a fair job of finding the correct location for the optional hyphen, but now the trouble is, the leading letters in the string are included as part of the delimiting substring.
Sometimes, "lookarounds" can be used in regex patterns to match but not consume substrings. A "positive lookbehind" is used to match a preceding substring, however "variable length lookbehinds" are not permitted in php (and most other regex flavors). This is what the invalid pattern would look like: /(?<=[a-z]{2,})-?/i.
The way around this technicality is to "restart the fullstring match" using the \K token (aka a lookbehind alternative) just before the optional hyphen. To correctly target only the intended delimiter, the leading letters must be "matched/consumed" then "discarded" -- that's what \K does.
As for the inclusion of the 3rd and 4th parameter of preg_split()...
I've set the 3rd parameter to 2. This is just like the limit parameter that explode() has. It instructs the function to not make more than 2 output elements. For this case, I could have used NULL or -1 to mean "unlimited", but I could NOT leave the parameter empty -- it must be assigned to allow for the declaration of the 4th parameter.
I've set the 4th parameter to PREG_SPLIT_NO_EMPTY which instructs the function to not generate empty output elements.
Ta-Da!
p.s. a preg_match_all() solution is as easy as using a pipe and two anchors:
$inputs=['AK747','AK-747','AK-','AK']; // variations as I understand them
foreach($inputs as $input){
echo "$input returns: ";
var_export(preg_match_all('/^[a-z]{2,}|\d{2,}$/i',$input,$out)?$out[0]:[]);
echo "\n";
}
// same outputs as above
You can make the - optional with ?.
/([A-Za-z]{2,}-?[0-9]{2,})/
https://regex101.com/r/tIgM4F/1
I have a string with some numbers and text and I'm trying to split the string at the first non-numeric character.
For Example, I have a few strings like
$value = '150px';
$value = '50em';
$value = '25%';
I've been trying to split the string using preg_split and a little regex.
$value_split = preg_split( '/[a-zA-Z]/' , $fd['yks-mc-form-padding'] );
I'm able to get the first part of the string using $value_split[0], for example I can store 150, or 50 or 25. I need to return the second part of the string as well (px, em or %).
How can I split the string using preg_split or something similar to return both parts of the array??
Thanks!
If you want to use regex and you haven't already, you should play with RegExr.
To do what you're wanting with regex, assuming all the strings will be all numeric together, followed by all non-numeric, you could do:
$matches = array();
preg_match('/([0-9]+)([^0-9]+)/',$value,$matches);
Then $matches[1] will be the numeric part and $matches[2] will be the rest
To break it down,
[0-9] matches any numeric character, so [0-9]+ matches 1 or more numeric characters in a row, so per the docs $matches[1] will have the (numeric) text matched in by the first set of parentheses
and [^0-9] matches any non-numeric character, so [^0-9]+ matches 1 or more non-numeric characters in a row and fills $matches[2] because it's in the 2nd set of parentheses
By preg_split() you cannot achieve what are you trying to. It will delete the part of your string which separates the whole string (in this case it will be separated by character [a-zA-Z]). Use preg_match() (or preg_match_all()) function.
You can use this pattern:
/([0-9]+)([a-zA-Z%]+)/
See demo.
Use the PREG_SPLIT_OFFSET_CAPTURE flag - it will cause an array to be returned, with item [0] being the string matched, and item [1] its starting position in the original string.
You can then use that info to extract the rest of the string by using ordinary sub-string functionality.
Something along the lines of:
$values_split = preg_split( '/[a-zA-Z]/' , $fd['yks-mc-form-padding'] );
$position = $values_split[0][1]
$length = $values_split[0][0]
$startPos = $position + $length
$numToGet = lenght($input) - $startPos
$remainder = substr($inline, startPos, $numToGet)
I need help with a regular expression in PHP.
I have one string containing a lot of data and the format could be like this.
key=value,e4354ahj\,=awet3,asdfa\=asdfa=23f23
So I have 2 delimiters , and = where , is the set of key and value. The thing is that key and value can contain the same symbols , and = but they will always be escaped. So I cant use explode. I need to use preg_split but I am no good at regular expressions.
Could someone give me a hand with this one?
You need to use negative lookbehind:
// 4 backslashes because they are in a PHP string, so PHP translates them to \\
// and then the regex engine translates the \\ to a literal \
$keyValuePairs = preg_split('/(?<!\\\\),/', $input);
This will split on every , that is not escaped, so you get key-value pairs. You can do the same for each pair to separate the key and value:
list($key, $value) = preg_split('/(?<!\\\\)=/', $pair);
See it in action.
#Jon's answer is awesome. I though of providing a solution by matching the string:
preg_match_all('#(.*?)(?<!\\\\)=(.*?)(?:(?<!\\\\),|$)#', $string, $m);
// You'll find the keys in $m[1] and the values in $m[2]
$array = array_combine($m[1], $m[2]);
print_r($array);
Output:
Array
(
[key] => value
[e4354ahj\,] => awet3
[asdfa\=asdfa] => 23f23
)
Explanation:
(.*?)(?<!\\\\)= : match anything and group it until = not preceded by \
(.*?)(?:(?<!\\\\),|$) : match anything and group it until , not preceded by \ or end of line.
I want to split a string on several chars (being +, ~, > and #, but I want those chars to be part of the returned parts.
I tried:
$parts = preg_split('/\+|>|~|#/', $input, PREG_SPLIT_DELIM_CAPTURE);
The result is only 2 parts where there should be 5 and the split-char isn't part of part [1].
I also tried:
$parts = preg_split('/\+|>|~|#/', $input, PREG_SPLIT_OFFSET_CAPTURE);
The result is then 1 part too few (4 instead of 5) and the last part contains a split-char.
Without flags in preg_split, the result is almost perfect (as many parts as there should be) but all the split-chars are gone.
Example:
$input = 'oele>boele#4 + key:type:id + *~the end'; // spaces should be ignored
$output /* should be: */
array( 'oele', '>boele', ' #4 ', '+ key:type:id ', '+ *', '~the end' );
Is there a spl function or flag to do this or do I have to make one myself =(
$parts = preg_split('/(?=[+>~#])/', $input);
See it
Since you want to have the delimiters to be part of the next split piece, your split point is right before the delimiter and this can be easily done using positive look ahead.
(?= : Start of positive lookahead
[+>~#] : character class to match any of your delimiters.
) : End of look ahead assertion.
Effectively you are asking preg_split to split the input string at points just before delimiters.
You're missing an assignment for the limit parameter which is why it's returning less than you expected, try:
$parts = preg_split('/\+|>|~|#/', $input, -1, PREG_SPLIT_OFFSET_CAPTURE);
well i had the same problem in the past. You have to parenthese your regexp with brackets and then it hopefully works
$parts = preg_split('/(\+|>|~|#)/', $input, PREG_SPLIT_OFFSET_CAPTURE);
and here is it explained: http://www.php.net/manual/en/function.preg-split.php#94238
Ben is correct.
Just to add to his answer, PREG_SPLIT_DELIM_CAPTURE is a constant with value of 2 so you get 2 splits, similarly PREG_SPLIT_OFFSET_CAPTURE has a value of 4.