Regexp PHP Split number and string - php

I would like to split a string contains some numbers and letters. Like this:
ABCd Abhe123
123ABCd Abhe
ABCd Abhe 123
123 ABCd Abhe
I tried this:
<?php preg_split('#(?<=\d)(?=[a-z])#i', "ABCd Abhe 123"); ?>
But it doesn't work. Only one cell in array with "ABCd Abhe 123"
I would like for example, in cell 0: numbers and in cell1: string:
[0] => "123",
[1] => "ABCd Abhe"
Thank you for your help! ;)

Use preg_match_all instead
preg_match_all("/(\d+)*\s?([A-Za-z]+)*/", "ABCd Abhe 123" $match);
For every match:
$match[i][0] contains the matched segment
$match[i][1] contains numbers
$match[i][2] contains letters
(See here for regex test)
Then put them in an array
for($i = 0; $i < count($match); $i++)
{
if($match[i][1] != "")
$numbers[] = $match[1];
if($match[i][2] != "")
$letters[] = $match[2];
}
EDIT1
I've updated the regex. It now looks for either numbers or letters, with or without a whitespace.
EDIT2
The regex is correct, but the arrayhandling wasn't. Use preg_match_all, then $match is an array containing arrays, like:
Array
(
[0] => Array
(
[0] => Abc
[1] => aaa
[2] => 25
)
[1] => Array
(
[0] =>
[1] =>
[2] => 25
)
[2] => Array
(
[0] => Abc
[1] => aaa
[2] =>
)
)

Maybe something like this?
$numbers = preg_replace('/[^\d]/', '', $input);
$letters = preg_replace('/\d/', '', $input);

Related

Convert string to array at different character occurence

Consider I have this string 'aaaabbbaaaaaabbbb' I want to convert this to array so that I get the following result
$array = [
'aaaa',
'bbb',
'aaaaaa',
'bbbb'
]
How to go about this in PHP?
PHP code demo
Regex: (.)\1{1,}
(.): Match and capture single character.
\1: This will contain first match
\1{1,}: Using matched character one or more times.
<?php
ini_set("display_errors", 1);
$string="aaaabbbaaaaaabbbb";
preg_match_all('/(.)\1{1,}/', $string,$matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => aaaa
[1] => bbb
[2] => aaaaaa
[3] => bbbb
)
[1] => Array
(
[0] => a
[1] => b
[2] => a
[3] => b
)
)
Or:
PHP code demo
<?php
$string="aaaabbbaaaaaabbbb";
$array=str_split($string);
$start=0;
$end= strlen($string);
$indexValue=$array[0];
$result=array();
$resultantArray=array();
while($start!=$end)
{
if($indexValue==$array[$start])
{
$result[]=$array[$start];
}
else
{
$resultantArray[]=implode("", $result);
$result=array();
$result[]=$indexValue=$array[$start];
}
$start++;
}
$resultantArray[]=implode("", $result);
print_r($resultantArray);
Output:
Array
(
[0] => aaaa
[1] => bbb
[2] => aaaaaa
[3] => bbbb
)
I have written a one-liner using only preg_split() that generates the expected result with no wasted memory (no array bloat):
Code (Demo):
$string = 'aaaabbbaaaaaabbbb';
var_export(preg_split('/(.)\1*\K/', $string, 0, PREG_SPLIT_NO_EMPTY));
Output:
array (
0 => 'aaaa',
1 => 'bbb',
2 => 'aaaaaa',
3 => 'bbbb',
)
Pattern:
(.) #match any single character
\1* #match the same character zero or more times
\K #keep what is matched so far out of the overall regex match
The real magic happens with the \K, for more reading go here.
The 0 parameter in preg_split() means "unlimited matches". This is the default behavior, but it needs to hold its place in the function so that the next parameter is used appropriately as a flag
The final parameter is PREG_SPLIT_NO_EMPTY which removes any empty matches.
Sahil's preg_match_all() method preg_match_all('/(.)\1{1,}/', $string,$matches); is a good attempt but it is not perfect for two reasons:
The first issue is that his use of preg_match_all() returns two subarrays which is double the necessary result.
The second issue is revealed when $string="abbbaaaaaabbbb";. His method will ignore the first lone character. Here is its output:
Array (
[0] => Array
(
[0] => bbb
[1] => aaaaaa
[2] => bbbb
)
[1] => Array
(
[0] => b
[1] => a
[2] => b
)
)
Sahil's second attempt produces the correct output, but requires much more code. A more concise non-regex solution could look like this:
$array = str_split($string);
$last = "";
foreach ($array as $v) {
if (!$last || strpos($last, $v) !== false) {
$last .= $v;
} else {
$result[] = $last;
$last = $v;
}
}
$result[] = $last;
var_export($result);

How to get a particular string using preg_replace?

i want to get a particular value from string in php. Following is the string
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_replace('/(.*)\[(.*)\](.*)\[(.*)\](.*)/', '$2', $str);
i want to get value of data01. i mean [1,2].
How can i achieve this using preg_replace?
How can solve this ?
preg_replace() is the wrong tool, I have used preg_match_all() in case you need that other item later and trimmed down your regex to capture the part of the string you are looking for.
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_match_all('/\[([0-9,]+)\]/',$string,$match);
print_r($match);
/*
print_r($match) output:
Array
(
[0] => Array
(
[0] => [1,2]
[1] => [2,3]
)
[1] => Array
(
[0] => 1,2
[1] => 2,3
)
)
*/
echo "Your match: " . $match[1][0];
?>
This enables you to have the captured characters or the matched pattern , so you can have [1,2] or just 1,2
preg_replace is used to replace by regular expression!
I think you want to use preg_match_all() to get each data attribute from the string.
The regex you want is:
$string = 'users://data01=[1,2]/data02=[2,3]/*';
preg_match_all('#data[0-9]{2}=(\[[0-9,]+\])#',$string,$matches);
print_r($matches);
Array
(
[0] => Array
(
[0] => data01=[1,2]
[1] => data02=[2,3]
)
[1] => Array
(
[0] => [1,2]
[1] => [2,3]
)
)
I have tested this as working.
preg_replace is for replacing stuff. preg_match is for extracting stuff.
So you want:
preg_match('/(.*?)\[(.*?)\](.*?)\[(.*?)\](.*)/', $str, $match);
var_dump($match);
See what you get, and work from there.

php regex split string by [%%%]

Hi I need a preg_split regex that will split a string at substrings in square brackets.
This example input:
$string = 'I have a string containing [substrings] in [brackets].';
should provide this array output:
[0]= 'I have a string containing '
[1]= '[substrings]'
[2]= ' in '
[3]= '[brackets]'
[4]= '.'
After reading your revised question:
This might be what you want:
$string = 'I have a string containing [substrings] in [brackets].';
preg_split('/(\[.*?\])/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
You should get:
Array
(
[0] => I have a string containing
[1] => [substrings]
[2] => in
[3] => [brackets]
[4] => .
)
Original answer:
preg_split('/%+/i', 'ot limited to 3 %%% so it can be %%%% or % or %%%%%, etc Tha');
You should get:
Array
(
[0] => ot limited to 3
[1] => so it can be
[2] => or
[3] => or
[4] => , etc Tha
)
Or if you want a mimimum of 3 then try:
preg_split('/%%%+/i', 'Not limited to 3 %%% so it can be %%%% or % or %%%%%, etc Tha');
Have a go at http://regex.larsolavtorvik.com/
I think this is what you are looking for:
$array = preg_split('/(\[.*?\])/', $string, null, PREG_SPLIT_DELIM_CAPTURE);

How to parse a file in PHP

I have a string with the following content
[text1] some content [text2] some
content some content [text3] some
content
The "[textn]" are finite and also have specific names. I want to get the content into an array. Any idea?
If you don't wanna use regular expressions, then strtok() would do the trick here:
strtok($txt, "["); // search for first [
while ($id = strtok("]")) { // alternate ] and [
$result[$id] = strtok("["); // add token
}
In php there are function for splitting the string with regexp delimiters, like preg_match, preg_match_all, look them up.
If you have a word list, you can split the string like this (obviously, one could write it much nicer):
$words = array('[text1]','[text2]','[text3]');
$str = "[text1] some content [text2] some content some content [text3] some content3";
for ($i=0; $i<sizeof($words) ; $i++) {
$olddel = $del;
$del = $words[$i];
list($match,$str) = explode($del,$str);
if ($i-1 >= 0) { $matches[$i-1] = $olddel.' '.$match; }
}
$matches[] =$del." ".$str;
print_r($matches);
This will output: Array ( [0] => [text1] some content [1] => [text2] some content some content [2] => [text3] some content3 )
preg_match or preg_match_all, you need to give us an example if you want regex.
$string = "[text1] some content [text2] some content some content [text3] some content";
preg_match_all("#\[([^\[\]]+)\]#is", $string, $matches);
print_r($matches); //Array ( [0] => Array ( [0] => [text1] [1] => [text2] [2] => [text3] ) [1] => Array ( [0] => text1 [1] => text2 [2] => text3 ) )
Non-recursive.
Is [ and ] part of the string or did you just use them to highlight the part that you want to extract? If it is not, then you can use
if (preg_match_all("/\b(text1|text2|text3|foo|bar)\b/i", $string, $matches)) {
print_r($matches);
}

Regex for spliting on all unescaped semi-colons

I'm using php's preg_split to split up a string based on semi-colons, but I need it to only split on non-escaped semi-colons.
<?
$str = "abc;def\\;abc;def";
$arr = preg_split("/;/", $str);
print_r($arr);
?>
Produces:
Array
(
[0] => abc
[1] => def\
[2] => abc
[3] => def
)
When I want it to produce:
Array
(
[0] => abc
[1] => def\;abc
[2] => def
)
I've tried "/(^\\)?;/" or "/[^\\]?;/" but they both produce errors. Any ideas?
This works.
<?
$str = "abc;def\;abc;def";
$arr = preg_split('/(?<!\\\);/', $str);
print_r($arr);
?>
It outputs:
Array
(
[0] => abc
[1] => def\;abc
[2] => def
)
You need to make use of a negative lookbehind (read about lookarounds). Think of "match all ';' unless preceed by a '\'".
I am not really proficient with PHP regexes, but try this one:
/(?<!\\);/
Since Bart asks: Of course you can also use regex to split on unescaped ; and take escaped escape characters into account. It just gets a bit messy:
<?
$str = "abc;def\;abc\\\\;def";
preg_match_all('/((?:[^\\\\;]|\\\.)*)(?:;|$)/', $str, $arr);
print_r($arr);
?>
Array
(
[0] => Array
(
[0] => abc;
[1] => def\;abc\\;
[2] => def
)
[1] => Array
(
[0] => abc
[1] => def\;abc\\
[2] => def
)
)
What this does is to take a regular expression for “(any character except \ and ;) or (\ followed by any character)” and allow any number of those, followed by a ; or the end of the string.
I'm not sure how php handles $ and end-of-line characters within a string, you may need to set some regex options to get exactly what you want for those.

Categories