use preg_split to split chords and words - php

I'm working on a little piece of code playing handling song tabs, but i'm stuck on a problem.
I need to parse each song tab line and to split it to get chunks of chords on the one hand, and words in the other.
Each chunk would be like :
$line_chunk = array(
0 => //part of line containing one or several chords
1 => //part of line containing words
);
They should stay "grouped". I mean by this that it should split only when the function reaches the "limit" between chords and words.
I guess I should use preg_split to achieve this. I made some tests, but I've been only able to split on chords, not "groups" of chords:
$line_chunks = preg_split('/(\[[^]]*\])/', $line, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
Those examples shows you what I would like to get :
on a line containing no chords :
$input = '{intro}';
$results = array(
array(
0 => null,
1 => '{intro}
)
);
on a line containing only chords :
$input = '[C#] [Fm] [C#] [Fm] [C#] [Fm]';
$results = array(
array(
0 => '[C#] [Fm] [C#] [Fm] [C#] [Fm]',
1 => null
)
);
on a line containing both :
$input = '[C#]I’m looking for [Fm]you [G#]';
$results = array(
array(
0 => '[C#]',
1 => 'I’m looking for'
),
array(
0 => '[Fm]',
1 => 'you '
),
array(
0 => '[G#]',
1 => null
),
);
Any ideas of how to do this ?
Thanks !

preg_split isn't the way to go. Most of the time, when you have a complicated split task to achieve, it's more easy to try to match what you are interested by instead of trying to split with a not easy to define separator.
A preg_match_all approach:
$pattern = '~ \h*
(?| # open a "branch reset group"
( \[ [^]]+ ] (?: \h* \[ [^]]+ ] )*+ ) # one or more chords in capture group 1
\h*
( [^[\n]* (?<=\S) ) # eventual lyrics (group 2)
| # OR
() # no chords (group 1)
( [^[\n]* [^\s[] ) # lyrics (group 2)
) # close the "branch reset group"
~x';
if (preg_match_all($pattern, $input, $matches, PREG_SET_ORDER)) {
$result = array_map(function($i) { return [$i[1], $i[2]]; }, $matches);
print_r($result);
}
demo
A branch reset group preserves the same group numbering for each branch.
Note: feel free to add:
if (empty($i[1])) $i[1] = null;
if (empty($i[2])) $i[2] = null;
in the map function if you want to obtain null items instead of empty items.
Note2: if you work line by line, you can remove the \n from the pattern.

I would go with PHP explode:
/*
* Process data
*/
$input = '[C#]I’m looking for [Fm]you [G#]';
$parts = explode("[", $input);
$results = array();
foreach ($parts as $item)
{
$pieces = explode("]", $item);
if (count($pieces) < 2)
{
$arrayitem = array( "Chord" => $pieces[0],
"Lyric" => "");
}
else
{
$arrayitem = array( "Chord" => $pieces[0],
"Lyric" => $pieces[1]);
}
$results[] = $arrayitem;
}
/*
* Echo results
*/
foreach ($results as $str)
{
echo "Chord: " . $str["Chord"];
echo "Lyric: " . $str["Lyric"];
}
Boudaries are not tested in the code, as well as remaining whitespaces, but it is a base to work on.

Related

REGEX for first two characters from a set of values?

I have a set of starting 2 alphabets as:
$arr = ['AB', 'DC', 'LF']
Problem: I need to make a regex (in PHP and TypeSript) that passes only those strings which starts with above values.
Example:
Valid:
ABwerty45^&*jk
ABwerrtty
LF%$^erftgt5234
Invalid:
TABYR56H
ab7877
Abtyu7
Any help is appreciated.
You could join() the array and compose a regex with an alternation like this:
<?php
$strings = <<<DATA
ABwerty45^&*jk
ABwerrtty
LF%$^erftgt5234
TABYR56H
ab7877
Abtyu7
DATA;
$arr = ['AB', 'DC', 'LF'];
$regex = '~^(?:' . join('|', $arr) . ').*~m';
if (preg_match_all($regex, $strings, $matches)) {
print_r($matches);
}
?>
Which yields
Array
(
[0] => Array
(
[0] => ABwerty45^&*jk
[1] => ABwerrtty
[2] => LF%$^erftgt5234
)
)
Basically, this says:
^ # match the start of the string
(?:AB|DC|LF) # AB or DC or LF
.* # 0+ characters in that line
Instead of a regex you could check if the first 2 characters are present in the array:
$arr = ['AB', 'DC', 'LF'];
if (in_array(substr("ABwerty45^&*jk",0, 2), $arr)) {
// ...
}
Demo
const strings = [
"ABwerty45^&*jk",
"ABwerrtty",
"LF%$^erftgt5234",
"TABYR56H",
"ab7877",
"Abtyu7"
];
const arr = ['AB', 'DC', 'LF'];
strings.forEach((s) => {
let match = arr.includes(s.substring(0, 2));
match ? console.log("Match : ", s) : console.log("No match: ", s);
});

How do I apply a replace to each array element in PHP?

I have an array with a list of all controllers in my application:
$controllerlist = glob("../controllers/*_controller.php");
How do I strip ../controllers/ at the start and _controller.php at the end of each array element with one PHP command?
As preg_replace can act on an array, you could do:
$array = array(
"../controllers/test_controller.php",
"../controllers/hello_controller.php",
"../controllers/user_controller.php"
);
$array = preg_replace('~../controllers/(.+?)_controller.php~', "$1", $array);
print_r($array);
output:
Array
(
[0] => test
[1] => hello
[2] => user
)
Mapping one array to another:
$files = array(
'../controllers/test_controller.php',
'../controllers/hello_controller.php'
);
$start = strlen('../controllers/');
$end = strlen('_controller.php') * -1;
$controllers = array_map(
function($value) use ($start, $end) {
return substr($value, $start, $end);
},
$files
);
var_dump($controllers);
I'm not sure how you defined "command", but I doubt there is a way to do that with one simple function call.
However, if you're simply wanting it to be compact, here's a simple way of doing it:
$controllerlist = explode('|||', str_replace(array('../controllers/', '_controller.php'), '', implode('|||', glob("../controllers/*_controller.php"))));
It's a bit dirty, but it gets the job done in a single line.
One command without searching and replacing? Yes you can!
If I'm not missing something grande, what about keeping it simple and chopping 15 characters from the start and the end using the substr function:
substr ( $x , 15 , -15 )
Since glob will always give you strings with that pattern.
Example:
// test array (thanks FruityP)
$array = array(
"../controllers/test_controller.php",
"../controllers/hello_controller.php",
"../controllers/user_controller.php" );
foreach($array as $x){
$y=substr($x,15,-15); // Chop 15 characters from the start and end
print("$y\n");
}
Output:
test
hello
user
No need for regex in this case unless there can be variations of what you mentioned.
$array = array(
"../controllers/test_controller.php",
"../controllers/hello_controller.php",
"../controllers/user_controller.php"
);
// Actual one liner..
$list = str_replace(array('../controllers/', '_controller.php'), "", $array);
var_dump($array);
This will output
array (size=3)
0 => string 'test' (length=4)
1 => string 'hello' (length=5)
2 => string 'user' (length=4)
Which is (I think) what you asked for.
If you have an array like this :
$array = array( "../controllers/*_controller.php",
"../controllers/*_controller.php");
Then array_map() help you to trim the unnecessary string.
function trimmer( $string ){
return str_replace( "../controllers/", "", $string );
}
$array = array( "../controllers/*_controller.php",
"../controllers/*_controller.php");
print_r( array_map( "trimmer", $array ) );
http://codepad.org/VO6kyVOa
to strip 15 chars at the start and 15 at the end of each arrayelement in one command:
$controllerlist = substr_replace(
substr_replace(
glob("../controllers/*_controller.php"),'',-15
),'',0,15
)
preg_replace accepts an array as argument too:
$before = '../controllers/';
$after = "_controller.php";
$preg_str = preg_quote($before,"/").'(.*)'.preg_quote($after,"/");
$controllerlist = preg_replace('/^'.$preg_str.'$/', '\1', glob("$before*$after"));

Regex Multiple Capture of Group

I'm using regex to capture the dimensions of ads
Source content is an HTML File, and I'm trying to capture for content that looks like:
size[200x400,300x1200] (could be 1-4 different sizes)
I'm trying to an array with the different sizes in it
My capture code looks like this:
$size_declaration = array();
$sizes = array();
$declaration_pattern = "/size\[(\d{2,4}x\d{2,4}|\d{2,4}x\d{2,4},){1,4}\]/";
$sizes_pattern = "/\d{2,4}x\d{2,4}/";
$result = preg_match($declaration_pattern, $html, $size_declaration);
if( $result ) {
$result = preg_match_all($sizes_pattern, $size_declaration[0], $sizes);
var_dump($sizes);
}
The code above produces usable results:
$sizes = array(
[0] => array (
[0] => '200x400',
[1] => '300x1200'
)
)
but it takes quite a bit of code. I was thinking it was possible to collect the results with a single regex, but I couldn't find a result that works. Is there a way to clean this up a bit?
It's not very practical to turn it into a single expression; it would be better to keep them separate; the first expression finds the boundaries and does rudimentary content checks on the inner contents, the second expression breaks it down into individual pieces:
if (preg_match_all('/size\[([\dx,]+)\]/', $html, $matches)) {
foreach ($matches[0] as $size_declaration) {
if (preg_match_all('/\d+x\d+/', $size_declaration, $sizes)) {
print_r($sizes[0]);
}
}
}
This one is a little simpler:
$html = "size[200x400,300x600,300x100]";
if (($result = preg_match_all("/(\d{2,4}x\d{2,4}){1,4}/", $html, $matches)) > 0)
var_dump($matches);
//
// $matches =>
// array(
// (int) 0 => array(
// (int) 0 => '200x400',
// (int) 1 => '300x600',
// (int) 2 => '300x100'
// ),
// (int) 1 => array(
// (int) 0 => '200x400',
// (int) 1 => '300x600',
// (int) 2 => '300x100'
// )
// )
//
The only way is to repeat the 4 eventual sizes in the pattern:
$subject = <<<LOD
size[523x800]
size[200x400,300x1200]
size[201x300,352x1200,123x456]
size[142x396,1444x32,143x89,231x456]
LOD;
$pattern = '`size\[(\d{2,4}x\d{2,4})(?:,(\d{2,4}x\d{2,4}))?(?:,(\d{2,4}x\d{2,4}))?(?:,(\d{2,4}x\d{2,4}))?]`';
preg_match_all($pattern, $subject, $matches, PREG_SET_ORDER);
foreach ($matches as &$match) { array_shift($match); }
print_r($matches);
The pattern can also be shorten using references to capture groups:
$pattern = '`size\[(\d{2,4}x\d{2,4})(?:,((?1)))?(?:,((?1)))?(?:,((?1)))?]`';
or with the Oniguruma syntax:
$pattern = '`size\[(\d{2,4}x\d{2,4})(?:,(\g<1>))?(?:,(\g<1>))?(?:,(\g<1>))?]`';

PHP preg_match creating empty arrays

I am using the following code:
foreach ($_POST as $key => $value) {
(preg_match("/^._box_(\d+)_5$/", $key, $matches));
//$prodid[] = $matches[1];
$firephp->log($matches, 'matches');
};
This code is working on the following information being $_POSTed from the previous page:
array(
['clin'] =>
['clinmail'] =>
['quest_3'] =>
['quest_7'] =>
['quest_8'] =>
['quest_9'] =>
['quest_10'] =>
['quest_15'] =>
['quest_16'] =>
['hosp'] => 8
['user'] => 16
['a_box_15_5'] => 2
['a_box_16_5'] => 2
['b_box_1_5'] => '$0.00'
['b_box_29_5'] => 1
)
The problem is I get the following result:
matches: array()
matches: array()
matches: array()
matches: array()
matches: array()
matches: array()
matches: array()
matches: array()
matches: array()
matches: array()
matches: array()
matches: array('0'=>'a_box_15_5', '1'=>'15')
matches: array('0'=>'a_box_16_5', '1'=>'16')
matches: array('0'=>'b_box_1_5', '1'=>'1')
matches: array('0'=>'b_box_29_5', '1'=>'29')
I don't want it matching the first 11 positions. I only want the results that actually match what I'm looking for, which in this case is that last four $_POST's isn't that what preg_match is supposed to do? How can I limit it to just the matches?
preg_match() works correctly: if there are no matches, $matches will be empty. But you're not doing anything different if there are no matches, you're always calling $firephp->log(), matches or not.
preg_match() returns 1 if the pattern matches, or 0 otherwise and false if an error occurred, so you can use that to see if there are matches, and only then call $firephp->log():
foreach ($_POST as $key => $value) {
if (preg_match('/^._box_(\d+)_5$/', $key, $matches)) {
$firephp->log($matches, 'matches');
}
}
add check before logging it:
foreach ($_POST as $key => $value) {
(preg_match("/^._box_(\d+)_5$/", $key, $matches));
//$prodid[] = $matches[1];
if(!empty($matches)){
$firephp->log($matches, 'matches');
}
};

Turn text inside brackets to an array PHP

If I have a string that looks like this:
$myString = "[sometext][moretext][993][112]This is a long text";
I want it to be turned into:
$string = "This is a long text";
$arrayDigits[0] = 993;
$arrayDigits[1] = 112;
$arrayText[0] = "sometext";
$arrayText[1] = "moretext";
How can I do this with PHP?
I understand Regular Expressions is the solution. Please notice that $myString was just an example. There can be several brackets, not just two of each, as in my example.
Thanks for your help!
This is what I came up with.
<?php
#For better display
header("Content-Type: text/plain");
#The String
$myString = "[sometext][moretext][993][112]This is a long text";
#Initialize the array
$matches = array();
#Fill it with matches. It would populate $matches[1].
preg_match_all("|\[(.+?)\]|", $myString, $matches);
#Remove anything inside of square brackets, and assign to $string.
$string = preg_replace("|\[.+\]|", "", $myString);
#Display the results.
print_r($matches[1]);
print_r($string);
After that, you can iterate over the $matches array and check each value to assign it to a new array.
Try this:
$s = '[sometext][moretext][993][112]This is a long text';
preg_match_all('/\[(\w+)\]/', $s, $m);
$m[1] will contain all texts in the brakets, after this you could check type of each value. Also, you could check this using two preg_match_all: at first time with pattern /\[(\d+)\]/ (will return array of digits), in the second - pattern /\[([a-zA-z]+)\]/ (that will return words):
$s = '[sometext][moretext][993][112]This is a long text';
preg_match_all('/\[(\d+)\]/', $s, $matches);
$arrayOfDigits = $matches[1];
preg_match_all('/\[([a-zA-Z]+)\]/', $s, $matches);
$arrayOfWords = $matches[1];
For cases like yours you can make use of named subpatterns so to "tokenize" your string. With some little code, this can be made easily configurable with an array of tokens:
$subject = "[sometext][moretext][993][112]This is a long text";
$groups = array(
'digit' => '\[\d+]',
'text' => '\[\w+]',
'free' => '.+'
);
Each group contains the subpattern and it's name. They match in their order, so if the group digit matches, it won't give text a chance (which is necessary here because \d+ is a subset of \w+). This array can then turned into a full pattern:
foreach($groups as $name => &$subpattern)
$subpattern = sprintf('(?<%s>%s)', $name, $subpattern);
unset($subpattern);
$pattern = sprintf('/(?:%s)/', implode('|', $groups));
The pattern looks like this:
/(?:(?<digit>\[\d+])|(?<text>\[\w+])|(?<free>.+))/
Everything left to do is to execute it against your string, capture the matches and filter them for some normalized output:
if (preg_match_all($pattern, $subject, $matches))
{
$matches = array_intersect_key($matches, $groups);
$matches = array_map('array_filter', $matches);
$matches = array_map('array_values', $matches);
print_r($matches);
}
The matches are now nicely accessible in an array:
Array
(
[digit] => Array
(
[0] => [993]
[1] => [112]
)
[text] => Array
(
[0] => [sometext]
[1] => [moretext]
)
[free] => Array
(
[0] => This is a long text
)
)
The full example at once:
$subject = "[sometext][moretext][993][112]This is a long text";
$groups = array(
'digit' => '\[\d+]',
'text' => '\[\w+]',
'free' => '.+'
);
foreach($groups as $name => &$subpattern)
$subpattern = sprintf('(?<%s>%s)', $name, $subpattern);
unset($subpattern);
$pattern = sprintf('/(?:%s)/', implode('|', $groups));
if (preg_match_all($pattern, $subject, $matches))
{
$matches = array_intersect_key($matches, $groups);
$matches = array_map('array_filter', $matches);
$matches = array_map('array_values', $matches);
print_r($matches);
}
You could try something along the lines of:
<?php
function parseString($string) {
// identify data in brackets
static $pattern = '#(?:\[)([^\[\]]+)(?:\])#';
// result container
$t = array(
'string' => null,
'digits' => array(),
'text' => array(),
);
$t['string'] = preg_replace_callback($pattern, function($m) use(&$t) {
// shove matched string into digits/text groups
$t[is_numeric($m[1]) ? 'digits' : 'text'][] = $m[1];
// remove the brackets from the text
return '';
}, $string);
return $t;
}
$string = "[sometext][moretext][993][112]This is a long text";
$result = parseString($string);
var_dump($result);
/*
$result === array(
"string" => "This is a long text",
"digits" => array(
993,
112,
),
"text" => array(
"sometext",
"moretext",
),
);
*/
(PHP5.3 - using closures)

Categories