Having problems splitting a long string with preg_match_all

Having problems splitting a long string with preg_match_all - php

I have a variable (text) and it is updated with sentences every time there is an update. When i display this array it turns into 1 long sentence, and i want to break this up in single sentences for readability.
<?php
$pattern = '~\\d+-\\d+-\\d{4} // \\w+: ~ ';
$subject = '01-02-2015 // john: info text goes here 10-12-2015 // peter: some more info
';
$matches = array();
$result = preg_match_all ($pattern, $subject, $matches);
?>
Which gives this output:
$matches:
array (
0 =>
array (
0 => '01-02-2015 // john: ',
1 => '10-12-2015 // peter: ',
),
)
I'd like the output to be:
$matches:
array (
0 =>
array (
0 => '01-02-2015 // john: info text goes here',
1 => '10-12-2015 // peter: some more info',
),
)
I need the output to be like this so i can use a foreach loop to print each sentence.
ps. I'd like to try to get it to work this way first, because otherwise i'd need to change a lot of entries in the database.
pps. I'm also not a hero with regex as you can see, so i hope someone can help me out!

Just change your regex like below,
$pattern = '~\d+-\d+-\d{4} // \w+: .*?(?=\s\d+|$)~';
.*? will do a non-greedy match of zero or more characters until a space followed by digits or end of the line is reached.
DEMO
$str = "01-02-2015 // john: info text goes here 10-12-2015 // peter: some more info";
preg_match_all('~\d+-\d+-\d{4} // \w+: .*?(?=\s\d+|$)~', $str, $matches);
print_r($matches);
Output:
Array
(
[0] => Array
(
[0] => 01-02-2015 // john: info text goes here
[1] => 10-12-2015 // peter: some more info
)
)

Related

Strange behavior of preg_match_all php

I have a very long string of html. From this string I want to parse pairs of rus and eng names of cities. Example of this string is:
$html = '
Абакан
Хакасия республика
Абан
Красноярский край
Абатский
Тюменская область
';
My code is:
$subject = $this->html;
$pattern = '/<a href="([\/a-zA-Z0-9-"]*)">([а-яА-Я]*)/';
preg_match_all($pattern, $subject, $matches);
For trying I use regexer . You can see it here http://regexr.com/399co
On the test used global modifier - /g
Because of in PHP we can't use /g modifier I use preg_match_all function. But result of preg_match_all is very strange:
Array
(
[0] => Array
(
[0] => <a href="/forecasts5000/russia/republic-khakassia/abakan">Абакан
[1] => <a href="/forecasts5000/russia/krasnoyarsk-territory/aban">Абан
[2] => <a href="/forecasts5000/russia/tyumen-area/abatskij">Аба�
[3] => <a href="/forecasts5000/russia/arkhangelsk-area/abramovskij-ma">Аб�
)
[1] => Array
(
[0] => /forecasts5000/russia/republic-khakassia/abakan
[1] => /forecasts5000/russia/krasnoyarsk-territory/aban
[2] => /forecasts5000/russia/tyumen-area/abatskij
[3] => /forecasts5000/russia/arkhangelsk-area/abramovskij-ma
)
[2] => Array
(
[0] => Абакан
[1] => Абан
[2] => Аба�
[3] => Аб�
)
)
First of all - it found only first match (but I need to get array with all matches)
The second - result is very strange for me. I want to get the next result:
pairs of /forecasts5000/russia/republic-khakassia/abakan and Абакан
What do I do wrong?

Element 0 of the result is an array of each of the full matches of the regexp. Element 1 is an array of all the matches for capture group 1, element 2 contains capture group 2, and so on.
You can invert this by using the PREG_SET_ORDER flag. Then element 0 will contain all the results from the first match, element 1 will contain all the results from the second match, and so on. Within each of these, [0] will be the full match, and the remaining elements will be the capture groups.
If you use this option, you can then get the information you want with:
foreach ($matches as $match) {
$url = $match[1];
$text = $match[2];
// Do something with $url and $text
}

You can also use T-Regx library which has separate methods for each case :)
pattern('<a href="([/a-zA-Z0-9-"]*)">([а-яА-Я]*)')
->match($this->html)
->forEach(function (Match $match) {
$match = $match->text();
$group = $match->group(1);
echo "Match $match with group $group"
});
I also has automatic delimiters

Pattern for preg_match

I have a string contains the following pattern "[link:activate/$id/$test_code]" I need to get the word activate, $id and $test_code out of this when the pattern [link.....] occurs.
I also tried getting the inside items by using grouping but only gets active and $test_code couldn't get $id. Please help me to get all the parameter and action name in array.
Below is my code and output
Code
function match_test()
{
$string = "Sample string contains [link:activate/\$id/\$test_code] again [link:anotheraction/\$key/\$second_param]]] also how the other ationc like [link:action] works";
$pattern = '/\[link:([a-z\_]+)(\/\$[a-z\_]+)+\]/i';
preg_match_all($pattern,$string,$matches);
print_r($matches);
}
Output
Array
(
[0] => Array
(
[0] => [link:activate/$id/$test_code]
[1] => [link:anotheraction/$key/$second_param]
)
[1] => Array
(
[0] => activate
[1] => anotheraction
)
[2] => Array
(
[0] => /$test_code
[1] => /$second_param
)
)

Try this:
$subject = <<<'LOD'
Sample string contains [link:activate/$id/$test_code] again [link:anotheraction/$key/$second_param]]] also how the other ationc like [link:action] works
LOD;
$pattern = '~\[link:([a-z_]+)((?:/\$[a-z_]+)*)]~i';
preg_match_all($pattern, $subject, $matches);
print_r($matches);
if you need to have \$id and \$test_code separated you can use this instead:
$pattern = '~\[link:([a-z_]+)(/\$[a-z_]+)?(/\$[a-z_]+)?]~i';

Is this what you are looking for?
/\[link:([\w\d]+)\/(\$[\w\d]+)\/(\$[\w\d]+)\]/
Edit:
Also the problem with your expression is this part:
(\/\$[a-z\_]+)+
Although you have repeated the group, the match will only return one because it is still only one group declaration. The regex won't invent matching group numbers for you (Not that i've ever seen anyway).

Regular expression not returning propper matches

I am currently experiencing some problems with my regular expression on my PHP server.
This is my current regulair expression:
/\{content="(?:([^"\|]*)\|?)+"\}/
And I want it to match:
{content="default|test|content|text"}
And then return this in the matches:
default
test
content
text
But when I currently execute it I get back this in my matches:
array (
0 => '{content="default|test|content|text"}',
1 => '',
)
Do any of you have a problem what I am doing wrong?
With kind regards,
Youri Arktesteijn

You can use positive lookaheads and positive lookbehinds.
Three phases:
We match the beginning quote, but don't catch it in our output. Then we match anything that's not a pipe. Then we match a pipe without catching it.
Non-pipes between pipes
Non-pipes and non-quotes between a pipe and a quote.
Here's the code.
<?php
$string = '{content="default|test|content|text"}';
$my_matches = preg_match_all('!((?<=")([^|]+)(?=[|])|(?<=[|])([^|]+)(?=[|])|(?<=[|])([^|"]+)(?="))!',$string,$matches);
print_r($matches[0]);
?>
Output
Array
(
[0] => default
[1] => test
[2] => content
[3] => text
)
Once you have the logic working, then you can pair the look ahead and look behind characters to shorten the match string.
$my_matches = preg_match_all('!(?<=["|])([^|"]+)(?=[|"])!',$string,$matches);
Output
Array
(
[0] => default
[1] => test
[2] => content
[3] => text
)

I don't know how it is possible by using a single line of regular expression. Anyway try the following code,
<?php
if (preg_match('/\{content="(?:([^\"]+))"\}/', $sContent, $matches) > 0) {
$result = explode('|', $matches[1]);
} else {
$result = array();
}
echo '<pre>' . print_r($result, true) . '</pre>';
?>

Match String and Get Variable PHP

I would like to be able to use this string to pull off a certain piece of data from a database '{my-id-1}' so basically if this is found in the text '{my-id-*}' then get the id (eg. if {is-id-1} then ID is 1) and then I can run some code with that ID.
So I've got it so I can get the ID from the braces, but I'm not sure how to replace that within the text.
<?php
$text = "test 1 dfhjsdh sdjkfhksdhfkj skjh {is-id-1} sdfhskdfh sdfsdjfhksd fjksdfhksd {is-id-2}";
preg_match_all('/{is-id-+(.*?)}/',$text, $matches);
print_r ($matches);
$replacewiththis = "this has been replaced, it was id: " . $idhere;
$text = preg_replace('/{is-id-+(.*?)}/', $replacewiththis, $text);
echo $text;
?>
The Array for the matches outputs:
Array (
[0] => Array (
[0] => {is-id-1}
[1] => {is-id-2}
)
[1] => Array (
[0] => 1
[1] => 2
)
)
I'm stuck now and not sure how to can process each of the braces. Can anyone give me a hand?
Thanks.

I am not sure I understood well what you want, but I think this is it:
foreach($matches[1] as $match){
$replacewiththis = "this has been replaced, it was id: $match";
$text=str_replace('{is-id-'.$match.'}', $replacewiththis, $text);
}
echo $text;

Explode with regexp

i have a string like {ASK(Value, Value, 'Sentence', Some_Char)} and i need to get of exploded values in (). What i am doing wrong?
preg_match_all('/\{ASK\((.*?),\)\}/', '{ASK(Value, Value, \'Sentence\', X)}', $matches);
print_r($matches);

Take out the comma from your regular expression, and it matches.
preg_match_all('/\{ASK\((.*?)\)\}/', '{ASK(Value, Value, \'Sentence\', X)}', $matches);
print_r($matches);
//Explode the matched group
$exploded = explode(',',$matches[1]);
print_r($exploded);
/*
* Note that we used $matches[1] instead of $matches[0],
* since the first element contains the entire matched
* expression, and each subsequent element contains the matching groups.
*/

$s = "{ASK(Value, Value, 'Sentence', Some_Char)}";
$p = '#\{ASK\((.*?)\)\}#';
preg_match_all($p, $s, $matches);
print_r($matches);

Simply split & explode
$Myval = "{ASK(Value, Value, 'Sentence', Some_Char)}";
$splitedVal = split('[()]', $Myval);
$explodedVal = explode(",", $splitedVal[1]);
print_r($explodedVal);
// output
Array ( [0] => Value [1] => Value [2] => 'Sentence' [3] => Some_Char )

An easy way to do this (though not entirely contained within the regex) might be:
preg_match_all('/\{ASK\([^)]*\)\}/', '{ASK(Value, Value, \'Sentence\', X)}', $matches);
$values = explode($matches[1]);

So long as your Values, Sentences, and Chars do not contain , or ), then this single regex pattern will deliver without the extra explode() call.
Pattern: ~(?:\G, |ASK\()\K[^,)]+~ (Pattern Demo)
Code: (Demo)
$string="{ASK(Value, Value, 'Sentence', Some_Char)}";
print_r(preg_match_all('~(?:\G, |ASK\()\K[^,)]+~',$string,$out)?$out[0]:[]);
Output:
Array
(
[0] => Value
[1] => Value
[2] => 'Sentence'
[3] => Some_Char
)
The "magic" is in the \G. This tells regex to continue matching at the start of the string or just after the previous match. Here is a similar answer that I've posted: https://stackoverflow.com/a/48373347/2943403

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Having problems splitting a long string with preg_match_all - php

Related

Strange behavior of preg_match_all php

Pattern for preg_match

Regular expression not returning propper matches

Match String and Get Variable PHP

Explode with regexp

Categories

Resources