Regex with unknown character length - php

Simple question for you folks.
Sorry that I have to ask it.
On my website, I want to use signatures at "random" places in my text. The problem is, There could be multiple DIFFERENT signatures in this given string.
The signature code is ~~USERNAME~~
So anything like
~~timtj~~
~~foobar~~
~~totallylongusername~~
~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~
I have tried using preg_match for this, with no success. I understand that the third parameter is used to store the matches, but I can not properly get a match because of the format.
Should I not use preg_match, or am I just not able to use signatures in this manner?

You could make use of preg_match_all and with this modified regex
preg_match_all('/~~(.*?)~~/', $str, $matches);
The code...
<?php
$str="~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~";
preg_match_all('/~~(.*?)~~/', $str, $matches);
print_r($matches[1]);
OUTPUT :
Array
(
[0] => I-d0n't-us3-pr0p3r-ch#r#ct3r5
)

This should work, but usernames mustn't contain ~~
preg_match_all('!~~(.*?)~~!', $str, $matches);
Output:
Array
(
[0] => Array
(
[0] => ~~timtj~~
[1] => ~~foobar~~
[2] => ~~totallylongusername~~
[3] => ~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~
)
[1] => Array
(
[0] => timtj
[1] => foobar
[2] => totallylongusername
[3] => I-d0n't-us3-pr0p3r-ch#r#ct3r5
)
)
The first sub array contains the complete matched strings and the other sub arrays contain the matched groups.
You could change the order by using the flag PREG_SET_ORDER, see http://php.net/preg_match_all#refsect1-function.preg-match-all-parameters
<?php
$str = "~~timtj~~ ~~foobar~~ ~~totallylongusername~~ ~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~";
preg_match_all("!~~(.*?)~~!", str, $matches, PREG_SET_ORDER);
print_r($matches);
This code produces the following output
Array
(
[0] => Array
(
[0] => ~~timtj~~
[1] => timtj
)
[1] => Array
(
[0] => ~~foobar~~
[1] => foobar
)
[2] => Array
(
[0] => ~~totallylongusername~~
[1] => totallylongusername
)
[3] => Array
(
[0] => ~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~
[1] => I-d0n't-us3-pr0p3r-ch#r#ct3r5
)
)

Related

split string and make array, regular expression

$content = "[2][6][11]";
This i would like to split into an array with values [2], [6] and [11].
preg_split("/\[*\]/i", $content);
Wrong output: Array ( [0] => [2 [1] => [5 [2] => )
Any help what's wrong on the regular expression.
thanks.
You can use lookarounds for this split:
$content = "[2][6][11]";
print_r(preg_split('/(?<=\])(?=\[)/', $content));
Output:
Array
(
[0] => [2]
[1] => [6]
[2] => [11]
)
You can use lookarounds to test what are the characters around the position you want to find without matching them.
print_r(preg_split('~(?<=])(?=\[)~', $content));
Note that if you already know how your string is formatted, you can also use preg_match_all with a more simple pattern: ~\[\d+]~
You can also do it with preg_match_all :
$content = "[2][6][11]";
preg_match_all("/\[.*\]/Ui", $content, $matches);
$result = $matches[0];
print_r($result);
Output:
Array
(
[0] => [2]
[1] => [6]
[2] => [11]
)

Obtain specific data with preg_match_all

I have different texts which aren't well formatted, therefore I need a pattern which works with all of them and return some specific elements (text) from it. Let's say I have this text:
"AL TEST232 KW 12*/13*/17 TEST kw16TEST123 kw 15*"
and I want my preg_match_all() to return something like this:
Array
(
[0] => Array
(
[0] => AL TEST232
[1] => 12/13/17
)
[1] => Array
(
[0] => TEST
[1] => 16
)
[2] => Array
(
[0] => TEST123
[1] => 15
)
)
Is this possible with a single pattern?
You can use:
preg_match_all('~(\w[\s\w]*?\w)\s*kw\s*([\d/*]+)~', $input, $matches);
RegEx Demo

Using REGEX with escaped quotes inside quotes

I have a PHP preg_match_all and REGEX question.
I have the following code:
<?php
$string= 'attribute1="some_value" attribute2="<h1 class=\"title\">Blahhhh</h1>"';
preg_match_all('/(.*?)\s*=\s*(\'|"|&#?\w+;)(.*?)\2/s', trim($string), $matches);
print_r($matches);
?>
That does not seem to pickup escaped quotes for the instance that I want to pass in HTML with quotes. I have tried numerous solutions for this with the basic quotes inside quotes REGEX fixes, but none seem to be working for me. I can't seem to place them correctly inside this pre-existing REGEX.
I am not a REGEX master, can someone please point me in the right direction?
The result I am trying to achieve is this:
Array
(
[0] => Array
(
[0] => attribute1="some_value"
[1] => attribute2="<h1 class=\"title\">Blahhhh</h1>"
)
[1] => Array
(
[0] => attribute1
[1] => attribute2
)
[2] => Array
(
[0] => "
[1] => "
)
[3] => Array
(
[0] => some_value
[1] => <h1 class=\"title\">Blahhhh</h1>
)
)
Thanks.
You can solve this with a negative lookbehind assertion:
'/(.*?)\s*=\s*(\'|"|&#?\w+;)(.*?)(?<!\\\\)\2~/'
^^^^^^^^^
The closing quote should not be prepended by \. Gives me:
Array
(
[0] => Array
(
[0] => attribute1="some_value"
[1] => attribute2="<h1 class=\"title\">Blahhhh</h1>"
)
[1] => Array
(
[0] => attribute1
[1] => attribute2
)
[2] => Array
(
[0] => "
[1] => "
)
[3] => Array
(
[0] => some_value
[1] => <h1 class=\"title\">Blahhhh</h1>
)
)
This regex ain't perfect because it of the entity you but in there as delimiter, like the quotes it can be escaped as well with \. No idea if that is really intended.
See also this great question/answer: Split string by delimiter, but not if it is escaped.

What is the regex for the text between quotes?

Ok, I have tried looking at other answers, but couldn't get mine solved. So here is the code:
{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}
I need to get every second value in the quotes (as the "name" values are constant). I actually worked out that I need to get text between :" and " but i can't manage to write a regex for that.
EDIT: I'm doing preg_match_all in php. And its between :" and ", not " and " as someone else edited.
Why on earth would you attempt to parse JSON with regular expressions? PHP already parses JSON properly, with built-in functionality.
Code:
<?php
$input = '{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}';
print_r(json_decode($input, true));
?>
Output:
Array
(
[chg] => -0.71
[vol] => 40700
[time] => 11.08.2011 12:29:09
[high] => 1.417
[low] => 1.360
[last] => 1.400
[pcl] => 1.410
[turnover] => 56,560.25
)
Live demo.
You may need to escape characters or add a forward slash to the front or back depending on your language. But it's basically:
:"([^"].*?)"
or
/:"([^"].*?)"/
I've test this in groovy as below and it works.
import java.util.regex.*;
String test='{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}'
// Create a pattern to match breaks
Pattern p = Pattern.compile(':"([^"]*)"');
// Split input with the pattern
// Run some matches
Matcher m = p.matcher(test);
while (m.find())
System.out.println("Found comment: "+m.group().replace('"','').replace(":",""));
Output was:
Found comment: -0.71
Found comment: 40700
Found comment: 11.08.2011 12:29:09
Found comment: 1.417
Found comment: 1.360
Found comment: 1.400
Found comment: 1.410
Found comment: 56,560.25
PHP Example
<?php
$subject = '{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}';
$pattern = '/(?<=:")[^"]*/';
preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?>
Output is:
Array ( [0] => Array ( [0] => Array ( [0] => -0.71 [1] => 8 ) [1] => Array ( [0] => 40700 [1] => 22 ) [2] => Array ( [0] => 11.08.2011 12:29:09 [1] => 37 ) [3] => Array ( [0] => 1.417 [1] => 66 ) [4] => Array ( [0] => 1.360 [1] => 80 ) [5] => Array ( [0] => 1.400 [1] => 95 ) [6] => Array ( [0] => 1.410 [1] => 109 ) [7] => Array ( [0] => 56,560.25 [1] => 128 ) ) )

preg_match return all parts in array

I've got following php code:
$match = array();
if (preg_match("%^(/\d+)(/test)(/\w+)*$%", "/25/test/t1/t2/t3/t4", $match))
print_r($match);
I'm getting this result:
Array ( [0] => /25/test/t1/t2/t3/t4 [1] => /25 [2] => /test [3] => /t4 )
What do i need to change in my regexp to get this result:
Array ( [0] => /25/test/t1/t2/t3/t4 [1] => /25 [2] => /test [3] => /t1 [4] => /t2 [5] => /t3 [6] => /t4)
you need preg_match_all
preg_match_all( '~(/\w+)~', $str, $matches );
in your situation you can use explode too
<?php
$str = '/a/b/1/2/3/4';
if(preg_match('/^(\/\w+)*$/', $str) && preg_match_all('/\/\w+/', $str, $matches)) {
$matches = $matches[0];
print_r($matches);
}
?>
Prints:
Array
(
[0] => /a
[1] => /b
[2] => /1
[3] => /2
[4] => /3
[5] => /4
)
Using your original example, you could use a recursive expression:
"%(/\w+)(?>[^(/\w+)]?|(?R))%"
This works my matching (/\w+) subexpressions in turn. Therfore the match for
"/a/b/1/2/3/4"
Would be:
Array
(
[0] => Array
(
[0] => /a [1] => /b [2] => /1 [3] => /2 [4] => /3 [5] => /4
)
...
However your later examples complicate things. A simple 0 or more match will only return the last (greedy) or first (ungreedy) match - not all submatches. preg_match_all won't be able to handle your dynamic expression.
You will have to clarify what you're trying to achieve in more detail before a suitable solution can be provided.

Categories