Regex is dividing a single statement? - php

I am having a difficult time with regex, I am getting a separate "revenue" without any of the previous matching.
$string = "FY2013 EPS, FQ 2012 revenue";
preg_match_all("/F[Y|Q]\s?\d{4}\sEPS|revenue/", $string, $matches);
print_r($matches);
Result:
Array ( [0] => Array ( [0] => FY2013 EPS [1] => revenue ) )
What I was expecting:
Array ( [0] => Array ( [0] => FY2013 EPS [1] => FQ 2012 revenue ) )

try this
$string = "FY2013 EPS, FQ 2012 revenue";
preg_match_all("/F[Y|Q]\s?\d{4}\s(?:EPS|revenue)/", $string, $matches);
print_r($matches);

It reads as OR revenue - you want to use:
"/F[Y|Q]\s?\d{4}\s(?:EPS|revenue)/"
where ?: denotes non-capture group

You need an alternation for the EPS|revenue specifically, not revenue and everything else. All together:
/F[Y|Q]\s?\d{4}\s(?:EPS|revenue)/

try this
preg_match_all("/(F[Y|Q]\s?\d{4}\s(EPS|revenue))/", $string, $matches);
produces
Array
(
[0] => Array
(
[0] => FY2013 EPS
[1] => FQ 2012 revenue
)
[1] => Array
(
[0] => FY2013 EPS
[1] => FQ 2012 revenue
)
[2] => Array
(
[0] => EPS
[1] => revenue
)
)
for me

Related

Parsing digit which has three letters with preg_match

I have a string which I have to parse digit which has three letters but I want to use same pattern using preg_match.
Here is my code can anybody help me out.
$string=" AMOUNT - 10.00CAD 0.50XGA 1.00XQA";
if(preg_match('/^\s+AMOUNT\s+\-\s+\d+[.]\d+[A-Z]{3}\s+((?J)(?<amount>\d+[.]\d+)(XGA)?(?J)\s+(?<amount>\d+[.]\d+)(XQA))/',$string,$m))
{
print_r($m);
}
I'd use preg_match_all like that:
$string=" AMOUNT - 10.00CAD 0.50XGA 1.00XQA";
if(preg_match_all('/^\s+AMOUNT\s+-\s+(*SKIP)(*F)|(\d+\.\d+)[A-Z]{3}\b/', $string, $m)) {
print_r($m);
}
Output:
Array
(
[0] => Array
(
[0] => 10.00CAD
[1] => 0.50XGA
[2] => 1.00XQA
)
[amount] => Array
(
[0] => 10.00
[1] => 0.50
[2] => 1.00
)
[1] => Array
(
[0] => 10.00
[1] => 0.50
[2] => 1.00
)
)

Obtain specific data with preg_match_all

I have different texts which aren't well formatted, therefore I need a pattern which works with all of them and return some specific elements (text) from it. Let's say I have this text:
"AL TEST232 KW 12*/13*/17 TEST kw16TEST123 kw 15*"
and I want my preg_match_all() to return something like this:
Array
(
[0] => Array
(
[0] => AL TEST232
[1] => 12/13/17
)
[1] => Array
(
[0] => TEST
[1] => 16
)
[2] => Array
(
[0] => TEST123
[1] => 15
)
)
Is this possible with a single pattern?
You can use:
preg_match_all('~(\w[\s\w]*?\w)\s*kw\s*([\d/*]+)~', $input, $matches);
RegEx Demo

Regex with unknown character length

Simple question for you folks.
Sorry that I have to ask it.
On my website, I want to use signatures at "random" places in my text. The problem is, There could be multiple DIFFERENT signatures in this given string.
The signature code is ~~USERNAME~~
So anything like
~~timtj~~
~~foobar~~
~~totallylongusername~~
~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~
I have tried using preg_match for this, with no success. I understand that the third parameter is used to store the matches, but I can not properly get a match because of the format.
Should I not use preg_match, or am I just not able to use signatures in this manner?
You could make use of preg_match_all and with this modified regex
preg_match_all('/~~(.*?)~~/', $str, $matches);
The code...
<?php
$str="~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~";
preg_match_all('/~~(.*?)~~/', $str, $matches);
print_r($matches[1]);
OUTPUT :
Array
(
[0] => I-d0n't-us3-pr0p3r-ch#r#ct3r5
)
This should work, but usernames mustn't contain ~~
preg_match_all('!~~(.*?)~~!', $str, $matches);
Output:
Array
(
[0] => Array
(
[0] => ~~timtj~~
[1] => ~~foobar~~
[2] => ~~totallylongusername~~
[3] => ~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~
)
[1] => Array
(
[0] => timtj
[1] => foobar
[2] => totallylongusername
[3] => I-d0n't-us3-pr0p3r-ch#r#ct3r5
)
)
The first sub array contains the complete matched strings and the other sub arrays contain the matched groups.
You could change the order by using the flag PREG_SET_ORDER, see http://php.net/preg_match_all#refsect1-function.preg-match-all-parameters
<?php
$str = "~~timtj~~ ~~foobar~~ ~~totallylongusername~~ ~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~";
preg_match_all("!~~(.*?)~~!", str, $matches, PREG_SET_ORDER);
print_r($matches);
This code produces the following output
Array
(
[0] => Array
(
[0] => ~~timtj~~
[1] => timtj
)
[1] => Array
(
[0] => ~~foobar~~
[1] => foobar
)
[2] => Array
(
[0] => ~~totallylongusername~~
[1] => totallylongusername
)
[3] => Array
(
[0] => ~~I-d0n't-us3-pr0p3r-ch#r#ct3r5~~
[1] => I-d0n't-us3-pr0p3r-ch#r#ct3r5
)
)

What is the regex for the text between quotes?

Ok, I have tried looking at other answers, but couldn't get mine solved. So here is the code:
{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}
I need to get every second value in the quotes (as the "name" values are constant). I actually worked out that I need to get text between :" and " but i can't manage to write a regex for that.
EDIT: I'm doing preg_match_all in php. And its between :" and ", not " and " as someone else edited.
Why on earth would you attempt to parse JSON with regular expressions? PHP already parses JSON properly, with built-in functionality.
Code:
<?php
$input = '{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}';
print_r(json_decode($input, true));
?>
Output:
Array
(
[chg] => -0.71
[vol] => 40700
[time] => 11.08.2011 12:29:09
[high] => 1.417
[low] => 1.360
[last] => 1.400
[pcl] => 1.410
[turnover] => 56,560.25
)
Live demo.
You may need to escape characters or add a forward slash to the front or back depending on your language. But it's basically:
:"([^"].*?)"
or
/:"([^"].*?)"/
I've test this in groovy as below and it works.
import java.util.regex.*;
String test='{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}'
// Create a pattern to match breaks
Pattern p = Pattern.compile(':"([^"]*)"');
// Split input with the pattern
// Run some matches
Matcher m = p.matcher(test);
while (m.find())
System.out.println("Found comment: "+m.group().replace('"','').replace(":",""));
Output was:
Found comment: -0.71
Found comment: 40700
Found comment: 11.08.2011 12:29:09
Found comment: 1.417
Found comment: 1.360
Found comment: 1.400
Found comment: 1.410
Found comment: 56,560.25
PHP Example
<?php
$subject = '{"chg":"-0.71","vol":"40700","time":"11.08.2011 12:29:09","high":"1.417","low":"1.360","last":"1.400","pcl":"1.410","turnover":"56,560.25"}';
$pattern = '/(?<=:")[^"]*/';
preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE);
print_r($matches);
?>
Output is:
Array ( [0] => Array ( [0] => Array ( [0] => -0.71 [1] => 8 ) [1] => Array ( [0] => 40700 [1] => 22 ) [2] => Array ( [0] => 11.08.2011 12:29:09 [1] => 37 ) [3] => Array ( [0] => 1.417 [1] => 66 ) [4] => Array ( [0] => 1.360 [1] => 80 ) [5] => Array ( [0] => 1.400 [1] => 95 ) [6] => Array ( [0] => 1.410 [1] => 109 ) [7] => Array ( [0] => 56,560.25 [1] => 128 ) ) )

preg_match return all parts in array

I've got following php code:
$match = array();
if (preg_match("%^(/\d+)(/test)(/\w+)*$%", "/25/test/t1/t2/t3/t4", $match))
print_r($match);
I'm getting this result:
Array ( [0] => /25/test/t1/t2/t3/t4 [1] => /25 [2] => /test [3] => /t4 )
What do i need to change in my regexp to get this result:
Array ( [0] => /25/test/t1/t2/t3/t4 [1] => /25 [2] => /test [3] => /t1 [4] => /t2 [5] => /t3 [6] => /t4)
you need preg_match_all
preg_match_all( '~(/\w+)~', $str, $matches );
in your situation you can use explode too
<?php
$str = '/a/b/1/2/3/4';
if(preg_match('/^(\/\w+)*$/', $str) && preg_match_all('/\/\w+/', $str, $matches)) {
$matches = $matches[0];
print_r($matches);
}
?>
Prints:
Array
(
[0] => /a
[1] => /b
[2] => /1
[3] => /2
[4] => /3
[5] => /4
)
Using your original example, you could use a recursive expression:
"%(/\w+)(?>[^(/\w+)]?|(?R))%"
This works my matching (/\w+) subexpressions in turn. Therfore the match for
"/a/b/1/2/3/4"
Would be:
Array
(
[0] => Array
(
[0] => /a [1] => /b [2] => /1 [3] => /2 [4] => /3 [5] => /4
)
...
However your later examples complicate things. A simple 0 or more match will only return the last (greedy) or first (ungreedy) match - not all submatches. preg_match_all won't be able to handle your dynamic expression.
You will have to clarify what you're trying to achieve in more detail before a suitable solution can be provided.

Categories