Trying to parse numbers with regex - php

I'm trying to parse all numbers from a text:
text 2030 text 2,5 text 2.000.000 2,000,000 -200 +31600000000. 200. 2.5 200? 1:200
Based on this regex:
(?<!\S)(\-?|\+?)(\d*\.?\,?\d+|\d{1,3}(,?.?\d{3})*(\.\,\d+)?)(?!\S)
But endings like ., ?, !, , right after the number doesn't match. I only want full matches with preg_match_all. (see image)
I guess that the problem is in the last part of my regex (?!\S). I tried different things but I can't figured it out how to solve this.

If we don't wish to validate our numbers, maybe we could then start with a simple expression, maybe something similar to:
(?:^|\s)([+-]?[\d:.,]*\d)\b
Test
$re = '/(?:^|\s)([+-]?[\d:.,]*\d)\b/s';
$str = 'text 2030 text 2,5 text 2.000.000 2,000,000 -200 +31600000000. 200. 2.5 200? 1:200
';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_dump($matches);
In the right panel of this demo, the expression is further explained, if you might be interested.
EDITS:
Another expression would be:
(?:^|\s)\K[+-]?(?:\d+:\d+|\d+(?:[.,]\d{1,3})+|\d+)\b
which would not still validate our numbers and just collect those listed, with some invalid numbers.
DEMO 2
DEMO 3

Related

Regex in conditional or with exact number of digits

I've been struggling to achieve regex with the operator or.
For example
Having the following chain:
Allowed numbers: 1, 2, 5, 6, 20
"/path/item/1"
"/path/item/2"
"/path/item/5"
etc
The regex that I have been testing is:
"/\/path\/item\/(1|2|5|6|20)/"
What I want is for regex to return true only if it is 1 or 2 or 5 or 6, etc.
But for the example of the number 20, the regex returns true for 2 and not for 20.
How can I validate each value independently, that is to say that it is only true if it is 2 and not 20. But true when it is 20 but not 2.
How would the regex be to implement this validation?
Ejemplo
You need to restrict the search such that the matched digits bring you to the end of the string:
"/\/path\/item\/(1|2|5|6|20)$/"
This will mean that the digits must exactly match, and does not involve any re-ordering of the permitted values in your regex.
Demonstrated here
The key is to add the large numbers first in the capturing or non-capturing group, such as:
^\/path\/item\/(20|1|2|5|6)$
or
^\/path\/item\/(?:20|1|2|5|6)$
or
\/path\/item\/(?:20|1|2|5|6)
Test
$re = '/^\/path\/item\/(20|1|2|5|6)$/s';
$str = '/path/item/20';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_dump($matches);
The expression is explained on the top right panel of this demo, if you wish to explore further or modify it, and in this link, you can watch how it would match against some sample inputs step by step, if you like.
Problem with your code was, whenever you sent 20 to match, 2 was matched first and was ignored as there also was 0 following. This can be resolved by giving 20 first, like this:
\/path\/item\/(20|1|2|5|6)\/
View Here: https://regex101.com/r/aJf1Q8/1

PHP Regex to get text between 2 words with numbers

i'm trying to get the string between two words in a entire string:
Ex.:
My string:
...'Total a Facturar 123,061 221,063 26,860161,16080,580310,760 358,297 Recepcionado'...
I'm using
/(?<=Total a Facturar )(.*?) Recepcionado/
I need the highlighted characters (26,860161,16080,580310,760)
and i get 221,061 221,063 26,860161,16080,580310,760 358,297 Recepcionado with my pattern.
The numbers of the string are always different, i need the numbers that are together without a space.
Thanks
EDIT:
Here is the entire string: eval.in/802292
I hope this will be helpful
Regex demo or Regex demo 2
Regex: (?:\d+(?:\,\d+){2,})
For above question you can also use it like this (?:\d+(?:\,\d+){4})
1. (?:\d+) this will match digits one or more.
2. (?:\,\d+){2,} Adding this in expression will match patterns like , and digits {2,} for 2 or more than 2 times.
PHP code: Try this code snippet here
<?php
ini_set('display_errors', 1);
$string = "Total a Facturar 123,061 221,063 26,860161,16080,580310,760 358,297 Recepcionado";
preg_match("#(?:\d+(?:\,\d+){2,})#", $string, $matches);
print_r($matches);

Grabbing number next to a dollar sign with optional thousands and decimals

I am trying to grab a number that can be in the format $5,000.23 as well as say, $22.43 or $3,000
Here's my regular expression, this is in PHP.
preg_match('/\$([0-9]+)([\.,]*)?([0-9]*)?([\.])?([0-9]*)?/', $blah, $blah2);
It seems to match numbers in the format $5,500.23 perfectly fine, however it doesn't seem to match any other numbers well, like $0.
How do I make everything optional? Shouldn't grouping () and using a question mark do that?
This should do the trick:
\$[\d,.]*[\d]
Debuggex Demo
Specific PHP Example:
$re = "/\\$[\\d,.]*[\\d]/";
$str = "\$1 klsjdfgsjdfg \$100 kjdfhglsjdfg \$1,000 jljsdfg \$1,000.00 ldfjhsdf";
preg_match_all($re, $str, $matches);
Regex 101 Demo

PHP Regex to extract phone number from + to first non-number

EDIT: I'm doing this because the data I've been provided has hundreds of newline-separated entries in this format, and I need to incorporate microformats into the address data. Thus if the string provided is as below, I need to output:
<p><span itemprop="telephone">+1 800 123 456</span> (toll free) from overseas</p>
--
I need a regex to extract a phone number from the format below:
+1 800 123 456 (toll free) from overseas
The data I have been provided has consistently been entered in this format, so effectively, a regex to get everything from and including a "+" up to the first non-numerical character.
If you want to use a regex you can use something like this:
\+[\d\s]+
$re = '/(\+[\d\s]+)/';
$str = "+1 800 123 456 (toll free) from overseas\n";
preg_match_all($re, $str, $matches);
On the other hand, you can use what castis suggested, to use a preg_replace to replace the characters you don't want by empty string and keep the rest, like:
preg_replace('/[\D\+]/', '', $phone_number);

Regular expression in php to find roman numerals

I use PHP to highlight all the roman numerals in string.
For example:
Protocol XXXIV/14 from session...
Protocol XXIX/13 from session...
Protocol XXXV/13 from session...
So I've found a perfect example on http://regexr.com/2uhln. It works good for above examples, but when I try to use it in php, it stops work.
My PHP code is
$subject = "Protocol XXXV/13 from session...";
$pattern ='/(?:XL|L|L?(?:IX|X{1,3}|X{0,3}(?:IX|IV|V|V?I{1,3})))/';
preg_match($pattern,$subject,$matches);
It outputs just 1-3 characters from roman numeral, so
XXXIV - gives XXX
XXIX - gives XX
XXXV - gives XXX
I have two questions:
What is wrong? How to fix it?
how to modify regular expression from http://regexr.com/2uhln to work for all roman numerals up to one hundred (roman C). It doesnt work ex. XLVII, XLVI, XLV.
Change the order of your pattern. That is, place the longest pattern as first, then medium finally short. syntax would be like long|medium|short . So that the longest string would be matched first.
$re = "~L?(?:X{0,3}(?:IX|IV|V|V?I{1,3})|IX|X{1,3})|XL|L~m";
$str = "Protocol XXXIV/14 from session...\nProtocol XXIX/13 from session...\nProtocol XXXV/13 from session...";
preg_match_all($re, $str, $matches);
print_r($matches);
Update:
\b(?:X?L?(?:X{0,3}(?:IX|IV|V|V?I{1,3})|IX|X{1,3})|XL|L)\b
DEMO

Categories