I am trying to split a string into terms in PHP using preg_split. I need to extract normal words ( \w ) but also currency ( even currency symbol ) and numeric terms ( including commas and decimal points ). Can anyone help me out, as I cannot seem to create a valid regex to use for preg_split to achieve this. Thanks
Why not use preg_match_all() instead of preg_split() ?
$str = '"1.545" "$143" "$13.43" "1.5b" "hello" "G9"'
. ' This is a test sentence, with some. 123. numbers'
. ' 456.78 and punctuation! signs.';
$digitsPattern = '\$?\d+(\.\d+)?';
$wordsPattern = '[[:alnum:]]+';
preg_match_all('/('.$digitsPattern.'|'.$wordsPattern.')/i', $str, $matches);
print_r($matches[0]);
What about preg_match_all() each word with this [\S]+\b then you get an array with the words in it.
Big brown fox - $20.25 will return
preg_match_all('/[\S]+\b/', $str, $matches);
$matches = array(
[0] = 'Big',
[1] = 'brown',
[2] = 'fox',
[3] = '$20.25'
)
Does it solve your problem to split on whitespace? "/\s+/"
Related
I have a problem with RegEX. I have output like this.
Number of rooms
2
Price
120000
Square in meter
60
I’m trying to achieve this: I want remove all text except “Number of rooms 2” My value “2” changes. So far I have expression like this:
<?php
$str = get_field('all');
preg_match('/ Number of rooms \s*(\d+)/' , $str, $matches);
echo $matches[1];
?>
Remove the preceding space before the Number word :
preg_match('/Number of rooms \s*(\d+)/' , $str, $matches);
Remove the Space before Number and after rooms in your regex:
$str = 'Number of rooms
2
Price
120000
Square in meter
60';
preg_match('/Number of rooms\s*(\d+)/' , $str, $matches);
print_r($matches);
output:
Array
(
[0] => Number of rooms
2
[1] => 2
)
As the others has said it's the space. You can solve it with removing the space or make it optional with *.
I would advise to use regex options im also as it will be case insensitive and treat the string as multilined.
preg_match('/number of rooms\s*(\d+)/im', $str, $m);
var_dump($m);
you can try it this way:
$str = get_field('all');
$str_array = explode("\n",$str);
$new_str=$str_array[0]." ".$str_array[1];
echo $new_str;
I want to separate my sentence(s) into two parts. Because they are made of English letters and non english letters. I have regex I am using in preg_split method to get normal letters and characters. This though, works for opposite and I am left with only Japanese and not english.
String I work with:
すぐに諦めて昼寝をするかも知れない。 I may give up soon and just nap instead.
My attempt:
$parts = preg_split("/[ -~]+$/", $cleanline); // $cleanline is the string above
print_r($parts);
My result
Array ( [0] => すぐに諦めて昼寝をするかも知れない。 [1] => )
As you can see, I do get an empty second value. How can I get both the English and the non-English text into two different strings? Why is the English text not returning even if I use correct regex (from what I've been testing)?
You could use lookaround to split on boundary between non alphabetic and alphabetic + space
$str = 'すぐに諦めて昼寝をするかも知れない。 I may give up soon and just nap instead.';
$parts = preg_split("/(?<=[^a-z])(?=[a-z\h])|(?<=[a-z\h])(?=[^a-z])/i", $str, 2);
print_r($parts);
Output:
Array
(
[0] => すぐに諦めて昼寝をするかも知れない。
[1] => I may give up soon and just nap instead.
)
try mb_split instead of preg_split function.
mb_regex_encoding('UTF-8');
mb_internal_encoding("UTF-8");
$parts = mb_split("/[ -~]+$/", $cleanline);
If you have two spaces between the two strings as shown in your example, you can split them easily with a simple \s{2} :
<?php
$s = "すぐに諦めて昼寝をするかも知れない。 I may give up soon and just nap instead.";
$s = preg_split("/\s{2}/", $s);
print_r($s);
?>
Output:
Array
(
[0] => すぐに諦めて昼寝をするかも知れない。
[1] => I may give up soon and just nap instead.
)
Demo: http://ideone.com/uD2W1Q
I want to search a phone number from a whole sentence. It can be any number with a pattern like (122) 221-2172 or 122-221-2172 or (122)-221-2172 by help of PHP where I don't know in which part of the sentence that number is exists or I could use substr.
$text = 'foofoo 122-221-2172 barbar 122 2212172 foofoo ';
$text .= ' 122 221 2172 barbar 1222212172 foofoo 122-221-2172';
$matches = array();
// returns all results in array $matches
preg_match_all('/[0-9]{3}[\-][0-9]{6}|[0-9]{3}[\s][0-9]{6}|[0-9]{3}[\s][0-9]{3}[\s][0-9]{4}|[0-9]{9}|[0-9]{3}[\-][0-9]{3}[\-][0-9]{4}/', $text, $matches);
$matches = $matches[0];
var_dump($matches);
You can use regular expressions to solve this. Not 100% on php syntax, but I imagine it would look something like:
$pattern = '/^\(?\d{3}\)?-\d{3}-\d{4}/';
^ says "begins with"
\( escapes the (
\(? say 0 or 1 (
\d{x} says exactly x numbers
You may also want to check out Using Regular Expressions with PHP
I've got a comma delimited string of id's coming in and I need some quick way to split them into an array.
I know that I could hardcode it, but that's just gross and pointless.
I know nothing about regex at all and I can't find a SIMPLE example anywhere on the internet, only huge tutorials trying to teach me how to master regular expressions in 2 hours or something.
fgetcsv is only applicable for a file and str_getcsv is only available in PHP 5.3 and greater.
So, am I going to have to write this by hand or is there something out there that will do it for me?
I would prefer a simple regex solution with a little explanation as to why it does what it does.
$string = "1,3,5,9,11";
$array = explode(',', $string);
See explode()
Returns an array of strings, each of which is a substring of string formed by splitting it on boundaries formed by the string delimiter .
Any problem with normal split function?
$array = split(',', 'One,Two,Three');
will give you
Array
(
[0] => One
[1] => Two
[2] => Three
)
If you want to just split on commas:
$values = explode(",", $string);
If you also want to get rid of whitespace around the commas (eg: your string is 1, 3, 5)
$values = preg_split('/\s*,\s*/', $string)
If you want to be able to have commas in your string when surrounded by quotes, (eg: first, "se,cond", third)
$regex = <<<ENDOFREGEX
/ " ( (?:[^"\\\\]++|\\\\.)*+ ) \"
| ' ( (?:[^'\\\\]++|\\\\.)*+ ) \'
| ,+
/x
ENDOFREGEX;
$values = preg_split($regex, $string, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
A simple regular expression should do the trick.
$a_ids = preg_split('%,%', $ids);
I have:
stackoverflow.com/.../link/Eee_666/9_uUU/66_99U
What regex for /Eee_666/9_uUU/66_99U?
Eee_666, 9_uUU, and 66_99U is a random value
How can I solve it?
As simple as that:
$link = "stackoverflow.com/.../link/Eee_666/9_uUU/66_99U";
$regex = '~link/([^/]+)/([^/]+)/([^/]+)~';
# captures anything that is not a / in three different groups
preg_match_all($regex, $link, $matches);
print_r($matches);
Be aware though that it eats up any character expect the / (including newlines), so you either want to exclude other characters as well or feed the engine only strings with your format.
See a demo on regex101.com.
You can use \K here to makei more thorough.
stackoverflow\.com/.*?/link/\K([^/\s]+)/([^/\s]+)/([^/\s]+)
See demo.
https://regex101.com/r/jC8mZ4/2
In the case you don't how the length of the String:
$string = stackoverflow.com/.../link/Eee_666/9_uUU/66_99U
$regexp = ([^\/]+$)
result:
group1 = 66_99U
be careful it may also capture the end line caracter
For this kind of requirement, it's simpler to use preg_split combined with array_slice:
$url = 'stackoverflow.com/.../link/Eee_666/9_uUU/66_99U';
$elem = array_slice(preg_split('~/~', $url), -3);
print_r($elem);
Output:
Array
(
[0] => Eee_666
[1] => 9_uUU
[2] => 66_99U
)