Regex for matching phrase between '' php - php

I want to match the content inside the ' and ' (single quotes). For example: 'for example' should return for and example. It's only a part of the sentence I have to analyze, I used preg_split(\s) for the whole sentence, so the 'for example' will become 'for and example'.
Right now I've tried /^'(.*)|(.*)'$/ and it only returns for but not the example, if I put it like /^(.*)'|'(.*)$/, it only returns example but not for. How should I fix this?

You can avoid double handling of the string by leveraging the \G metacharacter to continue matching an unlimited number of space-delimited strings inside of single quotes.
Code: (PHP Demo) (Regex Demo)
$string = "text 'for an example of the \"continue\" metacharacter' text";
var_export(preg_match_all("~(?|'|\G(?!^) )\K[^ ']+~", $string, $out) ? $out[0] : []);
Output:
array (
0 => 'for',
1 => 'an',
2 => 'example',
3 => 'of',
4 => 'the',
5 => '"continue"',
6 => 'metacharacter',
)

To get the single sentences (which you then want to split) you can use preg_match_all() to capture anything between two single quotes.
preg_match_all("~'([^']+)'~", $text, $matches)
$string = $matches[1];
$string now contains something like "example string with words".
Now if you want to split a string according to a specific sequence / character, you can make use of explode():
$string = "example string with words";
$result = explode(" ", $string);
print_r($result);
gives you:
Array
(
[0] => example
[1] => string
[2] => with
[3] => words
)

Related

Php splitting a sentence

I'm trying to split a string of sentences by "." to get each sentence in an array. Like below:
$Text = "Hello, Mr. James. How are you today."
$split= explode(".", $Text);
As you can see $Text contains 2 sentences therefore i should only have 2 elements in the array. The issue i'm having is that sometimes my $Text can contain words like "Mr." or any other word which contains a "." in the middle of a sentence. This will result in the sentences being split from the middle and placed separately in the array like below:
Array ( [0] => Hello, Mr [1] => James [2] => How are you today [3] => )
You can avoid a lot of exception handling and general misery, if you can ensure that all English sentences are properly spaced at the end of each sentence -- 2 consecutive spaces. This can be difficult when dealing with some digitized strings because sometimes multi-spacing gets condensed to a single space.
This is what I mean:
$Text = "Hello, Mr. James. How are you today.";
$split = explode(" ", $Text);
var_export($split);
// array ( 0 => 'Hello, Mr. James.', 1 => 'How are you today.', )
Exploding on each space-space will give you a reliable result.
If you want good output, you'll need to use good input.
If you want to blacklist a few predictable substrings that should not be use to split the string, then you can use (*SKIP)(*FAIL) for that.
Code: (Demo)
$text = "Hello, Mr. James. How are you today.";
var_export(
preg_split('~(?:Mrs?|Miss|Ms|Prof|Rev|Col|Dr)[.?!:](*SKIP)(*F)|[.?!:]+\K\s+~', $text, 0, PREG_SPLIT_NO_EMPTY)
);
Output:
array (
0 => 'Hello, Mr. James.',
1 => 'How are you today.',
)

How to extract substrings with delimiters from a string in php

I would like to remove substrings from a string that have delimiters.
Example:
$string = "Hi, I want to buy an [apple] and a [banana].";
How do I get "apple" and "banana" out of this string and in an array? And the other parts of the string "Hi, I want to buy an" and "and a" in another array.
I apologize if this question has already been answered. I searched this site and couldn't find anything that would help me. Every situation was just a little different.
You could use preg_split() thus:
<?php
$pattern = '/[\[\]]/'; // Split on either [ or ]
$string = "Hi, I want to buy an [apple] and a [banana].";
echo print_r(preg_split($pattern, $string), true);
which outputs:
Array
(
[0] => Hi, I want to buy an
[1] => apple
[2] => and a
[3] => banana
[4] => .
)
You can trim the whitespace if you like and/or ignore the final fullstop.
preg_match_all('(?<=\[)([a-z])*(?=\])', $string, $matches);
Should do what you want. $matches will be an array with each match.
I assume you want words as values in the array:
$words = explode(' ', $string);
$result = preg_grep('/\[[^\]]+\]/', $words);
$others = array_diff($words, $result);
Create an array of words using explode() on a space
Use a regex to find [somethings] using preg_grep()
Find the difference of all words and [somethings] using array_diff(), which will be the "other" parts of the string

Split string on spaces except words in quotes

I have a string like
$string = 'Some of "this string is" in quotes';
I want to get an array of all the words in the string which I can get by doing
$words = explode(' ', $string);
However I don't want to split up the words in quotes so ideally the end array will be
array ('Some', 'of', '"this string is"', 'in', 'quotes');
Does anyone know how I can do this?
You can use:
$string = 'Some of "this string is" in quotes';
$arr = preg_split('/("[^"]*")|\h+/', $string, -1,
PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
print_r ( $arr );
Output:
Array
(
[0] => Some
[1] => of
[2] => "this string is"
[3] => in
[4] => quotes
)
RegEx Breakup
("[^"]*") # match quoted text and group it so that it can be used in output using
# PREG_SPLIT_DELIM_CAPTURE option
| # regex alteration
\h+ # match 1 or more horizontal whitespace
Instead of doing it this way, you can do it in another way aka matching. It will be a lot more easier to match than to split.
So use the regex: /[^\s]+|".*?"/ in conjuction with preg_match_all.
You can get values by match, not by split, with regex:
/"[^"]+"|\w+/g
whis will match:
"[^"]+" - characters between quote signs ",
\w+ - sets of word characters (A-Za-z_0-9),
DEMO
I think you can use a regex like this:
/("[^"]*")|(\S+)/g
And you can use substitution $2
[Regex Demo]

Split string into associative array (while maintaining characters)

I'm trying to figure out how to split a string that looks like this :
a20r51fx500fy3000
into an associative array that will look like this :
array(
'a' => 20,
'r' => 51,
'fx' => 500,
'fy' => 3000,
);
I don't think I can use preg_split as this will drop the character I'm splitting on (I tried /[a-zA-Z]/ but obviously that didn't do what I wanted it to). I'd prefer if I could do it using some kind of built-in function, but I don't really mind looping if that's required.
Any help would be much appreciated!
Multiple Matches and PREG_SET_ORDER
Do this:
$yourstring = "a20r51fx500fy3000";
$regex = '~([a-z]+)(\d+)~';
preg_match_all($regex,$yourstring,$matches,PREG_SET_ORDER);
$yourarray=array();
foreach($matches as $m) {
$yourarray[$m[1]] = $m[2];
}
print_r($yourarray);
Output:
Array ( [a] => 20 [r] => 51 [fx] => 500 [fy] => 3000 )
If your string can contain upper-case letters, make the regex case-insensitive by adding the i flag after the closing delimiter: $regex = '~([a-z]+)(\d+)~i';
Explanation
([a-z]+) captures letters to Group 1
(\d+) captures digits to Group 1
$yourarray[$m[1]] = $m[2]; creates in index for the letters, and assigns the digits

Intelligent split of string into an array

This code will split the string into an array that contains test and string:
$str = 'test string';
$arr = preg_split('/\s+/', $str);
But I also want to detect quotes and ignore the text between them when splitting, for example:
$str = 'test "Two words"';
This should also return an array with two elements, test and Two words.
And another form, if possible:
$str = 'test=Two Words';
So if the equal sign is present before any spaces, the string should be split by =, otherwise the other rules from above should apply.
So how can I do this with preg_split?
Try str_getcsv:
print_r(str_getcsv('test string'," "));
print_r(str_getcsv('test "Two words"'," "));
print_r(str_getcsv('test=Two Words',"="));
Outputs
Array
(
[0] => test
[1] => string
)
Array
(
[0] => test
[1] => Two words
)
Array
(
[0] => test
[1] => Two Words
)
You can use something like preg_match to check if there's an equal sign exist before space and then determine what delimiter to use.
Works only in PHP>=5.3 though.
I'm sure this could be done with regex, but how about just splitting the string by quotation marks, then by spaces, using explode?
Given the string 'I am a string "with an embedded" string', you could first split by quotation marks, giving you ['I am a string', 'with an embedded', 'string'], then you go over every other element in the array and split by spaces, resulting in ['I', 'am', 'a', 'string', 'with an embedded', 'string'].
The exact code to do this you can probably write yourself. If not, let me know and I'll help you.
In your last example, just split by the equals symbol:
$str = 'test=Two Words';
print explode('=', $str);

Categories