I want to match the content inside the ' and ' (single quotes). For example: 'for example' should return for and example. It's only a part of the sentence I have to analyze, I used preg_split(\s) for the whole sentence, so the 'for example' will become 'for and example'.
Right now I've tried /^'(.*)|(.*)'$/ and it only returns for but not the example, if I put it like /^(.*)'|'(.*)$/, it only returns example but not for. How should I fix this?
You can avoid double handling of the string by leveraging the \G metacharacter to continue matching an unlimited number of space-delimited strings inside of single quotes.
Code: (PHP Demo) (Regex Demo)
$string = "text 'for an example of the \"continue\" metacharacter' text";
var_export(preg_match_all("~(?|'|\G(?!^) )\K[^ ']+~", $string, $out) ? $out[0] : []);
Output:
array (
0 => 'for',
1 => 'an',
2 => 'example',
3 => 'of',
4 => 'the',
5 => '"continue"',
6 => 'metacharacter',
)
To get the single sentences (which you then want to split) you can use preg_match_all() to capture anything between two single quotes.
preg_match_all("~'([^']+)'~", $text, $matches)
$string = $matches[1];
$string now contains something like "example string with words".
Now if you want to split a string according to a specific sequence / character, you can make use of explode():
$string = "example string with words";
$result = explode(" ", $string);
print_r($result);
gives you:
Array
(
[0] => example
[1] => string
[2] => with
[3] => words
)
Related
I'm trying to split a string of sentences by "." to get each sentence in an array. Like below:
$Text = "Hello, Mr. James. How are you today."
$split= explode(".", $Text);
As you can see $Text contains 2 sentences therefore i should only have 2 elements in the array. The issue i'm having is that sometimes my $Text can contain words like "Mr." or any other word which contains a "." in the middle of a sentence. This will result in the sentences being split from the middle and placed separately in the array like below:
Array ( [0] => Hello, Mr [1] => James [2] => How are you today [3] => )
You can avoid a lot of exception handling and general misery, if you can ensure that all English sentences are properly spaced at the end of each sentence -- 2 consecutive spaces. This can be difficult when dealing with some digitized strings because sometimes multi-spacing gets condensed to a single space.
This is what I mean:
$Text = "Hello, Mr. James. How are you today.";
$split = explode(" ", $Text);
var_export($split);
// array ( 0 => 'Hello, Mr. James.', 1 => 'How are you today.', )
Exploding on each space-space will give you a reliable result.
If you want good output, you'll need to use good input.
If you want to blacklist a few predictable substrings that should not be use to split the string, then you can use (*SKIP)(*FAIL) for that.
Code: (Demo)
$text = "Hello, Mr. James. How are you today.";
var_export(
preg_split('~(?:Mrs?|Miss|Ms|Prof|Rev|Col|Dr)[.?!:](*SKIP)(*F)|[.?!:]+\K\s+~', $text, 0, PREG_SPLIT_NO_EMPTY)
);
Output:
array (
0 => 'Hello, Mr. James.',
1 => 'How are you today.',
)
I would like to remove substrings from a string that have delimiters.
Example:
$string = "Hi, I want to buy an [apple] and a [banana].";
How do I get "apple" and "banana" out of this string and in an array? And the other parts of the string "Hi, I want to buy an" and "and a" in another array.
I apologize if this question has already been answered. I searched this site and couldn't find anything that would help me. Every situation was just a little different.
You could use preg_split() thus:
<?php
$pattern = '/[\[\]]/'; // Split on either [ or ]
$string = "Hi, I want to buy an [apple] and a [banana].";
echo print_r(preg_split($pattern, $string), true);
which outputs:
Array
(
[0] => Hi, I want to buy an
[1] => apple
[2] => and a
[3] => banana
[4] => .
)
You can trim the whitespace if you like and/or ignore the final fullstop.
preg_match_all('(?<=\[)([a-z])*(?=\])', $string, $matches);
Should do what you want. $matches will be an array with each match.
I assume you want words as values in the array:
$words = explode(' ', $string);
$result = preg_grep('/\[[^\]]+\]/', $words);
$others = array_diff($words, $result);
Create an array of words using explode() on a space
Use a regex to find [somethings] using preg_grep()
Find the difference of all words and [somethings] using array_diff(), which will be the "other" parts of the string
I have a string like
$string = 'Some of "this string is" in quotes';
I want to get an array of all the words in the string which I can get by doing
$words = explode(' ', $string);
However I don't want to split up the words in quotes so ideally the end array will be
array ('Some', 'of', '"this string is"', 'in', 'quotes');
Does anyone know how I can do this?
You can use:
$string = 'Some of "this string is" in quotes';
$arr = preg_split('/("[^"]*")|\h+/', $string, -1,
PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
print_r ( $arr );
Output:
Array
(
[0] => Some
[1] => of
[2] => "this string is"
[3] => in
[4] => quotes
)
RegEx Breakup
("[^"]*") # match quoted text and group it so that it can be used in output using
# PREG_SPLIT_DELIM_CAPTURE option
| # regex alteration
\h+ # match 1 or more horizontal whitespace
Instead of doing it this way, you can do it in another way aka matching. It will be a lot more easier to match than to split.
So use the regex: /[^\s]+|".*?"/ in conjuction with preg_match_all.
You can get values by match, not by split, with regex:
/"[^"]+"|\w+/g
whis will match:
"[^"]+" - characters between quote signs ",
\w+ - sets of word characters (A-Za-z_0-9),
DEMO
I think you can use a regex like this:
/("[^"]*")|(\S+)/g
And you can use substitution $2
[Regex Demo]
I'm trying to figure out how to split a string that looks like this :
a20r51fx500fy3000
into an associative array that will look like this :
array(
'a' => 20,
'r' => 51,
'fx' => 500,
'fy' => 3000,
);
I don't think I can use preg_split as this will drop the character I'm splitting on (I tried /[a-zA-Z]/ but obviously that didn't do what I wanted it to). I'd prefer if I could do it using some kind of built-in function, but I don't really mind looping if that's required.
Any help would be much appreciated!
Multiple Matches and PREG_SET_ORDER
Do this:
$yourstring = "a20r51fx500fy3000";
$regex = '~([a-z]+)(\d+)~';
preg_match_all($regex,$yourstring,$matches,PREG_SET_ORDER);
$yourarray=array();
foreach($matches as $m) {
$yourarray[$m[1]] = $m[2];
}
print_r($yourarray);
Output:
Array ( [a] => 20 [r] => 51 [fx] => 500 [fy] => 3000 )
If your string can contain upper-case letters, make the regex case-insensitive by adding the i flag after the closing delimiter: $regex = '~([a-z]+)(\d+)~i';
Explanation
([a-z]+) captures letters to Group 1
(\d+) captures digits to Group 1
$yourarray[$m[1]] = $m[2]; creates in index for the letters, and assigns the digits
This code will split the string into an array that contains test and string:
$str = 'test string';
$arr = preg_split('/\s+/', $str);
But I also want to detect quotes and ignore the text between them when splitting, for example:
$str = 'test "Two words"';
This should also return an array with two elements, test and Two words.
And another form, if possible:
$str = 'test=Two Words';
So if the equal sign is present before any spaces, the string should be split by =, otherwise the other rules from above should apply.
So how can I do this with preg_split?
Try str_getcsv:
print_r(str_getcsv('test string'," "));
print_r(str_getcsv('test "Two words"'," "));
print_r(str_getcsv('test=Two Words',"="));
Outputs
Array
(
[0] => test
[1] => string
)
Array
(
[0] => test
[1] => Two words
)
Array
(
[0] => test
[1] => Two Words
)
You can use something like preg_match to check if there's an equal sign exist before space and then determine what delimiter to use.
Works only in PHP>=5.3 though.
I'm sure this could be done with regex, but how about just splitting the string by quotation marks, then by spaces, using explode?
Given the string 'I am a string "with an embedded" string', you could first split by quotation marks, giving you ['I am a string', 'with an embedded', 'string'], then you go over every other element in the array and split by spaces, resulting in ['I', 'am', 'a', 'string', 'with an embedded', 'string'].
The exact code to do this you can probably write yourself. If not, let me know and I'll help you.
In your last example, just split by the equals symbol:
$str = 'test=Two Words';
print explode('=', $str);