PHP preg_split is returning empty strings

PHP preg_split is returning empty strings - php

I am trying to split a string of combined lowercase letters into separate words with each first letter of the word being capitalized. I am trying to use PHP's preg_split(), but I'm not sure that I'm using it correctly, because the words aren't delimiters. the options for words are:
1. Burger
2. Fries
3. Chicken
4. Pizza
5. Sandwich
6. Onionrings
7. Milkshake
8. Coke
The below code returns blank array elements:
<?php
$input = 'milkshakepizzachickenfriescokeburgerpizzasandwichmilkshakepizza';
$split = preg_split("/(burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke)/", $input);
var_dump($split);
All the var_dumps and the echos are for debugging purposes only. The expected output is to have one long string with space-separated menu items. For example:
Burger Coke Fries

preg_split() will split the array by the value you're giving it, just like most split()-style functions. So, of course you get an array of blanks. If you split the string "-----" by the character -, for instance, then every character is counted as a delimiter and gets scooped out of the string.
What you want is preg_match_all().
preg_match_all — Perform a global regular expression match
Store the matches in some $matches variable as I do below...
$input = 'milkshakepizzachickenfriescokeburgerpizzasandwichmilkshakepizza';
$split = preg_match_all("/(burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke)/", $input, $matches);
print_r($matches);
Working Demo.
Results:
[0] => Array
(
[0] => milkshake
[1] => pizza
[2] => chicken
[3] => fries
[4] => coke
[5] => burger
[6] => pizza
[7] => sandwich
[8] => milkshake
[9] => pizza
)

try this
<?php
$input ="burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke";
$pattern = "/[|\s:]/";
$split = preg_split($pattern,$input);
print_r ($split);

You can capture your splitters, but the bits between the splits are empty, though it's possible to discard them.
<?php
$input = 'milkshakepizzachickenfriescokeburgerpizzasandwichmilkshakepizza';
$split = preg_split("/(burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke)/", $input, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
print ucwords(implode(' ', $split));
Output:
Milkshake Pizza Chicken Fries Coke Burger Pizza Sandwich Milkshake Pizza

Related

Finding sentences between characters

I am trying to find sentences between pipe | and dot ., e.g.
| This is one. This is two.
The regex pattern I use :
preg_match_all('/(:\s|\|+)(.*?)(\.|!|\?)/s', $file0, $matches);
So far I could not manage to capture both sentences. The regex I use captures only the first sentence.
How can I solve this problem?
EDIT: as it may seen from the regex, I am trying to find the sentences BETWEEN (: or |) AND (. or ! or ?)
Column or pipe indicates starting point for sentences.
The sentences might be:
: Sentence one. Sentence two. Sentence three.
| Sentence one. Sentence two?
| Sentence one. Sentence two! Sentence three?

I would keep it simple and just match on:
\s*[^.|]+\s*
This says to match any content not consisting of pipes or full stops, and it also trims optional whitespace before/after each sentence.
$input = "| This is one. This is two.";
preg_match_all('/\s*[^.|]+\s*/s', $input, $matches);
print_r($matches[0]);
This prints:
Array
(
[0] => This is one
[1] => This is two
)

This does the job:
$str = '| This is one. This is two.';
preg_match_all('/(?:\s|\|)+(.*?)(?=[.!?])/', $str, $m);
print_r($m)
Output:
Array
(
[0] => Array
(
[0] => | This is one
[1] => This is two
)
[1] => Array
(
[0] => This is one
[1] => This is two
)
)
Demo & explanation

Another option is to make use of \G to get iterative matches asserting the position at the end of the previous match and capture the values in a capturing group matching a dot and 0+ horizontal whitespace chars after.
(?:\|\h*|\G(?!^))([^.\r\n]+)\.\h*
In parts
(?: Non capturing group
\|\h* Match | and 0+ horizontal whitespace chars
| Or
\G(?!^) Assert position at the end of previous match
) Close group
( Capture group 1
- [^.\r\n]+ Match 1+ times any char other than . or a newline
) Close group
\.\h* Match 1 . and 0+ horizontal whitespace chars
Regex demo | Php demo
For example
$re = '/(?:\|\h*|\G(?!^))([^.\r\n]+)\.\h*/';
$str = '| This is one. This is two.
John loves Mary.| This is one. This is two.';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
print_r($matches);
Output
Array
(
[0] => Array
(
[0] => | This is one.
[1] => This is one
)
[1] => Array
(
[0] => This is two
[1] => This is tw
)
)

To keep it simple, find everything between | and . and then split:
$input = "John loves Mary. | This is one. This is two. | Sentence 1. Sentence 2.";
preg_match_all('/\|\s*([^|]+)\./', $input, $matches);
if ($matches) {
foreach($matches[1] as $match) {
print_r(preg_split('/\.\s*/', $match));
}
}
Prints:
Array
(
[0] => This is one
[1] => This is two
)
Array
(
[0] => Sentence 1
[1] => Sentence 2
)

Split string after each number

I have a database full of strings that I'd like to split into an array. Each string contains a list of directions that begin with a letter (U, D, L, R for Up, Down, Left, Right) and a number to tell how far to go in that direction.
Here is an example of one string.
$string = "U29R45U2L5D2L16";
My desired result:
['U29', 'R45', 'U2', 'L5', 'D2', 'L16']
I thought I could just loop through the string, but I don't know how to tell if the number is one or more spaces in length.

You can use preg_split to break up the string, splitting on something which looks like a U,L,D or R followed by numbers and using the PREG_SPLIT_DELIM_CAPTURE to keep the split text:
$string = "U29R45U2L5D2L16";
print_r(preg_split('/([UDLR]\d+)/', $string, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY));
Output:
Array (
[0] => U29
[1] => R45
[2] => U2
[3] => L5
[4] => D2
[5] => L16
)
Demo on 3v4l.org

A regular expression should help you:
<?php
$string = "U29R45U2L5D2L16";
preg_match_all("/[A-Z]\d+/", $string, $matches);
var_dump($matches);

Because this task is about text extraction and not about text validation, you can merely split on the zer-width position after one or more digits. In other words, match one or more digits, then forget them with \K so that they are not consumed while splitting.
Code: (Demo)
$string = "U29R45U2L5D2L16";
var_export(
preg_split(
'/\d+\K/',
$string,
0,
PREG_SPLIT_NO_EMPTY
)
);
Output:
array (
0 => 'U29',
1 => 'R45',
2 => 'U2',
3 => 'L5',
4 => 'D2',
5 => 'L16',
)

REGEX Pattern for Validation that check all string is integer and split into single integers

I tried multiple time to make a pattern that can validate given string is natural number and split into single number.
..and lack of understanding of regex, the closest thing that I can imagine is..
^([1-9])([0-9])*$ or ^([1-9])([0-9])([0-9])*$ something like that...
It only generates first, last, and second or last-second split-numbers.
I wonder what I need to know to solve this problem.. thanks

You may use a two step solution like
if (preg_match('~\A\d+\z~', $s)) { // if a string is all digits
print_r(str_split($s)); // Split it into chars
}
See a PHP demo.
A one step regex solution:
(?:\G(?!\A)|\A(?=\d+\z))\d
See the regex demo
Details
(?:\G(?!\A)|\A(?=\d+\z)) - either the end of the previous match (\G(?!\A)) or (|) the start of string (^) that is followed with 1 or more digits up to the end of the string ((?=\d+\z))
\d - a digit.
PHP demo:
$re = '/(?:\G(?!\A)|\A(?=\d+\z))\d/';
$str = '1234567890';
if (preg_match_all($re, $str, $matches)) {
print_r($matches[0]);
}
Output:
Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
[5] => 6
[6] => 7
[7] => 8
[8] => 9
[9] => 0
)

PHP regex to extract special string

I am trying to use regex to extract a certain syntax, in my case something like "10.100" or "20.111", in which 2 numbers are separated by dot(.) . So if I provide "a 10.100", it will extract 10.100 from the string. If I provide "a 10.100 20.101", it will extract 10.100 and 20.101.
Until now I have tried to use
preg_match('/^.*([0-9]{1,2})[^\.]([0-9]{1,4}).*$/', $message, $array);
but still no luck. Please provide any suggestion because I don't have strong regex knowledge. Thanks.

You may use
\b[0-9]{1,2}\.[0-9]{1,4}\b
See the regex demo.
Details:
\b - a leading word boundary
[0-9]{1,2} - 1 or 2 digits
\. - a dot
[0-9]{1,4} - 1 to 4 digits
\b - a trailing word boundary.
If you do not care about the whole word option, just remove \b. Also, to match just 1 or more digits, you may use + instead of the limiting quantifiers. So, perhaps
[0-9]+\.[0-9]+
will also work for you.
See a PHP demo:
$re = '/[0-9]+\.[0-9]+/';
$str = 'I am trying to use regex to extract a certain syntax, in my case something like "10.100" or "20.111", in which 2 numbers are separated by dot(.) . So if I provide "a 10.100", it will extract 10.100 from the string. If I provide "a 10.100 20.101", it will extract 10.100 and 20.101.';
preg_match_all($re, $str, $matches);
print_r($matches[0]);
Output:
Array
(
[0] => 10.100
[1] => 20.111
[2] => 10.100
[3] => 10.100
[4] => 10.100
[5] => 20.101
[6] => 10.100
[7] => 20.101
)

Regex: /\d+(?:\.\d+)/
1. \d+ for matching digits one or more.
2. (?:\.\d+) for matching digits followed by . like .1234
Try this code snippet here
<?php
ini_set('display_errors', 1);
$string='a 10.100 20.101';
preg_match_all('/\d+(?:\.\d+)/', $string, $array);
print_r($array);
Output:
Array
(
[0] => Array
(
[0] => 10.100
[1] => 20.101
)
)

$decimals = "10.5 100.50 10.250";
preg_match_all('/\b[\d]{2}\.\d+\b/', $decimals, $output);
print_r($output);
Output:
Array
(
[0] => 10.5
[1] => 10.250
)
Regex Demo | Php Demo

ignoring upper case words with explode() in PHP

I'm new to PHP and I'm trying to explode data in a text file and put it into an array, then a table. The data in the text file looks like this:
THE MAN IN THE HIGH CASTLE by Philip K. Dick published 1965 born 1922
Assume that you cannot alter the original data. If I write:
$dataArray = explode(" ",$book);
that works for most of the data, but but splits every word of the book title into a different element. Is there a way I can tell it not to split upper case words?

Instead of explode, you may want to try using preg_split for this. It splits strings using a regular expression:
$book = 'THE MAN IN THE HIGH CASTLE by Philip K. Dick published 1965 born 1922';
// Split on all-lowercase words
print_r(preg_split('/\b\s*[a-z]+\s*\b/', $book));
Output:
Array
(
[0] => THE MAN IN THE HIGH CASTLE
[1] => Philip K. Dick
[2] => 1965
[3] => 1922
)

$input = explode("by", $book);
$title = $input[0];
$stuff = $input[1];

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP preg_split is returning empty strings - php

try this <?php $input ="burger|fries|chicken|pizza|sandwich|onionrings|milkshake|coke"; $pattern = "/[|\s:]/"; $split = preg_split($pattern,$input); print_r ($split);

Related

Finding sentences between characters

Split string after each number

REGEX Pattern for Validation that check all string is integer and split into single integers

PHP regex to extract special string

ignoring upper case words with explode() in PHP

Categories

Resources