Check string words start from # - php

I have a string and need to get the list of all words that start from #
$string = "Hello #bablu, This is my friend #roshan. Say hi to all. Also, I introduce 1 friend that is amit#gmail.com."
Now in this string, I need to get only bablu and roshan. Not get amit#gmail.com because amit has an email address. Now I have explode from # but explode method split the email address too.
$explode = explode('#',$string);
print_r($explode);
How can I get only # words in PHP?
[
0 => "",
1 => "bablu",
2 => "",
3 => "roshan",
4 => "amit",
5 => "gmail.com"
]
My excepted answer would be :
[
0 => "bablu",
1 => "roshan"
]

explode doesn't do anything , all you need is to use preg_match_all
$string = "Hello #bablu, This is my friend #roshan. Say hi to all. Also, I introduce 1 friend that is amit#gmail.com.";
preg_match_all('/\B#([a-zA-Z]+)/', $string, $matches);
print_r($matches[1]);
Output with :
Array
(
[0] => bablu
[1] => roshan
)
The \B matches the empty string not at the beginning or end of a word. So you can ignore that email address.

It can be done by this..
$explode = explode(' #',$string);
By adding space before #

Related

Split string after each number

I have a database full of strings that I'd like to split into an array. Each string contains a list of directions that begin with a letter (U, D, L, R for Up, Down, Left, Right) and a number to tell how far to go in that direction.
Here is an example of one string.
$string = "U29R45U2L5D2L16";
My desired result:
['U29', 'R45', 'U2', 'L5', 'D2', 'L16']
I thought I could just loop through the string, but I don't know how to tell if the number is one or more spaces in length.
You can use preg_split to break up the string, splitting on something which looks like a U,L,D or R followed by numbers and using the PREG_SPLIT_DELIM_CAPTURE to keep the split text:
$string = "U29R45U2L5D2L16";
print_r(preg_split('/([UDLR]\d+)/', $string, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY));
Output:
Array (
[0] => U29
[1] => R45
[2] => U2
[3] => L5
[4] => D2
[5] => L16
)
Demo on 3v4l.org
A regular expression should help you:
<?php
$string = "U29R45U2L5D2L16";
preg_match_all("/[A-Z]\d+/", $string, $matches);
var_dump($matches);
Because this task is about text extraction and not about text validation, you can merely split on the zer-width position after one or more digits. In other words, match one or more digits, then forget them with \K so that they are not consumed while splitting.
Code: (Demo)
$string = "U29R45U2L5D2L16";
var_export(
preg_split(
'/\d+\K/',
$string,
0,
PREG_SPLIT_NO_EMPTY
)
);
Output:
array (
0 => 'U29',
1 => 'R45',
2 => 'U2',
3 => 'L5',
4 => 'D2',
5 => 'L16',
)

How to preg_split without losing a character?

I have a string like this
$string = "Hello; how are you;Hey, I am fine";
$new = preg_split("/;\w/", $string);
print_r($new);
I am trying to split the string only when there is no white-space between the words and ";". But when I do this, I lose the H from Hey. It's probably because the split happens through the recognition of ;H. Could someone tell me how to prevent this?
My output:
$array = [
0 => [
0 => 'Hello; how are you ',
1 => 0,
],
1 => [
0 => 'ey, I am fine',
1 => 21,
],
]
You might use a word boundary \b:
\b;\b
$string = "Hello; how are you;Hey, I am fine";
$new = preg_split("/\b;\b/", $string);
print_r($new);
Demo
Or a negative lookahead and negative lookbehind
(?<! );(?! )
Demo
Lookarounds cost more steps. In terms of pattern efficiency, a word boundary is better and maintains the intended "no-length" character consumption.
In well-formed English, you won't ever have to check for a space before a semi-colon, so only 1 word boundary seems sufficient (I don't know if malformed English is possible because it is not represented in your sample string).
If you want to acquire the offset value, preg_split() has a flag for that.
Code: (Demo)
$string = "Hello; how are you;Hey, I am fine";
$new = preg_split("/;\b/", $string, -1, PREG_SPLIT_OFFSET_CAPTURE);
var_export($new);
Output:
array (
0 =>
array (
0 => 'Hello; how are you',
1 => 0,
),
1 =>
array (
0 => 'Hey, I am fine',
1 => 19,
),
)
Use split with this regex ;(?=\w) then you will not lose the H
You are capturingthe \w in your regex.You dont want that. Therefore, do this:
$new = preg_split("/;(?=\w)/", $string);
A capture group is defined in brackets, but the ?= means match but don't capture.
Check it out here https://3v4l.org/Q77LZ

Extract urls from string without spaces between

Let's say I have a string like this:
$urlsString = "http://foo.com/barhttps://bar.com//foo.com/foo/bar"
and I want to get an array like this:
array(
[0] => "http://foo.com/bar",
[1] => "https://bar.com",
[0] => "//foo.com/foo/bar"
);
I'm looking to something like:
preg_split("~((https?:)?//)~", $urlsString, PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
Where PREG_SPLIT_DELIM_CAPTURE definition is:
If this flag is set, parenthesized expression in the delimiter pattern will be captured and returned as well.
That said, the above preg_split returns:
array (size=3)
0 => string '' (length=0)
1 => string 'foo.com/bar' (length=11)
2 => string 'bar.com//foo.com/foo/bar' (length=24)
Any idea of what I'm doing wrong or any other idea?
PS: I was using this regex until I've realized that it doesn't cover this case.
Edit:
As #sidyll pointed, I'm missing the $limit in the preg_split parameters. Anyway, there is something wrong with my regex, so I will use #WiktorStribiżew suggestion.
You may use a preg_match_all with the following regex:
'~(?:https?:)?//.*?(?=$|(?:https?:)?//)~'
See the regex demo.
Details:
(?:https?:)? - https: or http:, optional (1 or 0 times)
// - double /
.*? - any 0+ chars other than line break as few as possible up to the first
(?=$|(?:https?:)?//) - either of the two:
$ - end of string
(?:https?:)?// - https: or http:, optional (1 or 0 times), followed with a double /
Below is a PHP demo:
$urlsString = "http://foo.com/barhttps://bar.com//foo.com/foo/bar";
preg_match_all('~(?:https?:)?//.*?(?=$|(?:https?:)?//)~', $urlsString, $urls);
print_r($urls);
// => Array ( [0] => http://foo.com/bar [1] => https://bar.com [2] => //foo.com/foo/bar )

PCRE regex for movie data

i have a string like this
<14> south.park.s14e01.locdog.avi [190713856]
i need a php regexp to get an array like this
array(14, 'south.park.s14e01.locdog.avi', 190713856)
please help
preg_match('/^<(\d+)> \s+ (\S+) \s+ \[(\d+)\]$/x', $input, $your_array);
Where your desired results are in $your_array starting at index 1.
$test = '<14> south.park.s14e01.locdog.avi [190713856]';
preg_match('/<(\d{2})>\s(.+)\s\[(\d{9})\]/',$test,$m);
print_r($m);//[1] => 14 [2] => south.park.s14e01.locdog.avi [3] => 190713856

Retain Delimiters when Splitting String

Edit: OK, I can't read, thanks to Col. Shrapnel for the help. If anyone comes here looking for the same thing to be answered...
print_r(preg_split('/([\!|\?|\.|\!\?])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));
Is there any way to split a string on a set of delimiters, and retain the position and character(s) of the delimiter after the split?
For example, using delimiters of ! ? . !? turning this:
$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';
into this
array('Hello', '.', 'A question', '?', 'How strange', '!', 'Maybe even surreal', '!?', 'Who knows', '.');
Currently I'm trying to use print_r(preg_split('/([\!|\?|\.|\!\?])/', $string)); to capture the delimiters as a subpattern, but I'm not having much luck.
Your comment sounds like you've found the relevant flag, but your regex was a little off, so I'm going to add this anyway:
preg_split('/(!\?|[!?.])/', $string, null, PREG_SPLIT_DELIM_CAPTURE);
Note that this will leave spaces at the beginning of every string after the first, so you'll probably want to run them all through trim() as well.
Results:
$string = 'Hello. A question? How strange! Maybe even surreal!? Who knows.';
print_r(preg_split('/(!\?|[!?.])/', $string, null, PREG_SPLIT_DELIM_CAPTURE));
Array
(
[0] => Hello
[1] => .
[2] => A question
[3] => ?
[4] => How strange
[5] => !
[6] => Maybe even surreal
[7] => !?
[8] => Who knows
[9] => .
[10] =>
)
From PHP8.1, it is no longer permitted to use null as the limit parameter for preg_split() because an integer is expected. When seeking unlimited output elements from the return value, it is acceptable to use 0 or -1. (Demo)
To avoid empty elements in the returned array, I recommend PREG_SPLIT_NO_EMPTY as an additional flag. (Demo)
var_export(
preg_split(
'/(!\?|[!?.])/',
$string,
0,
PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY
)
);
Since PHP8, it is technically possible to omit the limit parameter and declare flags by using named parameters.
Simply add the PREG_SPLIT_DELIM_CAPTURE to the preg_split function:
$str = 'Hello. A question? How strange!';
$var = preg_split('/([!?.])/', $str, 0, PREG_SPLIT_DELIM_CAPTURE);
$var = array(
0 => "Hello",
1 => ".",
2 => " A question",
3 => "?",
4 => " How strange",
5 => "!",
6 => "",
);
You can also split on the space after a ., !, ? or !?. But this can only be used if you can guarantee that there is a space after such a character.
You can do this, by matching a but with a positive look-back: (<=\.|!?|?|!): this makes the regex
'/(?<=\.|\?|!) /'
And then, you'll have to check if the strings matched ends with !?: if so, substring the last two. If not, you'll have to substring the last character.

Categories