How to split string into an alphabetic string and a numeric string? - php

I need to use preg_split() function to split string into alpha and numeric.
Ex: ABC10000 into, ABC and 10000
GSQ39800 into GSQ and 39800
WERTYI67888 into WERTYI and 67888
Alpha characters will be the first characters(any number of) of the string always and then the numeric(any number of).

using preg_match
$keywords = "ABC10000";
preg_match("#([a-zA-Z]+)(\d+)#", $keywords, $matches);
print_r($matches);
output
Array
(
[0] => ABC10000
[1] => ABC
[2] => 10000
)

This is a tiny task. Use \K with matching on an capital letters in a character class using a one or more quantifier:
Code:
$in='WERTYI67888';
var_export(preg_split('/[A-Z]+\K/',$in));
Output:
array (
0 => 'WERTYI',
1 => '67888',
)

Related

Explode a string where the explode condition is bunch of specific characters

I'm looking for a way to explode a string. For example, I have the following string: (we don't count the beginning - 0x)
0xa9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368
which is actually an ETH transaction input. I need to explode this string into 3 parts. Imagine 1 bunch of zeros is actually a single space and these spaces define the gates where the string should be exploded.
How can I do that?
preg_split()
This function uses a regular expression to split a string.
So in this example at two or more 0 in a row:
$arr = preg_split('/[0]{2,}/', $string);
print_r($arr);
echo PHP_EOL;
This will output the following:
Array
(
[0] => a9059xbb
[1] => fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d
[2] => 54368
)
Be aware that you will have problems if a message itself has a 00 in it. Assuming it is used as a null-byte for "end of string", this will not happen, though.
preg_match()
This is an example using regular expressions. You can split at arbitrary points.
$string = 'a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368';
print_r($string);
echo PHP_EOL;
$res = preg_match('/(.{4})(.{32})(.{32})/', $string, $matches);
print_r($matches);
echo PHP_EOL;
This outputs:
a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368
Array
(
[0] => a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199a
[1] => a905
[2] => 9xbb000000000000000000000000fc7a
[3] => 5f48a1a1b3f48e7dcb1f23a1ea24199a
)
As you can see /(.{4})(.{32})(.{32})/ will find 4 bytes, then 32 and after that 32 again. Capturing groups are made with () around what you want to find. They appear in the $matches array (0 is always the whole string found).
In case you want to ignore certain parts you can express that as well:
/(.{4})9x(.{32}).{4}(.{32})/
This changes the found string:
Array
(
[0] => a9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d000
[1] => a905
[2] => bb000000000000000000000000fc7a5f
[3] => a1b3f48e7dcb1f23a1ea24199af4d000
)
Links
PHP documentation for the mentioned functions:
https://www.php.net/manual/en/function.preg-split.php
https://www.php.net/manual/en/book.pcre.php
Play around with the second regular expression using this demo:
https://regex101.com/r/pfZtH8/1
If you will always explode them at the same points (4 bytes(8 hexadecimal digits), 32 bytes(64 hexadecimal digits), 32 bytes(64 hexadecimal digits)), you could use substr().
$input = "0xa9059xbb000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d00000000000000000000000000000000000000000000000000000000000054368";
$first = substr($input,2,8);
$second = substr($input,10,64);
$third = substr($input,74,64);
print_r($first);
print "<br>";
print_r($second);
print "<br>";
print_r($third);
print "<br>";
this outputs:
a9059xbb
000000000000000000000000fc7a5f48a1a1b3f48e7dcb1f23a1ea24199af4d0
0000000000000000000000000000000000000000000000000000000000054368

PHP split string into integer, string and special character

I need to split this format of strings CF12:10 into array like below,
[0] => CF, [1] => 12, [2] => 10
Numbers and String of the provided string can be any length. I have found the php preg_match function but don't know how to make regular expression for my case. Any solution would be highly appreciated.
You could use this regex to match the individual parts:
^(\D+)(\d+):(.*)$
It matches start of string, some number of non-digit characters (\D+), followed by some number of digits (\d+), a colon and some number of characters after the : and before end-of-line. In PHP you can use preg_match to then find all the matching groups:
$input = 'CF12:10';
preg_match('/^(\D+)(\d+):(.*)$/', $input, $matches);
array_shift($matches);
print_r($matches);
Output:
Array
(
[0] => CF
[1] => 12
[2] => 10
)
Demo on 3v4l.org
Try the following code if it helps you
$str = 'C12:10';
$arr = preg_match('~^(.*?)(\d+):(.*)~m', $str, $matches);
array_shift($matches);
echo '<pre>';print_r($matches);

Match all regex start with but not end with characters

i have an array of words
i want to match all with starting '___'
but some words also having '___' at the end .
but i do not want to match these words
here is my word list
___apis
___db_tables
___groups
___inbox_messages
___sent_messages
___todo
___users
___users_groups
____4underscorestarting
sinan
sssssssssss
test_______dfg
testttttt
tet____
tttttttttt
uuuuuuuu
vvvvvvvvvvvv
wwwwwwww
zzzzzzzzzz
i want to match only these words
___apis
___db_tables
___groups
___inbox_messages
___sent_messages
___todo
___users
___users_groups
i do not want to match these words
tet____
test_______dfg
____4underscorestarting
this is how it looks like when i try
The solution using preg_grep function:
// $arr is your initial array of words
$matched = preg_grep("/^_{3}[^_].*/", $arr);
print_r($matched);
The output:
Array
(
[0] => ___apis
[1] => ___db_tables
[2] => ___groups
[3] => ___inbox_messages
[4] => ___sent_messages
[5] => ___todo
[6] => ___users
[7] => ___users_groups
)
Update: To get the opposite matches use one of the following:
regex pattern:
/^(?!_{3})\w*/
set the third argument of preg_grep function as PREG_GREP_INVERT(... preg_grep("/^_{3}[^_].*/", $arr, PREG_GREP_INVERT))
http://php.net/manual/en/function.preg-grep.php
^___[a-z].*
this should do it for you.See demo.
https://regex101.com/r/hHRg8d/1
^_{3}.*[^(_{3})]$
Starts(^) with 3 '_' _{3}
Can contain anything in the middle .*
Does not end($) in 3 '' [^({3}]

Using regex to not match periods between numbers

I have a regex code that splits strings between [.!?], and it works, but I'm trying to add something else to the regex code. I'm trying to make it so that it doesn't match [.] that's between numbers. Is that possible? So, like the example below:
$input = "one.two!three?4.000.";
$inputX = preg_split("~(?>[.!?]+)\K(?!$)~", $input);
print_r($inputX);
Result:
Array ( [0] => one. [1] => two! [2] => three? [3] => 4. [4] => 000. )
Need Result:
Array ( [0] => one. [1] => two! [2] => three? [3] => 4.000. )
You should be able to split on this:
(?<=(?<!\d(?=[.!?]+\d))[.!?])(?![.!?]|$)
https://regex101.com/r/kQ6zO4/1
It uses lookarounds to determine where to split. It looks behind to try to match anything in the set [.!?] one or more times as long as it isn't preceded by and succeeded by a digit.
It also won't return the last empty match by ensuring the last set isn't the end of the string.
UPDATE:
This should be much more efficient actually:
(?!\d+\.\d+).+?[.!?]+\K(?!$)
https://regex101.com/r/eN7rS8/1
Here is another possibility using regex flags:
$input = "one.two!three???4.000.";
$inputX = preg_split("~(\d+\.\d+[.!?]+|.*?[.!?]+)~", $input, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
print_r($inputX);
It includes the delimiter in the split and ignores empty matches. The regex can be simplified to ((?:\d+\.\d+|.*?)[.!?]+), but I think what is in the code sample above is more efficient.

Match rest of string with regex

I have a string like this
ch:keyword
ch:test
ch:some_text
I need a regular expression which will match all of the strings, however, it must not match the following:
ch: (ch: is proceeded by a space, or any number of spaces)
ch: (ch: is proceeded by nothing)
I am able to deduce the length of the string with the 'ch:' in it.
Any help would be appreciated; I am using PHP's preg_match()
Edit: I have tried this:
preg_match("/^ch:[A-Za-z_0-9]/", $str, $matches)
However, this only matches 1 character after the string. I tried putting a * after the closing square bracket, but this matches spaces, which I don't want.
preg_match('/^ch:(\S+)/', $string, $matches);
print_r($matches);
\S+ is for matching 1 or more non-space characters. This should work for you.
Try this regular expression:
^ch:\S.*$
$str = <<<TEXT
ch:keyword
ch:test
ch:
ch:some_text
ch: red
TEXT;
preg_match_all('|ch\:(\S+)|', $str, $matches);
echo '<pre>'; print_r($matches); echo '</pre>';
Output:
Array
(
[0] => Array
(
[0] => ch:keyword
[1] => ch:test
[2] => ch:some_text
)
[1] => Array
(
[0] => keyword
[1] => test
[2] => some_text
)
)
Try using this:
preg_match('/(?<! +)ch:[^ ].*/', $str);

Categories