PHP: Get value from string with preg_match, limited characters - php

Sorry for the title, I don't know how to explain it better.
I must get 354607 from the following string:
...jLHoiAAD1037354607Ij0Ij1Ij2...
The "354607" is dynamic, but it has the "1037" in any case before, and is in any case exactly 6 characters long.
The problem is, the string is about 50.000 up to 1.000.000 characters long. So I want a resource-friendly solution.
I tried it with:
preg_match_all("/1037(.*?){0,5}/", $new, $search1037);
and:
preg_match_all("/1037(.*?{0,5})/", $new, $search1037);
but, I don't know how to use regular expressions correctly.
I hope someone could help me!
Thank's a lot!

Use, \d{6} represents 6 numbers
preg_match_all("/1037(\d{6})/", $new, $search1037);
returns an array with
array(
0 => array(
0 => 1037354607
),
1 => array(
0 => 354607
)
)
Check this demo

Since you're concerned with finding a resource-friendly solution, you may be better off not using preg_match. Regular expressions tend to require more overhead in general, as discussed in this SO question.
Instead, you could use strstr():
$string = strstr($string,'1037');
Which will return the first instance of '1037' in $string, along with everything following it. Then, use substr():
$string = substr($string,4,6);
Which returns the substring within $string starting at position 4 (where position 0 = 1, position 1 = 0, position 2 = 3, position 3 = 7, position 4 = beginning of 6 digits) and including 6 characters.
For fun, in one line:
$string = substr(strstr($string,'1037'),4,6);

Related

How to split combined string

I have a string which looks like this:
21/04/2014,16:57:28,19,0,2021/04/2014,16:57:48,19,0,20
I would like to split it so that I get something like the following:
21/04/2014,16:57:28,19,0,20
21/04/2014,16:57:48,19,0,20
I have tried using php's substr which I thought was giving results but it duplicated this '21/04/2014,16:57:48,19,0,20' twice.
$data3 = array(substr($data1, -27), substr($data1, 27));
Even tried a regex with no luck.
If length of parts you want to get is constant you can use str_split function with second parameter.
$data = str_split($string, 27);
Elon Than's answer is the perfect solution if, as he states, the length is constant. However, I just thought I'd add this solution in case (for example) the '19' could also be '3' (or whatever):
preg_match_all("/\d{2}\/\d{2}\/\d{4}\,\d{2}:\d{2}:\d{2},\d{1,2},\d{1},\d{2}/", $string, $matches);
var_dump($matches[0]);
Notice the \d{1,2} will include any number that is 1 or 2 digits.
$data3 = array(substr($data1, 0, 27), substr($data1, 27));

Adding Character between numbers

I have an integer $client_version=1000 I do need to add dots between every number in this integer so it looks like 1.0.0.0 and save it in new variable as string.
How can I do this?
Easy enough:
$client_version = 1000;
$dotted = join(".",str_split($client_version));
Note that this will always split it so that there is only one character between the dots. If you want something like 1.00.0, you'll need to change your question to explain more about what you're trying to do and what patterns you need.
PHP offers the function array str_split ( string $string [, int $split_length = 1 ] ) to convert a string to a character-array or blocks of characters.
In your case, invoking str_split((string)1000, 1) or str_split((string)1000) will result in:
Array
(
[0] => 1
[1] => 0
[2] => 0
[3] => 0
)
Code:
implode('.',str_split((string)1000))
Result: 1.0.0.0
For a more general, yet less well known approach, based on Regular Expression see this gist and this tangentially related topic on SO.
Code:
preg_match_all('/(.{1})/', (string)1000, $matches);
echo implode('.', $matches[0]);
Result: 1.0.0.0
Use str_split to get an array of chars and then implode them.
$client_version = 1000;
$client_version_chars = str_split($client_version);
$client_version_with_dots = implode('.', $client_version_chars);

RegEx for hashtag separated string

I have bunch of strings like this:
a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc
And what I need to do is to split them up based on the hashtag position to something like this:
Array
(
[0] => A
[1] => AAX1AAY222
[2] => B
[3] => BBX4BBY555BBZ6
[4] => C
[5] => MMM1
[6] => D
[7] => ARA1
[8] => E
[9] => ABC
)
So, as you see the character right behind the hashtag is captured plus everything after the hashtag just right before the next char+hashtag.
I've the following RegEx which works fine only when I have a numeric value in the end of each part.
Here is the RegEx set up:
preg_split('/([A-Z])+#/', $text, 0, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
And it works fine with something like this:
C#mmm1D#ara1
But, if I change it to this (removing the numbers):
C#mmmD#ara
Then it will be the result, which is not good:
Array
(
[0] => C
[1] => D
)
I've looked at this question and this one also, which are similar but none of them worked for me.
So, my question is why does it work only if it has followed by a number? and how I can solve it?
Here you can see some of them sample strings which I have:
a#123b#abcc#def456 // A:123, B:ABC, C:DEF456
a#abc1def2efg3b#abcdefc#8 // A:ABC1DEF2EFG3, B:ABCDEF, C:8
a#abcdef123b#5c#xyz789 // A:ABCDEF123, B:5, C:XYZ789
P.S. Strings are case-insensitive.
P.P.S. If you ever thinking what the hell are these strings, they are user submitted answers to a questionnaire, and I can't do anything on them like refactoring as they are already stored and just need to be proceed.
Why Not Using explode?
If you look at my examples you will see that I need to capture the character right before the # as well. If you think it's possible with explode() please post the output as well, thanks!
Update
Should we focus on why /([A-Z])+#/ works only if numbers included? thanks.
Instead of using preg_split(), decide what you want to match instead:
A set of "words" if followed by either <any-char># or <end-of-string>.
A character if immediately followed by #.
$str = 'a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc';
preg_match_all('/\w+(?=.#|$)|\w(?=#)/', $str, $matches);
Demo
This expression uses two look-ahead assertions. The results are in $matches[0].
Update
Another way of looking at it would be this:
preg_match_all('/(\w)#(\w+)(?=\w#|$)/', $str, $matches);
print_r(array_combine($matches[1], $matches[2]));
Each entry starts with a single character, followed by a hash, followed by X characters until either the end of the string is encountered or the start of a next entry.
The output is this:
Array
(
[a] => aax1aay222
[b] => bbx4bby555bbz6
[c] => mmm1
[d] => ara1
[e] => abc
)
If you still want to use preg_split you can remove the + and it might work as expected:
'/([A-Z])#/i'
Since then you only match the hashtag and ONE alpha character before, and not all them.
Example: http://codepad.viper-7.com/z1kFDb
Edit: Added a case-insensitive flag i in the pattern.
Use explode() rather than Regexp
$tmpArray = explode("#","a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc");
$myArray = array();
for($i = 0; $i < count($tmpArray) - 1; $i++) {
if (substr($tmpArray[$i],0,-1)) $myArray[] = substr($tmpArray[$i],0,-1);
if (substr($tmpArray[$i],-1)) $myArray[] = substr($tmpArray[$i],-1);
}
if (count($tmpArray) && $tmpArray[count($tmpArray) - 1]) $myArray[] = $tmpArray[count($tmpArray) - 1];
edit: I updated my answer to reflect better reading the questions
You can use explode() function that will split the string except the hash signs, like stated in the answers given before.
$myArray = explode("#",$string);
For the string 'a#aax1aay222b#bbx4bby555bbz6c#mmm1d#ara1e#abc' this returns something like
$myarray = array('a', 'aax1aay22b', 'bbx4bby555bbz6c' ....);
All you need now is to take the last character of each string in array as another item.
$copy = array();
foreach($myArray as $item){
$beginning = substr($item,0,strlen($item)-1); // this takes all characters except the last one
$ending = substr($item,-1); // this takes the last one
$copy[] = $beginning;
$copy[] = $ending;
} // end foreach
This is an example, not tested.
EDIT
Instead of substr($item,0,strlen($item)-1); you might use substr($item,0,-1);.

Extract first integer in a string with PHP

Consider the following strings:
$strings = array(
"8.-10. stage",
"8. stage"
);
I would like to extract the first integer of each string, so it would return
8
8
I tried to filter out numbers with preg_replace but it returns all integers and I only want the first.
foreach($strings as $string)
{
echo preg_replace("/[^0-9]/", '',$string);
}
Any suggestions?
A convenient (although not record-breaking in performance) solution using regular expressions would be:
$string = "3rd time's a charm";
$filteredNumbers = array_filter(preg_split("/\D+/", $string));
$firstOccurence = reset($filteredNumbers);
echo $firstOccurence; // 3
Assuming that there is at least one number in the input, this is going to print the first one.
Non-digit characters will be completely ignored apart from the fact that they are considered to delimit numbers, which means that the first number can occur at any place inside the input (not necessarily at the beginning).
If you want to only consider a number that occurs at the beginning of the string, regex is not necessary:
echo substr($string, 0, strspn($string, "0123456789"));
preg_match('/\d+/',$id,$matches);
$id=$matches[0];
If the integer is always at the start of the string:
(int) $string;
If not, then strpbrk is useful for extracting the first non-negative number:
(int) strpbrk($string, "0123456789");
Alternatives
These one-liners are based on preg_split, preg_replace and preg_match:
preg_split("/\D+/", " $string")[1];
(int) preg_replace("/^\D+/", "", $string);
preg_match('/\d+/', "$string 0", $m)[0];
Two of these append extra character(s) to the string so empty strings or strings without numbers do not cause problems.
Note that these alternative solutions are for extracting non-negative integers only.
Try this:
$strings = array(
"8.-10. stage",
"8. stage"
);
$res = array();
foreach($strings as $key=>$string){
preg_match('/^(?P<number>\d)/',$string,$match);
$res[$key] = $match['number'];
}
echo "<pre>";
print_r($res);
foreach($strings as $string){
if(preg_match("/^(\d+?)/",$string,$res)) {
echo $res[1].PHP_EOL;
}
}
if you have Notice in PHP 7 +
Notice: Only variables should be passed by reference in YOUR_DIRECTORY_FILE.php on line LINE_NUMBER
By using this code
echo reset(array_filter(preg_split("/\D+/", $string)));
Change code to
$var = array_filter(preg_split("/\D+/", $string));
return reset($var);
And enjoy! Best Regards Ovasapov
How to filter out all characters except for the first occurring whole integer:
It is possible that the target integer is not at the start of the string (even if the OP's question only provides samples that start with an integer -- other researchers are likely to require more utility ...like the pages that I closed today using this page). It is also possible that the input contains no integers, or no leading / no trailing non-numeric characters.
The following is a regex expression has two checks:
It targets all non-numeric characters from the start of the string -- it stops immediately before the first encountered digit, if there is one at all.
It matches/consumes the first encountered whole integer, then immediatelly forgets/releases it (using \K) before matching/consuming ANY encountered characters in the remainder of the string.
My snippet will make 0, 1, or 2 replacements depending on the quality of the string.
Code: (Demo)
$strings = [
'stage', // expect empty string
'8.-10. stage', // expect 8
'8. stage', // expect 8
'8.-10. stage 1st', // expect 8
'Test 8. stage 2020', // expect 8
'Test 8.-10. stage - 2020 test', // expect 8
'A1B2C3D4D5E6F7G8', // expect 1
'1000', // expect 1000
'Test 2020', // expect 2020
];
var_export(
preg_replace('/^\D+|\d+\K.*/', '', $strings)
);
Or: (Demo)
preg_replace('/^\D*(\d+).*/', '$1', $strings)
Output:
array (
0 => '',
1 => '8',
2 => '8',
3 => '8',
4 => '8',
5 => '8',
6 => '1',
7 => '1000',
8 => '2020',
)

How to write regex to return only certain parts of this string?

So I'm working on a project that will allow users to enter poker hand histories from sites like PokerStars and then display the hand to them.
It seems that regex would be a great tool for this, however I rank my regex knowledge at "slim to none".
So I'm using PHP and looping through this block of text line by line and on lines like this:
Seat 1: fabulous29 (835 in chips)
Seat 2: Nioreh_21 (6465 in chips)
Seat 3: Big Loads (3465 in chips)
Seat 4: Sauchie (2060 in chips)
I want to extract seat number, name, & chip count so the format is
Seat [number]: [letters&numbers&characters] ([number] in chips)
I have NO IDEA where to start or what commands I should even be using to optimize this.
Any advice is greatly appreciated - even if it is just a link to a tutorial on PHP regex or the name of the command(s) I should be using.
I'm not entirely sure what exactly to use for that without trying it, but a great tool I use all the time to validate my RegEx is RegExr which gives a great flash interface for trying out your regex, including real time matching and a library of predefined snippets to use. Definitely a great time saver :)
Something like this might do the trick:
/Seat (\d+): ([^\(]+) \((\d+)in chips\)/
And some basic explanation on how Regex works:
\d = digit.
\<character> = escapes character, if not part of any character class or subexpression. for example:
\t
would render a tab, while \\t would render "\t" (since the backslash is escaped).
+ = one or more of the preceding element.
* = zero or more of the preceding element.
[ ] = bracket expression. Matches any of the characters within the bracket. Also works with ranges (ex. A-Z).
[^ ] = Matches any character that is NOT within the bracket.
( ) = Marked subexpression. The data matched within this can be recalled later.
Anyway, I chose to use
([^\(]+)
since the example provides a name containing spaces (Seat 3 in the example). what this does is that it matches any character up to the point that it encounters an opening paranthesis.
This will leave you with a blank space at the end of the subexpression (using the data provided in the example). However, his can easily be stripped away using the trim() command in PHP.
If you do not want to match spaces, only alphanumerical characters, you could so something like this:
([A-Za-z0-9-_]+)
Which would match any letter (within A-Z, both upper- & lower-case), number as well as hyphens and underscores.
Or the same variant, with spaces:
([A-Za-z0-9-_\s]+)
Where "\s" is evaluated into a space.
Hope this helps :)
Look at the PCRE section in the PHP Manual. Also, http://www.regular-expressions.info/ is a great site for learning regex. Disclaimer: Regex is very addictive once you learn it.
I always use the preg_ set of function for REGEX in PHP because the PERL-compatible expressions have much more capability. That extra capability doesn't necessarily come into play here, but they are also supposed to be faster, so why not use them anyway, right?
For an expression, try this:
/Seat (\d+): ([^ ]+) \((\d+)/
You can use preg_match() on each line, storing the results in an array. You can then get at those results and manipulate them as you like.
EDIT:
Btw, you could also run preg_match_all on the entire block of text (instead of looping through line-by-line) and get the results that way, too.
Check out preg_match.
Probably looking for something like...
<?php
$str = 'Seat 1: fabulous29 (835 in chips)';
preg_match('/Seat (?<seatNo>\d+): (?<name>\w+) \((?<chipCnt>\d+) in chips\)/', $str, $matches);
print_r($matches);
?>
*It's been a while since I did php, so this could be a little or a lot off.*
May be it is very late answer, But I am interested in answering
Seat\s(\d):\s([\w\s]+)\s\((\d+).*\)
http://regex101.com/r/cU7yD7/1
Here's what I'm currently using:
preg_match("/(Seat \d+: [A-Za-z0-9 _-]+) \((\d+) in chips\)/",$line)
To process the whole input string at once, use preg_match_all()
preg_match_all('/Seat (\d+): \w+ \((\d+) in chips\)/', $preg_match_all, $matches);
For your input string, var_dump of $matches will look like this:
array
0 =>
array
0 => string 'Seat 1: fabulous29 (835 in chips)' (length=33)
1 => string 'Seat 2: Nioreh_21 (6465 in chips)' (length=33)
2 => string 'Seat 4: Sauchie (2060 in chips)' (length=31)
1 =>
array
0 => string '1' (length=1)
1 => string '2' (length=1)
2 => string '4' (length=1)
2 =>
array
0 => string '835' (length=3)
1 => string '6465' (length=4)
2 => string '2060' (length=4)
On learning regex: Get Mastering Regular Expressions, 3rd Edition. Nothing else comes close to the this book if you really want to learn regex. Despite being the definitive guide to regex, the book is very beginner friendly.
Try this code. It works for me
Let say that you have below lines of strings
$string1 = "Seat 1: fabulous29 (835 in chips)";
$string2 = "Seat 2: Nioreh_21 (6465 in chips)";
$string3 = "Seat 3: Big Loads (3465 in chips)";
$string4 = "Seat 4: Sauchie (2060 in chips)";
Add to array
$lines = array($string1,$string2,$string3,$string4);
foreach($lines as $line )
{
$seatArray = explode(":", $line);
$seat = explode(" ",$seatArray[0]);
$seatNumber = $seat[1];
$usernameArray = explode("(",$seatArray[1]);
$username = trim($usernameArray[0]);
$chipArray = explode(" ",$usernameArray[1]);
$chipNumber = $chipArray[0];
echo "<br>"."Seat [".$seatNumber."]: [". $username."] ([".$chipNumber."] in chips)";
}
you'll have to split the file by linebreaks,
then loop thru each line and apply the following logic
$seat = 0;
$name = 1;
$chips = 2;
foreach( $string in $file ) {
if (preg_match("Seat ([1-0]): ([A-Za-z_0-9]*) \(([1-0]*) in chips\)", $string, $matches)) {
echo "Seat: " . $matches[$seat] . "<br>";
echo "Name: " . $matches[$name] . "<br>";
echo "Chips: " . $matches[$chips] . "<br>";
}
}
I haven't ran this code, so you may have to fix some errors...
Seat [number]: [letters&numbers&characters] ([number] in chips)
Your Regex should look something like this
Seat (\d+): ([a-zA-Z0-9]+) \((\d+) in chips\)
The brackets will let you capture the seat number, name and number of chips in groups.

Categories