How to extract a collection of numbers from a string? - php

I need to extract a project number out of a string. If the project number was fixed it would have been easy, however it can be either P.XXXXX, P XXXXX or PXXXXX.
Is there a simple function like preg_match that I could use? If so, what would my regular expression be?

There is indeed - if this is part of a larger string e.g. "The project (P.12345) is nearly done", you can use:
preg_match('/P[. ]?(\d{5})/',$str,$match);
$pnumber = $match[1];
Otherwise, if the string will always just be the P.12345 string, you can use:
preg_match('/\d{5}$/',$str,$match);
$pnumber = $match[0];
Though you may prefer the more explicit match of the top example.

Try this:
if (preg_match('#P[. ]?(\d{5})#', $project_number, $matches) {
$project_version = $matches[1];
}
Debuggex Demo

You said that project number is 4 of 5 digit length, so:
preg_match('/P[. ]?(\d{4,5})/', $tring, $m);
$project_number = $m[1];

Assuming you want to extract the XXXXX from the string and XXXXX are all integers, you can use the following.
preg_replace("/[^0-9]/", "", $string);
You can use the ^ or caret character inside square brackets to negate the expression. So in this instance it will replace anything that isn't a number with nothing.

I would use this kind of regex : /.*P[ .]?(\d+).*/
Here is a few test lines :
$string = 'This is the P123 project, with another useless number 456.';
$project = preg_replace('/.*P[ .]?(\d+).*/', '$1', $string);
var_dump($project);
$string = 'This is the P.123 project, with another useless number 456.';
$project = preg_replace('/.*P[ .]?(\d+).*/', '$1', $string);
var_dump($project);
$string = 'This is the P 123 project, with another useless number 456.';
$project = preg_replace('/.*P[ .]?(\d+).*/', '$1', $string);
var_dump($project);

use explode() function to split those

Related

Replace multiple items in a string

i've scraped a html string from a website. In this string it contains multiple strings like color:#0269D2. How can i make str_replace code which replace this string with another color ?
For instance something like this just looping through all color:#0269D in the fulltext string variable?
str_replace("color:#0269D","color:#000000",$fulltext);
you pass array to str_replace function , no need to use loop
$a= array("color:#0269D","color:#000000");
$str= str_replace($a,"", $string);
You have the right syntax. I would add a check:
$newText = str_replace("color:#0269D", "color:#000000", $fulltext, $count);
if($count){
echo "Replaced $count occurrences of 'color'.";
}
This code might be too greedy for what you're looking to do. Careful. Also if the string differs at all, for example color: #0269D, this replacement will not happen.
’str_replace’ already replaces all occurrences of the search string with the replacement string.
If you want to replace all colors but aren't sure which hexcodes you'll find you could use preg_replace to match multiple occurrences of a pattern with a regular expression and replace it.
In your case:
$str = "String with loads of color:#000000";
$pattern = '/color ?: ?#[0-9a-f]{3,6}/i';
$replacement = "color:#FFFFFF";
$result = preg_replace($pattern, $replacement, $str);

how to remove last occurance of underscore in string

I have a string that contains many underscores followed by words ex: "Field_4_txtbox" I need to find the last underscore in the string and remove everything following it(including the "_"), so it would return to me "Field_4" but I need this to work for different length ending strings. So I can't just trim a fixed length.
I know I can do an If statement that checks for certain endings like
if(strstr($key,'chkbox')) {
$string= rtrim($key, '_chkbox');
}
but I would like to do this in one go with a regex pattern, how can I accomplish this?
The matching regex would be:
/_[^_]*$/
Just replace that with '':
preg_replace( '/_[^_]*$/', '', your_string );
There is no need to use an extremly costly regex, a simple strrpos() would do the job:
$string=substr($key,0,strrpos($key,"_"));
strrpos — Find the position of the last occurrence of a substring in a string
You can also just use explode():
$string = 'Field_4_txtbox';
$temp = explode('_', strrev($string), 2);
$string = strrev($temp[1]);
echo $string;
As of PHP 5.4+
$string = 'Field_4_txtbox';
$string = strrev(explode('_', strrev($string), 2)[1]);
echo $string;

Trim all characters before an integer in a string in PHP?

I have an alpha numeric string say for example,
abc123bcd , bdfnd567, dfd89ds.
I want to trim all the characters before the first appearance of any integer in the string.
My result should look like,
abc , bdfnd, dfd.
I am thinking of using substr. But not sure how to check for a string before first appearance of an integer.
You can easily remove the characters you don't want with preg_replace [docs] and a regular expression:
$str = preg_replace('#\d.*$#', '', $str);
\d matches a digit and .*$ matches any character until the end of the string.
Learn more about regular expressions: http://www.regular-expressions.info/.
DEMO
A possible non-Regex solution would be:
strcspn — Find length of initial segment not matching mask
substr — Return part of a string
Example:
$string = 'foo1bar';
echo substr($string, 0, strcspn($string, '1234567890')); // gives foo
$string = 'abc123bcd';
preg_replace("/[0-9]/", "", $string);
or
trim($string, '0123456789');
I believe you are looking for this?
$matches = array();
preg_match("/^[a-z]+/", "dfd89ds", $matches);
echo $matches[0]; // returns dfd
You can use a regex for this:
$string = 'abc123bcd';
preg_match('/^[a-zA-Z]*/i', $string, $matches);
var_dump($matches[0]);
will produce:
abc
To remove the +/- sign, you can simply use:
abs($number)
and get the absolute value.
e.g
$abs = abs($signed_integer);

Identifying a random repeating pattern in a structured text string

I have a string that has the following structure:
ABC_ABC_PQR_XYZ
Where PQR has the structure:
ABC+JKL
and
ABC itself is a string that can contain alphanumeric characters and a few other characters like "_", "-", "+", "." and follows no set structure:
eg.qWe_rtY-asdf or pkl123
so, in effect, the string can look like this:
qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ
My goal is to find out what string constitutes ABC.
I was initially just using
$arrString = explode("_",$string);
to return $arrString[0] before I was made aware that ABC ($arrString[0]) itself can contain underscores, thus rendering it incorrect.
My next attempt was exlpoding it on "_" anyway and then comparing each of the exploded string parts with the first string part until I get a semblance of a pattern:
function getPatternABC($string)
{
$count = 0;
$pattern ="";
$arrString = explode("_", $string);
foreach($arrString as $expString)
{
if(strcmp($expString,$arrString[0])!==0 || $count==0)
{
$pattern = $pattern ."_". $arrString[$count];
$count++;
}
else break;
}
return substr($pattern,1);
}
This works great - but I wanted to know if there was a more elegant way of doing this using regular expressions?
Here is the regex solution:
'^([a-zA-Z0-9_+-]+)_\1_\1\+'
What this does is match (starting from the beginning of the string) the longest possible sequence consisting of the characters inside the square brackets (edit that per your spec). The sequence must appear exactly twice, each time followed by an underscore, and then must appear once more followed by a plus sign (this is actually the first half of PQR with the delimiter before JKL). The rest of the input is ignored.
You will find ABC captured as capture group 1.
So:
$input = 'qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ';
$result = preg_match('/^([a-zA-Z0-9_+-]+)_\1_\1\+/', $input, $matches);
if ($result) {
echo $matches[2];
}
See it in action.
Sure, just make a regular expression that matches your pattern. In this case, something like this:
preg_match('/^([a-zA-Z0-9_+.-]+)_\1_\1\+JKL_XYZ$/', $string, $match);
Your ABC is in $match[1].
If the presence of underscores in these strings has a low frequency, it may be worth checking to see if a simple explode() will do it before bothering with regex.
<?php
$str = 'ABC_ABC_PQR_XYZ';
if(substr_count($str, '_') == 3)
$abc = reset(explode('_', $str));
else
$abc = regexy_function($str);
?>

Replace all characters in string apart from PHP

I have a string Trade Card Catalogue 1988 Edition I wish to remove everything apart from 1988.
I could have an array of all letters and do a str_replace and trim, but I wondered if this was a better solution?
$string = 'Trade Card Catalogue 1988 Edition';
$letters = array('a','b','c'....'x','y','z');
$string = str_to_lower($string);
$string = str_replace($letters, '', $string);
$string = trim($string);
Thanks in advance
Regular expression?
So assuming you want the number (and not the 4th word or something like that):
$str = preg_replace('#\D#', '', $str);
\D means every character that is not a digit. The same as [^0-9].
If there could be more numbers but you only want to get a four digit number (a year), this will also work (but obviously fails if you there are several four digit numbers and you want to get a specific one) :
$str = preg_replace('#.*?(\d{4,4}).*#', '\1', $str);
You can actually just pass the entire set of characters to be trimmed as a parameter to trim:
$string = trim($string, 'abc...zABC...Z ' /* don't forget the space */);

Categories