Ensure a string contains specific characters, each at most once - php

I would like to test if a string is empty or if it only contain specific characters, each at most once.
Example:
Given $valid = 'ABCDE', the following strings are:
$a = ''; // valid, empty
$b = 'CE'; // valid, only contains C and E, each once
$c = 'AZ'; // invalid, contains Z
$d = 'DAA'; // invalid, contains A twice
Any quick way of doing this, (possibly) using regex?

We can try using the following regex pattern:
^(?!.*(.).*\1)[ABCDE]{0,5}$
Here is an explanation of the regex:
^ from the start of the string
(?!.*(.).*\1) assert that the same letter does not repeat
[ABCDE]{0,5} then match 0-5 letters
$ end of the string
Sample PHP script:
$input = "ABCDE";
if (preg_match("/^(?!.*(.).*\1)[ABCDE]{0,5}$/", $input)) {
echo "MATCH";
}
The negative lookahead (?!.*(.).*\1) works by checking if it can capture any single letter, and then also find it again later on in the string. Let's take the OP's invalid input DAA. The above negative lookahead would ffail when it matches and captures the first A, and then sees it again. Note carefully that lookarounds can have their own capture groups.

Related

Regular expression 'not match numbers' in PHP

I would like to check if a string is in another string with comma separated. I've write some code and it work like this...
$find_number = '3'
$string_to_search = '1,3,124,12'
preg_match('/[^0-9]'.$find_number.'[^0-9]/', string_to_search);
//match
$find_number = '4'
$string_to_search = '1,3,124,12'
preg_match('/[^0-9]'.$find_number.'[^0-9]/', string_to_search);
//not match
Which is what I expected. The problem is that the first and last string can't recognized in this expression. What did I do wrong?
You don't need regular expressions for such a simple task. If $find_number doesn't contain the separator then you can enclose both $find_number and $string_to_search in separators and use function strpos() to find out if $find_number is present or not in $string_to_search. strpost() is much faster than preg_match().
$find_number = '3';
$string_to_search = '1,3,124,12';
if (strpos(",{$string_to_search},", ",{$find_number},") !== FALSE) {
echo("{$find_number} is present in {$string_to_search}");
} else {
echo("No.");
}
Wrapping $find_number in separators is needed to avoid it finding partial values, wrapping $string_to_search in separators is needed to let it find the $find_number when it is the first or the last entry in $string_to_search.
You need to make sure to check if there are no digits on both sides of the $find_number, and that is why you need (?<!\d) / (?!\d) lookarounds that do not consume text before and after the number you need to match allowing to check the first and last items. The [^0-9] in your pattern are negated character class instances, that require a character other than a digit before and after the find_number. The lookarounds will just fail the match if there is a digit before ((?<!\d) negative lookbehind) or after (with (?!\d) negative lookahead) the find_number. See below:
$find_number = '3';
$string_to_search = '1,3,124,12';
if (preg_match('/(?<!\d)'.$find_number.'(?!\d)/', $string_to_search)) {
echo "match!";
}
See the PHP demo.
An alternative is to use explode with in_array:
$find_number = '3';
$string_to_search = '1,3,124,12';
if (in_array($find_number, explode(",", $string_to_search))) {
echo "match!";
}
See another PHP demo

Is it possible to match all attributes in a preg_match with empty or missing attributes?

I'm having a little bit of an issue with pre_match.
I have a string that can come with attributes in any order (eg. [foobar a="b" c="d" f="g"] or [foobar c="d" a="b" f="g"] or [foobar f="g" a="b" c="d"] etc.)
These are the patterns I have tried:
// Matches when all searched for attributes are present
// doesn't match if one of them is missing
// http://www.phpliveregex.com/p/dHi
$pattern = '\[foobar\b(?=\s)(?=(?:(?!\]).)*\s\ba=(["|'])((?:(?!\1).)*)\1)(?=(?:(?!\]).)*\s\bc=(["'])((?:(?!\3).)*)\3)(?:(?!\]).)*]'
// Matches only when attributes are in the right order
// http://www.phpliveregex.com/p/dHj
$pattern = '\[foobar\s+a=["\'](?<a>[^"\']*)["\']\s+c=["\'](?<c>[^"\']*).*?\]'
I'm trying to figure it out, but can't seem to get it right.
Is there a way to match all the attributes, even when other ones are missing or empty (a='')?
I've even toyed with explode at the spaces between the attributes and then str_replace, but that seemed too overkill and not the right way to go about this.
In the links I've only matched for a="b" and c="d" but I also want to match these cases even if there is an e="f" or a z="x"
If you have the [...] strings as separate strings, not inside larger text, it is easy to use a \G based regex to mark a starting boundary ([some_text) and then match any key-value pair with some basic regex subpatterns using negated character classes.
Here is the regex:
(?:\[foobar\b|(?!^)\G)\s+\K(?<key>[^=]+)="(?<val>[^"]*)"(?=\s+[^=]+="|])
Here is what it matches in human words:
(?:\[foobar\b|(?!^)\G) - a leading boundary, the regex engine should find it first before proceeding, and it matches literal [foobar or the end of the previous successful match (\G matches the string start or position right after the last successful match, and since we need the latter only, the negative lookahead (?!^) excludes the beginning of the string)
\s+ - 1 or more whitespaces (they are necessary to delimit tag name with attribute values)
\K - regex operator that forces the regex engine to omit all the matched characters grabbed so far. A cool alternative to a positive lookbehind in PCRE.
(?<key>[^=]+) - Named capture group "key" matching 1 or more characters other than a =.
=" - matches a literal =" sequence
-(?<val>[^"]*) - Named capture group "val" matching 0 or more characters (due to * quantifier) other than a "
" - a literal " that is a closing delimiter for a value substring.
(?=\s+[^=]+="|]) - a positive lookahead making sure there is a next attribute or the end of the [tag xx="yy"...] entity.
PHP code:
$re = '/(?:\[foobar\b|(?!^)\G)\s+\K(?<key>[^=]+)="(?<val>[^"]*)"(?=\s+[^=]+="|])/';
$str = "[foobar a=\"b\" c=\"d\" f=\"g\"]";
preg_match_all($re, $str, $matches);
print_r(array_combine($matches["key"], $matches["val"]));
Output: [a] => b, [c] => d, [f] => g.
You could use the following function:
function toAssociativeArray($str) {
// Single key/pair extraction pattern:
$pattern = '(\w+)\s*=\s*"([^"]*)"';
$res = array();
// Valid string?
if (preg_match("/\[foobar((\s+$pattern)*)\]/", $str, $matches)) {
// Yes, extract key/value pairs:
preg_match_all("/$pattern/", $matches[1], $matches);
for ($i = 0; $i < count($matches[1]); $i += 1) {
$res[$matches[1][$i]] = $matches[2][$i];
}
};
return $res;
}
This is how you could use it:
// Some test data:
$testData = array('[foobar a="b" c="d" f="g"]',
'[foobar a="b" f="g" a="d"]',
'[foobar f="g" a="b" c="d"]',
'[foobar f="g" a="b"]',
'[foobar f="g" c="d" f="x"]');
// Properties I am interested in, with a default value:
$base = array("a" => "null", "c" => "nothing", "f" => "");
// Loop through the test data:
foreach ($testData as $str) {
// get the key/value pairs and merge with defaults:
$res = array_merge($base, toAssociativeArray($str));
// print value of the "a" property
echo "value of a is {$res['a']} <br>";
}
This script outputs:
value of a is b
value of a is d
value of a is b
value of a is b
value of a is null

modify values in variable string with php

Consider example:
$mystring = "us100ch121jp23uk12";
I) I want to change value of jp by adding +1 so that makes the string into
us100ch121jp24uk12
suppose if
II) Is there a way to seperate the numeric part and alphabetic part in the above string into:
[us , 100]
[ch,121]
[jp,24]
[us,12]
my code:
$string = "us100ch121jp23uk12";
$search_for = "us";
$pairs = explode("[]", $string); // I dont know the parameters.
foreach ($pairs as $index=>$pair)
{
$numbers = explode(',',$pair);
if ($numbers[0] == $search_for){
$numbers[1] += 1; // 23 + 1 = 24
$pairs[index] = implode(',',$numbers); //push them back
break;
}
}
$new_string = implode('|',$pairs);
using Evan sir's suggestions
$mystring = "us100ch121jp22uk12";
preg_match_all("/([A-z]+)(\d+)/", $mystring, $output);
//echo $output[0][4];
foreach($output[0] as $key=>$value) {
// echo "[".$value."]";
echo "[".substr($value, 0, 2).",".substr($value, 2, strlen($value) - 2)."]"."<br>";
}
If you use preg_match_all("/([A-z]+)(\d+)/", $string, $output);, it will return an array to $output that contains three arrays. The first array will be country number strings (eg 'us100'). The second will contain country strings (eg 'us'). The third will contain the numbers (eg '100').
Since the second and third arrays will have matching indexes ($output[1][0] will be 'us' and $output[2][0] will be '100'), you could just cycle through those and do whatever you'd like to them.
Here is more information about using regular expressions in PHP. The site also contains information about regular expressions in general, which are a useful tool for any programmer!
You can do it using regular expressions in PHP. See tutorial:
http://w3school.in/w3schools-php-tutorial/php-regular-expression/
Function Description
ereg_replace() The ereg_replace() function finds for string specified by pattern and replaces pattern with replacement if found.
eregi_replace() The eregi_replace() function works similar to ereg_replace(), except that the search for pattern in string is not case sensitive.
preg_replace() The preg_replace() function works similar to ereg_replace(), except that regular expressions can be used in the pattern and replacement input parameters.
preg_match() The preg_match() function finds string of a pattern and returns true if pattern matches false otherwise.
Expression Description
[0-9] It matches any decimal digit from 0 through 9.
[a-z] It matches any character from lowercase a through lowercase z.
[A-Z] It matches any character from uppercase A through uppercase Z.
[a-Z] It matches any character from lowercase a through uppercase Z.
p+ It matches any string containing at least one p.
p* It matches any string containing zero or more p’s.
p? It matches any string containing zero or more p’s. This is just an alternative way to use p*.
p{N} It matches any string containing a sequence of N p’s
p{2,3} It matches any string containing a sequence of two or three p’s.
p{2, } It matches any string containing a sequence of at least two p’s.
p$ It matches any string with p at the end of it.
^p It matches any string with p at the beginning of it.
[^a-zA-Z] It matches any string not containing any of the characters ranging from a through z and A through Z.
p.p It matches any string containing p, followed by any character, in turn followed by another p.
^.{2}$ It matches any string containing exactly two characters.
<b>(.*)</b> It matches any string enclosed within <b> and </b>.
p(hp)* It matches any string containing a p followed by zero or more instances of the sequence hp.
you also can use JavaScript:
http://www.w3schools.com/jsref/jsref_obj_regexp.asp

Find string with single quote

I have an array of strings . I need all the strings which do not contain any special character. Only a to z is allowed. Is there any method using regex or is there any string function ?
You can use the regex ^[a-zA-Z]*$ which matches strings that only contain A to Z and a to z. (It will also match an empty string).
Explanation:
^ is an anchor that anchors the regex at the start of the string (So the regex starts matching from the start of the string)
[a-zA-Z] is a character class that contains the characters we want to match
* indicates that it should be matched zero or more times (use + for one or more times)
$ is an anchor for the end of the string, so the regex has to stop matching at the end of the string or it won't be a match.
You use preg_match to check a single string to see if it matches a pattern (preg_match returns 0 if there is no match, so we just check there is a match):
if ( preg_match('/^[a-zA-Z]*$/', $subject) !== 0 ){
//match
Hence you can then iterate over the array of strings, and create a new array of those that match the pattern
$array = "your_array";
$output_array = array();
foreach ($array as $elem) {
if ( preg_match('/^[a-zA-Z]{1,}$/', $elem)) {
$output_array[] = $elem;
}
}
in output_array will be your data

Identifying a random repeating pattern in a structured text string

I have a string that has the following structure:
ABC_ABC_PQR_XYZ
Where PQR has the structure:
ABC+JKL
and
ABC itself is a string that can contain alphanumeric characters and a few other characters like "_", "-", "+", "." and follows no set structure:
eg.qWe_rtY-asdf or pkl123
so, in effect, the string can look like this:
qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ
My goal is to find out what string constitutes ABC.
I was initially just using
$arrString = explode("_",$string);
to return $arrString[0] before I was made aware that ABC ($arrString[0]) itself can contain underscores, thus rendering it incorrect.
My next attempt was exlpoding it on "_" anyway and then comparing each of the exploded string parts with the first string part until I get a semblance of a pattern:
function getPatternABC($string)
{
$count = 0;
$pattern ="";
$arrString = explode("_", $string);
foreach($arrString as $expString)
{
if(strcmp($expString,$arrString[0])!==0 || $count==0)
{
$pattern = $pattern ."_". $arrString[$count];
$count++;
}
else break;
}
return substr($pattern,1);
}
This works great - but I wanted to know if there was a more elegant way of doing this using regular expressions?
Here is the regex solution:
'^([a-zA-Z0-9_+-]+)_\1_\1\+'
What this does is match (starting from the beginning of the string) the longest possible sequence consisting of the characters inside the square brackets (edit that per your spec). The sequence must appear exactly twice, each time followed by an underscore, and then must appear once more followed by a plus sign (this is actually the first half of PQR with the delimiter before JKL). The rest of the input is ignored.
You will find ABC captured as capture group 1.
So:
$input = 'qWe_rtY-asdf_qWe_rtY-asdf_qWe_rtY-asdf+JKL_XYZ';
$result = preg_match('/^([a-zA-Z0-9_+-]+)_\1_\1\+/', $input, $matches);
if ($result) {
echo $matches[2];
}
See it in action.
Sure, just make a regular expression that matches your pattern. In this case, something like this:
preg_match('/^([a-zA-Z0-9_+.-]+)_\1_\1\+JKL_XYZ$/', $string, $match);
Your ABC is in $match[1].
If the presence of underscores in these strings has a low frequency, it may be worth checking to see if a simple explode() will do it before bothering with regex.
<?php
$str = 'ABC_ABC_PQR_XYZ';
if(substr_count($str, '_') == 3)
$abc = reset(explode('_', $str));
else
$abc = regexy_function($str);
?>

Categories