I want to filter strings that I have in an csv file, and I'm looking for a correct regexp that matches these strings:
PLP_LES_HALLES.VOLUME_POMPE
Newyork:Flow(m3/h)
In fact, the string should not contain any characters like : ç & é # ! ? “ ' ³ = + etc.
I tried this one :
([a-zA-Z0-9_:.(\/)]*) but when I tested it, I figured out that it matches everything. Kindly help me to find the correct one.
Here is my code to test:
while (($line = fgetcsv($handle, 1024, ";")) !== FALSE) {
$total = count( $line );
$keys = array('mesure', 'timestamp', 'value');
$args=array(
'mesure' => array('filter' => FILTER_VALIDATE_REGEXP,
'options' => array('regexp' => '([a-zA-Z0-9_:.(\/)]*)')),
'timestamp' => array( 'filter' => FILTER_VALIDATE_INT,
'options' => array('min_range' => 20000000000000, 'length' => 14)),
'value' => FILTER_VALIDATE_FLOAT);
$testing = filter_var_array(array_combine($keys, $line), $args);
var_dump($testing);
}
EDIT
These strings should not match:
PLP_LES_HALLéS.VOLUME_POMPE
PLP_LES_HàLLES.VOLUME_POMPE
Newyork:Flow(m³/h)
To sum up, all strings that have any characters from the list ç & é # ! ? “ ' ³ = + etc` should not match
Your regex does not match the whole string, and you are using ambiguous regex delimiter, it is recommended to use more common symbols as regex delimiters.
'/^[a-zA-Z0-9_:.()\/]*$/'
^^ ^^
The ^ will match the start of the string, and $ will match its end, requiring a whole string match.
Also, [a-zA-Z0-9_] can be written as \w, use it to shorten the pattern (this is not recommended only if you do not want to match Unicode strings):
'/^[\w:.()\/]*$/'
Related
I have a string that consists of repeated words. I want to replace a substring 'OK' located between 'L3' and 'L4'. Below you can find my code:
$search = "/(?<=L3).*(OK).*(?=L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, $replace, $str);
If I use that pattern with preg_match, it finds a correct substring(third 'OK'). However, when I apply that pattern to preg_replace, it replaces substring that matches the full pattern, instead of the parenthesized subpattern.
So could you please give me an advice what I should change in my code? I know that there are plenty amount of similar questions about regex, but as I understand my pattern is correct and I'm only confused with preg_replace function
It is true that your regex matches a place in the string that is preceded with L3 then contains the last OK substring after 0+ chars other than linebreak symbols and then matches any 0+ chars up to the place followed with L4. See your regex demo.
A possible solution is to use 2 capturing groups around the subpatterns before and after the OK, and use backreferences in the replacement pattern:
$search = "/(L3.*?)OK(.*?L4)/";
$replace = "REPLACEMENT";
$subject = "'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'OK'), 'L4' => ('John', 'Madrid', 'OK')";
$str = preg_replace($search, '$1'.$replace.'$2', $subject);
echo $str; // => 'L1' => ('Vanessa', 'Prague', 'OK'), 'L2' => ('Alex', 'Paris', 'OK'), 'L3' => ('Paul', 'Paris', 'REPLACEMENT'), 'L4' => ('John', 'Madrid', 'OK')
See the PHP demo
If there cannot be any L3.5 in between L3 and L4, the (L3.*?)OK(.*?L4) pattern is safe to use. It will match and capture L3 and then 0+ chars other than a linebreak up to the first OK, then will match OK, and then will match and capture 0+ chars up to the first L4.
If there can be no L4, use a (?:(?!L4).)* tempered greedy token matching any symbol other than a linebreak symbol that is not starting an L4 sequence:
'~(L3(?:(?!L4).)*)OK~'
See the regex demo
NOTE: If you want to make the regexps safer, add ' around L# inside the patterns.
$string = '/start info#example.com';
$pattern = '/{command} {name}#{domain}';
get array params in php, Like the example below:
['command' => 'start', 'name' => 'info', 'domain' => 'example.com']
and
$string = '/start info#example.com';
$pattern = '/{command} {email}';
['command' => 'start', 'email' => 'info#example.com']
and
$string = '/start info#example.com';
$pattern = '{command} {email}';
['command' => '/start', 'email' => 'info#example.com']
If its a single line string you can use preg_match and a regular expression such as this
preg_match('/^\/(?P<command>\w+)\s(?P<name>[^#]+)\#(?P<domain>.+?)$/', '/start info#example.com', $match );
But depending on variation in the data you may have to adjust the regx a bit. This outputs
command [1-6] start
name [7-11] info
domain [12-23] example.com
but it will also have the numeric index in the array.
https://regex101.com/r/jN8gP7/1
Just to break this down a bit, in English.
The leading ^ is start of line, then named capture ( \w (any a-z A-Z 0-9 _ ) ) then a space \s then named capture of ( anything but the #t sign [^#] ), then the #t sign #, then name captured of ( anything .+? to the end $ )
This will capture anything in this format,
(abc123_ ) space (anything but #)#(anything)
I currently have code that displays data like so:
1
11 Title Here
2
21 Guns
A
Awesome
Using this:
foreach($animes as $currentAnime){
$thisLetter = strtoupper($currentAnime->title[0]);
$sorted[$thisLetter][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
unset($thisLetter);
}
How do I group all numbers to a #, and all Symbols to a ~?
Like so:
#
11 Title Here
21 Guns
~
.ahaha
A
Awesome
Thank you for the advice.
You can check with is_numeric() if this is a number and with preg_match() if this is a symbol.
foreach($animes as $currentAnime){
$thisLetter = strtoupper($currentAnime->title[0]);
if(is_numeric($thisLetter))
{
$sorted['#'][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
}
else if(preg_match('/[^a-zA-Z0-9]+/', $thisLetter))
{
$sorted['~'][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
}
else
{
$sorted[$thisLetter][] = array('title' => $currentAnime->title, 'id' => $currentAnime->id);
}
unset($thisLetter);
}
[^a-zA-Z0-9]+ - fits letter if there are no a-z letters also no A-Z characters nor 0-9 digits there can be added characters that will not suit as well you didn't precise what characters fits so I've added to regex basics. Moreover, you can check also if there is a letter only by [a-zA-Z]+ and add this to string group and last "else" statement will be for strings that aren't numeric neither strings.
I would like to add a regular expression character dot . and such a slash /.
'numericString' => array(
'pattern' => '^[a-zćęłńóśźżA-Z0-9\s]+$',
)
How i can do?
add \. and \/
'numericString' => array(
'pattern' => '^[a-zćęłńóśźżA-Z0-9\s\.\/]+$',
)
Note: you must escape the characters "^.[$()|*+?{\" with a backslash ('\'), as
they have special meaning.
Use the below code..
'numericString' => array(
'pattern' => '^[a-zćęłńóśźżA-Z0-9\s\.\/]+$',
)
A literal . is expressed in a regex as \.
A literal / is expressed as \/
Note: not all regex flavours require escaping the /, only the ones that use it for delimiting the regex.
So, I have a text field that can contain only letters, numbers, hyphens, dots and underscores. I would like to validate it using Zend_Validate_Regex but this pattern does not work. Why?
/[a-z][A-Z][0-9]-_./
Here is my text element:
$titleSlug = new Zend_Form_Element_Text('title_slug', array(
'label' => 'Title Slug',
'required' => FALSE,
'filters' => array(
'StringTrim',
'Null'
),
'validators' => array(
array('StringLength', FALSE, array(3, 255)),
array('Regex', FALSE, array('pattern' => '/[a-z][A-Z][0-9]-_./'))
)
));
Your regex matches a string that contains a lowercase letter, an uppercase letter, a digit, a dash, an underscore and any other character, in that order. You need this:
/^[\w.-]*$/
^ and $ anchor the match at the start and end of the string.
\w matches letters, digits and underscore; together with the dot and dash they form a character class ([...]) which is repeated zero or more times (*).
how about this:
/[a-zA-Z]*|\d*|-*|\.*|_*/