PHP regex: each word must end with dot - php

Can someone help me how to specific pattern for preg_match function?
Every word in string must end with dot
First character of string must be [a-zA-Z]
After each dot there can be a space
There can't be two spaces next to each other
Last character must be a dot (logicaly after word)
Examples:
"Ing" -> false
"Ing." -> true
".Ing." -> false
"Xx Yy." -> false
"XX. YY." -> true
"XX.YY." -> true
Can you help me please how to test the string? My pattern is
/^(([a-zA-Z]+)(?! ) \.)+\.$/
I know it's wrong, but i can't figure out it. Thanks

Check how this fits your needs.
/^(?:[A-Z]+\. ?)+$/i
^ matches start
(?: opens a non-capture group for repetition
[A-Z]+ with i flag matches one or more alphas (lower & upper)
\. ? matches a literal dot followed by an optional space
)+ all this once or more until $ end
Here's a demo at regex101
If you want to disallow space at the end, add negative lookbehind: /^(?:[A-Z]+\. ?)+$(?<! )/i

Try this:
$string = "Ing
Ing.
.Ing.
Xx Yy.
XX. YY.
XX.YY.";
if (preg_match('/^([A-Za-z]{1,}\.[ ]{0,})*/m', $string)) {
// Successful match
} else {
// Match attempt failed
}
Result:
The Regex in detail:
^ Assert position at the beginning of a line (at beginning of the string or after a line break character)
( Match the regular expression below and capture its match into backreference number 1
[A-Za-z] Match a single character present in the list below
A character in the range between “A” and “Z”
A character in the range between “a” and “z”
{1,} Between one and unlimited times, as many times as possible, giving back as needed (greedy)
\. Match the character “.” literally
[ ] Match the character “ ”
{0,} Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)

Related

Regular expression to find empty functions

I would like to use a regular expression that finds only functions that are empty in php files
For example
function name_not_important()
{
}
Regex can be function\s[^\(]+\([^)]*\)(\n)*{(\n)*}
From https://regex101.com/:
function matches the characters function literally (case sensitive) \s matches any whitespace character (equivalent to [\r\n\t\f\v ])
Match a single character not present in the list below [^(]
matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy) ( matches the
character ( literally (case sensitive) ( matches the character (
literally (case sensitive) Match a single character not present in the
list below [^)]
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) ) matches the
character ) literally (case sensitive) ) matches the character )
literally (case sensitive) 1st Capturing Group (\n)*
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) A repeated capturing
group will only capture the last iteration. Put a capturing group
around the repeated group to capture all iterations or use a
non-capturing group instead if you're not interested in the data \n
matches a line-feed (newline) character (ASCII 10) { matches the
character { literally (case sensitive) 2nd Capturing Group (\n)*
matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy) A repeated capturing
group will only capture the last iteration. Put a capturing group
around the repeated group to capture all iterations or use a
non-capturing group instead if you're not interested in the data \n
matches a line-feed (newline) character (ASCII 10) } matches the
character } literally (case sensitive) Global pattern flags g
modifier: global. All matches (don't return after first match) m
modifier: multi line. Causes ^ and $ to match the begin/end of each
line (not only begin/end of string)
Note: This regex assumes that indentation of braces are in alignment.

Preg_match which only accept the website address with or with out www. and http://

I have listing all the website address to my overview page. Before that I have to validate the address with the all possible cases.
After several research I found the below regex. But this is not given an exact result.
/((http|https)\:\/\/)?[a-zA-Z0-9\.\/\?\:#\-_=#]+\.([a-zA-Z0-9\.\/\?\:#\-_=#])*/
My possible test cases are:
'test.com',
'http://www.google.com',
'www.google.com',
'https://google.com',
'https://www.google.com',
'testetst',
'<img src="/test/test" >',
'<img src="/test/test.png" alt="page" title="page">'
I want only the domain name. Here I want first five result as true and remain should be false.
Try this:
Code:
<?php
$input = 'test.com
http://www.google.com
www.google.com
https://google.com
https://www.google.com
testetst
img src="/test/test" >
<img src="/test/test.png" alt="page" title="page">';
echo '<h3>Input</h3><pre>'.htmlentities($input).'</pre><h3>Output</h3>';
preg_match_all('%(http[s]{0,1}://)*([A-Za-z0-9-]*?\.){0,1}([A-Za-z0-9-]*?\.[A-Za-z0-9-]*?)[\s]*(\r\n|\n\r|\r|\n|$)%', $input, $regs, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($regs[0]); $i++) {
// $regs[3][$i] contains domain name
echo $regs[3][$i] . '<br />';
}
Result:
Input:
test.com
http://www.google.com
www.google.com
https://google.com
https://www.google.com
testetst
img src="/test/test" >
<img src="/test/test.png" alt="page" title="page">
Output:
test.com
google.com
google.com
google.com
google.com
The Regex in detail:
( Match the regular expression below and capture its match into backreference number 1
http Match the characters “http” literally
[s] Match the character “s”
{0,1} Between zero and one times, as many times as possible, giving back as needed (greedy)
:// Match the characters “://” literally
)* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
( Match the regular expression below and capture its match into backreference number 2
[A-Za-z0-9-] Match a single character present in the list below
A character in the range between “A” and “Z”
A character in the range between “a” and “z”
A character in the range between “0” and “9”
The character “-”
*? Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
\. Match the character “.” literally
){0,1} Between zero and one times, as many times as possible, giving back as needed (greedy)
( Match the regular expression below and capture its match into backreference number 3
[A-Za-z0-9-] Match a single character present in the list below
A character in the range between “A” and “Z”
A character in the range between “a” and “z”
A character in the range between “0” and “9”
The character “-”
*? Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
\. Match the character “.” literally
[A-Za-z0-9-] Match a single character present in the list below
A character in the range between “A” and “Z”
A character in the range between “a” and “z”
A character in the range between “0” and “9”
The character “-”
*? Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
)
[\s] Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
* Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
( Match the regular expression below and capture its match into backreference number 4
Match either the regular expression below (attempting the next alternative only if this one fails)
\\r Match a carriage return character
\\n Match a line feed character
| Or match regular expression number 2 below (attempting the next alternative only if this one fails)
\\n Match a line feed character
\\r Match a carriage return character
| Or match regular expression number 3 below (attempting the next alternative only if this one fails)
\\r Match a carriage return character
| Or match regular expression number 4 below (the entire group fails if this one fails to match)
\\n Match a line feed character
| Or match regular expression number 5 below (the entire group fails if this one fails to match)
\$ Assert position at the end of the string (or before the line break at the end of the string, if any)
)

preg_match lookbehind after second slash

This is my string:
stringa/stringb/123456789,abc,cde
and after preg_match:
preg_match('/(?<=\/).*?(?=,)/',$array,$matches);
output is:
stringb/123456789
How can I change my preg_match to extract the string after second slash (or after last slash)?
Desired output:
123456789
You can match anything other than a / as
/(?<=\/)[^\/,]*(?=,)/
[^\/,]* Negated character class matches anything other than , or \
Regex Demo
Example
preg_match('/(?<=\/)[^\/,]*(?=,)/',$array,$matches);
// $matches[0]
// => 123456789
This should do it.
<?php
$array = 'stringa/stringb/123456789,abc,cde';
preg_match('~.*/(.*?),~',$array,$matches);
echo $matches[1];
?>
Disregard everything until the last forward slash (.*/). Once the last forward slash is found keep all the data until the first comma((.*?),).
You don't need to use lookbehind, i.e.:
$string = "stringa/stringb/123456789,abc,cde";
$string = preg_replace('%.*/(.*?),.*%', '$1', $string );
echo $string;
//123456789
Demo:
http://ideone.com/IxdNbZ
Regex Explanation:
.*/(.*?),.*
Match any single character that is NOT a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “/” literally «/»
Match the regex below and capture its match into backreference number 1 «(.*?)»
Match any single character that is NOT a line break character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “,” literally «,»
Match any single character that is NOT a line break character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
$1
Insert the text that was last matched by capturing group number 1 «$1»

php - regex - how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01)?

how to extract a number with decimal (dot and comma) from a string (e.g. 1,120.01) ?
I have a regex but doesn't seem to play well with commas
preg_match('/([0-9]+\.[0-9]+)/', $s, $matches);
The correct regex for matching numbers with commas and decimals is as follows (The first two will validate that the number is correctly formatted):
decimal optional (two decimal places)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$
Debuggex Demo
Explained:
number (decimal optional)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*(?:\.[0-9]{2})?$
Options: case insensitive
Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
The character “+” «+»
The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “,” literally «,?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character in the range between “0” and “9” «[0-9]{3}»
Exactly 3 times «{3}»
Match the regular expression below «(?:\.[0-9]{2})?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Will Match:
1,432.01
456.56
654,246.43
432
321,543
Will not Match
454325234.31
324,123.432
,,,312,.32
123,.23
decimal mandatory (two decimal places)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$
Debuggex Demo
Explained:
number (decimal required)
^[+-]?[0-9]{1,3}(?:,?[0-9]{3})*\.[0-9]{2}$
Options: case insensitive
Assert position at the beginning of the string «^»
Match a single character present in the list below «[+-]?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
The character “+” «+»
The character “-” «-»
Match a single character in the range between “0” and “9” «[0-9]{1,3}»
Between one and 3 times, as many times as possible, giving back as needed (greedy) «{1,3}»
Match the regular expression below «(?:,?[0-9]{3})*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the character “,” literally «,?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character in the range between “0” and “9” «[0-9]{3}»
Exactly 3 times «{3}»
Match the character “.” literally «\.»
Match a single character in the range between “0” and “9” «[0-9]{2}»
Exactly 2 times «{2}»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Will Match:
1,432.01
456.56
654,246.43
324.75
Will Not Match:
1,43,2.01
456,
654,246
324.7523
Matches Numbers separated by commas or decimals indiscriminately:
^(\d+(.|,))+(\d)+$
Debuggex Demo
Explained:
Matches Numbers Separated by , or .
^(\d+(.|,))+(\d)+$
Options: case insensitive
Match the regular expression below and capture its match into backreference number 1 «(\d+(.|,))+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «+»
Match a single digit 0..9 «\d+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the regular expression below and capture its match into backreference number 2 «(.|,)»
Match either the regular expression below (attempting the next alternative only if this one fails) «.»
Match any single character that is not a line break character «.»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «,»
Match the character “,” literally «,»
Match the regular expression below and capture its match into backreference number 3 «(\d)+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Note: You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «+»
Match a single digit 0..9 «\d»
Will Match:
1,32.543,2
5456.35,3.2,6.1
2,7
1.6
Will Not Match:
1,.2 // two ., side by side
1234,12345.5467. // ends in a .
,125 // begins in a ,
,.234 // begins in a , and two symbols side by side
123,.1245. // ends in a . and two symbols side by side
Note: wrap either in a group and then just pull the group, let me know if you need more specifics.
Description: This type of RegEx works with any language really (PHP, Python, C, C++, C#, JavaScript, jQuery, etc). These Regular Expressions are good for currency mainly.
You can use this regex: -
/((?:[0-9]+,)*[0-9]+(?:\.[0-9]+)?)/
Explanation: -
/(
(?:[0-9]+,)* # Match 1 or more repetition of digit followed by a `comma`.
# Zero or more repetition of the above pattern.
[0-9]+ # Match one or more digits before `.`
(?: # A non-capturing group
\. # A dot
[0-9]+ # Digits after `.`
)? # Make the fractional part optional.
)/
Add the comma to the range that can be in front of the dot:
/([0-9,]+\.[0-9]+)/
# ^ Comma
And this regex:
/((?:\d,?)+\d\.[0-9]*)/
Will only match
1,067120.01
121,34,120.01
But not
,,,.01
,,1,.01
12,,,.01
# /(
# (?:\d,?) Matches a Digit followed by a optional comma
# + And at least one or more of the previous
# \d Followed by a digit (To prevent it from matching `1234,.123`)
# \.? Followed by a (optional) dot
# in case a fraction is mandatory, remove the `?` in the previous section.
# [0-9]* Followed by any number of digits --> fraction? replace the `*` with a `+`
# )/
The locale-aware float (%f) might be used with sscanf.
$result = sscanf($s, '%f')
That doesn't split the parts into an array though. It simply parses a float.
See also: http://php.net/manual/en/function.sprintf.php
A regex approach:
/([0-9]{1,3}(?:,[0-9]{3})*\.[0-9]+)/
This should work
preg_match('/\d{1,3}(,\d{3})*(\.\d+)?/', $s, $matches);
Here is a great working regex. This accepts numbers with commas and decimals.
/^-?(?:\d+|\d{1,3}(?:,\d{3})+)?(?:\.\d+)?$/

Regular Expression (preg_match)

This is the not working code:
<?php
$matchWith = " http://videosite.com/ID123 ";
preg_match_all('/\S\/videosite\.com\/(\w+)\S/i', $matchWith, $matches);
foreach($matches[1] as $value)
{
print 'Hyperlink';
}
?>
What I want is that it should not display the link if it has a whitespace before or after.
So now it should display nothing. But it still displays the link.
This can also match ID12, because 3 is not an space, and the / of http:/ is not a space. You can try:
preg_match_all('/^\S*\/videosite\.com\/(\w+)\S*$/i', $matchWith, $matches);
So, you don't want it to display if there's whitespaces. Something like this should work, didn't test.
preg_match_all('/^\S+?videosite\.com\/(\w+)\S+?$/i', $matchWith, $matches);
You can try this. It works:
if (preg_match('%^\S*?/videosite\.com/(\w+)(?!\S+)$%i', $subject, $regs)) {
#$result = $regs[0];
}
But i am positive that after I post this, you will update your question :)
Explanation:
"
^ # Assert position at the beginning of the string
\S # Match a single character that is a “non-whitespace character”
*? # Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
\/ # Match the character “/” literally
videosite # Match the characters “videosite” literally
\. # Match the character “.” literally
com # Match the characters “com” literally
\/ # Match the character “/” literally
( # Match the regular expression below and capture its match into backreference number 1
\w # Match a single character that is a “word character” (letters, digits, etc.)
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
(?! # Assert that it is impossible to match the regex below starting at this position (negative lookahead)
\S # Match a single character that is a “non-whitespace character”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
\$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
"
It would probably be simpler to use this regex:
'/^http:\/\/videosite\.com\/(\w+)$/i'
I believe you are referring to the white space before http, and the white space after the directory. So, you should use the ^ character to indicate that the string must start with http, and use the $ character at the end to indicate that the string must end with a word character.

Categories