regular expression to match words with space or no space - php

i am trying to find php regular expression that match the word like "Hello World" with space and also match the word "HelloWorld" without space.

You could use:
/^Hello ?World$/
Or you don't care the number of spaces:
/^Hello *World$/
Or it could also be blank chars like tab, then use \s instead a space.

Generally that would be:
/[a-zA-Z ]+/
If you want numbers, too:
/[a-zA-Z0-9 ]+/
You would need to set some sort of boundary. If the string just contains this, you can use start and end delimiters:
/^[a-zA-Z0-9 ]+$/

Related

How to check if string contains specific special characters or starting with a space? [duplicate]

I have the following requirements for validating an input field:
It should only contain alphabets and spaces between the alphabets.
It cannot contain spaces at the beginning or end of the string.
It cannot contain any other special character.
I am using following regex for this:
^(?!\s*$)[-a-zA-Z ]*$
But this is allowing spaces at the beginning. Any help is appreciated.
For me the only logical way to do this is:
^\p{L}+(?: \p{L}+)*$
At the start of the string there must be at least one letter. (I replaced your [a-zA-Z] by the Unicode code property for letters \p{L}). Then there can be a space followed by at least one letter, this part can be repeated.
\p{L}: any kind of letter from any language. See regular-expressions.info
The problem in your expression ^(?!\s*$) is, that lookahead will fail, if there is only whitespace till the end of the string. If you want to disallow leading whitespace, just remove the end of string anchor inside the lookahead ==> ^(?!\s)[-a-zA-Z ]*$. But this still allows the string to end with whitespace. To avoid this look back at the end of the string ^(?!\s)[-a-zA-Z ]*(?<!\s)$. But I think for this task a look around is not needed.
This should work if you use it with String.matches method. I assume you want English alphabet.
"[a-zA-Z]+(\\s+[a-zA-Z]+)*"
Note that \s will allow all kinds of whitespace characters. In Java, it would be equivalent to
[ \t\n\x0B\f\r]
Which includes horizontal tab (09), line feed (10), carriage return (13), form feed (12), backspace (08), space (32).
If you want to specifically allow only space (32):
"[a-zA-Z]+( +[a-zA-Z]+)*"
You can further optimize the regex above by making the capturing group ( +[a-zA-Z]+) non-capturing (with String.matches you are not going to be able to get the words individually anyway). It is also possible to change the quantifiers to make them possessive, since there is no point in backtracking here.
"[a-zA-Z]++(?: ++[a-zA-Z]++)*+"
Try this:
^(((?<!^)\s(?!$)|[-a-zA-Z])*)$
This expression uses negative lookahead and negative lookbehind to disallow spaces at the beginning or at the end of the string, and requiring the match of the entire string.
I think the problem is there's a ? before the negation of white spaces, which means it is optional
This should work:
[a-zA-Z]{1}([a-zA-Z\s]*[a-zA-Z]{1})?
at least one sequence of letters, then optional string with spaces but always ends with letters
I don't know if words in your accepted string can be seperated by more then one space. If they can:
^[a-zA-Z]+(( )+[a-zA-z]+)*$
If can't:
^[a-zA-Z]+( [a-zA-z]+)*$
String must start with letter (or few letters), not space.
String can contain few words, but every word beside first must have space before it.
Hope I helped.

What is the difference between 2 regex patterns?

I want users input their username with only alphanumeric and dot character.
So I wrote a regex pattern as following:
'/([a-zA-Z0-9\.]+)/'
But I want to know is it the same with:
'/([a-zA-Z0-9.]+)/'
2 below patterns is the same? Thank you for help! :-)
You don't need to escape the dot which was present inside a character class. Inside a character class, dot . and escaped dot \. matches the literal dot. So both regexes are same.
And also for validation purposes, i would suggest you to add anchors like '/^[a-zA-Z0-9.]+$/' . Anchors would be used to do a exact string match. That is , /[a-zA-Z0-9.]+/ regex would match the substring foo in this ()foo input string but if you add start and end anchors to your regex like /^[a-zA-Z0-9.]+$/, it won't match even a single character in the above mentioned string. It's allowed to match only one or more alphanumeric or dot characters , if it finds a character other than dot or alphanumeric, then the regex engine won't match the corresponding string.

Explode and/or regex text to HTML link in PHP

I have a database of texts that contains this kind of syntax in the middle of English sentences that I need to turn into HTML links using PHP
"text1(text1)":http://www.example.com/mypage
Notes:
text1 is always identical to the text in parenthesis
The whole string always have the quotation marks, parenthesis, colon, so the syntax is the same for each.
Sometimes there is a space at the end of the string, but other times there is a question mark or comma or other punctuation mark.
I need to turn these into basic links, like
text1
How do I do this? Do I need explode or regex or both?
"(.*?)\(\1\)":(.*\/[a-zA-Z0-9]+)(?=\?|\,|\.|$)
You can use this.
See Demo.
http://regex101.com/r/zF6xM2/2
You can use this replacement:
$pattern = '~"([^("]+)\(\1\)":(http://\S+)(?=[\s\pP]|\z)~';
$replacement = '\1';
$result = preg_replace($pattern, $replacement, $text);
pattern details:
([^("]+) this part will capture text1 in the group 1. The advantage of using a negated character class (that excludes the double quote and the opening parenthesis) is multiple:
it allows to use a greedy quantifier, that is faster
since the class excludes the opening parenthesis and is immediatly followed by a parenthesis in the pattern, if in an other part of the text there is content between double quotes but without parenthesis inside, the regex engine will not go backward to test other possibilities, it will skip this substring without backtracking. (This is because the PCRE regex engine converts automatically [^a]+a into [^a]++a before processing the string)
\S+ means all that is not a whitespace one or more times
(?=[\s\pP]|\z) is a lookahead assertion that checks that the url is followed by a whitespace, a punctuation character (\pP) or the end of the string.
You can use this regex:
"(.*?)\(.*?:(.*)
Working demo
An appropriate Regular Expression could be:
$str = '"text1(text1)":http://www.example.com/mypage';
preg_match('#^"([^\(]+)' .
'\(([^\)]+)\)[^"]*":(.+)#', $str, $m);
print ''.$m[2].'' . PHP_EOL;

php: strip everything except alphanumeric unicode and two characters

I am trying to get a strip a text from all punctuation but since the text is in Spanish I can't use [A-Za-z0-9].
I have found this regex:
trim(preg_replace('#[^\p{L}\p{N}]+#u', ' ', $str)
which seems to do the job, but I would like to keep two special characters # and #, how can I achieve that?
Extra question: How can I delete all strings that are just numbers? e.g. 123 would be deleted but not as5623.
Thanks in advance!
You can simply add those characters to your negated class to retain them. And be sure to change your pattern delimiters to something other than # as well.
~[^\p{L}\p{N}##]+~u
To remove all strings that are numbers, you can place word boundaries \b around your pattern.
\b\d+\b
Note: A word boundary does not consume any characters. It asserts that on one side there is a word character, and on the other side there is not.
You can use posix character classes too.
/[^[:alnum:]##]+/
But for the two special character, you just have to add it inside character class.
To delete all the only number containing words following regex would work.
/\b[[:digit:]]+\b/

PHP find words which are not in regex

I have written following regular expression /^[A-Za-z0-9-_\s]*$/ in PHP which allows numbers, letters, spaces, hyphen and underscore. I want to display those matches which are not valid against the regex i.e "My Name is Blahblah!!!" should give me "!!!" output.
Use the caret symbol inside the character class to invert the match and remove the start (^) and end ($) characters:
/[^A-Za-z0-9-_\s]+/
http://php-regex.blogspot.com/2008/01/how-to-negate-character-class.html
If you replace all the matches with the empty string then you'll get the non-matching parts back:
preg_replace('/[A-Za-z0-9-_\s]+/', '', $string)
This will work for any arbitrary regex, but for your specific regex #Andy's solution is simpler.
Notice that I removed the anchors ^ and $ to make this work.
preg_replace("/^[A-Za-z0-9-_\s]*$/","","My Name is Blahblah!!!") // Output: "!!!"
Or, if you want all the groupings of them
preg_split("/^[A-Za-z0-9-_\s]*$/","","My Name is Blahblah!!!")
You have to put the hiphen - at the begining or at the end of the character class or escape it, so your regex would be :
/[^-A-Za-z0-9_\s]+/
or
/[^A-Za-z0-9_\s-]+/
or
/[^A-Za-z0-9\-_\s]+/

Categories