I have this code:
preg_match("/[^-+*%0-9]+/", $your_string, $matches)
It works great but I would like to be able to add the "/" character and I don't know how.
You can use a different pattern delimiter, such as a hash, instead of a forward slash, and then just match the forward slash like any other character:
preg_match('#^/#', $subject);
Such expressions needs to escaped, so you can use it like the following:
preg_match("/[^-+\/*%0-9]+/", $your_string, $matches)
Related
I am trying to use PHP's preg_match() to retrieve everything between the www. and .com of a URL.
e.g.:
www.example.com will return example
www.example-website.com will return example-website
I'm lucky in that the URLs I'm working with always start www. and always end .com, so it doesn't need to be particularly complex, accounting for many use cases.
However, my Regex knowledge is minimal to none.
My try:
preg_match("/.([^.]*)./", $string, $matches);
As according to RegExr the second match ($matches[1]?) should contain what I need, but it doesn't seem to be working.
Thanks.
(?<=www\.)(.+?)(?=\.com)
Try this.Grab the capture.See demo.
http://regex101.com/r/iZ9sO5/10
You need to escape the dots in the regex.
preg_match("/www\.([^.]*)\.com/", $string, $matches);
. in a regex can match (nearly) any character,
where as
\. matches only the literal . dot within the url.
www and com can be used for delimiting the string in the url which gives extra safety.
Example : http://regex101.com/r/aA5eC5/2
The first capture group (\1) will contain
example
example-website
EDIT
If the regex is to match strings with other . in it, something like www.example.somesite.com, then the regex can be modified as
preg_match("/www\.(.+)\.com/", $string, $matches);
I have this text string:
$text="::tower_unit::7::/tower_unit::<br/>::tower_unit::8::/tower_unit::<br/>::tower_unit::9::/tower_unit::";
Now I want to get the value of 7,8, and 9
how to do that in preg_match_all ?
I've tried this:
$pattern="/::tower_unit::(.*)::\/tower_unit::/i";
preg_match($pattern,$text,$matches);
print_r($matches);
but it still all wrong...
You forgot to escape the slash in your pattern. Since your pattern includes slashes, it's easier to use a different regex delimiter, as suggested in the comments:
$pattern="#::tower_unit::(\d+)::/tower_unit::#";
preg_match_all($pattern,$text,$matches);
I also converted (.*) to (\d+), which is better if the token you're looking for will always be a number. Plus, you might want to lose the i modifier if the text is always lower cased.
Your regex is "greedy".
Use the following one
$pattern="#::tower_unit::(.*?)::/tower_unit::#i";
or
$pattern="#::tower_unit::(.*)::/tower_unit::#iU";
and, if you wish, \d+ instead of .*? or .*
the function should be preg_match_all
There are quite a few questions on removing multiple slashes using regex in PHP. However, I have a special case I would like to exclude.
I have a full URL as my input: http://localhost/path/to/whatever
I have written to regex to convert backslashes to forward slashes, and then remove multiple consecutive slashes:
$cleaned = preg_replace('/(\\\+)|(\/+)/', "/", trim($input));
This works fine for the most part, however I need to be able to exclude the :// case, otherwise using that expression will result in which is not the intended result:
http:/localhost/path/to/whatever
I have tried using /(\\\+)|^[:](\/+)/, but this doesn't seem to work.
How can I exclude the :// case in my expression?
$cleaned = preg_replace('~(?<!https:|http:)[/\\\\]+~', "/", trim($input));
The subexpression inside the lookbehind can't use quantifiers, so the obvious approach - (?<!https?:) - won't work. But it can be made up of two or more fixed-length alternatives with different lengths. For example:
(?<!https:|http:) # OK
Be aware that the alternation has to be at the top level of the lookbehind, so this won't work:
(?<!(https:|http:)) # error
There is something called "negative look behind" (also available in positive or look ahead)
http://www.phpro.org/tutorials/Introduction-to-PHP-Regex.html
With this you could add an exception by something like
(?<=^https?:)
Then your expression will only match in places NOT preceded by "http:"
Simply a negative look-behind for a colon, preceding two or more forward or backward slashes:
$cleaned = preg_replace('/(?<!:)(?:\\/|\\\\){2,}/', "/", trim($input));
I want to find URL like following with preg_match.
http://www.website.com/THE_ID_WHICH_I_WANT/RANDOM_CHARACTERS_AND_NUMBERS.RANDOM_SOMETHING.html
This is how far I got:
preg_match_all('%http://www.website\.com\/(\w+)%', $string, $matches);
But I also want that it to get the random characters.
Thank you.
For matching anything it's customary to use .+ or the non-greedy .*?
You might want to use \S+ which matches anything that isn't a space character. And even then it might be too much. But you didn't really elaborate about the context in which you want to use it.
preg_match_all('%http://www\.website\.com/(\w+)/(.*)\.html%', $string, $matches);
The above is assuming that you want to separate "THE_ID_WHICH_I_WANT" from the other random characters.
Example: http://regexr.com?2v9t7
I've created this regex
(www|http://)[^ ]+
that match every http://... or www.... but I dont know how to make preg_replace that would work, I've tried
preg_replace('/((www|http://)[^ ]+)/', '\1', $str);
but it doesn't work, the result is empty string.
You need to escape the slashes in the regex because you are using slashes as the delimiter. You could also use another symbol as the delimiter.
// escaped
preg_replace('/((www|http:\/\/)[^ ]+)/', '\1', $str);
// another delimiter, '#'
preg_replace('#((www|http://)[^ ]+)#', '\1', $str);
When using the regex codes provided by the other users, be sure to add the "i" flag to enable case-insensitivity, so it'll work with both HTTP:// and http://. For example, using chaos's code:
preg_replace('!(www|http://[^ ]+)!i', '\1', $str);
First of all, you need to escape—or even better, replace—the delimeters as explained in the other answers.
preg_replace('~((www|http://)[^ ]+)~', '\1', $str);
Secondly, to further improve the regex, the $n replacement reference syntax is preferred over \\n, as stated in the manual.
preg_replace('~((www|http://)[^ ]+)~', '$1', $str);
Thirdly, you are needlessly using capturing parentheses, which only slows things down. Get rid of them. Don't forget to update $1 to $0. In case you are wondering, these are non-capturing parentheses: (?: ).
preg_replace('~(?:www|http://)[^ ]+~', '$0', $str);
Finally, I would replace [^ ]+ with the shorter and more accurate \S, which is the opposite of \s. Note that [^ ]+ does not allow spaces, but accepts newlines and tabs! \S does not.
preg_replace('~(?:www|http://)\S+~', '$0', $str);
Your main problem seems to be that you are putting everything in parentheses, so it doesn't know what "\1" is. Also, you need to escape the "/". So try this:
preg_replace('/(www|http:\/\/[^ ]+)/', '\1', $str);
Edit: It actually seems the parentheses were not an issue, I misread it. The escaping was still an issue as others also pointed out. Either solution should work.
preg_replace('!((?:www|http://)[^ ]+)!', '\1', $str);
When you use / as your pattern delimiter, having / inside your pattern will not work out well. I solved this by using ! as the pattern delimiter, but you could escape your slashes with backslashes instead.
I also didn't see any reason why you were doing two paren captures, so I removed one of them.
Part of the trouble in your situation is that you're running with warnings suppressed; if you had error_reporting(E_ALL) on, you'd have seen the messages PHP is trying to generate about your delimiter problem in your regex.
If there are multiple url contained in a string a separated by a line break instead of a space, you have to use the \S
preg_replace('/((www|http:\/\/)\S+)/', '$1', $val);