PHP : couldn't replace using preg_replace() - php

Just I'm replacing the object tag in the given string
$matches = preg_replace("/<object(.+?)</object>/","replacing string",$str);
but it is showing the error as
Warning: preg_replace() [function.preg-replace]: Unknown modifier 'o'
What went wrong?

The slash in </object> has to be quoted: <\/object>, or else it is interpreted as the end of your regex since you're delimiting it with slashes. The whole line should read:
$matches = preg_replace("/<object(.+?)<\\/object>/","replacing string",$str);

In your regex the forward slash is the regex delimiter. As you are dealing with tags, better use another delimiter (instead of escaping it with a backslash):
$matches = preg_replace("#<object(.+?)</object>#", "replacing string", $str);
There are other delimiteres, too. You can use any non-alphanumeric, non-backslash, non-whitespace character. However, certain delimiters should not be used under any circumstances: |, +, * and parentheses/brackets for example as they are often used in the regular expressions and would just confuse people and make them hate you.
Btw, using regular expressions for HTML is a Bad Thing!

The first character is taken as delimiter char to separate the expression from the flags. Thus this:
"/[a-z]+/i"
... is internally split into this:
- Pattern: [a-z]+
- Flags: i
So this:
"/<object(.+?)</object>/"
... is not a valid regexp. Try this:
"#<object(.+?)</object>#"

Related

Explode and/or regex text to HTML link in PHP

I have a database of texts that contains this kind of syntax in the middle of English sentences that I need to turn into HTML links using PHP
"text1(text1)":http://www.example.com/mypage
Notes:
text1 is always identical to the text in parenthesis
The whole string always have the quotation marks, parenthesis, colon, so the syntax is the same for each.
Sometimes there is a space at the end of the string, but other times there is a question mark or comma or other punctuation mark.
I need to turn these into basic links, like
text1
How do I do this? Do I need explode or regex or both?
"(.*?)\(\1\)":(.*\/[a-zA-Z0-9]+)(?=\?|\,|\.|$)
You can use this.
See Demo.
http://regex101.com/r/zF6xM2/2
You can use this replacement:
$pattern = '~"([^("]+)\(\1\)":(http://\S+)(?=[\s\pP]|\z)~';
$replacement = '\1';
$result = preg_replace($pattern, $replacement, $text);
pattern details:
([^("]+) this part will capture text1 in the group 1. The advantage of using a negated character class (that excludes the double quote and the opening parenthesis) is multiple:
it allows to use a greedy quantifier, that is faster
since the class excludes the opening parenthesis and is immediatly followed by a parenthesis in the pattern, if in an other part of the text there is content between double quotes but without parenthesis inside, the regex engine will not go backward to test other possibilities, it will skip this substring without backtracking. (This is because the PCRE regex engine converts automatically [^a]+a into [^a]++a before processing the string)
\S+ means all that is not a whitespace one or more times
(?=[\s\pP]|\z) is a lookahead assertion that checks that the url is followed by a whitespace, a punctuation character (\pP) or the end of the string.
You can use this regex:
"(.*?)\(.*?:(.*)
Working demo
An appropriate Regular Expression could be:
$str = '"text1(text1)":http://www.example.com/mypage';
preg_match('#^"([^\(]+)' .
'\(([^\)]+)\)[^"]*":(.+)#', $str, $m);
print ''.$m[2].'' . PHP_EOL;

Regex to detect the colon and sides of it?

First see my string please:
$a = "[ child : parent ]";
How can I detect that the pattern is:
[(optional space)word or character(optional space) : (optional space)word or character(optional space)]
You can catch this as follows in PHP:
Your regular expression is /\[ *\w+ *: *\w+ *]/
You would write code that would look like this to see if it matched.
if (preg_match('/regex/', $string)) {
// do things
}
Explanation of the Regular Expression
There is a backslash (\) before the open bracket because
[ has special meaning in regular expressions. The backslash
prevents its special meaning from being used.
The asterisk (*) matches 0 or more of the previous character expression. In this
case, it matches 0 or more spaces. If you instead used the
expression \s*, it would match 0 or more white-space characters
(space, tab, line break). Finally, if you wanted it to match 0 or 1
of the previous character, you would use ? instead of *.
The plus (+) matches 1 or more of the previous character expression. The \w character expression matches a letter, digit, or underscore. If you don't want underscores to match, you should instead use a character class. For example, you could use [A-Za-z0-9].
You can find more information on regular expressions at http://www.regular-expressions.info and http://www.regular-expressions.info/php.html
From your sample text I'd say you mean a human word and not \w regex word
preg_match('/\[ ?([a-z]+) ?: ?([a-z]+) ?\]/i', $a, $matches);
Explained demo: http://regex101.com/r/hB2oV9
$matches will save both values, test with var_dump($matches);
I'm not sure on the php-specific version of regex, but this should work:
\[ ?\w+ ? : ?\w+ ?\]
Here is a site that I've used in the past to find regular expressions for my needed patterns.
use this regex \[\s*\w+\s*:\s*\w+\s*\]
I would probably do it like this
preg_match('/^\[\s?\w+\s+:\s+\w+\s?\]$/', $string)

How to use regex to match this html tag?

I can't seem to figure out what I'm doing wrong...
I'm trying to find matches of
<cite>stuffhere</cite>
Is this right?
preg_match_all('<cite>(.*?)</cite>/ms', $str, $matches)
escape the /
preg_match_all('/<cite>(.*?)<\/cite>/ms', $str, $matches);
Your confusion is not your fault; PHP is notoriously weird in this area.
In most programming languages, you create a regex object one of two ways. If the language supports regexes as a first-class language element, you can use a regex literal:
var re = /<b>"\w+"<\/b>/; // JavaScript
Here, the forward-slash (/) is the regex delimiter; if you want to match a literal /, you have to escape it with a backslash: \/.
In other languages, you have to write the regex in the form of a string literal, which you then pass to a constructor or a factory method:
Pattern p = Pattern.compile("<b>\"\\w+\"</b>"); // Java
The forward-slash doesn't need to be escaped, but both the double-quote (") and backslash (\) do, because of their special meanings in string literals.
But PHP is unique: it doesn't support regex literals, so you have to write the regex as a string, but the string has to look like a regex literal! That is, it has to have string delimiters (quotes) and regex delimiters. For example:
$re = '/<b>"\w+"<\/b>/';
It isn't all bad; as you can see, you can use PHP's single-quoted strings instead of double-quoted, so you don't have to escape all backslashes and double-quotes. You can also choose different regex delimiters, so you don't have to escape (for example) literal forward-slashes in your regex:
$re = '~<cite>(.*?)</cite>~s'
The modifiers ('s' for single-line, 'i' for ignore-case, etc.) go after the trailing regex delimiter, as in Perl or JavaScript. Almost any ASCII punctuation character can be used as a regex delimiter; ~ and # are popular choices.
You should use an HTML Parser to parse html, or you will end up with unexpected errors. However, this is what your regex should be:
'#<cite>(.*?)</cite>#s'
Try this:
preg_match_all('/<cite>(.*?)<\/cite>/ms', $str, $matches);

What's wrong with this php/regex query?

preg_replace("/(/s|^)(php|ajax|c\+\+|javascript|c#)(/s|$)/i", '$1#$2$3', $somevar);
It's meant to turn, for example, PHP into #PHP.
Warning: preg_replace(): Unknown modifier '|'
It's because you are using the forward slash (/) as your delimiter. When the regex engine gets to /s (3rd character) it thinks the regex is over and the rest of it are modifiers. But no such modifier (|) exists, thus the error.
Next time, you can either:
Change your delimiters to something you won't use in your regex, ie:
preg_replace("!(/s|^)(php|ajax|c\+\+|javascript|c#)(/s|$)!i", '$1#$2$3', $somevar);
Or escape those characters with a backslash, ie: "/something\/else/"*
I also suspect you didn't intend to use /s, but the escape character \s that matches whitespace characters.
The first character in the regular expression is the delimiter. If you need to use this inside your regular expression then you need to escape it:
"/(\/s|^)...
^
Or alternatively, choose another delimiter that isn't used anywhere in your regular expression so that you don't need to escape:
"~(/s|^)...(/s|$)~i"
I prefer to do the latter as it makes the regular expression more readable.
(Although as NullUserException points out, the actual error is that you should have used a backslash instead of a slash).

Why does this PHP regex give me error?

Need Some Help With Regex:
I want to replace
[url=http://youtube.com]YouTube.com[/url]
with
YouTube.com
the regex
preg_replace("/[url=(.*?)](.*?)[/url]/is", '$2', $text);
why does this give me:
Warning: preg_replace() [function.preg-replace]: Unknown modifier 'r' in C:\Programa\wamp\www\func.php on line 18
You should escape special characters in your regular expression:
preg_replace('/\[url=(.*?)](.*?)\[\/url]/is', '$2', $text);
I have escaped the [ characters (they specify the start of a character class) and the / character (it specifies the boundaries of the regular expression.)
Alternatively (for the / character) you can use another boundary character:
preg_replace('#\[url=(.*?)](.*?)\[/url]#is', '$2', $text);
You still have to escape the [ characters, though.
PHP is interpreting the '/' in /url as being the end of the regex and the start of the regex options. Insert a '\' before it to make it a literal '/'.
You need to escape the '['s in the same way (otherwise they will be interpreted as introducing a character class).
preg_replace("/\[url=(.*?)](.*?)\[\/url]/is", '$2', $text);
Both the slashes and square brackets are special characters in regex, you will need to escape them:
\/ \[ \]
The 2nd '/' in a regex string ends the regex. You need to escape it. Also, preg_replace will interpret the '[url=(.*?)]' as a character class, so you need to escape those as well.
preg_replace('/\[url=(.*?)\](.*?)\[\/url\]/is', '$2', $text);
You seem to be just starting out with regular expressions. If that is the case - or maybe even if it isn't - you will find the Regex Coach to be a very helpful tool. It provides a sandbox for us to test our pattern matches and our replace strings too. If you had been using that it would have highlighted the need to escape the special characters.

Categories