I'm finishing BBCode support for my CMS. I'm using regex to convert the BBCode to html and vice versa. But yet, i have a little problem with security. I have for example regular expression:
~\[img=(.*?\.(?:jpg|jpeg|png))\|(.*?)\[/img\]~s
to determinate for example this
[img=somewhere.com\image\08-09-2014\cat.png|This is a cat[/img]
But it also works on strings like this, but that I really don't want to.
[img=somewhere.com" onclick="someBadJSCode()" src="\image\08-09-2014\cat.png|This is a cat[/img]
I thought that this edit to regex will help:
~\[img=([^"]+.*?\.(?:jpg|jpeg|png))\|(.*?)\[/img\]~s
But it actually didn't, dunno why. Any ideas?
Thanks to Casimir et Hippolyte, i got the regex, that works for the url, alt and class part without any JS danger.
Converts this:
[img=somewhere.com/img.jpg|left]cat[/img]
to this:
<img src="somewhere.com/img.jpg" class="left" alt="cat" >
Pattern for preg_replace method (1. parameter)
~\[img=([^"']+\.(?:jpg|jpeg|png))\|([^"']+)\]([^"']+)\[/img\]~s
Replacement for preg_replace method (2. parameter)
'<img src="$1" class="'.$this->GetElementClass('img').' $2" alt="$3" >'
Related
Hey I have a hard time understanding regex but I think that what suits my needs best currently have this line:
$str = preg_replace('/https:\/\/clips.twitch.tv\/(.*?)/', '<iframe src="https://clips.twitch.tv/embed?autoplay=false&clip=$1&tt_content=embed&tt_medium=clips_embed" width="640" height="360" frameborder="0" scrolling="no" allowfullscreen="true"></iframe>', $text);
What I want is to replace f.ex this link:
https://clips.twitch.tv/GleamingHelpfulOxNotLikeThis
To be the HTML in the replace part, but the last part f.ex. GleamingHelpfulOxNotLikeThis ends up behind the iframe and not after clip= where I have $1 which I thought would work.
.*? will match everything after the domain name, so you're including the entire rest of the line in your match. Making it non-greedy with ? doesn't help, because there's nothing after the wildcard that will cause it to stop.
If the last part of the URL is just an alphanumeric string, use \w* instead of .*.
If that's not how these URLs are constructed, you need to find some other way of telling where the URL ends in the text. This is not trivial, and often requires heuristics that misbehave. URLs can use most characters without requiring any escaping, including most punctuation, but when people type them into free-flowing text they'll often put punctuation after them and intend it to be distinguished. For instance, someone might write:
Is the URL http://www.foo.com/foobarbaz?
They intend the ? to mark it as a question, so that the URL ends at foobarbaz, but ? is a valid (and common) character in a URL, so there's no reason why http://www.foo.com/foobarbaz? couldn't be the intended URL. As humans we generally have no problem figuring this out from context, but a simple regular expression would have little hope. I've seen many automatic URL recognizers mess up like this.
So you should be prepared for the possibility that no matter what you use as the regexp, it might not parse all URLs correctly unless there are restrictions on the URLs that can be used.
I have long struggled with programming languages such as PHP, Javascript, HTML, etc. But my weakness is still very disturbing is about regex.
Previously I felt comfortable without understanding it but now I find the point where I have to use a regex function.
I want to replace a html tag that is created from a rich text editor, say [RTE] so that when I type [code] in the box and then I hit enter it will be translated by RTE <div>[code]</div>
What I need is to change the <div>[code]</div> into an opening html tag <div class="code">
I have tried using str_replace() PHP function as bellow :
$content = str_replace(
'<div>[code]</div>',
'<div class="code">',
$_POST['content']
);
but it's not work, I think maybe I need to use preg_replace() function but I can't.
Can someone help me what type the sample code to do that?
In preg_replace() function, you need to escape [,] symbols, so that it would match the literal [,] symbols.
Regex:
<(div)>\[([^\]]*)\]<\/\1>
REplacement string:
<\1 class="\2">
DEMO
I have seen lots of similar queries to this, but am struggling to get them to work in my application because I still don't fully understand regular expressions!
I'm using the old FCKEditor WYSIWYG to upload an image, but need to store the src as the full URL rather than the relative path.
At the time I need to do the replace, I've already replaced quotes with " so the pattern I'm looking for needs to be:
src=\"/userfiles/
This needs to be replaced with
src=\"http://mydomain.com/userfiles/
Thanks for your suggestions!!
you can actually do this with a str_replace and it'd be simpler but here's a preg.
$html = preg_replace('!src="/userfiles/!', 'src="http://mydomain.com/userfiles", $html)
here's the str_replace
$html = str_replace('src="/userfiles/', 'src="http://mydomain.com/userfiles", $html)
if there are spaces here and there you'll need the preg and you'll want to add
\s* in the places that have spaces.
I'm working in Wordpress and need to be able to remove images and empty paragraphs. So far, I've found out how to remove images without a problem. But, I then need to remove empty paragraph tags. I'm using PHP preg_replace to handle the regex functions.
So, as an example, I have the string:
<p style="text-align:center;"><img src="http://www.blah.com/image.jpg" alt="Blah Image" /></p><p>Some text</p>
I run this regex on it:
/<img.*?(>)/
And I end up with this string:
<p style="text-align:center;"></p><p>Some text</p>
I then need to be able to remove the empty paragraph. I tried this, but it removes all paragraphs and the contents of the paragraphs:
/<p[^>]*><\/p[^>]*>/
Any help/suggestions is greatly appreciated!
The correct regex is no regex. Use an HTML/DOM Parser instead. They're simple to use. Regex is for regular languages (which HTML is not).
/<p[^>]*><\/p[^>]*>/ (the regex you gave) should work fine. If it's giving you trouble you could try double-escaping the / like this: /<p[^>]*><\\/p[^>]*>/
PHP is funny about quoting and escape characters. For example "\n" is not equal to '\n'. The first is a line break, the second is a literal backslash followed by an 'n'. The PHP manual entry on string literals is probably worth a quick look.
I'm having a lot of difficulty matching an image url with spaces.
I need to make this
http://site.com/site.com/files/images/img 2 (5).jpg
into a div like this:
.replace(/(http:\/\/([^\s]+\.(jpg|png|gif)))/ig, "<div style=\"background: url($1)\"></div>")
Here's the thread about that:
regex matching image url with spaces
Now I've decided to first make the spaces into entities so that the above regex will work.
But I'm really having a lot of difficulty doing so.
Something like this:
.replace(/http:\/\/(.*)\/([^\<\>?:;]*?) ([^\<\>?:;]*)(\.(jpe?g|png|gif))/ig, "http://$1/$2%20$3$4")
Replaces one space, but all the rest are still spaces.
I need to write a regex that says, make all spaces between http:// and an image extension (png|jpg|gif) into %20.
At this point, frankly not sure if it's even possible. Any help is appreciated, thanks.
Trying Paolo's escape:
.escape(/http:\/\/(.*)\/([^\<\>?:;]*?) ([^\<\>?:;]*)(\.(jpe?g|png|gif))/)
Another way I can do this is to escape serverside in PHP, and in PHP I can directly mess with the file name without having to match it in regex.
But as far as I know something like htmlentities do not apply to spaces. Any hints in this direction would be great as well.
Try the escape function:
>>> escape("test you");
test%20you
If you want to control the replacement character but don't want to use a regular expression, a simple...
$destName = str_replace(' ', '-', $sourceName);
..would probably be the more efficient solution.
Lets say you have the string variable urlWithSpaces which is set to a URL which contains spaces.
Simply go:
urlWithoutSpaces = escape(urlWithSpaces);
What about urlencode() - that may do what you want.
On the JS side you should be using encodeURI(), and escape() only as a fallback. The reason to use encodeURI() is that it uses UTF-8 for encoding, while escape() uses ISO Latin. Same problems applies for decoding.
encodeURI = encodeURI || escape;
alert(encodeURI('image name.png'));