Regex for editing files - php

I have to replace the following in my PHP code:
assert('is_array($myArray)');
assert('my_function_call($myVariable)');
to make it read like:
assert(is_array($myArray));
assert(my_function_call($myVariable));
The problem is that it occurs a lot of time in my code files, and I would have to open each and make the change.
I use NetBeans which has a find and replace functionality which uses regex. What regex to use for this?

Use find replace regex term:
assert\(\'is_array\($myArray\)\'\);
assert\(\'my_function_call\($myVariable\)\'\);
Escaping the regex characters fixes simple text terms.

Related

Easy way to find file with ?> at the end?

I am getting this error:
Getting Warning - Cannot modify header information
I'm 99% sure it's because of a file ending with ?> and then some white space after that.
My problem is, I have looked at 15 possible files, but there are hundreds more to check. Is there an easy linux command to find the files ending with ?> and some whitespace after it? Or perhaps is there another way you guys solve this?
You are facing a EOF problem.
The whitespace at the end of the file its breaking your program, you need to find all the end of file occurrences with ?>(whitespace).
You can use a regex expression with a project finder tool, the regex would be: (?> )\z.
The \z regex condition will look for ?>(whitespace) only in the EOF.
I recommend you Sublime text 3 because you can apply regex doing a search and replace, there's a Sublime text find & replace examples if you want to learn how to.

Writing A Regular Expression For Search And Replace Function In Dreamweaver

I am trying to update hundreds of lines of comments in my php files. My editor allows me to use regular expressions to perform a search and replace. However, I don't know much about regular expression to write it. Please refer to example below.
Dump($Data1, 'Library_reports.php - Get_Filtered_InventoryReport() - $Data1');
Dump($Data2, 'Library_reports.php - Get_Filtered2InventoryReport() - $Data2');
Dump($Data3, 'Library_reports.php - GetFilteredInventoryReport() - $Data3');
to be replace with
Dump($Data1, __METHOD__.' - $Data1');
Dump($Data2, __METHOD__.' - $Data2');
Dump($Data3, __METHOD__.' - $Data3');
So basically, I want to search for
'Some_Alphanumeric_string()
and then replace it with a
__METHOD__.'
Give it a try: [A-Za-z0-9_]() it's nothing complicated here.
Edit:
[A-Za-z0-9_]+\(\)
StackOverflow eats my backslashes :)
Search with:
([a-zA-Z0-9]+)\(\)
Replace with:
^ intentionally left blank
Based on your description, this search regex will do the trick:
\b[a-z0-9_]+\b\(\)
...assuming you do case insensitive search. (It's an option in the Dreamweaver search/replace tool).
Otherwise:
\b[A-Za-z0-9_]+\b\(\)
Note: I've included the underscore in the character class based on your use of them in:
"Some_Alphanumeric_string()"

Trying to stop regex at a tag

I know there are other posts with a similar name but I've looked through them and they haven't helped me resolve this.
I'm trying to get my head around regex and preg_match. I am going through a body of text and each time a link exists I want it to be extracted. I'm currently using the following:
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
which works fine until it finds one that has <br after it. Then I get the url plus the <br which means it doesn't work correctly. How can I have it so that it stops at the < without including it?
Also, I have been looking everywhere for a clear explanation of using regex and I'm still confused by it. Has anyone any good guides on it for future reference?
\S* is too broad. In particular, I could inject into your code with a URL like:
http://hax.hax/"><script>alert('HAAAAAAAX!');</script>
You should only allow characters that are allowed in URLs:
[-A-Za-z0-9._~:/?#[]#!$&'()*+,;=]*
Some of these characters are only allowed in specific places (such as ?) so if you want better validation you will need more cleverness
Instead of \S exclude the open tag char from the class:
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/[^<]*)?/";
You might even want to be more restrictive by only allowing characters valid in URLs:
$reg_exUrl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/[a-zA-Z_\-\.%\?&]*)?/";
(or some more characters)
You could use this one as presented on the:
http://regex101.com/r/zV1uI7
On the bottom of the site you got it explained step by step.

Complex PHP/Perl regular expression for emoticons

I've checked google for help on this subject but all the answers keep overlooking a fatal flaw in the replacement method.
Essentially I have a set of emoticons such as :) LocK :eek and so on and need to replace them with image tags. The problem I'm having is identifying that a particular emoticon is not part of a word and is alone on a line. For example on our site we allow 'quick links' which are not included in the smiley replacement which take the format go:forum, user:Username and so on. Pretty much all answers I've read don't allow for this possiblity and as such break these links (i.e. go<img src="image.gif" />orum). I've tried experimenting around with different ways to get around this to check for the start of the line, spaces/newline characters and so on but I've not had much luck.
Any help with this problem would be greatly appreciated. Oh also I'm using PHP 5 and the preg_% functions.
Thanks,
Rupert S.
Edit 18/04/2011:
Thanks for your help peeps :) Have created the final regex that I though I'd share with everyone, had a couple problems to do with special space chars including newline but it's now working like a dream the final regex is:
(?<=\s|\A|\n|\r|\t|\v|\<br \/\>|\<br\>)(:S)(?=\s|\Z|$|\n|\r|\t|\v|\<br \/\>|\<br\>)
To complete the comment into an answer: The simplest workaround would be to assert that the emoticons are always surrounded by whitespace.
(?<=\s|^)[<:-}]+(?=\s|$)
The \s covers normal spaces and line breaks. Just to be safe ^ and $ cover occurrences at the start or very end of the text subject. The assertions themselves do not match, so can be ignored in the replacement string/callback.
If you want to do all the replace in one single preg_replace, try this:
preg_replace('/(?<=^|\s)(:\)|:eek)(?=$|\s)/e'
,"'$1'==':)'?'<img src=\"smile.gif\"/>':('$1'==':eek'?'<img src=\"eek.gif\"/>':'$1')"
,$input);

PHP regex for filtering out urls from specific domains for use in a vBulletin plug-in

I'm trying to put together a plug-in for vBulletin to filter out links to filesharing sites. But, as I'm sure you often hear, I'm a newb to php let alone regexes.
Basically, I'm trying to put together a regex and use a preg_replace to find any urls that are from these domains and replace the entire link with a message that they aren't allowed. I'd want it to find the link whether it's hyperlinked, posted as plain text, or enclosed in [CODE] bb tags.
As for regex, I would need it to find URLS with the following, I think:
Starts with http or an anchor tag. I believe that the URLS in [CODE] tags could be processed the same as the plain text URLS and it's fine if the replacement ends up inside the [CODE] tag afterward.
Could contain any number of any characters before the domain/word
Has the domain somewhere in the middle
Could contain any number of any characters after the domain
Ends with a number of extentions such as (html|htm|rar|zip|001) or in a closing anchor tag.
I have a feeling that it's numbers 2 and 4 that are tripping me up (if not much more). I found a similar question on here and tried to pick apart the code a bit (even though I didn't really understand it). I now have this which I thought might work, but it doesn't:
<?php
$filterthese = array('domain1', 'domain2', 'domain3');
$replacement = 'LINKS HAVE BEEN FILTERED MESSAGE';
$regex = array('!^http+([a-z0-9-]+\.)*$filterthese+([a-z0-9-]+\.)*(html|htm|rar|zip|001)$!',
'!^<a+([a-z0-9-]+\.)*$filterthese+([a-z0-9-]+\.)*</a>$!');
$this->post['message'] = preg_replace($regex, $replacement, $this->post['message']);
?>
I have a feeling that I'm way off base here, and I admit that I don't fully understand php let alone regexes. I'm open to any suggestions on how to do this better, how to just make it work, or links to RTM (though I've read up a bit and I'm going to continue).
Thanks.
You can use parse_url on the URLs and look into the hashmap it returns. That allows you to filter for domains or even finer-grained control.
I think you can avoid the overhead of this in using the filter_var built-in function.
You may use this feature since PHP 5.2.0.
$good_url = filter_var( filter_var( $raw_url, FILTER_SANITIZE_URL), FILTER_VALIDATE_URL);
Hmm, my first guess: You put $filterthese directly inside a single-quoted string. That single quotes don't allow for variable substitution. Also, the $filterthese is an array, that should first be joined:
var $filterthese = implode("|", $filterthese);
Maybe I'm way off, because I don't know anything about vBulletin plugins and their embedded magic, but that points seem worth a check to me.
Edit: OK, on re-checking your provided source, I think the regexp line should read like this:
$regex = '!(?#
possible "a" tag [start]: )(<a[^>]+href=["\']?)?(?#
offending link: )https?://(?#
possible subdomains: )(([a-z0-9-]+\.)*\.)?(?#
domains to block: )('.implode("|", $filterthese).')(?#
possible path: )(/[^ "\'>]*)?(?#
possible "a" tag [end]: )(["\']?[^>]*>)?!';

Categories