Targeting specific PHP tag with regex

Targeting specific PHP tag with regex - php

All my wordpress websites have recently been hacked, and a very long PHP line has been added on top of all PHP files.
It looks like that (juste a sample of the entire code)
<?php $gqmtlkp = '~ x24<!%o:!>! x242178}527}88:}35csboe))1/35.)1/14+9**-)1/2986+7452]88]5]48]32M3]317]445]212]445]43]321]y]252]18y]#>q%
The problem is that code is generated and is different in all files. But I noticed that every code contains
explode(chr((729-609))
Can someone help me with building a regex line, that will target first php tag (optional) containing : (numbers vary)
explode(chr((xxx-xxx))
so that I can automatically remove it in every files ?
Thanks a lot for your help

Based on my understanding of your request you're looking to escape the following format: <?php(optional) explode(chr((xxx-xxx))) <- your sample was missing a third closing paranthesis for explode() function so I added it. If that's not right then just remove the last \) portion.
Try this: /(\<\?php)? explode\(chr\(\([0-9]{3,3}-[0-9]{3,3}\)\)\)/
Not sure if space after optional first php tag is necessary. You can adjust it going from there.

Related

PhpStorm 2016.2 find and replace multiline text

In PhpStorm 2016.2 I have a new project that has been inherited and [badly] needs updating.
There are many pages each with opening line like so (example):
<?
include ("/inc/db.php");
I need to replace this line with several lines such as:
<?php
include "siteheader.php";
require "class.myclass.inc.php";
$dataBase = new DbObj();
I have previously simply copy and pasted multiline code into the PhpStorm search/replace function and that's (usually but not always) returned the correct changes, although they're all squished into single lines, making them harder to read (EOL characters are removed).
In this instance am looking specifically at the "replace in path" function as I need to apply this change to many pages.
I have Read the manual but can see no option for this. I think I could possibly use a Regular Expression but this would not be ideal (escapings etc.).
I have also looked but not found a suitable plugin from the PhpStorm Plugin Repository.
Is there a way of searching and/or replacing multiline text in path in PhpStorm 2016.2?
Cheers

There is no easy to use multi-line search or replace across multiple files (Find/Replace in Path functionality) unfortunately.
Right now you have to use Regex option for that -- that's the only option that works.
Watch these tickets (star/vote/comment) to get notified on any progress in this regard.
https://youtrack.jetbrains.com/issue/IDEA-69435
https://youtrack.jetbrains.com/issue/IDEA-61925
https://youtrack.jetbrains.com/issue/IDEA-145720
Manually making regex-compatible text can be quite problematic .. therefore you might use this few-steps trick:
Type your new text in one file to start with
Select such text and invoke Replace in Path... dialog -- with Regex option pre-selected it should automatically escape your selection to be regex-compatible
Copy that already-escaped text somewhere (just Clipboard should be enough)
Close dialog and go back to original file
Select text you want to replace and invoke Replace in Path... dialog -- it will have your initial text already filled in and regex compatible
Paste previously copied escaped text into Replace field
Execute find/replacement
On related note: https://stackoverflow.com/a/38672886/783119

You can do multiline Find&Replace with Regex option turned on
Find:
<\?\ninclude \("/inc/db\.php"\);
Replace:
<?php\ninclude "siteheader.php"; \nrequire "class.myclass.inc.php"; \n\$dataBase = new DbObj();
As you can see you need to do some additional work to escape some special characters and put \n instead of new lines, but it works. I've just checked.
P.S.
Indeed, it was possible to simply paste multiline text in previous versions, but it's not possible anymore. ;-(

Type Alt+Enter to add a new line in either the "search" or the "replace" field.

On a Mac:
open the 'Find' or 'Replace' tool, click into the text area and press the following keys once for every new line you want to create:
⌘ + 'Shift' + 'Enter'

Besides the suggestions on how to use regex for multiline, in case you want to match two pieces of code with arbitrary lines in the middle, you can use [\s\S]* instead of [\n.]* (which doesn't have the expected result). Example:
//you can match the $result-related code using `\$result([\s\S]*)while`
$result = DB::exec($query);
//blabla
//something else
while ($row = $result->fetch()) {

\s works as expected to match all whitespaces and newlines.
In my case I wanted to find switch ... case ... continue; syntax, so switch(\s|.)*continue worked as expected

Regex on File Names

I have a function called getContents(), Which accepts a regex for the file names it finds.
I scan the js folder for javascript files, with the following two regex patterns:
$js['head'] = "/(\.head\.js\.php)|(\.head\.js)|(\.h.js)/";
$js['foot'] = "/(\.foot\.js\.php)|(\.foot\.js)|(\.f.js)|(\.js)^(\.head\.js)/";
I have a naming system whereby if you determine where the javascript file gets loaded, in the <head> tag or footer of the HTML page. All files are generally considered to be loaded at the bottom of the page, unless you specify (.head.js for example).
Up until a few days a go I noticed that the js['foot'] array was also including .head.js as well, causing the files to be loaded twice. So I added in the ^(\.head\.js) and it worked! it stopped the .head.js files being added into the footer array. I was quite pleased with myself, because I suck at regex. However it seems now that standard .js files (any normal .js files) arnt being loaded into the $js['foot'] array now. Why is this? If I remove the ^(\.head\.js) part it loads them.
To be clear, I want the $js['foot'] array to load files ending with:
.foot.js.php
.foot.js
.f.js
.js
And IGNORE all:
.head.js.php
.head.js
.h.js
Can someone correct my regex above to do this? I thought the ^ operator was NOT but i was wrong!

^(\.head\.js) in the middle of string makes it an invalid because ^ is considered anchor that matches line start.
You actually need a negative lookbehind assertion to stop matching head.js in footer regex:
$js['head'] = '/\.head\.js(?:\.php)?|\.h.js/';
$js['foot'] = '/\.foot\.js(?:\.php)?|(?<!head|h)\.js/';
RegEx Demo

PHP Regex URL parsing issues preg_replace

I have a custom markup parsing function that has been working very well for many years. I recently discovered a bug that I hadn't noticed before and I haven't been able to fix it. If anyone can help me with this that'd be awesome. So I have a custom built forum and text based MMORPG and every input is sanitized and parsed for bbcode like markup. It'll also parse out URL's and make them into legit links that go to an exit page with a disclaimer that you're leaving the site... So the issue that I'm having is that when I user posts multiple URL's in a text box (let's say \n delimited) it'll only convert every other URL into a link. Here's the parser for URL's:
$markup = preg_replace("/(^|[^=\"\/])\b((\w+:\/\/|www\.)[^\s<]+)" . "((\W+|\b)([\s<]|$))/ei", '"$1".shortURL("$2")."$4"', $markup);
As you can see it calls a PHP function, but that's not the issue here. Then entire text block is passed into this preg_replace at the same time rather than line by line or any other means.
If there's a simpler way of writing this preg_replace, please let me know
If you can figure out why this is only parsing every other URL, that's my ultimate goal here
Example INPUT:
http://skylnk.co/tRRTnb
http://skylnk.co/hkIJBT
http://skylnk.co/vUMGQo
http://skylnk.co/USOLfW
http://skylnk.co/BPlaJl
http://skylnk.co/tqcPbL
http://skylnk.co/jJTjRs
http://skylnk.co/itmhJs
http://skylnk.co/llUBAR
http://skylnk.co/XDJZxD
Example OUTPUT:
http://skylnk.co/tRRTnb
<br>http://skylnk.co/hkIJBT
<br>http://skylnk.co/vUMGQo
<br>http://skylnk.co/USOLfW
<br>http://skylnk.co/BPlaJl
<br>http://skylnk.co/tqcPbL
<br>http://skylnk.co/jJTjRs
<br>http://skylnk.co/itmhJs
<br>http://skylnk.co/llUBAR
<br>http://skylnk.co/XDJZxD
<br>

e flag in preg_replace is deprecated. You can use preg_replace_callback to access the same functionality.
i flag is useless here, since \w already matches both upper case and lower case, and there is no backreference in your pattern.
I set m flag, which makes the ^ and $ matches the beginning and the end of a line, rather than the beginning and the end of the entire string. This should fix your weird problem of matching every other line.
I also make some of the groups non-capturing (?:pattern) - since the bigger capturing groups have captured the text already.
The code below is not tested. I only tested the regex on regex tester.
preg_replace_callback(
"/(^|[^=\"\/])\b((?:\w+:\/\/|www\.)[^\s<]+)((?:\W+|\b)(?:[\s<]|$))/m",
function ($m) {
return "$m[1]".shortURL($m[2])."$m[3]";
},
$markup
);

Matching all three kinds of PHP comments with a regular expression

I need to match all three types of comments that PHP might have:
# Single line comment
// Single line comment
/* Multi-line comments */
 
/**
* And all of its possible variations
*/
Something I should mention: I am doing this in order to be able to recognize if a PHP closing tag (?>) is inside a comment or not. If it is then ignore it, and if not then make it count as one. This is going to be used inside an XML document in order to improve Sublime Text's recognition of the closing tag (because it's driving me nuts!). I tried to achieve this a couple of hours, but I wasn't able. How can I translate for it to work with XML?
So if you could also include the if-then-else login I would really appreciate it. BTW, I really need it to be in pure regular expression expression, no language features or anything. :)
Like Eicon reminded me, I need all of them to be able to match at the start of the line, or at the end of a piece of code, so I also need the following with all of them:
<?php
echo 'something'; # this is a comment
?>

Parsing a programming language seems too much for regexes to do. You should probably look for a PHP parser.
But these would be the regexes you are looking for. I assume for all of them that you use the DOTALL or SINGLELINE option (although the first two would work without it as well):
~#[^\r\n]*~
~//[^\r\n]*~
~/\*.*?\*/~s
Note that any of these will cause problems, if the comment-delimiting characters appear in a string or somewhere else, where they do not actually open a comment.
You can also combine all of these into one regex:
~(?:#|//)[^\r\n]*|/\*.*?\*/~s
If you use some tool or language that does not require delimiters (like Java or C#), remove those ~. In this case you will also have to apply the DOTALL option differently. But without knowing where you are going to use this, I cannot tell you how.
If you cannot/do not want to set the DOTALL option, this would be equivalent (I also left out the delimiters to give an example):
(?:#|//)[^\r\n]*|/\*[\s\S]*?\*/
See here for a working demo.
Now if you also want to capture the contents of the comments in a group, then you could do this
(?|(?:#|//)([^\r\n]*)|/\*([\s\S]*?)\*/)
Regardless of the type of comment, the comments content (without the syntax delimiters) will be found in capture 1.
Another working demo.

Single-line comments
singleLineComment = /'[^']*'|"[^"]*"|((?:#|\/\/).*$)/gm
With this regex you have to replace (or remove) everything that was captured by ((?:#|\/\/).*$). This regex will ignore contents of strings that would look like comments (e.g. $x = "You are the #1"; or $y = "You can start comments with // or # in PHP, but I'm a code string";)
Multiline comments
multilineComment = /^\s*\/\*\*?[^!][.\s\t\S\n\r]*?\*\//gm

Complex PHP/Perl regular expression for emoticons

I've checked google for help on this subject but all the answers keep overlooking a fatal flaw in the replacement method.
Essentially I have a set of emoticons such as :) LocK :eek and so on and need to replace them with image tags. The problem I'm having is identifying that a particular emoticon is not part of a word and is alone on a line. For example on our site we allow 'quick links' which are not included in the smiley replacement which take the format go:forum, user:Username and so on. Pretty much all answers I've read don't allow for this possiblity and as such break these links (i.e. go<img src="image.gif" />orum). I've tried experimenting around with different ways to get around this to check for the start of the line, spaces/newline characters and so on but I've not had much luck.
Any help with this problem would be greatly appreciated. Oh also I'm using PHP 5 and the preg_% functions.
Thanks,
Rupert S.
Edit 18/04/2011:
Thanks for your help peeps :) Have created the final regex that I though I'd share with everyone, had a couple problems to do with special space chars including newline but it's now working like a dream the final regex is:
(?<=\s|\A|\n|\r|\t|\v|\<br \/\>|\<br\>)(:S)(?=\s|\Z|$|\n|\r|\t|\v|\<br \/\>|\<br\>)

To complete the comment into an answer: The simplest workaround would be to assert that the emoticons are always surrounded by whitespace.
(?<=\s|^)[<:-}]+(?=\s|$)
The \s covers normal spaces and line breaks. Just to be safe ^ and $ cover occurrences at the start or very end of the text subject. The assertions themselves do not match, so can be ignored in the replacement string/callback.

If you want to do all the replace in one single preg_replace, try this:
preg_replace('/(?<=^|\s)(:\)|:eek)(?=$|\s)/e'
,"'$1'==':)'?'<img src=\"smile.gif\"/>':('$1'==':eek'?'<img src=\"eek.gif\"/>':'$1')"
,$input);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Targeting specific PHP tag with regex - php

Related

PhpStorm 2016.2 find and replace multiline text

Regex on File Names

PHP Regex URL parsing issues preg_replace

Matching all three kinds of PHP comments with a regular expression

Complex PHP/Perl regular expression for emoticons

Categories

Resources