Regex PHP Function - php

Yes, I know that people don't like parsing PHP, use a tokenizer they said, it will be great they said... I want you to know it isn't great, it isn't even fine.
I am working in .NET and using PCRE-NET and want to parse some PHP Functions to see if I can do some PHP tree shaking.
I tried using CodeParser which uses Antlr4 to tokenize, the results I got back were horrible to navigate. Yes it is all there technically, but it is so convoluted that really, Regex is better for what I am looking for.
I have the following regex working:
(?<functionScope>\w+)\s*function\s+(?<functionName>\w+)\s*\((?<functionArguments>(?:[^()]+)*)?\s*\)[\s:]*.*(?<functionBody>{(?:[^{}]+|(?-1))*+})
Try it out: https://regex101.com/r/yU6K45/1
This will break up a PHP File into the individual scopes, functions, arguments and function body. I am now looking at the functionBody and wanting to find all functions used inside that function, which I have here:
(?=[^\=\s])((?<functionClass>[$?\w[\w\d]*)?(?<ClassOperator>::|->|\\)?){0,3}?(?<functionName>\w[\w\d]*)\((?<Arguments>.*)?\)
See it at: https://regex101.com/r/3JzPR5/1
An issue I am having is with named groups. When there is a lot of namespacing, the named groups don't work out well. I am wondering if you have any ideas how to split up the line:
$uri = ExtraLevel\Psr7\UriResolver::resolve(Psr7\Utils::uriFor($config['base_uri']), $uri);
To where I would have something like:
Full match ExtraLevel\Psr7\UriResolver::resolve(Psr7\Utils::uriFor($config['base_uri']), $uri)
Group `functionClass` ExtraLevel\
Group `functionClass2` Psr7\
Group `functionClass3` UriResolver::
Group `functionName` resolve
Group `Arguments` Psr7\Utils::uriFor($config['base_uri']), $uri
Would love to match in a way that won't break when there aren't 3-4 levels.

Related

Exclude strings from search PhpStorm

This question might have an answer somewhere out there in the internet but I can't seem to find it. Of course if you have a link to give me I will gladly accept it as an answer if it is what I am looking for.
Here goes:
How can I exclude some strings from my global search ?
Basically I use the Ctrl + Shift + F to find all occurrences of a nameOfTheFile.php string to find all the patterns that call this file. A not so very smart developer created multiple nameOfTheFile.php everywhere and so the path to include them always changes, I need something fix so I need to change every single call. There is a lot of calls => 1533 occurrences according to PhpStorm so doing them one by one is NOT an option.
So my plan is to write all the patterns down (there shouldn't be more than 50 so it is doable) and replace all of them later. To do that I could use a filter to exclude the patterns that I have already found.
At the moment the pattern list would be something like:
include_once("folderY/nameOfTheFile.php");
include_once(PATH . "folderY/nameOfTheFile.php");
include_once (PATH . "folderY/nameOfTheFile.php");
include_once ("../../../folderX/nameOfTheFile.php");
include_once ("../../folderX/nameOfTheFile.php");
include_once ("../folderX/nameOfTheFile.php");
require_once($settings['siteFilepath'] . "folderY/nameOfTheFile.php");
How can I exclude those strings from the search? I thought of using a Regex but as I am not an expert (Junior Dev here :/) I can't really come up with it. Also I would think that maybe there is something built in PhpStorm that could work better.
Have I missed something ? Is there a Regex to help me ? Bonus point: if there is a Regex please explain how it works (remember I am far from being an expert).
in PhpStorm go to replace all cntr+shift+R
select 'Regex' option and enter
include_once.*nameOfTheFile\.php"\);
this will select offending entries for replacement

Using regex pattern match in xhprof ignore function

I am trying to profile a codeigniter application with xhprof. I am getting the report like following...
Now I am trying to ignore some function during xhprof report generation. For that what I did is like following....
$ignore = array(
'???_op',
'???_op#1',
'???_op#2',
'???_op#3',
'???_op#4',
'???_op#5'
);
xhprof_enable(XHPROF_FLAGS_NO_BUILTINS | XHPROF_FLAGS_CPU | XHPROF_FLAGS_MEMORY, array('ignored_functions' => $ignore));
Now if I want to ignore all the CI related functions (i.e the functions starting like CI_*) seems like I have to insert them one by one in the array.
Is there any way where I can pattern match with regex and ignore functions according to my requirement?
Unfortunately, PHP's xhprof_enable() does not support regex patterns in the ignored_functions element of the options parameter.
I reckon the simplest way to manually generate the blacklist would be to copy-paste the rendered output from the function into your favorite IDE.
Once the text is in your IDE use the regex find/replace functionality to isolate your desired function names such as:
^(?:\?{3}_op|CI_)\S*
Then just copy the matches into your blacklist array.

Change url variable using php

Say I have a url like this in a php variable:
$url = "http://mywebsite.extension/names/level/etc/page/x";
how would I automatically remove everything after the .com (or other extension) and before /page/2?
Basically I would like every url that could be in $url to become http://mywebsite.extension/page/x
Is there a way to do this in php? :s
thanks for your help guys!
I think parse_url() is the function you're looking for. You can use it to break down an URL into it's component parts, and then put it back together however you want, adding in your own things as needed.
As PeeHaa noted, explode() will be useful for dividing up the path.

Search through multiple hosted XML files for a string

I'm working on a website that uses a lot of XML-files as data (150 in total and probably growing). Each page is an XML-file.
What I'm looking for is a way to look for a string through the XML-files. I'm not sure what programming language to use for this XML search engine.
I'm familiar with PHP, JavaScript, JQuery. So I'd prefer using those languages.
Thanks a bunch!
UPDATE: I'm looking for a solution that works quickly.
Ideally, the function returns the tagname that contains the searchstring.
If, for instance, the XML is as follows:
<article-1>This is a great story.</article-1>
If one would search for 'story', it would return 'article-1'.
I'm not quite sure on how to do this with a regular expression.
PHP can do this. Here's an example:
foreach(glob("{foldera/*.xml,folderb/*.xml}",GLOB_BRACE) as $filename) {
$xml = simplexml_load_file($filename);
//use regular expressions to find your string
}
You simply iterate through each file on your server using glob() with a foreach loop.
Sounds like a problem that could be solved with grep and regular expressions. Without knowing what string you're looking for it's not possible to say exactly what you should do, but reading some documentation on grep should get you started down the right path.

Changing/deleting html from file_get_contents

I'm currently using this code:
$blog= file_get_contents("http://powback.tumblr.com/post/" . $post);
echo $blog;
And it works. But tumblr has added a script that activates each time you enter a password-field. So my question is:
Can i remove certain parts with file_get_contents? Or just remove everything above the <html> tag? could i possibly kill a whole div so it wont load at all? And if so; how?
edit:
I managed to do it the simple way. By skipping 766 characters. The script now work as intended!
$blog= file_get_contents("powback.tumblr.com/post/"; . $post, NULL, NULL, 766);
After file_get_contents returns, you have in your hands a string. You can do anything you want to it, including cutting out parts of it.
There are two ways to actually do the cutting:
Using string functions like str_replace, preg_replace and others; the exact recipe depends on what you need to do. This approach is kind of frowned upon because you are working at the wrong level of abstraction, but in some cases it has an unmatched performance to time spent ratio.
Parsing the HTML into a DOM tree, modifying it appropriately (this time working at the appropriate level of abstraction) and then turn it back into a string and echo it. This can be more convenient to work with if your requirements are not dead simple and is easier to maintain, but it typically requires more code to be written.
If you want to do something that's most naturally expressed in HTML document terms ("cutting out this <div>") then don't be tempted and go with the second approach.
At that point, $blog is just a string, so you can use normal PHP functions to alter it. Look into these 2:
http://php.net/manual/en/function.str-replace.php
http://us2.php.net/manual/en/function.preg-replace.php
You can parse your output using simple html dom parser and display olythe contents thatyou really want to display

Categories