Regex to get a specific function code from php file - php

I'm editing a PHP file to replace a function's code by another code provided.
My problem is that the function's code may change. So, I don't know exactly how I should do the regex to find it.
Here's an example:
function lol()
{
echo('omg');
$num=0;
while($num<100)
{
echo('ok');
}
}
function teste()
{
echo('ola');
while($num<100)
{
echo('ok');
}
}
function test()
{
echo('oi');
while($num<100)
{
echo('ok');
}
}
What I need is to get the code of function teste(), which would be:
echo('ola');
while($num<100)
{
echo('ok');
}
BUT: I do not know how many whiles, foreaches or how many brackets are inside the functions, neither I don't know it's order. The example was just an example.
How would I be able to get the function's code?
Thank you in advance.

Disclaimer
As other users stated, this is probably not a good approach. And if you decided to use it anyway, double-check your result as regex can be treacherous.
Ugly solution
You can you something like to match even if the function is the last one (#Nippey):
(?:function teste\(\))[^{]*{(.*?)}[^}]*?(?:function|\Z)
Note: I'm using (?:xyz) which is for non-capturing parentheses group. Check if your regex engine support it.
Output
echo('ola');
while($num<100)
{
echo('ok');
}

Using recursive pattern (?R) you can match nested brackets:
$result = preg_replace_callback('#(.*?)\{((?:[^{}]|(?R))*)\}|.*#si', function($m){
if(isset($m[1], $m[2]) && preg_match('#function\s+teste\b#i', $m[1])){
return $m[2];
}
}, $code); // Anonymous function is used here which means PHP 5.3 is required
echo $result;
Online demo

Related

preg_match_all multine code scanning matches with a first word match

first time posting a question here. I am currently writing a tool for reverse engineer PHP7 code in order to create UML class diagrams and for that purpose I'm using preg_match_all to extract sections of code from the source files. So far so good but I must admit I don't fully understand yet how regular expressions work. Still I was able to create complex patterns but this one beats me.
what I want is to match the use clause from classes body in order to get the traits names. It can be in one of these following formats *(as mentioned in https://www.php.net/manual/en/language.oop5.traits.php):
1)
use trait;
use othertrait;
2)
use trait, othertrait, someothertrait;
or
use trait, othertrait, someothertrait { "conflict_resolutions" }
I don't care about conflict resolutions yet so I can drop these.
so far I have the following regex pattern:
class usetrait_finder {
use finder;
function __construct( string $source ){
$this->source = $source;
$this->pattern = "/";
$this->pattern .= "(?:(use\s+|,)\s*(?<traitname>[a-zA-Z0-9_]*))";
$this->pattern .= "";
$this->pattern .= "/ms";
$this->matches($source);
}
function get_trait_name(): string {
return $this->matches["traitname"][$this->current_key];
}
}
which matches the mentioned cases, but I know it's a cheat because the "use" word must appear at least one at first. I wrote a PHPUnit test to check every normal case and the following test doesn't pass:
// tag "use" must be preset at least once
function test_invalid_source_2(){
$source = "function sarasa(
sometrait
,
someothertrait )
{ anything }
function test() {}
}";
$finder = new usetrait_finder( $source );
var_dump( $finder->matches($source)[0] );
$this->assertEquals( false, $finder->more_elements() );
}
the var_dump output is:
array(1) {
[0]=>
string(16) ",
someothertrait"
}
my expected output should be an empty string or null since the word "use" is not the first thing matched on the first line. Of course others tests must pass, and in the $matches["traitname"] should be only one trait name.
usetrait_finder is here:
https://github.com/rudymartinb/classtree/blob/master/src/usetrait_finder.php
"finder" trait section of above source is here (not really important but it won't hurt mentioning):
https://github.com/rudymartinb/classtree/blob/master/src/traits/finder.php
full test case is here: https://github.com/rudymartinb/classtree/blob/master/tests/usetrait_finder_Test.php
thank you in advance

In PHP is there a better way to do dynamic templating

I'm creating module for an application which will have simple custom templating with tags that will be replaced with data from a database. The field names will be different in each instance of this module. I want to know if there is a better way to do this.
The code below is what I've come up with, But I believe there must be a better way. I struggled with preg_split and preg_match_all and just hit my limit so I did it the dumb person way.
<?php
$customTemplate = "
<div>
<<This>>
<<that>>
</div>
";
function process_template ($template, $begin = '<<', $end = '>>') {
$begin_exploded = explode($begin, $template);
if (is_array($begin_exploded)) {
foreach ($begin_exploded as $key1 => $value1) {
$end_exploded = explode($end, $value1);
if (is_array($end_exploded)) {
foreach ($end_exploded as $key2 => $value2) {
$tag = $begin.$value2.$end;
$variable = trim($value2);
$find_it = strpos($template,$tag);
if ($find_it !== false) {
//str_replace ($tag, $MyClass->get($variable), $template );
$template = str_replace ($tag, $variable, $template);
}
}
}
}
}
return $template;
}
echo(process_template($customTemplate));
/* Will Echo
<div>
This
that
</div>
*/
?>
In the future I will connect $MyClass->get() to replace the tag with the proper data. And the custom template will be built by the user.
Rather than preg_split or preg_match I would rather use preg_replace_callback, since you are doing replacements, and the replacement value is derived from what looks like will end up being a method in another class.
function process_template($template, $begin = '<<', $end = '>>') {
// get $MyClass in the function scope somehow. Maybe pass it as another parameter?
return preg_replace_callback("/$begin(\w+)$end/", function($var) use ($MyClass) {
return $MyClass->get($var[1]);
}, $template);
}
Here's an example to play with: https://3v4l.org/N1p03
I assume this is just for fun/learning. If I really needed to use a template for something I would rather start with composer require "twig/twig:^2.0" instead. In fact, if you're interested in learning more about how it works you could go check out a well-established system like twig or blade does it. (Better than I've done it in this answer.)
There are tons templating engines around, but sometimes... just add complexity and dependencies for a maybe simple thing. This a modified sample of what I used for make some javascript corrections. This works for your template.
function process_template($html,$b='<<',$e='>>'){
$replace=['this'=>'<input name="this" />','that'=>'<input name="that" />'];
if(preg_match_all('/('.$b.')(.*?)('.$e.')/is',$html,$matches,PREG_SET_ORDER|PREG_OFFSET_CAPTURE)){
$t='';$o=0;
foreach($matches as $m){
//for reference $m[1][0] contains $b, $m[2][0] contains $e
$t.=substr($html,$o,$m[0][1]-$o);
$t.=$replace[$m[2][0]];
$o=$m[3][1]+strlen($m[3][0]);
}
$t.=substr($html,$o);
$html=$t;
}
return $html;
}
$html="
<div>
<<this>>
<<that>>
</div>
";
$new=process_template($html);
echo $new;
For demo purpose I put the array $replace that handling the substitutions. You replace those with your function that will handle the replacement.
Here is a working snippet: https://3v4l.org/MBnbR
I like this function because you have control of what to replace and what to put back on the final result. By the way by using the PREG_OFFSET_CAPTURE also return on the matches the position where the regexp groups happens. Those are on the $m[x][1]. The captured text will be on $m[x][0].

PHP Regular expression with arrows (>>)

I need a little help with my regular expression.
Here is what I've got:
function formatLink($post) {
if(preg_match('/^\>\>[0-9]{+}$/', $post)) {
return "<font color=\"red\">".$post."</font>";
} else {
return "<font color=\"#b7b7b7\">".$post."</font>";
}
}
echo formatLink(">>86721678");
And honestly I don't know what doesn't it work. It should work for any string like this:
>>1
>>87759
Very similar to imageboard-like post ref.
Remove the curly braces. They are not needed. You also need to add the m modifier to allow it to match on any line, not just the entire post.
Also note that this will only work if there is literally nothing else on the line, not even a space. You might want to relax it like so:
/^\s*>>\s*\d+\s*$/m
You forgot to escape!
<?php
function formatLink($post) {
if(preg_match('/^\>\>[0-9]{+}$/', $post))
{
return "<font color=\"red\">".htmlentities($post)."</font>";
}
else
{
return "<font color=\"#b7b7b7\">".htmlentities($post)."</font>";
}
}
echo formatLink(">>86721678");
Running example.
I think your problem is in your regular expression.
Use this instead:
if(preg_match('/^\>\>([0-9]+)$/', $post)) {
See that I removed the curly brackets from your regular expression.
Try changing the regex to
/^\>\>[0-9]*$/

Returning a top level domain with a period at the end in php

Basically the problem I am having is I need to write this function that can take a URL like www.stackoverflow.com and just return the "com". But I need to be able to return the same value even if the URL has a period at the end like "www.stackoverflow.com."
This is what I have so far. The if statement is my attempt to return the point in the array before the period but I dont think I am using the if statement correctly. Otherwise the rest of the code does exactly what is supposed to do.
<?php
function getTLD($domain)
{
$domainArray = explode("." , $domain);
$topDomain = end($domainArray);
if ($topDomain == " ")
$changedDomain = prev(end($domainArray));
return $changedDomain;
return $topDomain;
}
?>
Don't use a regex for simple cases like that, it is cpu costly and unreadable. Just remove the final dot if it exists:
function getTLD($domain) {
$domain = rtrim($domain, '.');
return end(explode('.', $domain));
}
The end function is returning an empty string "" (without any spaces). You are comparing $topDomain to single space character so the if is not evaluating to true.
Also prev function requires array input and end($domainArray) is returning a string, so, $changedDomain = prev(end($domainArray)) should throw an E_WARNING.
Since end updates the internal pointer of the array $domainArray, which is already updated when you called $topDomain = end($domainArray), you do not need to call end on $domainArray inside the if block.
Try:
if ($topDomain == "") {
$changedDomain = prev($domainArray);
return $changedDomain; // Will output com
}
Here is the phpfiddle for it.
Use regular expressions for something like this. Try this:
function getTLD($domain) {
return preg_replace("/.*\.([a-z]+)\.?$/i", "$1", $domain );
}
A live example: http://codepad.org/km0vCkLz
Read more about regular expressions and about how to use them: http://www.regular-expressions.info/

Reading php files with special tags in php

I have a file which reads as follows
<<row>> 1|test|20110404<</row>>
<<row>> 1|test|20110404<</row>>
<<row>><</row>> indicates start and end of line.I want to read line between this tags and also check whether this tags are present.
The first thing you need to do is locate the position of this "tag". The strpos() function does just that.
$tag_pos=strpos('<> 1|test|20110404<> <> 1|test|20110404<>', '<>');
if ($tag_pos===false) {
//The tag was not found!
} else {
//$tag_pos equals the numeric position of the first character of your tag
}
If these are truly lines, an efficient way to get them all is just to split on <>.
$lines=explode('<>', '<> 1|test|20110404<> <> 1|test|20110404<>');
$lines=array_filter($lines); //Removes blank strings from array
You could improve this by adding a callback function to the array_filter() call that uses trim() to remove any whitespace and then see if it is blank or not.
Edit: Great, I see that your "tags" were missing from your post. Since your start and end tags do not match, the code above will be of little use to you. Let me try again...
function strbetweenstrs($source, $tag1, $tag2, $casesensitive=true) {
$whatsleft=$source;
while ($whatsleft<>'') {
if ($casesensitive) {
$pos1=strpos($whatsleft, $str1);
$pos2=strpos($whatsleft, $str2, $pos1+strlen($str1));
} else {
$pos1=strpos(strtoupper($whatsleft), strtoupper($str1));
$pos2=strpos(strtoupper($whatsleft), strtoupper($str2), $pos1+strlen($str1));
}
if (($pos1===false) || ($pos2===false)) {
break;
}
array_push($results, substr($whatsleft, $pos1+strlen($str1), $pos2-($pos1_strlen($str1))));
$whatsleft=substr($whatsleft, $pos2+strlen($str2));
}
}
Note that I haven't tested this... but you get the generally idea. There is probably a much more efficient way to go about doing it.
Creating your own format is not so hard, but creating a script to read it can be difficult.
The advantage of using standardized formats is that most programming languages has support for them already. For example:
XML: You can use the simplexml_load_string() function and it can make you navigate easily through your content.
$str = "<?xml version="1.0" encoding="utf-8"?>
<data>
<row>1|test|20110404</row>
<row>1|test|20110404</row>
</data>";
$xml = simplexml_load_string($str);
Now you can access your data
echo $xml->row[0];
echo $xml->row[1];
i'm sure you get the idea,
there is also a very good support for JSON (Javascript Object Notation) using the jsondecode() function;
Check it on php.net for more details
i would suggest to use preg_match :-
preg_match( '#<< row>>(.*)<< /row>>#', $line, $matches);
if( ! empty($matches))
{
// line was found
print_r( $matches[1] ); // will contain the content between the start and end row tags
}

Categories