Basically I want to find a function from a php string source content. I'm trying to parse a php file and read its content into string. I want to find something like:
function_name(paras) or function_name() or function_name(params, params)
for example if source contains:
echo 'Greetings'.greet("I'm Johan");
$age = date_of_birth(date());
echo 'I am ' .$age . 'years old';
it would then find greet, date_of_birth, date because these are the functions used.
If you want to get the parameters including nested brackets, like your date_of_birth(date()), its maybe not impossible with regex but very difficult.
If you say its enough to find the name of the function then you can try this:
\w+(?=\()
See it here on Regexr
That will match at least one word character that is followed by an opening bracket.
\w contains letters, digits and the underscore
(?=\() is a positive look ahead that checks if a ( is following
you need a regular expression:
maybe something like this
preg_match("[a-zA-Z_][a-zA-Z_0-9]*(.*)",$stringToLookIn);
preg_match_all('/([a-zA-Z_]\w+)\s*\(/', $source, $match);
var_dump($match[1]);
Related
I'm trying to retrieve the followed by count on my instagram page. I can't seem to get the Regex right and would very much appreciate some help.
Here's what I'm looking for:
y":{"count":
That's the beginning of the string, and I want the 4 numbers after that.
$string = preg_replace("{y"\"count":([0-9]+)\}","",$code);
Someone suggested this ^ but I can't get the formatting right...
You haven't posted your strings so it is a guess to what the regex should be... so I'll answer on why your codes fail.
preg_replace('"followed_by":{"count":\d')
This is very far from the correct preg_replace usage. You need to give it the replacement string and the string to search on. See http://php.net/manual/en/function.preg-replace.php
Your second usage:
$string = preg_replace(/^y":{"count[0-9]/","",$code);
Is closer but preg_replace is global so this is searching your whole file (or it would if not for the anchor) and will replace the found value with nothing. What your really want (I think) is to use preg_match.
$string = preg_match('/y":\{"count(\d{4})/"', $code, $match);
$counted = $match[1];
This presumes your regex was kind of correct already.
Per your update:
Demo: https://regex101.com/r/aR2iU2/1
$code = 'y":{"count:1234';
$string = preg_match('/y":\{"count:(\d{4})/', $code, $match);
$counted = $match[1];
echo $counted;
PHP Demo: https://eval.in/489436
I removed the ^ which requires the regex starts at the start of your string, escaped the { and made the\d be 4 characters long. The () is a capture group and stores whatever is found inside of it, in this case the 4 numbers.
Also if this isn't just for learning you should be prepared for this to stop working at some point as the service provider may change the format. The API is a safer route to go.
This regexp should capture value you're looking for in the first group:
\{"count":([0-9]+)\}
Use it with preg_match_all function to easily capture what you want into array (you're using preg_replace which isn't for retrieving data but for... well replacing it).
Your regexp isn't working because you didn't escaped curly brackets. And also you didn't put count quantifier (plus sign in my example) so it would only capture first digit anyway.
I have a big string like this:
[/az_column_text][/vc_column_inner][vc_column_inner width="3/4"]
[az_latest_posts post_layout="listed-layout" post_columns_count="2clm" post_categories="assemblea-soci-2015"]
[/vc_column_inner][/vc_row_inner][/vc_column]
What I need to extract:
assemblea-soci-2015
Of course this value can change, and also the big string can change too. I need a regex or something else to extract this value (it will be always from post_categories="my-value-to-extract") from this big string.
I think to take post_categories=" as the beginning of a possible substring and the next char " as the end of my portion, but no idea how to do this.
Is there an elegant way to do this also for future values with, of course, different length?
You can use this regex in PHP:
post_categories="\K[^"]+
RegEx Demo
You can use this regex:
(?<=post_categories=")[^"]+(?=")
?<= (lookbehind) looks for post_categories=" before the desired match, and (?=) (lookahead) looks for " after the desired match.
[^"] gets the match (which is assumed not to contain any ")
Demo
Example PHP code:
$text='[/az_column_text][/vc_column_inner][vc_column_inner width="3/4"]
[az_latest_posts post_layout="listed-layout" post_columns_count="2clm" post_categories="assemblea-soci-2015"]
[/vc_column_inner][/vc_row_inner][/vc_column]';
preg_match ("/(?<=post_categories=\")[^\"]+(?=\")/", $text,$matches);
echo $matches[0];
Output:
assemblea-soci-2015
This should extract what you want.
preg_match ("/post_categories=\"(.*)\"\[\]/", $text_you_want_to_use)
I am trying to pull the anchor text from a link that is formatted this way:
<h3><b>File</b> : i_want_this</h3>
I want only the anchor text for the link : "i_want_this"
"variable_text" varies according to the filename so I need to ignore that.
I am using this regex:
<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>
This is matching of course the complete link.
PHP uses a pretty close version to PCRE (PERL Regex). If you want to know a lot about regex, visit perlretut.org. Also, look into Regex generators like exspresso.
For your use, know that regex is greedy. That means that when you specify that you want something, follwed by anything (any repetitions) followed by something, it will keep on going until that second something is reached.
to be more clear, what you want is this:
<a href="
any character, any number of times (regex = .* )
">
any character, any number of times (regex = .* )
</a>
beyond that, you want to capture the second group of "any character, any number of times". You can do that using what are called capture groups (capture anything inside of parenthesis as a group for reference later, also called back references).
I would also look into named subpatterns, too - with those, you can reference your choice with a human readable string rather than an array index. Syntax for those in PHP are (?P<name>pattern) where name is the name you want and pattern is the actual regex. I'll use that below.
So all that being said, here's the "lazy web" for your regex:
<?php
$str = '<h3><b>File</b> : i_want_this</h3>';
$regex = '/(<a href\=".*">)(?P<target>.*)(<\/a>)/';
preg_match($regex, $str, $matches);
print $matches['target'];
?>
//This should output "i_want_this"
Oh, and one final thought. Depending on what you are doing exactly, you may want to look into SimpleXML instead of using regex for this. This would probably require that the tags that we see are just snippits of a larger whole as SimpleXML requires well-formed XML (or XHTML).
I'm sure someone will probably have a more elegant solution, but I think this will do what you want to done.
Where:
$subject = "<h3><b>File</b> : i_want_this</h3>";
Option 1:
$pattern1 = '/(<a href=")(.*)(">)(.*)(<\/a>)/i';
preg_match($pattern1, $subject, $matches1);
print($matches1[4]);
Option 2:
$pattern2 = '()(.*)()';
ereg($pattern2, $subject, $matches2);
print($matches2[4]);
Do not use regex to parse HTML. Use a DOM parser. Specify the language you're using, too.
Since it's in a captured group and since you claim it's matching, you should be able to reference it through $1 or \1 depending on the language.
$blah = preg_match( $pattern, $subject, $matches );
print_r($matches);
The thing to remember is that regex's return everything you searched for if it matches. You need to specify that only care about the part you've surrounded in parenthesis (the anchor text). I'm not sure what language you're using the regex in, but here's an example in Ruby:
string = 'i_want_this'
data = string.match(/<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>/)
puts data # => outputs 'i_want_this'
If you specify what you want in parenthesis, you can reference it:
string = 'i_want_this'
data = string.match(/<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>/)[1]
puts data # => outputs 'i_want_this'
Perl will have you use $1 instead of [1] like this:
$string = 'i_want_this';
$string =~ m/<a href=\"\/en\/browse\/file\/variable_text\">(.*?)<\/a>/;
$data = $1;
print $data . "\n";
Hope that helps.
I'm not 100% sure if I understand what you want. This will match the content between the anchor tags. The URL must start with /en/browse/file/, but may end with anything.
#(.*?)#
I used # as a delimiter as it made it clearer. It'll also help if you put them in single quotes instead of double quotes so you don't have to escape anything at all.
If you want to limit to numbers instead, you can use:
#(.*?)#
If it should have just 5 numbers:
#(.*?)#
If it should have between 3 and 6 numbers:
#(.*?)#
If it should have more than 2 numbers:
#(.*?)#
This should work:
<a href="[^"]*">([^<]*)
this says that take EVERYTHING you find until you meet "
[^"]*
same! take everything with you till you meet <
[^<]*
The paratese around [^<]*
([^<]*)
group it! so you can collect that data in PHP! If you look in the PHP manual om preg_match you will se many fine examples there!
Good luck!
And for your concrete example:
<a href="/en/browse/file/variable_text">([^<]*)
I use
[^<]*
because in some examples...
.*?
can be extremely slow! Shoudln't use that if you can use
[^<]*
You should use the tool Expresso for creating regular expression... Pretty handy..
http://www.ultrapico.com/Expresso.htm
I have strings in my application that users can send via a form, and they can optionally replace words in that string with replacements that they also specify. For example if one of my users entered this string:
I am a user string and I need to be parsed.
And chose to replace and with foo the resulting string should be:
I am a user string foo I need to be parsed.
I need to somehow find the starting position of what they want to replace, replace it with the word they want and then tie it all together.
Could anyone write this up or at least provide an algorithm? My PHP skills aren't really up to the task :(
Thanks. :)
$result = preg_replace('/\band\b/i', 'foo', $subject);
will find all occurences of and where it's a word on its own and replace it with foo. \b ensures that there is a word boundary before and after and.
use preg_replace. You don't need to think so hard about this though you will have to learn a little bit about regexes. :)
Read up on str_replace, or for more complex replacements on Regular Expressions and preg_replace.
Examples for both:
<?php
$str = 'I am a user string and I need to be parsed.';
echo str_replace( 'and', 'foo', $str ) . "\n";
echo preg_replace( '/and/', 'foo', $str ) . "\n";
?>
In response to the comments of this answer, note that both examples above will replace every occurrence of the search string (and), even when it happens to be within another word.
To take care of that you either have to add the word separators to the str_replace call (see the comment of an example), but this will get quite complicated when you want to take care of all common word separators (space, commas, dots, exclamation marks, question marks etc.).
An easier to way to fix this problem is to use the power of regular expressions and make sure, the actual search string is not found within another word. See Tim Pietzcker's example below for a possible solution.
Here is the subject:
http://www.mysite.com/files/get/937IPiztQG/the-blah-blah-text-i-dont-need.mov
What I need using regex is only the bit before the last / (including that last / too)
The 937IPiztQG string may change; it will contain a-z A-Z 0-9 - _
Here's what I tried:
$code = strstr($url, '/http:\/\/www\.mysite\.com\/files\/get\/([A-Za-z0-9]+)./');
EDIT: I need to use regex because I don't actually know the URL. I have string like this...
a song
more text
oh and here goes some more blah blah
I need it to read that string and cut off filename part of the URLs.
You really don't need a regexp here. Here is a simple solution:
echo basename(dirname('http://www.mysite.com/files/get/937IPiztQG/the-blah-blah-text-i-dont-need.mov'));
// echoes "937IPiztQG"
Also, I'd like to quote Jamie Zawinski:
"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."
This seems far too simple to use regex. Use something similar to strrpos to look for the last occurrence of the '/' character, and then use substr to trim the string.
/http:\/\/www.mysite.com\/files\/get\/([^/]+)\/
How about something like this? Which should capture anything that's not a /, 1 or more times before a /.
The greediness of regexp will assure this works fine ^.*/
The strstr() function does not use a regular expression for any of its arguments it's the wrong function for regex replacement.
Are you thinking of preg_replace()?
But a function like basename() would be more appropriate.
Try this
$ok=preg_match('#mysite\.com/files/get/([^/]*)#i',$url,$m);
if($ok) $code=$m[1];
Then give a good read to these pages
http://www.php.net/preg_match
preg_replace
Note
the use of "#" as a delimiter to avoid getting trapped into escaping too many "/"
the "i" flag making match insensitive
(allowing more liberal spellings of the MySite.com domain name)
the $m array of captured results