first time posting a question here. I am currently writing a tool for reverse engineer PHP7 code in order to create UML class diagrams and for that purpose I'm using preg_match_all to extract sections of code from the source files. So far so good but I must admit I don't fully understand yet how regular expressions work. Still I was able to create complex patterns but this one beats me.
what I want is to match the use clause from classes body in order to get the traits names. It can be in one of these following formats *(as mentioned in https://www.php.net/manual/en/language.oop5.traits.php):
1)
use trait;
use othertrait;
2)
use trait, othertrait, someothertrait;
or
use trait, othertrait, someothertrait { "conflict_resolutions" }
I don't care about conflict resolutions yet so I can drop these.
so far I have the following regex pattern:
class usetrait_finder {
use finder;
function __construct( string $source ){
$this->source = $source;
$this->pattern = "/";
$this->pattern .= "(?:(use\s+|,)\s*(?<traitname>[a-zA-Z0-9_]*))";
$this->pattern .= "";
$this->pattern .= "/ms";
$this->matches($source);
}
function get_trait_name(): string {
return $this->matches["traitname"][$this->current_key];
}
}
which matches the mentioned cases, but I know it's a cheat because the "use" word must appear at least one at first. I wrote a PHPUnit test to check every normal case and the following test doesn't pass:
// tag "use" must be preset at least once
function test_invalid_source_2(){
$source = "function sarasa(
sometrait
,
someothertrait )
{ anything }
function test() {}
}";
$finder = new usetrait_finder( $source );
var_dump( $finder->matches($source)[0] );
$this->assertEquals( false, $finder->more_elements() );
}
the var_dump output is:
array(1) {
[0]=>
string(16) ",
someothertrait"
}
my expected output should be an empty string or null since the word "use" is not the first thing matched on the first line. Of course others tests must pass, and in the $matches["traitname"] should be only one trait name.
usetrait_finder is here:
https://github.com/rudymartinb/classtree/blob/master/src/usetrait_finder.php
"finder" trait section of above source is here (not really important but it won't hurt mentioning):
https://github.com/rudymartinb/classtree/blob/master/src/traits/finder.php
full test case is here: https://github.com/rudymartinb/classtree/blob/master/tests/usetrait_finder_Test.php
thank you in advance
In the following code, from Beginning PHP and mySQL 5e, if acronym function is called without mentioning $matches, how ,in the definition of acronym, $matches is never linked to anything, but rather used in the isset($acronym[$matches[1]])) ?, How isset knows what is $matches in the first place?
The following is the code and I have tested that it is working.
I just cannot follow up with the use of an arbitrary term; $matches, and its use.
// This function will add the acronym's long form
// directly after any acronyms found in $matches
function acronym($matches) {
$acronyms = array(
'WWW' => 'World Wide Web',
'IRS' => 'Internal Revenue Service',
'PDF' => 'Portable Document Format');
if (isset($acronyms[$matches[1]]))
return $acronyms[$matches[1]] . " (" . $matches[1] . ")";
else
return $matches[1];
}
// The target text
$text = "The <acronym>IRS</acronym> offers tax forms in
<acronym>PDF</acronym> format on the <acronym>WWW</acronym>.";
// Add the acronyms' long forms to the target text
$newtext = preg_replace_callback("/<acronym>(.*)<\/acronym>/U", 'acronym',
$text);
print_r($newtext);
The output is:
The Internal Revenue Service (IRS) offers tax forms inPortable Document Format (PDF) format on the World Wide Web (WWW).
Reminder: The input, for the function preg_replace_callback is:
The <acronym>IRS</acronym> offers tax forms in <acronym>PDF</acronym> format on the <acronym>WWW</acronym>.
The preg_replace_callback() function is written in that way, that it calls the function with a well defined argument. See the manual of this function:
A callback that will be called and passed an array of matched elements in the subject string. The callback should return the replacement string. This is the callback signature:
handler ( array $matches ) : string
So your function acronym() will get an array with the matches from the regex. Keep in mind that you are not calling the acronym() function by yourself, the function preg_replace_callback() does that for you (with the argument defined in the documentation).
I have latex + html code somewhere in the following form:
...some text1.... \[latex-code1\]....some text2....\[latex-code2\]....etc
Firstly I want to obtain the latex codes in an array codes[] to be able to send them to a server for rendering, so that
code[0]=latex-code1, code[1]=latex-code2, etc
Secondly, I want to modify this text so that it looks like:
...some text1.... <img src="root/1.png">....some text2....<img src="root/2.png">....etc
i.e, the i-th latex code fragment is replaced by the link to the i-th rendered image.
I have been trying to do this with preg_replace_callback and preg_match_all but being new to PHP haven't been able to make it work. Please advise.
If you're looking for codez:
$html = '...some text1.... \[latex-code1\]....some text2....\[latex-code2\]....etc';
$codes = array();
$count = 0;
$replace = function($matches) use (&$codes, &$count) {
list(, $codes[]) = $matches;
return sprintf('<img src="root/%d.png">', ++$count);
};
$changed = preg_replace_callback('~\\\\\\[(.+?)\\\\\\]~', $replace, $html);
echo "Original: $html\n";
echo "Changed : $changed\n\nLatex Codes: ", print_r($codes, 1), "Count: ", $count;
I don't know at which part you've got the problems, if it's the regex pattern, you use characters inside your markers that needs heavy escaping: For PHP and PCRE, that's why there are so many slashes.
Another tricky part is the callback function because it needs to collect the codes as well as having a counter. It's done in the example with an anonymous function that has variable aliases / references in it's use clause. This makes the variables $codes and $count available inside the callback.
I'm working on a simple templating system. Basically I'm setting it up such that a user would enter text populated with special tags of the form: <== variableName ==>
When the system would display the text it would search for all tags of the form mentioned and replace the variableName with its corresponding value from a database result.
I think this would require a regular expression but I'm really messed up in REGEX here. I'm using php btw.
Thanks for the help guys.
A rather quick and dirty hack here:
<?php
$teststring = "Hello <== tag ==>";
$values = array();
$values['tag'] = "world";
function replaceTag($name)
{
global $values;
return $values[$name];
}
echo preg_replace('/<== ([a-z]*) ==>/e','replaceTag(\'$1\')',$teststring);
Output:
Hello world
Simply place your 'variables' in the variable array and they will be replaced.
The e modifier to the regular expression tells it to eval the replacement, the [a-z] lets you name the "variables" using the characters a-z (you could use [a-z0-9] if you wanted to include numbers). Other than that its pretty much standard PHP.
Very useful - Pointed me to what I was looking for...
Replacing tags in a template e.g.
<<page_title>>, <<meta_description>>
with corresponding request variables e,g,
$_REQUEST['page_title'], $_REQUEST['meta_description'],
using a modified version of the code posted:
$html_output=preg_replace('/<<(\w+)>>/e', '$_REQUEST[\'$1\']', $template);
Easy to change this to replace template tags with values from a DB etc...
If you are doing a simple replace, then you don't need to use a regexp. You can just use str_replace() which is quicker.
(I'm assuming your '<== ' and ' ==>' are delimiting your template var and are replaced with your value?)
$subject = str_replace('<== '.$varName.' ==>', $varValue, $subject);
And to cycle through all your template vars...
$tplVars = array();
$tplVars['ONE'] = 'This is One';
$tplVars['TWO'] = 'This is Two';
// etc.
// $subject is your original document
foreach ($tplVars as $varName => $varValue) {
$subject = str_replace('<== '.$varName.' ==>', $varValue, $subject);
}
UPDATE:
Thank you all for your input. Some additional information.
It's really just a small chunk of markup (20 lines) I'm working with and had aimed to to leverage a regex to do the work.
I also do have the ability to hack up the script (an ecommerce one) to insert the classes as the navigation is built. I wanted to limit the number of hacks I have in place to keep things easier on myself when I go to update to the latest version of the software.
With that said, I'm pretty aware of my situation and the various options available to me. The first part of my regex works as expected. I posted really more or less to see if someone would say, "hey dummy, this is easy just change this....."
After coming close with a few of my efforts, it's more of the principle at this point. To just know (and learn) a solution exists for this problem. I also hate being beaten by a piece of code.
ORIGINAL:
I'm trying to leverage regular expressions to add a CSS a class to the first and last list items within an ordered list. I've tried a bunch of different ways but can't produce the results I'm looking for.
I've got a regular expression for the first list item but can't seem to figure a correct one out for the last. Here is what I'm working with:
$patterns = array('/<ul+([^<]*)<li/m', '/<([^<]*)(?<=<li)(.*)<\/ul>/s');
$replace = array('<ul$1<li class="first"','<li class="last"$2$3</ul>');
$navigation = preg_replace($patterns, $replace, $navigation);
Any help would be greatly appreciated.
Jamie Zawinski would have something to say about this...
Do you have a proper HTML parser? I don't know if there's anything like hpricot available for PHP, but that's the right way to deal with it. You could at least employ hpricot to do the first cleanup for you.
If you're actually generating the HTML -- do it there. It looks like you want to generate some navigation and have a .first and .last kind of thing on it. Take a step back and try that.
+1 to generating the right html as the best option.
But a completely different approach, which may or may not be acceptable to you: you could use javascript.
This uses jquery to make it easy ...
$(document).ready(
function() {
$('#id-of-ul:firstChild').addClass('first');
$('#id-of-ul:lastChild').addClass('last');
}
);
As I say, may or may not be any use in this case, but I think its a valid solution to the problem in some cases.
PS: You say ordered list, then give ul in your example. ol = ordered list, ul = unordered list
You wrote:
$patterns = array('/<ul+([^<]*)<li/m','/<([^<]*)(?<=<li)(.*)<\/ul>/s');
First pattern:
ul+ => you search something like ullll...
The m modifier is useless here, since you don't use ^ nor $.
Second pattern:
Using .* along with s is "dangerous", because you might select the whole document up to the last /ul of the page...
And well, I would just drop s modifier and use: (<li\s)(.*?</li>\s*</ul>) with replace: '$1class="last" $2'
In view of above remarks, I would write the first expression: <ul.*?>\s*<li
Although I am tired of seeing the Jamie Zawinski quote each time there is a regex question, Dustin is right in pointing you to a HTML parser (or just generating the right HTML from the start!): regexes and HTML doesn't mix well, because HTML syntax is complex, and unless you act on a well known machine generated output with very predictable result, you are prone to get something breaking in some cases.
I don't know if anyone cares any longer, but I have a solution that works in my simple test case (and I believe it should work in the general case).
First, let me point out two things: While PhiLho is right in that the s is "dangerous", since dots may match everything up to the final of the document, this may very well be what you want. It only becomes a problem with not well formed pages. Be careful with any such regex on large, manually written pages.
Second, php has a special meaning of backslashes, even in single quotes. Most regexen will perform well either way, but you should always double-escape them, just in case.
Now, here's my code:
<?php
$navigation='<ul>
<li>Coffee</li>
<li>Tea</li>
<li>Milk</li>
<li>Beer</li>
<li>Water</li>
</ul>';
$patterns = array('/<ul.*?>\\s*<li/',
'/<li((.(?<!<li))*?<\\/ul>)/s');
$replace = array('$0 class="first"',
'<li class="last"$1');
$navigation = preg_replace($patterns, $replace, $navigation);
echo $navigation;
?>
This will output
<ul>
<li class="first">Coffee</li>
<li>Tea</li>
<li>Milk</li>
<li>Beer</li>
<li class="last">Water</li>
</ul>
This assumes no line feeds inside the opening <ul...> tag. If there are any, use the s modifier on the first expression too.
The magic happens in (.(?<!<li))*?. This will match any character (the dot) that is not the beginning of the string <li, repeated any amount of times (the *) in a non-greedy fashion (the ?).
Of course, the whole thing would have to be expanded if there is a chance the list items already have the class attribute set. Also, if there is only one list item, it will match twice, giving it two such attributes. At least for xhtml, this would break validation.
You could load the navigation in a SimpleXML object and work with that. This prevents you from breaking your markup with some crazy regex :)
As a preface .. this is waaay over-complicating things in most use-cases. Please see other answers for more sanity :)
Here is a little PHP class I wrote to solve a similar problem. It adds 'first', 'last' and any other classes you want. It will handle li's with no "class" attribute as well as those that already have some class(es).
<?php
/**
* Modify list items in pre-rendered html.
*
* Usage Example:
* $replaced_text = ListAlter::addClasses($original_html, array('cool', 'awsome'));
*/
class ListAlter {
private $classes = array();
private $classes_found = FALSE;
private $count = 0;
private $total = 0;
// No public instances.
private function __construct() {}
/**
* Adds 'first', 'last', and any extra classes you want.
*/
static function addClasses($html, $extra_classes = array()) {
$instance = new self();
$instance->classes = $extra_classes;
$total = preg_match_all('~<li([^>]*?)>~', $html, $matches);
$instance->total = $total ? $total : 0;
return preg_replace_callback('~<li([^>]*?)>~', array($instance, 'processListItem'), $html);
}
private function processListItem($matches) {
$this->count++;
$this->classes_found = FALSE;
$processed = preg_replace_callback('~(\w+)="(.*?)"~', array($this, 'appendClasses'), $matches[0]);
if (!$this->classes_found) {
$classes = $this->classes;
if ($this->count == 1) {
$classes[] = 'first';
}
if ($this->count == $this->total) {
$classes[] = 'last';
}
if (!empty($classes)) {
$processed = rtrim($matches[0], '>') . ' class="' . implode(' ', $classes) . '">';
}
}
return $processed;
}
private function appendClasses($matches) {
array_shift($matches);
list($name, $value) = $matches;
if ($name == 'class') {
$value = array_filter(explode(' ', $value));
$value = array_merge($value, $this->classes);
if ($this->count == 1) {
$value[] = 'first';
}
if ($this->count == $this->total) {
$value[] = 'last';
}
$value = implode(' ', $value);
$this->classes_found = TRUE;
}
return sprintf('%s="%s"', $name, $value);
}
}