I need some help with creating a regex for my php script. Basically, I have an associative array containing my data, and I want to use preg_replace to replace some place-holders with real data. The input would be something like this:
<td>{{address}}</td><td>{{fixDate}}</td><td>{{measureDate}}</td><td>{{builder}}</td>
I don't want to use str_replace, because the array may hold many more items than I need.
If I understand correctly, preg_replace is able to take the text that it finds from the regex, and replace it with the value of that key in the array, e.g.
<td>{{address}}</td>
get replaced with the value of $replace['address']. Is this true, or did I misread the php docs?
If it is true, could someone please help show me a regex that will parse this for me (would appreciate it if you also explain how it works, since I am not very good with regexes yet).
Many thanks.
Use preg_replace_callback(). It's incredibly useful for this kind of thing.
$replace_values = array(
'test' => 'test two',
);
$result = preg_replace_callback('!\{\{(\w+)\}\}!', 'replace_value', $input);
function replace_value($matches) {
global $replace_values;
return $replace_values[$matches[1]];
}
Basically this says find all occurrences of {{...}} containing word characters and replace that value with the value from a lookup table (being the global $replace_values).
For well-formed HTML/XML parsing, consider using the Document Object Model (DOM) in conjunction with XPath. It's much more fun to use than regexes for that sort of thing.
To not have to use global variables and gracefully handle missing keys you can use
function render($template, $vars) {
return \preg_replace_callback("!{{\s*(?P<key>[a-zA-Z0-9_-]+?)\s*}}!", function($match) use($vars){
return isset($vars[$match["key"]]) ? $vars[$match["key"]] : $match[0];
}, $template);
}
Related
I like how StackOverflow allows you to search for tags by specifying [tagname] in the search field. How could I go about writing a parser that would help me separate out tags from normal text. I can think of the manual way which would be to use some combination of substring and/or regex to get the position of opening and closing square brackets, and then extract out those strings, but I'm curious if there's a better way (and my regex skill is subpar at best)
// example
$query = 'How to use [jQuery] [selector] selectors';
$tags = getTags($query); // $tags == 'jQuery, selector'
$text = getText($query); // $text == 'How to use selectors'
Regular Expressions are probably the way to go. The more you can specify about how the tags are set the easier it will be to capture the right ones (In the expression below I limit it to either letters \w or numbers \d. The function uses a capture group (enclosed in parens) to pull out the relevant tags.
function getTags($query) {
preg_match_all("/\[([\w\d]+)\]/", $query, $matches);
return $matches;
}
Regex would probably work best, just don't try to parse HTML.
https://www.debuggex.com/
Is a really good site for visually seeing what your regex string is doing. I would recommend reading up on the PHP regex functions, and learn some more, there is a cheatsheat at the bottom of the site.
.*[(tag)].*
Would work to get the tags, using a captured group. The preg_match_all function is really good for working with multiple results, just make sure to read the official documentation to get it working how you need it.
For parsing more complex, or irregular things (like html, which is extremely difficult to do reliably), it is better to do it manually. Regex has worked for all my non HTML parsing needs in the past.
I am trying to work out the optimal way to replace all PHP variables within a string of code with a call to an array instead as shown below.
E.g. source code string
$random_var_name + $random_var_name2 * $diff_var_name3
Transformed into
$varArray["random_var_name"] + $varArray["random_var_name2"] * $varArray["diff_var_name3"]
I had thought that preg_replace() was the optimal solution, but the difficulty comes with the need to perform the replacement with a sub-part of the search pattern.
Perhaps it is better to just retrieve all the variables with a preg_match, edit/wrap them, then perform a single str_replace() for each variable?
However this is probably considerably slower.
The following regex should do what you're asking:
preg_replace('/\$([a-zA-Z_0-9]+)/', '$varArray["$1"]', $input_string);
In order to avoid to change $var['foo'] to $varArray["var"]['foo'] you have to check there're no [ character after the variable name. For this use a negative look-ahead:
$string = preg_replace('/\$(\w+)(?![\w\[])/', '$varArray["$1"]', $string);
I've got a problem with regexp function, preg_replace(), in PHP.
I want to get viewstate from html's input, but it doesn't work properly.
This code:
$viewstate = preg_replace('/^(.*)(<input\s+id="__VIEWSTATE"\s+type="hidden"\s+value=")(.*[^"])("\s+name="__VIEWSTATE">)(.*)$/u','^\${3}$',$html);
Returns this:
%0D%0A%0D%0A%3C%21DOCTYPE+html+PUBLIC+%22-%2F%2FW3C%2F%2FDTD+XHTML+1.0+Transitional%2F%2FEN%22+%22http%3A%2F%2Fwww.w3.org%2FTR%2Fxhtml1%2FDTD%2Fxhtml1-transitional.dtd%22%3E%0D%0A%0D%0A%3Chtml+xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F1999%2Fxhtml%22+%3E%0D%0A%3Chead%3E%3Ctitle%3E%0D%0A%09Strava.cz%0D%0A%3C%2Ftitle%3E%3Clink+rel%3D%22shortcut+icon%22+href%3D%22..%2FGrafika%2Ffavicon.ico%22+type%3D%22image%2Fx-icon%22+%2F%3E%3Clink+rel%3D%22stylesheet%22+type%3D%22text%2Fcss%22+media%3D%22screen%22+href%3D%22..%2FStyly%2FZaklad.css%22+%2F%3E%0D%0A++++%3Cstyle+type%3D%22text%2Fcss%22%3E%0D%0A++++++++.style1%0D%0A++++++++%7B%0D%0A++++++++++++width%3A+47px%3B%0D%0A++++++++%7D%0D%0A++++++++.style2%0D%0A++++++++%7B%0D%0A++++++++++++width%3A+64px%3B%0D%0A++++++++%7D%0D%0A++++%3C%2Fstyle%3E%0D%0A%0D%0A%3Cscript+type%3D%22text%2Fjavascript%22%3E%0D%0A%0D%0A++var+_gaq+%3D+_gaq+%7C%7C+%5B%5D%3B%0D%0A++_gaq.push%28%5B
EDIT: Sorry, I left this question for a long time. Finally I used DOMDocument.
To be sure i'd split this match into two phases:
Find the relevant input element
Get the value
Because you cannot be certain what the attributes order in the element will be.
if(preg_match('/<input[^>]+name="__VIEWSTATE"[^>]*>/i', $input, $match))
$value = preg_replace('/.*value="([^"]*)".*/i', '$1', $match[0]);
And, of course, always consider DOM and DOMXpath over regex for parsing html/xml.
You should only capture when you're planning on using the data. So most () are obsolete in that regexp pattern. Not a cause for failure but I thought I'd mention it.
Instead of using [^"] to mark that you don't want that character you could use the non-greedy modifier - ?. This makes sure the pattern is matching as little as it can. Since you have name="__VIEWSTATE" following the value this should be safe.
Let's put this in practice and simplify the pattern some. This works as you want:
'/.*<input\s+id="__VIEWSTATE"\s+type="hidden"\s+value="(.+?)"\s+name="__VIEWSTATE">.*/'
I would strongly recommend checking out an alternative to regexp for DOM operations. This makes certain your code works also if the attributes changes order. Plus it's so much nicer to work with.
The main mistake was the use of funciton preg_replace, witch returns the subject - neither the matched pattern nor the replacement. Thank you for your ideas and for the recommendation of DOMDocument. m93a
http://www.php.net/manual/en/function.preg-replace.php#refsect1-function.preg-replace-returnvalues
I got a string in which I replace all occurrences of [CODE]...[/CODE]. With preg_replace_callback can I call a function which handles the content of those tags. But how can I manipulate all string which are around those occurrences?
Example:
$str = "Hello, I am a string with [CODE]some code[/CODE] in it";
Now, with preg_replace_callback I manipulate the content of [CODE], in this case some code. But I'd like for all other text in this string, so Hello, I am a string with and in it to do something different. How could I do this the best way?
Thank you for you help!
Flo
It'd be simpler if I could see the regex, but the gist is that I think you want capture groups.
You should be able to access those regions separately by placing them into parenthesis-wrapped groups. Each section will be available to your callback. So (crudely) something like /(.*)(\[CODE\].*\[/CODE\])(.*)/ should pass an array of matches to your callback
Calling all the PHP helpers out there.
So basically I would like to give the function preg_match a variable that can contain a couple thousand lines of code) and have it search using a wildcard + strings either side of the widlcard.
For example I would like to search for strings that look like this <a href="*.pdf">
I would then like the function to return every match (along with the html shiz around the wildcard, this is to catch any directory structures too) in an array that I can loop through using a foreach(){} loop.
I'm guessing this is possible, would anyone have the time to help me with this?
I've check through all the preg_match lit' and through the answers on here, but I can't seem to get the patterns correct. Thanks in advance.
Peace out.
unset($matches);
preg_match_all('/<a href="[^"]+\.pdf">/',$text,$matches);
foreach ($matches as $match)
{
$shiz = $match[0];
// Your code here ...
}