Making a simple templating engine in PHP - php

I need to write a simple PHP function to replace text between {{ }} characters with their respective data.
Example:
String: "and with strange aeons even {{noun}} may {{verb}}"
$data = ['noun' => 'bird', 'verb' => 'fly'];
Result:
"and with strange aeons even bird may fly"
I have it almost working with the following code based on preg_replace_callback
function compile($str,$data){
foreach ($data as $k => $v) {
$pattern = '/\b(?<!\-)(' . $k . ')\b(?!-)/i';
$str = preg_replace_callback($pattern, function($m) use($v){
return $v;
}, $str);
}
return $str;
}
But I cant seem to account for the {{ }}.
The result looks like this:
"and with strange aeons even {{bird}} may {{fly}}"
How can I adjust the regex and/or code to account for the double curly brackets?
Also, before anyone asks why I'm trying to do this manually rather than use PHP itself or the Smarty plugin -- its too narrow a use case to install a plugin and I cannot use PHP itself because the input string is coming in as raw text from a database. I need to compile that raw text with data stored in a PHP array.

Since you're looping anyway, keep it simple:
foreach ($data as $k => $v) {
$str = str_ireplace('{{'.$k.'}}', $v, $str);
}
You can add a space before {{ and after }} if needed.

You can use
$str = "and with strange aeons even {{noun}} may {{verb}}";
$data = ['noun' => 'bird', 'verb' => 'fly'];
$pattern = '/{{(' . implode('|', array_keys($data)) . ')}}/i';
echo preg_replace_callback($pattern, function($m) use($data){
return $data[strtolower($m[1])];
}, $str);
// => and with strange aeons even bird may fly
See the PHP demo.
The $pattern will look like /{{(noun|verb)}}/i, and will match noun or verb inside double braces while capturing the word itself. The replacement will be the corresponding key value of the $data array. Turning the Group 1 value to lower case with strtolower($m[1]) is required since the keys in the $data array are all lowercase, and the $pattern can match uppercase variants, too.

Make use of strtr() and call it a day:
$string = 'and with strange aeons even {{noun}} may {{verb}}';
$data = ['{{noun}}' => 'bird', '{{verb}}' => 'fly'];
echo strtr( $string, $data );
produces:
and with strange aeons even bird may fly
strtr() is nice because it won't mess up the string in the event of:
$data = ['{{noun}}' => 'bi{{verb}}rd', '{{verb}}' => 'fly'];

Related

PHP Regex - match attribute and value in string

I have an array of possible attributes:
$attributes = ['color','size'];
Part of my URL looks like this:
color-light-grey-size-xs
I would need to get an array of attributes and their values, ie:
$values = [
'color' => 'light-grey',
'size' => 'xs'
]
Is that doable with regex?
A Regular Expression which you have to feed its cluster of attributes ORed:
(\w++)(?>-(\w+-?(?(?!color|size)(?-1))*))
^^^^^^^^^^
Regex live demo
PHP code:
$str = "color-light-grey-test-size-xs";
$attrs = ['color', 'size'];
$array = [];
preg_replace_callback(
"/(\w++)(?>-(\w+-?(?(?!" . implode("|", $attrs) . ")(?-1))*))/",
function($matches) use (&$array) {
$array[$matches[1]] = rtrim($matches[2], '-');
},
$str
);
print_r($array);
PHP live demo
Note: Order is not important at all.
I will tell you something.
ONLY if you know that first value is color and second value is size you can match between it like:
\bcolor\-([\w\d\-]+)-size-([\w\d\-]+)\b
You will get array of 2 matches for color $1 and for size $2
BUT if you don't know how your URL will look, you are in big problem.
You must know what you expect for all URL's and made matches for every combination.
Here is live example: https://regex101.com/r/3daBXx/1

PHP - How to search an associative array by matching the key against a regexp

I am currently working on a small script to convert data coming from an external source. Depending on the content I need to map this data to something that makes sense to my application.
A sample input could be:
$input = 'We need to buy paper towels.'
Currently I have the following approach:
// Setup an assoc_array what regexp match should be mapped to which itemId
private $itemIdMap = [ '/paper\stowels/' => '3746473294' ];
// Match the $input ($key) against the $map and return the first match
private function getValueByRegexp($key, $map) {
$match = preg_grep($key, $map);
if (count($match) > 0) {
return $match[0];
} else {
return '';
}
}
This raises the following error on execution:
Warning: preg_grep(): Delimiter must not be alphanumeric or backslash
What am I doing wrong and how could this be solved?
In preg_grep manual order of arguments is:
string $pattern , array $input
In your code $match = preg_grep($key, $map); - $key is input string, $map is a pattern.
So, your call is
$match = preg_grep(
'We need to buy paper towels.',
[ '/paper\stowels/' => '3746473294' ]
);
So, do you really try to find string We need to buy paper towels in a number 3746473294?
So first fix can be - swap'em and cast second argument to array:
$match = preg_grep($map, array($key));
But here comes second error - $itemIdMap is array. You can't use array as regexp. Only scalar values (more strictly - strings) can be used. This leads you to:
$match = preg_grep($map['/paper\stowels/'], $key);
Which is definitely not what you want, right?
The solution:
$input = 'We need to buy paper towels.';
$itemIdMap = [
'/paper\stowels/' => '3746473294',
'/other\sstuff/' => '234432',
'/to\sbuy/' => '111222',
];
foreach ($itemIdMap as $k => $v) {
if (preg_match($k, $input)) {
echo $v . PHP_EOL;
}
}
Your wrong assumption is that you think you can find any item from array of regexps in a single string with preg_grep, but it's not right. Instead, preg_grep searches elements of array, which fit one single regexp. So, you just used the wrong function.

PHP word censor with keeping the original caps

We want to censor certain words on our site but each word has different censored output.
For example:
PHP => P*P, javascript => j*vascript
(However not always the second letter.)
So we want a simple "one star" censor system but with keeping the original caps. The datas coming from the database are uncensored so we need the fastest way that possible.
$data="Javascript and php are awesome!";
$word[]="PHP";
$censor[]="H";//the letter we want to replace
$word[]="javascript";
$censor[]="a"//but only once (j*v*script would look wierd)
//Of course if it needed we can use the full censored word in $censor variables
Expected value:
J*vascript and p*p are awesome!
Thanks for all the answers!
You can put your censored words in key-based array, and value of the array should be the position of what char is replaced with * (see $censor array example bellow).
$string = 'JavaSCRIPT and pHp are testing test-ground for TEST ŠĐČĆŽ ŠĐčćŽ!';
$censor = [
'php' => 2,
'javascript' => 2,
'test' => 3,
'šđčćž' => 4,
];
function stringCensorSlow($string, array $censor) {
foreach ($censor as $word => $position) {
while (($pos = mb_stripos($string, $word)) !== false) {
$string =
mb_substr($string, 0, $pos + $position - 1) .
'*' .
mb_substr($string, $pos + $position);
}
}
return $string;
}
function stringCensorFast($string, array $censor) {
$pattern = [];
foreach ($censor as $word => $position) {
$word = '~(' . mb_substr($word, 0, $position - 1) . ')' . mb_substr($word, $position - 1, 1) . '(' . mb_substr($word, $position) . ')~iu';
$pattern[$word] = '$1*$2';
}
return preg_replace(array_keys($pattern), array_values($pattern), $string);
}
Use example :
echo stringCensorSlow($string, $censor);
# J*vaSCRIPT and p*p are te*ting te*t-ground for TE*T ŠĐČ*Ž ŠĐč*Ž!
echo stringCensorFast($string, $censor) . "\n";
# J*vaSCRIPT and p*p are te*ting te*t-ground for TE*T ŠĐČ*Ž ŠĐč*Ž!
Speed test :
foreach (['stringCensorSlow', 'stringCensorFast'] as $func) {
$time = microtime(true);
for ($i = 0; $i < 10000; $i++) {
$func($string, $censor);
}
$time = microtime(true) - $time;
echo "{$func}() took $time\n";
}
output on my localhost was :
stringCensorSlow() took 1.9752140045166
stringCensorFast() took 0.11587309837341
Upgrade #1: added multibyte character safe.
Upgrade #2: added example for preg_replace, which is faster than mb_substr. Tnx to AbsoluteƵERØ
Upgrade #3: added speed test loop and result on my local PC machine.
Make an array of words and replacements. This should be your fastest option in terms of processing, but a little more methodical to setup. Remember when you're setting up your patterns to use the i modifier to make each pattern case insensitive. You could ultimately pull these from a database into the arrays. I've hard-coded the arrays here for the example.
<!DOCTYPE html>
<html>
<meta content="text/html; charset=UTF-8" http-equiv="content-type">
<?php
$word_to_alter = array(
'!(j)a(v)a(script)(s|ing|ed)?!i',
'!(p)h(p)!i',
'!(m)y(sql)!i',
'!(p)(yth)o(n)!i',
'!(r)u(by)!i',
'!(ВЗЛ)О(М)!iu',
);
$alteration = array(
'$1*$2*$3$4',
'$1*$2',
'$1*$2',
'$1$2*$3',
'$1*$2',
'$1*$2',
);
$string = "Welcome to the world of programming. You can learn PHP, MySQL, Python, Ruby, and Javascript all at your own pace. If you know someone who uses javascripting in their daily routine you can ask them about becoming a programmer who writes JavaScripts. взлом прохладно";
$newstring = preg_replace($word_to_alter,$alteration,$string);
echo $newstring;
?>
</html>
Output
Welcome to the world of programming. You can learn P*P, M*SQL, Pyth*n,
R*by, and J*v*script all at your own pace. If you know someone who
uses j*v*scripting in their daily routine you can ask them about
becoming a programmer who writes J*v*Scripts. взл*м прохладно
Update
It works the same with UTF-8 characters, note that you have to specify a u modifier to make the pattern treated as UTF-8.
u (PCRE_UTF8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This
modifier is available from PHP 4.1.0 or greater on Unix and from PHP
4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.
Why not just use a little helper function and pass it a word and the desired censor?
function censorWord($word, $censor) {
if(strpos($word, $censor)) {
return preg_replace("/$censor/",'*', $word, 1);
}
}
echo censorWord("Javascript", "a"); // returns J*avascript
echo censorWord("PHP", "H"); // returns P*P
Then you can check the word against your wordlist and if it is a word that should be censored, you can pass it to the function. Then, you also always have the original word as well as the censored one to play with or put back in your sentence.
This would also make it easy to change the number of letters censored by just changing the offset in the preg_replace. All you have to do is keep an array of words, explode the sentence on spaces or something, and then check in_array. If it is in the array, send it to censorWord().
Demo
And here's a more complete example doing exactly what you said in the OP.
function censorWord($word, $censor) {
if(strpos($word, $censor)) {
return preg_replace("/$censor/",'*', $word, 1);
}
}
$word_list = ['php','javascript'];
$data = "Javascript and php are awesome!";
$words = explode(" ", $data);
// pass each word by reference so it can be modified inside our array
foreach($words as &$word) {
if(in_array(strtolower($word), $word_list)) {
// this just passes the second letter of the word
// as the $censor argument
$word = censorWord($word, $word[1]);
}
}
echo implode(" ", $words); // returns J*vascript and p*p are awesome!
Another Demo
You could store a lowercase list of the censored words somewhere, and if you're okay with starring the second letter every time, do something like this:
if (in_array(strtolower($word), $censored_words)) {
$word = substr($word, 0, 1) . "*" . substr($word, 2);
}
If you want to change the first occurrence of a letter, you could do something like:
$censored_words = array('javascript' => 'a', 'php' => 'h', 'ruby' => 'b');
$lword = strtolower($word);
if (in_array($lword, array_keys($censored_words))) {
$ind = strpos($lword, $censored_words[$lword]);
$word = substr($word, 0, $ind) . "*" . substr($word, $ind + 1);
}
This is what I would do:
Create a simple database (text file) and make a "table" of all your censored words and expected censored results. E.G.:
PHP --- P*P
javascript --- j*vascript
HTML --- HT*L
Write PHP code to compare the database information to your simple censored file. You will have to use array explode to create an array of only words. Something like this:
/* Opening database of censored words */
$filename = "/files/censored_words.txt";
$file = fopen( $filename, "r" );
if( $file == false )
{
echo ( "Error in opening file" );
exit();
}
/* Creating an array of words from string*/
$data = explode(" ", $data); // What was "Javascript and PHP are awesome!" has
// become "Javascript", "and", "PHP", "are",
// "awesome!". This is useful.
If your script finds matching words, replace the word in your data with the censored word from your list. You would have to delimit the file first by \r\n and finally by ---. (Or whatever you choose for separating your table with.)
Hope this helped!

A bit lost with preg_match regular expression

I'm a beginner in regular expression so it didn't take long for me to get totally lost :]
What I need to do:
I've got a string of values 'a:b,a2:b2,a3:b3,a4:b4' where I need to search for a specific pair of values (ie: a2:b2) by the second value of the pair given (b2) and get the first value of the pair as an output (a2).
All characters are allowed (except ',' which seperates each pair of values) and any of the second values (b,b2,b3,b4) is unique (cant be present more than once in the string)
Let me show a better example as the previous may not be clear:
This is a string: 2 minutes:2,5 minutes:5,10 minutes:10,15 minutes:15,never:0
Searched pattern is: 5
I thought, the best way was to use function called preg_match with subpattern feature.
So I tried the following:
$str = '2 minutes:2,5 minutes:5,10 minutes:10,15 minutes:15,20 minutes:20,30 minutes:30, never:0';
$re = '/(?P<name>\w+):5$/';
preg_match($re, $str, $matches);
echo $matches['name'];
Wanted output was '5 minutes' but it didn't work.
I would also like to stick with Perl-Compatible reg. expressions as the code above is included in a PHP script.
Can anyone help me out? I'm getting a little bit desperate now, as Ive spent on this most of the day by now ...
Thanks to all of you guys.
$str = '2 minutes:2,51 seconds:51,5 minutes:5,10 minutes:10,15 minutes:51,never:0';
$search = 5;
preg_match("~([^,\:]+?)\:".preg_quote($search)."(?:,|$)~", $str, $m);
echo '<pre>'; print_r($m); echo '</pre>';
Output:
Array
(
[0] => 5 minutes:5
[1] => 5 minutes
)
$re = '/(?:^|,)(?P<name>[^:]*):5(?:,|$)/';
Besides the problem of your expression having to match $ after 5, which would only work if 5 were the last element, you also want to make sure that after 5 either nothing comes or another pair comes; that before the first element of the pair comes either another element or the beginning of the string, and you want to match more than \w in the first element of the pair.
A preg_match call will be shorter for certain, but I think I wouldn't bother with regular expressions, and instead just use string and array manipulations.
$pairstring = '2 minutes:2,5 minutes:5,10 minutes:10,15 minutes:15,20 minutes:20,30 minutes:30, never:0';
function match_pair($searchval, $pairstring) {
$pairs = explode(",", $str);
foreach ($pairs as $pair) {
$each = explode(":", $pair);
if ($each[1] == $searchval) {
echo $each[0];
}
}
}
// Call as:
match_pair(5, $pairstring);
Almost the same as #Michael's. It doesn't search for an element but constructs an array of the string. You say that values are unique so they are used as keys in my array:
$str = '2 minutes:2,5 minutes:5,10 minutes:10,15 minutes:15,20 minutes:20,30 minutes:30, never:0';
$a = array();
foreach(explode(',', $str) as $elem){
list($key, $val) = explode(':', $elem);
$a[$val] = $key;
}
Then accessing an element is very simple:
echo $a[5];

extracting multiple fields from a text file using php

what is the best way of extracting multiple (~40 values) from a text file using php?
the data is more or less like:
NAMEA valuea
NAMEB valueb
I'm looking for a proper* approach to extracting this data into a data-structure, because i will need to specify regexs for all of them (all 40).
did i make myself clear?
*meaning, the default/painful method would be for me to do:
$namea = extractfunction("regexa", $textfilevalue);
$nameb = extractfunction("regeb", $textfilevalue);
... 40 times!
The lines may not be in the same order, or be present in each file. Every NAMEA is text like: "Registration Number:", or "Applicant Name:" (ie, with spaces in what i was calling as NAMEA)
Response to the Col.
i'm looking for a sensible "way" of writing my code, so its readable, modifiable, builds an object/array thats easily callable, etc... "good coding style!" :)
#Adam - They do actually... and contain slashes as well...
#Alix - Freaking marvelous man! THat was GOOD! would you also happen to have any insights on how I can "truncate" the rsultant array by removing everything from "key_x" and beyond? Should i open that as a new question?
Here is my take at it:
somefile.txt:
NAMEA valuea
NAMEB valueb
PHP Code:
$file = file_get_contents('./somefile.txt');
$string = preg_replace('~^(.+?)\s+(.+?)$~m', '$1=$2', $file);
$string = str_replace(array("\r\n", "\r", "\n"), '&', $string);
$result = array();
parse_str($string, $result);
echo '<pre>';
print_r($result);
echo '</pre>';
Output:
Array
(
[NAMEA] => valuea
[NAMEB] => valueb
)
You may also be able to further simplify this by using str_getcsv() on PHP 5.3+.
EDIT: My previous version fails for keys that have spaces like #Col. Shrapnel noticed. I didn't read the question with enough attention. A possible solution since you seem to be using keys that always have : appended is this:
$string = preg_replace('~^(.+?):\s+(.+?)$~m', '$1=$2', $file);
To remove everything from key_x to the end of the file you can do something like this:
$string = substr($string, 0, strpos($string, 'key_x'));
So the whole thing would look like this:
somefile.txt:
Registration Number: valuea
Applicant Name: valueb
PHP Code:
$file = file_get_contents('./somefile.txt');
$string = substr($file, 0, strpos($file, 'key_x'));
$string = preg_replace('~^(.+?):\s+(.+?)$~m', '$1=$2', $string);
$string = str_replace(array("\r\n", "\r", "\n"), '&', $string);
$result = array();
parse_str($string, $result);
echo '<pre>';
print_r($result);
echo '</pre>';
Output:
Array
(
[Registration_Number] => valuea
[Applicant_Name] => valueb
)
as far as I get it you can use file() to get an array of strings and then parse these strings with some regexp.
if you add a = sign between names and values, you'll be ble to get the whole thing at once using parse_ini_file()
Assuming your keys (namea, nameb) never have spaces in them:
$contents = file('some_file.txt'); // read file as array
$data = array();
foreach($contents as $line) { // iterate over file
preg_match('/^([^\s]+)\s+(.*)/', $line, $matches); // pull out key and value into $matches
$key = $matches[1];
$value = $matches[2];
$data[$key] = $value; // store key/value pairs in $data array
}
var_dump($data); // what did we get?

Categories