I'm building a PHP script to minify CSS/Javascript, which (obviously) involves getting rid of comments from the file. Any ideas how to do this? (Preferably, I need to get rid of /**/ and // comments)
Pattern for remove comments in JS
$pattern = '/((?:\/\*(?:[^*]|(?:\*+[^*\/]))*\*+\/)|(?:\/\/.*))/';
Pattern for remove comments in CSS
$pattern = '!/\*[^*]*\*+([^/][^*]*\*+)*/!';
$str = preg_replace($pattern, '', $str);
I hope above should help someone..
REFF : http://castlesblog.com/2010/august/14/php-javascript-css-minification
That wheel has been invented -- https://github.com/mrclay/minify.
PLEASE NOTE - the following approach will not work in all possible scenarios. Test before using in production.
Without preg patterns, without anything alike, this can be easily done with PHP built-in TOKENIZER. All three (PHP, JS and CSS as well) share the same way of representing comments in source files, and PHP's native, built-in token_get_all() function (without TOKEN_PARSE flag) can do dirty trick, even if the input string isn't well formed PHP code, which is exactly what one might need. All it asks is <?php at start of the string and magic happens. :)
<?php
function no_comments (string $tokens)
{ // Remove all block and line comments in css/js files with PHP tokenizer.
$remove = [];
$suspects = ['T_COMMENT', 'T_DOC_COMMENT'];
$iterate = token_get_all ('<?php '. PHP_EOL . $tokens);
foreach ($iterate as $token)
{
if (is_array ($token))
{
$name = token_name ($token[0]);
$chr = substr($token[1],0,1);
if (in_array ($name, $suspects)
&& $chr !== '#') $remove[] = $token[1];
}
}
return str_replace ($remove, '', $tokens);
}
The usage goes something like this:
echo no_comments ($myCSSorJsStringWithComments);
Take a look at minify, a "heavy regex-based removal of whitespace, unnecessary comments and tokens."
Related
$text = "
<tag>
<html>
HTML
</html>
</tag>
";
I want to replace all the text present inside the tags with htmlspecialchars(). I tried this:
$regex = '/<tag>(.*?)<\/tag>/s';
$code = preg_replace($regex,htmlspecialchars($regex),$text);
But it doesn't work.
I am getting the output as htmlspecialchars of the regex pattern. I want to replace it with htmlspecialchars of the data matching with the regex pattern.
what should i do?
You're replacing the match with the pattern itself, you're not using the back-references and the e-flag, but in this case, preg_replace_callback would be the way to go:
$code = preg_replace_callback($regex,'htmlspecialchars',$text);
This will pass the mathces groups to htmlspecialchars, and use its return value as replacement. The groups might be an array, in which case, you can try either:
function replaceCallback($matches)
{
if (is_array($matches))
{
$matches = implode ('', array_slice($matches, 1));//first element is full string
}
return htmlspecialchars($matches);
}
Or, if your PHP version permits it:
preg_replace_callback($expr, function($matches)
{
$return = '';
for ($i=1, $j = count($matches); $i<$j;$i++)
{//loop like this, skips first index, and allows for any number of groups
$return .= htmlspecialchars($matches[$i]);
}
return $return;
}, $text);
Try any of the above, until you find simething that works... incidentally, if all you want to remove is <tag> and </tag>, why not go for the much faster:
echo htmlspecialchars(str_replace(array('<tag>','</tag>'), '', $text));
That's just keeping it simple, and it'll almost certainly be faster, too.
See the quickest, easiest way in action here
If you want to isolate the actual contents as defined by your pattern, you could use preg_match($regex,$text,$hits);. This will give you an array of hits those bits that were between the paratheses in the pattern, starting at $hits[1], $hits[0] contains the whole matched string). You can then start manipulating these found matches, possibly using htmlspecialchars ... and combine them again into $code.
I'm struggling to find the best way to do this. Basically I am provided strings that are like this with the task of printing out the string with the math parsed.
Jack has a [0.8*100]% chance of passing the test. Katie has a [(0.25 + 0.1)*100]% chance.
The mathematical equations are always encapsulated by square brackets. Why I'm dealing with strings like this is a long story, but I'd really appreciate the help!
There are plenty of math evaluation libraries for PHP. A quick web search turns up this one.
Writing your own parser is also an option, and if it's just basic arithmetic it shouldn't be too difficult. With the resources out there, I'd stay away from this.
You could take a simpler approach and use eval. Be careful to sanitize your input first. On the eval docs's page, there are comments with code to do that. Here's one example:
Disclaimer: I know eval is just a misspelling of evil, and it's a horrible horrible thing, and all that. If used right, it has uses, though.
<?php
$test = '2+3*pi';
// Remove whitespaces
$test = preg_replace('/\s+/', '', $test);
$number = '(?:\d+(?:[,.]\d+)?|pi|π)'; // What is a number
$functions = '(?:sinh?|cosh?|tanh?|abs|acosh?|asinh?|atanh?|exp|log10|deg2rad|rad2deg|sqrt|ceil|floor|round)'; // Allowed PHP functions
$operators = '[+\/*\^%-]'; // Allowed math operators
$regexp = '/^(('.$number.'|'.$functions.'\s*\((?1)+\)|\((?1)+\))(?:'.$operators.'(?2))?)+$/'; // Final regexp, heavily using recursive patterns
if (preg_match($regexp, $q))
{
$test = preg_replace('!pi|π!', 'pi()', $test); // Replace pi with pi function
eval('$result = '.$test.';');
}
else
{
$result = false;
}
?>
preg_match_all('/\[(.*?)\]/', $string, $out);
foreach ($out[1] as $k => $v)
{
eval("\$result = $v;");
$string = str_replace($out[0][$k], $result, $string);
}
This code is highly dangerous if the strings are user inputs because it allows any arbitrary code to be executed
The eval approach updated from PHP doc examples.
<?php
function calc($equation)
{
// Remove whitespaces
$equation = preg_replace('/\s+/', '', $equation);
echo "$equation\n";
$number = '((?:0|[1-9]\d*)(?:\.\d*)?(?:[eE][+\-]?\d+)?|pi|π)'; // What is a number
$functions = '(?:sinh?|cosh?|tanh?|acosh?|asinh?|atanh?|exp|log(10)?|deg2rad|rad2deg|sqrt|pow|abs|intval|ceil|floor|round|(mt_)?rand|gmp_fact)'; // Allowed PHP functions
$operators = '[\/*\^\+-,]'; // Allowed math operators
$regexp = '/^([+-]?('.$number.'|'.$functions.'\s*\((?1)+\)|\((?1)+\))(?:'.$operators.'(?1))?)+$/'; // Final regexp, heavily using recursive patterns
if (preg_match($regexp, $equation))
{
$equation = preg_replace('!pi|π!', 'pi()', $equation); // Replace pi with pi function
echo "$equation\n";
eval('$result = '.$equation.';');
}
else
{
$result = false;
}
return $result;
}
?>
Sounds, like your homework....but whatever.
You need to use string manipulation php has a lot of built in functions so your in luck. Check out the explode() function for sure and str_split().
Here is a full list of functions specifically related to strings: http://www.w3schools.com/php/php_ref_string.asp
Good Luck.
I'm trying to develop a function that can sort through a string that looks like this:
Donny went to the {park|store|{beach with friends|beach alone}} so he could get a breath of fresh air.
What I intend to do is search the text recursively for {} patterns where there is no { or } inside the {}, so only the innermost sandwiched text is selected, where I will then run a php to array the contents and select one at random, repeating process until the whole string has been parsed, showing a complete sentence.
I just cannot wrap my head around regular expressions though.
Appreciate any help!
Don't know about maths theory behind this ;-/ but in practice that's quite easy. Try
$text = "Donny went to the {park|store|{beach with friends|beach alone}} so he could get a breath of fresh air. ";
function rnd($matches) {
$words = explode('|', $matches[1]);
return $words[rand() % count($words)];
}
do {
$text = preg_replace_callback('~{([^{}]+)}~', 'rnd', $text, -1, $count);
} while($count > 0);
echo $text;
Regexes are not capable of counting and therefore cannot find matching brackets reliably.
What you need is a grammar.
See this related question.
$str="Donny went to the {park|store|{beach {with friends}|beach alone}} so he could get a breath of fresh air. ";
$s = explode("}",$str);
foreach($s as $v){
if(strpos($v,"{")!==FALSE){
$t=explode("{",$v);
print end($t)."\n";
}
}
output
$ php test.php
with friends
Regular expressions don't deal well with recursive stuff, but PHP does:
$str = 'Donny went to the {park|store|{beach with friends|beach alone}} so he could get a breath of fresh air.';
echo parse_string($str), "\n";
function parse_string($string) {
if ( preg_match('/\{([^{}]+)\}/', $string, $matches) ) {
$inner_elements = explode('|', $matches[1]);
$random_element = $inner_elements[array_rand($inner_elements)];
$string = str_replace($matches[0], $random_element, $string);
$string = parse_string($string);
}
return $string;
}
You could do this with a lexer/parser. I don't know of any options in PHP (but since there are XML parsers in PHP, there are no doubt generic parsers). On the other hand, what you're asking to do is not too complicated. Using strings in PHP (substring, etc.) you could probably do this in a few recursive functions.
You will then finally have created a MadLibz generator in PHP with a simple grammar. Pretty cool.
$html = file_get_contents("1.html");
eval("print \"" . addcslashes(preg_replace("/(---(.+?)---)/", "\\2", $html), '"') . "\";");
This searches an string and replaces ---$variable--- with $variable.
How can I rewrite the script so that it searches for ---$_SESSION['variable']--- and replaces with $_SESSION['variable']?
You could just change the replacement to:
preg_replace("/(---\\\$_SESSION\\['(.+?)'\\]---)/", "\${\$_SESSION['\\2']}", $html)
but I wouldn't at all recommend it. As always, eval is a big clue you're doing something wrong.
Non-templating uses of $ in 1.html or the session variable will cause errors. Arbitrary code in 1.html or the session variable can be executed via the ${...} syntax, potentially compromising your server. Less-than signs or ampersands in the session variable will be output as-is, leading to cross-site-scripting attacks.
A better strategy is to keep the string as just a string, not a PHP command. Find the ---...--- sections and replace those separately:
$parts= preg_split('/---(.+?)---/', $html, null, PREG_SPLIT_DELIM_CAPTURE);
for ($i= 1; $i<count($parts); $i+= 2) {
$part= trim($parts[$i]);
if (strpos($part, "\$_SESSION['")==0) {
$key= stripcslashes(substr($part, 11, -2));
$parts[$i]= htmlspecialchars($_SESSION[$key], ENT_QUOTES);
}
}
$html= implode('', $parts);
(Not tested, but should be along the right lines. You may not want htmlspecialchars if you really want your variables to contain active HTML; this is not usually the case.)
The function you need is preg_quote(). But before I post any code here: Are you really really really sure your $html or your $_SESSION['variable'] contains no malicious strings like $(cat /etc/passwd)? If you are, double-check. If you still are, go ahead using this:
preg_replace("/(---" . preg_quote($_SESSION['variable'], '/') . "---)/", "\\2", $html)
Would it be possible to make a regex that reads {variable} like <?php echo $variable ?> in PHP files?
Thanks
Remy
The PHP manual already provides a regular expression for variable names:
[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
You just have to alter it to this:
\{[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*\}
And you’re done.
Edit You should be aware that a simple sequential replacment of such occurrences as Ross proposed can cause some unwanted behavior when for example a substitution also contains such variables.
So you should better parse the code and replace those variables separately. An example:
$tokens = preg_split('/(\{[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*\})/', $string, -1, PREG_SPLIT_DELIM_CAPTURE);
for ($i=1, $n=count($tokens); $i<$n; $i+=2) {
$name = substr($tokens[$i], 1, -1);
if (isset($variables[$name])) {
$tokens[$i] = $variables[$name];
} else {
// Error: variable missing
}
}
$string = implode('', $tokens);
It sounds like you're trying to do some template variable replacement ;)
I'd advise collecting your variables first, in an array for example, and then use something like:
// Variables are stored in $vars which is an array
foreach ($vars as $name => $value) {
$str = str_replace('{' . $name . '}', $value, $str);
}
{Not actually an answer, but need clarification}
Could you expand your question? Are you wanting to apply a regex to the contents of $variable?
The following line should replace all occurences of the string '{variable}' with the value of the global variable $variable:
$mystring = preg_replace_callback(
'/\{([a-zA-Z][\w\d]+)\}/',
create_function('$matches', 'return $GLOBALS[$matches[1]];'),
$mystring);
Edit: Replace the regex used here by the one mentioned by Gumbo to precisely catch all possible PHP variable names.
(in comments) i want to be able to type {variable}
instead of <?php echo $variable ?>
Primitive approach: You could use an external program (e.g. a Python script) to preprocess your files, making the following regex substitution:
"{([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)}"
with
"<?php echo $\g<1> ?>"
Better approach: Write a macro in your IDE or code editor to automatically make the substitution for you.