$html = file_get_contents("1.html");
eval("print \"" . addcslashes(preg_replace("/(---(.+?)---)/", "\\2", $html), '"') . "\";");
This searches an string and replaces ---$variable--- with $variable.
How can I rewrite the script so that it searches for ---$_SESSION['variable']--- and replaces with $_SESSION['variable']?
You could just change the replacement to:
preg_replace("/(---\\\$_SESSION\\['(.+?)'\\]---)/", "\${\$_SESSION['\\2']}", $html)
but I wouldn't at all recommend it. As always, eval is a big clue you're doing something wrong.
Non-templating uses of $ in 1.html or the session variable will cause errors. Arbitrary code in 1.html or the session variable can be executed via the ${...} syntax, potentially compromising your server. Less-than signs or ampersands in the session variable will be output as-is, leading to cross-site-scripting attacks.
A better strategy is to keep the string as just a string, not a PHP command. Find the ---...--- sections and replace those separately:
$parts= preg_split('/---(.+?)---/', $html, null, PREG_SPLIT_DELIM_CAPTURE);
for ($i= 1; $i<count($parts); $i+= 2) {
$part= trim($parts[$i]);
if (strpos($part, "\$_SESSION['")==0) {
$key= stripcslashes(substr($part, 11, -2));
$parts[$i]= htmlspecialchars($_SESSION[$key], ENT_QUOTES);
}
}
$html= implode('', $parts);
(Not tested, but should be along the right lines. You may not want htmlspecialchars if you really want your variables to contain active HTML; this is not usually the case.)
The function you need is preg_quote(). But before I post any code here: Are you really really really sure your $html or your $_SESSION['variable'] contains no malicious strings like $(cat /etc/passwd)? If you are, double-check. If you still are, go ahead using this:
preg_replace("/(---" . preg_quote($_SESSION['variable'], '/') . "---)/", "\\2", $html)
Related
I'm localizing a website that I've built. I'm doing this by having a .lang file read and each line (syntax: key=string) is placed in a variable depending on the chosen language.
This array is then used to place the strings in the correct places.
The problem I'm having is that certain strings need to have hyperlinks in the middle of them for example someplace I've put my name that links to my contact page. Or a lot of the readouts of the website need to be in the strings.
To solve this I've defined a variable that holds the html + Forecaster + html,
and the localization file contains the $Forecaster variable in the string.
The problem with this as I promptly discovered is that it stubbornly refuses to parse the inline variables in the strings from the file.
Instead it prints the string and variable name as it looks in the file.
And I have yet to find a way to make it parse the variables.
For example "Heating up took $str_time" would be printed on the page exactly like that, instead of inputting the previously defined value of $str_time.
I currently use fopen() and fgets() to open and read the lines. I then explode them to separate the key and the string and then place these into the array.
Is there a way to make it parse the variables, or alternatively is there another way of reading the lines that allows for parsing the inline variables?
The code that gets the line and converts it to the array looks like this:
(It obviously loops through the lines)
#list($key, $string) = explode('=', $line);
$key = strtok($line, '=');
$string = strtok('=');
$local[$key] = $string;
$counter++;
echo $local[$key] . "<br>";
The counter is unused and the echo is for testing.
A line from the .lang file looks like this:
fuel.results.heatup.timeused=Heating up took $str_time
I would call the array where I want the string like this:
$local['fuel.results.heatup.timeused']
As you can see I've tried both explode and strtok but it hasn't made a difference.
Personally I'd write your text file in JSON format to make it easier to pull data out.
Here is a solution directly from the php manual: http://nz2.php.net/manual/en/function.eval.php
$string = 'cup';
$name = 'coffee';
$str = 'This is a $string with my $name in it.';
echo $str. "\n";
eval("\$str = \"$str\";");
echo $str. "\n";
It is worth noting that eval() can be very dangerous used in the wrong way so make sure you're code is very secure E.g. if someone altered your txt file with real PHP code they could execute it directly on the server.
Another approach would require you to know all your variable names and could then do something like:
$str = 'Heating up took $str_time';
echo 'str=' . str_replace('$str_time', $str_time, $str);
Or do this via an array:
$str = 'Heating up took $str_time as well as $other_value';
$vars = Array('str_time', 'other_value');
foreach($vars as $varName) {
$str = str_replace('$' . $varName, $$varName, $str);
}
echo 'str=' . $str;
If you not know all the variable name, you can use this example, without eval(). It is indicatred to avoid eval().
$str = 'fuel.results.heatup.timeused=Heating up took $str_time';
$str_time = 'value';
if(preg_match('/\$([a-z0-9_]+)/i', $str, $v)) {
$vname = $v[1];
$str = str_replace('$'.$vname, $$vname, $str);
}
echo $str; // fuel.results.heatup.timeused=Heating up took value
Basically I need a regex expression to match all double quoted strings inside PHP tags without a variable inside.
Here's what I have so far:
"([^\$\n\r]*?)"(?![\w ]*')
and replace with:
'$1'
However, this would match things outside PHP tags as well, e.g HTML attributes.
Example case:
Here's my "dog's website"
<?php
$somevar = "someval";
$somevar2 = "someval's got a quote inside";
?>
<?php
$somevar3 = "someval with a $var inside";
$somevar4 = "someval " . $var . 'with concatenated' . $variables . "inside";
$somevar5 = "this php tag doesn't close, as it's the end of the file...";
it should match and replace all places where the " should be replaced with a ', this means that html attributes should ideally be left alone.
Example output after replace:
Here's my "dog's website"
<?php
$somevar = 'someval';
$somevar2 = 'someval\'s got a quote inside';
?>
<?php
$somevar3 = "someval with a $var inside";
$somevar4 = 'someval ' . $var . 'with concatenated' . $variables . 'inside';
$somevar5 = 'this php tag doesn\'t close, as it\'s the end of the file...';
It would also be great to be able to match inside script tags too...but that might be pushing it for one regex replace.
I need a regex approach, not a PHP approach. Let's say I'm using regex-replace in a text editor or JavaScript to clean up the PHP source code.
tl;dr
This is really too complex complex to be done with regex. Especially not a simple regex. You might have better luck with nested regex, but you really need to lex/parse to find your strings, and then you could operate on them with a regex.
Explanation
You can probably manage to do this.
You can probably even manage to do this well, maybe even perfectly.
But it's not going to be easy.
It's going to be very very difficult.
Consider this:
Welcome to my php file. We're not "in" yet.
<?php
/* Ok. now we're "in" php. */
echo "this is \"stringa\"";
$string = 'this is \"stringb\"';
echo "$string";
echo "\$string";
echo "this is still ?> php.";
/* This is also still ?> php. */
?> We're back <?="out"?> of php. <?php
// Here we are again, "in" php.
echo <<<STRING
How do "you" want to \""deal"\" with this STRING;
STRING;
echo <<<'STRING'
Apparently this is \\"Nowdoc\\". I've never used it.
STRING;
echo "And what about \\" . "this? Was that a tricky '\"' to catch?";
// etc...
Forget matching variable names in double quoted strings.
Can you just match all of the string in this example?
It looks like a nightmare to me.
SO's syntax highlighting certainly won't know what to do with it.
Did you consider that variables may appear in heredoc strings as well?
I don't want to think about the regex to check if:
Inside <?php or <?= code
Not in a comment
Inside a quoted quote
What type of quoted quote?
Is it a quote of that type?
Is it preceded by \ (escaped)?
Is the \ escaped??
etc...
Summary
You can probably write a regex for this.
You can probably manage with some backreferences and lots of time and care.
It's going to be hard and your probably going to waste a lot of time, and if you ever need to fix it, you aren't going to understand the regex you wrote.
See also
This answer. It's worth it.
Here's a function that utilizes the tokenizer extension to apply preg_replace to PHP strings only:
function preg_replace_php_string($pattern, $replacement, $source) {
$replaced = '';
foreach (token_get_all($source) as $token) {
if (is_string($token)){
$replaced .= $token;
continue;
}
list($id, $text) = $token;
if ($id === T_CONSTANT_ENCAPSED_STRING) {
$replaced .= preg_replace($pattern, $replacement, $text);
} else {
$replaced .= $text;
}
}
return $replaced;
}
In order to achieve what you want, you can call it like this:
<?php
$filepath = "script.php";
$file = file_get_contents($filepath);
$replaced = preg_replace_php_string('/^"([^$\{\n<>\']+?)"$/', '\'$1\'', $file);
echo $replaced;
The regular expression that's passed as the first argument is the key here. It tells the function to only transform strings to their single-quoted equivalents if they do not contain $ (embedded variable "$a"), { (embedded variable type 2 "{$a[0]}"), a new line, < or > (HTML tag end/open symbols). It also checks if the string contains a single-quote, and prevents the replacement to avoid situations where it would need to be escaped.
While this is a PHP solution, it's the most accurate one. The closest you can get with any other language would require you to build your own PHP parser in that language to some degree in order for your solution to be accurate.
I'm building a PHP script to minify CSS/Javascript, which (obviously) involves getting rid of comments from the file. Any ideas how to do this? (Preferably, I need to get rid of /**/ and // comments)
Pattern for remove comments in JS
$pattern = '/((?:\/\*(?:[^*]|(?:\*+[^*\/]))*\*+\/)|(?:\/\/.*))/';
Pattern for remove comments in CSS
$pattern = '!/\*[^*]*\*+([^/][^*]*\*+)*/!';
$str = preg_replace($pattern, '', $str);
I hope above should help someone..
REFF : http://castlesblog.com/2010/august/14/php-javascript-css-minification
That wheel has been invented -- https://github.com/mrclay/minify.
PLEASE NOTE - the following approach will not work in all possible scenarios. Test before using in production.
Without preg patterns, without anything alike, this can be easily done with PHP built-in TOKENIZER. All three (PHP, JS and CSS as well) share the same way of representing comments in source files, and PHP's native, built-in token_get_all() function (without TOKEN_PARSE flag) can do dirty trick, even if the input string isn't well formed PHP code, which is exactly what one might need. All it asks is <?php at start of the string and magic happens. :)
<?php
function no_comments (string $tokens)
{ // Remove all block and line comments in css/js files with PHP tokenizer.
$remove = [];
$suspects = ['T_COMMENT', 'T_DOC_COMMENT'];
$iterate = token_get_all ('<?php '. PHP_EOL . $tokens);
foreach ($iterate as $token)
{
if (is_array ($token))
{
$name = token_name ($token[0]);
$chr = substr($token[1],0,1);
if (in_array ($name, $suspects)
&& $chr !== '#') $remove[] = $token[1];
}
}
return str_replace ($remove, '', $tokens);
}
The usage goes something like this:
echo no_comments ($myCSSorJsStringWithComments);
Take a look at minify, a "heavy regex-based removal of whitespace, unnecessary comments and tokens."
I have fooled around with regex but can't seem to get it to work. I have a file called includes/header.php I am converting the file into one big string so that I can pull out a certain portion of the code to paste in the html of my document.
$str = file_get_contents('includes/header.php');
From here I am trying to get return only the string that starts with <ul class="home"> and ends with </ul>
try as I may to figure out an expression I am still confused.
Once I trim down the string I can just print that on my page but I can't figure out the trimming part
If you need something really hardcore, http://www.php.net/manual/en/book.xmlreader.php.
If you just want to rip out the text that fits that pattern try something like this.
$string = "stuff<ul class=\"home\">alsdkjflaskdvlsakmdf<another></another></ul>stuff";
if( preg_match( '/<ul class="home">(.*)<\/ul>/', $string, $match ) ) {
//do stuff with $match[0]
}
I'm assuming that the difficulty you're having has to do with escaping the regex special characters in the string(s) you're using as a delimiter. If so, try using the preg_quote() function:
$start = preg_quote('<ul class="home">');
$end = preg_quote('</ul>', '/');
preg_match("/" . $start. '.*' . $end . "/", $str, $matching_html_snippets);
The html you want should be in $matching_html_snippets[0]
You probably want an XML parser such as the built in one. Here is an example you might want to take a look at.
http://www.php.net/manual/en/function.xml-parse.php#90733
If you want to use regex then something along the lines of
$str = file_get_contents('includes/header.php');
$matchedstr = preg_match("<place your pattern here>", $str, $matches);
You probably want the pattern
'/<ul class="home">.*?<\/ul>/s'
Where $matches will contain an array of the matches it found so you can grab whatever element you want from the array with
$matchedstr[0];
which will return the first element. And then output that.
But I'd be a bit wary, regular expressions do tend to match to surprising edge cases and you need to feed them actual data to get reliable results as to when they are failing. However if you are just passing templates it should be ok, just do some tests and see if it all works. If not I'd still recommend using the PHP XML Parser.
Hope that helps.
If you feel like not using regexes you could use string finding, which I think the PHP manual implies is quicker:
function substrstr($orig, $startText, $endText) {
//get first occurrence of the start string
$start = strpos($orig, $startText);
//get last occurrence of the end string
$end = strrpos($orig, $endText);
if($start === FALSE || $end === FALSE)
return $orig;
$start++;
$length = $end - $start;
return substr($orig, $start, $length);
}
$substr = substrstr($string, '<ul class="home">', '</ul>');
You'll need to make some adjustments if you want to include the terminating strings in the output, but that should get you started!
Here's a novel way to do it; I make no guarantees about this technique's robustness or performance, other than it does work for the example given:
$prefix = '<ul class="home">';
$suffix = '</ul>';
$result = $prefix . array_shift(explode($suffix, array_pop(explode($prefix, $str)))) . $suffix;
Would it be possible to make a regex that reads {variable} like <?php echo $variable ?> in PHP files?
Thanks
Remy
The PHP manual already provides a regular expression for variable names:
[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*
You just have to alter it to this:
\{[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*\}
And you’re done.
Edit You should be aware that a simple sequential replacment of such occurrences as Ross proposed can cause some unwanted behavior when for example a substitution also contains such variables.
So you should better parse the code and replace those variables separately. An example:
$tokens = preg_split('/(\{[a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*\})/', $string, -1, PREG_SPLIT_DELIM_CAPTURE);
for ($i=1, $n=count($tokens); $i<$n; $i+=2) {
$name = substr($tokens[$i], 1, -1);
if (isset($variables[$name])) {
$tokens[$i] = $variables[$name];
} else {
// Error: variable missing
}
}
$string = implode('', $tokens);
It sounds like you're trying to do some template variable replacement ;)
I'd advise collecting your variables first, in an array for example, and then use something like:
// Variables are stored in $vars which is an array
foreach ($vars as $name => $value) {
$str = str_replace('{' . $name . '}', $value, $str);
}
{Not actually an answer, but need clarification}
Could you expand your question? Are you wanting to apply a regex to the contents of $variable?
The following line should replace all occurences of the string '{variable}' with the value of the global variable $variable:
$mystring = preg_replace_callback(
'/\{([a-zA-Z][\w\d]+)\}/',
create_function('$matches', 'return $GLOBALS[$matches[1]];'),
$mystring);
Edit: Replace the regex used here by the one mentioned by Gumbo to precisely catch all possible PHP variable names.
(in comments) i want to be able to type {variable}
instead of <?php echo $variable ?>
Primitive approach: You could use an external program (e.g. a Python script) to preprocess your files, making the following regex substitution:
"{([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff]*)}"
with
"<?php echo $\g<1> ?>"
Better approach: Write a macro in your IDE or code editor to automatically make the substitution for you.