I'm creating a very simple PHP templating system for a custom CMS/news system. Each article has the intro and the full content (if applicable). I want to have a conditional of the form:
{continue}click to continue{/continue}
So if it's possible to continue, the text "click to continue" will display, otherwise, don't display this whole string. So far, it works like this: if we can continue, remove the {continue} tags; otherwise, remove the whole thing.
I'm currently using preg_match_all to match the string in the second case. What is the best function for removing the matched text? A simple str_replace, or something else?
Or is there a better way to implement this overall?
Why not use preg_replace_callback?
Specifically:
preg_replace_callback('!\{continue\}(.*)\{/continue\}!Us', 'replace_continue', $html);
function replace_continue($matches) {
if (/* can continue */) {
return $matches[1];
} else {
return '';
}
}
I find preg_replace_callback to be incredibly useful.
Sorry I know people have been ridiculing this kind of answer, but just use Smarty. Its simple, stable, clever, free, small and cheap. Spend an hour learning how to use it and you will never look back.
Go to www.smarty.net
For PHP templating systems the proper way is to parse the template (using state machine or at least preg_split), generate PHP code from it, and then use that PHP code only. Then you'll be able to use normal if for conditional expressions.
Regular expressions aren't good idea for implementing templates. It will be PITA to handle nesting and enforce proper syntax beyond basic cases. Performance will be very poor (you have to be careful not to create backtracking expression and even then you'll end up scanning and copying KBs of templates several times).
Related
I am keeping record of every request made to my website. I am very aware of the security measurements that need to be taken before executing any MySQL query that contains data coming from query strings. I clean it as much as possible from injections and so far all tests have been successful using:
htmlspecialchars, strip_tags, mysqli_real_escape_string.
But on the logs of pages visited I find query strings of failed hack attempts that contain a lot of php code:
?1=%40ini_set%28"display_errors"%2C"0"%29%3B%40set_time_limit%280%29%3B%40set_magic_quotes_runtime%280%29%3Becho%20%27->%7C%27%3Bfile_put_contents%28%24_SERVER%5B%27DOCUMENT_ROOT%27%5D.%27/webconfig.txt.php%27%2Cbase64_decode%28%27PD9waHAgZXZhb
In the previous example we can see:
display_errors, set_time_limit, set_magic_quotes_runtime, file_put_contents
Another example:
/?s=/index/%5Cthink%5Capp/invokefunction&function=call_user_func_array&vars[0]=file_put_contents&vars[1][]=ctlpy.php&vars[1][]=<?php #assert($_REQUEST["ysy"]);?>ysydjsjxbei37$
This one is worst, there is even some <?php and $_REQUEST["ysy"] stuff in there. Although I am able to sanitize it, strip tags and encode < or > when I decode the string I can see the type of requests that are being sent.
Is there any way to detect a string that contains php code like:
filter_var($var, FILTER_SANITIZE_PHP);
FYI: This is not a real function, I am trying to give an idea of what I am looking for.
or some sort of function:
function findCode($var){
return ($var contains PHP) ? true : false
}
Again, not real
No need to sanitize, that has been taken care of, just to detect PHP code in a string. I need this because I want to detect them and save them in other logs.
NOTE: NEVER EXECUTE OR EVAL CODE COMING FROM QUERY STRINGS
After reading lots of comments #KIKO Software came up with an ingenious idea by using PHP tokenizer, but it ended up being extremely difficult because the string that is to be analyzed needed to have almost prefect syntax or it would fail.
So the best solution that I came up with is a simple function that tries to find commonly used PHP statements, In my case, especially on query strings with code injection. Another advantage of this solution is that we can modify and add to the list as many PHP statements as we want. Keep in mind that making the list bigger will considerably slow down your script. this functions uses strpos instead of preg_match (regex ) as its proven to perform faster.
This will not find 100% PHP code inside a string, but you can customize it to find as much as is required, never include terms that could be used in regular English, like 'echo' or 'if'
function findInStr($string, $findarray){
$found=false;
for($i=0;$i<sizeof($findarray);$i++){
$res=strpos($string,$findarray[$i]);
if($res !== false){
$found=true;
break;
}
}
return $found;
}
Simply use:
$search_line=array(
'file_put_contents',
'<?=',
'<?php',
'?>',
'eval(',
'$_REQUEST',
'$_POST',
'$_GET',
'$_SESSION',
'$_SERVER',
'exec(',
'shell_exec(',
'invokefunction',
'call_user_func_array',
'display_errors',
'ini_set',
'set_time_limit',
'set_magic_quotes_runtime',
'DOCUMENT_ROOT',
'include(',
'include_once(',
'require(',
'require_once(',
'base64_decode',
'file_get_contents',
'sizeof',
'array('
);
if(findInStr("this has some <?php echo 'PHP CODE' ?>",$search_line)){
echo "PHP found";
}
For a new project I've just started, I thought it might be a good idea to adhere to a styleguide for my code. I had sort of settled on a style for myself, but could use a little more structure, since there were a few things that varied between my projects and even sometimes within projects.
Now I've settled on this styleguide: http://www.php-fig.org/psr/psr-2/
Question is though, it says to limit lines of code to 80 characters, so I set up a ruler at that limit in ST3. I'm just not sure what a good practice is in regard to splitting code over multiple lines. How would you split the code below (it's already indented by 8 spaces)?
$this->errorMessage = (isset($this->errorDefinitions[$errorNo])) ? $this->errorDefinitions[$errorNo] : $errorMessage;
Or would it be better to conform to the guide by abandoning the shorthand expression and just writing:
if (isset($this->errorDefinitions[$errorNo])) {
$this->errorMessage = $this->errorDefinitions[$errorNo];
} else {
$this->errorMessage = $errorMessage;
}
There's nothing in the styleguide on this subject. Can anyone point me in the right direction or just tell me where I can find more information on the 'proper' way of doing it. I realize there might not be consensus on the way to do it, but I'd like to read your opinions.
You could break up your shorthand across lines:
$this->errorMessage = isset($this->errorDefinitions[$errorNo])
? $this->errorDefinitions[$errorNo]
: $errorMessage;
Just keep in mind that shorthand ternary statements are awesome... when they are short and simple! But once your conditions become longer you are just making your code harder to read, understand, and ultimately maintain. That's why PSR-2 has the line limit (in part).
Your code is a bit in the fuzzy area from my perspective. Breaking it up like above is ok, but if it was any more complicated at all (called a function, etc) I'd say abandon the shorthand and go with the if.
Ultimately it's your call, as you pointed out PSR-2 doesn't land on this issue.
I don't know if it's just me or not, but I am allergic to one line ifs in any c like language, I always like to see curly brackets after an if, so instead of
if($a==1)
$b = 2;
or
if($a==1) $b = 2;
I'd like to see
if($a==1){
$b = 2;
}
I guess I can support my preference by arguing that the first one is more prone to errors, and it has less readability.
My problem right now is that I'm working on a code that is packed with these one line ifs, I was wondering if there is some sort of utility that will help me correct these ifs, some sort of php code beautifier that would do this.
I was also thinking of developing some sort of regex that could be used with linux's sed command, to accomplish this, but I'm not sure if that's even possible given that the regex should match one line ifs, and wrap them with curley brackets, so the logic would be to find an if and the conditional statement, and then look for { following the condition, if not found then wrap the next line, or the next set of characters before a line break and then wrap it with { and }
What do you think?
Your best bet is probably to use the built-in PHP tokenizer, and to parse the resulting token stream.
See this answer for more information about the PHP Tokenizer: https://stackoverflow.com/a/5642653/1005039
You can also take a look at a script I wrote to parse PHP source files to fix another common problem in legacy code, namely to fix unquoted array indexes:
https://github.com/GustavBertram/php-array-index-fixer/blob/master/aif.php
The script uses a state machine instead of a generalized parser, but in your case it might be good enough.
I want to use PHP to replace javascript functions in HTML documents. For example:
original:
function my_function(hey) {
do stuff
}
new:
function new_function(hi) {
do different stuff
}
I was thinking of using regular expressions with the ereg_replace function, but I'm not sure if this is the best approach. The code in each function is pretty long... plus regex looks very daunting to me at the moment. Any suggestions?
This is the kind of thing that perl is built for. Spend a day learning how to slurp in files and take a good look at the many regex tools out there, and you'll have yourself a script that you can modify to use in the future, if need be, but is otherwise throwaway, which is good, since you probably wrote it crappily, being your first one and all.
Looks like you trying to do something pointless. Put a new function and call it. Just do no use the old one, once you anyway want to replace the entire body with it's name.
I'm looking to improve my PHP coding and am wondering what PHP-specific techniques other programmers use to improve productivity or workaround PHP limitations.
Some examples:
Class naming convention to handle namespaces: Part1_Part2_ClassName maps to file Part1/Part2/ClassName.php
if ( count($arrayName) ) // handles $arrayName being unset or empty
Variable function names, e.g. $func = 'foo'; $func($bar); // calls foo($bar);
Ultimately, you'll get the most out of PHP first by learning generally good programming practices, before focusing on anything PHP-specific. Having said that...
Apply liberally for fun and profit:
Iterators in foreach loops. There's almost never a wrong time.
Design around class autoloading. Use spl_autoload_register(), not __autoload(). For bonus points, have it scan a directory tree recursively, then feel free to reorganize your classes into a more logical directory structure.
Typehint everywhere. Use assertions for scalars.
function f(SomeClass $x, array $y, $z) {
assert(is_bool($z))
}
Output something other than HTML.
header('Content-type: text/xml'); // or text/css, application/pdf, or...
Learn to use exceptions. Write an error handler that converts errors into exceptions.
Replace your define() global constants with class constants.
Replace your Unix timestamps with a proper Date class.
In long functions, unset() variables when you're done with them.
Use with guilty pleasure:
Loop over an object's data members like an array. Feel guilty that they aren't declared private. This isn't some heathen language like Python or Lisp.
Use output buffers for assembling long strings.
ob_start();
echo "whatever\n";
debug_print_backtrace();
$s = ob_get_clean();
Avoid unless absolutely necessary, and probably not even then, unless you really hate maintenance programmers, and yourself:
Magic methods (__get, __set, __call)
extract()
Structured arrays -- use an object
My experience with PHP has taught me a few things. To name a few:
Always output errors. These are the first two lines of my typical project (in development mode):
ini_set('display_errors', '1');
error_reporting(E_ALL);
Never use automagic. Stuff like autoLoad may bite you in the future.
Always require dependent classes using require_once. That way you can be sure you'll have your dependencies straight.
Use if(isset($array[$key])) instead of if($array[$key]). The second will raise a warning if the key isn't defined.
When defining variables (even with for cycles) give them verbose names ($listIndex instead of $j)
Comment, comment, comment. If a particular snippet of code doesn't seem obvious, leave a comment. Later on you might need to review it and might not remember what it's purpose is.
Other than that, class, function and variable naming conventions are up to you and your team. Lately I've been using Zend Framework's naming conventions because they feel right to me.
Also, and when in development mode, I set an error handler that will output an error page at the slightest error (even warnings), giving me the full backtrace.
Fortunately, namespaces are in 5.3 and 6. I would highly recommend against using the Path_To_ClassName idiom. It makes messy code, and you can never change your library structure... ever.
The SPL's autoload is great. If you're organized, it can save you the typical 20-line block of includes and requires at the top of every file. You can also change things around in your code library, and as long as PHP can include from those directories, nothing breaks.
Make liberal use of === over ==. For instance:
if (array_search('needle',$array) == false) {
// it's not there, i think...
}
will give a false negative if 'needle' is at key zero. Instead:
if (array_search('needle',$array) === false) {
// it's not there!
}
will always be accurate.
See this question: Hidden Features of PHP. It has a lot of really useful PHP tips, the best of which have bubbled up to the top of the list.
There are a few things I do in PHP that tend to be PHP-specific.
Assemble strings with an array.
A lot of string manipulation is expensive in PHP, so I tend to write algorithms that reduce the discrete number of string manipulations I do. The classic example is building a string with a loop. Start with an array(), instead, and do array concatenation in the loop. Then implode() it at the end. (This also neatly solves the trailing-comma problem.)
Array constants are nifty for implementing named parameters to functions.
Enable NOTICE, and if you realy want to STRICT error reporting. It prevents a lot of errors and code smell: ini_set('display_errors', 1); error_reporting(E_ALL && $_STRICT);
Stay away from global variables
Keep as many functions as possible short. It reads easier, and is easy to maintain. Some people say that you should be able to see the whole function on your screen, or, at least, that the beginning and end curly brackets of loops and structures in the function should both be on your screen
Don't trust user input!
I've been developing with PHP (and MySQL) for the last 5 years. Most recently I started using a framework (Zend) with a solid javascript library (Dojo) and it's changed the way I work forever (in a good way, I think).
The thing that made me think of this was your first bullet: Zend framework does exactly this as it's standard way of accessing 'controllers' and 'actions'.
In terms of encapsulating and abstracting issues with different databases, Zend_Db this very well. Dojo does an excellent job of ironing out javascript inconsistencies between different browsers.
Overall, it's worth getting into good OOP techniques and using (and READING ABOUT!) frameworks has been a very hands-on way of getting to understand OOP issues.
For some standalone tools worth using, see also:
Smarty (template engine)
ADODB (database access abstraction)
Declare variables before using them!
Get to know the different types and the === operator, it's essential for some functions like strpos() and you'll start to use return false yourself.