Problem:
I'm looking for a PHP function to easily and efficiently normalise CSV content in a string (not in a file). I have made a function for that. I provide it in an answer, because it is a possible solution. Unfortuanately it doesn't work when the separator is included in incomming string values.
Can anyone provide a better solution?
Why not using fputcsv / fgetcsv ?
Because:
it requires at least PHP 5.1.0 (which is sometimes not available)
it can only read from files, but not from a string. even though, sometimes the input is not a file (eg. if you fetch the CSV from an email)
putting the content into a temporary file might be unavailable due to security policies.
Why / what kind of normalisation?
Normalise in a way, that the encloser encloses every field. Because the encloser can be optional and different per line and per field. This can happen if one is implementing unclean/incomplete specifications and/or using CSV content from different sources/programs/developers.
Example function call:
$csvContent = "'a a',\"b\",c,1, 2 ,3 \n a a,'bb',cc, 1, 2, 3 ";
echo "BEFORE:\n$csvContent\n";
normaliseCSV($csvContent);
echo "AFTER:\n$csvContent\n";
Output:
BEFORE:
'a a',"b",c,1, 2 ,3
a a,'bb',cc, 1, 2, 3
AFTER:
"a a","b","c","1","2","3"
"a a","bb","cc","1","2","3"
To specifically address your concern regarding f*csv working only with files:
Since PHP 5.3 there's str_getcsv.
For at least PHP >= 5.1 (and I really hope that's the oldest you'll have to deal with these days), you can use stream wrappers:
$buffer = fopen('php://memory', 'r+');
fwrite($buffer, $string);
rewind($buffer);
fgetcsv($buffer) ..
Or obviously the reverse if you want to use fputcsv.
This is a possible solution. But it doesn't consider the case that the separator (,) might be included in incoming strings.
function normaliseCSV(&$csv,$lineseperator = "\n", $fieldseperator = ',', $encloser = '"')
{
$csvArray = explode ($lineseperator,$csv);
foreach ($csvArray as &$line)
{
$lineArray = explode ($fieldseperator,$line);
foreach ($lineArray as &$field)
{
$field = $encloser.trim($field,"\0\t\n\x0B\r \"'").$encloser;
}
$line = implode ($fieldseperator,$lineArray);
}
$csv = implode ($lineseperator,$csvArray);
}
It is a simple chain of explode -> explode -> trim -> implode -> implode .
Although I agree with #deceze that you could expect atleast 5.1 these days, i'm sure there are some internal company servers somewhere who don't want to update.
I altered your method to be able to use field and line separators between double quotes, or in your case the $encloser value.
<?php
/*
In regards to the specs on http://tools.ietf.org/html/rfc4180 I use the following rules:
- "Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes."
- "If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote."
Exception:
Even though the specs says use double quotes, I 'm using your $encloser variable
*/
echo normaliseCSV('a,b,\'c\',"d,e","f","g""h""i","""j"""' . "\n" . "\"k\nl\nm\"");
function normaliseCSV($csv,$lineseperator = "\n", $fieldseperator = ',', $encloser = '"')
{
//We need 4 temporary replacement values
//line seperator, fieldseperator, double qoutes, triple qoutes
$keys = array();
while (count($keys)<3) {
$tmp = "##".md5(rand().rand().microtime())."##";
if (strpos($csv, $tmp)===false) {
$keys[] = $tmp;
}
}
//first we exchange "" (double $encloser) and """ to make sure its not exploded
$csv = str_replace($encloser.$encloser.$encloser, $keys[0], $csv);
$csv = str_replace($encloser.$encloser, $keys[0], $csv);
//Explode on $encloser
//Every odd index is within quotes
//Exchange line and field seperators for something not used.
$content = explode($encloser,$csv);
$len = count($content);
if ($len>1) {
for ($x=1;$x<$len;$x=$x+2) {
$content[$x] = str_replace($lineseperator,$keys[1], $content[$x]);
$content[$x] = str_replace($fieldseperator,$keys[2], $content[$x]);
}
}
$csv = implode('',$content);
$csvArray = explode ($lineseperator,$csv);
foreach ($csvArray as &$line)
{
$lineArray = explode ($fieldseperator,$line);
foreach ($lineArray as &$field)
{
$val = trim($field,"\0\t\n\x0B\r '");
//put back the exchanged values
$val = str_replace($keys[0],$encloser.$encloser,$val);
$val = str_replace($keys[1],$lineseperator,$val);
$val = str_replace($keys[2],$fieldseperator,$val);
$val = $encloser.$val.$encloser;
$field = $val;
}
$line = implode ($fieldseperator,$lineArray);
}
$csv = implode ($lineseperator,$csvArray);
return $csv;
}
?>
Output would be:
"a","b","c","d,e","f","g""h""i","""j"""
"k
l
m"
Codepad example
when i first read this question wasn´t sure if it should be solved or not, since <5.1 environments should be extinguished a long time ago, dispite of that is a hell of a question how to solve this so we should be thinking wich approach to take... and my guess is it should be char by char examination.
I have separated logic in three main scenarios:
A: CHAR is a separator
B: CHAR is a Fuc$€/& quotation
C: CHAR is a Value
Obtaining as a reulst this weapon class (including log for it) for our arsenal:
<?php
Class CSVParser
{
#basic requirements
public $input;
public $separator;
public $currentQuote;
public $insideQuote;
public $result;
public $field;
public $quotation = array();
public $parsedArray = array();
# for logging purposes only
public $logging = TRUE;
public $log = array();
function __construct($input, $separator, $quotation=array())
{
$this->separator = $separator;
$this->input = $input;
$this->quotation = $quotation;
}
/**
* The main idea is to go through the string to parse char by char to analize
* when a complete field is detected it´ll be quoted according and added to an array
*/
public function parse()
{
for($i = 0; $i < strlen($this->input); $i++){
$this->processStream($i);
}
foreach($this->parsedArray as $value)
{
if(!is_null($value))
$this->result .= '"'.addslashes($value).'",';
}
return rtrim($this->result, ',');
}
private function processStream($i)
{
#A case (its a separator)
if($this->input[$i]===$this->separator){
$this->log("A", $this->input[$i]);
if($this->insideQuote){
$this->field .= $this->input[$i];
}else
{
$this->saveField($this->field);
$this->field = NULL;
}
}
#B case (its a f"·%$% quote)
if(in_array($this->input[$i], $this->quotation)){
$this->log("B", $this->input[$i]);
if(!$this->insideQuote){
$this->insideQuote = TRUE;
$this->currentQuote = $this->input[$i];
}
else{
if($this->currentQuote===$this->input[$i]){
$this->insideQuote = FALSE;
$this->currentQuote ='';
$this->saveField($this->field);
$this->field = NULL;
}else{
$this->field .= $this->input[$i];
}
}
}
#C case (its a value :-) )
if(!in_array($this->input[$i], array_merge(array($this->separator), $this->quotation))){
$this->log("C", $this->input[$i]);
$this->field .= $this->input[$i];
}
}
private function saveField($field)
{
$this->parsedArray[] = $field;
}
private function log($type, $value)
{
if($this->logging){
$this->log[] = "CASE ".$type." WITH ".$value." AS VALUE";
}
}
}
and example of how to use it would be:
$original = 'a,"ab",\'ab\'';
$test = new CSVParser($original, ',', array('"', "'"));
echo "<PRE>ORIGINAL: ".$original."</PRE>";
echo "<PRE>PARSED: ".$test->parse()."</PRE>";
echo "<pre>";
print_r($test->log);
echo "</pre>";
and here are the results:
ORIGINAL: a,"ab",'ab'
PARSED: "a","ab","ab"
Array
(
[0] => CASE C WITH a AS VALUE
[1] => CASE A WITH , AS VALUE
[2] => CASE B WITH " AS VALUE
[3] => CASE C WITH a AS VALUE
[4] => CASE C WITH b AS VALUE
[5] => CASE B WITH " AS VALUE
[6] => CASE A WITH , AS VALUE
[7] => CASE B WITH ' AS VALUE
[8] => CASE C WITH a AS VALUE
[9] => CASE C WITH b AS VALUE
[10] => CASE B WITH ' AS VALUE
)
I might have mistakes since i only dedicated 25 mins to it, so any comment will be appreciated an edited.
Related
I need to turn each end-point in a multi-dimensional array (of any dimension) into a row containing the all the descendant nodes using PHP. In other words, I want to resolve each complete branch in the array. I am not sure how to state this more clearly, so maybe the best way is to give an example.
If I start with an array like:
$arr = array(
'A'=>array(
'a'=>array(
'i'=>1,
'j'=>2),
'b'=>3
),
'B'=>array(
'a'=>array(
'm'=>4,
'n'=>5),
'b'=>6
)
);
There are 6 end points, namely the numbers 1 to 6, in the array and I would like to generate the 6 rows as:
A,a,i,1
A,a,j,2
A,b,2
B,a,m,3
B,a,n,4
B,b,2
Each row contains full path of descendants to the end-point. As the array can have any number of dimensions, this suggested a recursive PHP function and I tried:
function array2Rows($arr, $str='', $out='') {
if (is_array($arr)) {
foreach ($arr as $att => $arr1) {
$str .= ((strlen($str)? ',': '')) . $att;
$out = array2Rows($arr1, $str, $out);
}
echo '<hr />';
} else {
$str .= ((strlen($str)? ',': '')) . $arr;
$out .= ((strlen($out)? '<br />': '')) . $str;
}
return $out;
}
The function was called as follows:
echo '<p>'.array2Rows($arr, '', '').'</p>';
The output from this function is:
A,a,i,1
A,a,i,j,2
A,a,b,3
A,B,a,m,4
A,B,a,m,n,5
A,B,a,b,6
Which apart from the first value is incorrect because values on some of the nodes are repeated. I have tried a number of variations of the recursive function and this is the closest I can get.
I will welcome any suggestions for how I can get a solution to this problem and apologize if the statement of the problem is not very clear.
You were so close with your function... I took your function and modified is slightly as follows:
function array2Rows($arr, $str='', $csv='') {
$tmp = $str;
if (is_array($arr)) {
foreach ($arr as $att => $arr1) {
$tmp = $str . ((strlen($str)? ', ': '')) . $att;
$csv = array2Rows($arr1, $tmp, $csv);
}
} else {
$tmp .= ((strlen($str)? ', ': '')) . $arr;
$csv .= ((strlen($csv)? '<br />': '')) . $tmp;
}
return $csv;
}
The only difference is the introduction of a temporary variable $tmp to ensure that you don't change the $str value before the recursion function is run each time.
The output from your function becomes:
This is a nice function, I can think of a few applications for it.
The reason that you are repeating the second to last value is that in your loop you you are appending the key before running the function on the next array. Something like this would work better:
function array2Rows($arr, &$out=[], $row = []) {
if (is_array($arr)) {
foreach ($arr as $key => $newArray) {
if (is_array($newArray)) {
$row[] = $key; //If the current value is an array, add its key to the current row
array2Rows($newArray, $out, $row); //process the new value
} else { //The current value is not an array
$out[] = implode(',',array_merge($row,[$key,$newArray])); //Add the current key and value to the row and write to the output
}
}
}
return $out;
}
This is lightly optimized and utilizes a reference to hold the full output. I've also changed this to use and return an array rather than strings. I find both of those changes to make the function more readable.
If you wanted this to return a string formatted similarly to the one that you have in your function, replace the last line with
return implode('<br>', $out);
Alternatively, you could do that when calling, which would be what I would call "best practice" for something like this; e.g.
$result = array2Rows($arr);
echo implode('<br>', $result);
Note, since this uses a reference for the output, this also works:
array2Rows($arr, $result);
echo implode('<br>', $result);
i have a string and i need to add some html tag at certain index of the string.
$comment_text = 'neethu and Dilnaz Patel check this'
Array ( [start_index_key] => 0 [string_length] => 6 )
Array ( [start_index_key] => 11 [string_length] => 12 )
i need to split at start index key with long mentioned in string_length
expected final output is
$formattedText = '<span>#neethu</span> and <span>#Dilnaz Patel</span> check this'
what should i do?
This is a very strict method that will break at the first change.
Do you have control over the creation of the string? If so, you can create a string with placeholders and fill the values.
Even though you can do this with regex:
$pattern = '/(.+[^ ])\s+and (.+[^ ])\s+check this/i';
$string = 'neehu and Dilnaz Patel check this';
$replace = preg_replace($pattern, '<b>#$\1</b> and <b>#$\2</b> check this', $string);
But this is still a very rigid solution.
If you can try creating a string with placeholders for the names. this will be much easier to manage and change in the future.
<?php
function my_replace($string,$array_break)
{
$break_open = array();
$break_close = array();
$start = 0;
foreach($array_break as $key => $val)
{
// for tag <span>
if($key % 2 == 0)
{
$start = $val;
$break_open[] = $val;
}
else
{
// for tag </span>
$break_close[] = $start + $val;
}
}
$result = array();
for($i=0;$i<strlen($string);$i++)
{
$current_char = $string[$i];
if(in_array($i,$break_open))
{
$result[] = "<span>".$current_char;
}
else if(in_array($i,$break_close))
{
$result[] = $current_char."</span>";
}
else
{
$result[] = $current_char;
}
}
return implode("",$result);
}
$comment_text = 'neethu and Dilnaz Patel check this';
$my_result = my_replace($comment_text,array(0,6,11,12));
var_dump($my_result);
Explaination:
Create array parameter with: The even index (0,2,4,6,8,...) would be start_index_key and The odd index (1,3,5,7,9,...) would be string_length
read every break point , and store it in $break_open and $break_close
create array $result for result.
Loop your string, add , add or dont add spann with break_point
Result:
string '<span>neethu </span>and <span>Dilnaz Patel </span> check this' (length=61)
My function using preg_replace is working perfectly on a dev server, but not at all on the production server. The problem might have something to do with encoding. Is there a way to make this expression so that it works regardless of the encoding?
The $config looks like this:
class JConfig {
public $mighty = array("0" => array("0" => "/`?\\#__mightysites[` \\n]+/u"), "1" => array("0" => "`hhd_mightysites` "));
public $mighty_enable = '0';
public $mighty_language = '';
public $mighty_template = '9';
public $mighty_home = '';
public $mighty_langoverride = '0';......
I put the variables associated with the lines I would like to strip in an array called strips like
$strips = array(
'mighty',
'mighty_enable',
'mighty_sync',
'mighty_language',
'mighty_template',.....
Then use a loop to strip out the lines:
foreach ($strips as $var) {
if (JString::strpos($config, 'public $' . $var . ' =') !== false) {
$config = preg_replace('/\tpublic \$' . $var . ' \= ([^\;]*)\;\n/u', '', $config);
$tempvar .= $var . ", ";
}
}
Again, it works perfectly on our dev server. It does not do anything to any lines on the production server. I also know that it passes the strpos like to get to the line with preg_replace. Can I make preg_replace environment proof?
I appreciate the help, since it is happening only on a production server it is very difficult to test!
The safest bet would be to not "trust" any of the literal spaces/tabs that you expect to match.
Instead of using \t and , I'll recommend \s+ where you expect a tab and \s where you expect a space.
Furthermore, to cover cases where the operating system may use \r\n or \n at the end of each line, you can use \R to match both variations.
I'm going to include a start of line character check via ^ at the beginning of the pattern and m as a pattern modifier. This ensures that we match and only match where you expect a \t at the start of the line.
Finally, preg_replace() has an optional 5th parameter that counts how many replacements were made. If $found is a non-zero value, then store the current $var value.
Code: (Demo)
$config = <<<'CONFIG'
class JConfig {
public $mighty = array("0" => array("0" => "/`?\\#__mightysites[` \\n]+/u"), "1" => array("0" => "`hhd_mightysites` "));
public $mighty_enable = '0';
public $mighty_language = '';
public $mighty_template = '9';
public $mighty_home = '';
public $mighty_langoverride = '0';......
CONFIG;
$strips = [
'mighty',
'mighty_enable',
'mighty_sync',
'mighty_language',
'mighty_template'
];
$tempvar = '';
foreach ($strips as $var) {
$config = preg_replace('~^\s+public\s\$' . $var . '\s=\s[^;]*;\R~um', '', $config, -1, $found);
if ($found) {
$tempvar .= $var . ", ";
}
}
echo "\$tempvar = $tempvar\n\n";
echo $config;
Output:
$tempvar = mighty, mighty_enable, mighty_language, mighty_template,
class JConfig {
public $mighty_home = '';
public $mighty_langoverride = '0';......
p.s. One final suggested refinement... If you don't actually need the $tempvar variable for your project (meaning you are only using this during debugging) then you can avoid the loop entirely, and just implode('|', $strips), wrap that generated string in ( and ), save as $var, and call preg_replace() just one time. This will be more efficient and your sample $strips data does not need to be prepared with preg_quote() because there are you "special characters" to escape.
I am creating a script that will locate a field in a text file and get the value that I need.
First used the file() function to load my txt into an array by line.
Then I use explode() to create an array for the strings on a selected line.
I assign labels to the array's to describe a $Key and a $Value.
$line = file($myFile);
$arg = 3
$c = explode(" ", $line[$arg]);
$key = strtolower($c[0]);
if (strpos($c[2], '~') !== false) {
$val = str_replace('~', '.', $c[2]);
}else{
$val = $c[2];
}
This works fine but that is a lot of code to have to do over and over again for everything I want to get out of the txt file. So I wanted to create a function that I could call with an argument that would return the value of $key and $val. And this is where I am failing:
<?php
/**
* #author Jason Moore
* #copyright 2014
*/
global $line;
$key = '';
$val = '';
$myFile = "player.txt";
$line = file($myFile); //file in to an array
$arg = 3;
$Character_Name = 3
function get_plr_data2($arg){
global $key;
global $val;
$c = explode(" ", $line[$arg]);
$key = strtolower($c[0]);
if (strpos($c[2], '~') !== false) {
$val = str_replace('~', '.', $c[2]);
}else{
$val = $c[2];
}
return;
}
get_plr_data2($Character_Name);
echo "This character's ",$key,' is ',$val;
?>
I thought that I covered the scope with setting the values in the main and then setting them a global within the function. I feel like I am close but I am just missing something.
I feel like there should be something like return $key,$val; but that doesn't work. I could return an Array but then I would end up typing just as much code to the the info out of the array.
I am missing something with the function and the function argument to. I would like to pass and argument example : get_plr_data2($Character_Name); the argument identifies the line that we are getting the data from.
Any help with this would be more than appreciated.
::Updated::
Thanks to the answers I got past passing the Array.
But my problem is depending on the arguments I put in get_plr_data2($arg) the number of values differ.
I figured that I could just set the Max of num values I could get but this doesn't work at all of course because I end up with undefined offsets instead.
$a = $cdata[0];$b = $cdata[1];$c = $cdata[2];
$d = $cdata[3];$e = $cdata[4];$f = $cdata[5];
$g = $cdata[6];$h = $cdata[7];$i = $cdata[8];
$j = $cdata[9];$k = $cdata[10];$l = $cdata[11];
return array($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l);
Now I am thinking that I can use the count function myCount = count($c); to either amend or add more values creating the offsets I need. Or a better option is if there was a way I could generate the return array(), so that it would could the number of values given for array and return all the values needed. I think that maybe I am just making this a whole lot more difficult than it is.
Thanks again for all the help and suggestions
function get_plr_data2($arg){
$myFile = "player.txt";
$line = file($myFile); //file in to an array
$c = explode(" ", $line[$arg]);
$key = strtolower($c[0]);
if (strpos($c[2], '~') !== false) {
$val = str_replace('~', '.', $c[2]);
}else{
$val = $c[2];
}
return array($key,$val);
}
Using:
list($key,$val) = get_plr_data2(SOME_ARG);
you can do this in 2 way
you can return both values in an array
function get_plr_data2($arg){
/* do what you have to do */
$output=array();
$output['key'] =$key;
$output['value']= $value;
return $output;
}
and use the array in your main function
you can use reference so that you can return multiple values
function get_plr_data2($arg,&$key,&$val){
/* do job */
}
//use the function as
$key='';
$val='';
get_plr_data2($arg,$key,$val);
what ever you do to $key in function it will affect the main functions $key
I was over thinking it. Thanks for all they help guys. this is what I finally came up with thanks to your guidance:
<?php
$ch_file = "Thor";
$ch_name = 3;
$ch_lvl = 4;
$ch_clss = 15;
list($a,$b)= get_char($ch_file,$ch_name);//
Echo $a,': ',$b; // Out Puts values from the $cdata array.
function get_char($file,$data){
$myFile = $file.".txt";
$line = file($myFile);
$cdata = preg_split('/\s+/', trim($line[$data]));
return $cdata;
}
Brand new to this community, thanks for all the patience.
I've the following method which allows me to protect MySQL entities:
public function Tick($string)
{
$string = explode('.', str_replace('`', '', $string));
foreach ($string as $key => $value)
{
if ($value != '*')
{
$string[$key] = '`' . trim($value) . '`';
}
}
return implode('.', $string);
}
This works fairly well for the use that I make of it.
It protects database, table, field names and even the * operator, however now I also want it to protect function calls, ie:
AVG(database.employees.salary)
Should become:
AVG(`database`.`employees`.`salary`) and not `AVG(database`.`employees`.`salary)`
How should I go about this? Should I use regular expressions?
Also, how can I support more advanced stuff, from:
MAX(AVG(database.table.field1), MAX(database.table.field2))
To:
MAX(AVG(`database`.`table`.`field1`), MAX(`database`.`table`.`field2`))
Please keep in mind that I want to keep this method as simple/fast as possible, since it pretty much iterates over all the entity names in my database.
If this is quoting parts of an SQL statement, and they have only complexity that you descibe, a RegEx is a great approach. On the other hand, if you need to do this to full SQL statements, or simply more complicated components of statements (such as "MAX(AVG(val),MAX(val2))"), you will need to tokenize or parse the string and have a more sophisticated understanding of it to do this quoting accurately.
Given the regular expression approach, you may find it easier to break the function name out as one step, and then use your current code to quote the database/table/column names. This can be done in one RE, but it will be tricker to get right.
Either way, I'd highly recommend writing a few unit test cases. In fact, this is an ideal situation for this approach: it's easy to write the tests, you have some existing cases that work (which you don't want to break), and you have just one more case to add.
Your test can start as simply as:
assert '`ticked`' == Tick('ticked');
assert '`table`.`ticked`' == Tick('table.ticked');
assert 'db`.`table`.`ticked`' == Tick('db.table.ticked');
And then add:
assert 'FN(`ticked`)' == Tick('FN(ticked)');
etc.
Using the test case ndp gave I created a regex to do the hard work for you. The following regex will replace all word boundaries around words that are not followed by an opening parenthesis.
\b(\w+)\b(?!\()
The Tick() functionality would then be implemented in PHP as follows:
function Tick($string)
{
return preg_replace( '/\b(\w+)\b(?!\()/', '`\1`', $string );
}
It's generally a bad idea to pass the whole SQL to the function. That way, you'll always find a case when it doesn't work, unless you fully parse the SQL syntax.
Put the ticks to the names on some previous abstraction level, which makes up the SQL.
Before you explode your string on periods, check if the last character is a parenthesis. If so, this call is a function.
<?php
$string = str_replace('`', '', $string)
$function = "";
if (substr($string,-1) == ")") {
// Strip off function call first
$opening = strpos($string, "(");
$function = substr($string, 0, $opening+1);
$string = substr($string, $opening+1, -1);
}
// Do your existing parsing to $string
if ($function == "") {
// Put function back on string
$string = $function . $string . ")";
}
?>
If you need to cover more advanced situations, like using nested functions, or multiple functions in sequence in one "$string" variable, this would become a much more advanced function, and you'd best ask yourself why these elements aren't being properly ticked in the first place, and not need any further parsing.
EDIT: Updating for nested functions, as per original post edit
To have the above function deal with multiple nested functions, you likely need something that will 'unwrap' your nested functions. I haven't tested this, but the following function might get you on the right track.
<?php
function unwrap($str) {
$pos = strpos($str, "(");
if ($pos === false) return $str; // There's no function call here
$last_close = 0;
$cur_offset = 0; // Start at the beginning
while ($cur_offset <= strlen($str)) {
$first_close = strpos($str, ")", $offset); // Find first deep function
$pos = strrpos($str, "(", $first_close-1); // Find associated opening
if ($pos > $last_close) {
// This function is entirely after the previous function
$ticked = Tick(substr($str, $pos+1, $first_close-$pos)); // Tick the string inside
$str = substr($str, 0, $pos)."{".$ticked."}".substr($str,$first_close); // Replace parenthesis by curly braces temporarily
$first_close += strlen($ticked)-($first_close-$pos); // Shift parenthesis location due to new ticks being added
} else {
// This function wraps other functions; don't tick it
$str = substr($str, 0, $pos)."{".substr($str,$pos+1, $first_close-$pos)."}".substr($str,$first_close);
}
$last_close = $first_close;
$offset = $first_close+1;
}
// Replace the curly braces with parenthesis again
$str = str_replace(array("{","}"), array("(",")"), $str);
}
If you are adding the function calls in your code, as opposed to passing them in through a string-only interface, you can replace the string parsing with type checking:
function Tick($value) {
if (is_object($value)) {
$result = $value->value;
} else {
$result = '`'.str_replace(array('`', '.'), array('', '`.`'), $value).'`';
}
return $result;
}
class SqlFunction {
var $value;
function SqlFunction($function, $params) {
$sane = implode(', ', array_map('Tick', $params));
$this->value = "$function($sane)";
}
}
function Maximum($column) {
return new SqlFunction('MAX', array($column));
}
function Avg($column) {
return new SqlFunction('AVG', array($column));
}
function Greatest() {
$params = func_get_args();
return new SqlFunction('GREATEST', $params);
}
$cases = array(
"'simple'" => Tick('simple'),
"'table.field'" => Tick('table.field'),
"'table.*'" => Tick('table.*'),
"'evil`hack'" => Tick('evil`hack'),
"Avg('database.table.field')" => Tick(Avg('database.table.field')),
"Greatest(Avg('table.field1'), Maximum('table.field2'))" => Tick(Greatest(Avg('table.field1'), Maximum('table.field2'))),
);
echo "<table>";
foreach ($cases as $case => $result) {
echo "<tr><td>$case</td><td>$result</td></tr>";
}
echo "</table>";
This avoids any possible SQL injection while remaining legible to future readers of your code.
You could use preg_replace_callback() in conjunction with your Tick() method to skip at least one level of parens:
public function tick($str)
{
return preg_replace_callback('/[^()]*/', array($this, '_tick_replace_callback'), $str);
}
protected function _tick_replace_callback($str) {
$string = explode('.', str_replace('`', '', $string));
foreach ($string as $key => $value)
{
if ($value != '*')
{
$string[$key] = '`' . trim($value) . '`';
}
}
return implode('.', $string);
}
Are you generating the SQL Query or is it being passed to you? If you generating the query I wouldn't pass the whole query string just the parms/values you want to wrap in the backticks or what ever else you need.
EXAMPLE:
function addTick($var) {
return '`' . $var . '`';
}
$condition = addTick($condition);
$SQL = 'SELECT' . $what . '
FROM ' . $table . '
WHERE ' . $condition . ' = ' . $constraint;
This is just a mock but you get the idea that you can pass or loop through your code and build the query string rather than parsing the query string and adding your backticks.