please dont close this question as repeated one..........
I am new to php.
I am developing one tree grid in extjs.were i need to display field in tree format.so for front end i need to send the encoded data.
I want 2 functions for encoding and decoding a string variable,multidimensional array,variables with a set of delimiters.
For example if i am having an array.........
array(
array( "task" => "rose",
"duration" => 1.25,
"user" => 15
),
array( "task" => "daisy",
"duration" => 0.75,
"user" => 25,
),
array( "task" => "orchid",
"duration" => 1.15,
"user" => 7
),
array( "task" => "sunflower",
"duration" => 1.50,
"user" => 70
)
);
i want to encode all the array fields or single field as..........
array(
array( "task" => "rose",
"duration" => 1.25,
"user" => 15
),
array( "task" => "daisy",
"duration" => 0.75,
"user" => 25$sbaa,
),
array( "task" => "orchid",
"duration" => 1.15,
"user" => 7$!ass,
),
array( "task" => "sunflower",
"duration" => 1.50,
"user" => 70$!abc
)
);
So like this only i need to encode string,variables with delimiters.........
later all the encoded values to be decoded before its taken back to back end.....for this i have to use this plugin..........
encode.class.php...........
<?php
/*-------------------------
Author: Jonathan Pulice
Date: July 26th, 2005
Name: JPEncodeClass v1
Desc: Encoder and decoder using patterns.
-------------------------*/
class Protector
{
var $Pattern = "";
var $PatternFlip = "";
var $ToEncode = "";
var $ToDecode = "";
var $Decoded = "";
var $Encoded = "";
var $Bug = false;
var $DecodePattern = "";
function Debug($on = true)
{
$this->Bug = $on;
}
function Encode()
{
$ar = explode(":", $this->Pattern);
$enc = $this->ToEncode;
if ($this->Bug) echo "<!-- BEGIN ENCODING -->\n";
foreach ($ar as $num => $ltr)
{
switch ($ltr)
{
case "E":
$enc = base64_encode($enc);
break;
case "D":
$enc = base64_decode($enc);
break;
case "R":
$enc = strrev($enc);
break;
case "I":
$enc = $this->InvertCase($enc);
break;
}
if ($this->Bug) echo "<!-- {$ltr}: {$enc} -->\n";
}
if ($this->Bug) echo "<!-------------------->\n\n";
#$this->Encoded = ($enc == $this->Str) ? "<font color='red'>No Encoding/Decoding Pattern Detected!</font>" : $enc;
return $this->Encoded;
}
function Decode()
{
$pattern = ($this->DecodePattern != "") ? $this->DecodePattern : $this->Pattern;
//Reverse the pattern
$this->PatternFlip($pattern);
//make into an array
$ar = explode(":", $this->PatternFlip);
$t = ($this->Encoded == "") ? $this->ToDecode : $this->Encoded;
if ($this->Bug) echo "<!-- BEGIN DECODING -->\n";
foreach ($ar as $num => $ltr)
{
switch ($ltr)
{
case "E":
$t = base64_encode($t);
break;
case "D":
$t = base64_decode($t);
break;
case "R":
$t = strrev($t);
break;
case "I":
$t = $this->InvertCase($t);
break;
}
if ($this->Bug) echo "<!-- {$ltr}: {$t} -->\n";
}
if ($this->Bug) echo "<!-------------------->\n\n";
$this->Decoded = ($t == $this->Encoded) ? "<font color='red'>No Encoding/Decoding Pattern Detected!</font>" : $t;
return $this->Decoded;
}
function MakePattern($len = 10)
{
//possible letters
// E - Base64 Encode
// R - Reverse String
// I - Inverse Case
$poss = array('E','R', 'I');
//generate a string
for ( $i = 0 ; $i < $len ; $i++ )
{
$tmp[] = $poss[ rand(0,2) ];
}
//echo $str. "<br>";
//fix useless pattern section RR II
$str = implode(":", $tmp);
//fix
$str = str_replace( 'R:R:R:R:R:R' , 'R:E:R:E:R:E' , $str );
$str = str_replace( 'R:R:R:R:R' , 'R:E:R:E:R' , $str );
$str = str_replace( 'R:R:R:R' , 'R:E:R:E' , $str );
$str = str_replace( 'R:R:R' , 'R:E:R' , $str );
$str = str_replace( 'R:R' , 'R:E' , $str );
//fix
$str = str_replace( 'I:I:I:I:I:I' , 'I:E:I:E:I:E' , $str );
$str = str_replace( 'I:I:I:I:I' , 'I:E:I:E:I' , $str );
$str = str_replace( 'I:I:I:I' , 'I:E:I:E' , $str );
$str = str_replace( 'I:I:I' , 'I:E:I' , $str );
$str = str_replace( 'I:I' , 'I:E' , $str );
//string is good, set as pattern
$this->Pattern = $str;
return $this->Pattern; //if we need it
}
function PatternFlip($pattern)
{
//reverse the pattern
$str = strrev($pattern);
$ar = explode(":", $str);
foreach ($ar as $num => $ltr)
{
switch ($ltr)
{
case "E":
$tmp[] = "D";
break;
case "D":
$tmp[] = "E";
break;
case "R":
$tmp[] = "R";
break;
case "I":
$tmp[] = "I";
break;
}
}
$rev = implode(":", $tmp);
$this->PatternFlip = $rev;
return $this->PatternFlip;
}
// This is my custom Case Invertor!
// if you would like to use this in a script, please credit it to me, thank you
function InvertCase($str)
{
//Do initial conversion
$new = strtoupper( $str );
//spluit into arrays
$s = str_split( $str );
$n = str_split( $new );
//now we step through each letter, and if its the same as before, we swap it out
for ($i = 0; $i < count($s); $i++)
{
if ( $s[$i] === $n[$i] ) //SWAP THE LETTER
{
//ge the letter
$num = ord( $n[$i] );
//see if the ord is in the alpha ranges ( 65 - 90 | 97 - 122 )
if ( ( $num >= 65 AND $num <= 90 ) OR ( $num >= 97 AND $num <= 122 ) )
{
if ($num < 97 ) { $num = $num + 32; }
else { $num = $num - 32; }
$newchr = chr($num);
$n[$i] = $newchr;
}
}
}
//join the new string back together
$newstr = implode("", $n);
return $newstr;
}
}
?>
............
from this plugin i need to use encode and decode functions for my functions.........
if anyone can help me on this.......it will be very much useful for me.......
Why don't you use json_encode? Just do
$str=json_encode($array);
Then, send the data, and at the other end do
$array=json_decode($str);
Okay, let's break down the problem.
You have an array. Each element in the array is a hash. One (or more) of the values in that hash has to be encoded using that horrible abomination of a library. But the library can't process arrays.
We'll have to process the array ourselves.
<rant>
Before we begin, I'd just like to again express how horribly that "Protector" code is designed. It's written for PHP4 and is effectively spaghetti code wrapped in a class. It mis-uses properties, almost as if the user had some sort of mis-remembering about how Java's instance variables work and somehow thought it would be appropriate or sane to use PHP in the same way. If the author of the code doesn't look back on that revolting chunk of bytes now with utter disdain, something is severely wrong with him.
</rant>
I'm going to base my knowledge of this class on this copy of it, as the one you've provided is formatted even worse than the original.
First, let's create a list of inner array keys that we need to encode.
$keys_to_encode = array( 'user' );
Your example encoding lists only the user key as encodable. If you need to encode others, just add more elements to that array.
Now, let's prepare our "Protector." It seems to want you to either specify a pattern, or use the MakePattern method to have it create one. We're going to manually-specify one because MakePattern can come up with effectively useless combinations.
$stupid = new Protector();
$stupid->Pattern = 'E:I:E:R:D:I:E';
This will base64-encode, flip the case, base64 again, reverse it, un-base64, flip the case, and then re-base64. Be aware that PHP's base64 decoder "correctly" ignores bad padding and unexpected characters, which is the only reason the base64-reverse-unbase64 thing will work. The resulting string will look like gibberish to people that don't know what base64 looks like, and un-base64 to gibberish for people that to know what base64 looks like.
If you need to encode the values in a certain way, you'd do so by just changing the pattern. It's important that you either hard-code the pattern or store it somewhere with the data, because without it, you'll have a hard time doing the decode. (I mean, it can be done by hand, but you don't want to have to do that.) I expect that your boss is going to give you specific instructions on the pattern, given that you don't have an option in using this utter failure at a class.
Now it's time to process our data. Let's pretend that $in contains your original array.
$out = array();
foreach($in as $k => $target) {
foreach($keys_to_encode as $target_key) {
$stupid->ToEncode = $target[ $target_key ];
$target[ $target_key ] = $stupid->Encode();
}
$out[$k] = $target;
}
This loops through the array of hashes, then inside each hash, only the keys we want to encode are encoded. The encoded hash is placed in a new array, $out. This is what you'd pass to your tree widget.
Decoding is just as easy. You've stated that your goal is letting users edit certain data in the tree widget, and you seem to want to protect the underlying keys. This tells me that you probably only need to deal with one of the hashes at a time. Therefore, I'm going to write this decode sample using only one value, $whatever.
$stupidest = new Protector();
$stupidest->Pattern = 'E:I:E:R:D:I:E'; // SAME PATTERN!
$stupidest->ToDecode = $whatever;
$decoded = $stupidest->Decode();
Again, it's critical that you use the same decode pattern here.
As for the actual interaction with the tree widget, you're on your own there. I know only enough about ExtJS to know that it exists and that the GUI creators for it are really fun to play with.
Related
i have a string and i need to add some html tag at certain index of the string.
$comment_text = 'neethu and Dilnaz Patel check this'
Array ( [start_index_key] => 0 [string_length] => 6 )
Array ( [start_index_key] => 11 [string_length] => 12 )
i need to split at start index key with long mentioned in string_length
expected final output is
$formattedText = '<span>#neethu</span> and <span>#Dilnaz Patel</span> check this'
what should i do?
This is a very strict method that will break at the first change.
Do you have control over the creation of the string? If so, you can create a string with placeholders and fill the values.
Even though you can do this with regex:
$pattern = '/(.+[^ ])\s+and (.+[^ ])\s+check this/i';
$string = 'neehu and Dilnaz Patel check this';
$replace = preg_replace($pattern, '<b>#$\1</b> and <b>#$\2</b> check this', $string);
But this is still a very rigid solution.
If you can try creating a string with placeholders for the names. this will be much easier to manage and change in the future.
<?php
function my_replace($string,$array_break)
{
$break_open = array();
$break_close = array();
$start = 0;
foreach($array_break as $key => $val)
{
// for tag <span>
if($key % 2 == 0)
{
$start = $val;
$break_open[] = $val;
}
else
{
// for tag </span>
$break_close[] = $start + $val;
}
}
$result = array();
for($i=0;$i<strlen($string);$i++)
{
$current_char = $string[$i];
if(in_array($i,$break_open))
{
$result[] = "<span>".$current_char;
}
else if(in_array($i,$break_close))
{
$result[] = $current_char."</span>";
}
else
{
$result[] = $current_char;
}
}
return implode("",$result);
}
$comment_text = 'neethu and Dilnaz Patel check this';
$my_result = my_replace($comment_text,array(0,6,11,12));
var_dump($my_result);
Explaination:
Create array parameter with: The even index (0,2,4,6,8,...) would be start_index_key and The odd index (1,3,5,7,9,...) would be string_length
read every break point , and store it in $break_open and $break_close
create array $result for result.
Loop your string, add , add or dont add spann with break_point
Result:
string '<span>neethu </span>and <span>Dilnaz Patel </span> check this' (length=61)
Hello dear programmers! I'm having a problem with the speed of the function preg_replace().
When I had little values (words) in the $patterns and $replacements arrays, the problem was not with the speed of searching and replacing in text from arrays, and when the number of the value in the arrays increased by 1.000.000 then the function preg_replace() was repeatedly slowed down. How can I search and replace in the text if I have more than 1,000,000 values (words) in the arrays? How to make a replacement as quickly as possible? Can the solution of the problem be buffered or cached? What advise, how correctly to me to act?
Here is an example of my array:
$patterns =
array
(
0 => "/\bмувосокори\b/u",
1 => "/\bмунаггас\b/u",
2 => "/\bмангит\b/u",
3 => "/\bмангития\b/u",
4 => "/\bмунфачир\b/u",
5 => "/\bмунфачира\b/u",
6 => "/\bманфиатпарасти\b/u",
7 => "/\bманфиатчу\b/u",
8 => "/\bманфиатчуи\b/u",
9 => "/\bманфиатхох\b/u",
10 => "/\bманфи\b/u",
...........................
1000000 => "/\bмусби\b/u"
)
$replacements =
array
(
0 => "мувосокорӣ",
1 => "мунағғас",
2 => "манғит",
3 => "манғития",
4 => "мунфаҷир",
5 => "мунфаҷира",
6 => "манфиатпарастӣ",
7 => "манфиатҷӯ",
8 => "манфиатҷӯӣ",
9 => "манфиатхоҳ",
10 => "манфӣ",
.....................
1000000 => "мусбӣ"
);
$text = "мувосокори мунаггас мангит мангития мунфачир манфиатпарасти...";
$result = preg_replace($patterns, $replacements, $text);
I use this javascript function in index.html file:
<script>
function response(str) {
if (str.length == 0) {
document.getElementById("text").innerHTML = "";
return;
} else {
var xmlhttp = new XMLHttpRequest();
xmlhttp.onreadystatechange = function() {
if (this.readyState == 4 && this.status == 200) {
document.getElementById("text").innerHTML = this.responseText;
}
};
xmlhttp.open("GET", "response.php?request=" + str, true);
xmlhttp.send();
}
}
</script>
PHP file response.php source:
<?php
$patterns = array();
$replacements = array();
$request = $_REQUEST["request"];
$response = "";
if ($request !== "") {
$start = microtime(true);
$response = preg_replace($patterns, $replacements, $request);
$stop = microtime(true);
$time_replace = $stop - $start;
}
echo $response === "" ? "" : $response."<br>Time: $time_replace";
?>
The time complexity of your algorithm is roughly O(nm) where n is the number of words in your replacement array, and m the number of words in the request.
Since all the patterns seem to look for words (\b before and after), and don't use any other regular expression syntax (only literal characters) you will get better performance by splitting the request into words and looking them up in an associative array, without the need to use regular expressions at all.
So define your pattern/replacement data as an associative array, like this:
$dict = array(
"мувосокори" => "мувосокорӣ",
"мунаггас" => "мунағғас",
"мангит" => "манғит",
"мангития" => "манғития",
"мунфачир" => "мунфаҷир",
"мунфачира" => "мунфаҷира",
"манфиатпарасти" => "манфиатпарастӣ",
"манфиатчу" => "манфиатҷӯ",
"манфиатчуи" => "манфиатҷӯӣ",
"манфиатхох" => "манфиатхоҳ",
"манфи" => "манфӣ",
...........................
"мусби" => "мусбӣ"
);
Then use preg_replace_callback to find each word in the request and look it up in the above dictionary:
$response = preg_replace_callback("/\pL+/u", function ($m) use ($dict) {
return isset($dict[$m[0]]) ? $dict[$m[0]] : $m[0];
}, $request);
The time complexity will be linear to the number of words in the request.
Handling upper/lower case
If you need to match also any variation in the capitalisation of the words, then it would be too much to store any such variation in the dictionary. Instead, you would keep the dictionary in all lowercase letters, and then use the code below. When there is a match with the dictionary, it checks what the capitalisation is of the original word, and applies the same to the replacement word:
$response = preg_replace_callback("/\pL+/u", function ($m) use ($dict) {
$word = mb_strtolower($m[0]);
if (isset($dict[$word])) {
$repl = $dict[$word];
// Check for some common ways of upper/lower case
// 1. all lower case
if ($word === $m[0]) return $repl;
// 2. all upper case
if (mb_strtoupper($word) === $m[0]) return mb_strtoupper($repl);
// 3. Only first letters are upper case
if (mb_convert_case($word, MB_CASE_TITLE) === $m[0]) return mb_convert_case($repl, MB_CASE_TITLE);
// Otherwise: check each character whether it should be upper or lower case
for ($i = 0, $len = mb_strlen($word); $i < $len; ++$i) {
$mixed[] = mb_substr($word, $i, 1) === mb_substr($m[0], $i, 1)
? mb_substr($repl, $i, 1)
: mb_strtoupper(mb_substr($repl, $i, 1));
}
return implode("", $mixed);
}
return $m[0]; // Nothing changes
}, $request);
See it run on repl.it
Converting your existing arrays
You can use a little piece of code to transform your current $patterns and $replacements arrays to the new data structure, to avoid you have to do it "manually":
foreach ($patterns as $i => $pattern) {
$dict[explode("\b", $pattern)[1]] = $replacements[$i];
}
Of course, you should not include this conversion in your code, but just run it once to produce the new array structure and then put that array literal in your code.
I know many of the users have asked this type of question but I am stuck in an odd situation.
I am trying a logic where multiple occurance of a specific pattern having unique identifier will be replaced with some conditional database content if there match is found.
My regex pattern is
'/{code#(\d+)}/'
where the 'd+' will be my unique identifier of the above mentioned pattern.
My Php code is:
<?php
$text="The old version is {code#1}, The new version is {code#2}, The stable version is {code#3}";
$newsld=preg_match_all('/{code#(\d+)}/',$text,$arr);
$data = array("first Replace","Second Replace", "Third Replace");
echo $data=str_replace($arr[0], $data, $text);
?>
This works but it is not at all dynamic, the numbers after #tag from pattern are ids i.e 1,2 & 3 and their respective data is stored in database.
how could I access the content from DB of respective ID mentioned in the pattern and would replace the entire pattern with respective content.
I am really not getting a way of it. Thank you in advance
It's not that difficult if you think about it. I'll be using PDO with prepared statements. So let's set it up:
$db = new PDO( // New PDO object
'mysql:host=localhost;dbname=projectn;charset=utf8', // Important: utf8 all the way through
'username',
'password',
array(
PDO::ATTR_EMULATE_PREPARES => false, // Turn off prepare emulation
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION
)
);
This is the most basic setup for our DB. Check out this thread to learn more about emulated prepared statements and this external link to get started with PDO.
We got our input from somewhere, for the sake of simplicity we'll define it:
$text = 'The old version is {code#1}, The new version is {code#2}, The stable version {code#3}';
Now there are several ways to achieve our goal. I'll show you two:
1. Using preg_replace_callback():
$output = preg_replace_callback('/{code#(\d+)}/', function($m) use($db) {
$stmt = $db->prepare('SELECT `content` FROM `footable` WHERE `id`=?');
$stmt->execute(array($m[1]));
$row = $stmt->fetch(PDO::FETCH_ASSOC);
if($row === false){
return $m[0]; // Default value is the code we captured if there's no match in de DB
}else{
return $row['content'];
}
}, $text);
echo $output;
Note how we use use() to get $db inside the scope of the anonymous function. global is evil
Now the downside is that this code is going to query the database for every single code it encounters to replace it. The advantage would be setting a default value in case there's no match in the database. If you don't have that many codes to replace, I would go for this solution.
2. Using preg_match_all():
if(preg_match_all('/{code#(\d+)}/', $text, $m)){
$codes = $m[1]; // For sanity/tracking purposes
$inQuery = implode(',', array_fill(0, count($codes), '?')); // Nice common trick: https://stackoverflow.com/a/10722827
$stmt = $db->prepare('SELECT `content` FROM `footable` WHERE `id` IN(' . $inQuery . ')');
$stmt->execute($codes);
$rows = $stmt->fetchAll(PDO::FETCH_ASSOC);
$contents = array_map(function($v){
return $v['content'];
}, $rows); // Get the content in a nice (numbered) array
$patterns = array_fill(0, count($codes), '/{code#(\d+)}/'); // Create an array of the same pattern N times (N = the amount of codes we have)
$text = preg_replace($patterns, $contents, $text, 1); // Do not forget to limit a replace to 1 (for each code)
echo $text;
}else{
echo 'no match';
}
The problem with the code above is that it replaces the code with an empty value if there's no match in the database. This could also shift up the values and thus could result in a shifted replacement. Example (code#2 doesn't exist in db):
Input: foo {code#1}, bar {code#2}, baz {code#3}
Output: foo AAA, bar CCC, baz
Expected output: foo AAA, bar , baz CCC
The preg_replace_callback() works as expected. Maybe you could think of a hybrid solution. I'll let that as a homework for you :)
Here is another variant on how to solve the problem: As access to the database is most expensive, I would choose a design that allows you to query the database once for all codes used.
The text you've got could be represented with various segments, that is any combination of <TEXT> and <CODE> tokens:
The old version is {code#1}, The new version is {code#2}, ...
<TEXT_____________><CODE__><TEXT_______________><CODE__><TEXT_ ...
Tokenizing your string buffer into such a sequence allows you to obtain the codes used in the document and index which segments a code relates to.
You can then fetch the replacements for each code and then replace all segments of that code with the replacement.
Let's set this up and defined the input text, your pattern and the token-types:
$input = <<<BUFFER
The old version is {code#1}, The new version is {code#2}, The stable version is {code#3}
BUFFER;
$regex = '/{code#(\d+)}/';
const TOKEN_TEXT = 1;
const TOKEN_CODE = 2;
Next is the part to put the input apart into the tokens, I use two arrays for that. One is to store the type of the token ($tokens; text or code) and the other array contains the string data ($segments). The input is copied into a buffer and the buffer is consumed until it is empty:
$tokens = [];
$segments = [];
$buffer = $input;
while (preg_match($regex, $buffer, $matches, PREG_OFFSET_CAPTURE, 0)) {
if ($matches[0][1]) {
$tokens[] = TOKEN_TEXT;
$segments[] = substr($buffer, 0, $matches[0][1]);
}
$tokens[] = TOKEN_CODE;
$segments[] = $matches[0][0];
$buffer = substr($buffer, $matches[0][1] + strlen($matches[0][0]));
}
if (strlen($buffer)) {
$tokens[] = TOKEN_TEXT;
$segments[] = $buffer;
$buffer = "";
}
Now all the input has been processed and is turned into tokens and segments.
Now this "token-stream" can be used to obtain all codes used. Additionally all code-tokens are indexed so that with the number of the code it's possible to say which segments need to be replaced. The indexing is done in the $patterns array:
$patterns = [];
foreach ($tokens as $index => $token) {
if ($token !== TOKEN_CODE) {
continue;
}
preg_match($regex, $segments[$index], $matches);
$code = (int)$matches[1];
$patterns[$code][] = $index;
}
Now as all codes have been obtained from the string, a database query could be formulated to obtain the replacement values. I mock that functionality by creating a result array of rows. That should do it for the example. Technically you'll fire a a SELECT ... FROM ... WHERE code IN (12, 44, ...) query that allows to fetch all results at once. I fake this by calculating a result:
$result = [];
foreach (array_keys($patterns) as $code) {
$result[] = [
'id' => $code,
'text' => sprintf('v%d.%d.%d%s', $code * 2 % 5 + $code % 2, 7 - 2 * $code % 5, 13 + $code, $code === 3 ? '' : '-beta'),
];
}
Then it's only left to process the database result and replace those segments the result has codes for:
foreach ($result as $row) {
foreach ($patterns[$row['id']] as $index) {
$segments[$index] = $row['text'];
}
}
And then do the output:
echo implode("", $segments);
And that's it then. The output for this example:
The old version is v3.5.14-beta, The new version is v4.3.15-beta, The stable version is v2.6.16
The whole example in full:
<?php
/**
* Simultaneous Preg_replace operation in php and regex
*
* #link http://stackoverflow.com/a/29474371/367456
*/
$input = <<<BUFFER
The old version is {code#1}, The new version is {code#2}, The stable version is {code#3}
BUFFER;
$regex = '/{code#(\d+)}/';
const TOKEN_TEXT = 1;
const TOKEN_CODE = 2;
// convert the input into a stream of tokens - normal text or fields for replacement
$tokens = [];
$segments = [];
$buffer = $input;
while (preg_match($regex, $buffer, $matches, PREG_OFFSET_CAPTURE, 0)) {
if ($matches[0][1]) {
$tokens[] = TOKEN_TEXT;
$segments[] = substr($buffer, 0, $matches[0][1]);
}
$tokens[] = TOKEN_CODE;
$segments[] = $matches[0][0];
$buffer = substr($buffer, $matches[0][1] + strlen($matches[0][0]));
}
if (strlen($buffer)) {
$tokens[] = TOKEN_TEXT;
$segments[] = $buffer;
$buffer = "";
}
// index which tokens represent which codes
$patterns = [];
foreach ($tokens as $index => $token) {
if ($token !== TOKEN_CODE) {
continue;
}
preg_match($regex, $segments[$index], $matches);
$code = (int)$matches[1];
$patterns[$code][] = $index;
}
// lookup all codes in a database at once (simulated)
// SELECT id, text FROM replacements_table WHERE id IN (array_keys($patterns))
$result = [];
foreach (array_keys($patterns) as $code) {
$result[] = [
'id' => $code,
'text' => sprintf('v%d.%d.%d%s', $code * 2 % 5 + $code % 2, 7 - 2 * $code % 5, 13 + $code, $code === 3 ? '' : '-beta'),
];
}
// process the database result
foreach ($result as $row) {
foreach ($patterns[$row['id']] as $index) {
$segments[$index] = $row['text'];
}
}
// output the replacement result
echo implode("", $segments);
Problem:
I'm looking for a PHP function to easily and efficiently normalise CSV content in a string (not in a file). I have made a function for that. I provide it in an answer, because it is a possible solution. Unfortuanately it doesn't work when the separator is included in incomming string values.
Can anyone provide a better solution?
Why not using fputcsv / fgetcsv ?
Because:
it requires at least PHP 5.1.0 (which is sometimes not available)
it can only read from files, but not from a string. even though, sometimes the input is not a file (eg. if you fetch the CSV from an email)
putting the content into a temporary file might be unavailable due to security policies.
Why / what kind of normalisation?
Normalise in a way, that the encloser encloses every field. Because the encloser can be optional and different per line and per field. This can happen if one is implementing unclean/incomplete specifications and/or using CSV content from different sources/programs/developers.
Example function call:
$csvContent = "'a a',\"b\",c,1, 2 ,3 \n a a,'bb',cc, 1, 2, 3 ";
echo "BEFORE:\n$csvContent\n";
normaliseCSV($csvContent);
echo "AFTER:\n$csvContent\n";
Output:
BEFORE:
'a a',"b",c,1, 2 ,3
a a,'bb',cc, 1, 2, 3
AFTER:
"a a","b","c","1","2","3"
"a a","bb","cc","1","2","3"
To specifically address your concern regarding f*csv working only with files:
Since PHP 5.3 there's str_getcsv.
For at least PHP >= 5.1 (and I really hope that's the oldest you'll have to deal with these days), you can use stream wrappers:
$buffer = fopen('php://memory', 'r+');
fwrite($buffer, $string);
rewind($buffer);
fgetcsv($buffer) ..
Or obviously the reverse if you want to use fputcsv.
This is a possible solution. But it doesn't consider the case that the separator (,) might be included in incoming strings.
function normaliseCSV(&$csv,$lineseperator = "\n", $fieldseperator = ',', $encloser = '"')
{
$csvArray = explode ($lineseperator,$csv);
foreach ($csvArray as &$line)
{
$lineArray = explode ($fieldseperator,$line);
foreach ($lineArray as &$field)
{
$field = $encloser.trim($field,"\0\t\n\x0B\r \"'").$encloser;
}
$line = implode ($fieldseperator,$lineArray);
}
$csv = implode ($lineseperator,$csvArray);
}
It is a simple chain of explode -> explode -> trim -> implode -> implode .
Although I agree with #deceze that you could expect atleast 5.1 these days, i'm sure there are some internal company servers somewhere who don't want to update.
I altered your method to be able to use field and line separators between double quotes, or in your case the $encloser value.
<?php
/*
In regards to the specs on http://tools.ietf.org/html/rfc4180 I use the following rules:
- "Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes."
- "If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote."
Exception:
Even though the specs says use double quotes, I 'm using your $encloser variable
*/
echo normaliseCSV('a,b,\'c\',"d,e","f","g""h""i","""j"""' . "\n" . "\"k\nl\nm\"");
function normaliseCSV($csv,$lineseperator = "\n", $fieldseperator = ',', $encloser = '"')
{
//We need 4 temporary replacement values
//line seperator, fieldseperator, double qoutes, triple qoutes
$keys = array();
while (count($keys)<3) {
$tmp = "##".md5(rand().rand().microtime())."##";
if (strpos($csv, $tmp)===false) {
$keys[] = $tmp;
}
}
//first we exchange "" (double $encloser) and """ to make sure its not exploded
$csv = str_replace($encloser.$encloser.$encloser, $keys[0], $csv);
$csv = str_replace($encloser.$encloser, $keys[0], $csv);
//Explode on $encloser
//Every odd index is within quotes
//Exchange line and field seperators for something not used.
$content = explode($encloser,$csv);
$len = count($content);
if ($len>1) {
for ($x=1;$x<$len;$x=$x+2) {
$content[$x] = str_replace($lineseperator,$keys[1], $content[$x]);
$content[$x] = str_replace($fieldseperator,$keys[2], $content[$x]);
}
}
$csv = implode('',$content);
$csvArray = explode ($lineseperator,$csv);
foreach ($csvArray as &$line)
{
$lineArray = explode ($fieldseperator,$line);
foreach ($lineArray as &$field)
{
$val = trim($field,"\0\t\n\x0B\r '");
//put back the exchanged values
$val = str_replace($keys[0],$encloser.$encloser,$val);
$val = str_replace($keys[1],$lineseperator,$val);
$val = str_replace($keys[2],$fieldseperator,$val);
$val = $encloser.$val.$encloser;
$field = $val;
}
$line = implode ($fieldseperator,$lineArray);
}
$csv = implode ($lineseperator,$csvArray);
return $csv;
}
?>
Output would be:
"a","b","c","d,e","f","g""h""i","""j"""
"k
l
m"
Codepad example
when i first read this question wasn´t sure if it should be solved or not, since <5.1 environments should be extinguished a long time ago, dispite of that is a hell of a question how to solve this so we should be thinking wich approach to take... and my guess is it should be char by char examination.
I have separated logic in three main scenarios:
A: CHAR is a separator
B: CHAR is a Fuc$€/& quotation
C: CHAR is a Value
Obtaining as a reulst this weapon class (including log for it) for our arsenal:
<?php
Class CSVParser
{
#basic requirements
public $input;
public $separator;
public $currentQuote;
public $insideQuote;
public $result;
public $field;
public $quotation = array();
public $parsedArray = array();
# for logging purposes only
public $logging = TRUE;
public $log = array();
function __construct($input, $separator, $quotation=array())
{
$this->separator = $separator;
$this->input = $input;
$this->quotation = $quotation;
}
/**
* The main idea is to go through the string to parse char by char to analize
* when a complete field is detected it´ll be quoted according and added to an array
*/
public function parse()
{
for($i = 0; $i < strlen($this->input); $i++){
$this->processStream($i);
}
foreach($this->parsedArray as $value)
{
if(!is_null($value))
$this->result .= '"'.addslashes($value).'",';
}
return rtrim($this->result, ',');
}
private function processStream($i)
{
#A case (its a separator)
if($this->input[$i]===$this->separator){
$this->log("A", $this->input[$i]);
if($this->insideQuote){
$this->field .= $this->input[$i];
}else
{
$this->saveField($this->field);
$this->field = NULL;
}
}
#B case (its a f"·%$% quote)
if(in_array($this->input[$i], $this->quotation)){
$this->log("B", $this->input[$i]);
if(!$this->insideQuote){
$this->insideQuote = TRUE;
$this->currentQuote = $this->input[$i];
}
else{
if($this->currentQuote===$this->input[$i]){
$this->insideQuote = FALSE;
$this->currentQuote ='';
$this->saveField($this->field);
$this->field = NULL;
}else{
$this->field .= $this->input[$i];
}
}
}
#C case (its a value :-) )
if(!in_array($this->input[$i], array_merge(array($this->separator), $this->quotation))){
$this->log("C", $this->input[$i]);
$this->field .= $this->input[$i];
}
}
private function saveField($field)
{
$this->parsedArray[] = $field;
}
private function log($type, $value)
{
if($this->logging){
$this->log[] = "CASE ".$type." WITH ".$value." AS VALUE";
}
}
}
and example of how to use it would be:
$original = 'a,"ab",\'ab\'';
$test = new CSVParser($original, ',', array('"', "'"));
echo "<PRE>ORIGINAL: ".$original."</PRE>";
echo "<PRE>PARSED: ".$test->parse()."</PRE>";
echo "<pre>";
print_r($test->log);
echo "</pre>";
and here are the results:
ORIGINAL: a,"ab",'ab'
PARSED: "a","ab","ab"
Array
(
[0] => CASE C WITH a AS VALUE
[1] => CASE A WITH , AS VALUE
[2] => CASE B WITH " AS VALUE
[3] => CASE C WITH a AS VALUE
[4] => CASE C WITH b AS VALUE
[5] => CASE B WITH " AS VALUE
[6] => CASE A WITH , AS VALUE
[7] => CASE B WITH ' AS VALUE
[8] => CASE C WITH a AS VALUE
[9] => CASE C WITH b AS VALUE
[10] => CASE B WITH ' AS VALUE
)
I might have mistakes since i only dedicated 25 mins to it, so any comment will be appreciated an edited.
Two days ago I started working on a code parser and I'm stuck.
How can I split a string by commas that are not inside brackets, let me show you what I mean:
I have this string to parse:
one, two, three, (four, (five, six), (ten)), seven
I would like to get this result:
array(
"one";
"two";
"three";
"(four, (five, six), (ten))";
"seven"
)
but instead I get:
array(
"one";
"two";
"three";
"(four";
"(five";
"six)";
"(ten))";
"seven"
)
How can I do this in PHP RegEx.
Thank you in advance !
You can do that easier:
preg_match_all('/[^(,\s]+|\([^)]+\)/', $str, $matches)
But it would be better if you use a real parser. Maybe something like this:
$str = 'one, two, three, (four, (five, six), (ten)), seven';
$buffer = '';
$stack = array();
$depth = 0;
$len = strlen($str);
for ($i=0; $i<$len; $i++) {
$char = $str[$i];
switch ($char) {
case '(':
$depth++;
break;
case ',':
if (!$depth) {
if ($buffer !== '') {
$stack[] = $buffer;
$buffer = '';
}
continue 2;
}
break;
case ' ':
if (!$depth) {
continue 2;
}
break;
case ')':
if ($depth) {
$depth--;
} else {
$stack[] = $buffer.$char;
$buffer = '';
continue 2;
}
break;
}
$buffer .= $char;
}
if ($buffer !== '') {
$stack[] = $buffer;
}
var_dump($stack);
Hm... OK already marked as answered, but since you asked for an easy solution I will try nevertheless:
$test = "one, two, three, , , ,(four, five, six), seven, (eight, nine)";
$split = "/([(].*?[)])|(\w)+/";
preg_match_all($split, $test, $out);
print_r($out[0]);
Output
Array
(
[0] => one
[1] => two
[2] => three
[3] => (four, five, six)
[4] => seven
[5] => (eight, nine)
)
You can't, directly. You'd need, at minimum, variable-width lookbehind, and last I knew PHP's PCRE only has fixed-width lookbehind.
My first recommendation would be to first extract parenthesized expressions from the string. I don't know anything about your actual problem, though, so I don't know if that will be feasible.
I can't think of a way to do it using a single regex, but it's quite easy to hack together something that works:
function process($data)
{
$entries = array();
$filteredData = $data;
if (preg_match_all("/\(([^)]*)\)/", $data, $matches)) {
$entries = $matches[0];
$filteredData = preg_replace("/\(([^)]*)\)/", "-placeholder-", $data);
}
$arr = array_map("trim", explode(",", $filteredData));
if (!$entries) {
return $arr;
}
$j = 0;
foreach ($arr as $i => $entry) {
if ($entry != "-placeholder-") {
continue;
}
$arr[$i] = $entries[$j];
$j++;
}
return $arr;
}
If you invoke it like this:
$data = "one, two, three, (four, five, six), seven, (eight, nine)";
print_r(process($data));
It outputs:
Array
(
[0] => one
[1] => two
[2] => three
[3] => (four, five, six)
[4] => seven
[5] => (eight, nine)
)
Clumsy, but it does the job...
<?php
function split_by_commas($string) {
preg_match_all("/\(.+?\)/", $string, $result);
$problem_children = $result[0];
$i = 0;
$temp = array();
foreach ($problem_children as $submatch) {
$marker = '__'.$i++.'__';
$temp[$marker] = $submatch;
$string = str_replace($submatch, $marker, $string);
}
$result = explode(",", $string);
foreach ($result as $key => $item) {
$item = trim($item);
$result[$key] = isset($temp[$item])?$temp[$item]:$item;
}
return $result;
}
$test = "one, two, three, (four, five, six), seven, (eight, nine), ten";
print_r(split_by_commas($test));
?>
I feel that its worth noting, that you should always avoid regular expressions when you possibly can. To that end, you should know that for PHP 5.3+ you could use str_getcsv(). However, if you're working with files (or file streams), such as CSV files, then the function fgetcsv() might be what you need, and its been available since PHP4.
Lastly, I'm surprised nobody used preg_split(), or did it not work as needed?
Maybe a bit late but I've made a solution without regex which also supports nesting inside brackets. Anyone let me know what you guys think:
$str = "Some text, Some other text with ((95,3%) MSC)";
$arr = explode(",",$str);
$parts = [];
$currentPart = "";
$bracketsOpened = 0;
foreach ($arr as $part){
$currentPart .= ($bracketsOpened > 0 ? ',' : '').$part;
if (stristr($part,"(")){
$bracketsOpened ++;
}
if (stristr($part,")")){
$bracketsOpened --;
}
if (!$bracketsOpened){
$parts[] = $currentPart;
$currentPart = '';
}
}
Gives me the output:
Array
(
[0] => Some text
[1] => Some other text with ((95,3%) MSC)
)
I am afraid that it could be very difficult to parse nested brackets like
one, two, (three, (four, five))
only with RegExp.