I’m trying to modify a string of the following form where each field is delimited by a tab except for the first which is followed by two or more tabs.
"$str1 $str2 $str3 $str4 $str5 $str6"
The modified string will have each field wrapped in HTML table tags, and be on its own, indented line as so.
"<tr>
<td class="title">$str1</td>
<td sorttable_customkey="$str2"></td>
<td sorttable_customkey="$str3"></td>
<td sorttable_customkey="$str4"></td>
<td sorttable_customkey="$str5"></td>
<td sorttable_customkey="$str6"></td>
</tr>
"
I tried using code like the following to do it.
$patterns = array();
$patterns[0]='/^/';
$patterns[1]='/\t\t+/';
$patterns[2]='/\t/';
$patterns[3]='/$/';
$replacements = array();
$replacements[0]='\t\t<tr>\r\n\t\t\t<td class="title">';
$replacements[1]='</td>\r\n\t\t\t<td sorttable_customkey="';
$replacements[2]='"></td>\r\n\t\t\t<td sorttable_customkey="';
$replacements[3]='"></td>\r\n\t\t</tr>\r\n';
for ($i=0; $i<count($lines); $i++) {
$lines[$i] = preg_replace($patterns, $replacements, $lines[$i]);
}
The problem is that the escaped characters (tabs and newlines) in the replacement array remain escaped in the destination string and I get the following string.
"\t\t<tr>\r\n\t\t\t<td class="title">$str</td>\r\n\t\t\t<td sorttable_customkey="$str2"></td>\r\n\t\t\t<td sorttable_customkey="$str3"></td>\r\n\t\t\t<td sorttable_customkey="$str4"></td>\r\n\t\t\t<td sorttable_customkey="$str5"></td>\r\n\t\t\t<td sorttable_customkey="$str6"></td>\r\n\t\t</tr>\r\n"
Strangely, this line I tried earlier on does work:
$data=preg_replace("/\t+/", "\t", $data);
Am I missing something? Any idea how to fix it?
You need double quotes or heredocs for the replacement string - PCRE only parses those escape characters in the search string.
In your working example preg_replace("/\t+/", "\t", $data) those are both literal tab characters because they're in double quotes.
If you changed it to preg_replace('/\t+/', '\t', $data) you can observe your main problem - PCRE understands that the \t in the search string represents a tab, but doesn't for the one in the replacement string.
So by using double quotes for the replacement, e.g. preg_replace('/\t+/', "\t", $data), you let PHP parse the \t and you get the expected result.
It is slightly incongruous, just something to remember.
Your $replacements array has all its strings decalred as single-quoted strings.
That means that escaped characters won't scape (except \').
It is not related directly to PCRE regular expressions, but to how PHP handles strings.
Basically you can type strings like these:
<?php # String test
$value = "substitution";
$str1 = 'this is a $value that does not get substituted';
$str2 = "this is a $value that does not remember the variable"; # this is a substitution that does not remember the variable
$str3 = "you can also type \$value = $value" # you can also type $value = substitution
$bigstr =<<< MARKER
you can type
very long stuff here
provided you end it with the single
value MARKER you had put earlier in the beginning of a line
just like this:
MARKER;
tl;dr version: problem is single quotes in the $replacements and $patterns that should be double quotes
Related
When running a function which returns a string, I end up with backwards-slashes before a quotation marks, like this:
$string = get_string();
// returns: Example
I suspect it is some type of escaping happening somewhere. I know I can string replace the backwards-slash, but I suppose in these cases, there is some type of unescape function you run?
You only need to escape quotes when it matches your starting/ending delimiter. This code should work properly:
$string = 'Example';
If your string is enclosed in single quotes ', then " doesn't need to be escaped. Likewise, the opposite is true.
Avoid using stripslashes(), as it could cause issues if single quotes need to contain slashes. A simple find/replace should work for you:
$string = 'Example';
$string = str_replace($string, '\"', '"');
echo $string; //echos Example
<?php
$string = 'Example';
echo stripslashes($string);
?>
I'm trying to remove all quote characters from a string but not those that are escaped.
Example:
#TEST string "quoted part\" which escapes" other "quoted string"
Should result in:
#TEST string quoted part\" which escapes other quoted string
I tried to achieve this using
$string = '#TEST string "quoted part\" which escapes" other "quoted string"'
preg_replace("/(?>=\\)([\"])/","", $string);
But can't seem to find a match pattern.
Any help or tip on an other approach
A very good example for (*SKIP)(*FAIL):
\\['"](*SKIP)(*FAIL)|["']
Replace this with an empty string and you're fine. See a demo on regex101.com.
In PHP this would be (you need to escape the backslash as well):
<?php
$string = <<<DATA
#TEST string "quoted part\" witch escape" other "quoted string"
DATA;
$regex = '~\\\\[\'"](*SKIP)(*FAIL)|["\']~';
$string = preg_replace($regex, '', $string);
echo $string;
?>
See a demo on ideone.com.
While (*SKIP)(*F) is a good technique all in all, it seems you may use a mere negative lookbehind in this case, where no other escape entities may appear but escaped quotes:
preg_replace("/(?<!\\\\)[\"']/","", $string);
See the regex demo.
Here, the regex matches...
(?<!\\\\) - a position inside the string that is not immediately preceded with a literal backslash (note that in PHP string literals, you need two backslashes to define a literal backslash, and to match a literal backslash with a regex pattern, the literal backslash in the string literal must be doubled since the backslash is a special regex metacharacter)
[\"'] - a double or single quote.
PHP demo:
$str = '#TEST string "quoted part\\" witch escape" other "quoted string"';
$res = preg_replace('/(?<!\\\\)[\'"]/', '', $str);
echo $res;
// => #TEST string quoted part\" witch escape other quoted string
In case backslashes may also be escaped in the input, you need to make sure you do not match a " that comes after two \\ (since in that case, a " is not escaped):
preg_replace("/(?<!\\\\)((?:\\\\{2})*)[\"']/",'$1', $string);
The ((?:\\\\{2})*) part will capture paired \s before " or ' and will put them back with the help of the $1 backreference.
May be this
$str = '#TEST string "quoted part\" witch escape" other "quoted string"';
echo preg_replace("#([^\\\])\"#", "$1", $str);
To get a double quoted string (which I cannot change) correctly parsed I have to do following:
$string = '15 Rose Avenue\n Irlam\n Manchester';
$string = str_replace('\n', "\n", $string);
print nl2br($string); // demonstrates that the \n's are now linebreak characters
So far, so good.
But in my given string there are characters like \xC3\xA4. There are many characters like this (beginning with \x..)
How can I get them correctly parsed as shown above with the linebreak?
You can use
$str = stripcslashes($str);
You can escape a \ in single quotes:
$string = str_replace('\\n', "\n", $string);
But you're going to have a lot of potential replaces if you need to do \\xC3, etc.... best use a preg_replace_callback() with a function(callback) to translate them to bytes
I want to make a CSV file importer on my website. I want the user to choose the delimiter.
The problem is when the form submits, the delimiter field is stored as '\t', for example, so when I'm parsing the file, I search for the string '\t' instead of a real TAB. It does the same thing with every special characters like \r, \n, etc...
I want to know the way or the function to use to convert these characters to their true representation without using an array like:
't' => "\t"
'r' => "\r"
...
You should probably decide what special chars will you allow and create a function like this one:
function translate_quoted($string) {
$search = array("\\t", "\\n", "\\r");
$replace = array( "\t", "\n", "\r");
return str_replace($search, $replace, $string);
}
echo str_replace("\\t", "\t", $string);
View an example here: http://ideone.com/IVFZk
PHP interpreter automatically escapes double quoted strings found in PHP source files, so echo "\t" actually indicates a TAB character.
On the contrary, when you read a string from any external source, the backslash assumes its literal value: a backslash and a 't'. You would express it in a PHP source as "\\t" (double quotes) or '\t' (single quotes), which is not what you want.
Sebastián's solution works, but PHP provides a native function for that.
stripcslashes() recognises C-like sequences (\a, \b, \f, \n, \r, \t and \v), as well as octal and hexadecimal representation, converting them to their actual meaning.
// C-like escape sequence
stripcslashes('\t') === "\t"; // true;
// Hexadecimal escape sequence
stripcslashes('\x09') === "\t"; // true;
// Octal escape sequence
stripcslashes('\011') === "\t"; // true;
Doesnt look like SO is leaving the tab in quotes, but tabbing once in any pad then copying into quotes should work.
$data = str_replace("\t", " ", $data);
I have a string like this: foo($bar1, $bar2)
How to I replace each variable with <span>$variable</span> with regexp?
This is my try (not working):
$row['name'] = preg_replace("/\$\w+/S", "<span>$1</span>", $row['name']);
I only want the variables to be replaced and have a span around them, I don't want commas or spaces to be replaced.
What I want is to have my string foo($bar1, $bar2) to be replaced with foo(<span>$bar1</span>, <span>$bar2</span>) ($bar1 and $bar2 are not variables, it's plain text).
Here are some problems I can see:
Since you are using double quotes for the regex,
you need to escape the $ using two \ as \\$.
Alternatively you can just use single
quote and use \$.
You are using $1 in the replacement
but you are not having any group in
the regex. So have ( ) around
\$\w+
So try:
$str = preg_replace('/(\$\w+)/', "<span>$1</span>", $str);
or
$str = preg_replace("/(\\$\w+)/", "<span>$1</span>", $str);
See it.