Stop regex splitting on whitespace - php

I'm writing a parser, trying to automate a way that I can pass any argument as a param like follows:
$content = '{loop for=products showPagination="true" paginationPosition="both" wrapLoop="true" returnDefaultNoResults="true" noResultsHeading="Nothing Found" noResultsHeadingSize="2" noResultsParagraph="We have not found any products in this category, please try another."}{/loop}';
preg_match_all('/([a-zA-Z]+)=([\/\.\"a-zA-Z0-9&;,_-]+)/', str_replace('"', '"', $content), $attr);
if (!is_array($attr)) return array();
for ($z = 0; $z < count($attr[1]); $z++) if (isset($attr['1'][$z])) $attrs[$attr['1'][$z]] = trim($attr['2'][$z], '"');
echo json_encode($attrs);
My Isssue is that my loop & regex is splitting out whitespace and I can't figure out how to alter it so that it doesn't.
I've tried adding \w into the right hand side of the = sign, but no luck.
RESULT
{"for":"products","showPagination":"true","paginationPosition":"both","wrapLoop":"true","returnDefaultNoResults":"true","noResultsHeading":"Nothing","noResultsHeadingSize":"2","noResultsParagraph":"We"}
You'll notice that the last two params both stop after the first word.

I suggest you to change the preg_match_all function like below.
preg_match_all('/([a-zA-Z]+)=("[^"]*"|\S+)/', str_replace('"', '"', $content), $attr);
It will greedily matches all the double quoted contents first. If there isn't any double quotes block, then it will match one or more non-space characters.
Output:
{"for":"products","showPagination":"true","paginationPosition":"both","wrapLoop":"true","returnDefaultNoResults":"true","noResultsHeading":"Nothing Found","noResultsHeadingSize":"2","noResultsParagraph":"We have not found any products in this category, please try another."}

Related

preg_replace does not replace the value as required

Assume we have a php array $row_mid, which contains strings like 'reaction_l0', 'reaction_l1', 'reaction_r0', 'reaction_r1' (in each case the number goes from 0 to 4. These strings are enclosed by <div> tags. I want to run a loop and remove these strings with preg_replace ():
$i = 0;
while ($i < count ($row_mid)){
$row_mid [$i] = preg_replace ("~^reaction_.[0-9]$~", "", $row_mid [$i]);
$i++;
}
The regexp ^reaction_.[0-9]$ was developed with the help of https://regex101.com/ and tested successfully with strings <div>reaction_r1</div> (no match, I need the tags stay where they are) and reaction_r1 (match). It doesn't work, however.
Get rid of the anchors, because they only allow the regexp to match the entire string, not when it's enclosed in tags.
$row_mid [$i] = preg_replace ("~reaction_.[0-9]~", "", $row_mid [$i]);
Just remove the two symbol ^ and $

How to remove commas between double quotes in PHP

Hopefully, this is an easy one. I have an array with lines that contain output from a CSV file. What I need to do is simply remove any commas that appear between double-quotes.
I'm stumbling through regular expressions and having trouble. Here's my sad-looking code:
<?php
$csv_input = '"herp","derp","hey, get rid of these commas, man",1234';
$pattern = '(?<=\")/\,/(?=\")'; //this doesn't work
$revised_input = preg_replace ( $pattern , '' , $csv_input);
echo $revised_input;
//would like revised input to echo: "herp","derp,"hey get rid of these commas man",1234
?>
Thanks VERY much, everyone.
Original Answer
You can use str_getcsv() for this as it is purposely designed for process CSV strings:
$out = array();
$array = str_getcsv($csv_input);
foreach($array as $item) {
$out[] = str_replace(',', '', $item);
}
$out is now an array of elements without any commas in them, which you can then just implode as the quotes will no longer be required once the commas are removed:
$revised_input = implode(',', $out);
Update for comments
If the quotes are important to you then you can just add them back in like so:
$revised_input = '"' . implode('","', $out) . '"';
Another option is to use one of the str_putcsv() (not a standard PHP function) implementations floating about out there on the web such as this one.
This is a very naive approach that will work only if 'valid' commas are those that are between quotes with nothing else but maybe whitespace between.
<?php
$csv_input = '"herp","derp","hey, get rid of these commas, man",1234';
$pattern = '/([^"])\,([^"])/'; //this doesn't work
$revised_input = preg_replace ( $pattern , "$1$2" , $csv_input);
echo $revised_input;
//ouput for this is: "herp","derp","hey get rid of these commas man",1234
It should def be tested more but it works in this case.
Cases where it might not work is where you don't have quotes in the string.
one,two,three,four -> onetwothreefour
EDIT : Corrected the issues with deleting spaces and neighboring letters.
Well, I haven't been lazy and written a small function to do exactly what you need:
function clean_csv_commas($csv){
$len = strlen($csv);
$inside_block = FALSE;
$out='';
for($i=0;$i<$len;$i++){
if($csv[$i]=='"'){
if($inside_block){
$inside_block=FALSE;
}else{
$inside_block=TRUE;
}
}
if($csv[$i]==',' && $inside_block){
// do nothing
}else{
$out.=$csv[$i];
}
}
return $out;
}
You might be coming at this from the wrong angle.
Instead of removing the commas from the text (presumably so you can then split the string on the commas to get the separate elements), how about writing something that works on the quotes?
Once you've found an opening quote, you can check the rest of the string; anything before the next quote is part of this element. You can add some checking here to look for escaped quotes, too, so things like:
"this is a \"quote\""
will still be read properly.
Not exactly an answer you've been looking for - But I've used it for cleaning commas in numbers in CSV.
$csv = preg_replace('%\"([^\"]*)(,)([^\"]*)\"%i','$1$3',$csv);
"3,120", 123, 345, 567 ==> 3120, 123, 345, 567

Regex to add line breaks before and after a string?

The following code removes comments, line breaks, and extra space from HTML and PHP files, but a problem I have is when the original file has <<<EOT; in it. What regex rule would I use to add a linebreak before and after <<<EOT; from $pre6?
//a bit messy, but this is the core of the program. removes whitespaces, line breaks, and comments. sometimes makes EOT error.
$pre1 = preg_replace('#<!--[^\[<>].*?(?<!!)-->#s', '', preg_replace('~>\s+<~', '><', trim(preg_replace('/\s\s+/', ' ', php_strip_whitespace(stripslashes(htmlspecialchars($uploadfile)))))));
$pre2 = preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $pre1);
$pre3 = str_replace(array("\r\n", "\r"), "\n", $pre2);
$pre4 = explode("\r\n", $pre3);
$pre5 = array();
foreach ($pre4 as $i => $line) {
if(!empty($line))
$pre5[] = trim($line);
}
$pre6 = implode($pre5);
echo $pre6;
To match <<<EOT, you could use <{3}[A-Z]{3}, or several other patterns, depending on how strictly you want to match that exact text.
Oh, I see what you're after now. I'm not great with PHP, but in regular expressions, you can capture a named group and then refer to that group in a replacement operation. You could use the following to capture <<<EOT into a group named Capture:
(?<Capture><{3}[A-Z]{3})
I think in PHP you can refer to it using something like:
$regs['Capture']
So maybe you're after a replacement parameter value of something like:
"\r\n".$regs['Capture']."\r\n"
...if $regs was the parameter passed to the replace operation.

Preg_match, Replace and back to string

sorry but i cant solve my problem, you know , Im a noob.
I need to find something in string with preg_match.. then replace it with new word using preg_replace, that's ok, but I don't understand how to put replaced word back to that string.
This is what I got
$text ='zda i "zda"';
preg_match('/"(\w*)"/', $text);
$najit = '/zda/';
$nahradit = 'zda';
$o = '/zda/';
$a = 'if';
$ahoj = preg_replace($najit, $nahradit, $match[1]);
Please, can you help me once again?
You can use e.g. the following code utilizing negative lookarounds to accomplish what you want:
$newtext = preg_replace('/(?<!")zda|zda(?!")/', 'if', $text)
It will replace any occurence of zda which is not enclosed in quotes on both sides (i.e. in U"Vzda"W the zda will be replaced because it is not enclosed directly into quotes).

preg_replace with arrays

My database has a table with 1000 terms and their definitions.
I want to print those definitions and add a span tag to every word that is already a term.
I use this to create the two arrays (patterns and replacements):
while($row = mysql_fetch_array($rsd)){
$patterns[$i] = '/'.$row['term'].'/';
$patterns[$i+1] = '/<span class="linkedterm"><span class="linkedterm">'.$row['term'].'</span>/';
$replacements[$i+1] = '<span class="linkedterm">'.$row['term'].'</span>';
$replacements[$i] = '<span class="linkedterm">'.$row['term'];
$i = $i + 2;
}
And this to echo the definitions:
echo preg_replace($patterns, $replacements, $row['definition']);
With this code i have an error for character /, at the close span tag. So I want a solution for this, to be able to pass a value with / char. Or any other solution that I may have missed.
Thanks
You might want to look at preg_quote
Quote regular expression characters
The / is also your delimiter character (to point out the start and end of your regex). So if you want to search for a literal /, make sure you escape it with a backslash, like so:
$patterns[$i+1] = '/<span class="linkedterm"><span class="linkedterm">'.$row['term'].'<\/span>/';
$patterns[$i] = '/'.preg_quote($row['term']).'/';
$patterns[$i+1] = '/'.preg_quote('<span class="linkedterm" ><span class="linkedterm" >'.$row['term'].'</span>', '/').'/';

Categories