Saving strings in quotes (") using regex - php

I've got a simple string that looks like a:104:{i:143;a:5:{s:5:"naz";s:7:"Alb";s:10:"base"}} and I'd like to save all text in quotation mark cleaning it of things like s:5 and stuff using regex. Is this possible?

Want to get everything between quotes? use: ".*" as your search string (escape " characters as required)
..also you can check out http://www.zytrax.com/tech/web/regex.htm for more help with regex. (It's got a great tool where you can test input text, RE, and see what you get out)

As long as the double quotes are matched, the following call
preg_match_all('/"([^"]*)"/',$input_string,$matches);
will give you all the text between the quotes as array of strings in $matches[1]

function session_raw_decode ($data) {
$vars = preg_split('/([a-zA-Z_\x7f-\xff][a-zA-Z0-9_\x7f-\xff^|]*)\|/', $data, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
$result = array();
for($i = 0; isset($vars[$i]); $i++)
$result[$vars[$i++]] = unserialize($vars[$i]);
return $result;
}
I have this snippet somewhere found on my server... (no idea from where it is or if I have it written myself)
You can use it and do:
$json = json_encode(session_raw_decode($string));
This should do the job.

Related

Get html or text from inside quotes including escape quotes with RegEx

What I want to do is to get the attribute value from a simple text I'm parsing. I want to be able to contain HTML as well inside the quotes, so that's what got me stalling right now.
$line = 'attribute = "<p class=\"qwerty\">Hello World</p>" attribute2 = "value2"'
I've gotten to the point (substring) where I'm getting the value
$line = '"<p class=\"qwerty\">Hello World</p>" attribute2 = "value2"'
My current regex works if there are no escaped quotes inside the text. However, when I try to escape the HTML quotes, it doesn't work at all. Also, using .* is going to the end of the second attribute.
What I'm trying to obtain from the string above is
$result = '<p class=\"qwerty\">Hello World</p>'
This is how far I've gotten with my trial and error regex-ing.
$value_regex = "/^\"(.+?)\"/"
if (preg_match($value_regex, $line, $matches))
$result = $matches[1];
Thank you very much in advance!
You can use negative lookbehind to avoid matching escaped quotes:
(?<!\\)"(.+?)(?<!\\)"
RegEx Demo
Here (?<!\\) is negative lookbehind that will avoid matching \".
However I would caution you on using regex to parse HTML, better to use DOM for that.
PHP Code:
$value_regex = '~(?<!\\\\)"(.+?)(?<!\\\\)"~';
if (preg_match($value_regex, $line, $matches))
$result = $matches[1];

I need to get a value in PHP "preg_match_all" - Basic (I think)

I am trying to use a License PHP System…
I will like to show the status of their license to the users.
The license Server gives me this:
name=Service_Name;nextduedate=2013-02-25;status=Active
I need to have separated the data like this:
$name = “Service_Name”;
$nextduedate = “2013-02-25”;
$status = “Active”;
I have 2 days tring to resolve this problem with preg_match_all but i cant :(
This is basically a query string if you replace ; with &. You can try parse_str() like this:
$string = 'name=Service_Name;nextduedate=2013-02-25;status=Active';
parse_str(str_replace(';', '&', $string));
echo $name; // Service_Name
echo $nextduedate; // 2013-02-25
echo $status; // Active
This can rather simply be solved without regex. The use of explode() will help you.
$str = "name=Service_Name;nextduedate=2013-02-25;status=Active";
$split = explode(";", $str);
$structure = array();
foreach ($split as $element) {
$element = explode("=", $element);
$$element[0] = $element[1];
}
var_dump($name);
Though I urge you to use an array instead. Far more readable than inventing variables that didn't exist and are not explicitly declared.
It sounds like you just want to break the text down into separate lines along the semicolons, add a dollar sign at the front and then add spaces and quotes. I'm not sure you can do that in one step with a regular expression (or at least I don't want to think about what that regular expression would look like), but you can do it over multiple steps.
Use preg_split() to split the string into an array along the
semicolons.
Loop over the array.
Use str_replace to replace each '=' with ' = "'.
Use string concatenation to add a $ to the front and a "; to the end of each string.
That should work, assuming your data doesn't include quotes, equal signs, semicolons, etc. within the data. If it does, you'll have to figure out the parsing rules for that.

How to remove commas between double quotes in PHP

Hopefully, this is an easy one. I have an array with lines that contain output from a CSV file. What I need to do is simply remove any commas that appear between double-quotes.
I'm stumbling through regular expressions and having trouble. Here's my sad-looking code:
<?php
$csv_input = '"herp","derp","hey, get rid of these commas, man",1234';
$pattern = '(?<=\")/\,/(?=\")'; //this doesn't work
$revised_input = preg_replace ( $pattern , '' , $csv_input);
echo $revised_input;
//would like revised input to echo: "herp","derp,"hey get rid of these commas man",1234
?>
Thanks VERY much, everyone.
Original Answer
You can use str_getcsv() for this as it is purposely designed for process CSV strings:
$out = array();
$array = str_getcsv($csv_input);
foreach($array as $item) {
$out[] = str_replace(',', '', $item);
}
$out is now an array of elements without any commas in them, which you can then just implode as the quotes will no longer be required once the commas are removed:
$revised_input = implode(',', $out);
Update for comments
If the quotes are important to you then you can just add them back in like so:
$revised_input = '"' . implode('","', $out) . '"';
Another option is to use one of the str_putcsv() (not a standard PHP function) implementations floating about out there on the web such as this one.
This is a very naive approach that will work only if 'valid' commas are those that are between quotes with nothing else but maybe whitespace between.
<?php
$csv_input = '"herp","derp","hey, get rid of these commas, man",1234';
$pattern = '/([^"])\,([^"])/'; //this doesn't work
$revised_input = preg_replace ( $pattern , "$1$2" , $csv_input);
echo $revised_input;
//ouput for this is: "herp","derp","hey get rid of these commas man",1234
It should def be tested more but it works in this case.
Cases where it might not work is where you don't have quotes in the string.
one,two,three,four -> onetwothreefour
EDIT : Corrected the issues with deleting spaces and neighboring letters.
Well, I haven't been lazy and written a small function to do exactly what you need:
function clean_csv_commas($csv){
$len = strlen($csv);
$inside_block = FALSE;
$out='';
for($i=0;$i<$len;$i++){
if($csv[$i]=='"'){
if($inside_block){
$inside_block=FALSE;
}else{
$inside_block=TRUE;
}
}
if($csv[$i]==',' && $inside_block){
// do nothing
}else{
$out.=$csv[$i];
}
}
return $out;
}
You might be coming at this from the wrong angle.
Instead of removing the commas from the text (presumably so you can then split the string on the commas to get the separate elements), how about writing something that works on the quotes?
Once you've found an opening quote, you can check the rest of the string; anything before the next quote is part of this element. You can add some checking here to look for escaped quotes, too, so things like:
"this is a \"quote\""
will still be read properly.
Not exactly an answer you've been looking for - But I've used it for cleaning commas in numbers in CSV.
$csv = preg_replace('%\"([^\"]*)(,)([^\"]*)\"%i','$1$3',$csv);
"3,120", 123, 345, 567 ==> 3120, 123, 345, 567

regex regular expression to match most of URLs needs improvement

I need a function which will check for the existing URLs in a string.
function linkcleaner($url) {
$regex="(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))";
if(preg_match($regex, $url, $matches)) {
echo $matches[0];
}
}
The regular expression is taken from the John Gruber's blog, where he addressed the problem of creating a regex matching all the URLs.
Unfortunately, I can't make it work. It seems the problem is coming from the double quotes inside the regex or the other punct symbols at the end of the expression.
Any help is appreciated.
Thank you!
You need to escape the " with a \
Apart from #tandu's answer, you also need delimiters for a regex in php.
The easiest would be to start and end your pattern with an # as that character does not appear in it:
$regex="#(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'\".,<>?«»“”‘’]))#";
Jack Maney's comment...EPIC :D
On a more serious note, it does not work because you terminated the string literal right in the middle.
To include a double quote (") in a string, you need to escape it using a \
So, the line will be
$regex="/(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'\".,<>?«»“”‘’]))/";
Notice I've escaped the (') as well. That is for when you define a string between 2 single quotes.
I am not sure how you guys read this regex, cause it's a real pain to read/modify... ;)
try this (this is not a one-liner, yes, but it is easy to understand and modify if needed):
<?php
$re_proto = "(?:https?|ftp|gopher|irc|whateverprotoyoulike)://";
$re_ipv4_segment = "[12]?[0-9]{1,2}";
$re_ipv4 = "(?:{$re_ipv4_segment}[.]){3}".$re_ipv4_segment;
$re_hostname = "[a-z0-9_]+(?:[.-][a-z0-9_]+){0,}";
$re_hostname_fqdn = "[a-z0-9_](?:[a-z0-9_-]*[.][a-z0-9]+){1,}";
$re_host = "(?:{$re_ipv4}|{$re_hostname})";
$re_host_fqdn = "(?:{$re_ipv4}|{$re_hostname_fqdn})";
$re_port = ":[0-9]+";
$re_uri = "(?:/[a-z0-9_.%-]*){0,}";
$re_querystring = "[?][a-z0-9_.%&=-]*";
$re_anchor = "#[a-z0-9_.%-]*";
$re_url = "(?:(?:{$re_proto})(?:{$re_host})|{$re_host_fqdn})(?:{$re_port})?(?:{$re_uri})?(?:{$re_querystring})?(?:{$re_anchor})?";
$text = <<<TEXT
http://www.example.com
http://www.example.com/some/path/to/file.php?f1=v1&f2=v2#foo
http://localhost.localdomain/
http://localhost/docs/???
www....wwhat?
www.example.com
ftp://ftp.mozilla.org/pub/firefox/latest/
Some new Mary-Kate Olsen pictures I found: the splendor of the Steiner Street Picture of href… http://t.co/tJ2NJjnf
TEXT;
$count = preg_match_all("\01{$re_url}\01is", $text, $matches);
var_dump($count);
var_dump($matches);
?>

Preg_match, Replace and back to string

sorry but i cant solve my problem, you know , Im a noob.
I need to find something in string with preg_match.. then replace it with new word using preg_replace, that's ok, but I don't understand how to put replaced word back to that string.
This is what I got
$text ='zda i "zda"';
preg_match('/"(\w*)"/', $text);
$najit = '/zda/';
$nahradit = 'zda';
$o = '/zda/';
$a = 'if';
$ahoj = preg_replace($najit, $nahradit, $match[1]);
Please, can you help me once again?
You can use e.g. the following code utilizing negative lookarounds to accomplish what you want:
$newtext = preg_replace('/(?<!")zda|zda(?!")/', 'if', $text)
It will replace any occurence of zda which is not enclosed in quotes on both sides (i.e. in U"Vzda"W the zda will be replaced because it is not enclosed directly into quotes).

Categories