Error parsing regex pattern in php - php

I want to split a string such as the following (by a divider like '~##' (and only that)):
to=enquiry#test.com~##subject=test~##text=this is body/text~##date=date
into an array containing e.g.:
to => enquiry#test.com
subject => test
text => this is body/text
date => date
I'm using php5 and I've got the following regex, which almost works, but there are a couple of errors and there must be a way to do it in one go:
//Split the string in the url of $text at every ~##
$regexp = "/(?:|(?<=~##))(.*?=.*?)(?:~##|$|\/(?!.*~##))/";
preg_match_all($regexp, $text, $a);
//$a[1] is an array containing var1=content1 var2=content2 etc;
//Now create an array in the form [var1] = content, [var2] = content2
foreach($a[1] as $key => $value) {
//Get the two groups either side of the equals sign
$regexp = "/([^\/~##,= ]+)=([^~##,= ]+)/";
preg_match_all($regexp, $value, $r);
//Assign to array key = value
$val[$r[1][0]] = $r[2][0]; //e.g. $val['subject'] = 'hi'
}
print_r($val);
My queries are that:
It doesn't seem to capture more than 3 different sets of parameters
It is breaking on the # symbol and so not capturing email addresses e.g. returning:
to => enquiry
subject => test
text => this is body/text
I am doing multiple different regex searches where I suspect I would be able to do one.
Any help would be really appreciated.
Thanks

Why are you using regex when there is much simple method to do this by explode like this
$str = 'to=enquiry#test.com~##subject=test~##text=this is body/text~##date=date';
$array = explode('~##',$str);
$finalArr = array();
foreach($array as $val)
{
$tmp = explode('=',$val);
$finalArr[$tmp['0']] = $tmp['1'];
}
echo '<pre>';
print_r($finalArr);

Related

Shortcode style parsing

Looking at how WP uses shortcodes I thoufght I could implement the same structure into a project, I assumed this would be availble somwehere but have yet to track down.
I started to parse myself starting with a preg_match_all
preg_match_all('/[[^]]*]/', $content, $match);
and that return the array with all the shortcodes inside content as expected but then looking at parsing the name, variables or array keys with values I start getting real heavy on parsing.
My current thought is to break up on spaces, then parse each but then i run into spaces in the values even though they are in quotes. So if i parse quoted data first then spaces to re-construct it seems very wasteful. I don't need to re-invent the wheel here so any input is fantastic.
example
[shortcodename key1="this is a value" key2="34"]
would like to have
Array
(
[shortcodename] => Array
(
[key1] => this is a value
[key2] => 34
)
)
here is the complete function that is working if anyone else is looking to do the same, obviously this is not meant to run user content but the called function should do any checks as this only replaces the shortcode if the funtction has a return value.
function processShortCodes($content){ // locate data inside [ ] and
//process the output, place back into content and returns
preg_match_all('/\[[^\]]*\]/', $content, $match);
$regex = '~"[^"]*"(*SKIP)(*F)|\s+~';
foreach ($match[0] as $key => $val){
$valOrig = $val; // keep uncleaned value to replace later
$val = trim(substr($val, 1, -1));
$replaced = preg_replace($regex,":",$val);
$exploded = explode(':',$replaced);
if (is_array($exploded)){
$fcall = array();
$fcallName = array_shift($exploded); // function name
if (function_exists($fcallName)){ // If function exsist then go
foreach ($exploded as $aKey => $aVal){
$arr = explode("=", $aVal);
if (substr($arr[1], 0, 1) == '&'){
$fCall[$arr[0]]=substr($arr[1], 6, -6); // quotes can be "
}else{
$fCall[$arr[0]]=substr($arr[1], 1, -1);
}
}
if ( is_array($fCall) && $fcallName ){
$replace = call_user_func($fcallName, $fCall);
if ($replace){
$content = str_replace($valOrig,$replace,$content);
}
}
}
}
}
You can try this to change all spaces not wrapped in quotes to let's say a semicolon then explode by semicolon
$regex = '~"[^"]*"(*SKIP)(*F)|\s+~';
$subject = 'hola hola "pepsi cola" yay';
$replaced = preg_replace($regex,";",$subject);
$exploded = explode(';', $replaced);
Credits

PHP Regex for a specific numeric value inside a comma-delimited integer number string

I am trying to get the integer on the left and right for an input from the $str variable using REGEX. But I keep getting the commas back along with the integer. I only want integers not the commas. I have also tried replacing the wildcard . with \d but still no resolution.
$str = "1,2,3,4,5,6";
function pagination()
{
global $str;
// Using number 4 as an input from the string
preg_match('/(.{2})(4)(.{2})/', $str, $matches);
echo $matches[0]."\n".$matches[1]."\n".$matches[1]."\n".$matches[1]."\n";
}
pagination();
How about using a CSV parser?
$str = "1,2,3,4,5,6";
$line = str_getcsv($str);
$target = 4;
foreach($line as $key => $value) {
if($value == $target) {
echo $line[($key-1)] . '<--low high-->' . $line[($key+1)];
}
}
Output:
3<--low high-->5
or a regex could be
$str = "1,2,3,4,5,6";
preg_match('/(\d+),4,(\d+)/', $str, $matches);
echo $matches[1]."<--low high->".$matches[2];
Output:
3<--low high->5
The only flaw with these approaches is if the number is the start or end of range. Would that ever be the case?
I believe you're looking for Regex Non Capture Group
Here's what I did:
$regStr = "1,2,3,4,5,6";
$regex = "/(\d)(?:,)(4)(?:,)(\d)/";
preg_match($regex, $regStr, $results);
print_r($results);
Gives me the results:
Array ( [0] => 3,4,5 [1] => 3 [2] => 4 [3] => 5 )
Hope this helps!
Given your function name I am going to assume you need this for pagination.
The following solution might be easier:
$str = "1,2,3,4,5,6,7,8,9,10";
$str_parts = explode(',', $str);
// reset and end return the first and last element of an array respectively
$start = reset($str_parts);
$end = end($str_parts);
This prevents your regex from having to deal with your numbers getting into the double digits.

Use regular expression to extract attribute value for custom tag

Thanks for taking a look at this. I'm using PHP. I have a string like so:
[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don't so much dance as rhythmically convulse.[/QUOTE]
And I want to pull out the values in the quotes and create an associative array like so:
["name" => "Max-Fischer", "post" => "486662533", "member" => "123"]
Then, I would like to remove the opening and closing [QUOTE] tags and replace them with custom HTML like so:
<blockquote>Max-Fischer wrote: I don't so much dance as rhythmically convulse.</blockquote>
So the main problem is creating the preg_match() or preg_replace() to handle first: grabbing the values out in an array, and second: removing the tags and replacing them with my custom content. I can figure out how to use the array to create the custom HTML, I just can't figure how to use regular expressions well enough to achieve it.
I tried a match like this to get the attribute values:
/(\S+)=[\"\']?((?:.(?![\"\']?\s+(?:\S+)=|[>\"\']))+.)[\"\']?/
But this only returns:
[QUOTE
And that's not even addressing how to put the values (if I can get them) into an array.
Thanks in advance for your time.
Cheers.
If the tag you're looking for is always going to be quote, then perhaps something a little simpler is possible:
$s ='"[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don\'t so much dance as rhythmically convulse.[/QUOTE]';
$r = '/\[QUOTE="(.*?)"\](.*)\[\/QUOTE\]/';
$m = array();
$arr = array();
preg_match($r, $s, $m);
// m[0] = the initial string
// m[1] = the string of attributes
// m[2] = the quote itself
foreach(explode(',', $m[1]) as $valuepair) { // split the attributes on the comma
preg_match('/\s*(.*): (.*)/', $valuepair, $mm);
// mm[0] = the attribute pairing
// mm[1] = the attribute name
// mm[2] = the attribute value
$arr[$mm[1]] = $mm[2];
}
print_r($arr);
print $m[2] . "\n";
this gives the following output:
Array
(
[name] => Max-Fischer
[post] => 486662533
[member] => 123
)
I don't so much dance as rhythmically convulse.
If you want to handle the case where there is more than one quote in the string, we can do this by modifying the regex to be slightly less greedy, and then using preg_match_all, instead of preg_match
$s ='[QUOTE="name: Max-Fischer, post: 486662533, member: 123"]I don\'t so much dance as rhythmically convulse.[/QUOTE]';
$s .='[QUOTE="name: Some-Guy, post: 486562533, member: 1234"]Quidquid latine dictum sit, altum videtur[/QUOTE]';
$r = '/\[QUOTE="(.*?)"\](.*?)\[\/QUOTE\]/';
// ^ <--- added to make it less greedy
$m = array();
$arr = array();
preg_match_all($r, $s, $m, PREG_SET_ORDER);
// m[0] = the first quote
// m[1] = the second quote
// m[0][0] = the initial string
// m[0][1] = the string of attributes
// m[0][2] = the quote itself
// element for each quote found in the string
foreach($m as $match) { // since there is more than quote, we loop and operate on them individually
$quote = array();
foreach(explode(',', $match[1]) as $valuepair) { // split the attributes on the comma
preg_match('/\s*(.*): (.*)/', $valuepair, $mm);
// mm[0] = the attribute pairing
// mm[1] = the attribute name
// mm[2] = the attribute value
$quote[$mm[1]] = $mm[2];
}
$arr[] = $quote; // we now build a parent array, to hold each individual quote
}
print_r($arr);
This gives output like:
Array
(
[0] => Array
(
[name] => Max-Fischer
[post] => 486662533
[member] => 123
)
[1] => Array
(
[name] => Some-Guy
[post] => 486562533
[member] => 1234
)
)
I managed to resolve yout problem: to get an associative array. I hope it will help you.
Here is code
$str = <<< PP
[QUOTE=" name : Max-Fischer,post : 486662533,member : 123 "]I don't so much dance as rhythmically convulse.[/QUOTE]
PP;
preg_match_all('/^\[QUOTE=\"(.*?)\"\](?:.*?)]$/', $str, $matches);
preg_match_all('/([a-zA-Z0-9]+)\s+:\s+([a-zA-Z0-9]+)/', $matches[1][0], $result);
$your_data = array_combine($result[1],$result[2]);
echo "<pre>";
print_r($your_data);

Tricky php string matching

I have a string that looks like this:
[2005]
one
two
three
[2004]
six
What would be the smoothest was to get an array from it that would look like this:
array(
['2005'] => "one \n two \n three",
['2005'] => "six",
)
... or maybe even get the inner array sliced into lines array...
I tried doing it with preg_split, which worked but didn't give associative array keys so I didn't have the year numbers as keys.
Is there any cool way of doing this without iterating through all the lines ?
/(\[[0-9]{4}\])([^\[]*)/ will give you the date and whatever is after until the next one.
Use the groups to create your array: With preg_match_all() you get a $matches array where $matches[1] is the date and $matches[2] is the data following it.
Using Sylverdrag's regex as a guide:
<?php
$test = "[2005]
one
two
three
[2004]
six";
$r = "/(\[[0-9]{4}\])([^\[]*)/";
preg_match_all($r, $test, $m);
$output = array();
foreach ($m[1] as $key => $name)
{
$name = str_replace(array('[',']'), array('',''), $name);
$output[ $name ] = $m[2][$key];
}
print_r($output);
?>
Output (PHP 5.2.12):
Array
(
[2005] =>
one
two
three
[2004] =>
six
)
That's slightly more complex:
preg_match_all('/\[(\d+)\]\n((?:(?!\[).+\n?)+)/', $ini, $matches, PREG_SET_ORDER);
(Could be simplified with knowing the real format constraints.)

preg_replace all characters up to a certain one

I have a string
&168491968426|mobile|3|100|1&185601651932|mobile|3|120|1&114192088691|mobile|3|555|5&
and i have to delete, say, this part &185601651932|mobile|3|120|1& (starting with amp and ending with amp) knowing only the first number up to vertical line (185601651932)
so that in result i would have
&168491968426|mobile|3|100|1&114192088691|mobile|3|555|5&
How could i do that with PHP preg_replace function. The number of line (|) separated values would be always the same, but still, id like to have a flexible pattern, not depending on the number of lines in between the & sign.
Thanks.
P.S. Also, I would be greatful for a link to a good simply written resource relating regular expressions in php. There are plenty of them in google :) but maybe you happen to have a really great link
preg_replace("/&185601651932\\|[^&]+&/", ...)
Generalized,
$i = 185601651932;
preg_replace("/&$i\\|[^&]+&/", ...);
if you want real flexibility, use preg_replace_callback. http://php.net/manual/en/function.preg-replace-callback.php
Important: don't forget to escape your number using preg_quote():
$string = '&168491968426|mobile|3|100|1&185601651932|mobile|3|120|1&114192088691|mobile|3|555|5&';
$number = 185601651932;
if (preg_match('/&' . preg_quote($number, '/') . '.*?&/', $string, $matches)) {
// $matches[0] contains the captured string
}
It seems to me you ought to be using another data structure than a string to manipulate this data.
I'd want this data in a structure like
Array(
[id] => Array(
[field_1] => value_1
[field_2] => value_2
)
)
Your massive string can be massaged into such a structure by doing something like this:
$data_str = '168491968426|mobile|3|100|1&185601651932|mobile|3|120|1&114192088691|mobile|3|555|5&';
$remove_num = '185601651932';
/* Enter a descriptive name for each of the numbers here
- these will be field names in the data structure */
$field_names = array(
'number',
'phone_type',
'some_num1',
'some_num2',
'some_num3'
);
/* split the string into its parts, and place them into the $data array */
$data = array();
$tmp = explode('&', trim($data_str, '&'));
foreach($tmp as $record) {
$fields = explode('|', trim($record, '|'));
$data[$fields[0]] = array_combine($field_names, $fields);
}
echo "<h2>Data structure:</h2><pre>"; print_r($data); echo "</pre>\n";
/* Now to remove our number */
unset($data[$remove_num]);
echo "<h2>Data after removal:</h2><pre>"; print_r($data); echo "</pre>\n";

Categories