I would like to parse an email address list (like the one in a TO header) with preg_match_all to get the user name (if exists) and the E-mail. Something similar to mailparse_rfc822_parse_addresses or Mail_RFC822::parseAddressList() from Pear, but in plain PHP.
Input :
"DOE, John \(ACME\)" <john.doe#somewhere.com>, "DOE, Jane" <jane.doe#somewhere.com>
Output :
array(
array(
'name' => 'DOE, John (ACME)',
'email' => 'john.doe#somewhere.com'
),
array(
'name' => 'DOE, Jane',
'email' => 'jane.doe#somewhere.com'
)
)
Don't need to support strange E-mail format (/[a-z0-9._%-]+#[a-z0-9.-]+.[a-z]{2,4}/i for email part is OK).
I can't use explode because the comma can appear in the name. str_getcsv doesn't work, because I can have:
DOE, John \(ACME\) <john.doe#somewhere.com>
as input.
Update:
For the moment, I've got this :
public static function parseAddressList($addressList)
{
$pattern = '/^(?:"?([^<"]+)"?\s)?<?([^>]+#[^>]+)>?$/';
if (preg_match($pattern, $addressList, $matches)) {
return array(
array(
'name' => stripcslashes($matches[1]),
'email' => $matches[2]
)
);
} else {
$parts = str_getcsv($addressList);
$result = array();
foreach($parts as $part) {
if (preg_match($pattern, $part, $matches)) {
$result[] = array(
'name' => stripcslashes($matches[1]),
'email' => $matches[2]
);
}
}
return $result;
}
}
but it fails on:
"DOE, \"John\"" <john.doe#somewhere.com>
I need to test on back reference the \" but I don't remember how to do this.
Finally I did it:
public static function parseAddressList($addressList)
{
$pattern = '/^(?:"?((?:[^"\\\\]|\\\\.)+)"?\s)?<?([a-z0-9._%-]+#[a-z0-9.-]+\\.[a-z]{2,4})>?$/i';
if (($addressList[0] != '<') and preg_match($pattern, $addressList, $matches)) {
return array(
array(
'name' => stripcslashes($matches[1]),
'email' => $matches[2]
)
);
} else {
$parts = str_getcsv($addressList);
$result = array();
foreach($parts as $part) {
if (preg_match($pattern, $part, $matches)) {
$item = array();
if ($matches[1] != '') $item['name'] = stripcslashes($matches[1]);
$item['email'] = $matches[2];
$result[] = $item;
}
}
return $result;
}
}
But I'm not sure it works for all cases.
I don't know that RFC, but if the format is always as you showed then you can try something like:
preg_match_all("/\"([^\"]*)\"\\s+<([^<>]*)>/", $string, $matches);
print_r($matches);
Related
Wondering if anyone out there can help me with the following regular expression, i can't match the block multine CF.{Coordonnees Abonne}: when used in PHP's preg_match function.
What is weird is when I do regex online it seems to work despite the block is in another group regex101 example
Here is the code : source code
<?php
$response = array(
1 => 'CF.{Temps}: 1',
2 => 'CF.{Etat}: return',
3 => 'CF.{Code}: 2',
4 => 'CF.{Values}: plaque',
5 => '',
6 => 'CF.{Coordonnees}: LA PERSONNE',
7 => ' ',
8 => ' 10000 LA VILLE',
9 => ' ',
10 => ' 0500235689',
11 => ' 0645788923',
12 => ' Login : test#mail.com',
13 => ' Password : PassWord!',
14 => '',
15 => 'CF.{Groupe}: 3',
16 => 'CF.{Date}: 4',
);
print_r(parseResponseBody($response));
function parseResponseBody(array $response, $delimiter = ':')
{
$responseArray = array();
$lastkey = null;
foreach ($response as $line) {
if(preg_match('/^([a-zA-Z0-9]+|CF\.{[^}]+})' . $delimiter . '\s(.*)|([a-zA-Z0-9].*)$/', $line, $matches)) {
$lastkey = $matches[1];
$responseArray[$lastkey] = $matches[2];
}
}
return $responseArray;
}
?>
Output :
Array
(
[CF.{Temps}] => 1
[CF.{Etat}] => return
[CF.{Code}] => 2
[CF.{Values}] => plaque
[CF.{Coordonnees}] => LA PERSONNE
[] =>
[CF.{Groupe}] => 3
[CF.{Date}] => 4
)
And there is the wanted final result that i need to extract :
Array
(
[CF.{Temps}] => 1
[CF.{Etat}] => return
[CF.{Code}] => 2
[CF.{Values}] => plaque
[CF.{Coordonnees}] => LA PERSONNE
10000 LA VILLE
0500235689
0645788923
Login : test#mail.com
Password : PassWord!
[CF.{Groupe}] => 3
[CF.{Date}] => 4
)
You have to check if current value at iteration starts with a block or not. Not both at same time though:
function parseResponseBody(array $response, $delimiter = ':') {
$array = [];
$lastIndex = null;
foreach ($response as $line) {
if (preg_match('~^\s*(CF\.{[^}]*})' . $delimiter . '\s+(.*)~', $line, $matches))
$array[$lastIndex = $matches[1]] = $matches[2];
elseif ((bool) $line)
$array[$lastIndex] .= PHP_EOL . $line;
}
return $array;
}
Live demo
I would do that this way:
function parse($response, $del=':', $nl="\n") {
$pattern = sprintf('~(CF\.{[^}]+})%s \K.*~A', preg_quote($del, '~'));
foreach ($response as $line) {
if ( preg_match($pattern, $line, $m) ) {
if ( !empty($key) )
$result[$key] = rtrim($result[$key]);
$key = $m[1];
$result[$key] = $m[0];
} else {
$result[$key] .= $nl . $line;
}
}
return $result;
}
var_export(parse($response));
demo
The key is stored in the capture group 1 $m[1] but the whole match $m[0] returns only the value part (the \K feature discards all matched characters on its left from the match result). When the pattern fails, the current line is appended for the last key.
The regex is fine, you just need to handle the case when there is no key:
function parseResponseBody(array $response, $delimiter = ':')
{
$responseArray = array();
$key = null;
foreach ($response as $line) {
if(preg_match('/^([a-zA-Z0-9]+|CF\.{[^}]+})' . $delimiter . '\s(.*)|([a-zA-Z0-9].*)$/', $line, $matches)) {
$key = $matches[1];
if(empty($key)){
$key = $lastKey;
$responseArray[$key] .= PHP_EOL . $matches[3];
}else{
$responseArray[$key] = $matches[2];
}
$lastKey = $key;
}
}
return $responseArray;
}
https://3v4l.org/rFIbk
Following php function is being used to replace bad words with starts but I need one additional parameters that will describe either bad words found or not .
$badwords = array('dog', 'dala', 'bad3', 'ass');
$text = 'This is a dog. . Grass. is good but ass is bad.';
print_r( filterBadwords($text,$badwords));
function filterBadwords($text, array $badwords, $replaceChar = '*') {
$repu = preg_replace_callback(array_map(function($w) { return '/\b' . preg_quote($w, '/') . '\b/i'; }, $badwords),
function($match) use ($replaceChar) {
return str_repeat($replaceChar, strlen($match[0])); },
$text
);
return array('error' =>'Match/No Match', 'text' => $repu );
}// Func
Output if badwords found should be like
Array ( [error] => Match[text] => Bad word dog match. )
If no badwords found then
Array ( [error] => No Match[text] => Bad word match. )
You can use the following:
function filterBadwords($text, array $badwords, $replaceChar = '*') {
//new bool var to see if there was any match
$matched = false;
$repu = preg_replace_callback(array_map(
function($w)
{
return '/\b' . preg_quote($w, '/') . '\b/i';
}, $badwords),
//pass the $matched by reference
function($match) use ($replaceChar, &$matched)
{
//if the $match array is not empty update $matched to true
if(!empty($match))
{
$matched = true;
}
return str_repeat($replaceChar, strlen($match[0]));
}, $text);
//return response based on the bool value of $matched
if($matched)
{
$return = array('error' =>'Match', 'text' => $repu );
}
else
{
$return = array('error' =>'No Match', 'text' => $repu );
}
return $return;
}
This uses reference and if condition to see if there were any matches and then returns response based on that.
Output(if matched):
array (size=2)
'error' => string 'Match' (length=5)
'text' => string 'This is a ***. . Grass. is good but *** is bad.'
Output(if none matched):
array (size=2)
'error' => string 'No Match' (length=8)
'text' => string 'This is a . . Grass. is good but is bad.'
<?php
$badwords = array('dog', 'dala', 'bad3', 'ass');
$text = 'This is a dog. . Grass. is good but ass is bad.';
$res=is_badword($badwords,$text);
echo "<pre>"; print_r($res);
function is_badword($badwords, $text)
{
$res=array('No Error','No Match');
foreach ($badwords as $name) {
if (stripos($text, $name) !== FALSE) {
$res=array($name,'Match');
return $res;
}
}
return $res;
}
?>
Output:
Array
(
[0] => dog
[1] => Match
)
Is there an easy way to parse a string for search terms including negative terms?
'this -that "the other thing" -"but not this" "-positive"'
would change to
array(
"positive" => array(
"this",
"the other thing",
"-positive"
),
"negative" => array(
"that",
"but not this"
)
)
so those terms could be used to search.
The code below will parse your query string and split it up into positive and negative search terms.
// parse the query string
$query = 'this -that "-that" "the other thing" -"but not this" ';
preg_match_all('/-*"[^"]+"|\S+/', $query, $matches);
// sort the terms
$terms = array(
'positive' => array(),
'negative' => array(),
);
foreach ($matches[0] as $match) {
if ('-' == $match[0]) {
$terms['negative'][] = trim(ltrim($match, '-'), '"');
} else {
$terms['positive'][] = trim($match, '"');
}
}
print_r($terms);
Output
Array
(
[positive] => Array
(
[0] => this
[1] => -that
[2] => the other thing
)
[negative] => Array
(
[0] => that
[1] => but not this
)
)
For those looking for the same thing I have created a gist for PHP and JavaScript
https://gist.github.com/UziTech/8877a79ebffe8b3de9a2
function getSearchTerms($search) {
$matches = null;
preg_match_all("/-?\"[^\"]+\"|-?'[^']+'|\S+/", $search, $matches);
// sort the terms
$terms = [
"positive" => [],
"negative" => []
];
foreach ($matches[0] as $i => $match) {
$negative = ("-" === $match[0]);
if ($negative) {
$match = substr($match, 1);
}
if (($match[0] === '"' && substr($match, -1) === '"') || ($match[0] === "'" && substr($match, -1) === "'")) {
$match = substr($match, 1, strlen($match) - 2);
}
if ($negative) {
$terms["negative"][] = $match;
} else {
$terms["positive"][] = $match;
}
}
return $terms;
}
I have this file, I cant figure out how to parse this file.
type = 10
version = 1.2
PART
{
part = foobie
partName = foobie
EVENTS
{
MakeReference
{
active = True
}
}
ACTIONS
{
}
}
PART
{
part = bazer
partName = bazer
}
I want this to be a array which should look like
$array = array(
'type' => 10,
'version' => 1.2,
'PART' => array(
'part' => 'foobie',
'partName' => 'foobie,
'EVENTS' => array(
'MakeReference' => array(
'active' => 'True'
)
),
'ACTIONS' => array(
)
),
'PART' => array(
'part' => 'bazer',
'partName' => 'bazer'
)
);
I tried with preg_match but that was not a success.
Any ideas?
Why not use a format that PHP can decode natively, like JSON?
http://php.net/json_decode
$json = file_get_contents("filename.txt");
$array = json_decode($json);
print_r($array);
Here is my approach.
First of all change Im changing the { to the line before.
Thats pretty easy
$lines = explode("\r\n", $this->file);
foreach ($lines as $num => $line) {
if (preg_match('/\{/', $line) === 1) {
$lines[$num - 1] .= ' {';
unset($lines[$num]);
}
}
Now the input looks like this
PART {
part = foobie
Now we can make the whole thing to XML instead, I know that I said a PHP array in the question, but a XML object is still fine enough.
foreach ($lines as $line) {
if (preg_match('/(.*?)\{$/', $line, $matches)) {
$xml->addElement(trim($matches[1]));
continue;
}
if (preg_match('/\}/', $line)) {
$xml->endElement();
continue;
}
if (strpos($line, ' = ') !== false) {
list($key, $value) = explode(' = ', $line);
$xml->addElementAndContent(trim($key), trim($value));
}
}
The addElement, actually just adds a to a string
endElement adds a and addElementAndContent doing both and also add content between them.
And just for making the whole content being a XML object, im using
$xml = simplexml_load_string($xml->getXMLasText());
And now $xml is a XML object, which is so much easier to work with :)
I have a url formatted like this:
http://www.example.com/detail/state-1/county-2/street-3
What I'd like to do is parse out the values for state (1), county (2), and street (3).
I could do this using a combination of substr(), strpos(), and a loop. But I think using a regex would be faster however I'm not sure what my regex would be.
$pieces = parse_url($input_url);
$path = trim($pieces['path'], '/');
$segments = explode('/', $path);
foreach($segments as $segment) {
$keyval = explode('-', $segment);
if(count($keyval) == 2) {
$key_to_val[$keyval[0]] = $keyval[1];
}
}
/*
$key_to_val: Array (
[state] => 1,
[county] => 2,
[street] => 3
)
*/
Could just do this:
<?php
$url = "http://www.example.com/detail/state-1/county-2/street-3";
list( , $state, $country, $street) = explode('/', parse_url($url, PHP_URL_PATH));
?>
if (preg_match_all('#([a-z0-9_]*)-(\\d+)#i', $url, $matches, PREG_SET_ORDER)) {
$matches = array(
array(
'state-1',
'state',
'1',
),
array(
'county-2',
'county',
'2',
),
array(
'street-3',
'street',
'3',
)
);
}
Note, that's the structure of the $matches array (what it would look like if you var_dump'd it...
This regex pattern should work:
$subject = 'http://www.example.com/detail/state-1/county-2/street-3';
$pattern = 'state-(\\d+)/county-(\\d+)/street-(\\d+)';
preg_match($pattern, $subject, $matches);
// matches[1],[2],[3] now stores (the values) 1,2,3 (respectively)
print_r($matches);