Im trying to make the below function only return 1 email per domain.
Example: if i feed the function:
email1#domain.com email2#domain.com email1#domain.com
email1#domain.com email3#test.co.uk
I want it to return
email1#domain.com email3#test.co.uk
Here is the current function:
function remove_duplicates($str) {
# match all email addresses using a regular expression and store them
# in an array called $results
preg_match_all("([\w-]+(?:\.[\w-]+)*#(?:[\w-]+\.)+[a-zA-Z]{2,7})",$str,$results);
# sort the results alphabetically
sort($results[0]);
# remove duplicate results by comparing it to the previous value
$prev="";
while(list($key,$val)=each($results[0])) {
if($val==$prev) unset($results[0][$key]);
else $prev=$val;
}
# process the array and return the remaining email addresses
$str = "";
foreach ($results[0] as $value) {
$str .= "<br />".$value;
}
return $str;
};
Any ideas how to achieve this?
Something along these lines:
$emails = array('email1#domain.com', 'email2#domain.com', 'email1#domain.com', 'email1#domain.com', 'email3#test.co.uk');
$grouped = array();
foreach ($emails as $email) {
preg_match('/(?<=#)[^#]+$/', $email, $match);
$grouped[$match[0]] = $email;
}
var_dump($grouped);
This keeps the last occurrence of a domain, it's not hard to modify to keep the first instead if you require it.
You could simply use the array_unique function to do the job for you:
$emails = explode(' ', $emailString);
$emails = array_unique($emails);
The concept prev is not reliable unless all equal hostnames are in one continuous sequence. It would work if you were sorting by hostname, with a sorting function provided, but it's a bit of overkill.
Build an array with the hostnames, drop entries for which there is already a hostname in the array.
I'd suggest the following trick/procedure:
Change from one string to array of addresses. You do this with preg_match_all, others might do it with explode, all seems valid. So you have this already.
Extract the domain from the address. You could do this again with an regular expression or some other thing, I'd say it's trivial.
Now check if the domain has been already used, and if not, pick that email address.
The last point can be easily done by using an array and the domain as key. You can then use isset to see if it is already in use.
Edit: As deceze opted for a similar answer (he overwrites the matches per domain), the following code-example is a little variation. As you have got string input, I considered to iterate over it step by step to spare the temporary array of addresses and to do the adress and domain parsing at once. To do that, you need to take care of the offsets, which is supported by preg_match. Something similar is actually possible with preg_match_all however, you would then have the array again.
This code will pick the first and ignore the other addresses per domain:
$str = 'email1#domain.com email2#domain.com email1#domain.com email1#domain.com email3#test.co.uk';
$addresses = array();
$pattern = '/[\w-]+(?:\.[\w-]+)*#((?:[\w-]+\.)+[a-zA-Z]{2,7})/';
$offset = 0;
while (preg_match($pattern, $str, $matches, PREG_OFFSET_CAPTURE, $offset)) {
list(list($address, $pos), list($domain)) = $matches;
isset($addresses[$domain]) || $addresses[$domain] = $address;
$offset = $pos + strlen($address);
}
Related
I'm currently building a Slack bot using Laravel, and one of the features is that it can receive an email address and send a message to it.
The issue is that email addresses (e.g bob#example.com) come through as <mailto:bob#example.com|bob#example.com> from Slack.
I currently have a function that retrieves the email from this:
public function getEmail($string)
{
$pattern = '/[a-z0-9_\-\+]+#[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);
$matches = array_filter($matches);
return $matches[0][0];
}
This seemed to be working fine with email addresses like bob#example.com, however it seems to fail when working with email addresses like bob.jones#example.com (which would come through as <mailto:bob.jones#example.com|bob.jones#example.com>.
In these cases, the function is returning jones#example.com as the email address.
I'm not great with regex, but is there something else I could use/change in my pattern, or a better way to fetch the email address from the string provided by Slack?
Could always take regex out of the equation if you know that's always the format it'll be in:
$testString = '<mailto:bob#example.com|bob#example.com>';
$testString = str_replace(['<mailto:', '>'], '', $testString);
$addresses = explode('|', $testString);
echo $addresses[0];
This method will do the job and you avoid to have regular expressions. and make sure the email being returned is a real email address by validating it with php functions.
function getEmailAddress($string)
{
$string = trim($string, '<>');
$args = explode('|', $string);
foreach ($args as $_ => $val) {
if(filter_var($val, FILTER_VALIDATE_EMAIL) !== false) {
return $val;
}
}
return null;
}
echo getEmailAddress('<mailto:bob#example.com|bob#example.com>');
Output
bob#example.com
You know the strings containing the e-mail address will always be of the form <mailto:bob#example.com|bob#example.com>, so use that. Specifically, you know the string will start with <mailto:, will contain a |, and will end with >.
An added difficulty though, is that the local part of an e-mail address may contain a pipe character as well, but the domain may not; see the following question.
What characters are allowed in an email address?
public function getEmail($string)
{
$pattern = '/^<mailto:([^#]+#[^|]+)|(.*)>$/i';
preg_match_all($pattern, $string, $matches);
$matches = array_filter($matches);
return $matches[1][0];
}
This matches the full line from beginning to end, but we capture the e-mail address within the first set of parentheses. $matches[1] contains all matches from the first capturing parentheses. You could use preg_match instead, since you're not looking for all matches, just the first one.
I'm trying to figure out how I can compare values from an array against a particular string.
Basically my values look like chrisx001, chrisx002, chrisx003, chrisx004, bob001
I was looking at fnmatch() but I'm not sure this is the right choice, as what I want to do is keep chrisx--- but ignore bob--- so I need to wildcard the last bit, is there a means of doing this where I can be like
if($value == "chrisx%"){/*do something*/}
and if thats possible is it possible to double check the % value as int or similar in other cases?
Regex can tell you if a string starts with chrisx:
if (preg_match('/^chrisx/', $subject)) {
// Starts with chrisx
}
You can also capture the bit after chrisx:
preg_match('/^chrisx(.*)/', $subject, $matches);
echo $matches[1];
You could filter your array to return a second array of only those entries beginning whith 'chris' and then process that filtered array:
$testData = array ( 'chrisx001', 'chrisx002', 'chrisx003', 'chrisx004', 'bob001');
$testNeedle = 'chris';
$filtered = array_filter( $testData,
function($arrayEntry) use ($testNeedle) {
return (strpos($arrayEntry,$testNeedle) === 0);
}
);
var_dump($filtered);
I have a text file : ban.txt have content
a:5:{i:14528;s:15:" 118.71.102.176";i:6048;s:15:" 113.22.109.137";i:16731;s:3:" 118.71.102.76";i:2269;s:12:" 1.52.251.63";i:9050;s:14:"123.21.100.174";}
I write a script to find and ban IP in this txt
<?php
$banlist = file("ban.txt");
foreach($banlist as $ips ) {
if($_SERVER["REMOTE_ADDR"] == $ips) {
die("Your IP is banned!");
}
}
?>
Can help me to list IP in this content, i m a newbie php. Thanks very much
Look this is an acknowledged crap solution based on an unclear question
Regex never seems a great solution, but I don't have a lot of detail on how consistent the file is.
1. Isolate "s" segments in your ban.txt
As such, and my regex isn't fantastic, but this regex should match the "s" segments which appear to be for IP bans (although your comment stating "The IP always in "ip"" confuses this a little).
Regex: s:[0-9]+:"[ ]*[0-9]+.[0-9]+.[0-9]+.[0-9]+";
2. Isolate the IPs within each "s" segment
Once we have these segments, we can strip the start bit up to the actual IP (i.e. turn s:123:"192.168.0.0"; into 192.168.0.0";), and afterwards trim the end quotation mark and semi-colon (i.e. 192.168.0.0"; to 192.168.0.0):
Regex for start junk (still need to trim end): s:[0-9]+:"[ ]*
Regex for end junk: [";]+
3. Example Code
This would give us this PHP code:
$banText = file_get_contents("ban.txt");
/* Evil, evil regexes */
$sSegmentsRegex = '/s:[0-9]+:"[ ]*[0-9]+.[0-9]+.[0-9]+.[0-9]+"/';
$removeStartJunkRegex = '/s:[0-9]+:"[ ]*/';
$removeEndJunkRegex = '/[";]+/'; /* Could use rtrim on each if wanted */
$matches = array();
/* Find all 's' bits */
preg_match_all($sSegmentsRegex, $banText, $matches);
$matches = $matches[0]; /* preg_match_all changes $matches to array of arrays */
/* Remove start junk of each 's' bit */
$matches = preg_replace($removeStartJunkRegex, "", $matches);
$matches = preg_replace($removeEndJunkRegex, "", $matches);
foreach($matches as $ip) {
if($_SERVER["REMOTE_ADDR"] == $ip) {
die("Your IP is banned!");
}
}
print_r($matches); /* Shows the list of IP bans, remove this in your app */
Example: http://codepad.viper-7.com/S9rTQe
$var="UseCountry=1
UseCountryDefault=1
UseState=1
UseStateDefault=1
UseLocality=1
UseLocalityDefault=1
cantidad_productos=5
expireDays=5
apikey=ABQIAAAAFHktBEXrHnX108wOdzd3aBTupK1kJuoJNBHuh0laPBvYXhjzZxR0qkeXcGC_0Dxf4UMhkR7ZNb04dQ
distancia=15
AutoCoord=1
user_add_locality=0
SaveContactForm=0
ShowVoteRating=0
Listlayout=0
WidthThumbs=100
HeightThumbs=75
WidthImage=640
HeightImage=480
ShowImagesSystem=1
ShowOrderBy=0
ShowOrderByDefault=0
ShowOrderDefault=DESC
SimbolPrice=$
PositionPrice=0
FormatPrice=0
ShowLogoAgent=1
ShowReferenceInList=1
ShowCategoryInList=1
ShowTypeInList=1
ShowAddressInList=1
ShowContactLink=1
ShowMapLink=1
ShowAddShortListLink=1
ShowViewPropertiesAgentLink=1
ThumbsInAccordion=5
WidthThumbsAccordion=100
HeightThumbsAccordion=75
ShowFeaturesInList=1
ShowAllParentCategory=0
AmountPanel=
AmountForRegistered=5
RegisteredAutoPublish=1
AmountForAuthor=5
AmountForEditor=5
AmountForPublisher=5
AmountForManager=5
AmountForAdministrator=5
AutoPublish=1
MailAdminPublish=1
DetailLayout=0
ActivarTabs=0
ActivarDescripcion=1
ActivarDetails=1
ActivarVideo=1
ActivarPanoramica=1
ActivarContactar=1
ContactMailFormat=1
ActivarReservas=1
ActivarMapa=1
ShowImagesSystemDetail=1
WidthThumbsDetail=120
HeightThumbsDetail=90
idCountryDefault=1
idStateDefault=1
ms_country=1
ms_state=1
ms_locality=1
ms_category=1
ms_Subcategory=1
ms_type=1
ms_price=1
ms_bedrooms=1
ms_bathrooms=1
ms_parking=1
ShowTextSearch=1
minprice=
maxprice=
ms_catradius=1
idcatradius1=
idcatradius2=
ShowTotalResult=1
md_country=1
md_state=1
md_locality=1
md_category=1
md_type=1
showComments=0
useComment2=0
useComment3=0
useComment4=0
useComment5=0
AmountMonthsCalendar=3
StartYearCalendar=2009
StartMonthCalendar=1
PeriodOnlyWeeks=0
PeriodAmount=3
PeriodStartDay=1
apikey=ABQIAAAAJ879Hg7OSEKVrRKc2YHjixSmyv5A3ewe40XW2YiIN-ybtu7KLRQiVUIEW3WsL8vOtIeTFIVUXDOAcQ
";
in that string only i want "api==ABQIAAAAJ879Hg7OSEKVrRKc2YHjixSmyv5A3ewe40XW2YiIN-ybtu7KLRQiVUIEW3WsL8vOtIeTFIVUXDOAcQ";
plz guide me correctly;
EDIT
As shamittomar pointed out, the parse_str will not work for this situation, posted the proper regex below.
Given this seems to be a QUERY STRING, use the parse_str() function PHP provides.
UPDATE
If you want to do it with regex using preg_match() as powertieke pointed out:
preg_match('/apikey=(.*)/', $var, $matches);
echo $matches[1];
Should do the trick.
preg_match(); should be right up your alley
people are so fast to jump to preg match when this can be done with regular string functions thats faster.
$string = '
expireDays=5
apikey=ABQIAAAAFHktBEXrHnX108wOdzd3aBTupK1kJuoJNBHuh0laPBvYXhjzZxR0qkeXcGC_0Dxf4UMhkR7ZNb04dQ
distancia=15
AutoCoord=1';
//test to see what type of line break it is and explode by that.
$parts = (strstr($string,"\r\n") ? explode("\r\n",$string) : explode("\n",$string));
$data = array();
foreach($parts as $part)
{
$sub = explode("=",trim($part));
if(!empty($sub[0]) || !empty($sub[1]))
{
$data[$sub[0]] = $sub[1];
}
}
and use $data['apikey'] for your api key, i would also advise you to wrpa in function.
I can bet this is a better way to parse the string and much faster.
function ParsemyString($string)
{
$parts = (strstr($string,"\r\n") ? explode("\r\n",$string) : explode("\n",$string));
$data = array();
foreach($parts as $part)
{
$sub = explode("=",trim($part));
if(!empty($sub[0]) || !empty($sub[1]))
{
$data[$sub[0]] = $sub[1];
}
}
return $data;
}
$data = ParsemyString($string);
First of all, you are not looking for
api==ABQIAAAAJ879Hg7OSEKVrRKc2YHjixSmyv5A3ewe40XW2YiIN-ybtu7KLRQiVUIEW3WsL8vOtIeTFIVUXDOAcQ
but you are looking for
apikey=ABQIAAAAJ879Hg7OSEKVrRKc2YHjixSmyv5A3ewe40XW2YiIN-ybtu7KLRQiVUIEW3WsL8vOtIeTFIVUXDOAcQ
It is important to know if the api-key property always occurs at the end and if the length of the api-key value is always the same. I this is the case you could use the PHP substr() function which would be easiest.
If not you would most probably need a regular expression which you can feed to PHPs preg_match() function. Something along the lines of apikey==[a-zA-Z0-9\-] Which matches an api-key containing a-z in both lowercase and uppercase and also allows for dashes in the key. If you are using the preg_match() function you can retrieve the matches (and thus your api-key value).
Given a list of emails, formated:
"FirstName Last" <email#address.com>, "NewFirst NewLast" <email2#address.com>
How can I build this into a string array of Only email addresses (I don't need the names).
PHP’s Mailparse extension has a mailparse_rfc822_parse_addresses function you might want to try. Otherwise you should build your own address parser.
You could use preg_match_all (docs):
preg_match_all('/<([^>]+)>/', $s, $matches);
print_r($matches); // inspect the resulting array
Provided that all addresses are enclosed in < ... > there is no need to explode() the string $s.
EDIT In response to comments, the regex could be rewritten as '/<([^#]+#[^>]+)>/'. Not sure whether this is fail-safe, though :)
EDIT #2 Use a parser for any non-trivial data (see the comments below - email address parsing is a bitch). Some errors could, however, be prevented by removing duplicate addresses.
<?php
$s = "\"FirstName Last\" <email#address.com>, \"NewFirst NewLast\" <email2#address.com>";
$emails = array();
foreach (split(",", $s) as $full)
{
preg_match("/.*<([^>]+)/", $full, $email);
$emails[] = $email[1];
}
print_r($emails);
?>