Regular Expression with Names and Emails

Regular Expression with Names and Emails - php

I am having a problem with regular expressions at the moment.
What I'm trying to do is that for each line through the iteration, it checks for this type of pattern: Lastname, Firstname
If it finds the name, then it will take the first letter of the first name, and the first six letters of the lastname and form it as an email.
I have the following:
$checklast = "[A-z],";
$checkfirst = "[A-z]";
if (ereg($checklast, $parts[1])||ereg($checkfirst, $parts[2])){
$first = preg_replace($checkfirst, $checkfirst{1,1}, $parts[2]);
print "<a href='mailto:$first.$last#email.com;'> $parts[$i] </a>";
}
This one obviously broke the code. But I was initially attempting to find only the first letter of the firstname and then after that the first six letters of the lastname followed by the #email.com This didn't work out too well. I'm not sure what to do at this point.
Any help is much appreciated.

How about something like this:
$name = 'Smith, John';
$email = preg_replace('/([a-z]{1,6})[a-z]*?,[\\s]([a-z])[a-z]*/i',
'\\2.\\1#email.com', $name);
echo $email; // J.Smith#email.com
Cheers

Related

Explode surname and name by lower/upper

How can I separate firstname and surname from a string like this:
Pietro DE GIOVANNI
(Pietro being the firstname and DE GIOVANNI the surname)
I used to do it with an explode() on the spaces, but obviously it doesn't work on a person like that.
Thanks in advance.

You can explode on the names by spaces as before, then loop the result as individual pieces of the name. Check with ctype_upper() if the string is purely uppercase or not, and append it to the proper variable.
Putting it into a function, it may look like this
function split_name($fullname) {
$firstname = "";
$surname = "";
$pieces = explode(" ", $fullname);
foreach ($pieces as $name) {
if (ctype_upper($name))
$surname .= $name." ";
else
$firstname .= $name. " ";
}
return array("firstname" => $firstname, "surname" => $surname);
}
You can then use it as such
$name = "Pietro DE GIOVANNI";
$split = split_name($name);
echo "Firstname: ".$split['firstname']."\nSurname: ".$split['surname'];
Note
This doesn't work for names such as James O'RILEY, John-Paul JOHNSON or John. F. KENNEDY. The first two we can circumvent by stripping away any characters that's not a-zA-Z before comparing with ctype_upper(), but the latter we won't be able to distinguish if it's a firstname or surname - there's not enough data to say either way. You can assume that it's always a part of the firstname (for instance), and/or check if it's after we've started looking at the surnames (if a name in capital letters has been found yet). You can take care of the first two cases by checking for
if (ctype_upper(filter_var(str_replace("'", "", $name), FILTER_SANITIZE_STRING)))
instead of using the if statement in the original codeblock. This removes quotes and any non-a-zA-Z values.
Here's a live demo where I've stripped away names that contain any characters beside a-zA-Z, which would account for the two first issues.

Regex to seperate concatenated phone number and email

I have strings where unfortunately phone number and email are concatenated like this:
$phone_email = "617.651.3123mya123#some-site.com";
I'd like to be able to split between the last number and first letter.
Desired output
$phone = "617.651.3123";
$email = "mya123#some-site.com";
I'm using php but hopefully the strategy would be straightforward in any language.
EDIT
I've tried many things including trying to simply grab the email by removing the digits. $email = preg_replace('#^\d+#', '', $phone_email); That results in removing only the 617 ignoring the .

As others have pointed out, splitting between "the last number and first letter" may not be a good strategy, since email addresses can start with numbers. That said, I believe this does roughly what you asked for:
$phone_email = "617.651.3123mya123#some-site.com";
$matches = [];
preg_match("/([^a-zA-Z]*)(.*)/", $phone_email, $matches);
$phone = $matches[1];
$email = $matches[2];
echo "Phone: $phone\n";
echo "Email: $email\n";
// Output:
// Phone: 617.651.3123
// Email: mya123#some-site.com

Simple solution is:
$phone_email = "617.651.3123mya123#some-site.com";
$ms = array();
preg_match("/(.*\d)([a-z].*)/", $phone_email, $ms);
print_r($ms);
Of course cases when email starts with a number are not considered.

split after digits and points repetitions
print_r(preg_split('/(\d+\.?)+\K/', $phone_email,2));
demo

Making one line Regular Expression

I'm making cover letters for mailing of books and magazines. I have all data of recipients in the data base and I have PHP script fetching that data and making cover letters. A user who writes that cover letter using special characters to mark where the name should be put etc.
For example, in order to compose a cover letter an user writes:
Dear [[last_name]],
please find attached book...
Then it gets parsed by PHP script and [[last_name]] tag gets replaced with a real name. When 1000 addresses selected for mailing then the script produces 1000 cover letters each one with defferent name.
Now, in my Russian language word "Dear" has different ending for male and female. It is like we say on English "Dear mr." or "Dear mrs."
In order to mark that in the cover letter user writes the possible endings for the word:
Dear[[oy/aya]] [[last_name]]
or it could be something like:
Dear[[ie/oe]]... etc.
I'm trying to figure out the regular expression and replacement command for my PHP script to parse those kind of lines.
For the last_name tags I use:
$text = ...//this is text of the cover letter with all the tags.
$res = $mysqli->query("SELECT * FROM `addresses` WHERE `flag` = 1")
while ($row=$res->fetch_assoc()) {
$text = str_replace('[[last_name]]', $row['lname'], $text);
echo $text;
}
For the word endings as I understand it should be something like:
$text = preg_replace('/\w{2-3}\//\w{2-3}/', ($row['gender']==1)
? 'regexp(first_half)'
: 'regext(second_half)', $text);
I could make this whole idea by cycling through the tag, parsing it and replace but it would be 5-10 lines of code. I'm sure this can be done just by the line above but I can't figure out how.

see http://www.phpliveregex.com/p/2BC and then you replace with $1 for male and $2 for female
...
preg_replace('~\[\[(.+?)/(.+?)\]\]~', $row['gender']==1?'$1':'$2', $text);

$gender = ($row['gender'] == 1) ? 1 : 2;
preg_replace_callback('#\[\[(?:([^\]/]+)(?:/([^\]/]+))?\]\]#',
function($match) use ($row, $gender) {
// $match will contain the current match info
// $match[1] will contain a field name, or the first part of a he/she pair
// $match[2] will be non-empty only in cases of he/she etc
return (empty($match[2])) ? $row[$match[1]] : $match[$gender];
}
);

PHP Regular extract parts from a string

I've created a regular expression in C# but now I'm struggling when trying to run it in PHP. I presumed they'd work the same but obviously not. Does anyone know what needs to be changed below to get it working?
The idea is to make sure that the string is in the format "Firstname Lastname (Company Name)" and then to extract the various parts of the string.
C# code:
string patternName = #"(\w+\s*)(\w+\s+)+";
string patternCompany = #"\((.+\s*)+\)";
string data = "Firstname Lastname (Company Name)";
Match name = Regex.Match(data, patternName);
Match company = Regex.Match(data, patternCompany);
Console.WriteLine(name.ToString());
Console.WriteLine(company.ToString());
Console.ReadLine();
PHP code (not working as expected):
$patternName = "/(\w+\s*)(\w+\s+)+/";
$patternCompany = "/\((.+\s*)+\)/";
$str = "Firstname Lastname (Company Name)";
preg_match($patternName, $str, $nameMatches);
preg_match($patternCompany, $str, $companyMatches);
print_r($nameMatches);
print_r($companyMatches);

Seems to work here. What you should realize is that when you're capturing matches in a regex, the array PHP produces will contain both the full string that got matched the pattern as a whole, plus each individual capture group.
For your name/company name, you'd need to use
$nameMatches[1] -> Firstname
$nameMatches[2] -> Lastname
and
$companyMatches[1] -> Company Name
which is what got matched by the capture group. the [0] element of both is the entire string.

It could be because you're using double-quotes. PHP might be intercepting your escape sequences and removing them since they are not recognized.

Your patterns do appear to extract the information you want. Try replacing the two print_r() lines with:
print "Firstname: " . $nameMatches[1] . "\n";
print "Lastname: " . $nameMatches[2] . "\n";
print "Company Name: " . $companyMatches[1] . "\n";
Is there anything wrong with this output?
Firstname: Firstname
Lastname: Lastname
Company Name: Company Name

Using regex to fix phone numbers in a CSV with PHP

My new phone does not recognize a phone number unless its area code matches the incoming call. Since I live in Idaho where an area code is not needed for in-state calls, many of my contacts were saved without an area code. Since I have thousands of contacts stored in my phone, it would not be practical to manually update them. I decided to write the following PHP script to handle the problem. It seems to work well, except that I'm finding duplicate area codes at the beginning of random contacts.
<?php
//the script can take a while to complete
set_time_limit(200);
function validate_area_code($number) {
//digits are taken one by one out of $number, and insert in to $numString
$numString = "";
for ($i = 0; $i < strlen($number); $i++) {
$curr = substr($number,$i,1);
//only copy from $number to $numString when the character is numeric
if (is_numeric($curr)) {
$numString = $numString . $curr;
}
}
//add area code "208" to the beginning of any phone number of length 7
if (strlen($numString) == 7) {
return "208" . $numString;
//remove country code (none of the contacts are outside the U.S.)
} else if (strlen($numString) == 11) {
return preg_replace("/^1/","",$numString);
} else {
return $numString;
}
}
//matches any phone number in the csv
$pattern = "/((1? ?\(?[2-9]\d\d\)? *)? ?\d\d\d-?\d\d\d\d)/";
$csv = file_get_contents("contacts2.CSV");
preg_match_all($pattern,$csv,$matches);
foreach ($matches[0] as $key1 => $value) {
/*create a pattern that matches the specific phone number by adding slashes before possible special characters*/
$pattern = preg_replace("/\(|\)|\-/","\\\\$0",$value);
//create the replacement phone number
$replacement = validate_area_code($value);
//add delimeters
$pattern = "/" . $pattern . "/";
$csv = preg_replace($pattern,$replacement,$csv);
}
echo $csv;
?>
Is there a better approach to modifying the CSV? Also, is there a way to minimize the number of passes over the CSV? In the script above, preg_replace is called thousands of times on a very large String.

If I understand you correctly, you just need to prepend the area code to any 7-digit phone number anywhere in this file, right? I have no idea what kind of system you're on, but if you have some decent tools, here are a couple options. And of course, the approaches they take can presumably be implemented in PHP; that's just not one of my languages.
So, how about a sed one-liner? Just look for 7-digit phone numbers, bounded by either beginning of line or comma on the left, and comma or end of line on the right.
sed -r 's/(^|,)([0-9]{3}-[0-9]{4})(,|$)/\1208-\2\3/g' contacts.csv
Or if you want to only apply it to certain fields, perl (or awk) would be easier. Suppose it's the second field:
perl -F, -ane '$"=","; $F[1]=~s/^[0-9]{3}-[0-9]{4}$/208-$&/; print "#F";' contacts.csv
The -F, indicates the field separator, the $" is the output field separator (yes, it gets assigned once per loop, oh well), the arrays are zero-indexed so second field is $F[1], there's a run-of-the-mill substitution, and you print the results.

Ah programs... sometimes a 10-min hack is better.
If it were me... I'd import the CSV into Excel, sort it by something - maybe the length of the phone number or something. Make a new col for the fixed phone number. When you have a group of similarly-fouled numbers, make a formula to fix. Same for the next group. Should be pretty quick, no? Then export to .csv again, omitting the bad col.

A little more digging on my own revealed the issues with the regex in my question. The problem is with duplicate contacts in the csv.
Example:
(208) 555-5555, 555-5555
After the first pass becomes:
2085555555, 208555555
and After the second pass becomes
2082085555555, 2082085555555
I worked around this by changing the replacement regex to:
//add escapes for special characters
$pattern = preg_replace("/\(|\)|\-|\./","\\\\$0",$value);
//add delimiters, and optional area code
$pattern = "/(\(?[0-9]{3}\)?)? ?" . $pattern . "/";

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regular Expression with Names and Emails - php

How about something like this: $name = 'Smith, John'; $email = preg_replace('/([a-z]{1,6})[a-z]?,[\\s]([a-z])[a-z]/i', '\\2.\\1#email.com', $name); echo $email; // J.Smith#email.com Cheers

Related

Explode surname and name by lower/upper

Regex to seperate concatenated phone number and email

Making one line Regular Expression

PHP Regular extract parts from a string

Using regex to fix phone numbers in a CSV with PHP

Categories

Resources

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Regular Expression with Names and Emails - php

How about something like this: $name = 'Smith, John'; $email = preg_replace('/([a-z]{1,6})[a-z]*?,[\\s]([a-z])[a-z]*/i', '\\2.\\1#email.com', $name); echo $email; // J.Smith#email.com Cheers

Related

Explode surname and name by lower/upper

Regex to seperate concatenated phone number and email

Making one line Regular Expression

PHP Regular extract parts from a string

Using regex to fix phone numbers in a CSV with PHP

Categories

Resources

How about something like this: $name = 'Smith, John'; $email = preg_replace('/([a-z]{1,6})[a-z]?,[\\s]([a-z])[a-z]/i', '\\2.\\1#email.com', $name); echo $email; // J.Smith#email.com Cheers