Extract a string between two characters - php

I have an output string in this format Fname Lname<fname#urmail.com>. I want to extract the email from here. How can I do that?

If you can be sure that the string format is consistent, a simple regular expression will do the trick:
$input = 'Fname Lname<fname#urmail.com>';
preg_match('~<(.*?)>~', $input, $output);
$email = $output[1];

Don't reinvent the wheel. Instead, use a parser. mailparse_rfc822_parse_addresses() is made for this specific task by professionals with an in-depth knowledge of the subject (and the possible quirks that you may run into).
Example #1 from the docs:
$to = 'Wez Furlong <wez#example.com>, doe#example.com';
var_dump(mailparse_rfc822_parse_addresses($to));
Gives (gentle formatting applied):
array(2) {
[0] => array(3) {
["display"] => string(11) "Wez Furlong"
["address"] => string(15) "wez#example.com"
["is_group"] => bool(false)
}
[1] => array(3) {
["display"] => string(15) "doe#example.com"
["address"] => string(15) "doe#example.com"
["is_group"] => bool(false)
}
}
See also: imap_rfc822_parse_adrlist() and Full name with valid email.

Use functions like substring and explode(easier method than regular expressions and will do the trick):
<?php
$text = 'Fname Lname<fname#urmail.com>';
$pieces = explode('<',$text);
$mail=substr($pieces[1],0,-1);
echo $mail;
?>

This should print the e-mail address:
if (preg_match("/<\S*>/", $subject, $matches)) {
echo "E-Mail address: ".$matches[0];
}

Related

Extract email address from string - php

I want to extract email address from a string, for example:
<?php // code
$string = 'Ruchika <ruchika#example.com>';
?>
From the above string I only want to get email address ruchika#example.com.
Kindly, recommend how to achieve this.
Try this
<?php
$string = 'Ruchika < ruchika#example.com >';
$pattern = '/[a-z0-9_\-\+\.]+#[a-z0-9\-]+\.([a-z]{2,4})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);
var_dump($matches[0]);
?>
see demo here
Second method
<?php
$text = 'Ruchika < ruchika#example.com >';
preg_match_all("/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $text, $matches);
print_r($matches[0]);
?>
See demo here
Parsing e-mail addresses is an insane work and would result in a very complicated regular expression. For example, consider this official regular expression to catch an e-mail address: http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html
Amazing right?
Instead, there is a standard php function to do this called mailparse_rfc822_parse_addresses() and documented here.
It takes a string as argument and returns an array of associative array with keys display, address and is_group.
So,
$to = 'Wez Furlong <wez#example.com>, doe#example.com';
var_dump(mailparse_rfc822_parse_addresses($to));
would yield:
array(2) {
[0]=>
array(3) {
["display"]=>
string(11) "Wez Furlong"
["address"]=>
string(15) "wez#example.com"
["is_group"]=>
bool(false)
}
[1]=>
array(3) {
["display"]=>
string(15) "doe#example.com"
["address"]=>
string(15) "doe#example.com"
["is_group"]=>
bool(false)
}
}
try this code.
<?php
function extract_emails_from($string){
preg_match_all("/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $string, $matches);
return $matches[0];
}
$text = "blah blah blah blah blah blah email2#address.com";
$emails = extract_emails_from($text);
print(implode("\n", $emails));
?>
This will work.
Thanks.
This is based on Niranjan's response, assuming you have the input email enclosed within < and > characters). Instead of using a regular expression to grab the email address, here I get the text part between the < and > characters. Otherwise, I use the string to get the entire email. Of course, I didn't make any validation on the email address, this will depend on your scenario.
<?php
$string = 'Ruchika <ruchika#example.com>';
$pattern = '/<(.*?)>/i';
preg_match_all($pattern, $string, $matches);
var_dump($matches);
$email = $matches[1][0] ?? $string;
echo $email;
?>
Here is a forked demo.
Of course, if my assumption isn't correct, then this approach will fail. But based on your input, I believe you wanted to extract emails enclosed within < and > chars.
This function extract all email from a string and return it in an array.
function extract_emails_from($string){
preg_match_all( '/([\w+\.]*\w+#[\w+\.]*\w+[\w+\-\w+]*\.\w+)/is', $string, $matches );
return $matches[0];
};
This works great and it's minimal:
$email = strpos($from, '<') ? substr($from, strpos($from, '<') + 1, -1) : $from
use (my) function getEmailArrayFromString to easily extract email adresses from a given string.
<?php
function getEmailArrayFromString($sString = '')
{
$sPattern = '/[\._\p{L}\p{M}\p{N}-]+#[\._\p{L}\p{M}\p{N}-]+/u';
preg_match_all($sPattern, $sString, $aMatch);
$aMatch = array_keys(array_flip(current($aMatch)));
return $aMatch;
}
// Example
$sString = 'foo#example.com XXX bar#example.com XXX <baz#example.com>';
$aEmail = getEmailArrayFromString($sString);
/**
* array(3) {
[0]=>
string(15) "foo#example.com"
[1]=>
string(15) "bar#example.com"
[2]=>
string(15) "baz#example.com"
}
*/
var_dump($aEmail);
Based on Priya Rajaram's code, I have optimised the function a little more so that each email address only appears once.
If, for example, an HTML document is parsed, you usually get everything twice, because the mail address is also used in the mailto link, too.
function extract_emails_from($string){
preg_match_all("/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $string, $matches);
return array_values(array_unique($matches[0]));
}
This will work even on subdomains. It extracts all emails from text.
$marches[0] has all emails.
$pattern = "/[a-zA-Z0-9-_]{1,}#[a-zA-Z0-9-_]{1,}(.[a-zA-Z]{1,}){1,}/";
preg_match_all ($pattern , $string, $matches);
print_r($matches);
$marches[0] has all emails.
Array
(
[0] => Array
(
[0] => clotdesormakilgehr#prednisonecy.com
[1] => **********#******.co.za.com
[2] => clotdesormakilgehr#prednisonecy.com
[3] => clotdesormakilgehr#prednisonecy.mikedomain.com
[4] => clotdesormakilgehr#prednisonecy.com
)
[1] => Array
(
[0] => .com
[1] => .com
[2] => .com
[3] => .com
[4] => .com
)
)
A relatively straight forward approach is to use PHP built-in methods for splitting texts into words and validating E-Mails:
function fetchEmails($text) {
$words = str_word_count($text, 1, '.#-_1234567890');
return array_filter($words, function($word) {return filter_var($word, FILTER_VALIDATE_EMAIL);});
}
Will return the e-mail addresses within the text variable.

Get a list of domains from a table via regex

I have list of domains in table with more info and
<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>
I need get .com domains using a regex. I tried to use something like :
'<td>(.............).com'
But what can I write instead of dots? What do I need to use?
I need get the data between the tags: <td>domain.com</td> -> domain.com
'<td>([^<]+\.com)</td>'
- it's more better, but i need get without tags
<?php
$html = '<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>';
$matches = array();
preg_match_all('/<td>(.*?.com)<\/td>/i', $html, $matches);
var_dump($matches[1]);
prints:
array(3) {
[0]=>
string(12) "example1.com"
[1]=>
string(12) "example3.com"
[2]=>
string(12) "example4.com"
}
Something like that:
'<td>([^<]+\.com)</td>'
but you shouldn't use regular expressions to parse html.
You can use look aheads and look behinds if you want to capture something but make sure it's surrounded by something else. Here we're capturing .com only.
<?php
$html = '<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>';
$pattern = "!(?<=<td>).*\.com*(?=</td>)!";
preg_match_all($pattern,$html,$matches);
$urls = $matches[0];
print_r($urls);
?>
Output
Array
(
[0] => example1.com
[1] => example3.com
[2] => example4.com
)

Mysqli array failing?

I have a block of text and a preg_match_all sequence to create an array ($matches) from certain elements in the text.
I then look up a corresponding entry for each string in the first array using mysqli and receive a second array - ($replacement).
I want to replace the first array's position in the original text with the second array, re-finding the first array and naming it $arraytoreplace. This is the code I use:
$replacement = array();
$myq = "SELECT code,title FROM messages WHERE ID=?";
if ($stmt = $mysqli2->prepare($myq)) {
foreach($matches[1] as $value) {
$stmt->bind_param("s", $value);
$stmt->execute();
// bind result variables
$stmt->bind_result($d,$cc);
if($stmt->fetch()) {
$replacement[] = '' . $cc . '';
}
}
$stmt->close();
}
If I use var_dump on the arrays before the str_replace like so:
var_dump($arraytoreplace);
var_dump($replacement);
I get:
array(4) {
[0]=> string(3) "111"
[1]=> string(2) "12"
[2]=> string(4) "1234"
[3]=> string(1) "0"
}
array(4) {
[0]=> string(5) "hello"
[1]=> string(2) "hi"
[2]=> string(3) "foo"
[3]=> string(3) "bar"
}
I then use str_replace to drop the second array into the first array's place in the original text.
Usually this is fine, but everything breaks once it hits the 10 string in an array mark.
Instead of Text hello text hi I'll get Text 11foo text foo1 or something equally bizarre.
Any ideas?
Edit: The code used for replacing the arrays as follows:
$messageprep = str_replace($arraytoreplace, $replacement, $messagebody);
$messagepostprep = str_replace('#', '', $messageprep);
echo '<div class="messagebody">' . $messagepostprep . '</div>';
It looks like your getting partial replacements when a string of numbers is contained inside a longer string, i.e. 23 inside 1234.
You need to do your replacements with a regular expression on the boundary of the search string. Something like...
$text = preg_replace("/\b" . $replace . "\b/", $value, $text);
Another possible solution would be to consider changing the values to replace so that they are padded with zeros...
Array(
[0] => string(3) "0111"
[1] => string(2) "0012"
[2] => string(4) "1234"
[3] => string(1) "0000"
)
...and make sure that your search strings are also padded with zeros, because 0012 will never be confused with 12 and accidentally found in 0123.

Why preg_match fails to get the result?

I have the below text displayed on the browser and trying to get the URL from the string.
string 1 = voice-to-text from #switzerland: http://bit.ly/lnpDC12D
When I try to use preg_match and trying to get the URL, but it fails
$urlstr = "";
preg_match('/\b((?#protocol)https?|ftp):\/\/((?#domain)[-A-Z0-9.]+)((?#file)\/[-A-Z0-9+&##\/%=~_|!:,.;]*)?((?#parameters)\?[A-Z0-9+&##\/%
=~_|!:,.;]*)?/i', $urlstr, $match);
echo $match[0];
I think #switzerland: has one more http// ... will it be problem ?
the above split works perfect for the below string,
voice-to-text: http://bit.ly/jDcXrZg
In this case I think parse_url will be better choice than regex based code. Something like this may work (assuming your URL always starts with http):
$str = "voice-to-text from #switzerland: http://bit.ly/lnpDC12D";
$pos = strrpos($str, "http://");
if ($pos>=0) {
var_dump(parse_url(substr($str, $pos)));
}
OUTPUT
array(3) {
["scheme"]=>
string(4) "http"
["host"]=>
string(6) "bit.ly"
["path"]=>
string(9) "/lnpDC12D"
}
As far as I understand your request, here is a way to do it :
$str = 'voice-to-text from <a href="search.twitter.com/…;: http://bit.ly/lnpDC12D';
preg_match("~(bit.ly/\S+)~", $str, $m);
print_r($m);
output:
Array
(
[0] => bit.ly/lnpDC12D
[1] => bit.ly/lnpDC12D
)

PHP json-like-string split

I have this $str value :
[{\"firstname\":\"guest1\",\"lastname\":\"one\",\"age\":\"22\",\"gender\":\"Male\"},{\"firstname\":\"guest2\",\"lastname\":\"two\",\"age\":\"22\",\"gender\":\"Female\"}]
I want to split it into the following:
firstname:guest1,lastname:one,age:22
firstname:guest2,lastname:two,age:22
I tried explode (",",$str) , but it explode all using , as delimiter and I don't get what I want
anyone can help me ?
As Josh K points out, that looks suspiciously like a JSON string. Maybe you should do a json_decode() on it to get the actual data you're looking for, all organized nicely into an array of objects.
EDIT: it seems your string is itself wrapped in double quotes ", so you'll have to trim those away before you'll be able to decode it as valid JSON:
$str_json = trim($str, '"');
$guests = json_decode($str_json);
var_dump($guests);
I get this output with the var_dump(), so it's definitely valid JSON here:
array(2) {
[0]=>
object(stdClass)#1 (4) {
["firstname"]=>
string(6) "guest1"
["lastname"]=>
string(3) "one"
["age"]=>
string(2) "22"
["gender"]=>
string(4) "Male"
}
[1]=>
object(stdClass)#2 (4) {
["firstname"]=>
string(6) "guest2"
["lastname"]=>
string(3) "two"
["age"]=>
string(2) "22"
["gender"]=>
string(6) "Female"
}
}
JSON (JavaScript Object Notation) is not CSV (comma-separated values). They're two vastly different data formats, so you can't parse one like the other.
To get your two strings, use a loop to get the keys and values of each object, and then build the strings with those values:
foreach ($guests as $guest) {
$s = array();
foreach ($guest as $k => $v) {
if ($k == 'gender') break;
$s[] = "$k:$v";
}
echo implode(',', $s) . "\n";
}
Output:
firstname:guest1,lastname:one,age:22
firstname:guest2,lastname:two,age:22
(Assuming you do want to exclude the genders for whatever reason; if not, delete the if ($k == 'gender') break; line.)
If you split on ,'s then you will get all the other crap that surrounds it. You would then have to strip that off.
Looks a lot like JSON data to me, where is this string coming from?
If that is valid json, just run it through json_decode() to get a native php array...
Note that you may need to run it through stripslashes() first, as it appears you may have magic_quotes_gpc set... You can conditionally call it by checking with the function get_magic_quotes_gpc:
if (get_magic_quotes_gpc()) {
$_POST['foo'] = stripslashes($_POST['foo']);
}
$array = json_decode($_POST['foo']);
You need to use preg_replace function.
$ptn = "/,\\"gender\\":\\"\w+\\"\}\]?|\\"|\[?\{/";
$str = "[{\"firstname\":\"guest1\",\"lastname\":\"one\",\"age\":\"22\",\"gender\":\"Male\"},{\"firstname\":\"guest2\",\"lastname\":\"two\",\"age\":\"22\",\"gender\":\"Female\"}]";
$rpltxt = "";
echo preg_replace($ptn, $rpltxt, $str);
You can the php regular expression tester to test the result.
or use preg_match_all
$ptn = "/(firstname)\\":\\"(\w+)\\",\\"(lastname)\\":\\"(\w+)\\",\\"(age)\\":\\"(\d+)/";
$str = "[{\"firstname\":\"guest1\",\"lastname\":\"one\",\"age\":\"22\",\"gender\":\"Male\"},{\"firstname\":\"guest2\",\"lastname\":\"two\",\"age\":\"22\",\"gender\":\"Female\"}]";
preg_match_all($ptn, $str, $matches);
print_r($matches);
i still haven't get a chance to retrieve the JSON :
I var_dump the trimmed value as :
$str_json = trim($userdetails->other_guests, '"');
$guests = json_decode($str_json);
var_dump($str_json,$guests);
WHERE $userdetails->other_guests is the $str value I had before...
I get the following output :
string(169) "[{\"firstname\":\"guest1\",\"lastname\":\"one\",\"age\":\"22\",\"gender\":\"Male\"},{\"firstname\":\"guest2\",\"lastname\":\"two\",\"age\":\"23\",\"gender\":\"Female\"}]"
NULL
This mean the decoded json are NULL... strange

Categories