Why preg_match fails to get the result? - php

I have the below text displayed on the browser and trying to get the URL from the string.
string 1 = voice-to-text from #switzerland: http://bit.ly/lnpDC12D
When I try to use preg_match and trying to get the URL, but it fails
$urlstr = "";
preg_match('/\b((?#protocol)https?|ftp):\/\/((?#domain)[-A-Z0-9.]+)((?#file)\/[-A-Z0-9+&##\/%=~_|!:,.;]*)?((?#parameters)\?[A-Z0-9+&##\/%
=~_|!:,.;]*)?/i', $urlstr, $match);
echo $match[0];
I think #switzerland: has one more http// ... will it be problem ?
the above split works perfect for the below string,
voice-to-text: http://bit.ly/jDcXrZg

In this case I think parse_url will be better choice than regex based code. Something like this may work (assuming your URL always starts with http):
$str = "voice-to-text from #switzerland: http://bit.ly/lnpDC12D";
$pos = strrpos($str, "http://");
if ($pos>=0) {
var_dump(parse_url(substr($str, $pos)));
}
OUTPUT
array(3) {
["scheme"]=>
string(4) "http"
["host"]=>
string(6) "bit.ly"
["path"]=>
string(9) "/lnpDC12D"
}

As far as I understand your request, here is a way to do it :
$str = 'voice-to-text from <a href="search.twitter.com/…;: http://bit.ly/lnpDC12D';
preg_match("~(bit.ly/\S+)~", $str, $m);
print_r($m);
output:
Array
(
[0] => bit.ly/lnpDC12D
[1] => bit.ly/lnpDC12D
)

Related

remove part of string after 4th slash in php

I have an array which is contains links and trying to edit those links. Trying to cut links after 4th slash.
[0]=>
string(97) "https://www.nowhere.com./downtoalley/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=shimokita4040/outline"
[1]=>
string(105) "https://www.example.com./wowar-waseda/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=shinjuku-w25861/outline"
[2]=>
string(91) "https://www.hey.com./gotoashbourn/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=kinuta7429/outline"
expected output is like this:
[0]=>
string(97) "https://www.nowhere.com./downtoalley/"
[1]=>
string(105) "https://www.example.com./wowar-waseda/"
[2]=>
string(91) "https://www.hey.com./gotoashbourn/"
Lengths are different, so I can't use strtok any other options for this?
Try following code:
<?php
$arr = array(
0 => "https://www.nowhere.com./downtoalley/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=shimokita4040/outline",
1 => "https://www.example.com./wowar-waseda/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=shinjuku-w25861/outline",
2 => "https://www.hey.com./gotoashbourn/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=kinuta7429/outline");
$resultArray = array();
foreach($arr as $str) {
array_push($resultArray, current(explode("?",$str)));
}
print_r($resultArray);
?>
You can test this code here
You can use preg_replace to replace everything in each string after the fourth / with nothing using this regex
^(([^/]*/){4}).*$
which looks for 4 sets of non-/ characters followed by a /, collecting that text in capture group 1; and then replacing with $1 which gives only the text up to the 4th /:
$strings = array("https://www.nowhere.com./downtoalley/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=shimokita4040/outline",
"https://www.example.com./wowar-waseda/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=shinjuku-w25861/outline",
"https://www.hey.com./gotoashbourn/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=kinuta7429/outline");
print_r(array_map(function ($v) { return preg_replace('#^(([^/]*/){4}).*$#', '$1', $v); }, $strings));
Output:
Array (
[0] => https://www.nowhere.com./downtoalley/
[1] => https://www.example.com./wowar-waseda/
[2] => https://www.hey.com./gotoashbourn/
)
Demo on 3v4l.org
There is no direct function to achieve this. You can follow PHP code as below:
$explodingLimit = 4;
$string = "https://www.nowhere.com./downtoalley/?iad2=sumai-pickup&argument=CH4fRVnN&dmai=shimokita4040/outline";
$stringArray = explode ("/", $string);
$neededElements = array_slice($stringArray, 0, $explodingLimit);
echo implode("/", $neededElements);
I have made this for one element which you can use for you array. Also you can add last '/' if you need that. Hope it helps you.

php5 copy sub-strings from string separated by space?

I am using php5 and have a script which will return IP addresses of the client.
Executing script using shell_exec() function. Now the output is like this: *192.168.10.40 192.168.10.41 *.
Now I need to store this in an array. I used preg_match() but it is not working.
Here is the code using preg_match() :
$test = shell_exec("/www/dhcp.sh");
$pattern='/([^ ]*) /';
preg_match($pattern, $test, $new);
preg_match() is returning 0;
Here is the one I used explode() :
$test = shell_exec("/www/dhcp.sh");
var_dump( explode(' ', $test ) );
I also used explode but I am getting the result as:
array(1) { [0]=> string(28) "192.168.10.40 192.168.10.41 " }
Can anyone tell me how can I split the string into an array?
Regards,
Sowmya
You can use explode to split your string:
explode(' ', '192.168.10.40 192.168.10.41'));
which gives you
array(2) {
[0]=>
string(13) "192.168.10.40"
[1]=>
string(13) "192.168.10.41"
}
http://php.net/manual/fr/function.explode.php

Extract email address from string - php

I want to extract email address from a string, for example:
<?php // code
$string = 'Ruchika <ruchika#example.com>';
?>
From the above string I only want to get email address ruchika#example.com.
Kindly, recommend how to achieve this.
Try this
<?php
$string = 'Ruchika < ruchika#example.com >';
$pattern = '/[a-z0-9_\-\+\.]+#[a-z0-9\-]+\.([a-z]{2,4})(?:\.[a-z]{2})?/i';
preg_match_all($pattern, $string, $matches);
var_dump($matches[0]);
?>
see demo here
Second method
<?php
$text = 'Ruchika < ruchika#example.com >';
preg_match_all("/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $text, $matches);
print_r($matches[0]);
?>
See demo here
Parsing e-mail addresses is an insane work and would result in a very complicated regular expression. For example, consider this official regular expression to catch an e-mail address: http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html
Amazing right?
Instead, there is a standard php function to do this called mailparse_rfc822_parse_addresses() and documented here.
It takes a string as argument and returns an array of associative array with keys display, address and is_group.
So,
$to = 'Wez Furlong <wez#example.com>, doe#example.com';
var_dump(mailparse_rfc822_parse_addresses($to));
would yield:
array(2) {
[0]=>
array(3) {
["display"]=>
string(11) "Wez Furlong"
["address"]=>
string(15) "wez#example.com"
["is_group"]=>
bool(false)
}
[1]=>
array(3) {
["display"]=>
string(15) "doe#example.com"
["address"]=>
string(15) "doe#example.com"
["is_group"]=>
bool(false)
}
}
try this code.
<?php
function extract_emails_from($string){
preg_match_all("/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $string, $matches);
return $matches[0];
}
$text = "blah blah blah blah blah blah email2#address.com";
$emails = extract_emails_from($text);
print(implode("\n", $emails));
?>
This will work.
Thanks.
This is based on Niranjan's response, assuming you have the input email enclosed within < and > characters). Instead of using a regular expression to grab the email address, here I get the text part between the < and > characters. Otherwise, I use the string to get the entire email. Of course, I didn't make any validation on the email address, this will depend on your scenario.
<?php
$string = 'Ruchika <ruchika#example.com>';
$pattern = '/<(.*?)>/i';
preg_match_all($pattern, $string, $matches);
var_dump($matches);
$email = $matches[1][0] ?? $string;
echo $email;
?>
Here is a forked demo.
Of course, if my assumption isn't correct, then this approach will fail. But based on your input, I believe you wanted to extract emails enclosed within < and > chars.
This function extract all email from a string and return it in an array.
function extract_emails_from($string){
preg_match_all( '/([\w+\.]*\w+#[\w+\.]*\w+[\w+\-\w+]*\.\w+)/is', $string, $matches );
return $matches[0];
};
This works great and it's minimal:
$email = strpos($from, '<') ? substr($from, strpos($from, '<') + 1, -1) : $from
use (my) function getEmailArrayFromString to easily extract email adresses from a given string.
<?php
function getEmailArrayFromString($sString = '')
{
$sPattern = '/[\._\p{L}\p{M}\p{N}-]+#[\._\p{L}\p{M}\p{N}-]+/u';
preg_match_all($sPattern, $sString, $aMatch);
$aMatch = array_keys(array_flip(current($aMatch)));
return $aMatch;
}
// Example
$sString = 'foo#example.com XXX bar#example.com XXX <baz#example.com>';
$aEmail = getEmailArrayFromString($sString);
/**
* array(3) {
[0]=>
string(15) "foo#example.com"
[1]=>
string(15) "bar#example.com"
[2]=>
string(15) "baz#example.com"
}
*/
var_dump($aEmail);
Based on Priya Rajaram's code, I have optimised the function a little more so that each email address only appears once.
If, for example, an HTML document is parsed, you usually get everything twice, because the mail address is also used in the mailto link, too.
function extract_emails_from($string){
preg_match_all("/[\._a-zA-Z0-9-]+#[\._a-zA-Z0-9-]+/i", $string, $matches);
return array_values(array_unique($matches[0]));
}
This will work even on subdomains. It extracts all emails from text.
$marches[0] has all emails.
$pattern = "/[a-zA-Z0-9-_]{1,}#[a-zA-Z0-9-_]{1,}(.[a-zA-Z]{1,}){1,}/";
preg_match_all ($pattern , $string, $matches);
print_r($matches);
$marches[0] has all emails.
Array
(
[0] => Array
(
[0] => clotdesormakilgehr#prednisonecy.com
[1] => **********#******.co.za.com
[2] => clotdesormakilgehr#prednisonecy.com
[3] => clotdesormakilgehr#prednisonecy.mikedomain.com
[4] => clotdesormakilgehr#prednisonecy.com
)
[1] => Array
(
[0] => .com
[1] => .com
[2] => .com
[3] => .com
[4] => .com
)
)
A relatively straight forward approach is to use PHP built-in methods for splitting texts into words and validating E-Mails:
function fetchEmails($text) {
$words = str_word_count($text, 1, '.#-_1234567890');
return array_filter($words, function($word) {return filter_var($word, FILTER_VALIDATE_EMAIL);});
}
Will return the e-mail addresses within the text variable.

Get a list of domains from a table via regex

I have list of domains in table with more info and
<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>
I need get .com domains using a regex. I tried to use something like :
'<td>(.............).com'
But what can I write instead of dots? What do I need to use?
I need get the data between the tags: <td>domain.com</td> -> domain.com
'<td>([^<]+\.com)</td>'
- it's more better, but i need get without tags
<?php
$html = '<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>';
$matches = array();
preg_match_all('/<td>(.*?.com)<\/td>/i', $html, $matches);
var_dump($matches[1]);
prints:
array(3) {
[0]=>
string(12) "example1.com"
[1]=>
string(12) "example3.com"
[2]=>
string(12) "example4.com"
}
Something like that:
'<td>([^<]+\.com)</td>'
but you shouldn't use regular expressions to parse html.
You can use look aheads and look behinds if you want to capture something but make sure it's surrounded by something else. Here we're capturing .com only.
<?php
$html = '<td>example1.com</td>
<td>example2.org</td>
<td>example3.com</td>
<td>example4.com</td>';
$pattern = "!(?<=<td>).*\.com*(?=</td>)!";
preg_match_all($pattern,$html,$matches);
$urls = $matches[0];
print_r($urls);
?>
Output
Array
(
[0] => example1.com
[1] => example3.com
[2] => example4.com
)

preg_match url get parameter parsing

I am trying to extract the latitude and longitude of a google maps url.
Such an url could look like this:
$url = 'http://maps.google.com/?ie=UTF8&ll=39.811856,11.309322&spn=0,0.485802&t=h&z=12&layer=c&cbll=39.311856,11.519322&panoid=1hltSSOoqv5H1dVKZWFkaA&cbp=13,117.95,,1,16.71';
As you can see, there's several location sets in the url. The cbll variable seems to be the right one in my case.
This is what I came up with:
preg_match("~&cbll=(-?\d+\.?\d*),(-?\d+\.?\d*)~", $url, $matches);
The problem: The preg_match seems to match the first '&ll=' in the url, and not the cbll part. I get the "&ll=" part of the url as result.
It works for me, if I var_dump $matches, I see this
array(3) {
[0]=>
string(25) "&cbll=39.311856,11.519322"
[1]=>
string(9) "39.311856"
[2]=>
string(9) "11.519322"
}
parse_str(parse_url($url, PHP_URL_QUERY), $vars);
list($a, $b) = explode(",", $vars['cbll']);

Categories