PHP extract one part of a string - php

I have to extract the email from the following string:
$string = 'other_text_here to=<my.email#domain.fr> other_text_here <my.email#domain.fr> other_text_here';
The server send me logs and there i have this kind of format, how can i get the email into a variable without "to=<" and ">"?
Update: I've updated the question, seems like that email can be found many times in the string and the regular expresion won't work well with it.

You can try with a more restrictive Regex.
$string = 'other_text_here to=<my.email#domain.fr> other_text_here';
preg_match('/to=<([A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4})>/i', $string, $matches);
echo $matches[1];

Simple regular expression should be able to do it:
$string = 'other_text_here to=<my.email#domain.fr> other_text_here';
preg_match( "/\<(.*)\>/", $string, $r );
$email = $r[1];
When you echo $email, you get "my.email#domain.fr"

Try this:
<?php
$str = "The day is <tag> beautiful </tag> isn't it? ";
preg_match("'<tag>(.*?)</tag>'si", $str, $match);
$output = array_pop($match);
echo $output;
?>
output:
beautiful

Regular expression would be easy if you are certain the < and > aren't used anywhere else in the string:
if (preg_match_all('/<(.*?)>/', $string, $emails)) {
array_shift($emails); // Take the first match (the whole string) off the array
}
// $emails is now an array of emails if any exist in the string
The parentheses tell it to capture for the $matches array. The .* picks up any characters and the ? tells it to not be greedy, so the > isn't picked up with it.

Related

PHP replace string for other string

I have a doubt, it may be something simple but I have no knowledge to solve it.
I get a string in php
$ string = "[link = someUrl] Text [link]"
And I would like to turn this string into:
"<a href='someUrl'> Text <a/>"
How do I change the URL? and How Can I do the opposite?
Remember that the string belongs to a text with more strings of these gifts.
Short preg_replace solution:
$s = "[link=someUrl] Text [/link]";
$result = preg_replace('#\[[^=]+=([^]]+)\]([^[]+).*#', '<a href=\'$1\'>$2</a>', $s);
print_r($result);
The output (as web page source code):
<a href='someUrl'> Text </a>
You can use the following code
function transformText($string) {
preg_match("/\[link\=([^\]]*)\](.*?)\[\/link]/", $string, $matches);
$someUrl = $matches[1];
$text = $matches[2];
$newString = "<a href='$someUrl'>$text</a>";
return $newString;
}
$string = "[link=someUrl] Text [/link]"; // Test string
echo (transformText($string));
Live demo for the regex used : https://regex101.com/r/tzVfmH/4
Note : The above code works only if there's a single [link], [/link] pair.
If multiple occurrences are to be handled then its better to use regex search and replace, using php's preg_replace as suggested in RomanPerekhrest's answer.

PHP preg_match and regular expression

I'm new for PHP
I am trying to get topic number of link but not work.
echo $topicsave is empty.
This my code.
$data = '
test_curl
';
preg_match_all('/\<a[^\?]+\/([^\"]+)\.\s*\>test_curl\<\/a\>/', $data, $match);
echo '<pre>',htmlspecialchars(print_r($match, true)),'</pre>';
if( count($match[0])){
foreach($match[1] as $vl){
preg_match_all('/topic\,([0-9]+\.[0-9]+)/', $vl, $m1);
if(count($m1[1]))
$topicsave = $m1[1][0];
echo $topicsave;
}
}
I want to get topic number 40500 please help me, topic is variable such as 120 or 2536 or 12456.
Thank you.
To extract the topic number from link you can use following regex.
Regex: topic,(\d+(\.\d+)*)\.html
Explanation: What am doing is feeding your link to regex and extracting number between topic, and .html.
Regex101 Demo
PHP demo on Ideone
You can do it with this:
$re = "/topic,(?'topic'\\d+)/";
$str = "test_curl";
preg_match($re, $str, $matches);
echo $matches['topic'];
Which will output:
40500
What I used here (?'topic'\\d+) is a named group. It allows you to retrieve data from your matches with the name you used (here topic).
If you need to do live tests, Regex 101 is great.
Try this solution:
$data = 'test_curl';
preg_match_all('/topic,(.*?)\..*\.html/s', $data, $match);
echo $match[1][0]; // Output: 40500

Simple str_replace() making things wrong - WordPress

I need some special filtering to certain text all over my website, like below:
function special_text( $content ) {
$search_for = 'specialtext';
$replace_with = '<span class="special-text"><strong>special</strong>text</span>';
return str_replace( $search_for, $replace_with, $content );
}
add_filter('the_content', 'special_text', 99);
It's doing thing in an excellent way, BUT...
in content if there's any link like: <a title="specialtext" href="http://specialtext.com">specialtext</a> then the title and href texts also changed and the link becomes broken.
How can I make exception there?
Is there a way I can put some exceptions in an array and str_replace() simply skip 'em?
You should use regular expression and use function preg_replace() to replace matched string. Here is the full implementation of your special_text() function.
function special_text( $content ) {
$search_for = 'specialtext';
$replace_with = '<span class="special-text"><strong>special</strong>text</span>';
return preg_replace( '/<a.*?>(*SKIP)(*F)|'.$search_for.'/m', $replace_with, $content );
}
In the following regular expression first, using <a.*?> - everything between <a...> is matched and using (*SKIP)(*F)| it is skipped and then from anything else $search_for is matched (in your case it's specialtext).
Jezzabeanz quite got it except you can simplify it still with:
return preg_replace("/^def/", $replace_with, $content);
If you just want to change the text between the the a tags then a regular expression works wonders.
Here is something I used when I was pulling data from emails sent to me:
(?<=">)(.*?\w)(?=<\/a)
returns "specialtext"
It also returns "specialtext test" if there is whitespace.
Regular expressions are definitely the way to go.
$subject = "abcdef";
$pattern = '/^def/';
preg_match($pattern, $subject, $matches, PREG_OFFSET_CAPTURE, 3);
print_r($matches);
?>
Source
And then do a replace on the returned matches.

php preg_match_all preg_replace array issue

I'm working on a bb-code replacement function when a user wants to post a smiley.
The problem is, that if someone uses a bb-code smiley that doesn't exists, it results in an empty post because the browser will not display the (non-existing) emoticon.
Here's my code so far:
// DO [:smiley:]
$convert_smiley = preg_match_all('/\[:(.*?):\]/i', $string, $matches);
if( $convert_smiley )
{
$string = preg_replace('/\[:(.*?):\]/i', "<i class='icon-smiley-$1'></i>", $string, $convert_smiley);
}
return $string;
The bb-code for a smiley usually looks like [:smile:] or like [:sad:] or like [:happy:] and so on.
The code above is working well, until someone post a bb-code that doesn't exists, so what I am asking for is a fix for non existing smileys.
Is there a possibility, in example to create an array, like array('smile', 'sad', 'happy') and only bb-code that matches one or more in this array will be converted?
So, after the fix, posting [:test:] or just [::] should not be converted and should be posted as original text while [:happy:] will be converted.
Any ideas? Thanks!
I put your possible smiley’s in non-grouping parentheses with or symbol in a regexp:
<?php
$string = 'looks like [:smile:] or like [:sad:] or like [:happy:] [:bad-smiley:]';
$string = preg_replace('/\[:((?:smile)|(?:sad)|(?:happy)):\]/i', "<i class='icon-smiley-$1'></i>", $string);
print $string;
Output:
looks like <i class='icon-smiley-smile'></i> or like <i class='icon-smiley-sad'></i> or like <i class='icon-smiley-happy'></i> [:bad-smiley:]
[:bad-smiley:] is ignored.
A simple workaround:
$string ="[:clap:]";
$convert_smiley = preg_match_all('/\[:(.*?):\]/i', $string, $matches);
$emoticons = array("smile","clap","sad"); //array of supported smileys
if(in_array($matches[1][0],$emoticons)){
//smily exists
$string = preg_replace('/\[:(.*?):\]/i', "<i class='icon-smiley-$1'></i>", $string, $convert_smiley);
}
else{
//smily doesn't exist
}
Well, the first issue is you are setting $convert_smiley to the true/false value of the preg_match_all() instead of parsing the results. Here is how I reworked your code:
// Test strings.
$string = ' [:happy:] [:frown:] [:smile:] [:foobar:]';
// Set a list of valid smileys.
$valid_smileys = array('smile', 'sad', 'happy');
// Do a `preg_match_all` against the smiley’s
preg_match_all('/\[:(.*?):\]/i', $string, $matches);
// Check if there are matches.
if (count($matches) > 0) {
// Loop through the results
foreach ($matches[1] as $smiley_value) {
// Validate them against the valid smiley list.
$pattern = $replacement = '';
if (in_array($smiley_value, $valid_smileys)) {
$pattern = sprintf('/\[:%s:\]/i', $smiley_value);
$replacement = sprintf("<i class='icon-smiley-%s'></i>", $smiley_value);
$string = preg_replace($pattern, $replacement, $string);
}
}
}
echo 'Test Output:';
echo htmlentities($string);
Just note that I chose to use sprintf() for the formatting of content & set $pattern and $replacement as variables. I also chose to use htmlentities() so the HTML DOM elements can easily be read for debugging.

In PHP, how do I extract multiple e-mail addresses from a block of text and put them into an array?

I have a block of text from which I want to extract the valid e-mail addresses and put them into an array. So far I have...
$string = file_get_contents("example.txt"); // Load text file contents
$matches = array(); //create array
$pattern = '/[A-Za-z0-9_-]+#[A-Za-z0-9_-]+\.([A-Za-z0-9_-][A-Za-z0-9_]+)/'; //regex for pattern of e-mail address
preg_match($pattern, $string, $matches); //find matching pattern
However, I am getting an array with only one address. Therefore, I am guessing I need to cycle through this process somehow. How do I do that?
You're pretty close, but the regex wouldn't catch all email formats, and you don't need to specify A-Za-z, you can just use the "i" flag to mark the entire expression as case insensitive. There are email format cases that are missed (especially subdomains), but this catches the ones I tested.
$string = file_get_contents("example.txt"); // Load text file contents
// don't need to preassign $matches, it's created dynamically
// this regex handles more email address formats like a+b#google.com.sg, and the i makes it case insensitive
$pattern = '/[a-z0-9_\-\+]+#[a-z0-9\-]+\.([a-z]{2,3})(?:\.[a-z]{2})?/i';
// preg_match_all returns an associative array
preg_match_all($pattern, $string, $matches);
// the data you want is in $matches[0], dump it with var_export() to see it
var_export($matches[0]);
output:
array (
0 => 'test1+2#gmail.com',
1 => 'test-2#yahoo.co.jp',
2 => 'test#test.com',
3 => 'test#test.co.uk',
4 => 'test#google.com.sg',
)
I know this is not the question you asked but I noticed that your regex is not accepting any address like 'myemail#office21.company.com' or any address with a subdomain. You could replace it with something like :
/[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/
which will reject less valid e-mail (although it is not perfect).
I also suggest you read this article on e-mail validation, it is pretty good and informative.
Your code is almost perfect, you just need to replace preg_match(...) with preg_match_all(...)
http://www.php.net/manual/en/function.preg-match.php
http://www.php.net/manual/en/function.preg-match-all.php
This detects all mail addresses:
$sourceeee= 'Here are examplr mymail#yahoo.com and my-e.mail#goog.com or something more';
preg_match_all('/[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}/i', $sourceeee, $found_mails);
then you can use $found_mails[0] array.
This regex will extract all unique email address from a url or file and output each in new line. It will consider all subdomains and prefix suffix issues. Find comfortable to use it.
<?
$url="http://example.com/";
$text=file_get_contents($url);
$res = preg_match_all(
"/[a-z0-9]+[_a-z0-9\.-]*[a-z0-9]+#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/i",
$text,
$matches
);
if ($res) {
foreach(array_unique($matches[0]) as $email) {
echo $email . "<br />";
}
}
else {
echo "No emails found.";
}
?>
check here for more reference : http://www.php.net/manual/en/function.preg-match-all.php
It worked better for me:
<?php
$content = "Hi my name is Joe, I can be contacted at joe#mysite.com.";
preg_match("/[_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})/i", $content, $matches);
print $matches[0];
?>
Some of the others didn't accept domains like: name#example.com.sv
I found it on: http://snipplr.com/view/63938/
This function works fine without using regex. So it is really faster and low resource hungry.
<?php
function extract_email_addresses($str){
$emails = array();
$str = strip_tags( $str );
$str = preg_replace('/\s+/', ' ', $str);
$str = preg_replace("/[\n\r]/", "", $str);
$remove_chars = array (',', "<", ">", ";", "'", ". ");
$str = str_replace( $remove_chars, ' ', $str );
$parts = explode(' ', $str);
if(count($parts) > 0){
foreach($parts as $part){
$part = trim($part);
if( $part != '' ) {
if( filter_var($part, FILTER_VALIDATE_EMAIL) !== false){
$emails[] = $part;
}
}
}
}
if(count($emails) > 0){
return $emails;
}
else{
return null;
}
}
$string = "Guys, please help me to extract valid sam-ple.1990#gmail.co.uk email addresses from some text content using php
example , i have below text content in mysql database ' Life is more beautiful, and i like to explore lot please email me to sample#gmail.com. Learn new things every day. 'from the above text content i want to extract email address 'sample-x#gmail.com' using php regular expressions or other method.";
$matches = extract_email_addresses( $string );
print_r($matches);
?>

Categories