PHP Regex to convert text before colon to link - php

I need to find the first occurance of a colon ':' and take the complete string before that and append it to a link.
e.g.
username: #twitter nice site! RT www.google.com : visited!
needs to be converted to:
username: nice site! RT www.google.com : visited!
I've already got the following regex that converts the string #twitter to a clickable URL:
E.g.
$description = preg_replace("/#(\w+)/", "#\\1", $description);
Any ideas : )

I'd use string manipulation for this, rather than regex, using strstr, substr and strlen:
$username = strstr($description, ':', true);
$description = '' . $username . ''
. substr($description, strlen($username));

$regEx = "/^([^:\s]*)(.*?:)/";
$replacement = "\1\2";

I have not tested the code, but it should work as is. Basically you need to capture after #twitter too.
$description = preg_replace("%([^:]+): #twitter (.+)%i",
"#\\1: \\2",
$description);

The following should work -
$description = preg_replace("/^(.+?):\s#twitter\s(.+?)$/", "#\\1: \\2", $description);

Direct answer to your question:
$string = preg_replace('/^(.*?):/', '$1:', $string);
But I assume that you are parsing twitter RSS or something similar. So you can just use /^(\w+)/.

Related

How to preserve single and double quotes in a link using REGEX?

I have a regex code which finds all URLs and replaces them with a HTML link. Here is my code:
// initializing
$str = "this is a good website www.example.com/classname/methodname/arg";
$rexProtocol = '(https?://)?';
$rexDomain = '((?:[-a-zA-Z0-9]{1,63}\.)+[-a-zA-Z0-9]{2,63}|(?:[0-9]{1,3}\.){3}[0-9]{1,3})';
$rexPort = '(:[0-9]{1,5})?';
$rexPath = '(/[!$-/0-9:;=#_\':;!a-zA-Z\x7f-\xff]*?)?';
$rexQuery = '(\?[!$-/0-9:;=#_\':;!a-zA-Z\x7f-\xff]+?)?';
$rexFragment = '(#[!$-/0-9:;=#_\':;!a-zA-Z\x7f-\xff]+?)?';
function callback($match){
// Prepend http:// if no protocol specified
$completeUrl = $match[1] ? $match[0] : "http://{$match[0]}";
$DetectProperName = strlen($match[2].$match[3].$match[4]) > 20 ? "...".substr($match[2].$match[3].$match[4],0,20) : $match[2].$match[3].$match[4];
return ''.$DetectProperName. '';
}
echo $str = preg_replace_callback("&\\b$rexProtocol$rexDomain$rexPort$rexPath$rexQuery$rexFragment(?=[?.!,;:\"]?(\s|$))&",'callback', htmlspecialchars($str));
Also here is the output:
this is a good website ...www.example.com/clas
Also here is a fiddle
Well, that's ok and it works as well for links. Now my question is about when input is containing a quote ' or ". That regex will add a \ next to it. How can I fix it? I want such a regex be not sensitive to quotes.
Here is an example:
Input:
$str = 'this is a " (quote)';
Current Output:
this is a \" (quote)
What I want:
this is a " (quote)
How can I do that?
Edit: According to some tests, I figured out that change single/double quotes to ASKII code. How can I prevent it?

Replace url strings in PHP

I have a string for example : I am a boy
I want to show this on my url for example in this way : index.php?string=I-am-a-boy
My program :
$title = "I am a boy";
$number_wrds = str_word_count($title);
if($number_wrds > 1){
$url = str_replace(' ','-',$title);
}else{
$url = $title;
}
What if I have a string : Destination - Silicon Valley
If I implement the same logic my url will be : index.php?string=Destination---Silicon-Valley
But I want to show only 1 hyphen.
I want to show a hyphen instead of a plus sign..
url_encode() will eventually insert plus symbols.. So it's not helping here.
Now if I use minus symbol then if the actual string is Destination - Silicon Valley, then the url will look like
Destination-Silicon-Valley and not
Destination---Silicon-Valley
Check this stackoverflow question title and the url. You will know what I am saying.
Check this
Use urlencode() to send strings along with an url:
$url = 'http://your.server.com/?string=' . urlencode($string);
In comments you told, that you don't want urlencode, you'll just replace spaces by - characters.
First, you should "just do it", the if conditional and str_word_count() is just overhead. Basically your example should look like this:
$title = "I am a boy";
$url = str_replace(' ','-', $title);
That's it.
Further you told that this would make problems if the original string already contains a -. I would use preg_replace() instead of str_replace() to solve that problem. Like this:
$string = 'Destination - Silicon Valley';
// replace spaces by hyphen and
// group multiple hyphens into a single one
$string = preg_replace('/[ -]+/', '-', $string);
echo $string; // Destination-Silicon-Valley
Use preg_replace instead:
$url = preg_replace('/\s+/', '-', $title);
\s+ means "any whitespace character (\t\r\n\f (space, tab, line feed, newline)).
use urlencode:
<?php
$s = "i am a boy";
echo urlencode($s);
$s = "Destination - Silicon Valley";
echo urlencode($s);
?>
return:
i+am+a+boy
Destination+-+Silicon+Valley
and urldecode:
<?php
$s = "i+am+a+boy";
echo urldecode($s)."\n";
$s = "Destination+-+Silicon Valley";
echo urldecode($s);
?>
return:
i am a boy
Destination - Silicon Valley
just use urlencode() and urldecode(). It’s for sending Data with GET in the URL.

How to ignore single quotes in regex using preg_replace function in PHP?

I am basically trying to transform any hash-tagged word in a string into a link:
Here is what my code looks like:
public function linkify($text)
{
// ... generating $url
$text = preg_replace("/\B#(\w+)/", "<a href=" . $url . "/$1>#$1</a>", $text);
return $text;
}
It works pretty good excepting the case when that $text contains a single quote. Here are
Example1:
"What is your #name ?"
Result: "What is your #name?" Works fine.
Example2:
"What's your #name ?"
Result: "What's your #name?" Does not work, I want
this result: "What's your #name?"
Any idea about how I can get rid of that single quote problem using PHP ?
EDIT1:
Just for info, before or after html_entity_decode($text) I got
"What's your #name?"
Something like this.
$string = "' \'' '";
$string = preg_replace("#[\\\\']#", "\'", $string);
Something is protecting your html entities. This can save your life if the string is coming from a get/post request - but iI it's from a trusted source just use html_entity_decode to convert it back. This 39-thing is a way to express the single quote as you might have realized.
if the problem is html_entities, then maybe you only need to html_entity_decode your $text
$text = preg_replace("/\B#(\w+)/", "<a href=" . $url . "/html_entity_decode($1)>#$1</a>", $text);
Thanks all for your suggestions, I've finally sorted this out with this :
html_entity_decode($str, ENT_QUOTES);

Removal of bad hyperlinks and the content inside of them

Ok, basically I have an array of bad urls and I would like to search through a string and strip them out. I want to strip everything from the opening tag to the closing tag, but only if the url in the hyperlink is in the array of bad urls. Here is how I would picture it working but I don't understand regular expressions well.
foreach($bad_urls as $bad_url){
$pattern = "/<a*$bad_url*</a>/";
$replacement = ' ';
preg_replace($pattern, $replacement, $content);
}
Thanks in advance.
Assuming that your 'bad urls' are properly formatted URLs, I would suggest doing something like this:
foreach($bad_urls as $bad_url){
$pattern = '/<[aA]\s.+[href|HREF]\=\"' . convert_to_pattern($bad_url) . '\".+<\/[aA]>/msU';
$replacement = ' ';
$content = preg_replace_all($pattern, $replacement, $content);
}
and separately
function convert_to_pattern($url)
{
searches = array('%', '&', '?', '.', '/', ';', ' ');
replaces = array('\%','\&','\?','\.','\/','\;','\ ');
return preg_replace_all($searches, $replaces, $url);
}
Please do not try to parse HTML using regular expressions. Just load up the HTML in a DOM, find all the <a> tags and check the href property. Much simpler and fool-proof.

Remove special chars from URL

I have a product database and I am displaying trying to display them as clean URLs, below is example product names:
PAUL MITCHELL FOAMING POMADE (150ml)
American Crew Classic Gents Pomade 85g
Tigi Catwalk Texturizing Pomade 50ml
What I need to do is display like below in the URL structure:
www.example.com/products/paul-mitchell-foaming-gel(150ml)
The problem I have is I want to do the following:
1. Remove anything inside parentheses (and the parentheses)
2. Remove any numbers next to g or ml e.g. 400ml, 10g etc...
I have been banging my head trying different string replaces but cant get it right, I would really appreciate some help.
Cheers
function makeFriendly($string)
{
$string = strtolower(trim($string));
$string = str_replace("'", '', $string);
$string = preg_replace('#[^a-z\-]+#', '_', $string);
$string = preg_replace('#_{2,}#', '_', $string);
$string = preg_replace('#_-_#', '-', $string);
return preg_replace('#(^_+|_+$)#D', '', $string);
}
this function helps you for cleaning url. (also cleans numbers)
try this,
<?php
$url = 'http%3A%2F%2Fdemo.com';
$decodedurl= urldecode($url);
echo $decodedurl;
?
$from = array('/\(|\)/','/\d+ml|\d+g/','/\s+/');
$to = array('','','-');
$sample = 'PAUL MITCHELL FOAMING POMADE (150ml)';
$sample = strtolower(trim(preg_replace($from,$to,$sample),'-'));
echo $sample; // prints paul-mitchell-foaming-pomade
Try this:
trim(preg_replace('/\s\s+/', ' ', preg_replace("/(?:\(.*?\)|\d+\s*(?:g|ml))/", "", $input)));
// "abc (def) 50g 500 ml 3m(ghi)" --> "abc 3m"

Categories