Remove all urls from text with php [duplicate] - php

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Remove urls using PHP
I'm trying to figure out the best way to remove URLs from text with php. I've looked at a bunch of different sites and questions on here but can't quite piece it all together.
I would like to remove all URLs like the following:
www.website.com
http://www.website.com
website.com
website.com/test
<tag>www.website.com</tag> (where <tag> is any html tag)
(www.website.com)
I've tried a few solutions i found on here, but i couldn't figure out how to get web addresses with bordering characters not part of the web address, i.e. parenthesis or within an HTML tag like <strong>
Any help is much appreciated.
Thanks

Perhaps...
// http(s)://
$txt = preg_replace('|https?://www\.[a-z\.0-9]+|i', '', $txt);
// only www.
$txt = preg_replace('|www\.[a-z\.0-9]+|i', '', $txt);
Or:
$Var = str_replace("itemtoreplace", "replacewith", $variabletoremovefrom");
PHP Str_replace
Use:
Remove urls using PHP
For reference.

Related

Find characters in string and make them a link in php [duplicate]

This question already has an answer here:
PHP regex. Convert [text](url) to text [duplicate]
(1 answer)
Closed 3 years ago.
I have 1000s of posts in Wordpress that have this weird code for a hyperlink in the body copy. For example, I want to find all instances of this:
[Website Name](http://www.website.com)
and turn it into
Website Name
What is the best way to achieve this in php?
$string = "This is a blog post hey check out this website [Website Name](http://www.website.com). It is a real good domain.
// do some magic
You can use preg_replace with this regex:
\[([^]]+)]\((http[^)]+)\)
It looks for a [, followed by some non-] characters, a ] and (http, then some non-) characters until a ).
This is then replaced with $1. For example:
$string = "This is a blog post hey check out this website [Website Name](http://www.website.com). It is a real good domain.";
echo preg_replace('/\[([^]]+)]\((http[^)]+)\)/', '$1', $string);
Output:
This is a blog post hey check out this website Website Name. It is a real good domain.
This weird code is Markdown (used for example here in SO).
If you want to convert it to HTML using PHP you could use this library : https://parsedown.org/
The advantage is that you will convert any other markdown tags and other forms of markdown links present in the posts.

PHP to remove xml [duplicate]

This question already has answers here:
How to get the value of an attribute from XML file in PHP?
(4 answers)
Closed 6 years ago.
I need some help with a problem if possible.
<IndexFile index="dlc:blahblahblahblah.zip" version="1.19.0" />
How would I use php to remove everything in the above line of code except
blahblahblahblah.zip
Note: blahblah isn't actual name, the name changes; I just need to remove the xml on either side of the .zip file.
I've tried a few things like strip_tags() but nothing works up to now.
Your test string was
<IndexFile index="dlc:blahblahblahblah.zip" version="4.19.0" />
Your regex pattern should be : index="dlc:(.+?)".
Your answer is:
blahblahblahblah.zip
Try it out at https://regex101.com/
See this answer for greedy vs nongreedy matching: java-pattern-does-not-return-leftmost-match
Ideally, you have to use an XML parser to parse the file and use XPaths/looping to get that item.
If that is the only line, why can't you just use a Regex to extract the value? index="[a-zA-Z0-9_.-:]*" (or [a-zA-Z0-9_.-:]*.zip)?
Again, you need to think about the future impacts.

detect URL in textarea and replace it to a link in PHP [duplicate]

This question already has answers here:
Replace URLs in text with HTML links
(17 answers)
Closed 8 years ago.
How can I detect URL in text area that doesn't include http://? Here is an a example for a input:
Hi bro! Look at my new website: www.example.com. I leard to build websites from http://another.example.net and from example.net.
Is there a way to convert it to this code?:
Hi bro! Look at my new website: www.example.com.
I leard to build websites from http://another.example.net
and from example.net.
As you can see, the code detects if there are a URL even if it doesn't starts with http:// or www, and adds to the a tag the http://.
See this basic example. It allows you to match in multi-line texts and it is not too restrictive. You could have links to internal network, where the machine hostname is used like: http://myspecialserver - this is valid link, no matter it might be accessible only by certain network(s).
The anwser uses the regular expressions. You can read more about them here: http://www.tutorialspoint.com/php/php_regular_expression.htm
We match with them the protocol and any text after which is consistent for URL, it does not contain space charaters, tabs, carriage returns and line feeds.
<?php
function linkify($text) {
return preg_replace('#\b(http|ftp)(s)?\://([^ \s\t\r\n]+?)([\s\t\r\n])+#smui', '$1$2://$3$4', $text);
}
echo nl2br(linkify('
Hello, visit https://www.domain.com
We are not partners of http://microsoft.com/ :)
Download source from: ftp://new.sourceforge.com
'));
?>

Strip RTF strings with PHP - Regex [duplicate]

This question already has answers here:
Regular Expression for extracting text from an RTF string
(11 answers)
Closed 9 years ago.
A column in the database I work with contains RTF strings, I would like to strip these out using PHP, leaving just the sentence between.
It is a MS SQL database 2005 if I recall correctly.
An example of the kind of strings pulled from the database (need any more let me know, all the rest are similar):
{\rtf1\ansi\ansicpg1252\deff0\deflang2057{\fonttbl{\f0\fnil\fcharset0 Tahoma;}}
\viewkind4\uc1\pard\lang1033\f0\fs17 ASSEMBLE COMPONENTS AS DETAILED ON DRAWING.\lang2057\fs17\par
}
I would like this to be stripped to only return:
ASSEMBLE COMPONENTS AS DETAILED ON DRAWING.
Now, I have successfully managed to strip the characters in ASP.NET for a previous project, however I would like to do so using PHP. Here is the regular expression I used in ASP.NET, which works flawlessly may I add:
"(\{.*\})|}|(\\\S+)"
However when I try to use the same expression in PHP with a preg_replace it does not strip half of the characters.
Any regex gurus out there?
Use this code. it will work fine.
$string = preg_replace("/(\{.*\})|}|(\\\S+)/", "", $string);
Note that I added a '/' in the beginning and at the end '/' in the regex.

php regex for parsing html [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 3 years ago.
i need some help to parse a html, extracting everything starting with http://, containing "abc" until first occurance of " or ' or blank space.
i have some regex like this /http:\/\/abc(.*)\"/ but it's not working well :\
are there any ideas? :)
P.S. sorry for bad english, it's not my natural language ;)
StackOverflow tends to prefer an HTML Document Parser over Regular Expressions for parsing HTML.
However, with that said, if you just want URLs from a string that happens to be HTML, I still believe a Regex is fine for the job.
Try preg_match_all:
preg_match_all("/http:\/\/[^\s'\"]*abc[^\s'\"]*/", $string, $matches);
Use a parser instead of a regex.
RegEx match open tags except XHTML self-contained tags
If all you want to do is extract URLs, regexen are a good choice. You don't need to get into the parser world.
If you have unix-like command tools you could approximate it very simply (assuming one url per line) with two passes:
grep http myfile.html | grep abc
You can use preg_grep() similarly.
preg_match_all ('/http:[^"\' ]+/', $html, $urls);
# $urls contains all the urls from your document
$abc_urls = preg_grep( '/abc/', $urls );

Categories