I've been looking for an example on how to use preg_replace or RegEX to
search HTML for either http://www.mydomain.co.uk/page.html or /page.html
and replace .html with .app
I only want to replace the .html on pages that belong to this site
I have an app that uses pages from our website. When I link page from the app I put .app so it removes/ changes the formatting. Which works, I just want to dynamically edit the links on the pages.
Cheers
OK for those interested, I've cracked it.
It searches html for links then checks if the link belongs to site.
Find ".html" replace with ".app" within the link
Find old links replace with new links within html.
Thanks guys you pointed me in the right direction
function searchHTML($html){
// if the page is used by the app replace all domain pages .html with .app
$domain ='mydomain.com';
$regEX = '/<a[^>]+href=([\'"])(.+?)\1[^>]*>/I';
preg_match_all($regEX, $html, $match); //check page for links
for($l=0; $l<=count($match[2]); $l++){ //loop through links found
$link = $match[2][$l];//url within href
if ((strpos($link, $domain) !== false) || ($link[0]=="/")) { //check links are domain links(domain name or the first char is /)
$updateLink = str_replace(".html", ".app", $link);//replace .html ext with .app ext
$html = str_replace($link, $updateLink, $html); //update html with new links
}
}
return $html;
}
function changeURL($url) {
$url = str_replace('html', 'app', $url);
return $url;
}
echo changeURL('http://www.mydomain.co.uk/page.html');
Result: http://www.mydomain.co.uk/page.app
Try this, works with PHP 4 >= 4.0.5
<?php
$string = "http://www.mydomain.co.uk/page.html";
$string = preg_replace_callback("/mydomain.co.uk\/.+?(html)$/", "replace", $string);
function replace($matches) {
return str_replace("html", "app", $matches[0]);
}
echo $string;
<?php
// it's important to use the $ in regex to be sure to replace only the suffix and not a part of the url
function change_url($url, $find, $replace){
return preg_replace("#\\.".preg_quote($find, "#")."$#uim", ".".$replace, $url);
}
echo change_url("http://www.mydomain.co.uk/page.html", "html", "app");
//retuns http://www.mydomain.co.uk/page.app
?>
Related
i'm trying to do something in PHP
I'm trying to get the link of an image -> store it to my DB, but I'd like the user to be able to store text before it, and after it, I've gotten my hands on a similar function for links, but the image part is missing.
As you can see the turnUrlIntoHyperlink does a regex check over the entire arg passed, turning the text that contains it to the url, so users can post something like
Hey check this cool site "https://stackoverflow.com" its dope!
And the entire argument posting to my database.
However i can't seem to get the same function working for the Convert Image, as it simply won't post and removed text before/after it before when i made the attempt.
How would i do this in a correct way, and can i combine these 2 functions in to 1 function?
function convertImg($string) {
return preg_replace('/((https?):\/\/(\S*)\.(jpg|gif|png)(\?(\S*))?(?=\s|$|\pP))/i', '<img src="$1" />', $string);
}
function turnUrlIntoHyperlink($string){
//The Regular Expression filter
$reg_exUrl = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";
// Check if there is a url in the text
if(preg_match_all($reg_exUrl, $string, $url)) {
// Loop through all matches
foreach($url[0] as $newLinks){
if(strstr( $newLinks, ":" ) === false){
$link = 'http://'.$newLinks;
}else{
$link = $newLinks;
}
// Create Search and Replace strings
$search = $newLinks;
$replace = ''.$link.'';
$string = str_replace($search, $replace, $string);
}
}
//Return result
return $string;
}
more explained in detail :
When i post a link like https://google.com/ I'd like it to be a href,
But if i post an image like https://image.shutterstock.com/image-photo/duck-on-white-background-260nw-1037486431.jpg , i'd like it to be a img src,
Currently, i'm storing it in my db and echoing it to a little debug panel,
Do you mean that you want to make an <img> inside <a> element?
Your turnUrlIntoHyperlink function have captured the url successfully, so we can just use explode to get string before and after the link.
$exploded = explode($link, $string);
$string_before = $exploded[0];
$string_after = $exploded[1];
Code example:
<?php
function turnUrlIntoHyperlink($string){
//The Regular Expression filter
$reg_exUrl = "/(?i)\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))/";
// Check if there is a url in the text
if(preg_match_all($reg_exUrl, $string, $url)) {
// add http protocol if the url does not already contain it
$newLinks = $url[0][0];
if(strstr( $newLinks, ":" ) === false){
$link = 'http://'.$newLinks;
}else{
$link = $newLinks;
}
$exploded = explode($link, $string);
$string_before = $exploded[0];
$string_after = $exploded[1];
return $string_before.'<img src="'.$link.'">'.$string_after;
}
return $string;
}
echo turnUrlIntoHyperlink('Hey check this cool site https://stackoverflow.com/img/myimage.png its dope!');
Output:
Hey check this cool site <img src="https://stackoverflow.com/img/myimage.png"> its dope!
Edit: the question has been edited
Since an image URL is just another kind of link/URL, your logic should go like this pseudocode:
if link is image and link is url
print <img src=link> tag
else if link is url and link is not image
print <a href=link> tag
else
print link
So you can just write a new function to "merge" those two function:
function convertToImgOrHyperlink($string) {
$result = convertImg($string);
if($result != $string) return $result;
$result = turnUrlIntoHyperlink($string);
if($result != $string) return $result;
return $string;
}
echo convertToImgOrHyperlink('Hey check this cool site https://stackoverflow.com/img/myimage.png its dope!');
echo "\r\n\r\n";
echo convertToImgOrHyperlink('Hey check this cool site https://stackoverflow.com/ its dope!');
echo "\r\n\r\n";
Output:
Hey check this cool site <img src="https://stackoverflow.com/img/myimage.png" /> its dope!
Hey check this cool site https://stackoverflow.com/ its dope!
The basic idea is that since image url is also a link, such check must be done first. Then if it's effective (input and return is different), then do <img> convertion. Otherwise do <a> convertion.
I want to rewrite/mask all external url in my article and also add nofollow and target="_blank". So that original link to the external site is get encrypted/ masked/ rewritten.
For example:
original link: www.google.com
rewrite it to: www.mydomain.com?goto=google.com
There is a plugin for joomla which rewrite external link: rewrite plugin.
But I am not using joomla. Please have a look at above plugin, It does exactly what I am looking for.
What I want?
$article = "hello this is example article I want to replace all external link http://google.com";
$host = substr($_SERVER['HTTP_HOST'], 0, 4) == 'www.' ? substr($_SERVER['HTTP_HOST'], 0) : $_SERVER['HTTP_HOST'];
if (thisIsNotMyWebsite){
replace external url
}
You can use DOMDocument to parse and traverse the document.
function rewriteExternal($html) {
// The url for your redirection script
$prefix = 'http://www.example.com?goto=';
// a regular expression to determine if
// this link is within your site, edit for your
// domain or other needs
$is_internal = '/(?:^\/|^\.\.\/)|example\.com/';
$dom = new DOMDocument();
// Parse the HTML into a DOM
$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link) {
$href = $link->getAttribute('href');
if (!preg_match($is_internal, $href)) {
$link->getAttributeNode('href')->value = $prefix . $href;
$link->setAttributeNode(new DOMAttr('rel', 'nofollow'));
$link->setAttributeNode(new DOMAttr('target', '_blank'));
}
}
// returns the updated HTML or false if there was an error
return $dom->saveHTML();
}
This approach will be much more reliable than using a regular expression based solution since it actually parses the DOM for you instead of relying on a often-fragile regex.
something like:
<?php
$html ='1224 google 567';
$tracking_string = 'http://example.com/track.php?url=';
$html = preg_replace('#(<a[^>]+href=")(http|https)([^>" ]+)("?[^>]*>)#is','\\1'.$tracking_string.'\\2\\3\\4',$html);
echo $html;
in action here: http://codepad.viper-7.com/7BYkoc
--my last update
<?php
$html =' 1224 google 567';
$tracking_string = 'http://example.com/track.php?url=';
$html = preg_replace('#(<a[^>]+)(href=")(http|https)([^>" ]+)("?[^>]*>)#is','\\1 nofollow target="_blank" \\2'.$tracking_string.'\\3\\4\\5',$html);
echo $html;
http://codepad.viper-7.com/JP8sUk
I have searched, and searched for 3+ hours this morning and tried over 10 different setups for how to grab and display a list of images from a url, and none of them worked correctly. I would either end up with no info displaying, or a 500 error. Can someone point me to an example or help me out here on how to do this properly. file_get_contents is not a viable option.
Example Directory: http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/
Files i know that are in that directory:
001.jpg,
002.jpg,
003.jpg
I would like the output to be the exact url to the file.
Let me know if more info is needed, i'm not 100% sure exactly how to explain it right lol.
Edit:
ok so what I guess i actually want to do is check the url for all the image tags and display a list with the full url to that image.
New to working with this url+images+php stuff so please don't hit me too hard with your downvote hammer with no comments lol.
Code I Tried:
<?php
/*
Credits: Bit Repository
URL: http://www.bitrepository.com/
*/
$url = $location;
// Fetch page
$string = FetchPage($url);
// Regex that extracts the images (full tag)
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex, $string, $out, PREG_PATTERN_ORDER);
$img_tag_array = $out[0];
echo "<pre>"; print_r($img_tag_array); echo "</pre>";
// Regex for SRC Value
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex_src_url, $string, $out, PREG_PATTERN_ORDER);
$images_url_array = $out[1];
echo "<pre>"; print_r($images_url_array); echo "</pre>";
// Fetch Page Function
function FetchPage($path)
{
$file = fopen($path, "r");
if (!$file)
{
exit("The was a connection error!");
}
$data = '';
while (!feof($file))
{
// Extract the data from the file / url
$data .= fgets($file, 1024);
}
return $data;
}
?>
and it returned a blank page
Based loosely on the code you already tried (but was riddled with problems). This grabs the full contents of the URL $url, parses out the <img> src attributes, and then outputs them.
Because this particular web host uses <base href=""/> tag to reset the base part of all URLs on the page, I've added a $base variable which you should set to the contents of the base tag.
Additionally, it looks like this particular web host has some pretty smart anti-hotlinking in place, so not all images may be visible.
But! Give it a whirl, let me know if it does what you need it to, and any questions.
<?php
$url = 'http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/';
$base = 'http://www.webtoonlive.com/';
// Pull in the external HTML contents
$contents = file_get_contents( $url );
// Use Regular Expressions to match all <img src="???" />
preg_match_all( '/<img[^>]*src=[\"|\'](.*)[\"|\']/Ui', $contents, $out, PREG_PATTERN_ORDER);
foreach ( $out[1] as $k=>$v ){ // Step through all SRC's
// Prepend the URL with the $base URL (if needed)
if ( strpos( $v, 'http://' ) !== true ) $v = $base . $v;
// Output a link to the URL
echo '' . $v . '<br/>';
}
Sample output:
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/000.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/001.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/002.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/003.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/004.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/005.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/006.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/007.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/008.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/009.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/010.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/011.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/012.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/013.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/014.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/015.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/016.jpg
I would like to implement the following feature onto my site. When a user posts something, he is also allowed to include one link, which is a link to a picture. Imagine a user posts something like this:
Hello look at this awesome picture. It is hilarious isn't it?
http://www.google.com/image.jpg
Then that text should be converted to:
Hello look at this awesome picture. It is hilarious isn't it?
<a target="_blank" href="http://www.google.com/image.jpg">
<img src="http://www.google.com/image.jpg" alt=""/>
</a>
So I need some php script that searches through the text for links and if it finds a link, checks that it links to a picture. It also needs to be able to recognize links that do not start with http, and also links that start with https.
How would you do that?
Thanks a lot :)
Dennis
how about these two links combined:
best way to determine if a URL is an image in PHP
PHP Regular Expression Text URL to HTML Link
$url="http://google.com/image.jpg";
function isImage( $url ){
$pos = strrpos( $url, ".");
if ($pos === false)
return false;
$ext = strtolower(trim(substr( $url, $pos)));
$imgExts = array(".gif", ".jpg", ".jpeg", ".png", ".tiff", ".tif"); // this is far from complete but that's always going to be the case...
if ( in_array($ext, $imgExts) )
return true;
return false;
}
$test=isImage($url);
if($test){
$pattern = '/((?:[\w\d]+\:\/\/)?(?:[\w\-\d]+\.)+[\w\-\d]+(?:\/[\w\-\d]+)*(?:\/|\.[\w\-\d]+)?(?:\?[\w\-\d]+\=[\w\-\d]+\&?)?(?:\#[\w\-\d]*)?)/';
$replace = '$1';
$msg = preg_replace( $pattern , $replace , $msg );
return stripslashes( utf8_encode( $msg ) );
}
This is the working code for this:
<?php
$sad222="somthing text bla bla bla ...... Https://cdn.fileinfo.com/img/ss/lg/jpg_44.JPG this is my picture.";
$d11="";$cs11 = array();$i=-1;
$sad111 = explode(" ",$sad222);
foreach ($sad111 as $sad)
{
if(strtolower(substr($sad,0,7))=="http://"||strtolower(substr($sad,0,7))=="ftps://"||strtolower(substr($sad,0,8))=="https://"||strtolower(substr($sad,0,6))=="ftp://"){
if(strtolower(substr($sad,strlen($sad)-4,4))==".jpg"||strtolower(substr($sad,strlen($sad)-4,4))==".jpe"||strtolower(substr($sad,strlen($sad)-4,4))==".jif"||strtolower(substr($sad,strlen($sad)-4,4))==".jfi"||strtolower(substr($sad,strlen($sad)-4,4))==".gif"||strtolower(substr($sad,strlen($sad)-4,4))==".png"||strtolower(substr($sad,strlen($sad)-4,4))==".bmp"||strtolower(substr($sad,strlen($sad)-4,4))==".dib"||strtolower(substr($sad,strlen($sad)-4,4))==".ico"||strtolower(substr($sad,strlen($sad)-5,5))==".jpeg"||strtolower(substr($sad,strlen($sad)-5,5))==".jfif"||strtolower(substr($sad,strlen($sad)-5,5))==".apng"||strtolower(substr($sad,strlen($sad)-5,5))==".tiff"||strtolower(substr($sad,strlen($sad)-4,4))==".tif"){
$d11="<img src='".$sad."' width='500' height='600'>";
$sad=$d11;}}$i++;
$cs11[$i]=$sad." ";
}
foreach ($cs11 as $dimz)
{
echo $dimz;
}
?>
I have a function that will add the <a href> tag before a link and </a> after the link. However, it breaks for some webpages. How would you improve this function? Thanks!
function processString($s)
{
// check if there is a link
if(preg_match("/http:\/\//",$s))
{
print preg_match("/http:\/\//",$s);
$startUrl = stripos($s,"http://");
// if the link is in between text
if(stripos($s," ",$startUrl)){
$endUrl = stripos($s," ",$startUrl);
}
// if link is at the end of string
else {$endUrl = strlen($s);}
$beforeUrl = substr($s,0,$startUrl);
$url = substr($s,$startUrl,$endUrl-$startUrl);
$afterUrl = substr($s,$endUrl);
$newString = $beforeUrl."".$url."".$afterUrl;
return $newString;
}
return $s;
}
function processString($s) {
return preg_replace('/https?:\/\/[\w\-\.!~#?&=+\*\'"(),\/]+/','$0',$s);
}
It breaks for all URLs that contain "special" HTML characters. To be safe, pass the three string components through htmlspecialchars() before concatenating them together (unless you want to allow HTML outside the URL).
function processString($s){
return preg_replace('#((https?://)?([-\w]+\.[-\w\.]+)+\w(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)*)#', '$1', $s);
}
Found it here