replace all url from html file with regex php

replace all url from html file with regex php - php

i want to save external html file using php
how to search and replace all url with regex using php
href="/web/***/http://blog.domain.com/site/styles-site.css"
where *** is code dynamic not known
replace to
href="site/styles-site.css"
mycode:
$html=$url;
$content = file_get_contents($html);
$newhtml = preg_replace( 'web/-[^-.]*\./' , '.' , $content);
file_put_contents('post1.html', $newhtml);

Related

PHP convert a URL into clickable link but without image src

I want to convert a string in PHP. This string may contain URL or image tags or other tags, but I don't want to convert image tags src value into a link. For example:
We have a link https://youtube.com/watch/8374h87shdv which needs to be converted but this is not to be <image class="emoji" alt="emoji" src="https://icloud.com/png/sdsdv234f.png"
The above string needs to be converted but without src URL.
I am using this currently:
function convert_strings( $content ){
$url = '~(?:(https?)://([^\s<]+)|(www\.[^\s<]+?\.[^\s<]+))(?<![\.,:])~i';
$content = preg_replace($url, '$0', $content);
$content = preg_replace('/(?<!\S)#([0-9a-zA-Z]+)/', '#$1', $content);
$content = convert_smilies( $content );
return $content;
}
But converts all. How can I achieve this?

how to replace url path in html file

How to find and replace all URL paths in an HTML file? I have an HTML file with links from Wayback Machine, like these:
"/web/2016***/http://blog.mydomain.com/archive/img.jpg"
"/web/2016***/http://blog.mydomain.com/archive/img2.jpg"
"/web/2016***/http://blog.mydomain.com/archive/page2.html"
The 2016*** part is dynamic. How do I extract these elements:
"/archive/img.jpg"
"/archive/img2.jpg"
"/archive/page2.html"
I have tried:
$html = $url;
$content = file_get_contents($html);
$newhtml = preg_replace( 'web/-[^-.]*\./' , '/' , $content);
file_put_contents('post1.html', $newhtml);

Try this regular expression: \/web.*blog\.mydomain\.com(.*):
preg_replace('\/web.*blog\.mydomain\.com(.*)', '\1', $content);
Check it out in action: https://regex101.com/r/m5ZaRo/3

Replace path and quotes with new url in file

I have a text file with the following data:
"K:\data\etrdtCfhbr6MUkAAFuVw.jpg"
"K:\data\rgtdrCfhbr6OUYAE5lNR.jpg"
"K:\data\Cfhbr6VVIrdtdAAmRPr.jpg"
"K:\data\Cffh-EyWsersetsAQ8eIz.jpg"
I want to replace the quotes and part of the path to get an output like this:
http://myweb.com/dat/etrdtCfhbr6MUkAAFuVw.jpg
http://myweb.com/dat/rgtdrCfhbr6OUYAE5lNR.jpg
http://myweb.com/dat/Cfhbr6VVIrdtdAAmRPr.jpg
http://myweb.com/dat/Cffh-EyWsersetsAQ8eIz.jpg
Right now I have some (pseudo)code where I don't know how to get it working correctly:
$filecontents = file_get_contents('/path/to/file.txt'); //Load the file contents
$newcontent = preg_replace('.....', $filecontents); //Use a regex to replace the stuff as I want
file_put_contents('/path/to/file.txt', $newcontent);
I highlighted the code where I'm stuck right now.

You can get your file as a string with file_get_contents(). Then you can use preg_replace() to replace each path as you want. And with file_put_contents() you simply save the file back.
The regex /\".*\\\\(.*?)\"/m simply means:
\".*\\\\ match a double quote(\") and then everything(.*) until the last backslash(\\\\)
(.*?)\" match everything((.*?)) until a double quote(\")
m modifier m simply means to use the regex for each line
Code
<?php
$file = file_get_contents("test.txt");
$file = preg_replace("/\".*\\\\(.*?)\"/m", "http://myweb.com/dat/$1", $file);
file_put_contents("test.txt", $file);
?>

file_get_contents() skipping text between <> tag

I am trying to read a .tsv file using PHP. I am using the simplest method of file_get_contents() but it is skipping any text between <> tags.
Following is the format of my .tsv file
<id_svyx35_88c_avbfa5> <Kuldeep_Raval> rdf:type <wikicat_Delhi_Daredevils_cricketers>
Following is the code I am using
$filename = "access_s.tsv";
$content = file_get_contents($filename);
//Split file into lines
$lines = explode("\n", $content);
echo $content;
On reading it, the output is just
rdf:type
Please help in what can be the solution to read the line as it is?

Try to apply htmlspecialchars() to $content:
$filename = "access_s.tsv";
$content = htmlspecialchars(file_get_contents($filename));
//Split file into lines
$lines = explode("\n", $content);
echo $content;
Reference on php.net
The tags have always been there, the browser just does not show them. Just like with any valid HTML tag, you can see them when viewing the source code of the website.

php strip_tags to allow comment

I need to strip all html tags but retain comment lines to extract for info.
Is it even possible?
$content = strip_tags($content, '<!-->');
This doesn't work and i have tried a few different variants.

you can protect your comment before strip them using following code
// create a random string for using in replace strings
$random = strtoupper(dechex(rand(0,10000000000)));
// replace comment starts
$html = preg_replace('/<!--/', '#MARKER-START-'. $random.'#', $html);
// replace comment ends
$html = preg_replace('/-->/', '#MARKER-END-'. $random.'#', $html);
// strip all html tags
$html = strip_tags($html);
// replace back comment starts
$html = preg_replace('/#MARKER-START-'. $random.'#/', '<!--', $html);
// replace back comment ends
$html = preg_replace('/#MARKER-END-'. $random.'#/', '-->', $html);

Instead of using strip_tags() use this regular expression:
$szRetVal = preg_replace( '%</?[a-z][a-z0-9]*[^<>]*>%sim','',$szHTML );

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

replace all url from html file with regex php - php

Related

PHP convert a URL into clickable link but without image src

how to replace url path in html file

Replace path and quotes with new url in file

file_get_contents() skipping text between <> tag

php strip_tags to allow comment

Categories

Resources