I am attempting to crawl through a file and insert "../../" at the beginning of every image path. Unfortunately though, the script is timing out, and since it only took a few seconds to run before this was added something tells me it is not doing what I think it should be. This is how I'm doing it:
$filedata = substr_replace(substr($filedata,$imageBeginning,1),"../../",$imageBeginning);
I am crawling entire HTML files to accomplish this, so I need an efficient solution. Any help is appreciated.
This is completely untested, but something like this:
preg_replace('/(<img\s+.*?src=")(.*\\.(?:jpg|png|bmp|gif).*?>)/', '$1../../$2', $filedata);
Explanation: You are making 2 captures in the regular expression. The first is everything from the start of the img tag to the start of the src attribute. The second is the src attribute value and everything after it. Then you just insert "../../" in the middle in the replacement.
http://php.net/manual/en/function.preg-replace.php
Related
I have a function called getContents(), Which accepts a regex for the file names it finds.
I scan the js folder for javascript files, with the following two regex patterns:
$js['head'] = "/(\.head\.js\.php)|(\.head\.js)|(\.h.js)/";
$js['foot'] = "/(\.foot\.js\.php)|(\.foot\.js)|(\.f.js)|(\.js)^(\.head\.js)/";
I have a naming system whereby if you determine where the javascript file gets loaded, in the <head> tag or footer of the HTML page. All files are generally considered to be loaded at the bottom of the page, unless you specify (.head.js for example).
Up until a few days a go I noticed that the js['foot'] array was also including .head.js as well, causing the files to be loaded twice. So I added in the ^(\.head\.js) and it worked! it stopped the .head.js files being added into the footer array. I was quite pleased with myself, because I suck at regex. However it seems now that standard .js files (any normal .js files) arnt being loaded into the $js['foot'] array now. Why is this? If I remove the ^(\.head\.js) part it loads them.
To be clear, I want the $js['foot'] array to load files ending with:
.foot.js.php
.foot.js
.f.js
.js
And IGNORE all:
.head.js.php
.head.js
.h.js
Can someone correct my regex above to do this? I thought the ^ operator was NOT but i was wrong!
^(\.head\.js) in the middle of string makes it an invalid because ^ is considered anchor that matches line start.
You actually need a negative lookbehind assertion to stop matching head.js in footer regex:
$js['head'] = '/\.head\.js(?:\.php)?|\.h.js/';
$js['foot'] = '/\.foot\.js(?:\.php)?|(?<!head|h)\.js/';
RegEx Demo
I have spent the entire day trying to figure out how to get this code to only affect the first instance it runs across. Eventually, I learned about a negative lookback and tried to implement that.
I have tried every possible arrangement except, of course, the correct one.I discovered regex101, which is really cool, but ultimately didn’t help me find the solution.
$content = preg_replace('/<img[^>]+./','', get_the_content_with_format());
This will be used in wordpress to strip out the first image on a page (moving it above the written content), but leave the rest in so that there can be images used in the post description.
Be easy on me, please. This is my first question here and I really am not a programmer.
Update: Because l’L'l asked, this is the entire chunk of relevant code.
<?php
//this will remove the images from the content editor
// it will not remove links from images, so if an image has a link, you will end up with an empty line.
$content = preg_replace('/<img[^>]+./','', get_the_content_with_format());
//this IF statement checks if $content has any value left after the images were removed
// If so, it will echo the div below it.. if not will won't do anything.
if($content != ""):?>
<div class="portfolio-box">
<?php echo do_shortcode( $content ) ?>
</div>
<?php endif; ?>
I’ve tried both of the solutions offered here but, for whatever reason, they didn’t work.
And, thank you guys very much for helping, by the way.
You could just anchor it at the beginning of the string (with ^), capture everything up to the first image (with (.*?)), and replace all of that with the content before the image:
$content = preg_replace('/^(.*?)<img[^>]+/s','$1', get_the_content_with_format());
Note I also added the modifier s so that dot (.) matches newlines.
If you just want to replace the first occurence of the regex match, just add "1" as fourth parameter, which indicates, that only one match will be replaced.
See http://php.net/manual/de/function.preg-replace.php
In your example, this would look like:
$content = preg_replace('/<img[^>]+./','', get_the_content_with_format(), 1);
I've been fooling around for a while with regex. A few days ago I started modifying a regex pattern I found some time ago. It detects all hyperlinks, my version should only detect hyperlinks and not images.
http://domain.com/someimage.jpg
shouldn't be detected. But it does detect an image partly. I don't how to solve this.
The original regex:
/(https?)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,10}(\/\S*)?/i
Link to my version:
http://regexr.com/38rv9
Please help. Thanks!
You just need a space at last.
/((https?)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,10}(\/(?:(\S(?!jpg|jpeg|png|gif))*))?)\s/ig
I would accomplish this by making sure what is clicked by the user does NOT end with an image file extension. You mention you are using php; have ONE condition statement that matches your original regex:
/(https?)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,10}(\/\S*)?/i
but does not match any common image file extension at the END of the expression:
/^.*\.*[*(jpg$|jpeg$|gif$|png$|tif$)]/i
This would work for any text string that precedes the image file extension; preg_match will be useful to accomplish this.
I'm stuck trying to make a regex in PHP that catches the link and its content from a html page (which I have no control over) and replaces it with a link of mine.
i.e.:
<a style="position:absolute;more_styles:more;" href="http://www.google.co.il/" class="something">This is the content</a>
Becomes:
<a style="position:absolute;more_styles:more;" href="my_function('http://www.google.co.il/')" class="something">This is the content</a>
This is the regex that I wrote:
$content = preg_replace('|<a(.*?)href=[\"\'](.*?)[\"\'][^>]*>(.*?)</a>|i','$3',$content);
This works well with all the links except links like:
<a href="http://google.co.il" onclick="if(MSIE_VER()>=4){this.style.behavior='url(#default#homepage)';this.setHomePage('http://www.google.co.il')}" class='brightgrey rightbar' style='font-size:12px'><b>Make me the home page!</b></a>
Obviously, the regexp stops at "MSIE_VER()>" because of the "[^>]*" part and i get the wrong content when I use "$3".
I tried almost every option to make this work but no luck.
Any thoughts?
Thank you all in advance..
First of all your code is trying to do something different that to add my_function - it tries to remove the starting tag and replace it with url only. There are several ways to acheieve your declared goal (i.e. substituing my_function to all hrefs) , the most pragmafic would be:
$content = preg_replace('|href=[\"\'](.*?)[\"\']|i',"href=\"my_function('$1')\"",$content);
if you need more prudent approach than I would use
$content = preg_replace('|(<a.*?)href=[\"\'](.*?)[\"\'](.*?</a>)|i',"$1href=\"my_function('$2')\"$3",$content);
last but not least if you need removing tag rather than what you have written, let me know there is million ways to do it.
By default .* will take evryting it can - eg. it takes onclick argument, because regex is still valid - replace "." with [^\"] - it will tell regexp to take evrything excluding " ( which cannot be in URL )
$content = preg_replace('|<a(.*?)href=[\"\']([^"]*?)[\"\'][^>]*>(.*?)</a>|i','$3',$content);
function has_thumbnail_image(&$post) {
$content = $post->post_content;
return preg_match('/<img[^>]+src="(.*?)"[^>]*>/', $content, $results);
}
I need a function that goes through a block of dynamically returned text and puts all of the images contained within into an array (or more specifically the image source of each image). The function above only gives me the first image and I cannot work out how to make this loop keep happening until all of the images are in the array. Any help on this would be much appreciated. Thanks
You may want to investigate preg_match_all. If I recall correctly, preg_match only searches for the first match and then stops.
You are very close! You just need preg_match_all instead of preg_match.
I don´t know how well you know your source, but you might want to allow single quotes for the src attribute.