I use this code:
<?php
$texthtml = '<p>test</p><br><p><img src="1.jpeg" alt=""><br></p><p><img src="2.png" alt=""><br><img src="3.png" alt=""></p>';
preg_match('/<img.+src=[\'"](?P<src>.+?)[\'"].*>/i', $texthtml, $image);
echo $image['src'];
?>
However, when I test it, I get last image (3.png) from a string.
I want to know how can I do for get first image (1.jpeg) in a string.
Try:
preg_match('/<img(?: [^<>]*?)?src=([\'"])(.*?)\1/', $texthtml, $image);
echo isset($image[1]) ? $image[1] : 'default.png';
Regex is not suitable for html tags.
You can read about it here: RegEx match open tags except XHTML self-contained tags
I suggest DOM document if it's more complex than what you have shown here.
If it's not more complex than this I suggest strpos to find the words and "trim" it with substr.
$texthtml = '<p>test</p><br><p><img src="1.jpeg" alt=""><br></p><p><img src="2.png" alt=""><br><img src="3.png" alt=""></p>';
$search = 'img src="';
$pos = strpos($texthtml, $search)+ strlen($search); // find postition of img src" and add lenght of img src"
$lenght= strpos($texthtml, '"', $pos)-$pos; // find ending " and subtract $pos to find image lenght.
echo substr($texthtml, $pos, $lenght); // 1.jpeg
https://3v4l.org/48iiI
Related
I have a string $content which looks like that
<h1>Or Any Other tags except img or nothing</h1>
...
<img src="{{media url="image_name.png"}}" alt="image_test" />
...
<h1>Or Any Other tags except img or nothing</h1>
So as the minimal content of the string is
<img src="{{media url="dynamic_image_name.png"}}" alt="dynamic_image_test_alt" />
What I want if to find a way to extract, alter and replace this specific line by the new one?
In the first place I made this:
protected function getStringBetween($str,$from,$to)
{
$sub = substr($str, strpos($str,$from)+strlen($from),strlen($str));
return substr($sub,0,strpos($sub,$to));
}
Using " as from and to variable to get the filename. which is enough to generate what I want.
I would like to do something like that
$generatedContent = "<b>Hi test</b>";
$newContent = alterateContent($content,$generatedContent)
And the $newContent output needs to be:
<h1>Or Any Other tags except img or nothing</h1>
...
<b>Hi test</b>
...
<h1>Or Any Other tags except img or nothing</h1>
I would usually rarely recommend using regular expressions to parse HTML, but in your case, since your goal is to alter something in the database, parsing HTML and then saving it again might accidentally alter some other stuff that you'd want unchanged, such as the formatting.
So here's a simple solution using regex:
function alterateContent(string $html, string $imageFileName, string $replacement): string
{
$imageFileName = preg_quote($imageFileName, '/');
return preg_replace(
"/<img\h+src=\"{{media url="{$imageFileName}"}}\".*?\/>/",
$replacement,
$html
);
}
Usage:
$newContent = alterateContent($yourHtmlString, 'image_name.png', '<b>Hi test</b>');
Note: this assumes the src attribute is always the first attribute of the image.
Demo
You can simply use preg_replace() for that, like this:
$newstring = preg_replace('~<img.*~','<b>Hi test</b>',$oldstring);
Without s modifier, it won't match new line character, so it should work just fine with inline replacement.
If you need to replace the img with exact src, you can do this like this:
$newstring = preg_replace('~<img src="'.$img_source.'".*~','<b>Hi test</b>',$oldstring);
If your source is only a filename without path, and in img tag it's with path, you can use this:
$newstring = preg_replace('~<img src=".*?'.$img_file.'".*~','<b>Hi test</b>',$oldstring);
I want to replace all anchor tags within a text with their href value, but my pattern does not work right.
$str = 'This is a text with multiple anchor tags. This is the first one: Link 1 and this one the second: Link 2 after that a lot of other text. And here the 3rd one: Link 3 Some other text.';
$test = preg_replace("/<a\s.+href=['|\"]([^\"\']*)['|\"].*>[^<]*<\/a>/i",'\1', $str);
echo $test;
At the end the text should look like this:
This is a text with multiple anchor tags. This is the first one: https://www.link1.com/ and this one the second: https://www.link2.com/ after that a lot of other text. And here the 3rd one: https://www.link3.com/ Some other text.
Thank you very much!
Just don't.
Use a parser instead.
$dom = new DOMDocument();
// since you have a fragment, wrap it in a <body>
$dom->loadHTML("<body>".$str."</body>");
$links = $dom->getElementsByTagName("a");
while($link = $links[0]) {
$link->parentNode->insertBefore(new DOMText($link->getAttribute("href")),$link);
$link->parentNode->removeChild($link);
}
$result = $dom->saveHTML($dom->getElementsByTagName("body")[0]);
// remove <body>..</body> wrapper
$output = substr($result, strlen("<body>"), -strlen("</body>"));
Demo on 3v4l
In case you're still set on regex, this should work:
preg_replace("/<a\s+href=['\"]([^'\"]+)['\"][^\>]*>[^<]+<\/a>/i",'$1', $str);
But you're probably better off with a solution like what Andreas posted.
FYI: the reason your previous regex didn't work was this little number:
.*>
Because . selects everything you ended up matching everything past the url to be replaced; all the way to the end. This is why it appeared to only select and replace the first anchor tag it found and cut off the rest.
Changing that to
[^\>]*
Ensures that this particular selection is constrained to only the portion of the string which exists between the url and the ending bracket of the a tag.
Simpler perhaps not, but safer is to loop the string with strpos to find and cut the string and remove the html.
$str = 'This is a text with multiple anchor tags. This is the first one: <a class="funky-style" href="https://www.link1.com/" title="Link 1">Link 1</a> and this one the second: Link 2 after that a lot of other text. And here the 3rd one: Link 3 Some other text.';
$pos = strpos($str, '<a');
while($pos !== false){
// Find start of html and remove up to link (<a href=")
$str = substr($str, 0, $pos) . substr($str, strpos($str, 'href="', $pos)+6);
// Find end of link and remove that.(" title="Link 1">Link 1</a>)
$str = substr($str, 0, strpos($str,'"', $pos)) . substr($str, strpos($str, '</a>', $pos)+4);
// Find next link if possible
$pos = strpos($str, '<a');
}
echo $str;
https://3v4l.org/vdN7E
Edited to handle different order of a a-tag.
If you want to replace a tags with href values you can do:
$post = preg_replace("/<a.*?href=\"(.*?)\".*?>(.*?)<\/a>/","$1",$post);
If you want to replace with text values:
$post = preg_replace("/<a.*?href=\"(.*?)\".*?>(.*?)<\/a>/","$2",$post);
So I'm trying to hack off a few characters at the end of a URL I'm getting from a preg_replace function. However it doesn't seem to be working. I'm not familiar with using these variables in preg_replace (it was just something I found that "mostly" worked).
Here's my attempt:
function addlink_replace($string) {
$pattern = '/<ul(.*?)class="slides"(.*?)<img(.*?)src="(.*?)"(.*?)>(.*?)<\/ul>/is';
$URL = substr($4, 0, -8);;
$replacement = '<ul$1class="slides"$2<a rel=\'shadowbox\' href="'.$URL.'"><img$3src="$4"$5></a>$6</ul>';
return preg_replace($pattern, $replacement, $string);
}
add_filter('the_content', 'addlink_replace', 9999);
Basically I need to remove the last bit of my .jpg file name, so I can show the LARGE image rather than the THUMBNAIL it's generating, but the "$4" doesn't seem to want to be manipulated.
This answer is based off of what you're looking to accomplish in this question with the HTML structure of your other question. The regex that is posted in your question will not match anything other than the first set of <li> and <img> tags , and you've indicated that you need to match all <li> and <img> tags within a <ul> so I've written a larger function to do so.
It will wrap all <img> tags that are inside of an <li> within a <ul> with the class of slides with an <a> with the source being the image's URL with the -110x110 string removed, while preserving the thumbnail source in the <img> tag.
function addlink_replace($string) {
$new_ul_block = '';
$ul_pattern = '/<ul(.*?)class="slides"(.*?)>(.*?)<\/ul>/is';
$img_pattern = '/<li(.*?)>(.*?)<img(.*?)src="(.*?)"(.*?)>(.*?)<\/li>/is';
preg_match($ul_pattern, $string, $ul_matches);
if (!empty($ul_matches[3]))
{
preg_match_all($img_pattern, $ul_matches[3], $img_matches);
if (!empty($img_matches[0]))
{
$new_ul_block .= "<ul{$ul_matches[1]}class=\"slides\"{$ul_matches[2]}>";
foreach ($img_matches[0] as $id => $img)
{
$new_img = str_replace('-110x110', '', $img_matches[4][$id]);
$new_ul_block .= "<li{$img_matches[1][$id]}>{$img_matches[2][$id]}<a href=\"{$new_img}\">";
$new_ul_block .= "<img{$img_matches[3][$id]}src=\"{$img_matches[4][$id]}\"{$img_matches[5][$id]}></a>{$img_matches[6][$id]}</li>";
}
$new_ul_block .= "</ul>";
}
}
if (!empty($new_ul_block))
{
$replace_pattern = '/<ul.*?class="slides".*?>.*?<\/ul>/is';
return preg_replace($replace_pattern, $new_ul_block, $string);
}
else
{
return $string;
}
}
The change of the <a>'s href attribute from what the image had is specifically done on the line
$new_img = str_replace('-110x110', '', $img_matches[2][$id]);
if you would like to modify it. If you need to remove anything other than -110x110 from the URL you may need to change it from str_replace to a preg_replace, or if you want to remove a specific number of characters from the end of the URL, you could use substr:
$new_img = substr($img_matches[2][$id], 0, -12);
Where -12 is the number of characters you want to remove from the end of the string (it's negative because it's starting at the end).
I've posted a working example of this function here.
You may want to consider modifying the source of what is generating this code block, rather than using this regex, as this regex may be hard to maintain in the future if the HTML structure changes.
I have the results from a database that contains images and text. I would like to remove the first image.
Example:
$string = 'This is some text and images
<img src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
I would like to remove the first image, it is not always the same url.
Thanks for any help.
$string = 'This is some text and images
<img src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
print preg_replace('/<img(.*)>/i','',$string,1);
The above should return
This is some text and images
,
more text and <img src="http://linktoanotherimage"> and so on
Assuming you know it'll be prefixed by spaces and a line break, and suffixed by a comma and line break (and you want to remove these, too), you can do
print preg_replace("/\n <img(.*)>\,\n /i",'',$string,1);
Which will give you
This is some text and images more text and <img src="http://linktoanotherimage"> and so on
There was a great answer on another Thread
function get_first_image($html){
require_once('SimpleHTML.class.php')
$post_dom = str_get_dom($html);
$first_img = $post_dom->find('img', 0);
if($first_img !== null) {
return $first_img->src;
}
return null;
}
You can do it via Regex expressions however regex isn't really suited for this.
$var = 'This is some text and images
<img src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
echo preg_replace('/<img.*?>/', '123', $var, 1);
This should do it. ? in the regex is to make it ungreedy.
Being a RegEx newbie, I tend to avoid it and use something like:
$i = strpos($string,'<img');
if ($i !== false) {
$j = strpos($string, '>', $i);
if ($j !== false) $string = substr($string,0,$i) . substr($string,$j);
}
Probably should be $i-1 and/or $j+1 - I never remember exactly what it should be.
Finally nailed it down:
preg_replace('/<img[\S\s]*?>/i','',$txt,1)
This one also works for cases when you might have:
$string = 'This is some text and images
<img
src="http://linktoimage" border="0">,
more text and <img src="http://linktoanotherimage"> and so on';
(a new line character in the tag.)
This code works for me
$html is your html content variable from which you want to remove first image tag.
$str = preg_replace('/(<img[^>]+>)/i','',$html,1);
For instance I have a string:
$string = '<div class="ImageRight" style="width:150px">';
which I want to transform into this:
$string = '<div class="ImageRight">';
I want to remove the portion
style="width:150px with preg_replace() where the
size 150 can vary, so the width can be
500px etc. aswell.
Also, the last part of the classname varies aswell, so the class can be ImageRight, ImageLeft, ImageTop etc.
So, how can I remove the style attribute completely from a string with the above mentioned structure, where the only things that varies is the last portion of the classname and the width value?
EDIT: The ACTUAL string I have is an entire html document and I don't want to remove the style attribute from the entire html, only from the tags which match the string I've shown above.
I think this is what you're after...
$modifiedHtml = preg_replace('/<(div class="Image[^"]+") style="[^"]+">/i', '<$1>', $html);
Remove completely.
$string = preg_replace("/style=\"width:150px\"/", "", $string);
Replace:
$string = preg_replace("/style=\"width:150px\"/", "style=\"width:500px\"", $string);
You can do it in two steps with
$place = 'Left';
$size = 500;
$string = preg_replace('/(?<=class="image)\W(?=")/',$place,$string);
$string = preg_replace('/(?<=style="width:)[0-9]+(?=")/',$size,$string);
Note: (?=...) is called a lookahead.
How about:
$string = preg_replace('/(div class="Image.+?") style="width:.+?"/', "$1", $string);
Simple:
$string = preg_replace('/<div class="Image(.*?)".*?>/i', '<div class="Image$1">', $string);