I have this piece of code:
if (!$thumbdone) {
if (intval($fb_image_use_content)==1) {
$imgreg = '/<img .*src=["\']([^ ^"^\']*)["\']/';
preg_match_all($imgreg, trim($post->post_content), $matches);
if (isset($matches[1][0])) {
//There's an image on the content
$image=$matches[1][0];
$pos = strpos($image, site_url());
if ($pos === false) {
if (stristr($image, 'http://') || stristr($image, 'https://')) {
//Complete URL - offsite
$fb_image=$image;
} else {
$fb_image=site_url().$image;
}
} else {
//Complete URL - onsite
$fb_image=$image;
}
$thumbdone=true;
}
}
}
Find the first image within a code.
The problem is I want to find the first image inside a div structured in this way:
<div style="display:block" class="ogimage">
<img class="aligncenter wp-image-1030 size-full" src="http:/site.com/image.jpg" alt="image" width="600" height="315">
</div>
I Googled this:
https://www.google.it/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=php%20get%20image%20from%20div
I also tried with http://www.phpliveregex.com/
but nothing. Help?
Use the right tool for the job instead of trying to parse HTML using a regular expression.
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$img = $xpath->query('//div[#class="ogimage"]/img');
echo $img->item(0)->getAttribute('src');
Working Demo
You can use explode like this:
$html = '<div style="display:block" class="ogimage"><img class="aligncenter wp-image-1030 size-full" src="http:/site.com/image.jpg" alt="image" width="600" height="315"></div>';
$result = explode($html)[7];
Related
I have a string that is for a blog, potentially it could have an unlimited number of images in this string, what I am trying to do is get all of the src="" and add a prefix to the url and use it as the hyperlink.
My string:
$test = 'Hello world
<img src="images/image1.jpg" />Line 2
<img src="images/image2.jpg" />Some text
<img src="images/image3.jpg" />';
I am able to prefix href. I am able to achieve this:
<img src="images/image1.jpg" />
<img src="images/image2.jpg" />
<img src="images/image3.jpg" />
This is my code so far:
$new = preg_replace('/(<img[^>]+src="([^\\"]+)"[^>]+\\/>)/','\\1',$test2);
echo $new;
I need to add foldername/as prefix in all the image src. What im trying to turn this into is the following:
<img src="foldername/images/image1.jpg" />
<img src="foldername/images/image2.jpg" />
<img src="foldername/images/image3.jpg" />
How can I do that?
To do this using DOMDocument rather than regex (a good reason is https://stackoverflow.com/a/1732454/1213708).
The code loads the HTML and then looks for all of the <img> tags. It first takes the value of src and adds the extra part to it. Then it creates a new <a> tag and also adds the (original) src value as the href attribute. It then replaces the <img> tag with the <a>, but adds the old value back into the <a>...
$dom = new DOMDocument();
$dom->loadHTML($test, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
foreach ( $dom->getElementsByTagName("img") as $image ) {
$src = $image->getAttribute("src");
$image->setAttribute("src", "foldername/".$src);
$anchor = $dom->createElement("a");
$anchor->setAttribute("href", $src);
$image->parentNode->replaceChild($anchor, $image);
$anchor->appendChild($image);
}
echo $dom->saveHTML();
Not that it matters now, but here is a parsing solution without regex. I prefer #NigelRen solution actually now that its posted. However, here is a way to build a list of image urls without regex that could be used for solving your issue and provide further exploration. I haven't tested the code but i'm pretty fair on it working.
<?php
$html = 'some html here';
$sentinel = '<img src="';
$quoteSentinel = '"';
$done = false;
$offset = 0;
$imageURLS = array();
while (!$done) {
$nextImageTagPos = strpos($html, $sentinel, $offset);
if ($nextImageTagPos !== false) {
//*** Parsing
// Find first quote
$quoteBegins = false;
for ($i = $nextImageTagPos; $i < $nextImageTagPos + 100; $i++) {
if (substr($html, $i, 1) == $quoteSentinel) {
$quoteBegins = $i;
break;
}
}
// Find ending quote
$quoteEnds = false;
if ($quoteBegins !== false) {
for ($i = $quoteBegins + 1; $i < $quoteBegins + 1000; $i++) {
if (substr($html, $i, 1) == $quoteSentinel) {
$quoteEnds = $i;
break;
}
}
if ($quoteEnds !== false) {
// Hooray! now we are getting somewhere
$imgUrlLength = ($quoteEnds - $quoteBegins) - 1;
$imgUrl = substr($html, $quoteBegins + 1, $imgUrlLength);
// ***Requirements
/*
I have a string that is for a blog, potentially it could have an unlimited number of images in this string, what i am trying to do is get all of the src="" and add a prefix to the url and use it as the hyperlink.
*/
// But for brevity lets skip requirements and build an array of URLs.
$imageURLS[] = $imgUrl;
$offset = $quoteEnds + 1;
}
}
// Extract url
}
else {
$done = true;
}
}
I need to convert an img like this:
<img src="https://techcrunch.com/wp-content/uploads/2015/04/codecode.jpg" style="height:404px; width:602px" />
to this:
<amp-img src="https://techcrunch.com/wp-content/uploads/2015/04/codecode.jpg" height="404" width="602"></amp-img>
Keeping in mind that this code will be in a portion of code with more html tags and, important, I can't use any library or anything..
I await answers, thanks in advance!
You can Achieve your goal with following code :
function html5toampImage($string) {
preg_match('/src="(.+?)"/', $string, $src);
$srcAttribute = $src[0];
preg_match('/style="(.+?)"/', $string, $styles);
$style = $styles[1];
$allData = explode(";",$style);
foreach($allData as $data) {
if($data) {
list($key,$value) = explode(":",$data);
if(trim($key)=="height") {
$heightAttribute = trim($key).'="'.trim(str_replace("px","",$value)).'"';
}
if(trim($key)=="width") {
$widthAttribute = trim($key).'="'.trim(str_replace("px","",$value)).'"';
}
}
}
$ampImageTag = '<amp-img '.$srcAttribute.' '.$heightAttribute.' '.$widthAttribute.' layout="responsive"></amp-img>';
return $ampImageTag;
}
$html5Tag = '<img alt="aa" src="https://techcrunch.com/wp-content/uploads/2015/04/codecode.jpg" style="height:404px; width:602px; color:red" />';
echo htmlentities(html5toampImage($html5Tag));
Here is working eval url
Hi Guys i do have this Html Code :
<div class="post-thumbnail2">
<a href="http://example.com" title="Title">
<img src="http://linkimgexample/image.png" alt="Title"/>
</a>
</div>
I want to get the value of src image (http://linkimgexample/image.png) and the value of the href link (http://example.com) using php DOMDocument
what i did to get the link was something like that :
$divs = $dom->getElementsByTagName("div");
foreach($divs as $div) {
$cl = $div->getAttribute("class");
if ($cl == "post-thumbnail2") {
$links = $div->getElementsByTagName("a");
foreach ($links as $link)
echo $link->getAttribute("href")."<br/>";
}
}
i could do the same for src img
$imgs = $div->getElementsByTagName("img");
foreach ($imgs as $img)
echo $img->getAttribute("src")."<br/>";
but sometime in the website there is no image and the Html code is like that :
<div class="post-thumbnail2">
</div>
so my questions is how could i get the 2 value at the same time it means when there is no image i show some message
to be more clear this is an example :
<div class="post-thumbnail2">
<a href="http://example1.com" title="Title">
<img src="http://linkimgexample/image1.png" alt="Title"/>
</a>
</div>
<div class="post-thumbnail2">
</div>
<div class="post-thumbnail2">
<a href="http://example3.com" title="Title">
<img src="http://linkimgexample/image2.png" alt="Title"/>
</a>
</div>
i want the result to be
http://example1.com - http://linkimgexample/image1.png
http://example2.com - there is no image here !
http://example3.com - http://linkimgexample/image2.pn
DOMElement::getElementsByTagName returns a DOMNodeList, that means you can find out if a img-element was found by checking the length property.
$imgs = $div->getElementsByTagName("img");
if($imgs->length > 0) {
foreach ($imgs as $img)
echo $img->getAttribute("src")."<br/>";
} else {
echo "there is no image here!<br/>";
}
You should think about using XPath - it makes your life traversing the DOM a bit easier:
$doc = new DOMDocument();
if($doc->loadHtml($xmlData)) {
$xpath = new DOMXPath($doc);
$postThumbLinks = $xpath->query("//div[#class='post-thumbnail2']/a");
foreach($postThumbLinks as $link) {
$imgList = $xpath->query("./img", $link);
$imageLink = "there is no image here!";
if($imgList->length > 0) {
$imageLink = $imgList->item(0)->getAttribute('src');
}
echo $link->getAttribute('href'), " - ", $link->getAttribute('title'),
" - ", $imageLink, "<br/>", PHP_EOL;
}
} else {
echo "can't load HTML document!", PHP_EOL;
}
I am creating a wordpress function and need to determine whether an image in the content is wrapped with an a tag that contains a link to a PDF or DOC file e.g.
<img src="../images/image.jpg" />
How would I go about doing this with PHP?
Thanks
I would very strongly advise against using a regular expression for this. Besides being more error prone and less readable, it also does not give you the ability to manipulate the content easily.
You would be better of loading the content into a DomDocument, retrieving all <img> elements and validating whether or not their parents are <a> elements. All you would have to do then is validate whether or not the value of the href attribute ends with the desired extension.
A very crude implementation would look a bit like this:
<?php
$sHtml = <<<HTML
<html>
<body>
<img src="../images/image.jpg" />
<img src="../images/image.jpg" />
<img src="../images/image.jpg" />
<p>this is some text <a href="site.com/doc.pdf"> more text</p>
</body>
</html>
HTML;
$oDoc = new DOMDocument();
$oDoc->loadHTML($sHtml);
$oNodeList = $oDoc->getElementsByTagName('img');
foreach($oNodeList as $t_oNode)
{
if($t_oNode->parentNode->nodeName === 'a')
{
$sLinkValue = $t_oNode->parentNode->getAttribute('href');
$sExtension = substr($sLinkValue, strrpos($sLinkValue, '.'));
echo '<li>I am wrapped in an anchor tag '
. 'and I link to a ' . $sExtension . ' file '
;
}
}
?>
I'll leave an exact implementation as an exercise for the reader ;-)
Here is a DOM parse based code that you can use:
$html = <<< EOF
<img src="../images/image.jpg" />
<img src="../images/image1.jpg" />
<IMG src="../images/image2.jpg" />
<img src="../images/image3.jpg" />
My PDF
EOF;
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html); // loads your html
$nodeList = $doc->getElementsByTagName('a');
for($i=0; $i < $nodeList->length; $i++) {
$node = $nodeList->item($i);
$children = $node->childNodes;
$hasImage = false;
foreach ($children as $child) {
if ($child->nodeName == 'img') {
$hasImage = true;
break;
}
}
if (!$hasImage)
continue;
if ($node->hasAttributes())
foreach ($node->attributes as $attr) {
$name = $attr->nodeName;
$value = $attr->nodeValue;
if ($attr->nodeName == 'href' &&
preg_match('/\.(doc|pdf)$/i', $attr->nodeValue)) {
echo $attr->nodeValue .
" - Image is wrapped in a link to a PDF or DOC file\n";
break;
}
}
}
Live Demo: http://ideone.com/dwJNAj
I'm trying to use regex to replace source attribute (could be image or any tag) in PHP.
I've a string like this:
$string2 = "<html><body><img src = 'images/test.jpg' /><img src = 'http://test.com/images/test3.jpg'/><video controls="controls" src='../videos/movie.ogg'></video></body></html>";
And I would like to turn it into:
$string2 = "<html><body><img src = 'test.jpg' /><img src = 'test3.jpg'/><video controls="controls" src='movie.ogg'></video></body></html>";
Heres what I tried :
$string2 = preg_replace("/src=["']([/])(.*)?["'] /", "'src=' . convert_url('$1') . ')'" , $string2);
echo htmlentities ($string2);
Basically it didn't change anything and gave me a warning about unescaped string.
Doesn't $1 send the content of the string ? What is wrong here ?
And the function of convert_url is from an example I posted here before :
function convert_url($url)
{
if (preg_match('#^https?://#', $url)) {
$url = parse_url($url, PHP_URL_PATH);
}
return basename($url);
}
It's supposed to strip out url paths and just return the filename.
Don't use regular expressions on HTML - use the DOMDocument class.
$html = "<html>
<body>
<img src='images/test.jpg' />
<img src='http://test.com/images/test3.jpg'/>
<video controls='controls' src='../videos/movie.ogg'></video>
</body>
</html>";
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML( $html );
$xpath = new DOMXPath( $dom );
libxml_clear_errors();
$doc = $dom->getElementsByTagName("html")->item(0);
$src = $xpath->query(".//#src");
foreach ( $src as $s ) {
$s->nodeValue = array_pop( explode( "/", $s->nodeValue ) );
}
$output = $dom->saveXML( $doc );
echo $output;
Which outputs the following:
<html>
<body>
<img src="test.jpg">
<img src="test3.jpg">
<video controls="controls" src="movie.ogg"></video>
</body>
</html>
You have to use the e modifier.
$string = "<html><body><img src='images/test.jpg' /><img src='http://test.com/images/test3.jpg'/><video controls=\"controls\" src='../videos/movie.ogg'></video></body></html>";
$string2 = preg_replace("~src=[']([^']+)[']~e", '"src=\'" . convert_url("$1") . "\'"', $string);
Note that when using the e modifier, the replacement script fragment needs to be a string to prevent it from being interpreted before the call to preg_replace.
function replace_img_src($img_tag) {
$doc = new DOMDocument();
$doc->loadHTML($img_tag);
$tags = $doc->getElementsByTagName('img');
foreach ($tags as $tag) {
$old_src = $tag->getAttribute('src');
$new_src_url = 'website.com/assets/'.$old_src;
$tag->setAttribute('src', $new_src_url);
}
return $doc->saveHTML();
}