PHP get image url from html-string using regular expression - php

I'm trying to get all images urls from a html-string with php.
Both from img-tags and from inline css (background-image)
<?php
$html = '
<div style="background-image : url(https://exampel.com/media/logo.svg);"></div>
<img src="https://exampel.com/media/my-photo.jpg" />
<div style="background-image:url('https://exampel.com/media/icon.png');"></div>
';
preg_match('/<img.+src=[\'"](?P<src>.+?)[\'"].*>|background-image[ ]?:[ ]?url\([ ]?[\']?["]?(.*?\.(?:png|jpg|jpeg|gif|svg))/i', $html, $image);
echo('<pre>'.print_r($image, true).'</pre>');
?>
The output from this is:
Array
(
[0] => background-image : url(https://exampel.com/media/logo.svg
[src] =>
[1] =>
[2] => https://exampel.com/media/logo.svg
)
Prefered output would be:
Array
(
[0] => https://exampel.com/media/logo.svg
[1] => https://exampel.com/media/my-photo.jpg
[2] => https://exampel.com/media/icon.png
)
I'm missing something here but I cant figure out what

Use preg_match_all() and rearrange your result:
<?php
$html = <<<EOT
<div style="background-image : url(https://exampel.com/media/logo.svg);"></div>
<img src="https://exampel.com/media/my-photo.jpg" />
<div style="background-image:url('https://exampel.com/media/icon.png');"></div>
EOT;
preg_match_all(
'/<img.+src=[\'"](.+?)[\'"].*>|background-image ?: ?url\([\'" ]?(.*?\.(?:png|jpg|jpeg|gif|svg))/i',
$html,
$matches,
PREG_SET_ORDER
);
$image = [];
foreach ($matches as $set) {
unset($set[0]);
foreach ($set as $url) {
if ($url) {
$image[] = $url;
}
}
}
echo '<pre>' . print_r($image, true) . '</pre>' . PHP_EOL;

Related

Parsing html using php to an array

I have the below html
<p>text1</p>
<ul>
<li>list-a1</li>
<li>list-a2</li>
<li>list-a3</li>
</ul>
<p>text2</p>
<ul>
<li>list-b1</li>
<li>list-b2</li>
<li>list-b3</li>
</ul>
<p>text3</p>
Does anyone have an idea to parse this html file with php to get this output using complex array
fist one for the tags "p"
and the second for tags "ul" because after above every "p" tag a tag "ul"
Array
(
[0] => Array
(
[value] => text1
(
[il] => list-a1
[il] => list-a2
[il] => list-a3
)
)
[1] => Array
(
[value] => text2
(
[il] => list-b1
[il] => list-b2
[il] => list-b3
)
)
)
I can't use replace or removing all tags cause I use
foreach ($doc->getElementsByTagName('p') as $link)
{
$dont = $link->textContent;
if (strpos($dont, 'document.') === false) {
$links2[] = array(
'value' => $link->textContent, );
}
$er=0;
foreach ($doc->getElementsByTagName('ul') as $link)
{
$dont2 = $link->nodeValue;
//echo $dont2;
if (strpos($dont2, 'favorisContribuer') === false) {
$links3[]= array(
'il' => $link->nodeValue, );
}
You could use the DOMDocument class (http://php.net/manual/en/class.domdocument.php)
You can see an example below.
<?php
$html = '
<p>text1</p>
<ul>
<li>list-a1</li>
<li>list-a2</li>
<li>list-a3</li>
</ul>
<p>text2</p>
<ul>
<li>list-b1</li>
<li>list-b2</li>
<li>list-b3</li>
</ul>
<p>text3</p>
';
$doc = new DOMDocument();
$doc->loadHTML($html);
$textContent = $doc->textContent;
$textContent = trim(preg_replace('/\t+/', '<br>', $textContent));
echo '
<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body>
' . $textContent . '
</body>
</html>
';
?>
However, I would suggest using javascript to find the content and send it to php instead.

PHP move bold text to array key

I have details array:
[info_details] => Array
(
[0] => <b>title:</b> this is title
[1] => <b>name:</b> this is name
[2] => <b>created</b> this is date
)
and I need to format this array to this:
[info_details] => Array
(
[title] => this is title
[name] => this is name
[created] => this is date
)
so what is the best way to explode bold text?
my code now:
foreach ( $array as $key => $value ) {
$this->__tmp_data['keep'][] = preg_split('/<b[^>]*>/', $value);
}
but it doesn't work.
PHP has inbuilt function strip_tags() to strip HTML tags.
foreach ( $array as $key => $value ) {
$this->__tmp_data['keep'][] = strip_tags($value);
}
UPDATE
<?php
$info_details = array
(
'<b>title:</b> this is title',
'<b>name:</b> this is name',
'<b>created:</b> this is date'
);
$tmp_data = [];
foreach ( $info_details as $key => $value ) {
list($key,$value)=explode('</b>', $value);
$tmp_data['keep'][str_replace(array(':','<b>'),'',$key)] = $value;
}
echo '<pre>';
print_r($tmp_data);
?>
OUTPUT
Array
(
[keep] => Array
(
[title] => this is title
[name] => this is name
[created] => this is date
)
)
Can try this regex with preg_match() and str_replace()
$pattern = "/<b>.+:<\/b>\s?/";
$arr['info_details'] = [
'<b>title:</b> this is title',
'<b>name:</b> this is name',
'<b>created:</b> this is date',
];
$new_arr['info_details'] = [];
foreach($arr['info_details'] as $val){
preg_match($pattern, $val, $m);
$new_arr['info_details'][trim(strip_tags($m[0]), ': ')] = str_replace($m[0], '', $val);
}
print '<pre>';
print_r($new_arr);
print '</pre>';
Output
Array
(
[info_details] => Array
(
[title] => this is title
[name] => this is name
[created] => this is date
)
)
Assuming that the colon will always be present you can use strip_tags and explode to get what you want.
<?php
$info_details = array(
"<b>title:</b> this is title",
"<b>name:</b> this is name",
"<b>created:</b> this is date"
);
$return = array();
foreach($info_details as $val){
list ($key, $value) = explode(":",strip_tags($val), 2);
$return[$key] = $value;
}
print_r($return);
See it live here. Also worth noting that this implementation will remove the : from the array key and strip any html content from the trailing portion of each array element.
If you can't rely on the delimiter to be there you can instead use the close bold tag as your delimiter.
<?php
$info_details = array(
"<b>title:</b> this is title",
"<b>name:</b> this is name",
"<b>created</b> this is date"
);
$return = array();
foreach($info_details as $val){
list ($key, $value) = explode("</b>",$val, 2);
$key = strip_tags($key);
$return[$key] = $value;
}
print_r($return);
or run it here
So I found a solution, but this is hardcode...
foreach ( $array as $key => $value ) {
$this->__tmp_data['keep'][strtolower(str_replace(':', '', strip_tags(#reset(explode('</b>', $value)))))] = trim(#end(explode('</b>', $value)));
}
any other solutions will be accepted, even regex are welcome!
reading above comments I guess this is what you needed.
<?php
$info_details = array
(
'<b>title:</b> this is title',
'<b>name:</b> this is name',
'<b>created:</b> this is date'
);
foreach ($info_details as $value)
{
$temp = explode("</b>",$value);
$info_details = array(strip_tags(str_replace(":","",$temp[0])) =>$temp[1]);
}
print_r($info_details);
?>

How to get normal src of an coded image like this in php?

Hi there I works with simple php parser to save imgs form external server...So I want to get an normal src of an picture
but it seem below img elements has an unusuall src ...
Is there anyway to turn this code to normal src or at least first save it in my server?
Note: text in src is too long...more than 170000 chars...I removed most of them to insert here to show you...
<img style="display: block; margin-left: auto; margin-right: auto;" src="" alt="">
Copy all of the src text after base64, and use php's base64_decode function to decode it. Once there you can write it to a jpg file if you want.
<?php
echo base64_encode(file_get_contents("../images/folder16.gif"))
?>
You can use PHP function to get the image coded.
See this working example :
<?php
$img = base64_encode(file_get_contents("https://www.google.co.in/images/srpr/logo11w.png"));
echo "<img src='data:image/gif;base64,".$img."' />";
?>
Follow the following steps
<?php
// [1] Prepare your page HTML content
$html = '<img src="_[1]_valid_base_64_encoded_string">';
$html .= '<img src="_[2]_valid_base_64_encoded_string">';
$html .= '<img src="_[3]_valid_base_64_encoded_string">';
$html .= '<img src="_[4]_valid_base_64_encoded_string">';
// [2] Get all src attributes
$xpath = new DOMXPath(#DOMDocument::loadHTML($html));
$src = $xpath->evaluate("//img/#src");
// [3] Loop src attributes and push image info to $images arary
$images = array();
foreach ($src as $attr)
{
$data = explode('/', $attr->value);
$data = str_replace(';', ',', $data[1]);
list($extension, $type, $encoded_string) = explode(',', $data);
// push to images array
$images[] = array(
'extension' => strtolower($extension),
'image_base64' => $encoded_string,
);
}
// results
echo '<pre>';
print_r($images);
echo '</pre>';
// [4] Move images to directory
// #file_put_contents("path/to/dir/image_name.$extension", base64_decode($encoded_string));
// print_r($images) output
Array
(
[0] => Array
(
[extension] => png
[image_base64] => image_[1]_valid_base_64_encoded_string
)
[1] => Array
(
[extension] => gif
[image_base64] => image_[2]_valid_base_64_encoded_string
)
[2] => Array
(
[extension] => jpeg
[image_base64] => image_[3]_valid_base_64_encoded_string
)
[3] => Array
(
[extension] => jpg
[image_base64] => image_[4]_valid_base_64_encoded_string
)
)

Loop in subset of an array matching key text in PHP

I have this array (using PHP):
Array
(
[dummy_value_01] => 10293
[other_dummy_value_01] => Text
[top_story_check] => 1
[top_story_hp] => 1
[top_story] => 248637
[top_story_id] => 100
[top_story_text] => 2010
[menu_trend_01] => 248714
[menu_trend_01_txt] => Text 01
[menu_trend_02] => 248680
[menu_trend_02_txt] => Text 02
[menu_trend_03] => 248680
[menu_trend_03_txt] => Text 03
[menu_trend_04] => 248680
[menu_trend_04_txt] => Text 04
[menu_trend_05] => 248680
)
I would like to loop only the menu_trend_* values and obtain a list like this:
<ul>
<li>Text 01: 248714</li>
<li>Text 02: 248680</li>
<li>[...]</li>
</ul>
Could you suggest the best way?
You can use this, it will try to match menu_trend_(DIGIT) and if it does, will echo the needed text.
echo '<ul>';
foreach ($array as $key => $val) {
$matches = array();
if (!preg_match('/^menu_trend_(\d+)$/', $key, $matches)) {
continue;
}
echo sprintf('<li>Text %s: %s</li>', $matches[1], $val);
}
echo '</ul>';
I'm not certain that this is the best way, but it will work:
$output = array();
foreach ($array as $k => $a) {
if (stristr($k, 'menu_trend_') && !empty($arr[$k . '_txt'])) {
$output[] = $arr[$k . '_txt'] . ': ' . $a;
}
}
echo "<ul>\n<li>" . implode("</li>\n<li>", $output) . "</li>\n</ul>";
Here's a working example

PHP - How to format this output using given array?

So right now i have an array named $socialMeta, containing:
Array (
[0] => Array (
[socialFacebook] => Array (
[0] => http://www.facebook.com/someUsername
)
)
[1] => Array (
[socialYoutube] => Array (
[0] => http://www.youtube.com/user/someUsername
)
)
[2] => Array (
[socialSoundcloud] => Array (
[0] => http://www.soundcloud.com/someUsername
)
)
)
From this array I need to create the following output:
<div class="social">
Add us on <span>Facebook</span>
Visit us on <span>Youtube</span>
Visit us on <span>Souncloud</span>
</div>
Please not that there are different anchor text for the first link.
For anchor classes i can use $socialMeta key to make whole process a bit easier.
<?php if (!empty($socialMeta)) { ?>
<div class="social">
<?php foreach ($socialMeta as $rows) {?>
<?php foreach ($rows as $key => $val) {?>
<?php
switch ($key) {
case "socialFacebook":
$title = "Facebook";
$class = "fb";
break;
case "socialYoutube":
$title = "Youtube";
$class = "yt";
break;
case "socialSoundcloud":
$title = "Souncloud";
$class = "sc";
break;
}
?>
Add us on <span><?php echo $title; ?></span>
<?php }?>
<?php }?>
</div>
<?php }?>
Start by identifying the network for each element in the array (I assume the name is $array in the following examples):
function add_network($array) {
static $networks = array('Facebook', 'Youtube', 'Soundcloud');
foreach($networks as $network)
if (isset($array['social' . $network])) {
$array['network'] = $network;
return $array;
}
//None found
$array['network'] = false;
return $array;
}
$array = array_map('add_network', $array);
Then transform the array (you should find a better name for this function):
function transform_array($a) {
static $classes = array('Youtube' => 'yt', 'Facebook' => 'fb', 'Soundcloud' => 'sc');
$network = $a['network'];
$class = $classes[$network];
$url = $a['social' . $network][0]
return array('network' => $network,
'url' => $url,
'class' => $class);
}
$array = array_map('transform_array', $array);
And now just loop over the elements of $array:
foreach($array as $row) {
$network = $row['network'];
$url = $row['url'];
$class = $row['class'];
if ($network === 'Facebook')
$link_text = 'Add us on <span>%s</span>';
else
$link_text = 'Visit us on <span>%s</span>'
$link_text = sprintf($link_text, $network);
printf('%s',
$url, $class, $link_text);
}
<?php
function flattenArray(array $input){
$nu = array();
foreach($input as $k => $v){
if(is_array($v) && count($v) == 1){
$nu[key($v)] = current($v);
if(is_array($nu[key($v)]) && count($v) == 1)
$nu[key($v)] = current($nu[key($v)]);
}
else
$nu[$k] = $v;
}
return $nu;
}
// here you can maintain the sortorder of the output and add more social networks with the corresponding URL-text...
$urlData = array(
'socialFacebook' => 'Add us on <span>Facebook></span>',
'socialYoutube' => 'Visit us on <span>Youtube</span>',
'socialSoundcloud' => 'Visit us on <span>Souncloud</span>',
);
$testArray = array(
array('socialFacebook' => array('http.asdfsadf')),
array('socialYoutube' => array('http.asdfsadf')),
array('socialSoundcloud' => array('http.asdfsadf'))
);
$output = flattenArray($testArray);
HERE WE GO
echo '<div class="social">';
foreach($urlData as $network => $linkText){
if(!empty($output[$network]))
echo sprintf('%s</span>', $output[$network], $linkText);
}
echo '</div>';

Categories