PHP Get a specific image path a <img> inside a url

PHP Get a specific image path a <img> inside a url - php

I am trying to get a specific image out of a url.
So for example if www.domain.com has a
<img id="image100" src="images/dog.jpg">
I am trying to get the path from that specific img tag.
I tried two different ways:
$matches = array();
preg_match_all('/<img id="image100" (.*?)src="(.*?)"\/>/i', file_get_contents($url), $matches);
echo $matches[1];
error:
Notice: Array to string conversion
$dom = new DOMDocument;
$dom->loadHTMLFile($url);
$DOMxpath = new DOMXPath($dom);
$image = $DOMxpath->query("//*[#class='image100]");
echo $image->item(0)->getAttribute('src');
error:
Fatal error: Call to a member function item() on a non-object

Regex should look something like this :
/<img id="image100".*?src="(.*?)"/
A quick explanation of what is going on here:
. matches a single character except for new line
*? means repeat the previous token
() delimits a capture group, what you want to extract.
/ escapes the character /
? makes the previous character optional
Basically, what this says is : Look in the string for a substring that starts with
<img id="image100"
contains any number of characters afterwards then continues with src="", then match whatever is between the "".
A great tool to test your regex is: https://regex101.com/r/eB8rU8/1

Having that file_get_contents is like $img:
$img = '<img id="image100" src="images/dog.jpg">';
$resp = preg_match_all('/\<img\sid="image100"\ssrc="(.*?)"\/?\>/',$img,$result);
var_dump($result)
/* response:
array (size=2)
0 =>
array (size=1)
0 => string '<img id="image100" src="images/dog.jpg">' (length=40)
1 =>
array (size=1)
0 => string 'images/dog.jpg' (length=14)
*/

Related

Place or add character(s) on preg_match

I got f.e. a string
foo-bar/baz 123
and a pattern
#foo-(bar/baz)#
This pattern would give me the captured bar/baz,
But i would like to replace (sanitize) the / with a - to receive bar-baz.
Reason: i got a method that gets a string as parameter, and the regex pattern via config.
The tool around this is dynamically and will look up the returned match as an id.
But the target|entity to find is using - instead of `/´ in the id.
So i now could hard code some exception like ~"if is this adapter then replace this and that" -
but i wonder if i could do that we regex.
Test code:
// Somewhere in a loop ...
$string = "foo-bar/baz 123"; // Would get dynamically as parameter.
$pattern = "#foo-(bar/baz)#"; // Would get dynamically from config.
// ...
if (preg_match($pattern, $string, $matches) === 1) {
// return $matches[1]; Would return captured id.
echo var_export($matches, true) . PHP_EOL;
}
Returns
array (
0 => 'foo-bar/baz',
1 => 'bar/baz',
)
Expected|Need
array (
0 => 'foo-bar/baz',
1 => 'bar-baz', // <-- "-" instead of "/"
)
So im searching a way to "match-replace".

preg_replace_callback() returns wrong arrays when multiple instances

I have the following script which replaces [amazon region=com asin=1234567] in my string with getAffiliateLink("com","1234567") to trigger a PHP function and show a specific div:
$content = preg_replace_callback('/\[amazon region=([^\b]+) asin=([^\b]+)\]/', 'callback', $content);
function callback ($matches) {
print_r($matches);
return getAffiliateLink("$matches[1]","$matches[2]");
}
echo $content;
Now this works perfectly.. until I have multiple instances of [amazon region=com asin=1234567] in my string, each with a different asin number.
Then I get the following print:
Array (
[0] => [amazon region=com asin=B004QJ9458] [amazon region=com asin=B0080KWRI0]
[1] => com asin=B004QJ9458] [amazon region=com
[2] => B0080KWRI0
)
It's obvious that something is going wrong here looking at the output above.How can I change my code so it works for different instances of the particular string?

The [^\b] character class matches any char but a \x08 char (BACKSPACE).
You may fix the pattern using
$content = preg_replace_callback('/\[amazon region=(\S+) asin=(\S+)\]/', 'callback', $content);
See the PHP demo.
The \S+ matches 1 or more non-whitespace chars.

How to remove a substring from text and assign a nested substring to a variable?

I'm working in Joomla developing a Module where I need to strip this snippet from the $article->text and extract the part number to have its contents stored in $part_number.
{myplugin}ABCDEF1234,"Flux Capacitor"{/myplugin}
I've been trying to work something out, but I can't get it working:
$re = '/\{myplugin\}(\w+),[^{}]+\{\/myplugin\}/';
$subst = '';
$result = preg_replace($re, $subst, $article->text);
$article->text = $result;
But this doesn't return the part number so I can put it in $part_number. Can this be done in one regular expression operation, or should it be one to extract the number number and a second to remove the snippet from $article->text?
The intention is to have {myplugin}ABCDEF1234,"Flux Capacitor"{/myplugin} removed from $article->text and have its part number such as ABCDEF1234 copied from this snippet and stored in PHP variable $part_number.

I would recommend you to use preg_match:
$s='{myplugin}ABCDEF1234,"Flux Capacitor"{/myplugin}';
preg_match('/{myplugin}(\w+)\,"(.+)"{\/myplugin}/', $s, $result);
$result will be:
array (size=3)
0 => string '{myplugin}ABCDEF1234,"Flux Capacitor"{/myplugin}' (length=48)
1 => string 'ABCDEF1234' (length=10)
2 => string 'Flux Capacitor' (length=14)
UPD:
$article->text = str_replace($result[0], '', $article->text);
$part_number = $result[1];

Yes you can remove the plugin substring from $article->text AND declare $partnumber in one hit.
Code: (Demo)
$article=(object)['text'=>'Some leading text {myplugin}ABCDEF1234,"Flux Capacitor"{/myplugin} some trailing text'];
$re = '~\{myplugin\}([^,]+),[^{]+\{/myplugin\}~';
$subst = '';
$article->text=preg_replace_callback($re,function($m)use(&$partnumber){ $partnumber=$m[1]; return '';},$article->text,1);
echo $article->text;
echo "\n";
echo $partnumber;
Output:
Some leading text some trailing text
ABCDEF1234
By switching from preg_replace() to preg_replace_callback() you can call an anonymous function to carry out the two tasks. First, the new variable $partnumber is declared, then the empty string replaces the greater plugin substring (fullstring match).
use(&$partnumber) allows you to declare a modifiable variable inside the callback function which will be available in the global scope.
My method assumes that there will only ever be 1 found $partnumber (this is why there is a 1 for a value in the 4th parameter. If there are two or more, then the 4th parameter must be removed and the $partnumber assignment must be written as an array $partnumber[]=$m[1] so that earlier matches aren't overwritten by later ones.

How to get file_contents from each url of an urls array

I am trying to figure out how I can return the file_contents of each url in an array(urls_array). So far the following code , using simplehtmpdom gives me just one result then the code fails to run on.....in the foreach loop.
$urlsall = 'http://php.net,
http://php.net/downloads,
http://php.net/docs.php,
http://php.net/get-involved,
http://php.net/support,
http://php.net/manual/en/getting-started.php,
http://php.net/manual/en/introduction.php,
http://php.net/manual/en/tutorial.php,
http://php.net/manual/en/langref.php,
http://php.net/manual/en/language.basic-syntax.php,
http://php.net/manual/en/language.types.php,
http://php.net/manual/en/language.variables.php,
http://php.net/manual/en/language.constants.php,
http://php.net/manual/en/language.expressions.php,
http://php.net/manual/en/language.operators.php,
http://php.net/manual/en/language.control-structures.php,
http://php.net/manual/en/language.functions.php,
http://php.net/manual/en/language.oop5.php,
http://php.net/manual/en/language.namespaces.php,
http://php.net/manual/en/language.errors.php,
http://php.net/manual/en/language.exceptions.php,
http://php.net/manual/en/language.generators.php,
http://php.net/manual/en/language.references.php,
http://php.net/manual/en/reserved.variables.php,
http://php.net/manual/en/reserved.exceptions.php,
http://php.net/manual/en/reserved.interfaces.php,
http://php.net/manual/en/context.php';
$urls_array = explode(',', $urlsall);
//var_dump ($urls_array);
foreach ($urls_array as $url)
{
$html = SimpleHtmlDom::file_get_html($url);
$title = $html->find('title',0);
echo $title->plaintext;
}
results : PHP: Hypertext Preprocessor
ERROR: An error occured, The error has been reported.
Error on Dec 18, 2015 17:16PM - file_get_contents( http://php.net/downloads): failed to open stream: Invalid argument in E:\xampp\htdocs\sitename\SimpleHtmlDom.php on line 81
What I want to do is get all the urls titles from the above foreach loop.

Like I said in my comment: By the looks of things, the most likely cause of the problem is your using explode on a string, using comma's as delimiters. However, your string contains a lot of whitespace, too, which you're not trimming. That would explain why the first URL passes without fault, but the second one fails (that url starts with a new-line character).
I'd suggest you either define an array of url's instead of a string you explode, or you trim all the urls:
$urls = array_map('trim', explode(',', $urlsall));
This calls trim for each value in the array that explode returns. However, that's a bit silly. You're hard-coding the urls to begin with, so why not write an array instead of a long string?
$urls = array(
'http://php.net',
'http://php.net/downloads',
'http://php.net/docs.php',
'http://php.net/get-involved',
'http://php.net/support',
'http://php.net/manual/en/getting-started.php',
//rest of the urls here
);

You get this error because you have in you array some line break in your array.
When you do a var_dump of your array I get :
array (size=27)
0 => string 'http://php.net' (length=14)
1 => string '
http://php.net/downloads' (length=26)
2 => string '
http://php.net/docs.php' (length=25)
3 => string '
http://php.net/get-involved' (length=29)
Why did you used an explode ?
Make directly an array to do this :
$urlsall = array(
'http://php.net',
'http://php.net/downloads',
'http://php.net/docs.php',
'http://php.net/get-involved',
'http://php.net/support',
'http://php.net/manual/en/getting-started.php',
'http://php.net/manual/en/introduction.php',
'http://php.net/manual/en/tutorial.php',
'http://php.net/manual/en/langref.php',
'http://php.net/manual/en/language.basic-syntax.php',
'http://php.net/manual/en/language.types.php',
'http://php.net/manual/en/language.variables.php',
'http://php.net/manual/en/language.constants.php',
'http://php.net/manual/en/language.expressions.php',
'http://php.net/manual/en/language.operators.php',
'http://php.net/manual/en/language.control-structures.php',
'http://php.net/manual/en/language.functions.php',
'http://php.net/manual/en/language.oop5.php',
'http://php.net/manual/en/language.namespaces.php',
'http://php.net/manual/en/language.errors.php',
'http://php.net/manual/en/language.exceptions.php',
'http://php.net/manual/en/language.generators.php',
'http://php.net/manual/en/language.references.php',
'http://php.net/manual/en/reserved.variables.php',
'http://php.net/manual/en/reserved.exceptions.php',
'http://php.net/manual/en/reserved.interfaces.php',
'http://php.net/manual/en/context.php'
);

php convert string with new lines into array?

I am getting data from an API and the resulting string is
[RESPONSE]
PROPERTY[STATUS][0]=ACTIVE
PROPERTY[REGISTRATIONEXPIRATIONDATE][0]=2012-04-04 19:48:48
DESCRIPTION=Command completed successfully
QUEUETIME=0
CODE=200
RUNTIME=0.352
QUEUETIME=0
RUNTIME=0.8
EOF
I am trying to convert this into an array like
Array(
['PROPERTY[STATUS][0]'] => ACTIVE,
['CODE'] => 200,
...
);
So I am trying to explode it using the resulting file_get_content function with an explode like
$output = explode('=',file_get_contents($url));
But the problem is the returning values are not always returned in the same order, so I need to have it like $array['CODE'] = 200, and $array['RUNTIME'] = 0.352 however there does not seem to be any kind of new line characters? I tried \r\n, \n, <br>, \r\n\r\n in the explode function to no avail. But there is new lines in both notepad and the browser.
So my question is there some way to determine if a string is on a new line or determine what the character forcing the new line is? If not is there some other way I could read this into an array?

To find out what the breaking character is, you could do this (if $data contatins the string example you've posted):
echo ord($data[strlen('[RESPONSE]')]) . PHP_EOL;
echo ord($data[strlen('[RESPONSE]')+1]); // if there's a second char
Then take a look in the ASCII table to see what it is.
EDIT: Then you could explode the data using that newly found character:
explode(ord($ascii_value), $data);
Btw, does file() return a correct array?

Explode on "\n" with double quotes so PHP understands this is a line feed and not a backslashed n ;-) then explode each item on =

Why not just use parse_ini_file() or parse_ini_string()?
It should do everything you need (build an array) in one easy step.

Try
preg_split("/$/m", $str)
or
preg_split("/$\n?/m", $str)
for the split

The lazy solution would be:
$response = strtr($response, "\r", "\n");
preg_match_all('#^(.+)=(.+)\s*$#m', $response, $parts);
$parts = array_combine($parts[1], $parts[2]);
Gives you:
Array (
[PROPERTY[STATUS][0]] => ACTIVE
[PROPERTY[REGISTRATIONEXPIRATIONDATE][0]] => 2012-04-04 19:48:48
[DESCRIPTION] => Command completed successfully
[QUEUETIME] => 0
[CODE] => 200
[RUNTIME] => 0.8

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

PHP Get a specific image path a <img> inside a url - php

Related

Place or add character(s) on preg_match

preg_replace_callback() returns wrong arrays when multiple instances

How to remove a substring from text and assign a nested substring to a variable?

How to get file_contents from each url of an urls array

php convert string with new lines into array?

Categories

Resources