PHP file_get_contents not showing url link - php

I'm having an issue with php file_get_content(), I have a txt file with links where I created a foreach loop that display multiple links in the same webpage but it's not working, please take a look at the code:
<?php
$urls = file("links.txt");
foreach($urls as $url) {
file_get_contents($url);
echo $url;
}
The content of links.txt is: https://www.google.com
Result: Only a String displaying "https://www.google.com"
Another code that works is :
$url1 = file_get_contents('https://google.com');
echo $url1;
This code returns google's homepage, but I need to use first method with loops to provide multiple links.
Any idea?

Here's one way of combining the things you already had implemented:
$urls = file("links.txt");
foreach($urls as $url) {
$contents = file_get_contents($url);
echo $contents;
}
Both file and file_get_contents are functions that return some value; what you had to do is putting return value of the latter one inside a variable, then outputting that variable with echo.
In fact, you didn't even need to use variable: this...
$urls = file("links.txt");
foreach($urls as $url) {
echo file_get_contents($url);
}
... should have been sufficient too.

Related

PHP get meta tags function doesn't work with non http

I was hoping I can get help with a problem I am having.
I'm using the php get meta tags function to see if a tag exist on a list of websites, the problem occurs when ever there is a domain without HTTP.
Ideally I would want to add the HTTP if it doesn't exist, and also I would need a work around if the domain has HTTPS here is the code I'm using.
I will get this error if I land on a site without HTTP in the domain.
Warning: get_meta_tags(www.drhugopavon.com/): failed to open stream: No such file or directory in C:\xampp\htdocs\webresp\index.php on line 16
$urls = array(
'https://www.smilesbycarroll.com/',
'https://hurstbournedentalcare.com/',
'https://www.dentalhc.com/',
'https://www.springhurstdentistry.com/',
'https://www.smilesbycarroll.com/',
'www.drhugopavon.com/'
);
foreach ($urls as $url) {
$tags = get_meta_tags($url);
if (isset($tags['viewport'])) {
echo "$url tag exist" . "</br>";
}
if (!isset($tags['viewport'])) {
echo "$url tag doesnt exist" . "</br>";
}
}
You could use parse_url() to check if the element scheme exists or not. If not, you could add it:
$urls = array(
'https://www.smilesbycarroll.com/',
'https://hurstbournedentalcare.com/',
'https://www.dentalhc.com/',
'https://www.springhurstdentistry.com/',
'https://www.smilesbycarroll.com/',
'www.drhugopavon.com/'
);
$urls = array_map(function($url) {
$data = parse_url($url);
if (!isset($data['scheme'])) $url = 'http://' . $url ;
return $url;
}, $urls);
print_r($urls);
You can use this to check if the domain has http
foreach($urls as $url){
if(strpos($url, "http") === FALSE) //check if the url contains http and add it to the beginning of the string if it doesn't
$url = "http://" . $url;
$tags = get_meta_tags($url);
}
Another simpler option would be to check for :// in the url
foreach($urls as $url){
if(strpos($url, "://") === FALSE) //check if the url contains http and add it to the beginning of the string if it doesn't
$url = "http://" . $url;
$tags = get_meta_tags($url);
}
Or you can use regex like Wild Beard suggested
you know whats funny, I thought it was because it had http but I put error_reporting(0); in my original code and it worked as I wanted it to haha.

PHP Why is my code not entering the foreach loop on line 20? The file_get_contents() doesnt seem to be working

This screenshot shows that the URL is getting stored in $url
This screenshots shows that after I add echo $html to the code, it says undefined variable $url and file_get_contents(): filename cannot be empty
Also, I have tried almost everything that's there on stackoverflow including file_get_html() and cURL. Nothing seems to work. Please tell me where I'm going wrong here.
<?php
include_once('simple_html_dom.php');
$base_url = "https://www.instagram.com/";
$html = "";
if ( isset($_POST['username']) ) {
$url = $base_url.htmlspecialchars($_POST['username'])."/";//concatenate $base_url to username to generate full URL
}
$html = file_get_contents($url); //access the URL in $url
$doc = new DOMDocument;
$doc->loadHTML($html); //get HTML of the webpage given by file_get_contents
$tags = $doc->getElementsByTagName('img');
$arr = (array)$tags;
if (empty($arr)) {
echo 'emptyarray';
}
foreach ($tags as $tag) {
echo $tag->getAttribute('src');
}
?>
Edit:
If 'http:// stackoverflow.com/questions' is used instead of 'https:// www.instagram.com/ its_kushal_here' file_get_contents() is working fine and not failing.
When you refreshed the page did you make sure your post parameters carried through to the new request ?
The issue seems to be here
if ( isset($_POST['username']) ) {
$url = $base_url.htmlspecialchars($_POST['username'])."/";
}
If $_POST['username'] is not set then $url will not be defined. Also remove the # from #$doc->loadHTML($html); so you can see the error it outputs. That will help you workout what fails after that point.

Use a regex to get text from html source code

I have got a php code that stores html source code of a site in a variable and I want to get two links from that source code only.
First link is in meta tag key content:
<meta property="og:image" content="http://img.xxx.xx/vid/xxx/b7950d611f934f0eef95c1cd010348e3.jpg"/>
And second
jw.load([{ file: 'http://vrbx105.xxx.xx/U7yvQnLiA_m5mhE9MUHf3w/1477628604/vl107aeb2d7db53f91fc6ad2e76fe11e49.mp4', provider: 'http' }]);
I need to get only those two links, they change every time a page is reloaded:
http://img.xxx.xx/vid/xxx/b7950d611f934f0eef95c1cd010348e3.jpg
http://vrbx105.xxx.xx/U7yvQnLiA_m5mhE9MUHf3w/1477628604/vl107aeb2d7db53f91fc6ad2e76fe11e49.mp4
If you insist in regex, here's one for the first link: https://regex101.com/r/CHpfDY/1
And here's the second: https://regex101.com/r/VVF0Gf/1
Unless you have a PHP JavaScript parser handy, you can at least get rid of the regular expression for the HTML search. Something like this should work, though it's hard to test without the URL...
<?php
$dom=new DomDocument();
$dom->loadHTMLFile("http://example.com/example.html");
$xpath = new DomXpath($dom);
$metanode = $xpath->query("//meta[#property='og:image']/#content");
if ($metanode->length) {
$url1 = $metanode[0]->value;
}
$scriptnode = $xpath->query("//script");
foreach ($scriptnode as $script) {
$array = explode("\n", $script->nodeValue);
foreach ($array as $line) {
if (preg_match("/jw.load... file: '(.*?)'/", $line, $matches)) {
$url2 = $matches[1];
break(2);
}
}
}
echo $url1;
echo $url2;

Simple DOM file_get_html returns empty page

I'm having trouble with passing a complex url to file_get_html When I try this code
<?php
require_once("$_SERVER[DOCUMENT_ROOT]/dom/simple_html_dom.php");
$base = $_GET['url'];
//file_get_contents() reads remote webpage content
$html_base = file_get_html("http://www.realestateinvestar.com.au/ME2/dirmod.asp?sid=1A0FFDB3E8CD48909120C118D03F6016&nm=&type=news&mod=News&mid=9A02E3B96F2A415ABC72CB5F516B4C10&tier=3&nid=C67A9DD2C0144B9EB41DB58365C05927");
foreach($html_base->find('p') as $td) {
echo $td;
}
?>
It works
But if I try to pass the url as a variable via mysite.com/goget.php?url=http://www.realestateinvestar.com.au/ME2/dirmod.asp?sid=1A0FFDB3E8CD48909120C118D03F6016&nm=&type=news&mod=News&mid=9A02E3B96F2A415ABC72CB5F516B4C10&tier=3&nid=C67A9DD2C0144B9EB41DB58365C05927
<?php
require_once("$_SERVER[DOCUMENT_ROOT]/dom/simple_html_dom.php");
$base = $_GET['url'];
//file_get_contents() reads remote webpage content
$html_base = file_get_html($base);
foreach($html_base->find('p') as $td) {
echo $td;
}
?>
It returns a blank page.
Any help?
Use urlencode():
"mysite.com/goget.php?url="
.urlencode("http://www.realestateinvestar.com.au/ME2/dirmod.asp?sid=1A0FFDB3E8CD48909120C118D03F6016&nm=&type=news&mod=News&mid=9A02E3B96F2A415ABC72CB5F516B4C10&tier=3&nid=C67A9DD2C0144B9EB41DB58365C05927")

php get all files from a remote directory

I have searched, and searched for 3+ hours this morning and tried over 10 different setups for how to grab and display a list of images from a url, and none of them worked correctly. I would either end up with no info displaying, or a 500 error. Can someone point me to an example or help me out here on how to do this properly. file_get_contents is not a viable option.
Example Directory: http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/
Files i know that are in that directory:
001.jpg,
002.jpg,
003.jpg
I would like the output to be the exact url to the file.
Let me know if more info is needed, i'm not 100% sure exactly how to explain it right lol.
Edit:
ok so what I guess i actually want to do is check the url for all the image tags and display a list with the full url to that image.
New to working with this url+images+php stuff so please don't hit me too hard with your downvote hammer with no comments lol.
Code I Tried:
<?php
/*
Credits: Bit Repository
URL: http://www.bitrepository.com/
*/
$url = $location;
// Fetch page
$string = FetchPage($url);
// Regex that extracts the images (full tag)
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex, $string, $out, PREG_PATTERN_ORDER);
$img_tag_array = $out[0];
echo "<pre>"; print_r($img_tag_array); echo "</pre>";
// Regex for SRC Value
$image_regex_src_url = '/<img[^>]*'.
'src=[\"|\'](.*)[\"|\']/Ui';
preg_match_all($image_regex_src_url, $string, $out, PREG_PATTERN_ORDER);
$images_url_array = $out[1];
echo "<pre>"; print_r($images_url_array); echo "</pre>";
// Fetch Page Function
function FetchPage($path)
{
$file = fopen($path, "r");
if (!$file)
{
exit("The was a connection error!");
}
$data = '';
while (!feof($file))
{
// Extract the data from the file / url
$data .= fgets($file, 1024);
}
return $data;
}
?>
and it returned a blank page
Based loosely on the code you already tried (but was riddled with problems). This grabs the full contents of the URL $url, parses out the <img> src attributes, and then outputs them.
Because this particular web host uses <base href=""/> tag to reset the base part of all URLs on the page, I've added a $base variable which you should set to the contents of the base tag.
Additionally, it looks like this particular web host has some pretty smart anti-hotlinking in place, so not all images may be visible.
But! Give it a whirl, let me know if it does what you need it to, and any questions.
<?php
$url = 'http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/';
$base = 'http://www.webtoonlive.com/';
// Pull in the external HTML contents
$contents = file_get_contents( $url );
// Use Regular Expressions to match all <img src="???" />
preg_match_all( '/<img[^>]*src=[\"|\'](.*)[\"|\']/Ui', $contents, $out, PREG_PATTERN_ORDER);
foreach ( $out[1] as $k=>$v ){ // Step through all SRC's
// Prepend the URL with the $base URL (if needed)
if ( strpos( $v, 'http://' ) !== true ) $v = $base . $v;
// Output a link to the URL
echo '' . $v . '<br/>';
}
Sample output:
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/000.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/001.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/002.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/003.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/004.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/005.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/006.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/007.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/008.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/009.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/010.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/011.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/012.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/013.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/014.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/015.jpg
http://www.webtoonlive.com/webtoon/fantasy_world_survival/ch02/016.jpg

Categories