XML/API cannot be retrieved by PHP/curl - php

Yeah, I'm stumped. I'm getting nothing. curl_exec is returning no content. I've tried file_get_contents, but that completely times out. I'm attempting to get an API XML from my Subsonic media server and display it on my web server (different servers). The end result would be that I can have people log in to my web server with the media server account. I can deal with the actual parsing later, but I can't even grab the XML right now. I've tried their forums, but haven't gotten much help since they're not really PHP inclined. Figure I'd ask here.
$url = "http://{$subserver}/rest/getUser.view?u={$username}&p={$password}&username={$username}&v=1.8.0&c={$appID}";
$c = curl_init($url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_HEADER, 0);
$result = curl_exec($c);
curl_close($c);
echo $result;
This returns nothing. The variables are defined correctly, and I get the same response as if I typed in the whole URL. Here is their API page: http://www.subsonic.org/pages/api.jsp I've even tried with their "ping" function - still empty
The url itself looks fine. In the web browser, it returns:
This XML file does not appear to have any style information associated with it. The document tree is shown below.
<subsonic-response xmlns="http://subsonic.org/restapi" status="ok" version="1.8.0">
<user username="xxxxxx" email="xxxxxx#xxxxxx.com" scrobblingEnabled="false" adminRole="true" settingsRole="true" downloadRole="true" uploadRole="true" playlistRole="true" coverArtRole="true" commentRole="true" podcastRole="true" streamRole="true" jukeboxRole="true" shareRole="true"/>
</subsonic-response>
I admit I've never used XML, but according to everything I've read... this should work. And it does work, with other random XML files I found on the web.
it might have something to do with the fact that it's not an ".xml" file, but a generated via url xml, as this same exact code will work with some random xml file I found ( http://www.w3schools.com/xml/note.xml )
Any thoughts?

Related

How to get json data without api

I'm trying to get the viewer count so I can check if a streamer is online on https://www.dlive.tv/. If you view the page source on a streamer's page (https://www.dlive.tv/thelongestchain), there's a bunch of json and "watchingCount" is there.
Basically, I want to have the streamer appear on the "Live Now" section of my site if their viewer count is 1 or more, but I can't figure out anyway on how to get the viewer count. I know I could use something like Selenium if I was using python and could run it from my pc, but I need the site to know it.
DLive doesn't have an api yet, so I don't know how to make a call(or request I don't know the terminology) to get this info. When I look in the network tab on chrome I see that there's a call (https://graphigo.prd.dlive.tv/) that provides stream info I think. Would I also need my authkey?
I realize this question is broad and all over the place but it's because so am I with me trying to solve this the last couple days. If I had the viewercount as a variable, I know how to display the streamer on the "Live Now" section of my site, I just don't know how to get the necessary data.
If there's another way I should be checking if a streamer is online or offline other than getting the viewercount, that would work too. If anyone could help me out I would greatly appreciate it, thanks.
I tried scraping the page but I don't think you can scrape dynamic content. When I tried to use SimpleHTMLDom it just returned static elements.
<?php
require 'simple_html_dom.php';
$html = file_get_html('https://www.dlive.tv/thelongestchain')
if(($html->find('video', 0))) {
echo 'online';
}else{
echo 'offline';
}
/* The video element is only on the page if the streamer is live, but it doesn't return because it's not static I presume */
?>
I have no idea at all how to go about making a call/request to get the json data for the viewer count, or how to get any other data that could check if a streamer is online. All the scraping I've done did not return any elements that weren't static (the same no matter if the streamer was online or offline).
Try cURL, it's like a magic wand.
This will return the entire page, and I think including the JSON you're looking for:
<?php
$curl = curl_init('https://www.dlive.tv/thelongestchain');
curl_setopt($curl, CURLOPT_FAILONERROR, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($curl);
var_dump($result); // you do as you need to here
curl_close($curl);
?>
Here's the <script> containing the data I believe you need. Assuming "watchingCount" is the same thing you're looking for?
<script>window.__INITIAL_STATE__={"version":"1.2.0","snackbar":{"snackbar":null},"accessToken":{"accessToken":null},"userMeta":{"userMeta":{"fingerprint":null,"referrer":{"isFirstTime":true,"streamer":null,"happyHour":null,"user":null},"ipStats":null,"ip":"34.216.252.62","langCode":"en","trackingInfo":{"rank":"not available","prevPage":"not available","postStatus":"not available"},"darkMode":true,"NSFW":false,"prefetchID":"e278e744-f522-480e-a290-8eed0fe83b07","cashinEmail":""}},"me":{"me":null},"dialog":{"dialog":{"login":false,"subscribe":false,"cashIn":false,"chest":false,"chestWinners":false,"downloadApp":false}},"tabs":{"tabs":{"livestreamMobileActiveTab":"tab-chat"}},"globalInfo":{"globalInfo":{"languages":[],"communityUpdates":[],"recommendChannels":[]}},"ui":{"ui":{"viewPointWidth":1920,"mq":4,"isMobile":false}}};(function(){var s;(s=document.currentScript||document.scripts[document.scripts.length-1]).parentNode.removeChild(s);}());</script> <script>window.__APOLLO_STATE__={"defaultClient":{"user:dlive-00431789":{"id":"user:dlive-00431789","avatar":"https:\u002F\u002Fimages.prd.dlivecdn.com\u002Favatar\u002F5c1330d8-5bc8-11e9-ab17-865634f95b6b","__typename":"User","displayname":"thelongestchain","partnerStatus":"NONE","username":"dlive-00431789","canSubscribe":false,"subSetting":null,"followers":{"type":"id","generated":true,"id":"$user:dlive-00431789.followers","typename":"UserConnection"},"livestream":{"type":"id","generated":false,"id":"livestream:dlive-00431789+i7rCywMWg","typename":"Livestream"},"hostingLivestream":null,"offlineImage":"https:\u002F\u002Fimages.prd.dlivecdn.com\u002Fofflineimage\u002Fvideo-placeholder.png","banStatus":"NO_BAN","deactivated":false,"about":"#lovejonah\n\nJonah's NEW FRIENDS:\nhttps:\u002F\u002Fdlive.tv\u002FFlamenco https:\u002F\u002Fi.gyazo.com\u002F88416fca5047381105da289faba60e7c.png\nhttps:\u002F\u002Fdlive.tv\u002FHamsterSamster https:\u002F\u002Fi.gyazo.com\u002F984b19f77a1de5e3028e42ccd71052a0.png\nhttps:\u002F\u002Fdlive.tv\u002Fjayis4justice \nhttps:\u002F\u002Fdlive.tv\u002FDenomic\nhttps:\u002F\u002Fdlive.tv\u002FCutie\nhttps:\u002F\u002Fdlive.tv\u002FTruly_A_No_Life\n\n\n\n\n\n\n\n\n\n\n\n\nOur Socials:\nhttps:\u002F\u002Fwww.twitch.tv\u002Fthelongestchain\nhttps:\u002F\u002Fdiscord.gg\u002Fsagd68Z\n\nLINO website: https:\u002F\u002Flino.network\nLINO Whitepaper: https:\u002F\u002Fdocsend.com\u002Fview\u002Fy9qtwb6\nLINO Tracker : https:\u002F\u002Ftracker.lino.network\u002F#\u002F\nLINO Discord : https:\u002F\u002Fdiscord.gg\u002FTUxp3ww\n\nThe Legend of Lemon's: https:\u002F\u002Fbubbl.us\u002FNTE1OTA4MS85ODY3ODQwL2M0Y2NjNjRlYmI0ZGNkNDllOTljNDMxODExNjFmZDRk-X?utm_source=shared-link&utm_medium=link&s=9867840\n\nPC:\nAMD FX 6core 3.0ghz ddr3\n12GB RAM HyperFury X Blue ddr3\nCooler Master Hyper 6heatpipe cpu cooler\nGigabyte MB\n2 x EVGA 1070 FTW\nKingston SSD 120gb\nKingston SSD 240GB\nREDDRAGON Keyboard\nREDDRAGON Mouse\nBlack Out Blue Yeti Microphone\nLogitech C922\n\nApps Used:\nBig Trades Tracker: https:\u002F\u002Ftucsky.github.io\u002FSignificantTrades\u002F#\nMultiple Charts: \nhttps:\u002F\u002Fcryptotrading.toys\u002Fcrypto-panel\u002F\nhttps:\u002F\u002Fcryptowatch.net\n\n\n\n","treasureChest":{"type":"id","generated":true,"id":"$user:dlive-00431789.treasureChest","typename":"TreasureChest"},"videos":{"type":"id","generated":true,"id":"$user:dlive-00431789.videos","typename":"VideoConnection"},"pastBroadcasts":{"type":"id","generated":true,"id":"$user:dlive-00431789.pastBroadcasts","typename":"PastBroadcastConnection"},"following":{"type":"id","generated":true,"id":"$user:dlive-00431789.following","typename":"UserConnection"}},"$user:dlive-00431789.followers":{"totalCount":1000,"__typename":"UserConnection"},"livestream:dlive-00431789+i7rCywMWg":{"id":"livestream:dlive-00431789+i7rCywMWg","totalReward":"3243600","watchingCount":5,"permlink":"dlive-00431789+i7rCywMWg","title":"bybit 0.1eth HIGH LEVERAGE","content":"","category":{"type":"id","generated":false,"id":"category:11455","typename":"Category"},"creator":{"type":"id","generated":false,"id":"user:dlive-00431789","typename":"User"},"__typename":"Livestream","language":{"type":"id","generated":false,"id":"language:1","typename":"Language"},"watchTime({\"add\":false})":true,"disableAlert":false},"category:11455":{"id":"category:11455","backendID":11455,"title":"Cryptocurrency","__typename":"Category","imgUrl":"https:\u002F\u002Fimages.prd.dlivecdn.com\u002Fcategory\u002FCBAOENLDK"},"language:1":{"id":"language:1","language":"English","__typename":"Language"},"$user:dlive-00431789.treasureChest":{"value":"2144482","state":"COLLECTING","ongoingGiveaway":null,"__typename":"TreasureChest","expireAt":"1560400949000","buffs":[],"startGiveawayValueThreshold":"500000"},"$user:dlive-00431789.videos":{"totalCount":0,"__typename":"VideoConnection"},"$user:dlive-00431789.pastBroadcasts":{"totalCount":13,"__typename":"PastBroadcastConnection"},"$user:dlive-00431789.following":{"totalCount":41,"__typename":"UserConnection"},"ROOT_QUERY":{"userByDisplayName({\"displayname\":\"thelongestchain\"})":{"type":"id","generated":false,"id":"user:dlive-00431789","typename":"User"}}}};(function(){var s;(s=document.currentScript||document.scripts[document.scripts.length-1]).parentNode.removeChild(s);}());</script>
I assume you'll then just have to throw in a loop and make the url dynamic to get through whatever streamers you're monitoring with your site.

How can i print php code by call url?I tried by file name but i want to print by url

I am trying to print php code on web page by using my URL. I know by file name i can print php code using "show_source('filename.php');" but i want to print code by URL, not by file.
I tried:-
<?php
show_source("http://URL.com/index.php");
?>
I also tried this code:-
<?php
$c = curl_init('http://URL.com');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
//curl_setopt(... other options you want...)
$html = curl_exec($c);
if (curl_error($c))
die(curl_error($c));
// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);
curl_close($c);
I also tried this code:-
$html = file_get_contents('https://www.URl.com');
print_r ($html) ;
?>
Short answer: If the web server is configured correctly, it should be impossible to do what you are trying to do.
A correctly configured web server will only send content after PHP has processed it. If the web server is sending raw PHP when a .php file is requested, it is misconfigured. If you are trying to view your own PHP files from a server you control, you can try making a copy of the PHP files and changing the extension to .phps, which the server should send as raw PHP code. Note that this will expose the PHP source to the web, which could present a security risk.
As Mr. Squidward already mentioned, this should not be possible. Otherwise this would be a major security breach since you can store passwords for databases in the PHP files.
A possible solution for your problem would be that you create a REST API on the second server and there you have a function that gets the content of a specific file and returns it in JSON.
But ensure that you don't pass any critical data as passwords or user-data in it.

How to collect HTML source response from a remote server?

From within the HTML code in one of my server pages I need to address a search of a specific item on a database placed in another remote server that I don’t own myself.
Example of the search type that performs my request: http://www.remoteserver.com/items/search.php?search_size=XXL
The remote server provides to me - as client - the response displaying a page with several items that match my search criteria.
I don’t want to have this page displayed. What I want is to collect into a string (or local file) the full contents of the remote server HTML response (the code we have access when we click on ‘View Source’ in my IE browser client).
If I collect that data (it could easily reach reach 50000 bytes) I can then filter the one in which I am interested (substrings) and assemble a new request to the remote server for only one of the specific items in the response provided.
Is there any way through which I can get HTML from the response provided by the remote server with Javascript or PHP, and also avoid the display of the response in the browser itself?
I hope I have not confused your minds …
Thanks for any help you may provide.
As #mario mentioned, there are several different ways to do it.
Using file_get_contents():
$txt = file_get_contents('http://www.example.com/');
echo $txt;
Using php's curl functions:
$url = 'http://www.mysite.com';
$ch = curl_init($url);
// Tell curl_exec to return the text instead of sending it to STDOUT
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
// Don't include return header in output
curl_setopt($ch, CURLOPT_HEADER, 0);
$txt = curl_exec($ch);
curl_close($ch);
echo $txt;
curl is probably the most robust option because you have options for more control over the exact request parameters and possibilities for error handling when things don't go as planned

How to display images when using cURL?

When scraping page, I would like the images included with the text.
Currently I'm only able to scrape the text. For example, as a test script, I scraped Google's homepage and it only displayed the text, no images(Google logo).
I also created another test script using Redbox, with no success, same result.
Here's my attempt at scraping the Redbox 'Find a Movie' page:
<?php
$url = 'http://www.redbox.com/Titles/AvailableTitles.aspx';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;
?>
the page was broken, missing box art, missing scripts, etc.
Looking at FF's Firebug's Extension 'Net' tool(allows me to check headers and file paths), I discovered that Redbox's images and css files were not loaded/missing (404 not found). I noticed why, it was because my browser was looking for Redbox's images and css files in the wrong place.
Apperently the Redbox images and css files are located relative to the domain, likewise for Google's logo. So if my script above is using its domain as the base for the files path, how could I change this?
I tried altering the host and referer request headers with the script below, and I've googled extensively, but no luck.
My fix attempt:
<?php
$url = 'http://www.redbox.com/Titles/AvailableTitles.aspx';
$referer = 'http://www.redbox.com/Titles/AvailableTitles.aspx';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Host: www.redbox.com") );
curl_setopt ($ch, CURLOPT_REFERER, $referer);
curl_setopt($ch, CURLOPT_NOBODY, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result= curl_exec ($ch);
curl_close ($ch);
echo $result;
?>
I hope I made sense, if not, let me know and I'll try to explain it better.
Any help would be great! Thanks.
UPDATE
Thanks to everyone(especially Marc, and Wyatt), your answers helped me figure out a method to implement.
I was able to succesfully test by following the steps below:
Download the page and its requisites via Wget.
Add <base href="..." /> to downloaded page's header.
Upload the revised downloaded page and its original requisites via Wput to a temporary server.
Test uploaded page on temporary server via browser
If the uploaded page is not displayed properly, some of the requisites might be missing still(css,jss,ect). View which are missing via a tool that lets you view header responses(eg. the 'net' tool from FF's Firebug Addon). After locating the missing requisites, visit original page that the uploaded page is based on, take note of proper requisite locations that were missing, then revise the downloaded page from step 1 to
accommodate the new proper locations and begin at step 3 again. Else, if page is rendered properly, then success!
Note: When revising the downloaded page I manually edited the code, I'm sure you could use regEX or a parsing library on cUrl's request to automate the process.
When you scrape a URL, you're retrieving a single file, be it html, image, css, javascript, etc... The document you see displayed in a browser is almost always the result of MULTIPLE files: the original html, each seperate image, each css file, each javascript file. You enter only a single address, but fully building/displaying the page will require many HTTP requests.
When you scrape the google home page via curl and output that HTML to the user, there's no way for the user to know that they're actually viewing Google-sourced HTML - it appears as if the HTML came from your server, and your server only. The user's browser will happily suck in this HTML, find the images, and request the images from YOUR server, not google's. Since you're not hosting any of google's images, your server responds with a properly 404 "not found" error.
To make the page work properly, you've got a few choices. The easiest is to parse the HTML of the page and insert a <base href="..." /> tag into the document's header block. This will tell any viewing browsers that "relatively" links within the document should be fetched from this 'base' source (e.g. google).
A harder option is to parse the document and rewrite any references to external files (images ,css, js, etc...) and put in the URL of the originating server, so the user's browser goes to the original site and fetches from there.
The hardest option is to essentially set up a proxy server, and if a request comes in for a file that doesn't exist on your server, to try and fetch the corresponding file from Google via curl and output it to the user.
If the site you're loading is using relative paths for its resource URLs (i.e. /images/whatever.gif instead of http://www.site.com/images/whatever.gif), you're going to need to do some rewriting of those URLs in the source you get back, since cURL won't do that itself, though Wget (official site seems to be down) does (and will even download and mirror the resources for you), but does not provide PHP bindings.
So, you need to come up with a methodology to scrape through the resulting source and change relative paths into absolute paths. A naive way would be something like this:
if (!preg_match('/src="https?:\/\/"/', $result))
$result = preg_replace('/src="(.*)"/', "src=\"$MY_BASE_URL\\1\"", $result);
where $MY_BASE_URL is the base URL you want to rewrite, i.e. http://www.mydomain.com. That won't work for everything, but it should get you started. It's not an easy thing to do, and you might be better off just spawning off a wget command in the background and letting it mirror or rewrite the HTML for you.
Try obtaining the images by having the raw output returned, using the CURLOPT_BINARYTRANSFER option set to true, as below
curl_setopt($ch,CURLOPT_BINARYTRANSFER, true);
I've used this successfully to obtain images and audio from a webpage.

file_get_contents() GET request not showing up on my webserver log

I've got a simple php script to ping some of my domains using file_get_contents(), however I have checked my logs and they are not recording any get requests.
I have
$result = file_get_contents($url);
echo $url. ' pinged ok\n';
where $url for each of the domains is just a simple string of the form http://mydomain.com/, echo verifies this. Manual requests made by myself are showing.
Why would the get requests not be showing in my logs?
Actually I've got it to register the hit when I send $result to the browser. I guess this means the webserver only records browser requests? Is there any way to mimic such in php?
ok tried curl php:
// create curl resource
$ch = curl_init();
// set url
curl_setopt($ch, CURLOPT_URL, "getcorporate.co.nr");
//return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// $output contains the output string
$output = curl_exec($ch);
// close curl resource to free up system resources
curl_close($ch);
same effect though - no hit registered in logs. So far it only registers when I feed the http response back from my script to the browser. Obviously this will only work for a single request and not a bunch as is the purpose of my script.
If something else is going wrong, what debugging output can I look at?
Edit: D'oh! See comments below accepted answer for explanation of my erroneous thinking.
If the request is actually being made, it would be in the logs.
Your example code could be failing silently.
What happens if you do:
<?PHP
if ($result = file_get_contents($url)){
echo "Success";
}else{
echo "Epic Fail!";
}
If that's failing, you'll want to turn on some error reporting or logging and try to figure out why.
Note: if you're in safe mode, or otherwise have fopen url wrappers disabled, file_get_contents() will not grab a remote page. This is the most likely reason things would be failing (assuming there's not a typo in the contents of $url).
Use curl instead?
That's odd. Maybe there is some caching afoot? Have you tried changing the URL dynamically ($url = $url."?timestamp=".time() for example)?

Categories