Grabbing content of external site CSS class. (steam store) - php

I have been playing around with this code for a while but cant get it to work properly.
My goal is to display or maybe even create a table with ID's of grabbed data from the steam store for my own website and game library. the class is 'game_area_description'
This is a study project of mine.
So i tried to get the table using the following code.
#section('selectedGame');
<?php
$url = 'https://store.steampowered.com/app/'.$game->appID."/";
header("Access-Control-Allow-Origin: ${url}");
$dom = new DOMDocument();
#$dom->loadHTMLFile($url);
$xpath = new DOMXpath($dom);
$elements = $xpath->query('//div[#class="game_area_description"]/a');
$link = $dom->saveHTML($elements->item(0));
echo $link;
?>
#endsection;
I am using Laravel by the way.
In some other cases i can get another piece of the website.
$url = 'https://store.steampowered.com/app/'.$game->appID."/";
$content = file_get_contents($url);
$first_step = explode( '<div class="game_description_snippet">' , $content );
$second_step = explode("</div>" , $first_step[1] );
echo "<p>${second_step[0]}</p>";
Here it just takes the excerpt of the webpage which works in some cases.
Here is the biggest issue, other than not beeing able to get all the information where i get an error $first_step[1]is not valid.
Is some CORE issue.
See the webpage loads an age check in some cases like "Batman Arkham knight". the user needs to either log in or verify their age first.
Keeping me from using the second block of code.
But the first gives me all kinds of errors as the screenshot shows.
Anyone know of a way to grab this part of the page?
Where the description of the game is?

The answer to my question was in the comments.
apparently steam has some undocumented API's .
here is the code ( with bootstrap CSS).
That i used and going ti implement in my migration tables and seeder
#section('selectedGame');
<div class="container border">
<!-- Content here -->
<?php
$url = "http://store.steampowered.com/api/appdetails?appids=".$game->appID;
$jsondata = file_get_contents($url);
$parsed = json_decode($jsondata,true);
$gameID = $game->appID;
$gameDescr = $parsed[$gameID]['data']['about_the_game'];
echo $gameDescr;
?>
</div>
#endsection;

Related

Web Scrape - not Working with a link, no result

i got a problem. I did that script with php to obtain the price of a web.
<?php
require('simple_html_dom.php'); //Library
$url = "https://www.exito.com/products/MP00550000000204/Televisor+LG+Led+43+Pulgadas+Full+HD+Smart+TV+43LJ550T"; //Link
$html = new simple_html_dom();
$html->load_file($url);
$post = $html->find('p[class=price offer]', 0)->plaintext;
$resultado = str_replace ( ".0", '', $post);
echo $resultado;
?>
So, if i test with the link that is on the code it works and shows me the price of the article 1186900
But when i change the link for one of the same page (this)
https://www.exito.com/products/0000225183192526/Carne+Res+Molida+En+Bandeja
I test the script and it does not shows me anything.
I donĀ“t understand because it is the same page and the price is in the same <p></p> that all the articles.
What am i doing wrong?
NOTE: Remember that if you want to test the script you need to download the simple_html_dom.php here
I appreciate your help.
Thanks

PHP get external data, but if content's longer than x do this?

There is probably a better way of doing this and if so please let me know but I am trying to get data from an external website (this is just for a personal website I wanted to make myself for ease of tracking stuff + I like to learn).
The issue i am having is I am using xpath to get data from a specific div. The issue arises only in a very small occurrence when the div is not there. I will use a URL in the below that this is not working on:
<?php
// exchange line 1
$url = "https://coinmarketcap.com/currencies/farstcoin/";
$dom = new DOMDocument();
#$dom->loadHTMLFile($url);
$xpath = new DOMXpath($dom);
// exchange line 1 name
$elementsname1 = $xpath->query('//*[#id="markets-table"]/tbody/tr[1]/td[2]/a/text()');
$exchange = $dom->saveHTML($elementsname1->item(0));
$exchangelower = strtolower($exchange);
$exchangelowerspace = str_replace(' ',"-",$exchangelower);
// exchange line 1 pair
$elementpair = $xpath->query('//*[#id="markets-table"]/tbody/tr[1]/td[3]/a/text()');
$pairing = $dom->saveHTML($elementpair->item(0));
// exchange line 1 volume
$elementvolume = $xpath->query('//*[#id="markets-table"]/tbody/tr[1]/td[4]');
$volume = $dom->saveHTML($elementvolume->item(0));
// exchange line 1 price
$elementprice = $xpath->query('//*[#id="markets-table"]/tbody/tr[1]/td[5]');
$exchangeprice = $dom->saveHTML($elementprice->item(0));
As i am doing this for a load of pages this worked best for me but this one issue is really rare but annoying. I am using the below to display the data:
<div class="col-md-12">Price:<?php echo $exchangeprice; ?></div>
<div class="col-md-12">Volume:<?php echo $volume; ?></div>
<div class="col-md-12">Pair:<?php echo $pairing; ?></div>
For this one example it's pulling the whole of the coinmarketcap.com website in and displaying it on my page, in other examples it pulls the data as needed.
Is there a way of me saying if the string is longer than xxx characters its failed or?
Thanks!

Different output when parsing website

I'm trying to scrape the table containing "Institution Name:, Institution Type: etc" from the following website
http://cricos.deewr.gov.au/Institution/InstitutionDetails.aspx?ProviderID=00001
Using PHP and simple_html_dom i've generated the following code:
<?php
require 'simple_html_dom.php';
$html = file_get_html('http://cricos.deewr.gov.au/Institution/InstitutionDetails.aspx?ProviderID=00001');
$element = $html->find("table");
echo $element[1];
?>
I've iterated through tables (e.g $element[0], $element[1], $element[2] etc)
But do not find the correct table.
I then tried the same code on a local copy of the website:
<?php
require 'simple_html_dom.php';
$html = file_get_html('Institution Details.htm');
$element = $html->find("table");
echo $element[1];
?>
And it works I can see the table I am looking for under $element[1].
Why does it work on a local copy but no when using the url? How do I make the first set of code work?
Thankyou

How to create a sitemap with page relationships

I'm currently trying to figure out a way to write a script (preferrably PHP) that would crawl through a site and create a sitemap. In addition to the traditional standard listing of pages, I'd like the script to keep track of which pages link to other pages.
Example pages
A
B
C
D
I'd like the output to give me something like the following.
Page Name: A
Pages linking to Page A:
B
C
D
Page Name: B
Pages linking to Page B:
A
C
etc...
I've come across multiple standard sitemap scripts, but nothing that really accomplishes what I am looking for.
EDIT
Seems I didn't give enough info. Sorry about my lack of clarity there. Here is the code I currently have. I've used simple_html_dom.php to take care of the tasks of parsing and searching through the html for me.
<?php
include("simple_html_dom.php");
url = 'page_url';
$html = new simple_html_dom();
$html->load_file($url);
$linkmap = array();
foreach($html->find('a') as $link):
if(contains("cms/education",$link)):
if(!in_array($link, $linkmap)):
$linkmap[$link->href] = array();
endif;
endif;
endforeach;
?>
Note: My little foreach loop just filters based on a specific substring in the url.
So, I have the necessary first level pages. Where I am stuck is in creating a loop that will not run indefinitely, while keeping track of the pages you have already visited.
Basically, you need two arrays to control the flow here. The first will keep track of the pages you need to look at and the second will track the pages you have already looked at. Then you just run your existing code on each page until there are none left:
<?php
include("simple_html_dom.php");
$urlsToCheck = array();
$urlsToCheck[] = 'page_url';
$urlsChecked = array();
while(count($urlsToCheck) > 0)
{
$url = array_pop($urlsToCheck);
if (!in_array($url, $urlsChecked)
{
$urlsChecked[] = $url;
$html = new simple_html_dom();
$html->load_file($url);
$linkmap = array();
foreach($html->find('a') as $link):
if(contains("cms/education",$link)):
if((!in_array($link, $urlsToCheck)) && (!in_array($link,$urlsChecked)))
$urlsToCheck[] = $link;
if(!in_array($link, $linkmap)):
$linkmap[$link->href] = array();
endif;
endif;
endforeach;
}
}
?>

Only show certain ID with PHP web scrape?

I'm working on a personal project where it gets the content of my local weather station's school/business closing and it displays the results on my personal site. Since the site doesn't use an RSS feed (sadly), I was thinking of using a PHP scrape to get the contents of the page, but I only want to show a certain ID element. Is this possible?
My PHP code is,
<?php
$url = 'http://website.com';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
echo $output;
?>
I was thinking of using preg_match, but I'm not sure of the syntax or if that's even the right command. The ID element I want to show is #LeftColumnContent_closings_dg.
Here's an example using DOMDocument. It pulls the text from the first <h1> element with the id="test" ...
$html = '
<html>
<body>
<h1 id="test">test element text</h1>
<h1>test two</h1>
</body>
</html>
';
$dom = new DOMDocument;
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$res = $xpath->query('//h1[#id="test"]');
if ($res->item(0) !== NULL) {
$test = $res->item(0)->nodeValue;
}
A library I've used with great success for this sort of things is PHPQuery: http://code.google.com/p/phpquery/ .
You basically get your website into a string (like you have above), then do:
phpQuery::newDocument($output);
$titleElement = pq('title');
$title = $titleElement->html();
For instance - that would get the contents of the title element. The benefit is that all the methods are named after the jQuery ones, making it pretty easy to learn if you already know jQuery.

Categories