I want to be able to show the top 10 players on my server from gametracker.com into my webpage.
Now I looked up the source code of the gametracker.com page which is showing the top 10 players and the part looks like this
<div class="blocknew blocknew666">
<div class="blocknewhdr">
TOP 10 PLAYERS <span class="item_text_12">(Online & Offline)</span>
</div>
<table class="table_lst table_lst_stp">
<tr>
<td class="col_h c01">
Rank
</td>
<td class="col_h c02">
Name
</td>
<td class="col_h c03">
Score
</td>
<td class="col_h c04">
Time Played
</td>
</tr>
.
.
.
.
</table>
<div class="item_h10">
</div>
<a class="fbutton" href="/server_info/*.*.*.*:27015/top_players/">
View All Players & Stats
</a>
</div>
As you can see the content I want is within the class="blocknew blocknew666" I could have easily pulled it out if it was within an id but I don't know how to handle it when the content is within a class. I looked up on the internet a bit and came across this
// Create DOM from URL or file
$html = file_get_html('http://www.google.com/');
// Find all images
foreach($html->find('img') as $element)
echo $element->src . '<br>';
// Find all links
foreach($html->find('a') as $element)
echo $element->href . '<br>';
Is it possible to use this code to do what I want? If yes please write the line of code i would need to use, or give me some suggestion on how to tackle this issue.
I'm only going to post a partial answer because I believe that doing this might be a violation of the terms of use for the GameTracker service, what you are asking for is basically a method to steal proprietary content from another website. You SHOULD most definitely be GETTING PERMISSION from GameTracker before you do this.
To do this I would use strstr. http://php.net/manual/en/function.strstr.php
$html = file_get_html('http://www.gametracker.com/server_info/someip/');
$topten = strstr($html, 'TOP 10 PLAYERS');
echo $topten; //this will print everthing after the content you looked for.
Now I will leave it up to you to figure out how to chop off the un-needed content that comes after the top ten is done AND to get permission from GameTracker to use this.
Based on tremor's suggestion this is the working code for the above problem
<?php
function rstrstr($haystack,$needle)
{
return substr($haystack, 0,strpos($haystack, $needle));
}
$html = file_get_contents('http://www.gametracker.com/server_info/*.*.*.*:27015/');
$topten = strstr($html, 'TOP 10 PLAYERS');//this will print everthing after the content you looked for.
$topten = strstr($topten, '<table class="table_lst table_lst_stp">');
$topten = rstrstr($topten,'<div class="item_h10">'); //this will trim stuff that is not needed
echo $topten;
?>
The code you provided is part of the simpledom library
http://simplehtmldom.sourceforge.net/
You need to download and include the library for the code to work.
Related
I need to load some 3rd party widget onto my website. The only way they distribute it is by means of clumsy old <iframe>.
I don't have much choice so what I do is get an iframe html code, using a proxy page on my website like so:
$iframe = file_get_contents('http://example.com/page_with_iframe_html.php');
Then I have to remove some specific parts in iframe like this:
$iframe = preg_replace('~<div class="someclass">[\s\S]*<\/div>~ix', '', $iframe);
In this way I intend to remove the unwanted section. And in the end i simply output the iframe like so:
echo ($iframe);
The iframe gets output alright, however the unwanted section is still there. The regex itself was tested using regex101, but it doesn't work.
You should try this way, Hope this will help you out. Here i am using sample HTML remove the div with given class name, First i load the document, query and remove that node from the child.
Try this code snippet here
<?php
ini_set('display_errors', 1);
//sample HTML content
$string1='<html>'
. '<body>'
. '<div>This is div 1</div>'
. '<div class="someclass"> <span class="hot-line-text"> hotline: </span> <a id="hot-line-tel" class="hot-line-link" href="tel:0000" target="_parent"> <button class="hot-line-button"></button> <span class="hot-line-number">0000</span> </a> </div>'
. '</body>'
. '</html>';
$object= new DOMDocument();
$object->loadHTML($string1);
$xpathObj= new DOMXPath($object);
$result=$xpathObj->query('//div[#class="someclass"]');
foreach($result as $node)
{
$node->parentNode->removeChild($node);
}
echo $object->saveHTML();
I'd like to get the content (CSS, children, ect.) to display on a HTML page, but this element is on a external page. When I use:
$page = new DOMDocument();
$page->loadHTMLFile('about.php');
$text = $page->getElementById('text');
echo $text->nodeValue;
I only get the text, but #text also has a image as child and some CSS. Can I get (and echo) those to, kind of like with an iframe, but then with a element. If so, how?
Thanks a lot.
Maybe what you're looking for is DOMDocument::saveHTML().
If you set the optional arguments it outputs only this particular node.
$elm = $page->getElementById('text');
echo $elm->ownerDocument->saveHTML($elm);
I have found a solution, although it doesn't retrieve the CSS, but if you only need the element and its children, this is my best bet.
Use simple_html_dom.php to do all the hard stuff.
My external page:
<div id='text'>
<img src='img/dummy.png' align='left' alt='Image not available. Our apologies.'/>
<span>text</span><br/>
<p>
text
</p>
<p>
text
</p>
<p>
text
</p>
<div>
Now, my page that I'd like to show the contents of my external page:
<?php include('../includes/simple_html_dom.php'); ?>
....
<?php
$html = file_get_html('about.php');
$ret = $html->find('div#text', 0);
echo $ret;
?>
what this does, it echos the element with its children, without CSS unfortunately.
I'd like to be able to grab data such as list of articles from yahoo finance. At the moment I have a local hosted webpage that searched yahoo finance for stock symbols (E.g Nok), It then returns the opening price, current price, and how far up or down the price has gone.
What I'd like to do is actually grab related links that yahoo has on the page - These links have articles related to the share price...E.g https://au.finance.yahoo.com/q?s=nok&ql=1 Scroll down to headlines, I'd like to grab those links.
At the moment I'm working off a book (PHP Advanced for the world wide web, I know it's old but I found it laying around yesterday and it's quite interesting :) ) In the book it says 'It's important when accessing web pages to know exactly where the data is' - I would think by now there would be a way around this...Maybe the ability to search for links that have a particular keyword in it or something like that!
I'm wondering if theres a special trick I can use to grab particular bits of data on a webpage?? Like crawlers, they are able to grab links that are related to something.
It would be great to know how to do this, then i'd be able to apply it to other subjects in the future.
Ill add my code that I have at the moment. This is purely for practise as I'm learning PHP in my course :)
##getquote.php
<!DOCTYPE html PUBLIC "-//W3// DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/2000/REC-xhtml1-20000126/DTD/xhtml1-transitional.dtd">
<html xmlns="https://www.w3.org/1999/xhtml">
<head>
<title>Get Stock Quotes</title>
<link href='css/style.css' type="text/css" rel="stylesheet">
</head>
<h1>Stock Reader</h1>
<body>
<?php
//Read[1] = current price
//read[5] = opening price
//read[4] = down or up whatever percent from opening according to current price
//Step one
//Begin the PHP section my checking if the form has been submitted
if(isset($_POST['submit'])){
//Step two
//Check if a stock symbol was entered.
if(isset($_POST['symbol'])){
//Define the url to be opened
$url = 'http://quote.yahoo.com/d/quotes.csv?s=' . $_POST['symbol'] . '&f=sl1d1t1c1ohgv&e=.csv';
//Open the url, if can't SHUTDOWN script and write msg
$fp = fopen($url, 'r') or die('Cannot Access YAHOO!.');
//This will get the first 30 characters from the file located in $fp
$read = fgetcsv ($fp, 30);
//Close the file processsing.
fclose($fp);
include("php/displayDetails.php");
}
else{
echo "<div style='color:red'>Please enter a SYMBOL before submitting the form</div>";
}
}
?>
<form action='getquote.php' method='post'>
<p>Symbol: </p><input type='text' name='symbol'>
<br />
<input type="submit" value='Fetch Quote' name="submit">
</form>
<br />
<br />
##displayDetails.php
<div class='display-contents'>
<?php
echo "<div>Todays date: " . $read[2] . "</div>";
//Current price
echo "<div>The current value for " . $_POST["symbol"] . " is <strong>$ " . $read[1] . "</strong></div>";
//Opening Price
echo "<div>The opening value for " . $_POST["symbol"] . " is <strong>$ " . $read[5] . "</strong></div>";
if($read[1] < $read[5])
{
//Down or Up depending on opening.
echo "<div>" .strtoupper($_POST['symbol']) ."<span style='color:red'> <em>IS DOWN</em> </span><strong>$" . $read[4] . "</strong></div>";
}
else{
echo "<div>" . strtoupper($_POST['symbol']) ."<span style='color:green'> <em>IS UP</em> </span><strong>$" . $read[4] . "</strong></div>";
}
added code to displayDetails.php
function getLinks(){
$siteContent = file_get_contents($url);
$div = explode('class="yfi_headlines">',$siteContent);
// every thing inside is a content you want
$innerContent = explode('<div class="ft">',$div)[0]; //now you have inner content of your div;
$list = explode("<ul>",$innerConent)[1];
$list = explode("</ul>",$list)[0];
echo $list;
}
?>
</div>
I just the same code in - I didn't really know what I should do with it?!
Idk for fgetcsv but with file_get_contents you can grab whole content of a page into a string variable.
Then you can search for links in string (do not use regex for html content search: Link regex)
I briefly looked at yahoo's source code so you can do:
-yfi_headlines is a div class witch wrappes desired links
$siteContent = file_get_contents($url);
$div = explode('class="yfi_headlines">',$siteContent)[1]; // every thing inside is a content you want
-last class inside searched div is: ft
$innerContent = explode('<div class="ft">',$div)[0]; //now you have inner content of your div;
repeat for getting <ul> inner content
$list = explode("<ul>",$innerConent)[1];
$list = explode("</ul>",$list)[0];
now you have a list of links in format: <li>text</li>
There are more efficient ways to parse web page like using DOMDocument:
Example
For getting content of a page you can also look at this answer
https://stackoverflow.com/a/15706743/2656311
[ADITIONALY] IF it is a large website: at the beggining of a function do: ini_set("memory_limit","1024M"); so you can store more data to your memory!
As I want to understand Simple HTML Dom a bit I am playing around with it, to test options on my localhost.
Basically I want to take the titles and intro's of this website and display them on my page.
The title as <h2> and the intro as <p>.
What am I doing wrong?
<?php
include 'simple_html_dom.php';
// Create DOM from URL
$html = file_get_html('http://www.nu.nl/algemeen');
foreach($html->find('div[class=list-overlay]') as $article){
$title['intro'] = $article->find('span[class=title]', 0)->innertext;
$intro['details'] = $article->find('span[class=excerpt]', 0)->innertext;
echo '<h2>'. $articles . '</h2>
<p>'. $title .'</p>';
}
?>
edit: There was a double line in there.
Your soulution is somehow right. You have only few typos in variable names. Here is my editation of your code. Also I have added few comments to help you understand.
<?php
include 'simple_html_dom.php';
// Create DOM from URL
$html = file_get_html('http://www.nu.nl/algemeen');
// exctract all elements matching selector div[class=...]
foreach($html->find('div[class=list-overlay]') as $article){
// and for each extract first (0) element that matches to span[class=title]
$title = $article->find('span[class=title]', 0)->innertext;
// and do the same for intro, extract first element that belongs to selector
$intro = $article->find('span[class=excerpt]', 0)->innertext;
// and write it down simply
echo '<h2>'. $title . '</h2>';
echo '<p>' . $intro . '</p>';
}
?>
This solution isn't good though. The have bad structure of their HTML so it is not easy to select only articles, because they don't have them in div with ID articles (for example. You are lucky man anyway, because they provide you XML feed of their articles that is much easier to parse (also less data to transfer and so on). You can find it here and extract titles and intros for your website.
I'm using simple html dom parser and i'm trying to get the company name from the following string:
<a data-omniture="LIST:COMPANYNAME" href="/b/Elite+Heating+and+Plumbing+Services-Plumbers-Reading-RG46RA-901196207/index.html" itemprop="name"> Elite Heating & Plumbing Services </a>
So the bit in between the a tags.
I have the following code:
<?php include_once('simple_html_dom.php');
$html = file_get_html('http://www.thelink.com/');
foreach($html->find('a') as $element)
echo $element->href . '<br>'; echo '</ul>';
?
Also, is it posible to search for html5 data things, so data-whatever: the info in the link
Which brings all the links back, but obviously i don't want all the links.
Yes, you would just do:
$name = $html->find('a[itemprop=name]', 0)->text();
This isn't xpath btw, it's css.