Loading results into the buffer without echoing to the browser - php

How can I load loop results into a buffer using the PHP output control functions without echoing the results to the browser? In essence, what I'm trying to do is call results from the buffer as opposed to echoing my way through the loop "as it goes". Is it possible to do this? Any help appreciated. Thanks!

Use ob_get_contents to get the buffer contents without sending them.
To clean out the buffer call ob_end_clean
To do both in one step call ob_get_clean
An example would be
ob_start();
foreach ($results as $result){
include("tmplate/to/render/a/result.php");
}
$resultHTML = ob_get_clean();
Then later.
<div class='left-rail'><?= $resultHtml ?></div>

It's still not clear what you mean, but here's an answer based on past experience with php programmers: PHP is a full programming language, so you can build complex data structures without producing any output. If you're reading rows from a database, you can read them into an array, do whatever you want with the array, then produce output when you're ready.
If you're generating output by scanning through the array in lots of steps, you can gradually build up a string (or more, if that's necessary in your case) and again output them when you know what you want to do.
Something along these lines:
$output = "";
foreach ($my_array as $row) {
$output .= "<li>".$row."</li>\n";
// plus various checks etc.
}
Am I getting the idea of what you're after?

Related

PHP Concatenate file output to string

I am rewriting a piece of a program to change it from using "echos" throughout the script to create a large output variable using Heredocs instead, which outputs at the end of the file.
A piece of the script includes another PHP file that directly outputs HTML and has php logic within the HTML it outputs. This file is used by other pieces of the overall program that are not yet being rewritten (due to time constraints).
Is it possible to append the output of another file to an $output variable? I've tried doing this, but it it doesn't work for string appending:
$output .= include 'foo.php';
$output .= file_get_contents('foo.php');
The file_get_contents wrote all the PHP logic directly in HTML, as I suspected it would and the straight 'include' echo'd the HTML as I also expected.
Is there a method to get the output buffer of the file and append to a string?
EDIT: Nevermind the question, I completely forgot about OB_Buffering. Added an answer with my solution, no need to answer this one
I feel stupid. I found the answer 5 minutes after posting, I completely forgot about ob_buffering:
ob_start();
include('./foo.php');
$output .= ob_get_contents();
ob_end_clean();

Parse this data with PHP

Going to this webstie:
http://steamcommunity.com/market/priceoverview/?currency=3&appid=730&market_hash_name=AK-47%20%7C%20Redline%20%28Field-Tested%29
Yields this result:
{"success":true,"lowest_price":"5,59€ ","volume":"5,688","median_price":"5,92€ "}
The result is updated every time the page is refreshed. Using PHP, how would I be able to save the result line and split it up in my code I can use it for other things? Would it be viable/possible to do this about 3000-5000 times from a loop in my code, or would it be too much and crash it? I won't be using all the data from it in my code, just saving it into a database and moving to the next result.
That code is JSON and can be parsed with json_decode
$data = file_get_contents('http://steamcommunity.com/market/priceoverview/?currency=3&appid=730&market_hash_name=AK-47%20%7C%20Redline%20%28Field-Tested%29');
$json = json_decode($data);
echo $json->volume;
As far as looping... why? You would simply load the page 3000+ times in a row. What would be the benefit of that? Perhaps you should consider a cron job instead, which could fetch the data at regular intervals (and not spam the Steam servers)
Your code is in JSON format.
So:
$content = file_get_contents('http://steamcommunity.com/market/priceoverview/?currency=3&appid=730&market_hash_name=AK-47%20%7C%20Redline%20%28Field-Tested%29');
json_decode($content);
Should work correctly

Why isn't PHP continuing to run through these urls?

I have the following PHP. Basically, I'm getting similar data from multiple pages of a website (the current number of homeruns from a website that has a bunch of baseball player profiles). The JSON that I'm bringing in has all of the URLs to all of the different profiles that I'm looking to grab from, and so I need PHP to run through the URLs and grab the data. However, the following PHP only gets the info from the very first URL. I'm probably making a stupid mistake. Can anyone see why it's not going through all the URLs?
include('simple_html_dom.php');
$json = file_get_contents("http://example.com/homeruns.json");
$elements = json_decode($json);
foreach ($elements as $element){
$html = new simple_html_dom();
$html->load_file($element->profileurl);
$currenthomeruns = $html->find('.homeruns .current',0);
echo $element->name, " currently has the following number of homeruns: ", strip_tags($currenthomeruns);
return $html;
}
Wait... You are using return $html. Why? Return is going to break out of your function, thus stopping your foreach.
If you are indeed trying to get the $html out of your function for ALL of the elements, you should push each $html into an array and then return that array after the loop.
Because you return. return leaves the current method, function, or script, which includes every loop. With PHP5.5 you can use yield to let the function behaves like an generator, but this is definitely out of scope for now.
Unless your braces are off, you return at the very end of the loop so the loop will never iterate.

Alternative to php preg_match to pull data from an external website?

I want to extrat the content of a specific div in an external webpage, the div looks like this:
<dt>Win rate</dt><dd><div>50%</div></dd>
My target is the "50%". I'm actually using this php code to extract the content:
function getvalue($parameter,$content){
preg_match($parameter, $content, $match);
return $match[1];
};
$parameter = '#<dt>Score</dt><dd><div>(.*)</div></dd>#';
$content = file_get_contents('https://somewebpage.com');
Everything works fine, the problem is that this method is taking too much time, especially if I've to use it several times with diferents $content.
I would like to know if there's a better (faster, simplier, etc.) way to acomplish the same function? Thx!
You may use DOMDocument::loadHTML and navigate your way to the given node.
$content = file_get_contents('https://somewebpage.com');
$doc = new DOMDocument();
$doc->loadHTML($content);
Now to get to the desired node, you may use method DOMDocument::getElementsByTagName, e.g.
$dds = $doc->getElementsByTagName('dd');
foreach($dds as $dd) {
// process each <dd> element here, extract inner div and its inner html...
}
Edit: I see a point #pebbl has made about DomDocument being slower. Indeed it is, however, parsing HTML with preg_match is a call for trouble; In that case, I'd also recommend looking at event-driven SAX XML parser. It is much more lightweight, faster and less memory intensive as it does not build a tree. You may take a look at XML_HTMLSax for such a parser.
There are basically three main things you can do to improve the speed of your code:
Off load the external page load to another time (i.e. use cron)
On a linux based server I would know what to suggest but seeing as you use Windows I'm not sure what the equivalent would be, but Cron for linux allows you to fire off scripts at certain schedule time offsets - in the background - so not using a browser. Basically I would recommend that you create a script who's sole purpose is to go and fetch the website pages at a particular time offset (depending on how frequently you need to update your data) and then write those webpages to files on your local system.
$listOfSites = array(
'http://www.something.com/page.htm',
'http://www.something-else.co.uk/index.php',
);
$dirToContainSites = getcwd() . '/sites';
foreach ( $listOfSites as $site ) {
$content = file_get_contents( $site );
/// i've just simply converted the URL into a filename here, there are
/// better ways of handling this, but this at least keeps things simple.
/// the following just converts any non letter or non number into an
/// underscore... so, http___www_something_com_page_htm
$file_name = preg_replace('/[^a-z0-9]/i','_', $site);
file_put_contents( $dirToContainSites . '/' . $file_name, $content );
}
Once you've created this script, you then need to set the server up to execute it as regularly as you need. Then you can modify your front-end script that displays the stats to read from local files, this would give a significant speed increase.
You can find out how to read files from a directory here:
http://uk.php.net/manual/en/function.dir.php
Or the simpler method (but prone to possible problems) is just to re-step your array of sites, convert the URLs to file names using the preg_replace above, and then check for the file's existence in the folder.
Cache the result of calculating your statistics
It's quite likely this being a stats page that you'll want to visit it quite frequently (not as frequent as a public page, but still). If the same page is visited more often than the cron-based script is executed then there is no reason to do all the calculation again. So basically all you have to do to cache your output is do something similar to the following:
$cachedVersion = getcwd() . '/cached/stats.html';
/// check to see if there is a cached version of this page
if ( file_exists($cachedVersion) ) {
/// if so, load it and echo it to the browser
echo file_get_contents($cachedVersion);
}
else {
/// start output buffering so we can catch what we send to the browser
ob_start();
/// DO YOUR STATS CALCULATION HERE AND ECHO IT TO THE BROWSER LIKE NORMAL
/// end output buffering and grab the contents so we now have a string
/// of the page we've just generated
$content = ob_get_contents(); ob_end_clean();
/// write the content to the cached file for next time
file_put_contents($cachedVersion, $content);
echo $content;
}
Once you start caching things you need to be aware of when you should delete or clear your cache - otherwise if you don't your stats output will never change. With regards to this situation, the best time to clear your cache is at the point you go and fetch the external web pages again. So you should add this line to the bottom of your "cron" script.
$cachedVersion = getcwd() . '/cached/stats.html';
unlink( $cachedVersion ); /// will delete the file
There are other speed improvements you could make to the caching system (you could even record the modified times of the external webpages and load only when they have been updated) but I've tried to keep things easy to explain.
Don't use a HTML Parser for this situation
Scanning a HTML file for one particular unique value does not require the use of a fully-blown or even lightweight HTML Parser. Using RegExp incorrectly seems to be one of those things that lots of start-up programmers fall into, and is a question that is always asked. This has led to lots of automatic knee-jerk reactions from more experience coders to automatically adhere to the following logic:
if ( $askedAboutUsingRegExpForHTML ) {
$automatically->orderTheSillyPersonToUse( $HTMLParser );
} else {
$soundAdvice = $think->about( $theSituation );
print $soundAdvice;
}
HTMLParsers should be used when the target within the markup is not so unique, or your pattern to match relies on such flimsy rules that it'll break the second an extra tag or character occurs. They should be used to make your code more reliable, not if you want to speed things up. Even parsers that do not build a tree of all the elements will still be using some form of string searching or regular expression notation, so unless the library-code you are using has been compiled in an extremely optimised manner, it will not beat well coded strpos/preg_match logic.
Considering I have not seen the HTML you are hoping to parse, I could be way off, but from what I've seen of your snippet it should be quite easy to find the value using a combination of strpos and preg_match. Obviously if your HTML is more complex and might have random multiple occurances of <dt>Win rate</dt><dd><div>50%</div></dd> it will cause problems - but even so - a HTMLParser would still have the same problem.
$offset = 0;
/// loop through the occurances of 'Win rate'
while ( ($p = stripos ($html, 'win rate', $offset)) !== FALSE ) {
/// grab out a snippet of the surrounding HTML to speed up the RegExp
$snippet = substr($html, $p, $p + 50 );
/// I've extended your RegExp to try and account for 'white space' that could
/// occur around the elements. The following wont take in to account any random
/// attributes that may appear, so if you find some pages aren't working - echo
/// out the $snippet var using something like "echo '<xmp>'.$snippet.'</xmp>';"
/// and that should show you what is appearing that is breaking the RegExp.
if ( preg_match('#^win\s+rate\s*</dt>\s*<dd>\s*<div>\s*([0-9]+%)\s*<#i', $snippet, $regs) ) {
/// once you are here your % value will be in $regs[1];
break; /// exit the while loop as we have found our 'Win rate'
}
/// reset our offset for the next loop
$offset = $p;
}
Gotchas to be aware of
If you are new to PHP, as you state in a comment above, then the above may seem rather complicated - which it is. What you are trying to do is quite complex, especially if you want to do it optimally and fast. However, if you follow throught the code I've given and research any bits that you aren't sure of / haven't heard of (php.net is your friend), it should give you a better understanding of a good way to achieve what you are doing.
Guessing ahead however, here are some of the problems you might face with the above:
File Permission errors - in order to be able to read and write files to and from the local operating system you will need to have the correct permissions to do so. If you find you can not write files to a particular directory it might be that the host you are using wont allow you to do so. If this is the case you can either contact them to ask about how to get write permission to a folder, or if that isn't possible you can easily change the code above to use a database instead.
I can't see my content - when using output buffering all the echo and print commands do not get sent to the browser, they instead get saved up in memory. PHP should automatically output all the stored content when the script exits, but if you use a command like ob_end_clean() this actually wipes the 'buffer' so all the content is erased. This can lead to confusing situations when you know you are echoing something.. but it just isn't appearing.
(Mini Disclaimer :) I've typed all the above manually so you may find there are PHP errors, if so, and they are baffling, just write them back here and StackOverflow can help you out)
Instead of trying to not use preg_match why not just trim your document contents down in size? for example, you could dump everything before <body and everything after </body>. then preg_match will be searching less content already.
Also, you could try to do each one of these processes as a pseudo separate thread, so that way they aren't happening one at a time.

PHP - To echo or not to echo?

What is more efficient and/or what is better practice, to echo the HTML or have many open and close php tags?
Obviously for big areas of HTML it is sensible to open and close the php tags. What about when dealing with something like generating XML? Should you open and close the php tags with a single echo for each piece of data or use a single echo with the XML tags included in quotations?
From a maintenance perspective, one should have the HTML / XML as separate from the code as possible IMO, so that minor changes can be made easily even by a non-technical person.
The more a homogeneous block the markup is, the cleaner the work.
One way to achieve this is to prepare as much as possible in variables, and using the heredoc syntax:
// Preparation
$var1 = get_value("yxyz");
$var2 = get_url ("abc");
$var3 = ($count = 0 ? "Count is zero" : "Count is not zero");
$var4 = htmlentities(get_value("def"));
// Output
echo <<<EOT
<fieldset title="$var4">
<ul class="$var1">
<li>
$var2
</li>
</ul>
</fieldset>
EOT;
You will want to use more sensible variable names, of course.
Edit: The link pointed out by #stesch in the comments provides some good arguments towards using a serializer when producing XML, and by extension, even HTML, instead of printing it out as shown above. I don't think a serializer is necessary in every situation, especially from a maintenance standpoint where templates are so much more easy to edit, but the link is well worth a read. HOWTO Avoid Being Called a Bozo When Producing XML
Another big advantage of the separation between logic and content is that if transition to a templating engine, or the introduction of caching becomes necessary one day, it's almost painless to implement because logic and code are already separated.
PHP solves this problem by what is known as heredocs. Check it out please.
Example:
echo <<<EOD
<td class="itemname">{$k}s</td>
<td class="price">{$v}/kg</td>
EOD;
Note: The heredoc identifer (EOD in this example) must not have any spaces or indentation.
Whichever makes sense to you. The performance difference is marginal, even if a large echo is faster.
But an echo of a big string is hard to read and more <?php echo $this->that; ?> tell a story :)
echo sends its argument further down the request processing chain, and eventually this string is sent to the client through a say, network socket. Depending on how the echo works in conjunction with underlying software layers (e.g. webserver), sometimes your script may be able to execute faster than it can push data to the client. Without output buffering, that is. With output buffering, you trade memory to gain speed - you echos are faster because they accumulate in a memory buffer. But only if there is no implicit buffering going on. One'll have to inspect Apache source code to see how does it treat PHPs stdout data.
That said, anything below is true for output buffering enabled scripts only, since without it the more data you attempt to push at once the longer you have to wait (the client has to receive and acknowledge it, by ways of TCP!).
It is more efficient to send a large string at once than do N echos concatenating output. By similar logic, it is more efficient for the interpreter to enter the PHP code block (PHP processing instruction in SGML/XML markup) once than enter and exit it many times.
As for me, I assemble my markup not with echo, but using XML DOM API. This is also in accordance with the article linked above. (I reprint the link: http://hsivonen.iki.fi/producing-xml/) This also answers the question whether to use one or many PHP tags. Use one tag which is your entire script, let it assemble the resulting markup and send it to the client.
Personally I tend to prefer what looks the best as code readability is very important, particularly in a team environment. In terms of best practice I'm afraid I'm not certain however it is usually best practice to optimize last meaning that you should write it for readability first and then if you encounter speed issues do some refactoring.
Any issues you have with efficiency are likely to be elsewhere in your code unless you are doing millions of echo's.
Another thing to consider is the use of an MVC to separate your "views" from all of your business logic which is a very clean way to code. Using a template framework such as smarty can take this one step further leading to epic win.
Whatever you do, don't print XML!
See HOWTO Avoid Being Called a Bozo When Producing XML
I've made myself the same question long time ago and came up with the same answer, it's not a considerable difference. I deduct this answer with this test:
<?
header('content-type:text/plain');
for ($i=0; $i<10; $i++) {
$r = benchmark_functions(
array('output_embeed','output_single_quote','output_double_quote'),
10000);
var_dump($r);
}
function output_embeed($i) {
?>test <?php echo $i; ?> :)<?
}
function output_single_quote($i) {
echo 'test '.$i.' :)';
}
function output_double_quote($i) {
echo "test $i :)";
}
function benchmark_functions($functions, $amount=1000) {
if (!is_array($functions)||!$functions)
return(false);
$result = array();
foreach ($functions as $function)
if (!function_exists($function))
return(false);
ob_start();
foreach ($functions as $idx=>$function) {
$start = microtime(true);
for ($i=0;$i<$amount;$i++) {
$function($idx);
}
$time = microtime(true) - $start;
$result[$idx.'_'.$function] = $time;
}
ob_end_clean();
return($result);
}
?>

Categories