PHP file_get_contents optimization - php

I use simple file_get_contents feature to grab data from other site and place it on mine.
<?php
$mic1link = "https://example.com/yyy.html";
$mic2link = "https://example.com/zzz.html";
$mic3link...
$mic4link...
$mic5link...
$mic6link...
?>
<?php
$content = file_get_contents($mic1link);
preg_match('#<span id="our_price_displays" class="price" itemprop="price" content=".*">(.*)</span>#Uis', $content, $mic1);
$mic1 = $mic1[1];
?>
<?php
$content = file_get_contents($mic2link);
preg_match('#<span id="our_price_displays" class="price" itemprop="price" content=".*">(.*)</span>#Uis', $content, $mic2);
$mic2 = $mic2[1];
?>
And fired up by
<?php echo "$mic1";?> and <?php echo "$mic2";?>
It works but it impacts on performance (delay).
Is there any way to optimize this script or maybe another way to achieve this?

Firstly, as others have said, the first step is to use the Guzzle library for this instead of file_get_contents(). This will help, although ultimately you will always be constrained by the performance of the remote sites.
If at all possible, try to reduce the number of http requests you have to make: Can the remote site aggregate the data from all the requests into a single one? Or are you able to obtain the data via other means? (eg direct requests to a remote database?). The answers here will depend on what the data is and where you're getting it from, but look for ways to acheive this as those requests are going to be a bottleneck to your system no matter what.
If the resources are static (ie they don't change from one request to another), then you should cache them locally and read the local content rather than the remote content on every page load.
Caching can be done either the first time the page loads (in which case that first page load will still have the performance hit, but subsequent loads won't), or done by a separate background task (in which case your page needs to take account of the possibility of the content not being available in the cache if the page is loaded before the task runs). Either way, once the cache is populated, your page loads will be much faster.
If the resources are dynamic then you could still cache them as above, but you'll need to expire the cache more often, depending on how often the data are updated.
Finally, if the resources are specific to the individual page load (eg time-based data, or session- or user-specific) then you'll need to use different tactics to avoid the performance hit. Caching still has its place, but won't be anything like as useful in this scenario.
In this scenario, your best approach is to limit the amount of data being loaded in a single page load. You can do this a number of ways. Maybe by giving the user a tabbed user-interface, where he has to click between tabs to see each bit of data. Each tab would be a different page load, so you'd be splitting the performance hit between multiple pages, and thus less noticable to the user, especially if you've used caching to make it seamless when he flips back to a tab he previously loaded. Alternatively if it all needs to be on the same page, you could use ajax techniques to populated the different bits of data directly into the page. You might even be able to call the remote resources directly from the Javascript in the browser rather than loading them in your back-end php code. This would remove the dog-leg of the data having to go via your server to get to the end user.
Lots to think about there. You'll probably want to mix and match bits of this with other ideas. I hope I've given you some useful tips though.

Related

Is PHP restrict transfer data between files?

In one file I make query and get result.
index.php (outputs the links to browser)
foreach ($conn->query($sql) as $info) {
$output_html = '<a href = addsearch.php?id='.$info['id'].' target=_blank \>'.$title_.'</a>';
print("$output_html<br>");
}
$_SESSION['info'] = $info; // now it's works such way
In other file I wish I would get the copy of result without using GET, POST, SESSION methods. (I no need in GET\POST as data stay on server in nearest RAM area. Also wouldn't want to use the SESSION variable as it use HDD.)
addsearch.php (launch only when user click on the link)
session_start();
print_r($_SESSION['info']); // works now
...
Is there another methods to get data? Any global RAM variables or cache, common shared resource between files.
I tried first example from PHP manual:
<?php
$a = 1;
include 'b.inc';
?>
but it doesn't work :-) because I launch files separately, so they have different processes.
PHP isn't restricting anything, it doesn't know about that data in the first place. There are two parts to understanding this:
PHP is a shared-nothing architecture.
HTTP is stateless.
The first is a design decision of PHP: every request receives a completely new environment, so data isn't held in memory between HTTP requests from users. This makes the language much more predictable, because actions on one page have very few side-effects on another.
The second, however, is more fundamental: even if PHP stored data between requests, it would be storing them in one pot for every user that accessed your site. That's because HTTP doesn't have any native tracking of "state", so two requests from the same user looks fundamentally the same as two requests from different users.
That's where cookies and sessions come in: you send a cookie to the user's browser with an ID, and you tie some data to that ID, stored somewhere on your server. That somewhere doesn't need to be on disk - it could be in a memory store like memcache or Redis, in a database, etc - but because of PHP's "shared nothing" model, it can't just be in a PHP variable.
Another relevant concept is caching: storing (again, on disk, in a memory store, etc) the results of slow computations, so that when asked to do the same computation again you can just look up the answer. Whereas a session is good for remembering what the customer puts in their shopping cart, caching is good for displaying the same set of search results to every customer that enters the same search.

How to parallelize requests without mecache in PHP?

The page really needs to load fast, but the DB is slow, so we split it into two db calls, one faster and one slower, the first one that is faster runs and we can serve a part of the page that is quite usable by itself.
But then we want the second request to go off, and we know that it will ALWAYS be necessary to do whenever the first request goes off. So now the first part of the page contains a script which fires off http requests and then we make a db call and finally it loads.
But this is a serial opreation, which means the first part of page load needs to both finish its db, return http, render in the browser, run the script, request http then wait for db and finally return us the whole page.
How do you go about solving this in PHP? We dont have memcache and I looked into fifo but we dont have posix_mkfifo function either.
I want to make two db calls on the first request, serve the first request and part of page, let the second db call continue running, when its finished I want to keep it in /tmp/ or a buffer or wherever fast - in memory - and when the script asks for it - perhaps the scripts http req will need to wait for it some more, perhaps its lucky and will get it served from memory already.
But where in memory do you keep it, across requests and php instances? Not in global, not in session, not in memcached. Where? Sockets?? Should I fork and pipe?
EDIT: Thanks, everybody. I went with the two-async-http-requests route.
I think you could use AJAX.
First time send HTML page with 2 javascript AJAX call, one for each sql query, triggered by page load.
Then load page async with those results.
The problem is that your problem is to complex to solve it without extra solutions like memcache. Direkt in PHP you can save short data in SHM. But thats not the best solution.
The best solution is to build a better database structure so get a better result and a faster response from your database.
For better performance in your database you can look at MySQL memory tables. But be careful the tables will be cleared after restart. So you can fill the tables with data for caching.
And you can send more then one request at a time with Ajax.

large select slowing down page load - caching php

I'm building a web app, the way I started off the app for testing purposes is to load lots of data in to session arrays from my database so I can use the values easily throughout the pages. I have one page the has numerous selects on it, and each time the php page loops through all the variables, chooses the selected one, and outputs the dropdown. One of my arrays though has just under 3000 values and loading this dropdown slows the page down from about 300ms to 1-1.2s. Not terrible but easy to tell that it is less responsive. So I'd like to know if there is anyway for me to improve the load speed, or any thoughts on a substitute for the dropdown.
What I have tried so far:
Session arrays hold all the values, when the page is loaded through jquery ajax method the php page loops through these values and echos the dropdowns.
Php include - create php or html pages of all the values pre written as selects, this creates a ~100kb page for the problem dropdown and this is then included with include. Takes roughly the same amount plus I'd have to then use javascript to set the value, but I'd do this if it could be improved. I thought perhaps some caching could provide improvements here. There seemed to be no significant difference between html and php pages for include but I'd assume html would be better. I'm also assuming that I cannot use regular caching because I am using a php function to include these pages.
I have tried just loading in the html page and it takes about 1 sec on first load, after browser caching it is back down to 100-350ms so I imagine caching could provide a huge boost in performance.
I have considered:
Creating a cached version of the whole page but this will be quite the pain to implement so I'd only do it if people thought it is the right way to go with this. I would have to use ajax to retrieve some data for the inputs which I was originally doing with php echos.
Just removing the problem dropdown.
Just to clarify something I've never had clarified, am I correct in thinking php pages can never be cached by the browser, and so by extension any php included files can't be either. But then how come a javascript file linked to in a php file can be cached, because it is using an html method?
The data being returned and parsed into a dropdown is probably your bottleneck. However, if the bottleneck is actually the PHP code you could try installing an optcode cache like APC at http://php.net/manual/en/book.apc.php. It will speed up your PHP. (Zend Optimizer is also available at: http://www.zend.com/en/products/guard/runtime-decoders)
If your bottleneck is the database where the items in the dropdown is coming from, you may want to try setting MySQL to cache the results.
You may also want to try an alternative dropdown that uses AJAX to populate the dropdown as the user scrolls down, a few records at a time. You could also create it as a text field that prompts the user for possible matches as they type. These things may be faster.
I suspect the problem is the raw size of the data you're transmitting, based on the results of number 2 in "What I have tried so far." I don't think you can rely on browser caching, and server-side caching won't change the size of the data transmitted.
Here are a couple of ideas to reduce the amount of data transmitted during page load:
Load the select box separately, after the main page has been
delivered, using an asynchronous javascript call.
Break the choice into a hierarchical series of choices. User
chooses the top-level category, then another select box is populated
with matching sub-categories. When they choose a sub-category, the
third box fills with the actual options in that sub-category. Something like
this.
Of course, this only works if those 2nd and 3rd controls are filled-in using an async
javascript call.
Either way, make sure gzip compression is enabled on your server.
Edit: More on browser caching
The browser caches individual files, and you typically don't ask it to cache PHP pages because they may be different next time. (Individual php includes are invisible to the browser, because PHP combines their contents into the HTML stream.) If you use a browser's developer console (hit f12 on Chrome and go to Network, for example), you can see that most pages cause multiple requests from the browser to the server, and you may even see that some of those files (js, css, images) are coming from the cache.
What the browser caches and for how long is controlled by various HTTP response headers, like Cache-Control and Expires. If you don't override these in php by calling the header function, they are controlled by the web server (Apache) configuration.

Efficiency of using php to load scripts?

I have a website that's about 10-12 pages strong, using jQuery/Javascript throughout. Since not all scripts are necessary in each and every page, I'm currently using a switchstatement to output only the needed JS on any given page, so as to reduce the number of requests.
My question is, how efficient is that, performance-wise ? If it is not, is there any other way to selectively load only the needed JS on a page ?
This may not be necessary at all.
Bear in mind that if your caching is properly set up, embedding a JavaScript will take time only on first load - every subsequent request will come from the cache.
Unless you have big exceptions (like, a specific page using a huge JS library), I would consider embedding everything at all times, maybe using minification so everything is in one small file.
I don't see any performance issues with the method you are using, though. After all, it's about deciding whether to output a line of code or not. Use whichever method is most readable and maintainable in the long term.
Since you're using JS already, you can use JS solution completely - for example you could use yepnope instead of php. I don't know what's the structure of your website and how you determine which page needs what or at what point is something included (on load, on after some remote thing has finished delivering data), however if you use $.ajax extensively, you could also use yepnope to pull additional JS that's needed once $.ajax is done with what it was supposed to do.
You can safely assume the javascript is properly cached on the clientside.
As I also assume you serve a minified file, seen the size of your website I'd say the performance is neglectable.
It is much better to place ALL your JavaScript in a single separate ".js" file and reference this file in your pages.
The reason is that the browser will cache this file efficiently and it will only be downloaded once per session (or less!).
The only downside is you need to "refresh" a couple of times if you change your script.
So, after tinkering a bit, I decided to give LABjs a try. It does work well, and my code is much less bloated as a result. No noticeable increase in performance given the size of my site, but the code is much, much more maintainable now.
Funny thing is, I had a facebook like button in my header. After analyzing the requests in firebug I decided to remove it, and gained an astounding 2 seconds on the pages loading time. Holy crap is this damn thing inneficient...
Thanks for the answers all !

Should I use AJAX or get every data beforehand

I have a web app where I need to change a drop down list dynamically depending on another drop down list.
I have two options:
Get all the data beforehand with PHP and "manage" it later with Javascript.
Or get the data the user wants through AJAX.
The thing is, that the page loads with all the data by default and the user can later select a sub category to narrow the drop downs.
Which of the two options are better (faster, less resource intensive)?
The less resource intensive option is clearly AJAX since you're only transferring the required information and no more.
However, AJAX can make the page less responsive if the latency is high for the client (having to wait for connections to fetch data between drop-down choices).
So: Load everything up front if latency is a bigger issue, and use AJAX if bandwidth is more of an issue.
It depends of your main goal:
1.
with ajax you'll be able to get the data you want without page refresh, and getting it as needed, thus you app will run faster...
It will also allow you to have a single block of code on an independent file to be "called by ajax" when needed, thus ussing that code across your app without loading it constantly!
2.
With php, you will have to prep the data beforehand thus writing a little more code, thus making your app slower...
Performance is nothing a user will see, unless we are talking about a big amount of data.
Concluding, ajax is the best way when talking about performance and code effectiveness!
Ps: Personal opinion of course!
If there is a considerable number of possible select options, I would use AJAX to get them dynamically. If you only have a very small set of select options, it would be worth considering embedding them in the page. Embedding in the page means no latency, and a snappier interface.
However, as stated previously, dynamic retrievals are very useful if you have a large set of options, or if the options are subject to changing dynamically.
As with any ajax request, remember to display some form of visual feedback while the request is underway.

Categories