php persistent service? [duplicate] - php

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Cache Object in PHP without using serialize
So I have built a rather large data structure which cannot easily be turned into a relational database format. Using this data structure the requests I make are very fast but it takes about 4-5 seconds to load into memory. What I want is to load it into memory once, then have it sit there and be able to quickly answer individual requests which of course is not the normal flow with the php scripts I have written normally. Is there any good way to do this in php? (again no using a database, it has to use this specialized precomputed structure which takes a long time to load into memory.)
EDIT: This tutorial kind of gives what I want but it is pretty complicated and I was hoping someone would have a more elegant solution. As he says in the tutorial the whole problem is that naturally php is stateless.

You absolutely must do something like what your linked tutorial proposes.
No PHP state persist between requests. This is by design.
Thus you will need some kind of separate long-running process and thus some kind of IPC method, or else you need a better data structure you can load piecemeal.
If you really can't put this into a relational database (such as sqlite--it doesn't have to be a process database), explore using some other kind of database, such as a file-based key-value store.
Note that it is extremely unlikely that any long-running process you write, in any language, will be faster, easier, or better than getting this data structure of yours into a real database, relational or otherwise! Get your data structure into a database! It's the easiest among your possible paths.
Another thing you can do is just make loading your data structure as quick as possible. You can serialize it to a file and then deserialize the file; if that is not fast enough you can try igbinary, which is a much-faster-than-standard-php serializer.

Related

Thoughts on doing MySQL queries vs using SESSION variables? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Just curious how other people feel about this. Will appreciate opinions or facts, whatever you got :)
I am working on an application where a lot of info is pulled from MySQL and needed on multiple pages.
Would it make more sense to...
Pull all data ONCE and store it in SESSION variables to use on other pages
Pull the data from the database on each new page that needs it
I assume the preferred method is #1, but maybe there is some downside to using SESSION variables "too much"?
Side question that's kind of related: As far as URLs, is it preferable to have data stored in them (i.e. domain.com/somepage.php?somedata=something&otherdata=thisdata) or use SESSION variables to store that data so the URLs can stay general/clean (i.e. domain.com/somepage.php)?
Both are probably loaded questions but any possible insight would be appreciated.
Thanks!
Your question can't be answered to the point where the answer is applicable everywhere.
Here's why: many web server architectures deal with having HTTP server (Apache, Nginx), serverside language (PHP, Ruby, Python) and RDBMS (MySQL, PostgreSQL) on one and the same machine.
That's one of the most common setups you can find.
Now, this is what happens in your scenario:
You connect to MySQL - you establish a connection from PHP > MySQL and that "costs" a little
You request the data, so MySQL reads it from the hard drive (unless cached in RAM)
PHP gets the data and allocates some memory to hold the information
Now you save that to a session. But by default, sessions are disk based so you just issued a write operation and you spent at least 1 I/O operation of your hard drive
But let's look at what happened - you moved some data from disk (MySQL) to RAM (PHP variable) which then gets saved at disk again.
You really didn't help yourself or your system in that case, what happens is that you made things slower.
On the other hand, PHP (and other languages) are capable of maintaining connections to MySQL (and other databases) so they minimize the cost of opening a new connection (which is really inexpensive in the grand scheme of things).
As you can see, this is one scenario. There's a scenario where you have your HTTP server on a dedicated machine, PHP on dedicated machine and MySQL on dedicated machine. The question is, again, is it cheaper to move data from MySQL to a PHP session. Is that session disk based, redis based, memcache based, database based? What's the cost of establishing the connection to MySQL?
What you need to ask, in any scenario that you can imagine - what are you trading off and for what?
So, if you are running the most common setup (PHP and your database on the same machine) - the answer is NO, it's not better to store some MySQL data in a session.
If you use InnoDB (and you probably are) and if it's optimized properly, saving some data to a session to avoid apparent overhead of querying the db for reads won't yield benefits. It's most likely going to be quite the opposite.
Putting it into the session is almost always a terrible idea. It's not even worth considering unless you've exhausted all other options.
Here's how you tackle these problems:
Evaluate if there's anything you can do to simplify the query you're running, like trim down on the columns you fetch. Instead of SELECT * try SELECT x,y where those are the only columns you need.
Use EXPLAIN to find out why the query is taking so long. Look for any easy wins like adding indexes.
Check that your MySQL server is properly tuned. The default configuration is terrible and some simple one-line fixes can boost performance dramatically.
If, and only if, you've tried all these things and you can't squeeze out any more performance, you want to try and cache the results.
You only pull the pin on caching because caching is one of the hardest things to get right.
You can use something like Memcached or Redis act as a faster store for pre-refetched results. They're designed to automatically expire cached data that's no longer used.
The reason using $_SESSION is a bad idea is because once data is put in there very few take the time to properly expunge it later, leading to an ever growing session. If you're concerned about performance, keep your sessions as small as possible.
Just think about your users(client pc). session takes some spaces to user pc, also session can get lost, may e after closing page, or copying the link and paste it to other browser. God practice there i think just use query, but note something, try as much as possible to reduce number of queries in page, it will slow down your site.

Using XML instead of MySQL

I run a large website and I'm looking in to the advantages of using XML to retrieve the data rather than querying the database so much in the hope of speeding things up a little.
The problem is, I have lots of AJAX requests etc. that go on across the website.
Is there any great advantages in using XML over MySQL and how reliable is it, i.e. can I update the whole XML file on every update, or will that cause other users to not have access to the XML file for a few seconds while the PHP writes the new file... or should I use PHP to look for and update just that field in the XML document (although that would still require updating the whole file)!?
Any ideas and best-practice ideas would be great here. How do stackoverflow do it?
Using a database sounds ideal for the scenario you describe (many concurrent accesses, many small queries and updates, etc.)
Reading and writing an XML file is definitely not going to be faster - in fact, it's likely to be much, much slower. XML is not a choice you would make to improve performance.
If you are having performance problems, look at optimizing your database first.
Please don't do this.
Working with XML (file system) will never be faster than querying a database. Databases are used for a reason...
In answer to your last question (how do stack overflow do it). They use a database: https://data.stackexchange.com/ - Namely SQL Server 2008.
Short and simple - databases was created for getting past the painfully slow file based systems.
How do stackoverflow do it?
An article from the site founder: Back to basics

Speed/Memory Issue with PHP [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Least memory intensive way to read a file in PHP
I have a problem with speed vs. memory usage.
I have a script which needs to be able to run very quickly. All it does is load multiple files from 1-100MB, consisting of a list of values and checks how many of these exist against another list.
My preferred way of doing this is to load the values from the file into an array (explode), and then loop through this array and check whether the value exists or not using isset.
The problem I have is that there are too many values, it uses up >10GB of memory (I don't know why it uses so much). So I have resorted to loading the values from the file into memory a few at a time, instead of just exploding the whole file. This cuts the memory usage right down, but is VERY slow.
Is there a better method?
Code Example:
$check=array('lots','of','values','here');
$check=array_flip($check);
$values=explode('|',file_get_contents('bigfile.txt'));
$matches=0;
foreach($values as $key)
if (isset($check[$key])) $matches++;
Maybe you could code your own C extension of PHP (see e.g. this question), or code a small utility program in C and have PHP run it (perhaps using popen)?
These seems like a classic solution for some form of Key/Value orientated NoSQL datastore (mongodb, couchdb, Riak) (or maybe even just a large memcache instance).
Assuming you can load the large data files into the datastore ahead of when you need to do the searching and that you'll be using the data from the loaded files more than once, you should see some impressive gains (as long your queries, mapreduce, etc aren't awful), judging by the size of your data you may want to look at a data store which doesn't need to hold everything in memory to be quick.
There are plenty of PHP drivers (and tutorials) for each of the datastores I mentioned above.
Open the files and read through them line wise. Maybe use MySQL, for import (LOAD DATA INFILE), for resulting data or both.
It seems you need some improved search engine.
Sphinx search server can be used for searching your values really fast.

Storing Website Data - JSON vs SQL

I'm managing a website that is built from the ground-up in PHP and uses AJAX extensively to load pages within the site.
Currently, the 'admin' interface I created stores the HTML for each page, the page's title, custom CSS for that page, and some other attributes in JSON in a file directory (e.g. example.com/hi would have its data stored in example.com/json/hi/data.json).
Is this more efficient (performance-wise) than using a database (MySQL) to store all of the content?
EDIT: I understand that direct filesystem access is more efficient than using a database, but I'm not just getting the contents of a file; I need to use json_decode to parse the JSON into an object/array first to work with it. Is the performance hit from that negligible?
Your manner doesn't sounds very bad, in fact I used a not-so-far solution before and it does the job pretty good. But you might consider more powerful storage as soon as your admin will grow or the need to separate things arises. Since, there are two ways :
SQL : using relationnal database like Postgre or MySQL
No-SQL : which seems to be more of your interest, here is why.
Considering you're using ajax for communication and json for storage, there's MongoDB. A no sql approach to storage that use json syntax for querying. There even is an Interactive Tutorial for it !
When it comes to performances, no one sounds to be faster. Both engine usually written in C or C++, that is, for best performances. Nevertheless, for a structure as simple as you describe, there's no faster way than direct file access.
Thad depends on what you use that data for.
If you only serve 'static' files, with almost the same data all the time, then even static HTML files are recommended for chaching.
If, on the other hand, you process data and display it in multiple forms (searches, custom statistics, etc) then it is much better to store it in some kind of DB

How to improve on PHP's XML loading time?

Dropping my lurker status to finally ask a question...
I need to know how I can improve on the performance of a PHP script that draws its data from XML files.
Some background:
I've already mapped the bottleneck to CPU - but want to optimize the script's performance before taking a hit on processor costs. Specifically, the most CPU-consuming part of the script is the XML loading.
The reason I'm using XML to store object data because the data needs to be accessible via a browser Flash interface, and we want to provide fast user access in that area. The project is still in early stages though, so if best practice would be to abandon XML altogether, that would be a good answer too.
Lots of data: Currently plotting for roughly 100k objects, albeit usually small ones - and they must ALL be taken up into the script, with perhaps a few rare exceptions. The data set will only grow with time.
Frequent runs: Ideally, we'd run the script ~50k times an hour; realistically, we'd settle for ~1k/h runs. This coupled with data size makes performance optimization completely imperative.
Already taken an optimization step of making several runs on the same data rather than loading it for each run, but it's still taking too long. The runs should generally use "fresh" data with the modifications done by users.
Just to clarify: is the data you're loading coming from XML files for processing in its current state and is it being modified before being sent to the Flash application?
It looks like you'd be better off using a database to store your data and pushing out XML as needed rather than reading it in XML first; if building the XML files gets slow you could cache files as they're generated in order to avoid redundant generation of the same file.
If the XML stays relatively static, you could cache it as a PHP array, something like this:
<xml><foo>bar</foo></xml>
is cached in a file as
<?php return array('foo' => 'bar');
It should be faster for PHP to just include the arrayified version of the XML.
~1k/hour, 3600 seconds per hour, more than 3 runs a second (let alone the 50k/hour)...
There are many questions. Some of them are:
Does your php script need to read/process all records of the data source for each single run? If not, what kind of subset does it need (~size, criterias, ...)
Same question for the flash application + who's sending the data? The php script? "Direct" request for the complete, static xml file?
What operations are performed on the data source?
Do you need some kind of concurrency mechanism?
...
And just because you want to deliver xml data to the flash clients it doesn't necessarily mean that you have to store xml data on the server. If e.g. the clients only need a tiny little subset of the availabe records it probably a lot faster not to store the data as xml but something more suited to speed and "searchability" and then create the xml output of the subset on-the-fly, maybe assisted by some caching depending on what data the client request and how/how much the data changes.
edit: Let's assume that you really,really need the whole dataset and need a continuous simulation. Then you might want to consider a continuous process that keeps the complete "world model" in memory and operates on this model on each run (world tick). This way at least you wouldn't have to load the data on each tick. But such a process is usually written in something else than php.

Categories