I am having a problem with memory in PHP. I am required to obtain a huge report, a table with a lot of information.
This information is obtained also via some complex functions, which means that it cannot be obtained as the user requests it, as he/she would have to wait around 5 minutes for the information to be displayed.
I was caching the result, but as the information grew, now the browser just crashes and does not show anything. What are possible solutions for this?
I was thinking on storing the data in mysql instead of cache, and then execute a couple of selects when the user requests the information. What do you think about that solution? Any other better options?
Update
Looks like the problem was not understood, so I add more detail.
A search is being used already. There are many points to be kept in mind:
1) The information itself has to be calculated. I have a cron running that builds it (takes like 5 minutes), and stores it in cache. The browser is just rendering from cache, and the search is searching on this cached data. The information cannot be obtained in real time.
2) That is why I was thinking on storing the calculated results in MySQL, that way the search can search in the MySQL table, instead of searching the cached data (which is huge and impossible to handle now by the browser).
I hope the problem is more clear now!
I think storing the data in mysql is good idea specially if you build some indices on frequently queried data columns , caching will consume large memory (it often store data in RAM) .
Another thing you should consider , try to view server logs (access, and error) logs because you could find your solution there
and finally hope you solve your probelm
Maybe, you should paginate the output if it is possible. I think storing data in MySQL is not great idea...
check the php.ini setting for this parameter
memory_limit = 512M
set it to some more value
Related
So, I have situation and I need second opinion. I have database and it' s working great with all foreign keys, indexes and stuff, but, when I reach certain amount of visitors, around 700-800 co-current visitors, my server hits bottle neck and displays "Service temporarily unavailable." So, I had and idea, what if I pull data from JSON instead of database. I mean, I would still update database, but on each update I would regenerate JSON file and pull data from it to show on my homepage. That way I would not press my CPU to hard and I would be able to make some kind of cache on user-end.
What you are describing is caching.
Yes, it's a common optimization to avoid over-burdening your database with query load.
The idea is you store a copy of data you had fetched from the database, and you hold it in some form that is quick to access on the application end. You could store it in RAM, or in a JSON file. Some people operate a Memcached or Redis in-memory database as a shared resource, so your app can run many processes or threads that access the same copy of data in RAM.
It's typical that your app reads some given data many times for every single time it updates the data. The greater this ratio of reads to writes, the better the savings in terms of lightening the load on your database.
It can be tricky, however, to keep the data in cache in sync with the most recent changes in the database. In other words, how do all the cache copies know when they should re-fetch the data from the database?
There's an old joke about this:
There are only two hard things in Computer Science: cache invalidation and naming things.
— Phil Karlton
So after another few days of exploring and trying to get the right answer this is what I have done. I decided to create another table, instead of JSON, and put all data, that was suposed to go in JSON file, in the table.
WHY?
Number one reason is MySQL has ability to lock tables while they're being updated, JSON has not.
Number two is that I will downgrade from few dozens of queries to just one, simplest, query: SELECT * FROM table.
Number three is that I have better control over content this way.
Number four, while I was searching for answer I found out that some people had issues with JSON availability if a lot of co-current connections were making request for same JSON, I would never have a problem with availability.
I've recently implemented Redis into one of my Laravel projects. It's currently more of an technical exercise as opposed to production as I want to see what it's capable of.
What I've done is created a list of payment transactions. What I'm pushing to the list is the payload which I receive from a webhook every time a transaction is processed. The payload is essentially an object containing all the information to do with that particular transaction.
I've created a VueJS frontend that then displays all the data in a table and has pagination so it's show 10 rows at a time.
Initially this was working super quick but now that the list contains 30,000 rows which is about 11MB worth of data, the request is taking about 11seconds.
I think the issue here is that I'm using a list and am fetching all the rows from the list using LRANGE.
The reason I used a list was because it has the LPUSH command so that latest transactions go to the start of the list.
I decided to do a test where I got all the data from the list and outputted the value to a blank page and this took about the same time so it's not an issue with Vue, Axios, etc.
Firslty, is this read speed normal? I've always heard that Redis is blazing fast.
Secondly, is there a better way to increase read performance when using Redis?
Thirdly, am I using the wrong data type?
In time I need to be able to store 1m rows of data.
As I realized you get all 30,000 rows in any transaction update and then paginate it in frontend. In my opinion, the true strategy is getting lighter data packs in each request.
For example, use Laravel pagination in response to your request.
In my opinion:
Firstly: As you know, Redis is blazing fast and Redis is really fast. Because Redis data always in memory, you say read 11MB data about use 11s, you can check your bandwidth
Secondly: I'm sorry I don't know how to increase in this env.
Thirdly: I think your choice ok.
So, you can check your bandwidth first(redis server).
I'm currently working on the project which is like e-commerce site. There are hundreds of thousands of records in the database tables. I also have to use join operations on them to get data as there is query builder in project to select criteria of data. It takes too much time to fetch data. So, I'm using limit as some no of records(e.g. 10) per page. Now I come to know the concept of memcached. So I thought to use memcached for my project as it will take too much time for only once. But still there are some doubts.
Will too many cache file affect? I mean there will be too many files will be created as for each page of each module, there will be one cache file. So digit will go approx 10000 cache file.
Let's assume that there is no any problem of no of files. But what about to update files using replace() when any row of table is being added or deleted from middle of the table. And here, table is being updated near about every week.
So I'm in dilemma that should I go for memcached or not? If any one can advice and answer with explanation, then it will be appreciated.
If your website executes many of the same MySQL queries that frequently return the same data, then yes, there is probably some benefit to running memcached.
Problem:
"There are hundreds of thousands of records...It takes too much time to fetch the data".
This probably indicates a problem with your schema. Properly indexed, even when using JOINs, the queries should be able to execute quickly (< 0.1 seconds). Run an EXPLAIN query on the queries that are taking a long time to run and see if they can be improved.
Answer to Question 1
There won't be an issue with too many cache files. Memcached stores all cached information in memory (hence the name), so no disk files are used. Cached objects are stored in RAM and accessed directly from RAM.
Answer to Question 2
Not exactly sure what you are asking here, but if your application updates or deletes information from the database, then it is critical that the cache items affected by the updates and deletes are deleted. If the application doesn't remove cached items affected by such operations, than the next time the data is queried, cached results which are no longer valid may be returned. Make sure any data cached either has appropriate expiration times set, or the application removes them from cache when the data in the database changes.
Hope that helps.
I would start not from Memcached but from figuring out what the bottleneck is. Your tables have roughly one millions rows. I don't know the size of a row but my educated guess is that it is less than 1K based on the fact that a browser window accommodates information from one record.
So it is probably 1G of information in your database. Correct me if I'm wrong. If that's true then the whole database should be automatically cached in RAM by MySQL.
Now that your database is totally in RAM then with proper organization of indexes complexity of a query should be linear with respect to the number of the result set which measured in a number of kilobytes because it fits the browser window.
So my advice is to determine the size of the database and to see the result of "top" command in order to know how much memory is consumed by MySQL. And if you make sure that your database sits totally in memory then run the explain command against your most popular queries and add some indexes to your database according to the result of the explain. Even if your database is bigger than the amount of RAM then I still recommend you to look into the results of the explain command cause it really helps a lot.
Currently i m using shared hosting domain for my site .But we have currently near about 11,00,000 rows in one of the tables.So its taking a lot of time to load the webpage.So we want to implement the database caching techniques like APC or memcache for our site.But in shared domain we dont have those facilities available,we have only eaccelerator.But eaccelerator does not cache db calls,If i m not wrong.So considering all these points we want to move to VPS and in this case.which database caching technique we need to use APC or memcache to decrease the page load time...Please guide on VPS and better caching technique of two
we have similar website and we use APC
APC will cache the opcode as well the html that is generated. This helps to avoid unrequired hits to the page
you should also enable caching on mysql to cache results of your query
I had a task where i needed to fetch rows from a database table that had more than 100.000 record. it was a scrollable page. So what i did was to fetch the first 50 records and cache the next 50 in the first call. and on scroll down events i wrote an ajax request to check if the data is available in cache; if not i fetched it from the database and also cached the next 50. It worked pretty well and solved the inconvenient load time.
if you have a similar scenario you might benefit from this approach.
ps: I used memcache.
From your comment I take it you're doing a LIKE %..% query and want to paginate the result. First of all, investigate whether FULLTEXT indices are an option for you, as they should perform better. If that's not an option, you can add a simple cache like so:
Treat each unique search term as an id, i.e. if in your URL you have ..?search=foobar, then "foobar" is the id of the result set. Keep that in all your links, e.g. ..?search=foobar&page=2.
If the result set does not yet exist (see below), create it:
Query the database with your slow query.
Get all the results into an array. Don't overdo it, you don't want to be storing hundreds of megabytes.
Create a unique filename per query, e.g. sha1($query), or maybe sha1(strtolower($query)).
serialize the data and store it in the file.
Get the data from the file, unserialize it, display the portion of the array corresponding to the requested page.
Occasionally, delete old cached results. You can do that with something like if (rand(0, 100) == 1) .., which will run the cleanup job every 100 queries on average. Strike a balance between server load and data freshness. Cache invalidation is a topic whole books can be written about, BTW.
That's a simple poor man's cache implementation. It's not great, but if you have absolutely nothing else to work with, it's better than running slow queries over and over.
APC is Alternative PHP Cache and works only with PHP. Whereas Memcahced will work independently with any language.
I'm hoping to develop a LAMP application that will centre around a small table, probably less than 100 rows, maybe 5 fields per row. This table will need to have the data stored within accessed rapidly, maybe up to once a second per user (though this is the 'ideal', in practice, this could probably drop slightly). There will be a number of updates made to this table, but SELECTs will far outstrip UPDATES.
Available hardware isn't massively powerful (it'll be launched on a VPS with perhaps 512mb RAM) and it needs to be scalable - there may only be 10 concurrent users at launch, but this could raise to the thousands (and, as we all hope with these things, maybe 10,000s, but this level there will be more powerful hardware available).
As such I was wondering if anyone could point me in the right direction for a starting point - all the data retrieved will be the same for all users, so I'm trying to investigate if there is anyway of sharing this data across all users, rather than performing 10,000 identical selects a second. Soooo:
1) Would the mysql_query_cache cache these results and allow access to the data, WITHOUT requiring a re-select for each user?
2) (Apologies for how broad this question is, I'd appreciate even the briefest of reponses greatly!) I've been looking into the APC cache as we already use this for an opcode cache - is there a method of caching the data in the APC cache, and just doing one MYSQL select per second to update this cache - and then just accessing the APC for each user? Or perhaps an alternative cache?
Failing all of this, I may look into having a seperate script which handles the queries and outputs the data, and somehow just piping this one script's data to all users. This isn't a fully formed thought and I'm not sure of the implementation, but perhaps a combo of AJAX to pull the outputted data from... "Somewhere"... :)
Once again, apologies for the breadth of these question - a couple of brief pointers from anyone would be very, very greatly appreciated.
Thanks again in advance
If you're doing something like an AJAX chat which polls the server constantly, you may want to look at node.js instead, which keeps an open connection between server and browser. This way, you can have changes pushed to the user when they happen and you won't need to do all that redundant checking once per second. This can scale very well to thousands of users and is written in javascript on the server-side, so not too difficult.
The problem with using the MySQL cache is that the entire table cache gets invalidated on any write to that table. You're better off using a caching solution like memcached or APC if you're trying to control that behavior more precisely. And yes, APC would be able to cache that information.
One other thing to keep in mind is that you need to know when to invalidate the cache as well, so you don't have stale data.
You can use apc,xcache or memcache for database query caching or you can use vanish or squid for gateway caching...