I am using cache to store our carts. As more carts are being added the system is becoming extremely slow when fetching a single cart.
I use the below code to access a single cart and feel it may be that there is lots of information stored under one key in the cache.
$carts = Cache::get("carts");
$cart = $carts[$this->reference];
I understand we could use unique keys such as carts:uniqueId but I need to be able to access all of the carts and loop through them so I feel this solution doesn't work.
I'd appreciate any advice on how I can solve this issue.
Edit:
As per ka_lin comment, I need to loop through the carts and bookings so that I can check if the booking date is still available. If it is no longer available I need to clear the date selected and also emit an event to alert the user.
I would prefer to keep the data on the server side as this is built behind an api in order to make it easier for integration for partners.
If you need to save the carts on the server for some reason you could always use the database for that.
If you are storing them because the users should not lose items when leaving the page i would recommend using javascript and storing it with localStorage.
EDIT:
It sounds like you are using the cache for something that should be handled by a database.
What happen behind the scene is: Cart (if you have used as Session based cart) in Laravel gets stored in file system and Cache without memcache/apc cache installed. Laravel use default cache as a filesystem too. So ultimately you doing same thing.
Personally i don't think that only using cart in many places of website slow down the whole website. There must be an other factor of website which slow down your website.
To resolve the issue I created table called CacheKeys where I saved the unique cart ids which I can then use to loop through and fetch all carts/bookings.
The mindset with which you approached the problem is correct. The implementation, however, brings lots of pain.
Enter: Lada cache package
In such systems, what's truly slow is the mountain of sql queries which hit the database. Lada cache helps by automatically saving your query results in redis and invalidating them when these have been updated.
We've seen queries drop from 2s to 0.5s in production environments, an effective 4x gain.
RTFM: Read the manual CAREFULLY especially the Finally, all your models must include the Spiritix\LadaCache\Database\LadaCacheTrait trait part and enjoy a hassle-free cache system.
NOTE: Your backend should be the only system handling your queries against your database.
Related
I am working on a project with a custom HTML5 front end and a backend I've designed from experience. The backend is composed of a message queue and a cache - currently I've chosen Beanstalk and Memcache because I'm famliar with them but I am open to suggestions.
My question though comes from how my coder is interfacing with the MySQL DB we are using to store the data. The idea is to pre-cache most or all of the DB so the site runs really fast. It's not a huge DB so RAM for Memcache shouldn't be an issue. However, my coder is using CodeIgniter with GreenBean. I've never heard of GreenBean before and when I google it I get almost nothing that isn't related to greenbeans the food. What little I could find suggested it was an ORM which fits from what my coder has told me.
The problem is this. With raw PDO my pre-caching scheme is simple - I would grab each row from each table and store it in the cache with a key. Then every time I needed that data I would look at the cache first for it and then the DB. If something is changed on the backend then I only need to update that row in the DB and the associated key in the cache.
With an ORM, if I store the entire ORM object serialized into the cache then it holds a bunch of related data. Data that could be incorrect if something were changed. For example, you have a DB of employees that is linked to the office they work in and the dept they work in. The ORM grabs the office and the dept and we store all of that in the cache. But if the office address changes the ORM object for every employee in that office is now stale/incorrect.
In that example, just letting the cache expire probably isn't an issue most of the time. But in my application, that data should really get updated immediately. So in a simple PDO scheme you flush the cache keys related to the data that changed and every future page call gets the updated data. But with an ORM you have lots and lots of cached object instances that might be incorrect and no good way of finding them. So it seems to me you are now left with some form of indexing of your cached objects and when you change something simple you could be flushing and refilling a big chunk of the cache. The site gets really slow then.
Typically I would just cache a DB result after the first time I needed it but in this case I think that could end up being really slow for a lot of users that make the first requests that particular set of data. Additionally, there are some search features that could require a lot of data from the DB. Thus my desire to pre-cache.
So in this case I'm thinking an ORM would hurt the site's performance. I'm thinking I'm not the first person to have this issue though. Is there an ORM out there that would handle this scenario well? Is there a better backend architecture I'm missing?
Thanks
Imagine a local Groupon clone. Now imagine a deal that attracted 10x normal visitors and because visitors were trying to buy deal in parallel MySQL database went down and deal's maximum purchases limit was exceeded.
I'm looking for best practices of the payment processing for highly-loaded websites, that will handle payments for the limited amount of products in parallel.
For now the simplest options seems to lock/unlock deal while customer is trying to purchase it on a third-party payment processor's page.
Any thoughts?
I was with you until you started to talk about a 3rd party payment processors page. It's hard to control your user's experience while dishing them off to a 3rd party site, because you have no idea what they're doing while they're there, if they got side-tracked, how long they're going to take to finish the transaction, IF they finished the transaction, etc.
If processing payments locally is not an option, that's not necessarily a problem - it just presents an issue with how you have to actually think about handling your transactions.
So, if it were me, not thinking about the 3rd party right now - we'll set that aside for a minute. Obviously, I'd #1 make sure my MySQL database was resilient enough to not go down, because that creates a huge problem for reconciling transactions. But, things happen, so you need a backup.
My suggestion would be to utilize a caching system which kept track of the product, and the current # of products available. Memcache could be good for this, as it's just a single record which will be pretty easy to grab. You wouldn't have to hit the database at all to get info on your product (availability) and if it went down, your users/application would be none the wiser, as you'd be getting info straight from Memcache about your item (no mysql required).
This presents an issue (when the database goes down) with storing payment records. When you collect money, you obviously need that transaction information in your database, and if your database is down - well, that's a problem. Memcache is not such a great solution for this, because you're limited to the size of your value and you must know about every key you care about. On top of that, Memcache doesn't have sets or set operations, so you can't append to a value without fear of nuking some data.
So, lets add a different piece of technology, Redis.
A solution for the transaction problem would be to write them to redis in the event that your MySQL server is not available (or write to both if you really want to, but you don't really need to do that). Then have a background process that knows how to go get the transaction details from redis and write them to your MySQL table(s) when it comes back online. Redis is pretty resilient to crashing, and is capable of operating at huge volumes. It also has set operations so you can easily append data to a set without fear of a race condition during your read/change/write operations.
So, you could store all your transactions in a redis key as a single set (store them as json strings if you like, that'd be pretty easy), then when your DB crashes you can just go get that data from Redis and write it to MySQL when it comes back online.
To keep things simple, if you were going use redis to store transactions, you may as well also use it to store your product cache, instead of memcache - keep the stack simple.
This takes care of not accessing the database for your Product details, and also keeping track of your (potentially) missed transactions, should MySQL crash. But it doesn't handle the problem of keeping track of product inventory while new transactions come in while MySQL is down, and ensuring that you don't over-sell product.
To handle this case, when a transaction is saved, you can decrement the # of products available (keep it as a flat number, so you're not constantly re-calculating it on page-load). This will tell you instantly if the product is oversold or not. However, what this does not do is protect the time that the "product is in the cart." Once the user puts the product in the cart (which you've allowed because you said you have the inventory), you have the problem of making sure it doesn't sell out before they check out.
The solution to this problem also doubles as your solution to the 3rd party transaction problem. So you're using a caching mechanism for your products, and a fall-back mechanism for your transactions. What you should do now, is when a user tries to buy a product (either puts it in the carts, or is shot off to the 3rd party processor) create a "product reservation" for them. It's probably easiest to make a redis entry for each of these. Make product reservations have a expiry time, say 5 or 10, maybe even 15 minutes if you like. Every time you see a user on your site, refresh the timeout to make sure they don't run out of time (you can put more logic in this if you desire, obviously). When a transaction was completed and changed from pending to paid, you'd create your transaction record (mysql or redis, depending on database availability), decrement your available quantity, and delete your reservation record.
You'd then use your available quantity information, in addition to your un-expired reservation information, to determine the quantity available for sale. If this number ever drops to zero, then you are effectively sold out; but if a certain number of your users don't convert it frees up the inventory that they didn't buy, allowing you to rinse and repeat that process until you're in fact, sold out.
This is a pretty long explanation of a fairly robust system, and if you ever run into the situation where your MySQL server crashed, AND redis crashed, you'd be kind of screwed; so it makes sense to have a failover of both of those systems here (which is entirely feasible and possible). It should make for a pretty rock solid checkout/inventory management process.
Hope it helps.
Use master slave mysql configuration with read/write connections.
Use cache as much as possible (redis is good idea).
Try to put some logic into redis, so it will not make extra connection to mysql + it will be faster.
For transactions maybe it is wise to use some kind of message queuing system (rabbitMQ). it will allow you to forward some tasks into background.
Dispate all this optimization you will have big problems if db or cache engine or mq will fail. But using master slave for all these services you will be kind of on the safe side. i.e. using multiple machines that will be able to continue to work if other machine fails.
And that brings me to next idea. cloud services with auto scaling (like aws).
Do you consider Compensating Service Transaction ?
I'm hoping to develop a LAMP application that will centre around a small table, probably less than 100 rows, maybe 5 fields per row. This table will need to have the data stored within accessed rapidly, maybe up to once a second per user (though this is the 'ideal', in practice, this could probably drop slightly). There will be a number of updates made to this table, but SELECTs will far outstrip UPDATES.
Available hardware isn't massively powerful (it'll be launched on a VPS with perhaps 512mb RAM) and it needs to be scalable - there may only be 10 concurrent users at launch, but this could raise to the thousands (and, as we all hope with these things, maybe 10,000s, but this level there will be more powerful hardware available).
As such I was wondering if anyone could point me in the right direction for a starting point - all the data retrieved will be the same for all users, so I'm trying to investigate if there is anyway of sharing this data across all users, rather than performing 10,000 identical selects a second. Soooo:
1) Would the mysql_query_cache cache these results and allow access to the data, WITHOUT requiring a re-select for each user?
2) (Apologies for how broad this question is, I'd appreciate even the briefest of reponses greatly!) I've been looking into the APC cache as we already use this for an opcode cache - is there a method of caching the data in the APC cache, and just doing one MYSQL select per second to update this cache - and then just accessing the APC for each user? Or perhaps an alternative cache?
Failing all of this, I may look into having a seperate script which handles the queries and outputs the data, and somehow just piping this one script's data to all users. This isn't a fully formed thought and I'm not sure of the implementation, but perhaps a combo of AJAX to pull the outputted data from... "Somewhere"... :)
Once again, apologies for the breadth of these question - a couple of brief pointers from anyone would be very, very greatly appreciated.
Thanks again in advance
If you're doing something like an AJAX chat which polls the server constantly, you may want to look at node.js instead, which keeps an open connection between server and browser. This way, you can have changes pushed to the user when they happen and you won't need to do all that redundant checking once per second. This can scale very well to thousands of users and is written in javascript on the server-side, so not too difficult.
The problem with using the MySQL cache is that the entire table cache gets invalidated on any write to that table. You're better off using a caching solution like memcached or APC if you're trying to control that behavior more precisely. And yes, APC would be able to cache that information.
One other thing to keep in mind is that you need to know when to invalidate the cache as well, so you don't have stale data.
You can use apc,xcache or memcache for database query caching or you can use vanish or squid for gateway caching...
I am thinking about using a noSQL (mongoDB) paired with memcached to store sessions with in my webapp. The idea is that upon each page load, the user data is compared to the data in the memcache and if something has changed, the data would be written to both memcached and mySQL. This way the reads would be greatly reduced and memcached utilized to do what it does best.
However I am a bit concerned about using a non-ACID database for session storage especially with the memcached layer. Let's say something goes wrong while updating the session to the DB and our users got instant headache wondering why their product that they put in the cart doesn't show up...
What's an appropriate approach to this? Should we go for a mySQL session storage or is it fine to keep a non-acid supportive database for sessions?
Thanks!
I'm using MongoDB as session storage currently. It is possible to avoid race conditions mentioned by pilif. I found a class that implements a session handler for MongoDB (http://www.jqueryin.com/projects/mongo-session/) and forked it on github to suit my needs (http://github.com/halfdan/MongoSession).
If you don't want to lose your data, stick with ACID tested databases.
What's the payoff you're looking for?
If you want a secure system, you can't trust anything from the user, save for perhaps selected integers, so letting them store the information is typically a really bad idea.
I don't see the payoff for storing sessions outside of your MySQL database. You can cron cleanup on the tables if that's your concern, but why bother? Some users will shop on a site and then get distracted for a while. They would then come back a day or two later.
If you use cookies or something really temporary to store their session info, there is a really good chance their shopping time was wasted. Users really value their time... so if you stored their session info in the database, you can write something sexy to manage that data.
Plus, the nice side effect of this is that you'll generate a lot of residual information about what people like on your website that wouldn't perhaps be available to you later on. Like you could even consider some of it to be like a poll or something where the items people are adding to their cart could impact how you manage your business, order inventory or focus your marketing.
If you go with something really temporary then you lose out on getting residual benefits.
Without any locking on the session, be really, really careful of what you are storing. Never ever store anything that is dependent on what you have read before as the data might change between you reading and writing - especially in case of ajax where multiple requests can go out at once.
An example what you must not store in a non-locked session would be a shopping cart as, to add a product, you have to read, unserialize, add the product and then serialize again. If any other request does the same thing between the first requests read and write, you lose the second request's data.
Have a look at this article for detail: http://thwartedefforts.org/2006/11/11/race-conditions-with-ajax-and-php-sessions/
Keep Sessions on your filesystem (where PHP locks them for you), in your database (where you have to do manual locking) or never, ever, write anything of value to your session if that value is derived of a previous read.
While using memcached as a cache for database, it is the user who have to ensure the data consistency between database and cache. If you'll want to scale up and add more servers there is a probability to be out of sync with database even if everything seems ok.
Instead you may consider Hazelcast. As of 1.9 it also supports memcache protocol. Compared to memcached Hazelcast wants you to implement Map Persister and only itself updates the database for the updated entries. This way you don't have to handle "check cache, if data changed update database" kind of stuff.
If you write your app so that the user stores all session information client side, then you just verify that information as needed, you won't need to worry about sessions on the server side. This is one of the principles in REST style architecture. For instance, if the user is requesting adding an item to their shopping cart, just store the itemID list and count on the client side. When you hit the cart page, you can easily look up the item information from the list of itemIDs they are telling you are in their cart.
During checkout, go directly against the database with transactions to ensure you aren't getting any race conditions, and check your live inventory. If inventory isn't there when they go to check out, just say, "sorry, we just sold out". Of course, at that point you should go update any caches you have out there that are telling people you have inventory.
I would look at how much the user costs to acquire and then ask what is the cost for implementing a really good system. Keep in mind that users are a biological retry method. "I'm bored... press reload again..." While, this isn't the most perfect solution, it is sometimes acceptable vs the cost comparsion for "not lose anything - ever".
If you want additional security, you can have your sessions cached to a separate set of memcache servers so there are no accidental flushes. :)
There are a number of other systems membase.org, and some other persistent memcache solutions (java implementations) that will persist storage to disk. If you want to modify your client somewhat, or how you access memcache, you can do your own replication of memcache session objects.
-daniel
I would like to create an interface for manipulating invoices in a transaction-like manner.
The database consists of an invoices table, which holds billing information, and an invoice_lines table, which holds line items for the invoices. The website is a set of scripts which allow the addition, modification, and removal of invoices and their corresponding lines.
The problem I have is this, I would like the ACID properties of the database to be reflected in the web application.
Atomic: When the user hits save, either the entire invoice is modified or the entire invoice is not changed at all.
Consistent: The application code already ensures consistency, lines cannot be added to non-existent invoices. Invoice IDs cannot be duplicated.
Isolated: If a user is in the middle of a set of changes to an invoice, I would like to hide those changes from other users until the user clicks save.
Durable: If the web site dies, the data should be safe. This already works.
If I were writing a desktop application, it would maintain a connection to the MySQL database at all times, allowing me to simply use the BEGIN TRANSACTION and COMMIT at the beginning and end of the edit.
From what I understand you cannot BEGIN TRANSACTION on one PHP page and COMMIT on a different page because the connection is closed between pages.
Is there a way to make this possible without extensions? From what I have found, only SQL Relay does this (but it is an extension).
you don't want to have long running transactions, because that'll limit concurrency. http://en.wikipedia.org/wiki/Command_pattern
The translation on the web for this type of processing is the use of session data or data stored in the page itself. Typically what is done is that after each web page is completed the data is stored in the session (or in the page itself) and at the point in which all of the pages have been completed (via data entry) and a "Process" (or "Save") button is hit, the data is converted into the database form and saved - even with the relational aspect of data like you mentioned. There are many ways to do this but I would say that most developers have an architecture similar to what I mentioned (using session data or state within the page) to satisfy what you are talking about.
You'll get much advice here on different architectures but I can say that the Zend Framework (http://framework.zend.com) and the use of Doctrine (http://www.doctrine-project.org/) make this fairy easy since Zend provides much of the MVC architecture and session management and Doctrine provides the basic CRUD (create, retrieve, update, delete) you are looking for - plus all of the other aspects (uniqueness, commit, rollback, etc). Keeping the connection open to mysql may cause timeouts and lack of available connections.
Database transactions aren't really intended for this purpose - if you did use them, you'd probably run into other problems.
But also you can't use them as each page request uses its own connection (potentially) so cannot share a transaction with any others.
Keep the modifications to the invoice somewhere else while the user is editing them, then apply them when she hits save; you can do this final apply step in a transaction (albeit quite a short-lived one).
Long-lived transactions are usually bad.
The solution is not to open the transaction during the GET phase. Do all aspects of the transaction—BEGIN TRANSACTION, processing, and COMMIT—all during the POST triggered by the "save" button.
Persistent connections may help you:
http://php.net/manual/en/features.persistent-connections.php
Another is that when using
transactions, a transaction block will
also carry over to the next script
which uses that connection if script
execution ends before the transaction
block does.
But I recommend you to find another approach to the problem.
For example: create a cache table.
When you need to "commit", transfer the records from the cache table to the "real" tables.
Altough there are some good answers, I think that found some good responses to your question, that I was stuck with also. I think the best approach is using a framework like Doctrine (O/R mapping) that has this kind of approach somehow implemented. Here you have a link to what I'm talking about.