Design concept to prevent duplicate orders being received into my system

Design concept to prevent duplicate orders being received into my system - php

I have a web system that I built that is an online ordering portal for our customers. We store their stock and they place orders for it through this portal.
We do a duplication check on the customer reference number so that the same order cannot come through twice, however we have been experiencing some issues whereby if a customer sends the order to our API multiple times (within milliseconds, if that, of eachother), our system doesn't have enough time to mark the order as received and as such, the system is allowing duplicates.
I am trying to decide on ways to combat this. I don't want to use database constraints for this as I find this an application issue rather than a database issue and don't believe this is a good solution.
Any design ideas on how to combat this? One solution I thought of was to use a mutex with the reference number so that if a mutex is locked for that reference number, then it might retry in a second etc? My understanding is that Mutex's are almost fool proof as they are enforced by the filesystem?
Any ideas would be appreciated

You could try employing a nonce strategy. The idea is to set random number to a hidden form field and store it in the session. Verify the id on post. The user has to deliberately refresh the page to obtain a new id and be able to post a second time.
update
So since you are using API service, then I would have to say you could use a batch system. Where an order comes in and gets stored in a holding area. A chron job runs through the batch and does the necessary pruning operations.

Related

PHP Data session storage

On my website, users can currently compare multiple company for a prestation.
For each company, I calculate the price of the prestation.
To calculate this price I have a very big SQL request in order to filter companies based on the user's previous input and get every parameter I need.
Once the query end, I loop througt the company list and calculate the price for each of them and for each additional services the company offer. Then I display those value to the user in HTML.
The user can then add or remove an option and order the company.
Then, when a user choose a price, I send the company_id along with the user's otpion (the different services he chose) to the server and get the price previously calculated from the user SESSION.
Prices are stored in user session in order to avoid the calculation process but, I have like ~6 prices by company and usually ~30 companies for one user request. Which mean that I store in session an array with around 180 different prices for one user.
I found it quite wastefull to store this many variables in session and I was wondering, is there a better way to store those variables ? Should I store them in database ?
By the way, server side, i'm using PHP along with Mysql for the database.

What you're effectively doing is a very primitive form of caching. Sadly, the session is not the best place to do so, for a variety of reasons:
The session can never be shared between users. Some values cached may be the same for every user. It's good to have a cache that allows you to go the "unique" or "shared" routes at will.
Is your session cached data used on every page? If it is, then forget this point. If, however, it isn't, on every page, you're still incurring the cost of fetching (which, depending on your server configuration, may involve a few fs calls, or network calls, or a combination) and deserializing the data, on every request. This, if your session payload is large, can make a significant different to load times.
Another point to consider is the simple fact that, if you are running a straight-out-of-the-box LAMP stack and have not configured a shared session driver, you're going to find a very nasty surprise when you scale out :-)
Before we go any further, ask yourself these questions:
Do the values in the session change on a user-by-user basis? And if they do, is it by a fixed amount of percentage?
Do the values in the session change often?
How are the values calculated?
If #1 is "No"
You are better off caching one copy and use it for every user.
If #2 is "No"
You are better off denormalizing (i.e. pre-calculating) the values in the database
If #3 is a complicated formula
See #2
In every case
MySQL is a very poor cache driver, and should be avoided if you can. Most people replace it with redis, but that's also not a very good way due to its inability to scale by itself (you need nutcracker/twemproxy to make it properly shard). Either way, if your data to be cached is relatively small, MySQL will do. However, plan for the future.
If you do plan on going the proper cache way, consider a key-value store such as Riak, or a document storage driver such as Aerospike.

High-load payment processing architecture in PHP

Imagine a local Groupon clone. Now imagine a deal that attracted 10x normal visitors and because visitors were trying to buy deal in parallel MySQL database went down and deal's maximum purchases limit was exceeded.
I'm looking for best practices of the payment processing for highly-loaded websites, that will handle payments for the limited amount of products in parallel.
For now the simplest options seems to lock/unlock deal while customer is trying to purchase it on a third-party payment processor's page.
Any thoughts?

I was with you until you started to talk about a 3rd party payment processors page. It's hard to control your user's experience while dishing them off to a 3rd party site, because you have no idea what they're doing while they're there, if they got side-tracked, how long they're going to take to finish the transaction, IF they finished the transaction, etc.
If processing payments locally is not an option, that's not necessarily a problem - it just presents an issue with how you have to actually think about handling your transactions.
So, if it were me, not thinking about the 3rd party right now - we'll set that aside for a minute. Obviously, I'd #1 make sure my MySQL database was resilient enough to not go down, because that creates a huge problem for reconciling transactions. But, things happen, so you need a backup.
My suggestion would be to utilize a caching system which kept track of the product, and the current # of products available. Memcache could be good for this, as it's just a single record which will be pretty easy to grab. You wouldn't have to hit the database at all to get info on your product (availability) and if it went down, your users/application would be none the wiser, as you'd be getting info straight from Memcache about your item (no mysql required).
This presents an issue (when the database goes down) with storing payment records. When you collect money, you obviously need that transaction information in your database, and if your database is down - well, that's a problem. Memcache is not such a great solution for this, because you're limited to the size of your value and you must know about every key you care about. On top of that, Memcache doesn't have sets or set operations, so you can't append to a value without fear of nuking some data.
So, lets add a different piece of technology, Redis.
A solution for the transaction problem would be to write them to redis in the event that your MySQL server is not available (or write to both if you really want to, but you don't really need to do that). Then have a background process that knows how to go get the transaction details from redis and write them to your MySQL table(s) when it comes back online. Redis is pretty resilient to crashing, and is capable of operating at huge volumes. It also has set operations so you can easily append data to a set without fear of a race condition during your read/change/write operations.
So, you could store all your transactions in a redis key as a single set (store them as json strings if you like, that'd be pretty easy), then when your DB crashes you can just go get that data from Redis and write it to MySQL when it comes back online.
To keep things simple, if you were going use redis to store transactions, you may as well also use it to store your product cache, instead of memcache - keep the stack simple.
This takes care of not accessing the database for your Product details, and also keeping track of your (potentially) missed transactions, should MySQL crash. But it doesn't handle the problem of keeping track of product inventory while new transactions come in while MySQL is down, and ensuring that you don't over-sell product.
To handle this case, when a transaction is saved, you can decrement the # of products available (keep it as a flat number, so you're not constantly re-calculating it on page-load). This will tell you instantly if the product is oversold or not. However, what this does not do is protect the time that the "product is in the cart." Once the user puts the product in the cart (which you've allowed because you said you have the inventory), you have the problem of making sure it doesn't sell out before they check out.
The solution to this problem also doubles as your solution to the 3rd party transaction problem. So you're using a caching mechanism for your products, and a fall-back mechanism for your transactions. What you should do now, is when a user tries to buy a product (either puts it in the carts, or is shot off to the 3rd party processor) create a "product reservation" for them. It's probably easiest to make a redis entry for each of these. Make product reservations have a expiry time, say 5 or 10, maybe even 15 minutes if you like. Every time you see a user on your site, refresh the timeout to make sure they don't run out of time (you can put more logic in this if you desire, obviously). When a transaction was completed and changed from pending to paid, you'd create your transaction record (mysql or redis, depending on database availability), decrement your available quantity, and delete your reservation record.
You'd then use your available quantity information, in addition to your un-expired reservation information, to determine the quantity available for sale. If this number ever drops to zero, then you are effectively sold out; but if a certain number of your users don't convert it frees up the inventory that they didn't buy, allowing you to rinse and repeat that process until you're in fact, sold out.
This is a pretty long explanation of a fairly robust system, and if you ever run into the situation where your MySQL server crashed, AND redis crashed, you'd be kind of screwed; so it makes sense to have a failover of both of those systems here (which is entirely feasible and possible). It should make for a pretty rock solid checkout/inventory management process.
Hope it helps.

Use master slave mysql configuration with read/write connections.
Use cache as much as possible (redis is good idea).
Try to put some logic into redis, so it will not make extra connection to mysql + it will be faster.
For transactions maybe it is wise to use some kind of message queuing system (rabbitMQ). it will allow you to forward some tasks into background.
Dispate all this optimization you will have big problems if db or cache engine or mq will fail. But using master slave for all these services you will be kind of on the safe side. i.e. using multiple machines that will be able to continue to work if other machine fails.
And that brings me to next idea. cloud services with auto scaling (like aws).

Do you consider Compensating Service Transaction ?

Do PHP pages on a server run simultaneously?

This probably seems like a very simple question, and I would probably know if I had a more in depth knowledge of computer processes and the like, but anyway..
If two people request the same page from my server, is the PHP page processed once for the first person, and then a second time for the second person, or might these run along side each other at the same time?
Take this as an example. I have one stock Item left in my PHP driven online shop. A user adds this to their cart. Php script 1) checks to see if it is in stock, Yup, its in stock, so it 2)reserves it for him.
What If, in between checking if its in stock and reserving it, the same PHP page was loading for someone else, and just after user A checked if it was in stock, so did user B, before user A got a chance to reserve it, so they both end up reserving it!
Sorry if this seems silly, can't seem to find an answer on it, which is it?

Congratulations, you have identified a race condition! :-)
Whether PHP pages run in parallel or one after the other depends on the web server. Typically a web server allocates several threads to handle multiple incoming requests at once. So it may indeed happen that several instances of the same script are run in parallel if two or more users request the same page at the same time. Due to timing and scheduling differences it is unpredictable when each page will execute which action exactly.
Hence for such situations as you describe it is important to program actions in an atomic way, meaning that they either complete in their entirety or not at all. In your case you could use locks, transactions, cleverly formed UPDATE statements, UNIQUE indexes or a number of other techniques that avoid the possibility of two users reserving the same thing.

Yes, in general, without getting into too much detail: PHP scripts are executed simultanously for each request separately.
For making sure the problem you mentioned does not occur, you should probably implement feature of your database management system called "transactions". This way if you do something on the database layer and at the end you will find out the reservation can not happen, all the actions made within transaction will be rolled back.
In addition to transactions you should design your application keeping in mind that the problem you mentioned may occur. Thus you should design your database & application in a way allowing you to 1) shorten the time between "checking" and "reserving" as much as possible, 2) stopping the action if you cannot make reservation, and finally - in case of emergency - 3) identifying which reservation came first and which should be revoked.
Another idea, falling into category of "your application's design", may be something we could call "temporary reservation". That means you can temporarily (eg. for a couple of seconds) lock your reservation if you are about to make reservation. After that you can check if you really can make that reservation and either turn it into permanent reservation or just revoke it. I believe some systems also make longer temporary reservations right after the customer begins the process of reserving his/her places. Then, if the process is successful, the reservation is changed into permanent, but if some specific amount of time passes without success, the reservation can be simply revoked, allowing another customer to begin the process.

yes definately, they are parallel for php but when the database concerns you should learn transaction portion of database management system.

Yes and no. PHP may run in simultaneous processes depending on server setup, but on a small-scale, you'll only have one database. Database queries are handled sequentially, so you'll never have that kind of conflict. (As long as you check to see if an item's in stock immediately before you reserve it for someone.) More information
Of course, Users A + B might both see that it's in stock, and A might request it before B. But your code can realize that it's now out of stock and display an error to User B.
(You get into trouble with multiple database servers. If you have the same data stored across multiple servers, there's lag time before data can be fully replicated. But you won't have that issue. We're talking like top 1,000 sites here.)

Users limitations - in database or direct to file? (flexibility vs. performance?)

the users registered in a web site will have the possibility to send invitations to the friends. I want to add a daily limit for the number of invitations that a user may send.
Initially I've just added a limit (40) in the php file, but then I thought it would be better to give to the administrators the possibility to change this limit, so I've added this limit in the database. But now every time a user want to send invitations the database will be used. Would this affect the performance?
How would you configure this feature?

TL;DR: just put it in the database. :)
Complete story: It should not be a performance hit. Everything (the user itself, the usernames of the recipient, loads of stuff from your page) will be coming from your database. you shouldn't care.
If you have a REALLY big userbase, and it becomes an issue, I'm sure there are other places to do performance updates (like use memcached for all sorts of stuff). But if you want to "cache" it, I guess you could retrieve it once while loggin in and put it in the session. Use this value to substract and check etc. then ALSO check once against the database (in the background) to make sure there isn't any sort of freakish thing going on for this user. But this can be async, and does not have as big of an impact for the user-experience.
In the rare case the session says it's ok, but the database says it isn't, just send the user an error. The other way around, might need the user to re-login. But it will be rare or even impossible if you implement it correctly :)

It depends a bit on hardware but a 400,000 row table (10,000 * 40) isn't that huge in MySQL standards. I think you'll be fine.
Just make sure that you've built it sensibly and from how you've described it that there's an index on the column that stores the unique invite code.

MySQL Transaction across many PHP Requests

I would like to create an interface for manipulating invoices in a transaction-like manner.
The database consists of an invoices table, which holds billing information, and an invoice_lines table, which holds line items for the invoices. The website is a set of scripts which allow the addition, modification, and removal of invoices and their corresponding lines.
The problem I have is this, I would like the ACID properties of the database to be reflected in the web application.
Atomic: When the user hits save, either the entire invoice is modified or the entire invoice is not changed at all.
Consistent: The application code already ensures consistency, lines cannot be added to non-existent invoices. Invoice IDs cannot be duplicated.
Isolated: If a user is in the middle of a set of changes to an invoice, I would like to hide those changes from other users until the user clicks save.
Durable: If the web site dies, the data should be safe. This already works.
If I were writing a desktop application, it would maintain a connection to the MySQL database at all times, allowing me to simply use the BEGIN TRANSACTION and COMMIT at the beginning and end of the edit.
From what I understand you cannot BEGIN TRANSACTION on one PHP page and COMMIT on a different page because the connection is closed between pages.
Is there a way to make this possible without extensions? From what I have found, only SQL Relay does this (but it is an extension).

you don't want to have long running transactions, because that'll limit concurrency. http://en.wikipedia.org/wiki/Command_pattern

The translation on the web for this type of processing is the use of session data or data stored in the page itself. Typically what is done is that after each web page is completed the data is stored in the session (or in the page itself) and at the point in which all of the pages have been completed (via data entry) and a "Process" (or "Save") button is hit, the data is converted into the database form and saved - even with the relational aspect of data like you mentioned. There are many ways to do this but I would say that most developers have an architecture similar to what I mentioned (using session data or state within the page) to satisfy what you are talking about.
You'll get much advice here on different architectures but I can say that the Zend Framework (http://framework.zend.com) and the use of Doctrine (http://www.doctrine-project.org/) make this fairy easy since Zend provides much of the MVC architecture and session management and Doctrine provides the basic CRUD (create, retrieve, update, delete) you are looking for - plus all of the other aspects (uniqueness, commit, rollback, etc). Keeping the connection open to mysql may cause timeouts and lack of available connections.

Database transactions aren't really intended for this purpose - if you did use them, you'd probably run into other problems.
But also you can't use them as each page request uses its own connection (potentially) so cannot share a transaction with any others.
Keep the modifications to the invoice somewhere else while the user is editing them, then apply them when she hits save; you can do this final apply step in a transaction (albeit quite a short-lived one).
Long-lived transactions are usually bad.

The solution is not to open the transaction during the GET phase. Do all aspects of the transaction—BEGIN TRANSACTION, processing, and COMMIT—all during the POST triggered by the "save" button.

Persistent connections may help you:
http://php.net/manual/en/features.persistent-connections.php
Another is that when using
transactions, a transaction block will
also carry over to the next script
which uses that connection if script
execution ends before the transaction
block does.
But I recommend you to find another approach to the problem.
For example: create a cache table.
When you need to "commit", transfer the records from the cache table to the "real" tables.

Altough there are some good answers, I think that found some good responses to your question, that I was stuck with also. I think the best approach is using a framework like Doctrine (O/R mapping) that has this kind of approach somehow implemented. Here you have a link to what I'm talking about.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.