Atomic/safe serving of single-use codes

Atomic/safe serving of single-use codes - php

I have a list of single-use discount codes for an ecommerce site I'm partnering with. I need to set up a page on my site where my users can fill out a form, and then will be given one of the codes. The codes are pre-determined and have been sent to me in a text file; I can't just generate them on the fly. I need to figure out the best way to get an unused code from the list, and then remove it from the list (or update a flag to mark it as used) at the same time, to avoid any possibility of giving two people the same code. In other words, something similar to a queue, where I can remove one item from the queue atomically.
This webapp will be running on AWS and the current code is Python (though I could potentially use something else if necessary; PHP would be easy). Ideally I'd use one of the AWS services or mysql to do this, but I'm open to other solutions if they're not a royal pain to get integrated. Since I thought "queue," SQS popped into my head, but this is clearly not what it's intended for (e.g. the 14 day limit on messages remaining in the queue will definitely not work for me). While I'm expecting very modest traffic (which means even really hacky solutions would probably work), I'd rather learn about the RIGHT way to do this even at scale.

I cant give actual code examples, but one of the easiest ways to do it would just be an increment counter in the file, so something like
0
code1
code2
code3
etc
and just skipping that many lines every time a code is used.
You could also do this pretty simply in a database

Amazon DynamoDB is a fast, NoSQL database from AWS, and it is potentially a good fit for this use case. Setting up a database table is easy, and you could load your codes into there. DynamoDB has a DeleteItem operation that also allows you to retrieve the data within the same, atomic operation (by setting the ReturnValues parameter to ALL_OLD). This would allow you to get and delete a code in one shot, so no other requests/processes can get the same code. AWS publishes official SDKs to help you connect to and use their services, including both a Python and PHP SDK (see http://aws.amazon.com/tools/).

Related

How to deal with External API latency

I have an application that is fetching several e-commerce websites using Curl, looking for the best price.
This process returns a table comparing the prices of all searched websites.
But now we have a problem, the number of stores are starting to increase, and the loading time actually is unacceptable at the user experience side. (actually 10s pageload)
So, we decided to create a database, and start to inject all Curl filtered result inside this database, in order to reduce the DNS calls, and increase Pageload.
I want to know, despite of all our efforts, is still an advantage implement a Memcache module?
I mean, will it help even more or it is just a waste of time?
The Memcache idea was inspired by this topic, of a guy that had a similar problem: Memcache to deal with high latency web services APIs - good idea?

Memcache could be helpful, but (in my opinion) it's kind of a weird way to approach the issue. If it was me, I'd go about it this way:
Firstly, I would indeed cache everything I could in my database. When the user searches, or whatever interaction triggers this, I'd show them a "searching" page with whatever results the server currently has, and a progress bar that fills up as the asynchronous searches complete.
I'd use AJAX to add additional results as they become available. I'm imagining that the search takes about ten seconds - it might take longer, and that's fine. As long as you've got a progress bar, your users will appreciate and understand that Stuff Is Going On.
Obviously, the more searches go through your system, the more up-to-date data you'll have in your database. I'd use cached results that are under a half-hour old, and I'd also record search terms and make sure I kept the top 100 (or so) searches cached at all times.
Know your customers and have what they want available. This doesn't have much to do with any specific technology, but it is all about your ability to predict what they want (or write software that predicts for you!)
Oh, and there's absolutely no reason why PHP can't handle the job. Tying together a bunch of unrelated interfaces is one of the things PHP is best at.

Your result is found outside the bounds of only PHP. Do not bother hacking together a result in PHP when a cronjob could easily be used to populate your database and your PHP script can simply query your database.
If you plan to only stick with PHP then I suggest you change your script to index your database from the results you have populated it with. To populate the results, have a cronjob ping a PHP script that is not accessible to the users which will perform all of your curl functionality.

API autocomplete call (performance, etc...)

I'm just starting to plan out a web app which allows users to save information about movies. This relies on the TMDb API. Now, i'd like to include an autocomplete feature for when they are searching for a movie. Do you think it's wise to make an API call onKeyUp? Or wait for a certain amount of time after a keyUp? Overall, is this the best way to carry this out?
I will be using PHP, jQuery and saving/retrieving user data with MySQL

Delay after key up unless you (or the server you are hitting) has the speed to be able to handle it. You'll have to account for race conditions anyway, but having that many calls isn't going to be very helpful. Your speeds to query the API are going to be slower than most user's typing speed, which means you'll be making unnecessary calls to your api, using both yours and your users' bandwidth.
Also, I would set a minimum number of character entered before you query (probably ~3 is good). This will also help lower number of queries, and you won't be running a query for 'a' or even 'ap' which could both be a lot of things. Once you get to 3 ('app') you can get a much smaller list of results, which is more helpful for a user.

You could use the TypeWatch jQuery plugin or something similar to only call the API after the user has stopped typing for a certain amount of time. Stack Overflow's Tags and Users pages, for example, use TypeWatch to only process the input after the user has stopped typing for 500ms.

What are the number of ways in which my approach to a news-feed is wrong?

This question has been asked a THOUSAND times... so it's not unfair if you decide to skip reading/answering it, but I still thought people would like to see and comment on my approach...
I'm building a site which requires an activity feed, like FourSquare.
But my site has this feature for the eye-candy's sake, and doesn't need the stuff to be saved forever.
So, I write the event_type and user_id to a MySQL table. Before writing new events to the table, I delete all the older, unnecessary rows (by counting the total number of rows, getting the event_id lesser than which everything is redundant, and deleting those rows). I prune the table, and write a new row every time an event happens. There's another user_text column which is NULL if there is no user-generated text...
In the front-end, I have jQuery that checks with a PHP file via GET every x seconds the user has the site open. The jQuery sends a request with the last update "id" it received. The <div> tags generated by my backend have the "id" attribute set as the MySQL row id. This way, I don't have to save the last_received_id in memory, though I guess there's absolutely no performance impact from storing one variable with a very small int value in memory...
I have a function that generates an "update text" depending on the event_type and user_id I pass it from the jQuery, and whether the user_text column is empty. The update text is passed back to jQuery, which appends the freshly received event <div> to the feed with some effects, while simultaneously getting rid of the "tail end" event <div> with an effect.
If I (more importantly, the client) want to, I can have an "event archive" table in my database (or a different one) that saves up all those redundant rows before deleting. This way, event information will be saved forever, while not impacting the performance of the live site...
I'm using CodeIgniter, so there's no question of repeated code anywhere. All the pertinent functions go into a LiveUpdates class in the library and model respectively.
I'm rather happy with the way I'm doing it because it solves the problem at hand while sticking to the KISS ideology... but still, can anyone please point me to some resources, that show a better way to do it? A Google search on this subject reveals too many articles/SO questions, and I would like to benefit from the experience any other developer that has already trawled through them and found out the best approach...

If you use proper indexes there's no reason you couldn't keep all the events in one table without affecting performance.
If you craft your polling correctly to return nothing when there is nothing new you can minimize the load each client has on the server. If you also look into push notification (the hybrid delayed-connection-closing method) this will further help you scale big successfully.
Finally, it is completely unnecessary to worry about variable storage in the client. This is premature optimization. The performance issues are going to be in the avalanche of connections to the web server from many users, and in the DB, tables without proper indexes.
About indexes: An index is "proper" when the most common query against a table can be performed with a seek and a minimal number of reads (like 1-5). In your case, this could be an incrementing id or a date (if it has enough precision). If you design it right, the operation to find the most recent update_id should be a single read. Then when your client submits its ajax request to see if there is updated content, first do a query to see if the value submitted (id or time) is less than the current value. If so, respond immediately with the new content via a second query. Keeping the "ping" action as lightweight as possible is your goal, even if this incurs a slightly greater cost for when there is new content.
Using a push would be far better, though, so please explore Comet.
If you don't know how many reads are going on with your queries then I encourage you to explore this aspect of the database so you can find it out and assess it properly.
Update: offering the idea of clients getting a "yes there's new content" answer and then actually requesting the content was perhaps not the best. Please see Why the Fat Pings Win for some very interesting related material.

How to "upgrade" the database in real world?

My company have develop a web application using php + mysql. The system can display a product's original price and discount price to the user. If you haven't logined, you get the original price, if you loginned , you get the discount price. It is pretty easy to understand.
But my company want more features in the system, it want to display different prices base on different user. For example, user A is a golden parnter, he can get 50% off. User B is a silver parnter, only have 30 % off. But this logic is not prepare in the original system, so I need to add some attribute in the database, at least a user type in this example. Is there any recommendation on how to merge current database to my new version of database. Also, all the data should preserver, and the server should works 24/7. (within stop the database)
Is it possible to do so? Also , any recommend for future maintaince advice? Thz u.

I would recommend writing a tool to run SQL queries to your databases incrementally. Much like Rails migrations.
In the system I am currently working on, we have such tool written in python, we name our scripts something like 000000_somename.sql, where the 0s is the revision number in our SCM (subversion), and the tool is run as part of development/testing and finally deploying to production.
This has the benefit of being able to go back in time in terms of database changes, much like in code (if you use a source code version control tool) too.

http://dev.mysql.com/doc/refman/5.1/en/alter-table.html
Here are more concrete examples of ALTER TABLE.
http://php.about.com/od/learnmysql/p/alter_table.htm
You can add the necessary columns to your table with ALTER TABLE, then set the user type for each user with UPDATE. Then deploy the new version of your app. that uses the new column.

Did you use an ORM for data access layer ? I know Doctrine comes with a migration API which allow version switch up and down (in case something went wrong with new version).
Outside any framework or ORM consideration, a fast script will minimize slowdown (or downtime if process is too long).
To my opinion, I'd rather prefer a 30sec website access interruption with an information page, than getting shorter interuption time but getting visible bugs or no display at all. If interruption times matters, it's best doing this at night or when lesser traffic.
This can all be done in one script (or at least launched by one commande line), when we'd to do such scripts we include in a shell script :
putting application in standby (temporary static page) : you can use .htaccess redirect or whatever applicable to your app/server environment.
svn udpate (or switch) for source code and assets upgrade
empty caches, cleaning up temp files, etc.
rebuild generated classes (symfony specific)
upgrade DB structure with ALTER / CREATE TABLE querys
if needed, migrate data from old structure to new : depending on what you changed on structure, it may require fetching data before altering DB structure, or use tmp tables.
if all went well, remove temporary page. Upgrade done
if something went wrong display a red message to the operator so it can see what happened, try to fix it and then remove waiting page by hand.
The script should do checks at each steps and stop a first error, and it should be verbose (but concise) about what it does at all steps, thus you can fix the app faster if something has to went wrong.
The best would be a recoverable script (error at step 2 - stop process - manual fix - recover at step 3), I never took the time to implement it this way.
If works pretty well but these kind of script have to be intensively tested, on an environnement as closest as possible to the production one.
In general we develop such scripts locally, and test them on the same platform tha the production env (just different paths and DB)
If the waiting page is not an option, you can go whithout but you need to ensure data and users session integrity. As an example, use LOCK on tables during upgrade/data transfer and use exclusive locks on modified files (SVN does I think)
There could other better solutions, but it's basically what I use and it do the job for us. The major drawback is that kind of script had to be rewritten at each major release, this incitate me to search for other options to do this, but which one ??? I would be glad if someone here had better and simpler alternative.

Determine locations mentioned in shortish (500 to 1000 words) piece of text using PHP

I'd like to find a way to take a piece of user supplied text and determine what addresses on the map are mentioned within the text. I'd be happy to use a free web service if it exists or use a script which will not consume too many resources.
One way I can imagine doing this is taking a gigantic database of addressing and searching for each of them individually in the text, but this does not seem efficient. Is there a better algorithm or technique one can suggest?
My basic idea is to take the location information and turn it into markers on a Google Map. If it is too difficult or CPU intensive to determine the locations automatically, I could require users to add information in a location field if necessary but I would prefer not to do this as some of the users are going to be quite young students.
This needs to be done in PHP as that is the scripting language available on my school hosted server.
Note this whole set-up will happen within the context of a Drupal node, and I plan on using a filter to collect the necessary location information from the individual node, so this parsing would only happen once (when the new text enters the database).

You could get something like opencalais to tag your text. One of the catigories which it returns is "city" you coud then use another third party module to show the location of the city.

If you did have a gigantic list of locations in a relational database, and you're only concerned about 500 to 1000 words, then you could definitely just pass the SQL command to find matches for the 500-1000 words and it would be quite efficient.
But even if you did have to call a slow API, you could feasibly request for 500 words one by one. If you kept a cache of the matches, then the cache would probably quickly fill up with all the stop words (you know, like "the", "if", "and") and then using the cache, it'd be likely that you would be searching much less than 500 words each time.
I think you might be surprised at how fast the brute force approach would work.

For future reference I would just like to mention the Yahoo API called Placemaker and the service GeoMaker that is built on top of it.
Those tools can be used to parse out locations from a text as requested here. Unfortunately no Drupal module seems to exists right now- but a custom solution seems easy to code.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.