I'll try to be simple, clear and direct. My problem is the following: I have a project where I need to
generate codes for scratch cards. The scrath cards are printed like the ones you use for charging your
mobile phone.
The system is that people buy the cards, get the codes on the cards, then call a TOIP server (Asterisk) and inserts the code to access a service. It is given three attempts to enter the right code.
I thought to make a PHP program to generate theses codes, so I surely need to pass by a PRNG (Pseudo Random Number Generator). My constraints are:
-As the people are calling, the code shouldn't be too long, but long enough to ensure security.
-I need the system to be fast enough when the comparison is made between the code entered
and the one stored in the database (needed for statistics purposes).
So my questions is:
-Is it right to use a PRNG?
-If yes, do you know one strong enough to generate good random numbers?
-What standards are used by the industry?
-How to make the comparison algorithm fast enough if the comparison is made on million of codes?
Thanks for your time and answers.
Yes, PRNG will work fine after tweaking it little bit.
http://en.wikipedia.org/wiki/Random_password_generator
You can refer to the password generator code in the link above. You have to make sure first digit is not 0 and use only digits not alphabets.
once a number is generated you have to check if it exists in DB or not before you insert.
Normally, 16 character/digits are used by industries. You can generate 20 digit numbers also to make the whole process faster.
To make a matching faster you have to index the field in database. most probably it will be a char(16) or char(20).
Note : as there is no need of varchar here char is the best option.
Keep the Mysql table engine as MYISAM for fast comparision.
Related
Intro:
I want to be able to have a system where a person can purchase a code from my store and that code or password can be used on my PHP website. But each time that code is used the amount of given uses on that code will be decreed so when it has been used up the code will be no longer usable.
Example:
My shop is selling a 5 use forum un-ban code.
The person goes and buys that code and is given a random generated diet code EX: 91259102
The person goes to the redemption site and enters that purchased code.
Once the code has been used that one time the code will only have 4 uses remaining on it.
Once the user has used the code 4 more times the code is no longer valid.
What I want to be able to do:
Generate codes via a PHP website. (no need to be automatically generated)
Store the given codes on a .txt document on the server or using a MySQL database.
Have the user be able to use the codes on my website.
Final:
Thanks for any input I can get it is all much appreciated.
Thanks, Code Squishy.
Do not store it in a .txt file. That's extremely insecure and just asking for someone to steal all of the codes. If you forget to blacklist that file type, anyone could simply request it and read the contents. It also would be much more difficult to deal with a TXT file than it would a database such as MySQL. This is the kind've thing databases are written for, so luckily you're covered![1]
Additionally, randomly generating would be much easier, and likely safer than generating them yourself. You could use any standard number generator, and even just hash it with the md5[2] hashing algorithm to get standard length codes.
When a new code is generated, create a new entry in your database. All the table has to contain is two columns, the code as well as the amount of uses left. Each time the code is used, just subtract from that stored value. Once it reaches zero, remove the entry from the table in your database.
PHP has many constructs for accomplishing all of these. This is where I recommend you start:
[1]MySQL and PHP
[2]PHP Hashing
I hope this helps!
Evening all, I've recently been reading the following blog post about sharding at Pinterest and I think there's some great stuff in there https://engineering.pinterest.com/blog/sharding-pinterest-how-we-scaled-our-mysql-fleet
What I'm unsure on though, is how best to decide where a brand new user should be inserted.
So for those that don't know or have bothered to read the above article, Pinterest have a number of shards, each with a number of databases on. They generate IDs for objects based on a 64 bit shifting that determines a shard, the type of object (user,pin etc..) to determine a table and the local auto-increment id for the object in question. Now they try to put pins etc. on the same database as the 'board' they are on. But for a brand new object, what would be the best way of determining the 'shard' it lives on?
For users that sign in via Facebook they use a modulus e.g
shard = md5(“1.2.3.4") % 4096 //4096 is the number of shards
But if I had a simple email/password registration form, do you think using a similar approach on email address would work for working out an initial shard? I'd assume it would have to be email in this case, otherwise they would have no way of knowing what database to validate the logging credentials against. Also I know that post is from 2015 so not too old and computing power moves quickly, but would there be a better option then using md5 here? I know the chance of a collision is minor - especially as we're just talking about hashing the email address here, but would it be worth using a different algorithm? I'm basically interested in the best way to determine a shard here and to work out how to get back to it (hence why I think it has to be email address)
Hope this all makes sense!
(p.s didn't take this with the Pinterest tag as it looks like that's just for api dev, but if someone thinks it might get better 'eyes' on the question then feel free to add it)
When using MD5 to determine the shard, there is no risk on collisions: If collisions occur then it just ends up in the same shard. The MD5 is not the key in that shard (so that is where the collision risk is removed).
The main issue in this shard method is that the number of shards is fixed, so performance in the end might be an issue (re-distributing a running environment is not easy, so in this design you are still dependent on faster machines if there is more growth then expected).
I'm making a little web application which needs to randomize stuff.
Just a little example of what's it gonna have: returns a random number between and 10 to the user.
I'm planning to do it using Javascript and jQuery (the calculator itself).
My question is: How can I make its functions truly random and not pseudo-random? Is the PHP function perhaps more random than the Javascript ones?
For the sake of the question, let's say what I want is a true random number between X and Y.
No function that you call will be "truly random". They are all PRNGs; most PRNGs are, in fact, quite good and I'd be surprised if they were inadequate for your application. Also, while a few PRNGs are notorious for their small period (Java's Random comes to mind), every modern language that I know of—including JavaScript, PHP, and (with its crypto packages) Java—have very good PRNGs.
The best way to collect "more random" data is to obtain it from an external random source. One possibility is to ask the user to move the mouse around in a window for a certain period of time, collecting the mouse coordinates at regular intervals. Some military, banking, and other high-security systems use hardware like thermal noise sensors to obtain data that is as close to random as one can hope; however, such hardware support is not going to be available to a web app.
Note that hacks like using the system clock are no better than PRNGs; most PRNGs initialize their seed with a combination of such data.
You have not understood the Matrix movies. ;) One function is not "more random" than one in another language. All functions are equally pseudo random. "Pseudo" means that the computer cannot, by definition, pull a random number out of thin air. It just can't. A computer computes, strictly, based on rules, accurately. There's no randomness anywhere in the system. For all its supposed power, randomness is the one thing a computer simply cannot do (that, and making coffee).
For true randomness, you need an outside, natural source. Like measuring atomic decay, or some such thing which is not predictable and truly random. Anything else is just pseudo-randomness, which may be of varying quality.
Good PRNGs try to collect "outside interference" in an entropy pool; e.g. Linux' /dev/random takes into account system driver "noise" which may be based on "random" packets hitting the ethernet port, or the user's mouse movements. How truly random that is is debatable, but it's very very hard to predict at least and suitably random for most purposes.
I don't think there's any way to remove the deterministic aspect of randomness from a program completely. You can do all you want to minimize, mitigate, and obfuscate whatever process you're using to "pull numbers from a hat", but you can never truly make it perfectly random.
You can fashion out a process with sufficient detail to make it practically random, but true random may be impossible.
While you can't achive true random with your code in php you can use random.org API. You could connect through curl in php or through ajax in javascript.
They are using atmospheric noise as a random seed as far as i know.
It is not possible to generate truly random variables on a computer. However, you may improve standard generators. Suppose, you have two basic generators. You create a table and fill it with values from the first generator. Then, if you want to get a number, the second one generates an index and returns correspondent value from the table. This value is then replaced with the new one... I forgot how this generator is called... Hope, it helps.
P.S. Sorry for my English.
My suggestion is that you generate a binary random string by encrypting the local time and date by an encryption algorithm. In this case try to gather all possible sources of "random" data, and load these both as input message and input key.
As you have seen from previous answers, above, your use and your requirements of your random data are important. A number is "random" if it is difficult or impossible for your application to guess its value in advance. Note that the same number source may be considered random for some applications while they are not random in others. Evidently you will have serious problems in case you need high quality random numbers for a demanding application.
TRNG98 True Random Numbers
I recently bought myself a domain for personal URL shortening.
And I created a function to generate alphanumeric strings of 4 characters as reference.
BUT
How do I check if they are already used or not? I can't check for every URL if it exists in the database, or is this just the way it works and I have to do it?
If so, what if I have 13.000.000 URLs generated (out of 14.776.336). Do I need to keep generating strings until I have found one that is not in the DB yet?
This just doesn't look the right way to do it, anyone who can give me some advise?
One memory efficient and faster way I think of is following. This problem can be solved without use of database at all. The idea is that instead of storing used urls in database, you can store them in memory. And since storing them in memory can take a lot of memory usage, so we will use a bit set (an array of bits ) and we only one bit for each url.
For each random string you generate, create a hashcode for that that lies b/w 0 and max number K.
Create a bit set( basically a bit array). Whenever you use some url, set corresponding hash code bit in bit set to 1.
Whenever you generate a new url, see if its hashcode bit is set. If yes, then discard that url and generate a new one. Repeat the process till you get one unused one.
This way you avoid DB forever, your lookups are extremely fast and it takes least amount of memory.
I borrowed the idea from this place
A compromise solution is to generate a random id, and if it is already in the database, find the first empty id that is bigger than it. (Wrapping around if you can't find any empty space in the range above.)
If you don't want the ids to be unguessable (you probably don't if you only use 4 characters), this approach works fine and is quick.
One algorithm is to try a few times to find a free url of N characters, if still not found, increase N. Start with N=4.
I have a database which holds URL's in a table (along with other many details about the URL). I have another table which stores strings that I'm going to use to perform searches on each and every link. My database will be big, I'm expecting at least 5 million entries in the links table.
The application which communicates with the user is written in PHP. I need some suggestions about how I can search over all the links with all the patterns (n X m searches) and in the same time not to cause a high load on the server and also not to lose speed. I want it to operate at high speed and low resources. If you have any hints, suggestions in pseudo-code, they are all welcomed.
Right now I don't know whether to use SQL commands to perform these searches and have some help from PHP also or completely do it in PHP.
First I'd suggest that you rethink the layout. It seems a little unnecessary to run this query for every user, try instead to create a result table, in which you just insert the results from that query that runs ones and everytime the patterns change.
Otherwise, make sure you have indexes (full text) set on the fields you need. For the query itself you could join the tables:
SELECT
yourFieldsHere
FROM
theUrlTable AS tu
JOIN
thePatternTable AS tp ON tu.link LIKE CONCAT('%', tp.pattern, '%');
I would say that you pretty definately want to do that in the SQL code, not the PHP code. Also searching on the strings of the URLs is going to be a long operation so perhaps some form of hashing would be good. I have seen someone use a variant of a Zobrist hash for this before (google will bring a load of results back).
Hope this helps,
Dan.
Do as much searching as you practically can within the database. If you're ending up with an n x m result set, and start with at least 5 million hits, that's a LOT Of data to be repeatedly slurping across the wire (or socket, however you're connecting to the db) just to end up throwing away most (a lot?) of it each time. Even if the DB's native search capabilities ('like' matches, regexp, full-text, etc...) aren't up to the task, culling unwanted rows BEFORE they get sent to the client (your code) will still be useful.
You must optimize your tables in DB. Use a md5 hash. New column with md5, will use index and faster found text.
But it don't help if you use LIKE '%text%'.
You can use Sphinx or Lucene.