I have two strings and I would like to mix the characters from each string into one bigger string, how can I do this in PHP? I can swap chars over but I want something more complicated since it could be guessed.
And please don't say md5() is enough and irreversible. :)
$string1 = '9cb5jplgvsiedji9mi9o6a8qq1';//session_id()
$string2 = '5d41402abc4b2a76b9719d911017c592';//md5()
Thank you for any help.
EDIT: Ah sorry Rob. It would be great if there is a solution where it was just a function I could pass two strings to, and it returned a string.
The returned string must contain both of the previous strings. Not just a concatination, but the characters of each string are mingled into one bigger one.
If you want to make a tamper-proof string which is human readable, add a secure hash to it. MD5 is indeed falling out of favour, so try sha1. For example
$salt="secret";
$hash=sha1($string1.$string2.$salt);
$separator="_";
$str=$string1.$separator.$string2.$separator.$hash;
If you want a string which cannot be read by humans, encrypt it - check out the mcrypt extension which offers a variety of options.
Use one of the SHA variants of the hash() function. Sha2 or sha256 should be sufficient and certainly much better than anything you could come up with.
Unless I am missing something if your wanting to combine those values into a unique value why not do sha1(string1, string2);
I'm guessing you want something reversible, so you can get these values back out. A quick-and-dirty technique for obscuring these two strings further would be to base64-encode them:
base64_encode($string1 . $string2);
Thank you everyone. I completely forgot about the SHA1 - got too into solving a problem that I forgot what else was out there. :)
Well, if not md5(), then sha1(). :)
Anyway,the possibilities to mangle are endless, pick your poison.
What I would do, if I really wanted to do something like that (which can be useful occasionally), I would add another element, chosen on random and shuffle the md5 string by it. and write down the random element in it, too.
For example, let us add to each md5 character a random 2 digit number, which we then split by digits and add 1st digit to resulting string, and 2nd digit - prepend to it.
I stumbled upon someplace where something of that kind was done today. I was trying to find some reference to a particular phone number - whether it appears anywhere on the country-local inet or not.
I visited a popular classified ads site, which gives phone numbers of advertisers and you have the option, when you are looking at a particular ad, to find all ads with the same phone number. Now, what they did, however, was that they encoded search string, so you are not searching for ?phone=123123, but something like ?phone==FFYx23=.
If they hadn't done that, I would be able to find out for my own purposes, rather than checking on ads, IF user with phone 123123 has posted any ads on the site.
If you are looking to verify message integrity and authenticity with hashing - you might want to look at HMAC - there are plenty of implementations in PHP using both SHA1 and MD5:
http://en.wikipedia.org/wiki/HMAC
EDIT: In fact, PHP now has a function for this:
http://us3.php.net/manual/en/function.hash-hmac.php
Related
I'm trying to create an unique invoice id in PHP and currently doing this as following:
md5(time().$userId);
I may have concurrent users, so I'm adding user id as well to make sure it is unique, but the md5 hashing is 32 character long, is there any way to limit the output short (eg, 8-10 characters, if possible) while ensuring uniqueness?
NB: The output characters has to be same, therefore, just concating user id with time is not actually what I'm looking for since user id could be variable length, eg: 5, 20 or 100.
There are a few ways to ensure uniqunes
time() works fine for your purpose, since you're concatenating it with $userId.
You can use substr to take only parts of a string.
With that said, there are other ways to get unique strings in php
The one I find myself using more often than not is openssl_random_pseudo_bytes(). Notice the pseudo, it's not entirely random.
You can use it like this bin2hex(openssl_random_pseudo_bytes(2)), where 2 is the length of bytes, so it will equal to 4 characters.
You can also use urandom if you're on linux via exec OR even better, use fread
While it's better than the pseudo approach it's limited to the OS.
uniqid Also works fine.
If you really want something truly random, I suggest https://www.random.org/.
The randomness comes from atmospheric noise
They have an API you can use.
random_bytes (as suggested by #deceze) also works fine, do note that it's only available in PHP > 7
Pick your poison.
If i want to do this, use $userId.time()
I think this is unique because an user can't submit more than one order in a moment
What is the best way to validate a string as not gibberish using PHP?
For example, if I get a string input from a user that must be at least 250 characters long, how can I tell whether they entered legitimate text (e.g. real words) or just gibberish to comply with the minimum characters (e.g. asdlfkjefksjlfkjldskfjelkef)?
I've thought about counting the number of words as one option, but the user could still space out their gibberish (e.g. asdlf kjef ksjlf kjl dskfje lkef), so it needs another kind of check on top of that.
Is there any way to check that at least half of a string contains real dictionary words, or something to that effect?
What is the best solution to this problem?
Thanks.
You cannot do that properly because Colorless green ideas sleep furiously.
You could try a Bloom filter
You can walk through your dictionary and delete all dictionary words from user input and then check the length of the rest
You could look at Markov Chains. Simply put the idea is this algorithm determines whether sequences of characters look like they belong together. It won't necessarily tell you it's not gibberish, but it should catch out things like "ksjhglah etc".
See Markov text generators
We use UUIDs for our primary keys in our db (generated by php, stored in mysql). The problem is that when someone wants to edit something or view their profile, they have this huge, scary, ugly uuid string at the end of the url. (edit?id=.....)
Would it be safe (read: still unique) if we only used the first 8 characters, everything before the first hyphen?
If it is NOT safe, is there some way to translate it into something else shorter for use in the url that could be translated back into the hex to use as a lookup? I know that I can base64 encode it to bring it down to 22 characters, but is there something even shorter?
EDIT
I have read this question and it said to use base64. again, anything shorter?
Shortening the UUID increases the probability of a collision. You can do it, but it's a bad idea. Using only 8 characters means just 4 bytes of data, so you'd expect a collision once you have about 2^16 IDs - far from ideal.
Your best option is to take the raw bytes of the UUID (not the hex representation) and encode it using base64. Or, just don't worry much, because I seriously doubt your users care what's in the URL.
Don't cut a single bit out of that UUID: You have no control over the algorithm that produced it, there are multiple possible implementation, algorithm implementation is subject to change (example: changed with the version of PHP you're using)
If you ask me an UUID in the address bar doesn't look scary or difficult at all, even a simple google search for "UUID" produces worst looking URL's, and everybody's used to looking at google URL's!
If you want nicer looking URL's, take a look at the address bar of this stackoverflow.com article. They're using the article ID followed by the title of the question. Only the ID part is relevant, everything else is there to make it easy on the eyes of readers (go ahead and try it, you can delete anything after the ID, you can replace it with junk - doesn't matter).
It is not safe to truncate uuid's. Also, they are designed to be globally unique, so you aren't going to have luck shortening them. Your best bet is to either assign each user a unique number, or let users pick a custom (unique) string (like a username, or nick name) that can be decoded. So you could have edit?id=.... or edit?name=blah and you then decode name into the uuid in your script.
It depends on how you're generating the UUID - if you're using PHP's uniqid then it's the right-most digits that are more "unique". However, if you're going to truncate the data, then there's no real guarantee that it'll be unique anyway.
Irrespective, I'd say that this is a somewhat sub-optimal approach - is there no way you can use a unique (and ideally meaningful) textual reference string instead of an ID in the query string? (Hard to know without more knowledge of the problem domain, but it's always a better approach in my opinion, even if SEO, etc. isn't a factor.)
If you were using this approach, you could also let MySQL generate the unique IDs, which is probably a considerably more sane approach than attempting to handle this in PHP.
If you're worried about scaring users with the UUID in the URL, why not write it out to a hidden form field instead?
I am building a string to detect whether filename makes sense or if they are completely random with PHP. I'm using regular expressions.
A valid filename = sample-image-25.jpg
A random filename = 46347sdga467234626.jpg
I want to check if the filename makes sense or not, if not, I want to alert the user to fix the filename before continuing.
Any help?
I'm not really sure that's possible because I'm not sure it's possible to define "random" in a way the computer will understand sufficiently well.
"umiarkowany" looks random, but it's a perfectly valid word I pulled off the Polish Wikipedia page for South Korea.
My advice is to think more deeply about why this design detail is important, and look for a more feasible solution to the underlying problem.
You need way to much work on that. You should make an huge array of most-used-word (like a dictionary) and check if most of the work inside the file (maybe separated by - or _) are there and it will have huge bugs.
Basically you will need of
explode()
implode()
array_search() or in_array()
Take the string and look for a piece glue like "_" or "-" with preg_match(); if there are some, explode the string into an array and compare that array with the dictionary array.
Or, since almost every words has alternate vowel and consonants you could make an huge script that checks whatever most of the words inside the file name are considered "not-random" generated. But the problem will be the same: why do you need of that? Check for a more flexible solution.
Notice:
Consider that even a simple-and-friendly-file.png could be the result of a string generator.
Good luck with that.
I'm not sure what this is called, which is why I'm having trouble searching for it.
What I'm looking to do is to take numbers and convert them to some alphanumeric base so that the number, say 5000, wouldn't read as '5000' but as 'G4u', or something like that. The idea is to save space and also not make it obvious how many records there are in a given system. I'm using php, so if there is something like this built into php even better, but even a name for this method would be helpful at this point.
Again, sorry for not being able to be more clear, I'm just not sure what this is called.
You want to change the base of the number to something other than base 10 (I think you want base 36 as it uses the entire alphabet and numbers 0 - 9).
The inbuilt base_convert function may help, although it does have the limitation it can only convert between bases 2 and 36
$number = '5000';
echo base_convert($number, 10, 36); //3uw
Funnily enough, I asked the exact opposite question yesterday.
The first thing that comes to mind is converting your decimal number into hexadecimal. 5000 would turn into 1388, 10000 into 2710. Will save a few bytes here and there.
You could also use a higher base that utilizes the full alphabet (0-Z instead of 0-F) or even the full 256 ASCII characters. As #Yacoby points out, you can use base_convert() for that.
As I said in the comment, keep in mind that this is not an efficient way to mask IDs. If you have a security problem when people can guess the next or previous ID to a record, this is very poor protection.
dechex will convert a number to hex for you. It won't obfuscate how many records are in a given system, however. I don't think it will make it any more efficient to store or save space, either.
You'd probably want to use a 2 way crypt function if obfuscation is needed. That won't save space, either.
Please state your goals more clearly and give more background, because this seems a bit pointless as it is.
This might confuse more people than simply converting the base of the numbers ...
Try using signed digits to represent your numbers. For example, instead of using digits 0..9 for decimal numbers, use digits -5..5. This Wikipedia article gives an example for the binary representation of numbers, but the approach can be used for any numeric base.
Using this together with, say, base-36 arithmetic might satisfy you.
EDIT: This answer is not really a solution to the question, so ignore it unless you are trying to hash a number.
My first thought we be to hash it using eg. md5 or sha1. (You'd probably not save any space though...)
To prevent people from using rainbow-tables or brute force to guess which number you hashed, you can always add a salt. It can be as simple as a string prepended to your number before hashing it.
md5 would return an alphanumeric string of exactly 32 chars and sha1 would return one of exaclty 40 chars.