MongoDB _id Field Auto Generation? - php

I am trying to setup mongodb to test out its speed and am running into a issue with _id duplication. I am not setting the is, I am letting mongodb do as I don't care. I ahve the following php code:
<?php
$mongo = new Mongo();
$db = $mongo->selectDB("scrap_fighters");
$collection = $db->selectCollection('scores');
$data = array
(
'user_id' => 1,
'name' => 'John Doe',
'score' => 120
);
$start = microtime(true);
for($x=0; $x < 1000; $x++)
{
$data['unqiue'] = microtime();
$result = $collection->insert($data, array('safe' => true));
}
?>
What does mongodb use to generate thier "unique" ids? I even tried replacing unique with:
$data['unqiue'] = rand(1, 1000000);
To be 100% sure it was working but it still failed after the first write. can I not enter records with the same data without specific generation a unique id myself?

MongoDB uses _id as primary key, and as such it has to be unique. Since you don't specify it, it will automatically generated. It consists of a microsecond timestamp and a hash based on the host, so even there are multiple hosts inserting simultaneously the probabily of collision is extremely low.
What is this unique field you are using? If you wanted it to be a primary key just don't set it.
About failing on duplicates: the only reason I can think of this happening is that you previously set up an index on this collection which requires some field (or combination of fields) to be unique. If not needed, remove it. If it's valid (eg: the user_id has to be unique) then insert unique records.

Related

Should I use encryption or hashes for generating custom reference ids?

I need to generate unique custom reference ids for each user based on their user id. Right now I'm using the md5 method for this and have limited the length to 12 digits/characters.
$user_id = '120';
$ref_id = substr(md5($user_id, 0, 12);
I know there are many ways to generate a string from another string, but what would be the best way to generate a simple but unique user ID with a relatively short length (max. 16 chr/digits)?
The ID is used to preserve and mask the true user ID or the name of the user in publications.
why not just use their ID number?
Because user ids do not have the same scheme and are sequential. I want all of my users to have let’s say a 16 chr/digit but unique ref ID. It's not really about security, but here the uniqueness is in the foreground. I just selected MD5 for the staging server. I am open for any other advice.
From the conversation in comments, a good solution would be to generate a unique random key and associate it with the value (such as the database row).
This random key is neither an encryption or a hash. This key is a unique reference variable.
So for each user you have their database membership row, and one column would be "reference_key"; this can be populated when a unique value associated only with this account,
For Example
The below code will generate a unique key and save it to the array value $saver['reference_key'] , you can then insert this into your database when you save your other customer data.
The MySQL column should be UNIQUE indexed and UTF8mb4 collation and character set.
The reason that the testing checks if the value exists already would be its easier to re roll the value before the MySQL error is triggered when an identical value is tried to be inserted into the UNIQUE column.
function nonce_generator($length = 40)
{
// Source can be any valid characters you want to use.
$source = 'abcdefghijklmnopqrstuvwxyzANCEDFGHIJKLMNOPQRSTUVWXYZ0123456789!-=+';
$max = strlen($source) - 1;
$i = 0;
$output = "";
do {
$output .= $source[random_int(0,$max)];
$i++;
}
while($i < $length);
return $output;
}
...
do {
$found = false;
$saver['reference_key'] = nonce_generator(12); //see function above to generate a key.
$check = $dataBase->getSelect("SELECT COUNT(*) as numb FROM customer WHERE reference_key = ? ", $saver['reference_key']);
if ($check['numb'] > 0) {
// check if key already exists in the database.
$found = true;
}
unset($check);
} while ($found === true);
// once a unique key has been found then save this to the database user.
// along with all other user details.
$dataBase->arrayToSQLInsert("customer", $saver);
Please note that for this code uses customised database interactions and is for illustration purposes ONLY
Advantages:
Unique key does not reference any other customer data.
Unique key is not a hash or encryption so can not be 'compromised'.
Key is assured to be unique.
Disadvantages:
On very large data sets the do/while process of finding a unique value may cause a slight slowdown

PHP MySQL - Update 6.5m rows performance issues

I am working with a MySQL table and I need to increment a value in one column for each row, of which there are over 6.5m.
The col type is varchar and can contain an integer or a string (i.e. +1). The table type is MyISAM.
I have attempted this with PHP:
$adjust_by = 1;
foreach ($options as $option) {
$original_turnaround = $option['turnaround'];
$adjusted_turnaround = $option['turnaround'];
if (preg_match('/\+/i', $original_turnaround)) {
$tmp = intval($original_turnaround);
$tmp += $adjust_by;
$adjusted_turnaround = '+'.$tmp;
} else {
$adjusted_turnaround += $adjust_by;
}
if (!array_key_exists($option['optionid'], $adjusted)) {
$adjusted[$option['optionid']] = array();
}
$adjusted[$option['optionid']][] = array(
'original_turn' => $original_turnaround,
'adjusted_turn' => $adjusted_turnaround
);
}//end fe options
//update turnarounds:
if (!empty($adjusted)) {
foreach ($adjusted as $opt_id => $turnarounds) {
foreach ($turnarounds as $turn) {
$update = "UPDATE options SET turnaround = '".$turn['adjusted_turn']."' WHERE optionid = '".$opt_id."' and turnaround = '".$turn['original_turn']."'";
run_query($update);
}
}
}
For obvious reasons there are serious performance issues with this approach. Running this in my local dev environment leads to numerous errors and eventually the server crashing.
Another thing I need to consider is when this is run in a production environment. This is for an ecommerce store, and I cannot have a huge update like this lock the database or cause any other issues.
One possible solution I have found is this: Fastest way to update 120 Million records
But creating another table comes with it's own issues. The codebase is not in a good state, similar queries are run on this table in loads of places so I would have to modify a large number of queries and files to make this approach work.
What are my options (if there are any)?
You can do this task with SQL.
With CAST you can convert a string into integer.
With IF and SUBSTR you can check if string contains +.
With CONCAT you will add (merge a two values into one string) + to your calculated result (if it will be necessary).
Just try this SQL:
"UPDATE `options` SET `turnaround` = CONCAT(IF(SUBSTR(`turnaround`, 1, 1) = '+', '+', ''), CAST(`turnaround` AS SIGNED) + " + $adjust_by + ") WHERE 1";
can't you just say
UPDATE whatevertable SET whatever = whatever + 1?
Try it and see, I'm pretty sure it will work!
EDIT: You have strings OR integers? Your DB design is flawed, this probably won't work, but would have been the correct answer had your DB design been more strict.
You probably don't have, but need, this 'composite' index (in either order):
INDEX(optionid, turnaround)
Please provide SHOW CREATE TABLE.
Another, slight, performance boost is to explicitly LOCK TABLE WRITE before that update loop. And UNLOCK afterwards. Caution: This only applies to MyISAM.
You would be much better off with InnoDB.

PHP PDO Unique Random Number Generator

Hello so first of all please consider this question as a newbie one because I can just set an ID field and add zerofill so it would look like 000001, 000002 and so fort. But what I did is wrong, and the system is already big so please consider my question. I have a table named accounts which has an id field and sponsorID field. Now what I did looks like this (btw I am using slim framework):
$db = new db();
$sponsorIDrandom = mt_rand(100000, 999999);
$bindCheck = array(
":sponsorID" => $sponsorIDrandom
);
$sponsorIDChecker = $db->select("accounts", "sponsorID = :sponsorID", $bindCheck);
$generate_SID = null;
do {
$generate_SID = mt_rand(100000, 999999);
} while (in_array($generate_SID, array_column($sponsorIDChecker, 'sponsorID')));
$db->insert("accounts", array(
"sponsorID" => $generate_SID
));
The code above will check if a number already exist in the accounts table and if there is an existing, it will generate a random number again until it becomes unique or non-existing in the accounts table. I made the sponsorID field unique so that it won't accept duplicate values.
Now the problem is the code I posted. I thought it would let the $generate_SID be unique because I used the in_array function so it would check if a value already exist in the array and do generate a number again until it is unique but I did receive luckily an error that it tried to insert a random number that already exists and it didn't generate a new one.
Can anyone tell me if there's a solution for this? Or should I re-modify the code above so it would not enter already existing sponsorID? Thank you in advance.
From what i understood, you try to insert a unique id into a table but the generator only runs once and or it tells you that the number already exists.
I've never used slim but it seems in your code you try to do a SELECT of a single record, because you generate a random number and then ask for this number to the database:
$sponsorIDrandom = mt_rand(100000, 999999);
$bindCheck = array(
":sponsorID" => $sponsorIDrandom
);
$sponsorIDChecker = $db->select("accounts", "sponsorID = :sponsorID", $bindCheck);
This only returns one or none rows if as you say the sponsorID is UNIQUE.
And then you try to generate another random number and check if is not repeated based on this single (or null) record.
$generate_SID = null;
do {
$generate_SID = mt_rand(100000, 999999);
} while (in_array($generate_SID, array_column($sponsorIDChecker, 'sponsorID')));
this loop only executes once because the probability of this second random number to be inside this record (if there is a record at all) are almost none and if the database are as big as you says, then the probability for collisions are too high.
for this code to work you need to load every record or ask for the newly generated number if it exists in the database every time it is generated, both alternatives are not recommended but since the databse is already (almost) full.
$sponsorIDChecker = $db->select(...); //use the equivalent of "SELECT sponsorID from accounts" without the WHERE clause, is better to ask for a single column.
$generate_SID = null;
do {
$generate_SID = mt_rand(100000, 999999);
} while (in_array($generate_SID, ...)); //here you put the result of the query above.
$db->insert("accounts", array(
"sponsorID" => $generate_SID
));
Now, something that may be of help: if you set the sponsorID as a zerofill in the databse as you said
I can just set an ID field and add zerofill so it would look like 000001, 000002 and so fort.
then you can lower the min value of mt_rand to 0 and you gain 100000 more IDs to try.

Caching MySQL results with Memcache and sorting/filtering cached results

To help everyone understand what I'm asking I put forward a scenario:
I have user A on my web app.
There is a particular page which has a table that contains information that is unique to that user. Let's say it is a list of customers that only show for user A because user A and these customers are in region 5.
Other users are assigned to different regions and see different lists of customers.
What I would like to do is cache all of the results for each users list. This isn't a problem as I can use:
$MC = new Memcache;
$MC->addserver('localhost');
$data = $MC->get('customers');
if($data)
{
} else {
$data = $this->model->customersGrid($take, $skip, $page, $pageSize, $sortColumn, $sortDirection, $filterSQL, $PDOFilterParams);
$MC->set('customers', $data);
}
header('Content-Type: application/json');
return $data;
The challenge now is to somehow convert the SQL filter syntax that comes from my users table into a function that can filter and sort an array ($data is a JSON string that I would turn into an array if that's the right way to go).
Just for reference, here is the array of aliases I use for building the WHERE clause in my statements:
$KF = new KendoFilter;
$KF->columnAliases = array(
'theName' => 'name',
'dimensions' => 'COALESCE((SELECT CONCAT_WS(" x ", height, width, CONCAT(length, unit)) FROM products_dimensions,
system_prefs, units_measurement
WHERE products_dimensions.productId = product.id
AND units_measurement.id = system_prefs.defaultMeasurementId), "-")',
'gridSearch' => array('theName', 'basePrice')
);
$filterSQL = $KF->buildFilter();
My question is what is a good way to filter and sort memcache data as if it was an SQL query? Or does memcache have something already built in?
Memcache cannot do this - you can't replace your database with memcache (that is not what it is for), you can only store key => value pairs.
I think a better approach is to store each data for each user in a specific mem cache key.
So for example if user A with $user_id = 123 visits the page:
$data = $MC->get('customers_for_'.$user_id);
This way you only get the customers for user 123.
A more generic approach is to generate a hash for each sql query with it's params (but that might be overkill in most cases). For example if you have a query select ... from ... where a = #a and b = #b with variables $a and $b you could do the following (you must adapt this for kendo of course, but to get the idea):
$query = "select ... from ... where a = #a and b = #b";
# crc32 because it is fast and the mem key does not get too long
$sql_crc = crc32($query.$a.$b);
$data = $MC->get("customers_".$sql_crc);
To rule out (unlikely) hash collisions for different users, you could mix in the user id in the key, too:
$data = $MC->get("customers_for_".$user_id."_".$sql_crc);
BUT: If you start doing this all over the place in your app because otherwise it is too slow, then maybe the problem lies in your database (missing/wrong indexes, bad column definitions, complicated relations, etc.) and time should better be invested in fixing the DB than working around the issue like this.

create 6 alphanumeric code for unique field

i need to create 6 alphanumeric for primary key in a table , however i just think to make int field with autoincrement and get the max value of that field and process with this code so it can be alphanumeric and stored in different field.. does this idea and code meet the requirement? and is it good? will it always be unique?
<?php
$code = the max value retrieved from the autoincrement int
function getNextAlphaNumeric($code) {
$base_ten = base_convert($code,36,10);
$result = base_convert($base_ten+1,10,36);
$result = str_pad($result, 6, '0', STR_PAD_LEFT);
$result = strtoupper($result);
return $result;
}
Here's something to think about.
Create 6 character primary key.
Generate code with php
Insert it using INSERT IGNORE; if no rows affected (not very likely in the beginning), try again.
Done. A unique code was added.

Categories