Mysql / php - id numbers reached 9999 and started counting from 0001 - php

I have a PHP website which takes customer applications. Each application is given an ID number which is incremental.
Recently the site reached application id number 9999 but instead of continuing on to 10000 it reverted back to 0001
Any ideas why this happened - perhaps some kind of php or mysql setting, range or limit?!
Thanks in advance

it is likely the size of the int allowed for the incrementing primary key.

please do this in table structure for id int(11)
in length give 11

yes there was some forming applied in the php to limit to 4 digits (0000) so when it reached 10000 it was cutting off the first 1. This was causing it to appear as 0001 etc in the backend and other places...
As a consequence it was also sending out the cut version of the id nmber in emails to applicants and causing all sorts of mess lol!
Took a while to find the right file because the original coder is not available!
Thanks to all.

Related

Performance of mysql uuid_to_bin in a php json import script

I have a php script which iterates through a JSON file line by line (using JsonMachine), checks each line for criteria (foreach); if criteria are met it checks if it's already in a database, and then it imports/updates (MYSQL 8.0.26). As an example last time this script ran it iterated through 65,000 rows and imported 54,000 of them in 24 seconds.
Each JSON row has a UUID as unique key, and I am importing this as a VARCHAR(36).
I read that it can be advantageous to store the UUIDs as BINARY(16) using uuid_to_bin and bin_to_uuid so I coded the script to store the UUID as binary and the read php scripts to unencode back to UUID, and the database fields to BINARY(16).
This worked functionally, but the script import time went from 24 seconds to 30 minutes. The server was not CPU-bound during that time, running at 25 to 30% (normally <5%).
The script without uuid conversion runs at about 3,000 lines per second, using uuid conversion it runs at about 30 lines per second.
The question: can anyone with experience on bulk importing using uuid_to_bin comment on performance?
I've reverted to native UUID storage, but I'm interested to hear others' experience.
EDIT with extra info from comments and replies:
The UUID is the primary key
The server is a VM with 12GB and 4 x assigned cores
The table is 54,000 rows (from the import), and is 70MB in size
Innodb buffer pool size is not changed from default, 128MB: 134,​217,​728
Oh, bother. UUID_TO_BIN changed the UUID values from being scattered to being roughly chronologically ordered (for type 1 uuids). This helps performance by clustering rows on disk better.
First, let's check the type. Please display one (any one) of the 36-char uuids or HEX(binary) using the 16-byte binary version. After that, I will continue this answer depending on whether it is type 1 or some other type.
Meanwhile, some other questions (to help me focus on the root cause):
What is the value of innodb_buffer_pool_size?
How much RAM?
How big is the table?
Were the incoming uuids in some particular order?
A tip: Use IODKU instead of SELECT + (UPDATE or INSERT). That will double the speed.
Then batch them 100 at a time. That may give another 10x speedup.
More
Your UUIDs are type 4 -- random. UUID_TO_BIN() changes from one random order to another. (Dropping from 36 bytes to 16 is still beneficial.)
innodb_buffer_pool_size -- 128M is an old, too small, default. If you have more than 4GB, set that to about 70% of RAM. This change should help performance significantly. Your VM has 12GB, so change the setting to 8G. This will eliminate most of the I/O, which is the slow part of SQL.

Auto-increment 5 numbers off?

I'm using a database for a project where I'm inserting things and I have an auto-increment. However, randomly, the auto-increment ID started counting wrong.
For example, the last one in the table has an ID of 227. When I insert another row, it should auto-assign it as 228, but instead it jumps to 232. How do I fix this?
Auto-increment doesn't mean use the next highest number in the table.
There are a few reasons why the auto-increment number is non-contiguous.
The auto-increment step may not be 1
The previous rows may have been deleted
Transactions inserting into the table may have been rolled back
A record may have updated the auto-increment column
The auto-increment start index may have been changed by a DDL modification.
There's probably a few other scenarios which would cause this, however.
To answer the question of "How do I fix this?", don't bother, the ID is supposed to be unique and nothing else, having IDs contiguous is usually not that useful (unless you're doing paging assuming they are contiguous, which is a bad assumption to have).
AutoIncrement numbers can be collected by the DB Server ahead of time. Instead of getting just the one asked for, a DB will sometimes get say 5 (or 10 or 100) and update the long term store of the next number to be assigned with the next + 5 (or 10 or 100). Then the next time the next one is assigned, it's got it in memory which is much faster, especially as it does not have to store the next to assign to the disk.
Timetable:
Asked for next number
Hasn't got one in memory so:
Go to disk to get the next (call it a)
Update disk with a + 100
Give the asker a
Store a+1 in memory as the next to give out
Asked for next number
Gives it a+1 and stores a+2 in memory
...and so on.
However if the DB Server gets stopped before it has given out all the numbers, then when it is restarted it will look for the next number and find a+100 - hence the gap. Autonumbers are generally guaranteed to given out as constantly increasing values, and should always be unique (watch what happens when max values are reached though). They are usually but not always sequential.
(Oracle does the same with Sequences. It's a while since I used it, but with a sequence you can specify a cache size. The only way to guarantee sequential assignments is to have a cache size of zero, which is slower.)

Detect points of change in sensor data

I have lots of sensor data from which I need to be able to detect changes reliably. Basically it comes from water level sensor in remote client. It's using accelerometer & float to get the water level. My problem is that the data can be noisy sometimes (it varies by 2-5 units per measurement) and sometimes I need to detect changes as low as 7-9 units.
When I'm graphing the data it's quite obvious for human eye that there's a change but how would I go at it programming wise? Now I'm just trying to detect changes bigger than x programmatically but it's not too reliable. I've attached a sample graph and pointed the changes with arrows. The huge changes in the beginning are just testing, so it's not normal behaviour for data.
The data is in MYSQL database and the code is in PHP so if you could point me to right direction I'd highly appreciate it!
EDIT: Also there can be some spikes in the data which are not considered valid but rather a mistake in the data.
EDIT: Example data can be found from http://pastebin.com/x8C9AtAk
The algorithm would need to run every 30 mins or so and should be able to detect changes within the last 2-4 pings. Each ping is in 3-5min interval.
I made some awk that you, or someone else, might like to experiment with. I average the last 10 (m) samples excluding the current one, and also average the last 2 samples (n) and then calculate the difference between the two and output a message if the absolute difference exceeds a threshold.
#!/bin/bash
awk -F, '
# j will count number of samples
# we will average last m samples and last n samples
BEGIN {j=0;m=10;n=2}
{d[j]=$3;id[j++]=$1" "$2} # Store this point in array d[]
END { # Do this at end after reading all samples
for(i=m-1;i<j;i++){ # Iterate over all samples, except first few while building average
totlastm=0 # Calculate average over last m not incl current
for(k=m;k>0;k--)totlastm+=d[i-k]
avelastm=totlastm/m # Average = total/m
totlastn=0 # Calculate average over last n
for(k=n-1;k>=0;k--)totlastn+=d[i-k]
avelastn=totlastn/n # Average = total/n
dif=avelastm-avelastn # Calculate difference between ave last m and ave last n
if(dif<0)dif=-dif # Make absolute
mesg="";
if(dif>4)mesg="<-Change detected"; # Make message if change large
printf "%s: Sample[%d]=%d,ave(%d)=%.2f,ave(%d)=%.2f,dif=%.2f%s\n",id[i],i,d[i],m,avelastm,n,avelastn,dif,mesg;
}
}
' <(tr -d '"' < levels.txt)
The last bit <(tr...) just removes the double quotes before sending the file levels.txt to awk.
Here is an excerpt from the output:
18393344 2014-03-01 14:08:34: Sample[1319]=343,ave(10)=342.00,ave(2)=342.00,dif=0.00
18393576 2014-03-01 14:13:37: Sample[1320]=343,ave(10)=342.10,ave(2)=343.00,dif=0.90
18393808 2014-03-01 14:18:39: Sample[1321]=343,ave(10)=342.10,ave(2)=343.00,dif=0.90
18394036 2014-03-01 14:23:45: Sample[1322]=342,ave(10)=342.30,ave(2)=342.50,dif=0.20
18394266 2014-03-01 14:28:47: Sample[1323]=341,ave(10)=342.20,ave(2)=341.50,dif=0.70
18394683 2014-03-01 14:38:16: Sample[1324]=346,ave(10)=342.20,ave(2)=343.50,dif=1.30
18394923 2014-03-01 14:43:17: Sample[1325]=348,ave(10)=342.70,ave(2)=347.00,dif=4.30<-Change detected
18395167 2014-03-01 14:48:25: Sample[1326]=345,ave(10)=343.20,ave(2)=346.50,dif=3.30
18395409 2014-03-01 14:53:28: Sample[1327]=347,ave(10)=343.60,ave(2)=346.00,dif=2.40
18395645 2014-03-01 14:58:30: Sample[1328]=347,ave(10)=343.90,ave(2)=347.00,dif=3.10
The right way to go about problems of this kind is to build a model of the phenomenon of interest and also a model of the noise process, and then make inferences about the phenomenon given some data. These inferences are necessarily probabilistic. The general computation you need to carry out is P(H_k | data) = P(data | H_k) P(H_k) / (sum_k (P(data | H_k) P(H_k)) (a generalized form of Bayes rule) where the H_k are all the hypotheses of interest, such as "step of magnitude at time " or "noise of magnitude ". In this case there might be a large number of plausible hypotheses, covering all possible magnitudes and times. You might need to limit the range of hypotheses considered in order to make the problem tractable, e.g. only looking back a certain number of time steps.

mysql unlimited primary keys auto increment

I have a question with im really unsure with.
First feel free to downvote me if its a must but i would really like to hear a more experienced developers opinion.
I am building a site where i would like to build similar functionality like google circles.
My logic would be this.
Every user will have circles attaced to them after signup.
example if the user will sign up
form filed and the following querys will be insierted to the database
**id | circle_name | user_id**
------------------------------------
1 | circle one | 1
------------------------------
2 | circle two | 1
------------------------------
3 | circle three | 1
Every circle will have a primary key
But this is what im unsure with, so after a time im a bit scared that the table will break, what im mean is if it will reach a number of id's it will actually stop generating more.
When you specifiy an int in the database the default value is 11, yes i know i can incrase or set it to the value what i want, but still giveing higher values is a good idea?
or is there any possibility to make a primary key auto increment to be unlimited?
thank you for the opinions and help outs
or is there any possibility to make a primary key auto increment to be unlimited?
You can use a BIGINT.
Strictly speaking it's not unlimited, but the range is so incredibly huge that you wouldn't be able to use up all the values even if you tried really hard.
Just run some maths and you ll get the answer yourself. If a length can store billions of values and you don't expect to have 1 million new registrations every week then getting to a point where it breaks would be "practically" tough, even if "theoretically" possible

Subset-sum problem in PHP with MySQL

following Problem:
I have a MySQL database with songs in it. The database has the following structure:
id INT(11)(PRIMARY)
title VARCHAR(255)
album VARCHAR(255)
track INT(11)
duration INT(11)
The user should be able to enter a specific time into a php form and the php function should give him a list of all possible combinations of songs which add up to the given time ±X min.
So if the user wants to listen to 1 hour of music ±5 minutes he would enter 60 minutes and 5 minutes of threshold into the form and would recieve all possible song sets which add up to a total of 55 to 65 minutes. It should not print out duplicates.
I already found several approaches to this problem but they did only give me back the durations which add up to X and not the song names etc. So my problem is how to solve this so that it gives me back the IDs of the songs which add up to the desired time or print out the list with the corresponding song names.
This seems to be one of the best answers I found, but I am not able to adapt it to my database.
What you are describing is a Knapsack Problem. When I was in college, we used Dynamic Programming to attack problems like this.
The brute-force methods (just try every combination until one (or more) works is a complexity O(n!), or factorial-length problem - extremely lengthy to iterate through and calculate!
If your times for tracks are stored as an int (in seconds seems the easiest math to me), then your knapsack size is 3300-3900 (3600 seconds == 1 hour).
Is your goal to always return the first set that matches, or to always return a random set?
Note - by bounding your sack size, you greatly expand the number of possible answers.

Categories