display huge data in batches of 100 every hour in mysql/php - php

I have a database with more than 600 rows but I can only retrieve/display 100 every hour. So I use
select * from table ORDER BY id DESC LIMIT 100
to retrieve the first 100. How do I write a script that will retrieve the data in batches of 100 every 1hr so that I can use it in a cron job?

Possible solution.
Add a field for to mark the record was already shown.
ALTER TABLE tablename
ADD COLUMN shown TINYINT NULL DEFAULT NULL;
NULL will mean that the record was not selected, 1 - that record is marked for selection, 0 - that record was already selected.
When you need to select up to 100 records you
2.1. Mark records to be shown
UPDATE tablename
SET shown = 1
WHERE shown = 1
OR shown IS NULL
ORDER BY shown = 1 DESC, id ASC
LIMIT 100;
shown = 1 condition in WHERE considered the fact that some records were marked but were not selected due to some error. shown = 1 DESC re-marks such records before non-marked.
If there is 100 or less records which were not selected all of them will be marked, else only 100 records with lower id (most ancient) will be marked.
2.2. Select marked records.
SELECT *
FROM tablename
WHERE shown = 1
ORDER BY id
LIMIT 100;
2.3. Mark selected records.
UPDATE tablename
SET shown = 0
WHERE shown = 1
ORDER BY id
LIMIT 100;
This is applicable when only one client selects the records.
If a lot of clients may work in parallel, and only one cliens must select a record, then use some cliens number (unique over all clients) for to mark a record for selection instead of 1.
Of course if there is only one client, and you guarantee that selection will not fail, you may simply store last shown ID somewhere (on the client side, or in some service table on the MySQL side) and simply select "next 100" starting from this stored ID:
SELECT *
FROM tablename
WHERE id > #stored_id
ORDER BY id
LIMIT 100;
and
SELECT MAX(id)
FROM tablename
WHERE id > #stored_id
ORDER BY id
LIMIT 100;
for to store instead of previous #stored_id.

Thank you #Akina and #Vivek_23 for your contributions. I was able to figure out an easier way to go about it.
Add a new field to table, eg shownstatus
Create a cronjob to display 100 (LIMIT 100) records with their shownstatus not marked as shown from table every hour and then update each record's shownstatus to shown NB. If I create a cronjob to run every hour for the whole day, I can get all records displayed and their shownstatus updated to shown by close of day.
Create a second cronjob to update all record's shownstatus to notshown
The downside to this is that, you can only display a total of 2,400 records a day. ie. 100 records every hour times 24hrs. So if your record grows to about 10,000. You will need to set your cronjob to run for atleast 5 days to display all records.
Still open to a better approach if there's any, but till then, I will have to just stick to this for now.

Let's say you made a cron that hits a URL something like
http://yourdomain.com/fetch-rows
or a script for instance, like
your_project_folder/fetch-rows.php
Let's say you have a DB table in place that looks something like this:
| id | offset | created_at |
|----|--------|---------------------|
| 1 | 100 | 2019-01-08 03:15:00 |
| 2 | 200 | 2019-01-08 04:15:00 |
Your script:
<?php
define('FETCH_LIMIT',100);
$conn = mysqli_connect(....); // connect to DB
$result = mysqli_query($conn,"select * from cron_hit_table where id = (select max(id) from cron_hit_table)")); // select the last record to get the latest offset
$offset = 0; // initial default offset
if(mysqli_num_rows($result) > 0){
$offset = intval(mysqli_fetch_assoc($result)['offset']);
}
// Now, hit your query with $offset included
$result = mysqli_query($conn,"select * from table ORDER BY id DESC LIMIT $offset,100");
while($row = mysqli_fetch_assoc($result)){
// your data processing
}
// insert new row to store next offset for next cron hit
$offset += FETCH_LIMIT; // increment current offset
mysqli_query($conn,"insert into cron_hit_table(offset) values($offset)"); // because ID would be auto increment and created_at would have default value as current_timestamp
mysqli_close($conn);
Whenever cron hits, you fetch last row from your hit table to get the offset. Hit the query with that offset and store the next offset for next hit in your table.
Update:
As pointed out by #Dharman in the comments, you can use PDO for more abstracted way of dealing with different types of database(but make sure you have appropriate driver for it, see checklist of drivers PDO supports to be sure) along with minor checks of query syntaxes.

Related

Mariadb: Pagination using OFFSET and LIMIT is skipping one row

I have this MariaDB table:
id, name
The id column has these attributes: Primary, auto_increment, unique.
The table has 40,000 rows.
I'm using this PHP & MariaDB to load rows from this table.
This is the PHP code:
$get_rows = $conn->prepare("SELECT * FROM my_table where id> 0 ORDER BY id ASC LIMIT 30 OFFSET ?");
$get_rows->bind_param('i', $offset);
//etc.
The query returned everything correctly at the first time, but in the next query (made through AJAX), I received the next 30 rows with a gap of one row between the current result and the next one. And this goes on and on.
In the table, the row #1 had been deleted. So, I restored it, and now the query works. However, I will definitely have to delete more rows in the future. (I don't have the option of soft-deleting).
Is there any way I can keep deleting rows, and have these queries return correct results (without skipping any row)?
EDIT
Here's an example of the range of the ids in the first 2 queries:
Query 1:
247--276
Query 2:
278--307
(277 is missing)
NB I asked ChatGPT, but it couldn't help. :')
LIMIT and OFFSET query rows by position, not by value. So if you deleted a row in the first "page," then the position of all subsequent rows moves down by one.
One solution to ensure you don't miss a row is to define pages by the greatest id value on the preceding page, instead of by the offset.
$get_rows = $conn->prepare("
SELECT * FROM my_table where id> ?
ORDER BY id ASC LIMIT 30");
$get_rows->bind_param('i', $lastId);
This only works if your previous query viewed the preceding page, so you can save the value of the last id in that page.

SQL: How to select rows that sum up to certain value

I want to select rows that sum up to a certain value.
My SQL (SQL Fiddle):
id user_id storage
1 1 1983349
2 1 42552
3 1 367225
4 1 1357899
37 1 9314493
It should calculate the sum up to 410000 and get the rows. Meanwhile it should get something like this:
id user_id storage
2 1 42552
3 1 367225
As you can see, 42552 + 367225 = 409777. It selected two rows that are nearly 410000.
I have tried everything but it didn't work :(
Sorry for my language, I am German.
You can use a correlated subquery to get the running total and retrieve those rows whose running total is < a specified number. (note that i changed the storage column to int. if it is a varchar the comparison would return the wrong result)
select id,user_id,storage
from uploads t
where storage+coalesce((select sum(storage) from uploads
where storage<t.storage),0) < 410000
order by storage
SQL Fiddle
Edit: When there are duplicate values in the storage column, it has to be accounted for in the running sum by including a condition for the id column. (in this case < condition has been used, so the smallest id for a duplicate storage value gets picked up)
select id,user_id,storage
from uploads t
where storage+coalesce((select sum(storage) from uploads
where storage<t.storage
or (storage=t.storage and id < t.id)),0) < 410000
order by storage
This is what you need:
SET #suma = 0;
SELECT #suma:=#suma+`storage`, id, storage FROM table
WHERE #suma<=410000
ORDER BY storage ASC;
I added "ORDER BY storage ASC" to skip rows that have to much storage.

MySQL update IDs on update, insert and delete

In my current application I am making a menu structure that can recursively create sub menu's with itself. However due to this I am finding it difficult to also allow some sort of reordering method. Most applications might just order by an "ordering" column, however in this case although doing that does not seem impossible, just a little harder.
What I want to do is use the ID column. So if I update id 10 to be id 1 then id 1 that was there previously becomes 2.
What I was thinking at a suggestion from a friend was to use cascades. However doing a little more research that does not seem to work as I was thinking it might.
So my question is, is there an ability to do this naively in MySQL? If so what way might I do that? And if not what would you suggest to come to the end result?
Columns:
id title alias icon parent
parents have a lower id then their children, to make sure the script creates the array to put the children inside. That part works, however If I want to use an ordering column I will have to make a numbering system that would ensure a child element is never higher then its parent in the results. Possible, but if I update a parent then I must uniquely update all its children as well, resulting in more MySQL queries that I would want.
I am no MySQL expert so this is why I brought up this question, I feel there might be a perfect solution to this that can allow the least overhead when it comes to the speed of the application.
Doing it on the ID column would be tough because you can't ever have 2 rows with the same ID so you can't set row 10 to row 1 until after you've set row 1 to row 2 but you can't set row 1 to row 2 until you set row 2 to row 3, etc. You'd have to delete row 10 and then do an update ID += 1 WHERE ID < 10... but you'd also have to tell MySQL to start from the highest number and go down....
You'd have to do it in separate queries like this:
Move ID 10 to ID 2
DELETE FROM table WHERE id = 10;
UPDATE table SET id = id + 1 WHERE id >= 2 AND id < 10 ORDER BY id DESC
INSERT INTO table (id, ...) VALUES (2, ...);
Another option, if you don't want to delete and reinsert would be to set the id for row 10 to be MAX(id) + 1 and then set it to 1 after
Also if you want to move row 2 to row 10 you'd have to subtract the id:
Move ID 2 to ID 10
DELETE FROM table WHERE id = 2;
UPDATE table SET id = id - 1 WHERE id > 2 AND id <= 10 ORDER BY id DESC
INSERT INTO table (id, ...) VALUES (10, ...);
If you don't have your ID column set as UNSIGNED you could make all the IDs you want to switch to negative ids since AUTO_INCREMENT doesn't do negative numbers. Still this is pretty hacky and I wouldn't recommend it. You also probably need to lock the table so no other writes happen while this is running.:
Move ID 2 to ID 10
UPDATE table SET id = id * -1 WHERE id > 2 AND id <= 10;
UPDATE table SET id = 10 WHERE id = 2;
UPDATE table SET id = id * -1 - 1 WHERE id < -2 AND id >= -10;

How can I get the offset of a particular row in MySQL?

I'm trying to make an image database which does not keep a consistent record of ID's. For example it might go 1,2,6,7,12, but as you can see that is only 5 rows.
Inside the table I have fileid and filename. I created a PHP script to show me the image when I give the fileid. But if I give it the ID 5 which does not exist I get an error. That's fine as I want an error for that, but not for users who will browse through these images using forward and back buttons. The forward and back buttons would need to retrieve the true fileid which comes after the given ID. Hopefully that makes sense.
This is how I imagine the code to look like:
SELECT offset( WHERE fileid=4 )
That would give me the offset of the row where fileid is equal to 4. I think this is easy enough to understand. The reasons I need this are for creating the forward and back button. So I planned to add 1 or take 1 from the offset which gives me the new ID, and the new filename. That way when users browse it will skip the dead ID values automatically, but it will give an error when giving a false ID.
Going up:
SELECT * FROM table WHERE id > 'your_current_id' ORDER BY id LIMIT 1;
Going down:
SELECT * FROM table WHERE id < 'your_current_id' ORDER BY id DESC LIMIT 1;
ps: it is better to make LIMIT 2, so that you can see that you are at the first or at the last records in the database when only one record is returned.
If your results are ordered by x, ascending, the following will give you your current offeset in the table:
SELECT COUNT(*) FROM tablename WHERE x < x_of_your_current_item;
If you just want to SELECT the next or previous row, you can skip having to do two queries by just directly selecting one row:
SELECT * FROM tablename WHERE x > x_of_your_current_item ORDER BY x LIMIT 1;
will give you the next item (and similarly < and adding DESC to the order-by would give you previous).
You can use offset. Initially set offset as zero.
First time your query will be
SELECT * FROM TABLE order by table_id LIMIT 0,1
and next
SELECT * FROM TABLE order by table_id LIMIT 1,1
..
and so on
This way one will get records from the beginning till the end.
Now about back and forward buttons.
Back Button: First time or whenever offset is zero disable back button
Forward Button:
when you query for a current record you check for the next record too
i.e. after this query
SELECT * FROM TABLE order by table_id LIMIT 0,1 fire a query like this
SELECT * FROM TABLE order by table_id LIMIT current_offset+1,1 and check if the query produces any results if it produces a result then set a boolean say next = TRUE else next = FALSE;
Using this boolean enable or disable Forward button.
One more thing on click of back button send the offset as current_offset - 1 and for forward button current_offset + 1
I hope this helps. I just came across this and thought of this solution.
SELECT * FROM table WHERE ... ORDER BY id DESC LIMIT 50,10;
SELECT * FROM table WHERE ... ORDER BY id DESC LIMIT 60,10;
SELECT * FROM table WHERE ... ORDER BY id DESC LIMIT 70,10;

how to synchronize mysql database requests?

I have a lot of entries in a table that are fetched for performing jobs. this is scaled to several servers.
when a server fetches a bunch of rows to add to its own job queue they should be "locked" so that no other server fetches them.
when the update is performed a timestamp is increased and they are "unlocked".
i currently do this by updating a field that is called "jobserver" in the table that defaults to null with the id of the jobserver.
a job server only selects rows where the field is null.
when all rows are processed their timestamp is updated and finally the job field set to null again.
so i need to synchronize this:
$jobs = mysql_query("
SELECT itemId
FROM items
WHERE
jobserver IS NULL
AND
DATE_ADD(updated_at, INTERVAL 1 DAY) < NOW()
LIMIT 100
");
mysql_query("UPDATE items SET jobserver = 'current_job_server' WHERE itemId IN (".join(',',mysql_fetch_assoc($jobs)).")");
// do the update process in foreach loop
// update updated_at for each item and set jobserver to null
every server executes the above in an infinite loop. if no fields are returned, everything is up 2 date (last update is not longer ago than 24 hours) and is sent to 10 minutes.
I currently have MyIsam and i would like to stay with it because it had far better performance than innodb in my case, but i heard that innodb has ACID transactions.
So i could execute the select and update as one. but how would that look and work?
the problem is that i cannot afford to lock the table or something because other processes neeed to read/write and cannot be locked.
I am also open to a higher level solution like a shared semaphore etc. the problem is the synchronization needs to be across several servers.
is the approach generally sane? would you do it differently?
how can i synchronize the job selectino to ensure that two servers dont update the same rows?
You can run the UPDATE first but with the WHERE and LIMIT that you had on the SELECT. You then SELECT the rows that have the jobserver field set to your server.
If you can't afford to lock the tables, then I would make the update conditional on the row not being modified. Something like:
$timestamp = mysql_query("SELECT DATE_SUB(NOW(), INTERVAL 1 DAY)");
$jobs = mysql_query("
SELECT itemId
FROM items
WHERE
jobserver IS NULL
AND
updated_at < ".$timestamp."
LIMIT 100
");
// Update only those which haven't been updated in the meantime
mysql_query("UPDATE items SET jobserver = 'current_job_server' WHERE itemId IN (".join(',',mysql_fetch_assoc($jobs)).") AND updated_at < ".$timestamp);
// Now get a list of jobs which were updated
$actual_jobs_to_do = mysql_query("
SELECT itemId
FROM items
WHERE jobserver = 'current_job_server'
");
// Continue processing, with the actual list of jobs
You could even combine the select and update queries, like this:
mysql_query("
UPDATE items
SET jobserver = 'current_job_server'
WHERE jobserver IS NULL
AND updated_at < ".$timestamp."
LIMIT 100
");

Categories