Trying to Select and Update the Same Rows Quickly - php

I have a MySQL table that's being updated very frequently. In essence, I'm trying to grab 500 rows with multiple PHP scripts at once, and I don't want the PHP scripts to grab the same rows. I don't to use ORDER BY RAND() due to its server load with thousands of rows.
So, I thought of simply having each script set every row's status as "1" (so it wouldn't be grabbed again). So, I want to grab 500 rows where status = 0 (I use SELECT order by asc), and then have those exact 500 rows set to status "1" so that another script doesn't grab those.
Since the table is being updated all the time, I can't select 500 rows by asc order, and then update 500 rows by asc rows, because by the time it takes to start the script and do SELECT, more rows might be added.
Therefore, I need a way to SELECT 500 rows and then somehow "remember" which rows I selected and update them.
How would I go about doing SELECT and UPDATE quickly like I described?

Generate a unique ID for each script (just a random integer usually works fine for these purposes).
Run an UPDATE table SET status = <my random id> WHERE status = 0 LIMIT 500 query from each process.
Then have each process run a SELECT ... FROM table WHERE status = <my random id> to actually get the rows.
In essence, you "lock" the rows to your current script first, and then go retrieve the ones you successfully locked.

Related

mysql - working with big table 200k rows

i'm using this code to genarate html table from mysql tabe. table has 200k rows.
$view->ach = $db->Query("SELECT from_unixtime(`date`), `aav_id`, `aav_domain`, `aav_word`, `aav_referer`, `aav_ip`, `aav_country`
FROM aav_views
where aav_user_id=$USER_ID
ORDER BY date DESC
");
but it's not working. web browser saying
"The page isn’t working
www.mysite.com is currently unable to handle this request.
HTTP ERROR 500 "
(not the 500 internal server error)
i add a limit to sql query like this
$view->ach = $db->Query("SELECT from_unixtime(`date`), `aav_id`, `aav_domain`, `aav_word`, `aav_referer`, `aav_ip`, `aav_country`
FROM aav_views
where aav_user_id=$USER_ID
ORDER BY date DESC
LIMIT 1000
");
now it is working fine. but i need to use without limit. i need to query all 200k rows
The way to handle such a large result set from MySQL is to use something called pagination. With pagination, you might only retrieve records 100 or 1000 at a time. This eliminates the problem of crashing your web server or web page with too much information.
MySQL has two keywords which are well suited to handle this problem. The first one, LIMIT, you already know, and it controls how many total records appear in the result set. The second one is OFFSET, and it specifies the position in the result set from which to begin taking records.
To give you an example, if you wanted to return the second 100 records from your table, you would issue the following query:
SELECT from_unixtime(date), aav_id, aav_domain, aav_word, aav_referer, aav_ip, aav_country
FROM aav_views
where aav_user_id=$USER_ID
ORDER BY date DESC
LIMIT 100 OFFSET 100
Typically, the user controls the offset by paging through a UI containing the results from the query.

Using PHP to delete many rows in a large table from MySQL

I am having trouble deleting many rows in a large table. I am trying to delete 200-300k rows from a 2m rows table.
My PHP script is something like this
for($i=0;$i<1000;$i++){
$query="delete from record_table limit 100";
$queryres=mysql_query($query) or die(mysql_error());;
}
this is just an example of my script where I will delete 100 rows at a time running for 1000 times to delete 100k records.
However, the PHP script just seems to keep running forever and not returning anything.
But when I tried to run the query from command line, it seems to delete just fine, although it takes about 5-6 minutes to delete.
Could there be something else that is preventing the PHP script from executing the query? I tried deleting 100k in one query and the result is the same too.
The query that I really wanted to run is "DELETE FROM table WHERE (timeinlong BETWEEN 'timefrom' AND 'timeto'"
The timeinlong column is indexed.
Hopefully you have an ID field so you can just do something like this:
$delete = mysql_query("DELETE FROM records WHERE id > 1000");
That would leave the first 1,000 rows and remove every other entry.
Perhaps adding a field to track deleted items will work for you. Then rather than actually deleting the rows, you update 'deleted' to TRUE. Obviously your other queries need to modified to select where deleted equals FALSE. But it's fast. Then you can trim the db via script at some other time.
Why are you deleting using loop so many times it's not a good way to delete. If you have any id (auto incremented) then use it with where clause
(e.g delete from record_table where id< any ID)
Or if you want to delete it with looping then for a long time you should also use set_time_limit(0) function to keep PHP script executing.

Server-side Pagination: total row count for expensive query?

I have a simple query using server-side pagination. The issue is the WHERE Clause makes a call to an expensive function and the functions argument is the user input, eg. what the user is searching for.
SELECT
*
FROM
( SELECT /*+ FIRST_ROWS(numberOfRows) */
query.*,
ROWNUM rn FROM
(SELECT
myColumns
FROM
myTable
WHERE expensiveFunction(:userInput)=1
ORDER BY id ASC
) query
)
WHERE rn >= :startIndex
AND ROWNUM <= :numberOfRows
This works and is quick assuming numberOfRows is small. However I would also like to have the total row count of the query. Depending on the user input and database size the query can take up to minutes. My current approach is to cache this value but that still means the user needs to wait minutes to see first result.
The results should be displayed in the Jquery datatables plugin which greatly helps with things like serer-side paging. It however requires the server to return a value for the total records to correctly display paging controls.
What would be the best approach? (Note: PHP)
I thought if returning first page immediately with a fake (better would be estimated) row count. After the page is loaded do an ajax call to a method that determines total row count of the query (what happens if the user pages during that time?) and then update the faked/estimated total row count.
However I have no clue how to do an estimate. I tried count(*) * 1000 with SAMPLE (0.1) but for whatever reason that actually takes longer than the full count query. Also just returning a fake/random value seems a bit hacky too. It would need to be bigger than 1 page size so that the "Next" button is enabled.
Other ideas?
One way to do it is as I said in the comments, to use a 'countless' approach. Modify the client side script in such a way that the Next button is always enabled and fetch the rows until there are none, then disable the Next button. You can always add a notification message to say that there are no more rows so it will be more user friendly.
Considering that you are expecting a significant amount of records, I doubt that the user will paginate through all the results.
Another way is to schedule a cron job that will do the counting of the records in the background and store that result in a table called totals. The running intervals of the job should be set up based on the frequency of the inserts / deletetions.
Then in the frontend, just use the count previously stored in totals. It should make a decent aproximation of the amount.
Depends on your DB engine.
In mysql, solution looks like this :
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();
Basically, you add another attribute on your select (SQL_CALC_FOUND_ROWS) which tells mysql to count the rows as if limit clause was not present, while executing the query, while FOUND_ROWS actually retrieves that number.
For oracle, see this article :
How can I perform this query in oracle
Other DBMS might have something similar, but I don't know.

MYSQL rotate through rows by date

The query selects the oldest row from a records table that's not older than a given date. The given date is the last row queried which I grab from a records_queue table. The goal of the query is to rotate through the rows from old to new, returning 1 row at a time for each user.
SELECT `records`.`record_id`, MIN(records.date_created) as date_created
FROM (`records`)
JOIN `records_queue` ON `records_queue`.`user_id` = `records`.`user_id`
AND record_created > records_queue.record_date
GROUP BY `records_queue`.`user_id`
So on each query I'm selecting the oldest row min(date_created) from records and returning the next oldest row larger > than the given date from records_query. The query keeps returning rows until it reaches the newest record. At that point the same row is returned. If the newest row was reached I want to return the oldest (start again from the bottom - one full rotate). How is that possible using 1 query?
From the code you have posted, one of two things is happening. Either this query is returning a full recordset that your application is then able to traverse through using it's own logic (this could be some variant of javascript if the page isn't reloading or passing parameters to the PHP code that are then used to select which record to display if the page does reload each time), or the application is updating the records_queue.record_date to bring back the next record - though I can't see any limitations of only fetching a single record in the query you posted.
Either way, you will need to modify the application logic, not this query to achieve the outcome you are asking for.
Edit: In the section of code that updates the queue, do a quick check to see if the value in records_queue.record_date is equal to the newest record. If it is run something like update records_queue set record_date = (select min(theDateColumn from records) instead of the current logic which just updates it with the current date being looked at.

MySQL: selecting rows one batch at a time using PHP

What I try to do is that I have a table to keep user information (one row for each user), and I run a php script daily to fill in information I get from users. For one column say column A, if I find information I'll fill it in, otherwise I don't touch it so it remains NULL. The reason is to allow them to be updated in the next update when the information might possibly be available.
The problem is that I have too many rows to update, if I blindly SELECT all rows that's with column A as NULL then the result won't fit into memory. If I SELECT 5000 at a time, then in the next SELECT 5000 I could get the same rows that didn't get updated last time, which would be an infinite loop...
Does anyone have any idea of how to do this? I don't have ID columns so I can't just say SELECT WHERE ID > X... Is there a solution (either on the MySQL side or on the php side) without modifying the table?
You'll want to use the LIMIT and OFFSET keywords.
SELECT [stuff] LIMIT 5000 OFFSET 5000;
LIMIT indicates the number of rows to return, and OFFSET indicates how far along the table is read from.

Categories