I am not an SQL magician so I'm venturing to ask for help. I have 4 tables to insert into a 5th one while checking a 6th table to ensure no duplicates. For example, no names in the 6th table can be inserted in the 5th one. I probably can try to figure out the best SQL query for the job but my head can't get around the right method? The final table size is small for now (5000 contact names), but will grow every month so I got to start right. I plan to use a PHP script with mysql connection to the database. This script will only run on my server (CenTOS 5).
Without seeing the schema, in general if you're going to prevent rows from entering tables based on other tables - in mySQL you'll need to utilize foreign keys. Overall, all of this will need to be done in a database transaction so that whatever logic you create in PHP to insert rows in various tables either succeed after total confirmation of success or fail and roll back to the prior state.
Related
Short Story: I wrote a code in a loop that insert rows into mysql table, and kept it running for 3 hours accidentally. I tried to empty the mysql table, but then again it automatically starts to insert rows. I even tried to drop and re-create it, but it is automatically inserting rows (id in increasing order)
Full story:
2 days ago, I wrote a code that use an API and insert the data into mysql table (in a loop). And to mesure, how many records I have added, I made an table named "lastrequestdone" and started to insert the id number after every record inserted successfully.
But the code inserted all the ID's in 1 hour, and after that, the API started to give error:404 response, and the loop started to run very fast.
So assume that, I ran a loop for 3 hours, that insert an id in increasing order. But once i realized it, I stopped the loop. Tried to empty the table, but whenever I empty the table, more rows automatically starts to get inserted.
Then i dropped the table, and created it again today, but the same thing is happening. Rows are getting inserted automatically.
Look for the processes in your MySQL Server using any UI tools like MySQL Workbench or likes of it and try to find the process of the query that you left running and stop that particular process. You should probably resolve it this way.
Hope it might help you - https://dba.stackexchange.com/questions/63302/how-to-stop-the-execution-of-a-long-running-insert-query
Do you mean by "kill / stop the MySQL instance" the linux command kill -9 ...?
Yes, this might lead to corrupt data.
Which instead should not leave corrupt data is the build-in kill command of MySQL. See the lower part of the page, which states that it might take the thread some time to actually notice the presence of the kill flag. The part which states it does not roll back updates when transactions are not used (which implies that it rolls back updates if transactions are used), etc. I therefore assume this does not corrupt any data (besides killing repair and optimize as noted there).
You can find out the thread id you need for this command with the command show processlist.
I have a MySQL database that is becoming really large. I can feel the site becoming slower because of this.
Now, on a lot of pages I only need a certain part of the data. For example, I store information about users every 5 minutes for history purposes. But on one page I only need the information that is the newest (not the whole history of data). I achieve this by a simple MAX(date) in my query.
Now I'm wondering if it wouldn't be better to make a separate table that just stores the latest data so that the query doesn't have to search for the latest data from a specific user between millions of rows but instead just has a table with only the latest data from every user.
The con here would be that I have to run 2 queries to insert the latest history in my database every 5 minutes, i.e. insert the new data in the history table and update the data in the latest history table.
The pro would be that MySQL has a lot less data to go through.
What are common ways to handle this kind of issue?
There are a number of ways to handle slow queries in large tables. The three most basic ways are:
1: Use indexes, and use them correctly. It is important to avoid table scans on large tables; this is almost always your most significant performance hit with single queries.
For example, if you're querying something like: select max(active_date) from activity where user_id=?, then create an index on the activity table for the user_id column. You can have multiple columns in an index, and multiple indexes on a table.
CREATE INDEX idx_user ON activity (user_id)
2: Use summary/"cache" tables. This is what you have suggested. In your case, you could apply an insert trigger to your activity table, which will update the your summary table whenever a new row gets inserted. This will mean that you won't need your code to execute two queries. For example:
CREATE TRIGGER update_summary
AFTER INSERT ON activity
FOR EACH ROW
UPDATE activity_summary SET last_active_date=new.active_date WHERE user_id=new.user_id
You can change that to check if a row exists for the user already and do an insert if it is their first activity. Or you can insert a row into the summary table when a user registers...Or whatever.
3: Review the query! Use MySQL's EXPLAIN command to grab a query plan to see what the optimizer does with your query. Use it to ensure that the optimizer is avoiding table scans on large tables (and either create or force an index if necesary).
I have a PHP project where I have to insert more than 10,000 rows to a SQL Table. These data are taken from a table and checked for some simple conditions and inserted to the second table at the end of every month.
How should I do this.
I think need more clarification. I currently use small batch (250 inserts) transferring using PHP cronjob and it works fine. But i need to do this is most appropriate method.
What will be the most appropriate one.
Cronjob with PHP as I currently use
Exporting to a file and BULK import method
Some sort of Stored procedure to transfer directly
or any other.
Use insert SQL statement. :^ )
Adds one or more rows to a table or a view in SQL Server 2012. For examples, see Examples.
Example of using mssql_* extension.
$server = 'KALLESPC\SQLEXPRESS';
$link = mssql_connect($server, 'sa', 'phpfi');
mssql_query("INSERT INTO STUFF(id, value) VALUES ('".intval($id)."','".intval($value)."')");
Since the data is large, make the batch of 500 records for processing.
Check the condition for those 500 batches , till that time, make ready another batch of 500 and insert first batch and process so on.
This will not give load on your sql server.
By this way i daily process 40k Records.
Use BULK INSERT - it is designed for exactly what you are asking and significantly increases the speed of inserts.
Also, (just in case you really do have no indexes) you may also want to consider adding an indexes - some indexes (most an index one on the primary key) may improve the performance of inserts.
The actual rate at which you should be able to insert records will depend on the exact data, the table structure and also on the hardware / configuration of the SQL server itself, so I can't really give you any numbers.
SQL Server does not insert more than 1000 records in a single batch. You have to create separate batch for insertion. Here I am suggesting some of alternative which will help you.
Create one stored procedure. create two temporary table one for valid data and other for invalid data. one by one check all your rules and validation and base on that insert data into this both table.
If data is valid then insert into valid temp table else insert into invalid temp table.
Now, next using merge statement you can insert all that data into your source table as per your requirements.
you can transfer N number of records between tables so I hope this would be fine for you
Thanks.
it's so simple , you can do it using multiple while, since 10000 rows is not huge data!
$query1 = mssql_query("select top 10000 * from tblSource");
while ($sourcerow = mssql_fetch_object($query1)){
mssql_query("insert into tblTarget (field1,field2,fieldn) values ($sourcerow->field1,$sourcerow->field2,$sourcerow->fieldn)");
}
this should be work as fine
i have 2 sql tables of a script that i need to be sync to another, this can be done with php cron (this was my plan) exept from one row
Table 1 Table 2
row 1 <----> Row 1
Row 2 <----> row 2
row 3 no sync row 3
both databases on same server
and the same user has full rights for both
i am looking for a php code to do this via a cpanel cron
on an after thought would it be best to merge the two so both updates with new data?
the issue is that in the example above i am needing row 3 to not change on both databases
I am very noob so please be nice lol Thx in advance
Update *
i should learn how to explain a bit better.
both the databases are control panels for sites, one of the tables rows has the system url in it, so if i share the database "site 2" links refers back to "site 1" this is a complex problem for me as i am very new to this.
what i need is to keep both databases upto date except that single row which in turn be different for both databases.
i have not tried anything just yet as i wouldn't know where to start :( lol
You dont have to use cron. MySQL in current version supports TRIGGERS and EVENTS.
You can use TRIGGER to copy data to another table. That copy (or any other operation) may be triggered by some event (like insert, update or delete in table). TRIGGER may run any query or any PL/SQL code.
Other option is an EVENT. This is something like internal task sheduler built in MySQL. It can also run queries, or any PL/SQL code, but it is triggered by system time (like Linux Cron). It has many advantages compared to cron.
PL/SQL is procedural SQL, with loops, variables and more.
If you think you are "noob" - i have cure for you. Read books about MySQL or if you are lazy - watch some tutorials ( http://thenewboston.org , http://phpacademy.org ).
Nobody here will write code for you. We can only fix a bug, give advice etc. :)
EDIT.
Example of EVENT:
-- this is comment in SQL language (line starts with --)
CREATE EVENT event_daily_copy_something
ON SCHEDULE
EVERY 1 DAY
COMMENT 'This text will appear in MySQL Workbench as description of event'
DO
BEGIN
INSERT INTO your_db_name.target_table_name(id, field)
SELECT id, something
FROM your_db_name.source_table_name
WHERE id = 3;
END
Synchronization of tables is quite complicated. I think you need few operations in event.
Check for new rows and copy
Check for deleted rows and delete them in "copy" table
Check for changed rows (here trigger on source table would be very useful, because trigger "knows" what row is edited and you can access new field values in table 1 and use them to update table 2).
One of MySQL tutorials - thenewboston#youtube.
I have a scraper which visits many sites and finds upcoming events and another script which is actually supposed to put them in the database. Currently the inserting into the database is my bottleneck and I need a faster way to batch the queries than what I have now.
What makes this tricky is that a single event has data across three tables which have keys to each other. To insert a single event I insert the location or get the already existing id of that location, then insert the actual event text and other data or get the event id if it already exists (some are repeating weekly etc.), and finally insert the date with the location and event ids.
I can't use a REPLACE INTO because it will orphan older data with those same keys. I asked about this in Tricky MySQL Batch Query but if TLDR the outcome was I have to check which keys already exist, preallocate those that don't exist then make a single insert for each of the tables (i.e. do most of the work in php). That's great but the problem is that if more than one batch was processing at a time, they could both choose to preallocate the same keys then overwrite each other. Is there anyway around this because then I could go back to this solution? The batches have to be able to work in parallel.
What I have right now is that I simply turn off the indexing for the duration of the batch and insert each of the events separately but I need something faster. Any ideas would be helpful on this rather tricky problem. (The tables are InnoDB now... could transactions help solve any of this?)
I'd recommend starting with Mysql Lock Tables which you can use to prevent other sessions from writing to the tables whilst you insert your data.
For example you might do something similar to this
mysql_connect("localhost","root","password");
mysql_select_db("EventsDB");
mysql_query("LOCK TABLE events WRITE");
$firstEntryIndex = mysql_insert_id() + 1;
/*Do stuff*/
...
mysql_query("UNLOCK TABLES);
The above does two things. Firstly it locks the table preventing other sessions from writing to it until you the point where you're finished and the unlock statement is run. The second thing is the $firstEntryIndex; which is the first key value which will be used in any subsequent insert queries.