Hello this is my first time i post but hopefully i won't mess up to much.
Basically i'm trying to to copy two tables into a new table, the data in table 2 and 3 are temp data that i update with two csv files. It's just basic data that share the same ID so thats the Primary Key and i want these to be combined into a new table. This is supposed to be done just once a day handling about 2000 lines Below follows a better description of what i'm looking for.
3 tables, Core, temp_data1, temp_data2
temp_data1 has id, name, product
temp_data2 has id, description
id is a unique since it's the product_nr of the product
First copy the data from temp_data1 to Core. Insert new line if the product does not exist, if it do exist it should update the row with the information
Next update Core with the description where id=id and do not insert if id do not exist (it should not exist)
I'm looking for something that can be done in one push of a button, first i upload the csv file into the two different databases (two different files) next i push a button to merge the two tables to the Core one. I know you can do this right away with the two csv files and skip the two tables but i feel like that is so over my head it's not even funny.
I can handle programming php it's all the mysql stuff that's messing with my head.
Hopefully you guys can help me and in return i will help out any other place i can.
Thanks in advance.
I'm not sure I understand it correctly, but this can be done using only sql script, using INSERT INTO...SELECT...ON DUPLICATE KEY UPDATE... - see http://dev.mysql.com/doc/refman/5.6/en/insert-select.html
Related
I have the following problem:
I have got a dataset inside a text file (not xml or csv encoded or something, just field values separated by \t and \n) which is updated every 2 minutes. I need to put the data from the file into a MariaDB Database, which itself is not very difficult to do.
What I am unsure about however, is how I would go about updating the table if the file's contents change. I thought about truncating the table and then filling it again, but doing that every 2 minutes with about 1000 datasets would mean some nasty problems with the database being incomplete during those updates, which makes it not a usable solution (which it wouldn't have been with fewer datasets either :D)
Another solution I thought about was to append the new data to the existing table, and use a delimter on the unique column (e.g. use cols 1-1000 before update, append data, then use values 1001-2000 after the update and remove 1-1000, after 2 or so updates start at id 1 again).
Updating the changing fields is not an option, because the raw data format would make that really difficult to keep track of the column that has changed (or hasn't)
I am, however unsure about best practices, as I am relatively new to SQL and stuff, and would like to hear your opinion, maybe I am just overlooking something obvious...
Even better...
CREATE TABLE new LIKE real; -- permanent, not TEMPORARY
load `new` from the incoming data
RENAME TABLE real TO old, new TO real;
DROP TABLE old.
Advantages:
The table real is never invisible, nor empty, to the application.
The RENAME is "instantaneous" and "atomic".
As suggested by Alex, I will create a temporary table, insert my data into the temporary table, truncate the production table and then insert from the temporary table. Works like a charm!
I'm not a database expert, so I'm not sure how to ask this question briefly and succinctly. I am trying to copy data with the following characteristics: many of the tables with data being copied contain references to other tables with data being copied; i.e., a patient might attend a class where their weight is recorded, so I need to copy both the class attendance row as well as the weight value stored in another table, which is referenced by the class attendance row. There are other, even more complex, examples in this database, but it seems that I need to perform some kind of recursive copy of these inter-referenced items so I can maintain the cross-references in the copied data.
So, is there any kind of standard approach to this problem? If there isn't a direct answer, could someone share the terminology of what I'm trying to do so that I can look it up on my own? I'm certain this problem has been tackled many times before, but I don't know how to find the solution. I understand the basic concepts of JOINs and FKs, but this solution seems to require a way to copy the rows from various tables while also going back and updating the cross-references (in some cases, these are FKs, and in other cases, they are not; I'm stuck with the schema as it is).
PS: If it's such an obvious solution, why won't anyone just provide it or characterize it below so we can move on? Most of humanity is capable of asking the occasional dumb question, and this may very well be one of mine, but I'm seriously stuck on this one and would appreciate some assistance.
Here's a sketch of a small part of the schema to try to illustrate the issue:
When we copy a patient's data record, we need to 1) create a new row in patient; 2) create a corresponding new row in edclass_session_labs; 3) create a new row in patient_lab_weight; and (here's what I see as the tricky part) 4) also update the reference in edclass_session_labs to the new row in patient_lab_weight. What I'm looking for is a way to do this programmatically and algorithmically. I'm sure problems like this have been tackled before, so that's why I'm asking for advice here.
I didn't fully understand what you mean by "copy patient data", so there are two options:
1) If you want to "copy" the data to a report, you need to link many tables with related information, so you have to study the concept of JOINs and FOREIGN KEYs. This is what we do when we need to convert relational data into a flat table that can be easily read by non-IT people.
2) If you need to copy specific data from database tables to other database tables, you also have to study FOREIGN KEYs and table relationship. You need to understand how table rows relate to rows on other tables (one to many, many to one, many to many), so you can create INSERT statements based on SELECTs that will filter the exact data you need.
This is very general, but I think it's sufficient to point you to the right direction.
EDIT:
Since the issue is related to creating a merged structure of patient data, let's say we have patient 1 and patient 2. They are duplicates of the same person, and need to be merged. I would do this, in this order:
a) Create a patient 3, this one will be the target of our merging. Simply copy each field from patients 1 or 2 to this new record.
b) Create as many new records as needed in table "patient_lab_weight". For example: if patient 1 has 2 records there, and patient 2 has 4 records, you will have to create 6 records, which are copies of the records related to patient 1 and 2, but patient_id will be 3. However, after creating each record here, obtain the auto_increment generated for field "patient_lab_weight_id", and insert a new record in "ed_class_session_labs", with patient_id = 3, and "patient_lab_weight_id" = the obtained ID. Do that for each insert on "patient_lab_weight".
c) after all that, disable patients 1 and 2 in your application.
If you use this approach, you will slowly build up your new structure, linked in a consistent way.
I have a question to which I have been unable to find the answer.
I can create an extra column in a PHP recordset by using an existing column and duplicating it:
SELECT
id_tst,
name_tst,
age_tst,
price_tst,
price_tst AS newprice_tst
FROM test_tst
From what I can work out the AS will only duplicate an existing colulmn or rename a column rs.
I want to add two extra columns to a table, but with no values.
I know a lot of people will say whats the point in that; it's pointless to have 2 columns with no data.
The reason is I am doing a price updating module for a CMS system, where the user can download a .csv file containing prices, modify the prices in a spreadsheet then re-upload the CSV to update he prices.
the two extra columns would be to hold the new prices keeping the old so a roll back from the CSV file could be performed if nessecary.
I could just get the client to add the two new colulmns into the spreadsheet, but would prefer to have the exported CSV with the columns already in place.
Is it possible to create blank columns when creaing an rs?
You can create empty "dummy" columns by aliasing a blank string:
SELECT '' AS emptyColumn, column1, column2 FROM table
This will produce another column in your query with all blank values.
I've got a PHP script pulling a file from a server and plugging the values in it into a Database every 4 hours.
This file can and most likely change within the 4 hours (or whatever timeframe I finally choose). It's a list of properties and their owners.
Would it be better to check the file and compare it to each DB entry and update any if they need it, or create a temp table and then compare the two using an SQL query?
None.
What I'd personally do is run the INSERT command using ON DUPLICATE KEY UPDATE (assuming your table is properly designed and that you are using at least one piece of information from your file as UNIQUE key which you should based on your comment).
Reasons
Creating temp table is a hassle.
Comparing is a hassle too. You need to select a record, compare a record, if not equal update the record and so on - it's just a giant waste of time to compare a piece of info and there's a better way to do it.
It would be so much easier if you just insert everything you find and if a clash occurs - that means the record exists and most likely needs updating.
That way you took care of everything with 1 query and your data integrity is preserved also so you can just keep filling your table or updating with new records.
I think it would be best to download the file and update the existing table, maybe using REPLACE or REPLACE INTO. "REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted." http://dev.mysql.com/doc/refman/5.0/en/replace.html
Presumably you have a list of columns that will have to match in order for you to decide that the two things match.
If you create a UNIQUE index over those columns then you can use either INSERT ... ON DUPLICATE KEY UPDATE(manual) or REPLACE INTO ...(manual)
I'm working on a basic php/mysql CMS and have a few questions regarding performance.
When viewing a blog page (or other sortable data) from the front-end, I want to allow a simple 'sort' variable to be added to the querystring, allowing posts to be sorted by any column. Obviously I can't accept anything from the querystring, and need to make sure the column exists on the table.
At the moment I'm using
SHOW TABLES;
to get a list of all of the tables in the database, then looping the array of table names and performing
SHOW COLUMNS;
on each.
My worry is that my CMS might take a performance hit here. I thought about using a static array of the table names but need to keep this flexible as I'm implementing a plugin system.
Does anybody have any suggestions on how I can keep this more concise?
Thankyou
If you using mysql 5+ then you'll find database information_schema usefull for your task. In this database you can access information of tables, columns, references by simple SQL queries. For example you can find if there is specific column at the table:
SELECT count(*) from COLUMNS
WHERE
TABLE_SCHEMA='your_database_name' AND
TABLE_NAME='your_table' AND
COLUMN_NAME='your_column';
Here is list of tables with specific column exists:
SELECT TABLE_SCHEMA, TABLE_NAME from COLUMNS WHERE COLUMN_NAME='your_column';
Since you're currently hitting the db twice before you do your actual query, you might want to consider just wrapping the actual query in a try{} block. Then if the query works you've only done one operation instead of 3. And if the query fails, you've still only wasted one query instead of potentially two.
The important caveat (as usual!) is that any user input be cleaned before doing this.
You could query the table up front and store the columns in a cache layer (i.e. memcache or APC). You could then set the expire time on the file to infinite and only delete and re-create the cache file when a plugin has been newly added, updated, etc.
I guess the best bet is to put all that stuff ur getting from Show tables etc in a file already and just include it, instead of running that every time. Or implement some sort of caching if the project is still in development and u think the fields will change.