How to compare two SQL file query structure in PHP - php

I have two sql files(t1.sql and t2.sql). Two sql files are different versions and two files contains create queries. I want to compare two sql files query to check the query structures are changed or not.
For example:-
t1.sql file:
CREATE TABLE `static_ids` (
`access_id` int(11) unsigned NOT NULL DEFAULT '0',
`group_id` int(11) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`access_id`,`group_id`)
) ENGINE=MyISAM;
t2.sql file:
CREATE TABLE `static_ids` (
`access_id` int(11) unsigned NOT NULL DEFAULT '0',
`group_id` int(11) unsigned NOT NULL DEFAULT '0',
`group_name` varchar(50) NOT NULL DEFAULT '',
PRIMARY KEY (`access_id`,`group_id`)
) ENGINE=MyISAM;
In this example, create query for the table static_ids structure is different.
Kindly give any idea to compare two sql file queries through PHP. Thanks in advance.

To reliably check if the structure is the same using only PHP, you would basically have to rewrite the MySQL parser. It would have to understand, for instance, that
id int;
is equivalent to
`id` int(11) signed DEFAULT NULL;
The much easier solution is to just run the CREATE TABLE command on a blank MySQL database, then do DESCRIBE table_name to a get a list of all the columns that were created as. You can then run through the list of columns for the two tables in a PHPfor` loop and compare them.

Related

MySQL INSERT ... OR UPDATE Explanation

I know there are many examples of how to use this as well as links to the MySQL documentation. Unfortunately, I am still a in need of clarification on how it actually works.
For instance, The following table structure (SQL code) is one example of what I need to use the INSERT ... OR UPDATE:
CREATE TABLE IF NOT EXISTS `occt_category` (
`category_id` int(11) NOT NULL,
`image` varchar(255) DEFAULT NULL,
`parent_id` int(11) NOT NULL DEFAULT '0',
`top` tinyint(1) NOT NULL,
`column` int(3) NOT NULL,
`sort_order` int(3) NOT NULL DEFAULT '0',
`status` tinyint(1) NOT NULL,
`date_added` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`date_modified` datetime NOT NULL DEFAULT '0000-00-00 00:00:00'
) ENGINE=MyISAM AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
ALTER TABLE `occt_category` ADD PRIMARY KEY (`category_id`);
ALTER TABLE `occt_category` MODIFY `category_id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=0;
What I am attempting to insert into this mess are new categories from an API source so there are definitely duplicates.
What I am getting from the API is the following:
[
{
"categoryID": 81,
"name": "3/4 Sleeve",
"url": "3-4sleeve",
"image": "Images/Categories/81_fm.jpg"
}
]
So given the above information; Do I need to change my table structure to check for duplicates coming in from the API?
In MSSQL I would just simply do an IF EXISTS .... statement to check for duplicates. Unfortunately, this is MySQL :(.
If you intend to make use of the INSERT ... ON DUPLICATE KEY UPDATE MySQL Syntax (which is what I understand from your question, as INSERT ... OR UPDATE is not a real MySQL command), then your current table structure is fine and you will NOT have to check for duplicate records.
The way this works is that before writing any new records into your table, the MySQL DB will first check to see if there are any records that have a value in a PRIMARY or UNIQUE key-field (in your case category_id) that is the same value for the corresponding field in the incoming record, if it finds one, it will simply update that record as opposed to writing a new one.
You can read more about this syntax here.

Dynamically update sql columns based on number of entries

I want to create a table like below:
id| timestamp | neighbour1_id | neighbour1_email | neighbour2_id | neighbour2_email
and so on upto max neighbour 20.
I have two questions:
Should I create columns statically or is there a way to create columns dynamically using php based on the count of json Array?
In either case, how would I refer to the columns dynamically and assign value to them based on jsonArray?
My jsonArray would look something like:
{id:123, email_id:abc, neighbours: [{neighbour1_id:234, neighbour1_email: bcd}, {neighbour2_id:345, neighbour2_email:dsf}, {}, {}...]}
Please advice. Thanks.
It looks like you need to rethink your database structure a bit. To me it looks like you need a single users (or whatever they are) table:
CREATE TABLE `users` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL,
`cretaed_at` timestamp NOT NULL,
PRIMARY KEY (`id`)
);
And another table that defines relations between those users:
CREATE TABLE `neighbors` (
`parent` int(11) unsigned NOT NULL,
`child` int(11) unsigned NOT NULL,
PRIMARY KEY (`parent`,`child`)
);
Now you can add as many neighbors to each user as you want. Fetching them is as easy as:
SELECT * FROM `users`
LEFT JOIN `neighbors` ON `users`.`id` = `neighbors`.`child`
WHERE `neighbors`.`parent` = ?
Where that question mark would become the id of the user from which you are fetching the neighbors, preferably by using a prepared statement.
If it is all JSON you will be working with, and querying isn't much of an issue, you could consider working with a noSql database or document store (like redis or mongoDb), but that is an entirely different story.
Just repeating a bunch of columns x times is definitely not the way to go. Vertical size (# rows) of tables in relational databases is no big issue, they are designed for that. Horizontal size (# columns) however is something to be careful with, as it may make your db uanessacry large, and decrease performance.
Just consider what you would if you want to find a user that has a neighbor with an email address [x]. You would have to repeat your where statement 20 times for each possible email column. And that is just one example...
well, the answer i was working on while pevara was posting theirs faster is almost the same...
CREATE TABLE `neighbours` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`neighbour_email` char(64) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
CREATE TABLE `neighbour_email_collections` (
`id` int(10) unsigned NOT NULL,
`email_id` char(64) NOT NULL,
`neighbour_id` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`,`neighbour_id`)
) ENGINE=InnoDB AUTO_INCREMENT=0 DEFAULT CHARSET=utf8;
insert into neighbours values (234, "bcd");
insert into neighbours values (345, "dsf");
insert into neighbour_email_collections values(123, "abc", 234);
insert into neighbour_email_collections values(123, "abc", 345);
select *
from neighbours
left join neighbour_email_collections
on neighbour_email_collections.neighbour_id=neighbours.id
where neighbour_email_collections.id=123;

MySQL INSERT SELECT and get primary key values

I want to know if it's possible to INSERT records from a SELECT statement from a source table into a destination table, get the INSERT ID's and UPDATE a field on all the corresponding records from source table.
Take for example, the destination table 'payments':
CREATE TABLE `payments` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`txid` TEXT NULL,
`amount` DECIMAL(16,8) NOT NULL DEFAULT '0.00000000',
`worker` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
)
The source table 'log':
CREATE TABLE `log` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`solution` VARCHAR(80) NOT NULL,
`worker` INT(11) NOT NULL,
`amount` DECIMAL(16,8) NOT NULL DEFAULT '0.00000000',
`pstatus` VARCHAR(50) NOT NULL DEFAULT 'pending',
`payment_id` INT(10) UNSIGNED NULL DEFAULT NULL,
PRIMARY KEY (`id`)
)
The "log" table contains multiple "micro-payments" for a completed task. The purpose of the "payments" table is to consolidate the micro-payments into one larger payment:
INSERT INTO payments p (amount, worker)
SELECT SUM(l.amount) AS total, l.worker FROM log l
WHERE l.pstatus = "ready"
AND l.payment_id IS NULL
AND l.amount > 0
GROUP BY l.worker
I'm not sure if clear from the code above, but I would like the field "payment_id" to be given the value of the insert id so that it's possible to trace back the micro-payment to the larger consolidated payment.
I could do it all client side (PHP), but I was wondering if there was some magical SQL query that would do it for me? Or maybe I am going about it all wrong.
You can use mysql_insert_id() to get the id the inserted record.
See mysql_insert_id()
But the above function is deprecated.
If you're using PDO, use PDO::lastInsertId.
If you're using Mysqli, use mysqli::$insert_id.
Well, the linking column between the tables is the column worker. After you inserted your values, just do
UPDATE log l
INNER JOIN payments p ON l.worker = p.worker
SET l.payment_id = p.id;
and that's it. Or did I get the question wrong? Note, that the columns differ in the attribute signed/unsigned. You might want to change that.
I think you should use ORM in php as follows:
Look into Doctrine.
Doctrine 1.2 implements Active Record. Doctrine 2+ is a DataMapper ORM.
Also, check out Xyster. It's based on the Data Mapper pattern.
Also, take a look at DataMapper vs. Active Record.

Comparison time for 2 large MySQL database table

I have imported 2 .csv file that I wanted to compare into MySQL table. now i want to compare both of them using join.
However, whenever I include both table in my queries, i get no response from phpMyAdmin ( sometimes it shows 'max execution time exceeded).
The record size in both db tables is 73k max. I dont think thats huge on data. Even a simple query like
SELECT *
FROM abc456, xyz456
seems to hang. I did an explain and I got this below. I dont know what to take from this.
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE abc456 ALL NULL NULL NULL NULL 73017
1 SIMPLE xyz456 ALL NULL NULL NULL NULL 73403 Using join buffer
can someone please help?
UPDATE: added the structure of the table with composite keys. There are around 100000+ records that would be inserted in this table.
CREATE TABLE IF NOT EXISTS `abc456` (
`Col1` varchar(4) DEFAULT NULL,
`Col2` varchar(12) DEFAULT NULL,
`Col3` varchar(9) DEFAULT NULL,
`Col4` varchar(3) DEFAULT NULL,
`Col5` varchar(3) DEFAULT NULL,
`Col6` varchar(40) DEFAULT NULL,
`Col7` varchar(200) DEFAULT NULL,
`Col8` varchar(40) DEFAULT NULL,
`Col9` varchar(40) DEFAULT NULL,
`Col10` varchar(40) DEFAULT NULL,
`Col11` varchar(40) DEFAULT NULL,
`Col12` varchar(40) DEFAULT NULL,
`Col13` varchar(40) DEFAULT NULL,
`Col14` varchar(20) DEFAULT NULL,
KEY `Col1` (`Col1`,`Col2`,`Col3`,`Col4`,`Col5`,`Col6`,`Col7`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
It looks like you are doing a pure catesian join in your query.
Shouldn't you be joining the tables on certain fields? If you do that and the query still takes a long time to execute, you should put appropriate indexes to speed up the query.
The reason that it is taking so long is that it is trying to join every single row of the first table to every single row of the second table.
You need a join condition, some way of identifying which rows should be matched up:
SELECT * FROM abc456, xyz456 WHERE abc456.id = xyz456.id
Add indexes on joining columns. That should help with performance.
Use MySQL Workbench or MySQL Client (console) for long queries. phpmyadmin is not designed to display queries that return 100k rows :)
If you REALLY have to use phpmyadmin and you need to run long queries you can use Firefox extension that prevents phpmyadmin timeout: phpMyAdmin Timeout Preventer (direct link!)
There is a direct link, because i couldnt find english description.

Indexing 2.5 million items with assorted other information

I have a table with a list of 2.5 million doctors. I also have tables for accepted insurance, languages spoken, and for specialties (taxonomy) provided. The doctor table is like:
CREATE TABLE `doctors` (
`doctor_id` int(10) NOT NULL AUTO_INCREMENT,
`city_id` int(10) NOT NULL DEFAULT '0',
`d_gender` char(1) NOT NULL DEFAULT 'U',
`s_insurance` int(6) NOT NULL DEFAULT '0',
`s_languages` int(6) NOT NULL DEFAULT '0',
`s_taxonomy` int(6) NOT NULL DEFAULT '0',
PRIMARY KEY (`doctor_id`)
) ENGINE=InnoDB;
The other information is stored as such:
CREATE TABLE `doctors_insurance` (
`assoc_id` int(10) NOT NULL AUTO_INCREMENT,
`doctor_id` int(10) NOT NULL DEFAULT '0',
`insurance_id` int(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`assoc_id`)
) ENGINE=InnoDB;
CREATE TABLE `doctors_languages` (
`assoc_id` int(10) NOT NULL AUTO_INCREMENT,
`doctor_id` int(10) NOT NULL DEFAULT '0',
`language_id` int(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`assoc_id`)
) ENGINE=InnoDB;
CREATE TABLE `doctors_taxonomy` (
`assoc_id` int(10) NOT NULL AUTO_INCREMENT,
`doctor_id` int(10) NOT NULL DEFAULT '0',
`taxonomy_id` int(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`assoc_id`)
) ENGINE=InnoDB;
Naturally each doctor supports various different insurance plans, maybe speaks multiple languages, and some doctors can have several different specialties (taxonomy). So I opted to have separate tables for indexing, this way need I add new indices or drop old ones, I can simply remove the tables and not have to wait the long time it takes to actually do it the old fashioned way.
Also because of other scaling techniques to consider in the future, classic JOINs make no difference to me right now, so I'm not worried about it.
Indexing by name was easy:
CREATE TABLE `indices_doctors_names` (
`ref_id` int(10) NOT NULL AUTO_INCREMENT,
`doctor_id` int(10) NOT NULL DEFAULT '0',
`practice_id` int(10) NOT NULL DEFAULT '0',
`name` varchar(120) NOT NULL DEFAULT '',
PRIMARY KEY (`ref_id`),
KEY `name` (`name`)
) ENGINE=InnoDB;
However when I wanted to allow people to search by the city, specialties, insurance, language, and gender and other demographics, I created his:
CREATE TABLE `indices_doctors_demos` (
`ref_id` int(10) NOT NULL AUTO_INCREMENT,
`doctor_id` int(10) NOT NULL DEFAULT '0',
`city_id` int(10) NOT NULL DEFAULT '0',
`taxonomy_id` int(6) NOT NULL DEFAULT '0',
`insurance_id` int(6) NOT NULL DEFAULT '0',
`language_id` int(6) NOT NULL DEFAULT '0',
`gender_id` char(1) NOT NULL DEFAULT 'U',
PRIMARY KEY (`ref_id`),
KEY `index` (`city_id`,`taxonomy_id`,`insurance_id`,`language_id`,`gender_id`)
) ENGINE=InnoDB;
The idea is that there will be an entry for each change in specialty, insurance, or language primarily, though others will still the same. This creates an obvious problem. If a doctor has 3 specialties, supports 3 insurance providers, and speaks 3 languages, this alone means this specific doctor has 27 entries. So 2.5 million entries easily balloons into far more.
There has to be a better approach to do this, but how can it be done? Again, I'm not interested in moving to classic indexing techniques and using JOINs because it will quickly become too slow, I need a method that can scale out easily.
I know this is not the answer you're looking for, but you've now taken the things that a RDBMs do well and tried implementing it yourself, using the same mechanism that the RDBMs could use to actually make sense of your data and optimize both retrieval and querying. In practice you've decided to drop using proper indexes to create your own half-way-there-solution, which will try to implement indexes by itself (by actually using the indexing capability of the RDBMs with the KEY).
I'd suggest to actually try to just use the database the way you've already structured it. 2.5m rows isn't that many rows, and you should be able to make it work fast and within your constraints using both JOINs and indexes. Use EXPLAIN and add proper indexes to support your the queries you want answered. If you ever run into an issue (and I'd doubt it regarding the amount of data you're querying here), decide to solve the bottle neck then when you actually know what could be the issue instead of trying to solve a problem you've only imagined so far. There might be other technologies than MySQL that can be helpful - but you'll need to know what's actually hurting your performance first.
The normal way to deal with the explosion of rows in a denormalized table like "indices_doctors_demos" is to normalize to 5NF. Try to keep in mind that normalizing has nothing at all to do with the decision to use id numbers as surrogate keys.
In the scenario you described, normalizing to 5NF seems practical. You wouldn't have any table with more than about 7 million rows. The table "indices_doctors_demos" vanishes entirely, the four "doctors" tables all become narrower, and all of them would end up with highly selective indexes.
If you worked for me, I'd require you to prove that 5NF can't work before I'd let you take a different approach.
Since you already have all the data, it makes sense to build it and test it, paying close attention to the query plans. It shouldn't take you more than one afternoon. Guessing at some table names, I'd suggest you load data into these tables.
-- You're missing foreign keys throughout. I've added some of them,
-- but not all of them. I'm also assuming you have a way to identify
-- doctors besides a bare integer.
CREATE TABLE `doctors` (
`doctor_id` int(10) NOT NULL AUTO_INCREMENT,
`city_id` int(10) NOT NULL DEFAULT '0',
`d_gender` char(1) NOT NULL DEFAULT 'U',
PRIMARY KEY (`doctor_id`)
) ENGINE=InnoDB;
CREATE TABLE `doctors_insurance` (
`doctor_id` int(10) NOT NULL DEFAULT '0',
`insurance_id` int(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`doctor_id`, `insurance_id`),
FOREIGN KEY (`doctor_id`) REFERENCES `doctors` (`doctor_id`),
FOREIGN KEY (`insurance_id`) REFERENCES `insurance` (`insurance_id`)
) ENGINE=InnoDB;
CREATE TABLE `doctors_languages` (
`doctor_id` int(10) NOT NULL DEFAULT '0',
`language_id` int(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`doctor_id`, `language_id`),
FOREIGN KEY (`doctor_id`) REFERENCES `doctors` (`doctor_id`),
FOREIGN KEY (`language_id`) REFERENCES `languages` (`language_id`)
) ENGINE=InnoDB;
CREATE TABLE `doctors_taxonomy` (
`doctor_id` int(10) NOT NULL DEFAULT '0',
`taxonomy_id` int(10) NOT NULL DEFAULT '0',
PRIMARY KEY (`doctor_id`, `taxonomy_id`),
FOREIGN KEY (`doctor_id`) REFERENCES `doctors` (`doctor_id`),
FOREIGN KEY (`taxonomy_id`) REFERENCES `taxonomies` (`taxonomy_id`)
) ENGINE=InnoDB;

Categories