Simultaneous mysql queries + PHP

Simultaneous mysql queries + PHP - php

I need retrieve data from 2 tables at the same time, the tables are not linked by foreigns keys or such.
$query1 = "select idemployee from employee where address like 'Park Avenue, 23421'";
$query2 "select idcompany from company where bossName like 'Peter'";
How can I do this with a kinda thread in PHP?. I've heard that threads are no safe in PHP.
UPDATED:
I got an input field that needs to looks data in both tables, is like search on both tables and show the posible results based on the employee address or boss's name, so you can type an address or just the boss's name. It's just a representation on what I need

Either use a single query, or look into something like Gearman to have workers performing jobs asynchronously (I assume the current code is only an example: if the queries you have there are performing so badly you want to perform them async. then you most likely have a database problem). Having some deamon processes ready to go to perform tasks is relatively simple.
.

Um...
$query1 = "select idemployee from employee where address like ?";
$query2 = "select idcompany from company where bossName like ?";
$stmt1 = $pdo->prepare($query1);
$stmt1->execute(array('Park Avenue, 23421'));
$employee = $stmt1->fetch();
$stmt2 = $pdo->prepare($query2);
$stmt2->execute(array('Peter'));
$company = $stmt2->fetch();
What am I missing?

You could use MYSQLI_ASYNC and http://docs.php.net/mysqli.poll (both only available with php 5.3+ and mysqlnd).
But then you'll need a separate connection to the MySQL server for each query.

Depends on what do you want to do with those queries. If, for example, you are using an AJAX form and can make two requests, you should create separate scripts, where each returns the results for each query. That is effectively running them in separate processes, so they execute simultaneously.
There is no such thing as threading per se in PHP, you can see a hack around it here (using full fledged processes.)

Counter answer to my previous:
Create a new table
CREATE TABLE EmployeeBossXref (
id INT auto_increment,
employee_id INT,
boss_id INT,
company_id INT,
FOREIGN KEY (employee_id) REFERENCES Employee(id),
FOREIGN KEY (boss_id) REFERENCES Employee(id),
FOREIGN KEY (company_id) REFERENCES Company(id)
) ENGINE=InnoDB;
Then change SQL to:
select Employee.name, Boss.name, Company.name FROM Employee
JOIN EmployeeBossXref ebx ON ebx.employee_id=Employee.id
JOIN Employee Boss ON Boss.id=ebx.boss_id
JOIN Company ON Company.id=ebx.company_id
WHERE Employee.address LIKE 'Park Avenue, 23421'
AND Boss.name LIKE 'Peter';
With this system, all bosses are employees (which they logically are!), employees can have more than one, or no boss.

You dont. Do you have an engineering reason you need to do this?
Making two queries simultaneously is still going to hit the same database, and the database is going to do the same amount of work. Its not going to make anything faster, and, you'll have the overhead of the additional threads/processes being created.
If you really need better concurrency, consider a 2nd (or 3rd or 4th) real-time replicated database for SELECT queries, to offload some of the work from the main database.

Related

MySQL using COUNT as total and avoiding exceeding a maximum

I was tasked to create this organization registration system. I decided to use MySQL and PHP to do it. Each organization in table orgs has a max_members column and has a unique id org_id. Each student in table students has an org column. Every time a student joins an organization, his org column is equated to the org_id of that organization.
When someone clicks join on an organization page, a PHP file executes.
In the PHP file, a query retrieves the total number of students whose org is equal to the org_id of the organization being joined.
$query = "SELECT COUNT(student_id) FROM students WHERE org = '$org_id'";
The maximum members is also retrieved from the orgs table.
$query = "SELECT max_members FROM orgs WHERE org_id = '$org_id'";
So I have variables $total_members and $max_members. A basic if statement checks if $total_members < $max_members, then updates the student's org equal to the org_id. If not, then it does nothing and notifies the student that the organization is full.
What my main concern is what if this situation happened:
Org A only has one slot left. 29/30 members.
Student A clicks join on Org A (and at the same time)
Student B clicks join on Org A
Student A retrieves data: There is one slot left
Student B retrieves data: There is one slot left
Student A's org = Org A's org_id
Student B's org = Org A's org_id
After the scripts have executed, Org A will show up with 31/30 members
Can this happen? If yes, how can I avoid it?
I've thought about using MySQL variables like this:
SET #org_id = 'what-ever-org';
SELECT #total_members := COUNT(student_id) FROM students WHERE org_main = #org_id;
SELECT #max_members := max_members FROM orgs WHERE org_id = #org_id;
UPDATE students SET org_main = IF(#total_members < #max_members, #org_id, '') WHERE student_id = 99999;
But I don't know if it would make a difference.
Row locking does not apply in my case. I think. I'd love to be proven wrong though.
The code I've written above is a simplified version of the original code. The original code included checking registration dates, org days, etc, however, these things are not related to the question.

What you're describing is usually called a race-condition. It occurs, because you perform two non-atomic operations on your database. To avoid this you need to use transactions, which ensure that the database server prevents this kind of interference. Another approach would be to use a "before update trigger".
Transaction
As you're using MySQL you have to make sure that the DB engine your tables are running on is InnoDB, because MyISAM just doesn't have transactions. Before you do your SELECT you need to start a transaction. Either send START TRANSACTION manually to the database or use a proper PHP implementation for it, e.g. PDO (PDO::beginTransaction()).
In a running transaction you can then use the suffix FOR UPDATE in your SELECT statement, which will lock the rows that have been selected by the query:
SELECT COUNT(student_id) FROM students WHERE org = :orgId FOR UPDATE
After your UPDATE statement you can commit the transaction, which will write the changes permanently to the database and remove the locks.
If you expect a lot of these simultaneous requests to happen, be aware that locking can cause some delay in the response, because the database might wait for a lock to be released.
Trigger
Instead of using transactions you can also create a trigger on the database, which will run before an update is executed on the database. In this trigger you could test if the maximum number of students has been exceeded and throw an error if that's the case. However, this can be a very challenging approach, especially if the value to be checked depends on something in the UPDATE statement. Also it is debatable if it's a good idea to implement this kind of logic on the database level.

There are two ways, use synchronized function in PHP which performs this operation. But if you want to implement all the logic in MySQL (I'd like this method), please use Stored Procedure.
Create a stored procedure as (not exactly):
CREATE PROCEDURE join_org(stu_id int, org_id, int, OUT success int)
DECLARE total_members int;
DECLARE max_members_Allowed;
SELECT COUNT(student_id) INTO total_members FROM students WHERE org = 'org_id';
SELECT max_members INTO max_members_Allowed FROM orgs WHERE org_id = 'org_id';
IF(max_members_Allowed > total_members) Then
UPDATE student SET orgid='org_id';
SET success = 1;
ELSE
SET success = 0;
END IF;
Then register this out variable named 'success' int your PHP code to indicate success or failure. Call this procedure when user clicks join.

Is this the correct way to structure this SQL query?

I am currently working with PHP and SQL on a website. There is a database containing users (accounts), organisations, and a relational table to link organisations to accounts (a many to many relationship)
When I delete an account from the database, the SQL query should also delete any organisations the account is linked to if the account being deleted is the only account linked to an organisation.
I am relatively new to SQL and have constructed a query which should delete an organisation from the organisations table under the conditions described above.
Here is my query:
'DELETE FROM TBL_ORGANISATIONS WHERE id = (
SELECT org_id FROM TBL_AFFILIATIONS WHERE account_email = :email AND (
SELECT COUNT(*) FROM TBL_AFFILIATIONS WHERE org_id IN (
SELECT org_id FROM TBL_AFFILIATIONS WHERE account_email = :email
)
) = 1
)'
Is this the correct way to structure this query or is there a clearer / more efficient way to do this? As I previously mentioned I am fairly new to SQL and have not yet grasped the concept of all the SQL keywords which can be useful in constructing queries such as this (JOIN etc.)
I thank you all in advance for any advice you can provide.
By the way:
I am using PDO hence the :email for those of you wondering.

If you have foreign key constraints, like I think you should, then this statement will fail, because the affiliation record still points to the organisation record to be deleted.
You can use ON DELETE CASCADE to delete the organisation, like Mihai suggested in his (now deleted) comment, but to do that, you will still have to check whether there's only one affiliation linked to the organisation.
In this case, I'd rather query the organisation's ID first. You will probably have that at hand anyway, because you'll know the details of the account you are deleting. Then first delete the account and next delete the organisation if you need to, with a statement that looks like this:
DELETE FROM TBL_ORGANISATIONS o
WHERE
o.id = :ThatIdYouQueriedBefore AND
NOT EXISTS (SELECT 'x' FROM TBL_AFFILIATIONS a.org_id = o.id);
Personally I'm not a big fan of cascaded deletes, since a seemingly small mistake might cost you a lot of data, but even you do use them, I don't think it makes this particular case much easier.

I think I would do two logical queries. Feel free to wrap in transaction if you need to guarantee that the deletes always happen together, with a rollback if they don't.
First, delete both the account and the account to organization affiliation with a single query (here I am assuming your account table name)
DELETE tbl_account,tbl_affiliations
FROM tbl_account INNER JOIN tbl_affiliations
ON tbl_account.account_email = tbl_affiliations.account_email /* I am assuming join condition here, perhaps there is an id to be used instead */
WHERE tbl_account.acount_email = ?
Then, delete any orphaned organizations:
DELETE tbl_organizations
FROM tbl_organizations LEFT JOIN tbl_affiliations
ON tbl_organizations.org_id = tbl_affiliations.org_id
WHERE tbl_affiliations.org_id IS NULL
Note that since that last query is not dependent on any account-specific information, you could also consider running an asynchronous process to clean up orphaned organizations if you don't need the organization deletion to happen synchronously with the account deletion.
The benefit of this approach is that you can potentially always delete user accounts in the same way, using the first query, as this works regardless as to whether there are multiple accounts associated with the organization. So you don't need any extra application logic or SQL SELECT subqueries to look for cases where there is a single account associated with an organization.

As you are trying to delete the data from two tables at a time, you can try this.
DELETE from TBL_ORGANISATIONS TO JOIN TBL_AFFILIATIONS TA
ON(TO.id=TA.org_id) where TA.account_email=:email;
Here, we are trying to delete the rows from two tables TBL_ORGANISATIONS and TBL_AFFILIATIONS using primary key of TO(i.e. id) and foriegn key TA(i.e. org_id) and adding a condition using where clause where TA.account_email=:email.

Show relationship using two table JOIN, or use PHP functions?

I'm making a micro-blogging website. The users can follow each other. I've to make stream of posts (activity stream) for the current user ( $userid ) based on the users the current user is following, like in Twitter. I know two ways of implementing this. Which one is better?
Tables:
Table: posts
Columns: PostID, AuthorID, TimeStamp, Content
Table: follow
Columns: poster, follower
The first way, by joining these two tables:
select `posts`.* from `posts`,`follow` where `follow`.`follower`='$userid' and
`posts`.`AuthorID`=`follow`.`poster` order by `posts`.`postid` desc
The second way is by making an array of users the $userid is following (posters), then doing php implode on this array, and then doing where in:
One thing I'll like to tell here that I'm storing the the number of users a user is following in the `following` record of the `user` table, so here I'll use this number as a limit when extracting the list of posters - the 'followingList':
function followingList($userid){
$listArray=array();
$limit="select `following` from `users` where `userid`='$userid' limit 1";
$limit=mysql_query($limit);
$limit=mysql_fetch_row($limit);
$limit= (int) $limit[0];
$sql="select `poster` from `follow` where `follower`='$userid' limit $limit";
$result=mysql_query($sql);
while($data = mysql_fetch_row($result)){
$listArray[] = $data[0];
}
$posters=implode("','",$listArray);
return $posters;
}
Now I've a comma separated list of user IDs the current $userid is following.And now selecting the posts to make the activity stream:
$posters=followingList($userid);
$sql = "select * from `posts` where (`AuthorID` in ('$posters'))
order by `postid` desc";
Which of the two methods is better?
And can knowing the total number of following (number of users the current user is following), make things faster in the first method as it's doing in the second method?
Any other better method?

You should go all the way with the first option. Always try as much as possible to process the data on the mysql server instead of in your PHP code. PHP will not implicitly cache the results of the operations while MySQL will do it.
The most important thing is to make sure you index your data correctly. Try using "EXPLAIN" statements to make sure you have optimized your database as much as possible and use #1 to link your data together.
http://dev.mysql.com/doc/refman/5.0/en/explain.html
This will allow you later to compute statistics also, while the second method requires you to process a part of the statistics.

The first important point is that PHP is good at building pages but very bad are managing data, everything manipulated by PHP will fill the memory and no special behavior can be applied in PHP to prevent using to much memory, except crashing.
On the other side the datatase job is to analyse relation between the tables, real number used by the query (cardinality of indexes and statictics on rows and index usage in fact), and a lot of different mechanism can be choosen by the engine depending on the size of data (merge joins, temporary tables, etc). That means you could have 256.278.242 posts and 145.268 users, with 5.684 average followers the datatabase job would be to find the fastest way to give you an answer. Well, when you hit really big numbers you'll see that all databases are not equal, but that's another problem.
On the PHP side Retrieving the list of users from the fisrt query coudl became very long (with a big number of followed users, let's say 15.000. Simply building the query string with 15 000 identifiers inside would take a quite big amount a memory. Trasnferring this new query to the SQL server would also be slow. It's definitively the wrong way.
Now be careful of the way you build your SQL request. A request is something you should be able to read from the top to the end, explaining what you really want. This will help the SQL (good) engine in choosing the right solution.
select `posts`.*
from `posts`
INNER JOIN `follow` ON posts`.`AuthorID`=`follow`.`poster`
where `follow`.`follower`='#userid'
order by `posts`.`postid` desc
LIMIT 15
Several remarks:
I have used an INNER JOIN.I want an INNER JOIN, let's write it, it will be easier to read for me later and it should be the same for the query analyser.
if #userid is an int do not use quotes. Please use ints for identifiers (this is really faster than strings). And on the PHP side cast the int "SELECT ..." . (int) $user_id ." ORDER ... or use query with parameters (This is for security).
I have used a LIMIT 15, maybe an offset could be used as well, if you want to show some pagination control around the posts. Let's say this query will retrieve 15.263 documents from my 5.642 folowwed users, you do not want, and the user do not want, to show theses 15.263 documents on a web page. And knowing with $limit that the number is 15.263 is a good thing but certainly not for a request limit. You know this number, but the database may know it as well if it has a good query analyser and some good internal statistics.
The request limit has several goals
1. Limit the size of data transfered from the database to your PHP script
2. Limit the memory usage of your PHP script (an array with 15.263 documents containg some HTMl stuff... ouch)
3. Limit the size of the final user output (and get a faster response)

Is it considered bad form to encode object-oriented data directly into single rows in a relational database?

I'm relatively new to databases so I apologize if there's an obvious way to approach this or if there is some fundamental process I'm missing. I'm using PHP and MySQL in a web application involving patient medical records. One requirement is that users be able to view and edit the medical records from a web page.
As I envisage it, a single Patient object has basic attributes like id, name, and address, and then each Patient also has an array of Medication objects (med_name, dose, reason), Condition objects (cond_name, date, notes), and other such objects (allergies, family history, etc.). My first thought was to have a database schema with tables as follows:
patients (id, name, address, ...)
medications ( patient_id, med_name, dose, reason)
conditions ( patient_id, cond_name, date, notes)
...
However, this seems wrong to me. Adding new medications or conditions is easy enough, but deleting or editing existing medications or conditions seems ridiculously inefficient - I'd have to, say, search through the medications table for a row matching patient_id with the old med_name, dose, and reason fields, and then delete/edit it with the new data. I could add some primary key to the medications and conditions tables to make it more efficient to find the row to edit, but that would seem like an arbitrary piece of data.
So what if I just had a single table with the following schema?
patients (id, name, address, meds, conds, ...)
Where meds and conds are simply representations (say, binary) of arrays of Medication and Condition objects? PHP can interpret this data and fetch and update it in the database as needed.
Any thoughts on best practices here would be welcome. I'm also considering switching to Ruby on Rails, so if that affects any decisions I should make I'm interested to hear that as well. Thanks a lot folks.

The 'badness' or 'goodness' of encoding your data like that depends on your needs. If you NEVER need to refer to individual smaller chunks of data in those 'meds' and 'conds' tables, then there's no problem.
However, then you're essentially reducing your database to a slightly-smarter-than-dumb storage system, and lose the benefits of the 'relational' part of SQL databases.
e.g. if you ever need to run a a query for "find all patients who are taking viagra and have heart conditions", then the DBMS won't be able directly run that query, as it has no idea how you've "hidden" the viagra/heart condition data inside those two fields, whereas with a properly normalized database you'd have:
SELECT ...
FROM patients
LEFT JOIN conditions ON patients.id = conditions.patient_id
LEFT JOIN meds ON patients.id = meds.patient_id
WHERE (meds.name = 'Viagra') AND (condition.name = 'Heart Disease')
and the DBMS hands everything automatically. If you're encoding everything into a single field, then you're stuck with substring operations (assuming the data's in some readable ascii format), or at worse, having to suck the entire database across to your client app, decode each field, check its contents, then throw away everything that doesn't contain viagra or heart disease - highly inefficient.

This breaks first normal form. You can never query on object attributes that way.
I'd recommend either an ORM solution, if you have objects, or an object database.

I'd have to, say, search through the medications table for a row
matching patient_id with the old med_name, dose, and reason fields,
and then delete/edit it with the new data.
Assuming the key was {patient_id, med_name, start_date}, you'd just do a single update. No searching.
update medications
set reason = 'Your newly edited reason, for example.'
where patient_id = ?
and med_name = ?
and start_date = ?
Your app will already know the patient id, med name, and start date, because the user will have to somehow "select" the row those are in before any change will make sense.
If you're going to change the dosage, you need two changes, an update and an insert, in order to make sense.
update medications
set stop_date = '2012-01-12'
where patient_id = ?
and med_name = ?
and start_date = ?
-- I'm using fake data in this one.
insert into medications (patient_id, med_name, start_date, stop_date, dosage)
values (1, 'that same med', '2012-01-12', '2012-01-22', '40mg bid')

Suitable design for a database application

I have a question related to a web app that I developed in PHP, MYSQL.
basically part 1 is :
I display results in the form of table say for software testing.
ID Prod_Name Set Date Result Platform
1 Alpha1 Pro1 01.01.01 PASS 2.3.1.2_OS
Now, I have divided the tables accordingly
Table Name: Results
ID, Name, Date, Result
Table Name : Set
ID, Set_Name, Prod_name
Table Name : Platform
ID, Platform_Name, Set_Name
Now, ID in each table is an incremented value and does not relate to anything else.
My php app, starts with fetching the results from 'Results' table. Since I want SET to be displayed for every row, I am making an another connection to the database and using the query
select Set_name
from Set
where Prod_name = row['Name'] // row['Name'] is fetched from the results table.
now I also want to display platform which I am extracting it from Platform table using the above method i.e making another connection and passing Set_Name = row['Set_Name'] from the Set table.
Now for my application is there any other way to achieve the same result ?
Typically, for large web based applications, if data is coming from a database server is making multiple connection to a DB server a feasible option?
Please do not consider the fact that with MySQL declaring a connection statement once will do the needful but what about MSSQL server? Do we need to write a long sql statement with several joins/selfjoins/unions and use those variables all over the application?
How is the application design for this case will be?
Can anyonce give me some ideas please?
Thanks.

For pretty much any flavour of database, a single SELECT statement which joins three tables will perform better than three separate statements querying a table apiece. Joining is what relational databases do.

I may not have understood everything, but here is something similar. First, let's make an ER model.
Now, because you don't seem to like joins, create a view in the database.
CREATE VIEW v_test AS
SELECT TestID, ProductName, TestName, Date, Result, PlatformName
FROM Product AS p
JOIN Test AS t ON t.ProductID = p.ProductID
JOIN Platform AS f ON f.PlatformID = t.PlatformID;
With this in place, you can simply use:
SELECT * FROM v_test WHERE ProductName = 'Alpha1'
You may also take a look at this question/answer with a similar scenario.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.