I have been given access to a third parties database and wish to create a tool using their information. The database designed for their original purpose is very very large and segregated. I need to complete the following task:
From the the below Schema, I need to complete the following tasks:
Look up the item in the invTypes, check both the invTypeMaterials and ramTypeRequirements to see if any materials are need to build the item. If yes, then look up each of those materials in invTypes, and again repeat the process to see if those in turn need components. This loop keeps going until the the check on both the invTypeMaterials and ramTypeRequirements is False, this can be 5 or 6 loops, but 5 or 6 items per loop to check so could be 1561 loops assuming 1 loop for original item, then 5 loops per material of which there is 5, 5 times.
Now I tried to complete the code and came up with the follow:
$materialList = array();
function getList($dbc, $item) {
global $materialList;
// Obtain initial material list
$materials = materialList($dbc, $item);
// For each row in the database
while ($material == mysqli_fetch_array($materials)) {
// Check if there are any sub materials required
if (subList($dbc, $material['ID'])) {
// If so then recurse over the list the given quantity (it has already done it once)
for ($i = 0; $i < $material['Qty'] - 1; $i++) {
if (!subList($dbc, $material['ID'])) {
break;
}
}
} else {
// If there are no further materials then this is the base material so add to the array.
$materialList .= array(
"Name" => $mMaterial['Name'],
"Qty" => $mMaterial['Qty'],
"ID" => $material['ID']
);
}
}
return $materialList;
}
function subList($dbc, $item) {
global $materialList;
// Query the material incase it require further building
$mMaterials = materialList($dbc, $item['ID']);
// If the database returns any rows, then it must have more sub-materials required
if (mysqli_num_rows($mMaterials) > 0) {
// Check the sub-materials to see if they intern require futher materials
if (subList($dbc, $material['ID'])) {
// If the function returns true then iterate over the list the given quantity (its already done it once before)
for ($i = 0; $i < $material['Qty'] - 1; $i++) {
if (!subList($dbc, $material['ID'])) {
break;
}
}
} else {
// if the database returns 0 rows then this object is the base material so add to array.
$materialList .= array(
"Name" => $mMaterial['Name'],
"Qty" => $mMaterial['Qty'],
"ID" => $material['ID']
);
return true;
}
} else {
return false;
}
}
function materialList($dbc, $item) {
// Query
$query = " SELECT i.typeID AS ID, i.typeName AS Name, m.Quantity AS Qty
FROM invTypes AS i
LEFT JOIN invTypeMaterials AS m
ON m.materialTypeID = i.typeID
LEFT JOIN ramTypeRequirements AS r
ON r.typeID = i.typeID
WHERE groupID NOT IN(278,269,278,270,268) AND m.typeID = $item";
$snippets = mysqli_query($dbc, $query) or die('Error: ' . mysqli_error($dbc));
return $snippets;
}
As im sure you have all noticed this code breaks about every programming law there is when it comes to recursive database calls. Not really practical especially in that subList() calls itself continually until it finds it's false. SQL isn't my strong suite, but I cannot for the life of me work out how to get over this problem.
Any pointers would be very helpful, I'm certainly not asking any of you to re-write my entire code for me, but if you have any ideas as to what I should consider I would be grateful.
As a generic solution I would do the following:
For every typeID, gather from both invTypeMaterials and ramTypeRequirements
From the gathered data, you create a new SELECT query and continue the cycle
Initial query
SELECT t.*, m.materialTypeID, m.quantity AS m_quantity, r.requiredTypeID, r.quantity AS r_quantity
FROM invTypes t
LEFT JOIN invTypeMaterials m USING (typeID)
LEFT JOIN ramTypeRequirements r USING (typeID)
WHERE <conditions to select the types>
I've just made a guess at which data from the extra tables are required to load; expand where necessary.
The materialTypeID and requiredTypeID will be non-null for matches rows and null otherwise.
Keep a table of types you have already loaded before, for faster reference. Then for the second query you replace the condition to something like `WHERE t.typeID IN ()
Let me know if this makes sense and whether it's even close to what's useful to you :)
Looks like here recursion is unavoidable. I join Jack's answer, just will extend it with PHP code :)
I must warn you that I never executed it, so it will need debugging, but I hope you will get the idea. :)
$checked_dependencies = array();
$materials = array();
function materialList( $ids ) {
// if we have an array of IDs, condition is ".. in (...)"
if(is_array($ids)) {
$condition = 'IN ('.implode(',',$ids).')';
// add all to checked dependencies
foreach($ids as $id) { $checked_dependencies[] = $id; }
}else{
// otherwise, checking for particular ID
$condition = "= {$ids}";
// add to checked dependencies
$checked_dependencies[] = $ids;
}
$query = "SELECT t.*,
m.materialTypeID, m.quantity AS m_quantity,
r.requiredTypeID,
r.quantity AS r_quantity
FROM invTypes t
LEFT JOIN invTypeMaterials m ON t.typeId = m.typeId
LEFT JOIN ramTypeRequirements r ON t.typeId = r.typeId
WHERE t.typeID {$condition}";
$res = mysqli_query($dbc, $query);
// this will be the list of IDs which we need to get
$ids_to_check = array();
while($material = mysqli_fetch_assoc($res)) {
$materialList[] = $material; // you can get only needed fields
// if we didn't check the dependencies already, adding them to the list
// (if they aren't there yet)
if(!in_array($material['materialTypeId'], $checked_dependencies)
&& !in_array($material['materialTypeId'], $ids_to_check)
&& !is_null($material['materialTypeId'])) {
$ids_to_check[] = $material['materialTypeId'];
}
if(!in_array($material['requiredTypeId'], $checked_dependencies)
&& !in_array($material['requiredTypeId'], $ids_to_check)
&& !is_null($material['requiredTypeId'])) {
$ids_to_check[] = $material['requiredTypeId'];
}
}
// if the result array isn't empty, recursively calling same func
if(!empty($ids_to_check)) { materialList($ids_to_check); }
}
I used a global array here, but it's easy to re-write the func to return data.
Also we can put some depth limit here to avoid too much recursion.
Generally, I'd say it is not a very convenient (for this task) organization of DB data. It's kinda comfortable to store data recursively like that, but, as you see, it results in an unknown amount of iterations and requests to database to get all the dependencies. And that might be expensive (PHP <-> MySQL <-> PHP <->...), on each iteration we lose time, especially if the DB is on remote server as in your case.
Of course, would be great to re-arrange the data structure for possibility to get all requirements at once, but as I understand you have a read-only access to the database. Second solution which comes to my head is a recursive MySQL stored procedure, which is also impossible here.
In some cases (not generally) it is good to get as much data as possible in one query, and operate with it locally, to lessen the iterations number. It is hard to say if it is possible here, because I don't know the size of DB and the structure, etc, but e.g. if all required dependencies are stored in one group, and the groups aren't enormously large, maybe it might be faster to get all the group info in one request to a PHP array and then collect the info from that array locally. But - it is only a guess and it needs testing and checking.
Related
I store the array into session for easily to retrieve and work.
$responses = session('get_all_response');
$responses contains 30 records maximum.
I aiming to make the pushing of data into the array more fast. Because if I have 10 records in $responses (array) it takes 30secs to load all the possible info regarding each content of that array (But the real thing is. The count of records in an array is more likely 30 maximum)
I loop inside the array
foreach($responses as $res)
{
$bo_images = DB::select('SELECT
image.bo_hotel_code,
image.bo_image_type_code,
image.bo_path,
imagetypes.bo_content_imagetype_description
FROM
bo_images AS image
RIGHT JOIN bo_content_imagetypes AS imagetypes
ON imagetypes.bo_content_imagetype_code = image.bo_image_type_code
WHERE image.bo_hotel_code = "'.$res['code'].'" AND image.bo_image_type_code = "COM" LIMIT 1');
if($bo_images != null)
{
foreach($bo_images as $row)
{
$responses[$res['code']]['information']['bo_images'] = array(
'image_type_code' => $row->bo_image_type_code,
'image_path' => 'http://photos.hotelbeds.com/giata/'.$row->bo_path,
'image_type_description' => $row->bo_content_imagetype_description,
);
}
}
$bo_categories = DB::select('SELECT
a.category_code,
b.bo_content_category_description
FROM
bo_hotel_contents AS a
RIGHT JOIN bo_content_categories AS b
ON b.bo_content_category_code = a.category_code
WHERE a.hotel_code= "'.$res['code'].'"');
if($bo_categories != null)
{
foreach($bo_categories as $row)
{
$responses[$res['code']]['information']['rating'] = array(
'description' => $row->bo_content_category_description,
);
}
}
}
In every loop, there is a code in there that will hold the key to get the contents inside the database.
then after that, it will push the content into that array that equal to the index of the array.
Otherwise. It is a success. But I know this is not the proper way of doing it. I know there is much better to do this.
Any help is so much appreciated
I'm not familiar with Laravel, so I don't know if prepared statements work with it, but you should do something to clean &/or verify the $res['code'] to make sure it is an integer, assuming that's what it's supposed to be.
First, prepare a string for an WHERE IN clause.
$str = "";
foreach ($responses as $res){
$str .= ','.$res['code'];
}
$str = substr($str,1); // to remove the comma
Then you'll need to change your query to use the IN statement.
WHERE a.hotel_code IN({$str})
I'm guessing image.bo_hotel_code refers to $res['code']. But in case it doesn't, you could modify your SELECT statement (if memory serves):
$code = $res['code'];
SELECT {$code} as code,
image.bo_hotel_code,
image.bo_image_type_code,
...
Then you'll loop over the results and put them into the array in the same manner, where $row['code'] would refer to the code used to select it. It should be MUCH faster than running repeated queries, and there should be one row for each code in the IN statement.
I'm wondering whether this kind of logic would improve query performance, say for example rather then checking a user likes a post on each element in an array and firing a query for each.
Instead i could push the primary id's into an array and then perform an IN query on them, this would reduce 15 nth term queries, and batch it into 2 query including the initial one.
I'm using PHP PDO, MYSQL.
Any advice? Am i on the right track people? :D
$items is the result set from the database, in this case they are questions that users are asking, i get a response in about 140ms and i've set a limit on how many items are loaded at once with pagination.
$questionIds = [];
foreach ($items as $item) {
array_push($questionIds, $item->question_id);
}
$items = loggedInUserLikesQuestions($questionIds, $items, $user_id);
Definitely the IN clause is faster on execution of the SQL query. However, you will only see significant actual clock-speed benefits once the number of items in your IN clause (on average) gets high.
The reason there is a speed difference, even though the individual update may be lightning-fast, is the setup, executing, tear-down, and response of each query, send/receive to the server. When you are doing thousands (or millions) of these as fast as you can, I've seen, instead of 500/sec, getting 200,000/sec. This may give you some idea.
However, with the IN-clause method, you need to make sure your IN clause does not become too big, and hitting the max query size (see variable max_allowed_packet)
Here is a simple set of functions that will automatically batch up into IN clauses of 1000 items each:
<?php
$db = new PDO('...');
$__q = [];
$flushQueue = function() use ($db, &$__q) {
if ( count($__q) > 0 ) {
$sanitized_ids = [];
foreach ( $__q as $id ) { $sanitized_ids[] = (int) $id; }
$db->query("UPDATE question SET linked = 1 WHERE id IN (". join(',',$sanitized_ids) .")");
$__q = [];
}
};
$queuedUpdate = function($question_id) use (&$__q, $flushQueue){
$__q[] = $question_id;
if ( count( $__q) > 1000 ) { $flushQueue(); }
};
// Then your code...
foreach ($items as $item) {
$queuedUpdate($item->question_id);
}
$flushQueue();
Obviously, you don't have to use anon functions, if you are in a class. But the above will work anywhere (assuming you are on >= PHP 5.3).
I have a bunch of photos on a page and using jQuery UI's Sortable plugin, to allow for them to be reordered.
When my sortable function fires, it writes a new order sequence:
1030:0,1031:1,1032:2,1040:3,1033:4
Each item of the comma delimited string, consists of the photo ID and the order position, separated by a colon. When the user has completely finished their reordering, I'm posting this order sequence to a PHP page via AJAX, to store the changes in the database. Here's where I get into trouble.
I have no problem getting my script to work, but I'm pretty sure it's the incorrect way to achieve what I want, and will suffer hugely in performance and resources - I'm hoping somebody could advise me as to what would be the best approach.
This is my PHP script that deals with the sequence:
if ($sorted_order) {
$exploded_order = explode(',',$sorted_order);
foreach ($exploded_order as $order_part) {
$exploded_part = explode(':',$order_part);
$part_count = 0;
foreach ($exploded_part as $part) {
$part_count++;
if ($part_count == 1) {
$photo_id = $part;
} elseif ($part_count == 2) {
$order = $part;
}
$SQL = "UPDATE article_photos ";
$SQL .= "SET order_pos = :order_pos ";
$SQL .= "WHERE photo_id = :photo_id;";
... rest of PDO stuff ...
}
}
}
My concerns arise from the nested foreach functions and also running so many database updates. If a given sequence contained 150 items, would this script cry for help? If it will, how could I improve it?
** This is for an admin page, so it won't be heavily abused **
you can use one update, with some cleaver code like so:
create the array $data['order'] in the loop then:
$q = "UPDATE article_photos SET order_pos = (CASE photo_id ";
foreach($data['order'] as $sort => $id){
$q .= " WHEN {$id} THEN {$sort}";
}
$q .= " END ) WHERE photo_id IN (".implode(",",$data['order']).")";
a little clearer perhaps
UPDATE article_photos SET order_pos = (CASE photo_id
WHEN id = 1 THEN 999
WHEN id = 2 THEN 1000
WHEN id = 3 THEN 1001
END)
WHERE photo_id IN (1,2,3)
i use this approach for exactly what your doing, updating sort orders
No need for the second foreach: you know it's going to be two parts if your data passes validation (I'm assuming you validated this. If not: you should =) so just do:
if (count($exploded_part) == 2) {
$id = $exploded_part[0];
$seq = $exploded_part[1];
/* rest of code */
} else {
/* error - data does not conform despite validation */
}
As for update hammering: do your DB updates in a transaction. Your db will queue the ops, but not commit them to the main DB until you commit the transaction, at which point it'll happily do the update "for real" at lightning speed.
I suggest making your script even simplier and changing names of the variables, so the code would be way more readable.
$parts = explode(',',$sorted_order);
foreach ($parts as $part) {
list($id, $position) = explode(':',$order_part);
//Now you can work with $id and $position ;
}
More info about list: http://php.net/manual/en/function.list.php
Also, about performance and your data structure:
The way you store your data is not perfect. But that way you will not suffer any performance issues, that way you need to send less data, less overhead overall.
However the drawback of your data structure is that most probably you will be unable to establish relationships between tables and make joins or alter table structure in a correct way.
I'm coding a quest system for my site that pretty much works like any you find in an MMORPG. I've got the whole thing working, but I really need to speed things up since I coded it inefficiently. Not because I can't code, but because I just wasn't sure how to go about it.
Basically I want to display all the quests that are available to the user. There are quests with no requirements, and some with. To see if a quest has already been completed, and a quest is available you would check the questuser table for the value of questuserCompleted. If it's 1, then it's complete.
Here are the tables I have. I left irrelevant things out.
quest table - Holds all the quest data
questID
questNPC
Where they get the Quest from
questPreReq
refers to a quest they would have needed to comple to get this one
questuser table - Holds all the quests the user has accepted, complete or not
questuserID
questuserUser
User's ID
questuserQuest
refers to the ID of the quest from the quest table
questuserCompleted
0 is in progress, 1 is complete
There's definitely a better way to do than I have now. I'm usually more efficient at things like this, but since I've never coded something like this before, I'm not really sure how to go about it.
Basically it just loops through every single quest, and with an if statement, it checks for questuserCompleted. Once there starts to be a lot of quests, this would get pretty slow for each load of the page.
function displayAvailable($npc = 0){
$completed[] = 0;
$notCompleted = array();
$query=mysql_query("
SELECT a.questID, a.questPreReq, a.questTitle, b.questuserCompleted, a.questText , a.questNPC
FROM quest a
LEFT JOIN questuser b ON a.questID = b.questuserQuest AND b.questuserUser = '".$this->userID."'
ORDER BY a.questID");
$comments = $this->ProcessRowSet($query);
$num = mysql_num_rows($query);
if($num){
foreach ($comments as $c){
if($c['questuserCompleted']){
$completed[] = $c['questID'];
}else{
$notCompleted[] = $c['questID'];
}
if(in_array($c['questPreReq'], $completed) && !in_array($c['questID'], $completed) && $c['questuserCompleted'] != '0'){
if($npc == 0 || $c['questNPC'] == $npc){
$count++;
$return .= "<p>".$c['questTitle']."</p>";
}
}
}
}
if(!$count){
$return = "You have no available quests";
}
return $return;
}
Thanks for any help.
Subqueries to the rescue
SELECT a.questID, a.questPreReq, a.questTitle, a.questText , a.questNPC
FROM quest a
WHERE a.questPreReq IS NULL
OR a.questPreReq IN (SELECT questuserQuest FROM questuser WHERE questuserUser = 'UserID' AND questuseCompleted = 1)
ORDER BY a.questID
So you let the database sort it out for you, this should be superfast.
The query is missing the exclusion of quests the user already did, I leave this as an exercise as I am writing on my smartphone ; )
I am trying to figure out a script to take a MySQL query and turn it into individual queries, i.e. denormalizing the query dynamically.
As a test I have built a simple article system that has 4 tables:
articles
article_id
article_format_id
article_title
article_body
article_date
article_categories
article_id
category_id
categories
category_id
category_title
formats
format_id
format_title
An article can be in more than one category but only have one format. I feel this is a good example of a real-life situation.
On the category page which lists all of the articles (pulling in the format_title as well) this could be easily achieved with the following query:
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC
However the script I am trying to build would receive this query, parse it and run the queries individually.
So in this category page example the script would effectively run this (worked out dynamically):
// Select article_categories
$sql = "SELECT * FROM article_categories WHERE category_id = 2";
$query = mysql_query($sql);
while ($row_article_categories = mysql_fetch_array($query, MYSQL_ASSOC)) {
// Select articles
$sql2 = "SELECT * FROM articles WHERE article_id = " . $row_article_categories['article_id'];
$query2 = mysql_query($sql2);
while ($row_articles = mysql_fetch_array($query2, MYSQL_ASSOC)) {
// Select formats
$sql3 = "SELECT * FROM formats WHERE format_id = " . $row_articles['article_format_id'];
$query3 = mysql_query($sql3);
$row_formats = mysql_fetch_array($query3, MYSQL_ASSOC);
// Merge articles and formats
$row_articles = array_merge($row_articles, $row_formats);
// Add to array
$out[] = $row_articles;
}
}
// Sort articles by date
foreach ($out as $key => $row) {
$arr[$key] = $row['article_date'];
}
array_multisort($arr, SORT_DESC, $out);
// Output articles - this would not be part of the script obviously it should just return the $out array
foreach ($out as $row) {
echo '<p>'.$row['article_title'].' <i>('.$row['format_title'].')</i><br />'.$row['article_body'].'<br /><span class="date">'.date("F jS Y", strtotime($row['article_date'])).'</span></p>';
}
The challenges of this are working out the correct queries in the right order, as you can put column names for SELECT and JOIN's in any order in the query (this is what MySQL and other SQL databases translate so well) and working out the information logic in PHP.
I am currently parsing the query using SQL_Parser which works well in splitting up the query into a multi-dimensional array, but working out the stuff mentioned above is the headache.
Any help or suggestions would be much appreciated.
From what I gather you're trying to put a layer between a 3rd-party forum application that you can't modify (obfuscated code perhaps?) and MySQL. This layer will intercept queries, re-write them to be executable individually, and generate PHP code to execute them against the database and return the aggregate result. This is a very bad idea.
It seems strange that you imply the impossibility of adding code and simultaneously suggest generating code to be added. Hopefully you're not planning on using something like funcall to inject code. This is a very bad idea.
The calls from others to avoid your initial approach and focus on the database is very sound advice. I'll add my voice to that hopefully growing chorus.
We'll assume some constraints:
You're running MySQL 5.0 or greater.
The queries cannot change.
The database tables cannot be changed.
You already have appropriate indexes in place for the tables the troublesome queries are referencing.
You have triple-checked the slow queries (and run EXPLAIN) hitting your DB and have attempted to setup indexes that would help them run faster.
The load the inner joins are placing on your MySQL install is unacceptable.
Three possible solutions:
You could deal with this problem easily by investing money into your current database by upgrading the hardware it runs on to something with more cores, more (as much as you can afford) RAM, and faster disks. If you've got the money Fusion-io's products come highly recommended for this sort of thing. This is probably the simpler of the three options I'll offer
Setup a second master MySQL database and pair it with the first. Make sure you have the ability to force AUTO_INCREMENT id alternation (one DB uses even id's, the other odd). This doesn't scale forever, but it does offer you some breathing room for the price of the hardware and rack space. Again, beef up the hardware. You may have already done this, but if not it's worth consideration.
Use something like dbShards. You still need to throw more hardware at this, but you have the added benefit of being able to scale beyond two machines and you can buy lower cost hardware over time.
To improve database performance you typically look for ways to:
Reduce the number of database calls
Making each database call as efficient as possible (via good design)
Reduce the amount of data to be transfered
...and you are doing the exact opposite? Deliberately?
On what grounds?
I'm sorry, you are doing this entirely wrong, and every single problem you encounter down this road will all be consequences of that first decision to implement a database engine outside of the database engine. You will be forced to work around work-arounds all the way to delivery date. (if you get there).
Also, we are talking about a forum? I mean, come on! Even on the most "web-scale-awesome-sauce" forums we're talking about less than what, 100 tps on average? You could do that on your laptop!
My advice is to forget about all this and implement things the most simple possible way. Then cache the aggregates (most recent, popular, statistics, whatever) in the application layer. Everything else in a forum is already primary key lookups.
I agree it sounds like a bad choice, but I can think of some situations where splitting a query could be useful.
I would try something similar to this, relying heavily on regular expressions for parsing the query. It would work in a very limited of cases, but it's support could be expanded progressively when needed.
<?php
/**
* That's a weird problem, but an interesting challenge!
* #link http://stackoverflow.com/questions/5019467/problem-writing-a-mysql-parser-to-split-joins-and-run-them-as-individual-query
*/
// Taken from the given example:
$sql = "SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC";
// Parse query
// (Limited to the clauses that are present in the example...)
// Edit: Made WHERE optional
if(!preg_match('/^\s*'.
'SELECT\s+(?P<select_rows>.*[^\s])'.
'\s+FROM\s+(?P<from>.*[^\s])'.
'(?:\s+WHERE\s+(?P<where>.*[^\s]))?'.
'(?:\s+ORDER\s+BY\s+(?P<order_by>.*[^\s]))?'.
'(?:\s+(?P<desc>DESC))?'.
'(.*)$/is',$sql,$query)
) {
trigger_error('Error parsing SQL!',E_USER_ERROR);
return false;
}
## Dump matches
#foreach($query as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
/* We get the following matches:
"select_rows" => "articles.*, formats.format_title"
"from" => "articles INNER JOIN formats ON articles.article_format_id = formats.format_id INNER JOIN article_categories ON articles.article_id = article_categories.article_id"
"where" => "article_categories.category_id = 2"
"order_by" => "articles.article_date"
"desc" => "DESC"
/**/
// Will only support WHERE conditions separated by AND that are to be
// tested on a single individual table.
if(#$query['where']) // Edit: Made WHERE optional
$where_conditions = preg_split('/\s+AND\s+/is',$query['where']);
// Retrieve individual table information & data
$tables = array();
$from_conditions = array();
$from_tables = preg_split('/\s+INNER\s+JOIN\s+/is',$query['from']);
foreach($from_tables as $from_table) {
if(!preg_match('/^(?P<table_name>[^\s]*)'.
'(?P<on_clause>\s+ON\s+(?P<table_a>.*)\.(?P<column_a>.*)\s*'.
'=\s*(?P<table_b>.*)\.(?P<column_b>.*))?$/im',$from_table,$matches)
) {
trigger_error("Error parsing SQL! Unexpected format in FROM clause: $from_table", E_USER_ERROR);
return false;
}
## Dump matches
#foreach($matches as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
// Remember on_clause for later jointure
// We do assume each INNER JOIN's ON clause compares left table to
// right table. Forget about parsing more complex conditions in the
// ON clause...
if(#$matches['on_clause'])
$from_conditions[$matches['table_name']] = array(
'column_a' => $matches['column_a'],
'column_b' => $matches['column_b']
);
// Match applicable WHERE conditions
$where = array();
if(#$query['where']) // Edit: Made WHERE optional
foreach($where_conditions as $where_condition)
if(preg_match("/^$matches[table_name]\.(.*)$/",$where_condition,$matched))
$where[] = $matched[1];
$where_clause = empty($where) ? null : implode(' AND ',$where);
// We simply ignore $query[select_rows] and use '*' everywhere...
$query = "SELECT * FROM $matches[table_name]".($where_clause? " WHERE $where_clause" : '');
echo "$query<br/>\n";
// Retrieve table's data
// Fetching the entire table data right away avoids multiplying MySQL
// queries exponentially...
$table = array();
if($results = mysql_query($table))
while($row = mysql_fetch_array($results, MYSQL_ASSOC))
$table[] = $row;
// Sort table if applicable
if(preg_match("/^$matches[table_name]\.(.*)$/",$query['order_by'],$matched)) {
$sort_key = $matched[1];
// #todo Do your bubble sort here!
if(#$query['desc']) array_reverse($table);
}
$tables[$matches['table_name']] = $table;
}
// From here, all data is fetched.
// All left to do is the actual jointure.
/**
* Equijoin/Theta-join.
* Joins relation $R and $S where $a from $R compares to $b from $S.
* #param array $R A relation (set of tuples).
* #param array $S A relation (set of tuples).
* #param string $a Attribute from $R to compare.
* #param string $b Attribute from $S to compare.
* #return array A relation resulting from the equijoin/theta-join.
*/
function equijoin($R,$S,$a,$b) {
$T = array();
if(empty($R) or empty($S)) return $T;
foreach($R as $tupleR) foreach($S as $tupleS)
if($tupleR[$a] == #$tupleS[$b])
$T[] = array_merge($tupleR,$tupleS);
return $T;
}
$jointure = array_shift($tables);
if(!empty($tables)) foreach($tables as $table_name => $table)
$jointure = equijoin($jointure, $table,
$from_conditions[$table_name]['column_a'],
$from_conditions[$table_name]['column_b']);
return $jointure;
?>
Good night, and Good luck!
In instead of the sql rewriting I think you should create a denormalized articles table and change it at each article insert/delete/update. It will be MUCH simpler and cheaper.
Do the create and populate it:
create table articles_denormalized
...
insert into articles_denormalized
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
Now issue the appropriate article insert/update/delete against it and you will have a denormalized table always ready to be queried.