Getting faster results from 2 databases to form 1 resultset

Getting faster results from 2 databases to form 1 resultset - php

So here is my scenario...
The bug_tracker table is in one server and task_traker is in another.
I want to show a combined result but can't since there are in two separate databases remotely.
So I am calling the task tracker first and then getting the bug details per iteration.
$task = oci_parse($task_conn, "select * from task_table where ....");
oci_execute($task);
while ($task_row = oci_fetch_array($task, OCI_ASSOC+OCI_RETURN_NULLS)) {
$bug = oci_parse($bug_conn, "select * from bug_table where id = " . $task_row['BUGID'] );
oci_execute($bug);
while ($task_row = oci_fetch_array($task, OCI_ASSOC+OCI_RETURN_NULLS)) {
... //output
}
... //output
}
But this entire process is making it very slow... since there are large number of records and columns.
Is there any way to make it even slightly faster? Note: I don't have access so can't setup oracle db links.

You could improve it using the IN statement:
<?php
$task = oci_parse($task_conn, "select * from task_table where ....");
oci_execute($task);
while ($task_row = oci_fetch_array($task, OCI_ASSOC+OCI_RETURN_NULLS)) {
$bugs[] = $task_row['BUGID'];
$users[] = $task_row['USER'];
$status[] = $task_row['TASK_STATUS'];
}
$bug = oci_parse($bug_conn, "select * from bug_table where id IN (" . implode(',', $bugs) . ");" );
oci_execute($bug);
while ($task_row = oci_fetch_array($task, OCI_ASSOC+OCI_RETURN_NULLS)) {
// ...
}
?>
On a sidenote, why are you not using PDO? I believe using it will already give you a performance boost.

PHP is not meant for this kind of operation, neither should you try to write your own join function.
One proper way of solving this issue is to dump the data from both databases into a local database, and there do the join.
You do not need anything fancy for the local database, an SQLite3 is probably enough.
Just dump the data from each database into a CSV files using a bash script that you put into cron. After the dump, (re)create each table in your SQLite3, and load the CSVs into these tables. After this you can do a join once and push the result into a new table which you then are free to query.
This is what in the datawarehouse world is often referred to as an ETL process, just in this case, very very simplified.

Related

Execute multiple queries in a single database connection using oracle 10g and php

I want to run multiple sql queries in a single database connection using oracle 10g and php. Here for every sql query queries I have to create database connection. Is there any way to run multiple sql queries in a single database connection or we have to fetch data this way only? Because when we have to run 50 queries, we have to write 50 times like below.
<?php
include("mydb.php");
// run query
$sql6 = "select * from dat where to_char(WD,'dd/mm')='19/08'";
$stid6=oci_parse($conn, $sql6);
// set array
$arr6 = array();
if(!$stid6){
$e=oci_error($conn);
trigger_error(htmlentities($e[message],ENT_QUOTES),E_USER_ERROR);
}
$r6=oci_execute($stid6);
if(!$r6){
$e=oci_error($stid6);
trigger_error(htmlentities($e[message],ENT_QUOTES),E_USER_ERROR);
}
// look through query
while($row = oci_fetch_array($stid6,OCI_ASSOC)){
// add each row returned into an array
$arr6[] = array(($row['WD']) , (float)$row['DATA']);
}
oci_free_statement($stid6);
oci_close($conn);
?>
<?php
include("mydb.php");
// run query
$sql7 = "select * from dat where to_char(WD,'dd/mm')='11/03'";
$stid7 = oci_parse($conn, $sql7);
// set array
$arr7 = array();
if(!$stid7){
$e=oci_error($conn);
trigger_error(htmlentities($e[message],ENT_QUOTES),E_USER_ERROR);
}
$r7=oci_execute($stid7);
if(!$r7){
$e=oci_error($stid7);
trigger_error(htmlentities($e[message],ENT_QUOTES),E_USER_ERROR);
}
// look through query
while($row = oci_fetch_array($stid7,OCI_ASSOC)){
// add each row returned into an array
$arr7[] = array(($row['WD'])) , (float)$row['DATA']);
}
oci_free_statement($stid7);
oci_close($conn);
?>
................
................
*Pardon me, I forgot to mention that we have store day-wise data in different array. I mean to say that 11/03's data will store in arr1 and 19/08's data will be stored in arr2. Not in same array.

(this should be a comment, but its a bit long)
I don't want to be disparaging here, but your question is alarming naive - so much so that it should be closed as off-topic.
Your code exhibits a lack of understanding about modular programming and variable scope. These should be covered by day 2 of a programming-from-scratch course. but oddly includes some more sophisticated PHP specific programming, but apalling SQL - it looks like someone else wrote the code as a quick hack and now you are trying to extend its capabilities.
That you are using an Oracle database, raises all sorts of questions about why are attempting this (Oracle is expensive; how can someone afford that but not afford to provision you with the skills you need?)
The solution to the problem as you have described it is to re-implement the script as a function that takes the OCI connection and SQL statement as parameters, then simply....
<?php
include("mydb.php");
$queries=array(
"select * from dat where to_char(WD,'dd/mm')='11/03'",
"select * from dat where to_char(WD,'dd/mm')='19/08'"
);
foreach ($queries as $sql) {
run_qry($sql, $conn);
}
oci_close($conn);
exit;
function run_query($sql, $conn)
{
$stid=oci_parse($conn, $sql);
// set array
$arr = array();
if(!$stid){
$e=oci_error($conn);
trigger_error(htmlentities($e[message],ENT_QUOTES),E_USER_ERROR);
}
...
oci_free_statement($stid);
return $arr;
}
However since the 2 example queries have exactly the same structure, there are other ways to the results of what are multiple queries - merging the SQL states into a single select using OR or UNION, using parameterized queries. Since the code you've shown us simply throws away the results its hard to say how you should approach the task.

Query for multiple SQLite tables with PHP

Forgive me if my question sounds stupid as I'm a begginer with SQLite, but I'm looking for the simplest SQLite query PHP solution that will give me full text results from at least three separate SQLite databases.
Google seems to give me links to articles without examples and I have to start from somewhere.
I have three databases:
domains.db (url_table, title_table, date_added_table)
extras.db (same tables as the first db)
admin.db (url_table, admin_notes_table)
Now I need a PHP query script that will execute a query and give me results from domains.db but if there are matches also from extras.db and admin.db.
I'm trying to just grasp the basics of it and looking for a starting point where I can at least study and learn the first code.

First, you connect to 'domains.db', query what you need, save however you want, than if there were a result in the first query, you connect to the others and query them.
$db1 = new SQLite3('domains.db');
$results1 = $db1->query('SELECT bar FROM foo');
if ($results1->numColumns() && $results1->columnType(0) != SQLITE3_NULL) {
// have rows
// so, again, $result2 = $db2->query('query');
// ....
} else {
// zero rows
}
// You can work with the data like this
//while ($row = $results1->fetchArray()) {
// var_dump($row);
//}
Source:
http://php.net/manual/en/sqlite3.query.php
http://php.net/manual/en/class.sqlite3result.php
Edit.: A better approach would be to use PDO, you can find a lot of tutorials and help to use it.
$db = new PDO('sqlite:mydatabase.db');
$result = $db->query('SELECT * FROM MyTable');
foreach ($result as $row) {
echo 'Example content: ' . $row['column1'];
}
You can also check the row count:
$row_count = sqlite_num_rows($result);
Source: http://blog.digitalneurosurgeon.com/?p=947

Splitting a string of values like 1030:0,1031:1,1032:2 and storing data in database

I have a bunch of photos on a page and using jQuery UI's Sortable plugin, to allow for them to be reordered.
When my sortable function fires, it writes a new order sequence:
1030:0,1031:1,1032:2,1040:3,1033:4
Each item of the comma delimited string, consists of the photo ID and the order position, separated by a colon. When the user has completely finished their reordering, I'm posting this order sequence to a PHP page via AJAX, to store the changes in the database. Here's where I get into trouble.
I have no problem getting my script to work, but I'm pretty sure it's the incorrect way to achieve what I want, and will suffer hugely in performance and resources - I'm hoping somebody could advise me as to what would be the best approach.
This is my PHP script that deals with the sequence:
if ($sorted_order) {
$exploded_order = explode(',',$sorted_order);
foreach ($exploded_order as $order_part) {
$exploded_part = explode(':',$order_part);
$part_count = 0;
foreach ($exploded_part as $part) {
$part_count++;
if ($part_count == 1) {
$photo_id = $part;
} elseif ($part_count == 2) {
$order = $part;
}
$SQL = "UPDATE article_photos ";
$SQL .= "SET order_pos = :order_pos ";
$SQL .= "WHERE photo_id = :photo_id;";
... rest of PDO stuff ...
}
}
}
My concerns arise from the nested foreach functions and also running so many database updates. If a given sequence contained 150 items, would this script cry for help? If it will, how could I improve it?
** This is for an admin page, so it won't be heavily abused **

you can use one update, with some cleaver code like so:
create the array $data['order'] in the loop then:
$q = "UPDATE article_photos SET order_pos = (CASE photo_id ";
foreach($data['order'] as $sort => $id){
$q .= " WHEN {$id} THEN {$sort}";
}
$q .= " END ) WHERE photo_id IN (".implode(",",$data['order']).")";
a little clearer perhaps
UPDATE article_photos SET order_pos = (CASE photo_id
WHEN id = 1 THEN 999
WHEN id = 2 THEN 1000
WHEN id = 3 THEN 1001
END)
WHERE photo_id IN (1,2,3)
i use this approach for exactly what your doing, updating sort orders

No need for the second foreach: you know it's going to be two parts if your data passes validation (I'm assuming you validated this. If not: you should =) so just do:
if (count($exploded_part) == 2) {
$id = $exploded_part[0];
$seq = $exploded_part[1];
/* rest of code */
} else {
/* error - data does not conform despite validation */
}
As for update hammering: do your DB updates in a transaction. Your db will queue the ops, but not commit them to the main DB until you commit the transaction, at which point it'll happily do the update "for real" at lightning speed.

I suggest making your script even simplier and changing names of the variables, so the code would be way more readable.
$parts = explode(',',$sorted_order);
foreach ($parts as $part) {
list($id, $position) = explode(':',$order_part);
//Now you can work with $id and $position ;
}
More info about list: http://php.net/manual/en/function.list.php
Also, about performance and your data structure:
The way you store your data is not perfect. But that way you will not suffer any performance issues, that way you need to send less data, less overhead overall.
However the drawback of your data structure is that most probably you will be unable to establish relationships between tables and make joins or alter table structure in a correct way.

Query on large mysql database

i've got a script which is supposed to run through a mysql database and preform a certain 'test'on the cases. Simplified the database contains records which represent trips that have been made by persons. Each record is a singel trip. But I want to use only roundway trips. So I need to search the database and match two trips to each other; the trip to and the trip from a certain location.
The script is working fine. The problem is that the database contains more then 600.000 cases. I know this should be avoided if possible. But for the purpose of this script and the use of the database records later on, everything has to stick together.
Executing the script takes hours right now, when executing on my iMac using MAMP. Off course I made sure that it can use a lot of memory etcetare.
My question is how could I speed things up, what's the best approach to do this?
Here's the script I have right now:
$table = $_GET['table'];
$output = '';
//Select all cases that has not been marked as invalid in previous test
$query = "SELECT persid, ritid, vertpc, aankpc, jaar, maand, dag FROM MON.$table WHERE reasonInvalid != '1' OR reasonInvalid IS NULL";
$result = mysql_query($query)or die($output .= mysql_error());
$totalCountValid = '';
$totalCountInvalid = '';
$totalCount = '';
//For each record:
while($row = mysql_fetch_array($result)){
$totalCount += 1;
//Do another query, get all the rows for this persons ID and that share postal codes. Postal codes revert between the two trips
$persid = $row['persid'];
$ritid = $row['ritid'];
$pcD = $row['vertpc'];
$pcA = $row['aankpc'];
$jaar = $row['jaar'];
$maand = $row['maand'];
$dag = $row['dag'];
$thecountquery = "SELECT * FROM MON.$table WHERE persid=$persid AND vertpc=$pcA AND aankpc=$pcD AND jaar = $jaar AND maand = $maand AND dag = $dag";
$thecount = mysql_num_rows(mysql_query($thecountquery));
if($thecount >= 1){
//No worries, this person ID has multiple trips attached
$totalCountValid += 1;
}else{
//Ow my, the case is invalid!
$totalCountInvalid += 1;
//Call the markInvalid from functions.php
$totalCountValid += 1;
markInvalid($table, '2', 'ritid', $ritid);
}
}
//Echo the result
$output .= 'Total cases: '.$totalCount.'<br>Valid: '.$totalCountValid.'<br>Invalid: '.$totalCountInvalid; echo $output;

Your basic problem is that you are doing the following.
1) Getting all cases that haven't been marked as invalid.
2) Looping through the cases obtained in step 1).
What you can easily do is to combine the queries written for 1) and 2) in a single query and loop over the data. This will speed up the things a bit.
Also bear in mind the following tips.
1) Selecting all columns is not at all a good thing to do. It takes ample amount of time for the data to traverse over the network. I would recommend replacing the wild-card with all columns that you really need.
SELECT * <ALL_COlumns>
2) Use indexes - sparingly, efficiently and appropriately. Understand when to use them and when not to.
3) Use views if you can.
4) Enable MySQL slow query log to understand which queries you need to work on and optimize.
log_slow_queries = /var/log/mysql/mysql-slow.log
long_query_time = 1
log-queries-not-using-indexes
5) Use correct MySQL field types and the storage engine (Very very important)
6) Use EXPLAIN to analyze your query - EXPLAIN is a useful command in MySQL which can provide you some great details about how a query is ran, what index is used, how many rows it needs to check through and if it needs to do file sorts, temporary tables and other nasty things you want to avoid.
Good luck.

Problem: Writing a MySQL parser to split JOIN's and run them as individual queries (denormalizing the query dynamically)

I am trying to figure out a script to take a MySQL query and turn it into individual queries, i.e. denormalizing the query dynamically.
As a test I have built a simple article system that has 4 tables:
articles
article_id
article_format_id
article_title
article_body
article_date
article_categories
article_id
category_id
categories
category_id
category_title
formats
format_id
format_title
An article can be in more than one category but only have one format. I feel this is a good example of a real-life situation.
On the category page which lists all of the articles (pulling in the format_title as well) this could be easily achieved with the following query:
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC
However the script I am trying to build would receive this query, parse it and run the queries individually.
So in this category page example the script would effectively run this (worked out dynamically):
// Select article_categories
$sql = "SELECT * FROM article_categories WHERE category_id = 2";
$query = mysql_query($sql);
while ($row_article_categories = mysql_fetch_array($query, MYSQL_ASSOC)) {
// Select articles
$sql2 = "SELECT * FROM articles WHERE article_id = " . $row_article_categories['article_id'];
$query2 = mysql_query($sql2);
while ($row_articles = mysql_fetch_array($query2, MYSQL_ASSOC)) {
// Select formats
$sql3 = "SELECT * FROM formats WHERE format_id = " . $row_articles['article_format_id'];
$query3 = mysql_query($sql3);
$row_formats = mysql_fetch_array($query3, MYSQL_ASSOC);
// Merge articles and formats
$row_articles = array_merge($row_articles, $row_formats);
// Add to array
$out[] = $row_articles;
}
}
// Sort articles by date
foreach ($out as $key => $row) {
$arr[$key] = $row['article_date'];
}
array_multisort($arr, SORT_DESC, $out);
// Output articles - this would not be part of the script obviously it should just return the $out array
foreach ($out as $row) {
echo '<p>'.$row['article_title'].' <i>('.$row['format_title'].')</i><br />'.$row['article_body'].'<br /><span class="date">'.date("F jS Y", strtotime($row['article_date'])).'</span></p>';
}
The challenges of this are working out the correct queries in the right order, as you can put column names for SELECT and JOIN's in any order in the query (this is what MySQL and other SQL databases translate so well) and working out the information logic in PHP.
I am currently parsing the query using SQL_Parser which works well in splitting up the query into a multi-dimensional array, but working out the stuff mentioned above is the headache.
Any help or suggestions would be much appreciated.

From what I gather you're trying to put a layer between a 3rd-party forum application that you can't modify (obfuscated code perhaps?) and MySQL. This layer will intercept queries, re-write them to be executable individually, and generate PHP code to execute them against the database and return the aggregate result. This is a very bad idea.
It seems strange that you imply the impossibility of adding code and simultaneously suggest generating code to be added. Hopefully you're not planning on using something like funcall to inject code. This is a very bad idea.
The calls from others to avoid your initial approach and focus on the database is very sound advice. I'll add my voice to that hopefully growing chorus.
We'll assume some constraints:
You're running MySQL 5.0 or greater.
The queries cannot change.
The database tables cannot be changed.
You already have appropriate indexes in place for the tables the troublesome queries are referencing.
You have triple-checked the slow queries (and run EXPLAIN) hitting your DB and have attempted to setup indexes that would help them run faster.
The load the inner joins are placing on your MySQL install is unacceptable.
Three possible solutions:
You could deal with this problem easily by investing money into your current database by upgrading the hardware it runs on to something with more cores, more (as much as you can afford) RAM, and faster disks. If you've got the money Fusion-io's products come highly recommended for this sort of thing. This is probably the simpler of the three options I'll offer
Setup a second master MySQL database and pair it with the first. Make sure you have the ability to force AUTO_INCREMENT id alternation (one DB uses even id's, the other odd). This doesn't scale forever, but it does offer you some breathing room for the price of the hardware and rack space. Again, beef up the hardware. You may have already done this, but if not it's worth consideration.
Use something like dbShards. You still need to throw more hardware at this, but you have the added benefit of being able to scale beyond two machines and you can buy lower cost hardware over time.

To improve database performance you typically look for ways to:
Reduce the number of database calls
Making each database call as efficient as possible (via good design)
Reduce the amount of data to be transfered
...and you are doing the exact opposite? Deliberately?
On what grounds?
I'm sorry, you are doing this entirely wrong, and every single problem you encounter down this road will all be consequences of that first decision to implement a database engine outside of the database engine. You will be forced to work around work-arounds all the way to delivery date. (if you get there).
Also, we are talking about a forum? I mean, come on! Even on the most "web-scale-awesome-sauce" forums we're talking about less than what, 100 tps on average? You could do that on your laptop!
My advice is to forget about all this and implement things the most simple possible way. Then cache the aggregates (most recent, popular, statistics, whatever) in the application layer. Everything else in a forum is already primary key lookups.

I agree it sounds like a bad choice, but I can think of some situations where splitting a query could be useful.
I would try something similar to this, relying heavily on regular expressions for parsing the query. It would work in a very limited of cases, but it's support could be expanded progressively when needed.
<?php
/**
* That's a weird problem, but an interesting challenge!
* #link http://stackoverflow.com/questions/5019467/problem-writing-a-mysql-parser-to-split-joins-and-run-them-as-individual-query
*/
// Taken from the given example:
$sql = "SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC";
// Parse query
// (Limited to the clauses that are present in the example...)
// Edit: Made WHERE optional
if(!preg_match('/^\s*'.
'SELECT\s+(?P<select_rows>.*[^\s])'.
'\s+FROM\s+(?P<from>.*[^\s])'.
'(?:\s+WHERE\s+(?P<where>.*[^\s]))?'.
'(?:\s+ORDER\s+BY\s+(?P<order_by>.*[^\s]))?'.
'(?:\s+(?P<desc>DESC))?'.
'(.*)$/is',$sql,$query)
) {
trigger_error('Error parsing SQL!',E_USER_ERROR);
return false;
}
## Dump matches
#foreach($query as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
/* We get the following matches:
"select_rows" => "articles.*, formats.format_title"
"from" => "articles INNER JOIN formats ON articles.article_format_id = formats.format_id INNER JOIN article_categories ON articles.article_id = article_categories.article_id"
"where" => "article_categories.category_id = 2"
"order_by" => "articles.article_date"
"desc" => "DESC"
/**/
// Will only support WHERE conditions separated by AND that are to be
// tested on a single individual table.
if(#$query['where']) // Edit: Made WHERE optional
$where_conditions = preg_split('/\s+AND\s+/is',$query['where']);
// Retrieve individual table information & data
$tables = array();
$from_conditions = array();
$from_tables = preg_split('/\s+INNER\s+JOIN\s+/is',$query['from']);
foreach($from_tables as $from_table) {
if(!preg_match('/^(?P<table_name>[^\s]*)'.
'(?P<on_clause>\s+ON\s+(?P<table_a>.*)\.(?P<column_a>.*)\s*'.
'=\s*(?P<table_b>.*)\.(?P<column_b>.*))?$/im',$from_table,$matches)
) {
trigger_error("Error parsing SQL! Unexpected format in FROM clause: $from_table", E_USER_ERROR);
return false;
}
## Dump matches
#foreach($matches as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
// Remember on_clause for later jointure
// We do assume each INNER JOIN's ON clause compares left table to
// right table. Forget about parsing more complex conditions in the
// ON clause...
if(#$matches['on_clause'])
$from_conditions[$matches['table_name']] = array(
'column_a' => $matches['column_a'],
'column_b' => $matches['column_b']
);
// Match applicable WHERE conditions
$where = array();
if(#$query['where']) // Edit: Made WHERE optional
foreach($where_conditions as $where_condition)
if(preg_match("/^$matches[table_name]\.(.*)$/",$where_condition,$matched))
$where[] = $matched[1];
$where_clause = empty($where) ? null : implode(' AND ',$where);
// We simply ignore $query[select_rows] and use '*' everywhere...
$query = "SELECT * FROM $matches[table_name]".($where_clause? " WHERE $where_clause" : '');
echo "$query<br/>\n";
// Retrieve table's data
// Fetching the entire table data right away avoids multiplying MySQL
// queries exponentially...
$table = array();
if($results = mysql_query($table))
while($row = mysql_fetch_array($results, MYSQL_ASSOC))
$table[] = $row;
// Sort table if applicable
if(preg_match("/^$matches[table_name]\.(.*)$/",$query['order_by'],$matched)) {
$sort_key = $matched[1];
// #todo Do your bubble sort here!
if(#$query['desc']) array_reverse($table);
}
$tables[$matches['table_name']] = $table;
}
// From here, all data is fetched.
// All left to do is the actual jointure.
/**
* Equijoin/Theta-join.
* Joins relation $R and $S where $a from $R compares to $b from $S.
* #param array $R A relation (set of tuples).
* #param array $S A relation (set of tuples).
* #param string $a Attribute from $R to compare.
* #param string $b Attribute from $S to compare.
* #return array A relation resulting from the equijoin/theta-join.
*/
function equijoin($R,$S,$a,$b) {
$T = array();
if(empty($R) or empty($S)) return $T;
foreach($R as $tupleR) foreach($S as $tupleS)
if($tupleR[$a] == #$tupleS[$b])
$T[] = array_merge($tupleR,$tupleS);
return $T;
}
$jointure = array_shift($tables);
if(!empty($tables)) foreach($tables as $table_name => $table)
$jointure = equijoin($jointure, $table,
$from_conditions[$table_name]['column_a'],
$from_conditions[$table_name]['column_b']);
return $jointure;
?>
Good night, and Good luck!

In instead of the sql rewriting I think you should create a denormalized articles table and change it at each article insert/delete/update. It will be MUCH simpler and cheaper.
Do the create and populate it:
create table articles_denormalized
...
insert into articles_denormalized
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
Now issue the appropriate article insert/update/delete against it and you will have a denormalized table always ready to be queried.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.