Can PHP arrays do this? - php

lets say;
I have a $friends array with 2,000 different friendID numbers
+
I have a $bulletins array with 10,000 bulletinID numbers, the $bulletins array will also have another value with a userID of who posted the bulletin entry
Now is it possible to get all the bulletinID numbers that have a userID matching a userID in the friends array? And if it is even possible, would this be fast or slow or not generally a good method? I am trying to get bulletin type posts on my site and only show ones posted by a friend of a user but some users have a few thousand friends and bulletins could be in the thousands but only some of them a user is allowed to view
Also if this is possible, could I limit it to oly get like the first 50 bulletins ID's that match a friendID

Where are you getting these arrays of thousands of friends/bulletins? If the answer is a relational database (MySQL, PostgreSQL), then this should be done using a SQL query as it is quite trivial, and much more efficient than anything you could do in PHP.
Here is an example of how this could be done in SQL:
SELECT
posts.id
FROM posts
JOIN users ON posts.user_id = users.id
JOIN user_friends ON user_friends.user_id = users.id
WHERE posts.type = 'bulletin'
AND user_friends.user_id = 7
LIMIT 50;
Obviously it is done with no knowledge of your actual database structure, if any, and thus will not work as-is, but should get you on the right path.

Ok, so it sounds like you have an array of friend ids that is not associative, ie array(0 => 'userid0', 1 => 'userid1', etc), and an array of bulletin ids that is associative, ie. array('bulletin1' => 'userid1', 'bulletin2' => 'userid2', etc).
Going with that assumption, you can get all the matching bulletins using array_intersect(). You can then take the first fifty bulletin keys with array_slice():
$matchingBulletins = array_intersect($bulletins, $friends);
$first50 = array_slice(array_keys($matchingBulletins),0,50);
It sounds like you might be getting this data out of a database however, in which case it would be much more prudent to filter your database results somehow and avoid returning 10,000 ids each time. You could do the sorting and filtering using JOINs and WHEREs on the right tables.

I'm assuming here that your $friends array is just an array of ints and each item in your $bulletins is an array with userId and some extra fields.
$len = count($bulletins);
$matchedBulletins = array();
for ($i = 0; $i < $len; $i++) {
if (in_array($bulletins[$i]['userId'], $friends) {
$matchedBulletins[] = $bulletins[$i];
}
}
If you wan't to limit this array to like 50 first records only just add a condition inside a loop.
$len = count($bulletins);
$matchedBulletins = array();
$bulletinsCount = 0;
for ($i = 0; $i < $len; $i++) {
if (in_array($bulletins[$i]['userId'], $friends) {
$matchedBulletins[] = $bulletins[$i];
$bulletinsCount++
if ($bulletinsCount == 50) {
break;
}
}
}

If you'd post a bit of each array (not all 10,000 items, the first 10 would do) you may get more bites.
Check out array_search() in the meantime.

Related

How to select random rows from a table in MySQL

I am creating a project which involves getting some questions from mysql database. For instance, if I have 200 questions in my database, I want to randomly choose 20 questions in such a way that no one question will be repeated twice. That is, I want to be able to have an array of 20 different questions from the 200 I have every time the user tries to get the list of questions to answer. I will really appreciate your help.
SELECT * FROM questions ORDER BY RAND() LIMIT 20;
PS^ This method not possible for very big tables
Use Google to find a function to create an array with 20 unique numbers, with a minimum and a maximum. Use this array to prepare an SQL query such as:
expression IN (value1, value2, .... value_n);
More on the SQL here.
Possible array filling function here too.
Assuming you have contiguously number questions in your database, you just need a list of 20 random numbers. Also assuming you want the user to be able to take more than one test and get another 20 questions without duplicates then you could start with a randomised array of 200 numbers and select blocks of 20 sequentially from that set i.e.
$startQuestion=1;
$maxQuestion=200;
$numberlist= range(1,$maxQuestion);
shuffle($numberlist);
function getQuestionSet( $noOfQuestions )
{
global $numberlist, $maxQuestion, $startQuestion;
$start= $startQuestion;
if( ($startQuestion+$noOfQuestions) > $maxQuestion)
{
echo "shuffle...\n";
shuffle($numberlist);
$startQuestion=1;
}else
$startQuestion+= $noOfQuestions;
return array_slice($numberlist,$start,$noOfQuestions);
}
// debug...
for($i=0; $i<42; $i++)
{
$questionset= getQuestionSet( 20 );
foreach( $questionset as $num )
echo $num." ";
echo "\n";
}
then use $questionset to retrieve your questions
If you know how many rows there are in the table, you could do use LIMIT to your advantage. With limit you specify a random offset; syntax: LIMIT offset,count. Example:
<?php
$totalRows = 200; // get his value dynamically or whatever...
$limit = 2; // num of rows to select
$rand = mt_rand(0,$totalRows-$limit-1);
$query = 'SELECT * FROM `table` LIMIT '.$rand.','.$limit;
// execute query
?>
This should be safe for big tables, however it will select adjacent rows. You could then mix up the result set via array_rand or shuffle:
<?php
// ... continued
$resultSet = $pdoStm->fetchAll();
$randResultKeys = array_rand($resultSet,$limit); // using array_rand
shuffle($resultSet); // or using shuffle
?>

Best way to show large amount of data

What is the best way to handle a large amount of data entries on a web page?
Let's assume I am having a database with 5000 records on a table that contain song_name,author_name,song_id,posted_by; I want to build a playlist with all the songs on a single page. Also on that page there is a player that plays songs according to the playlist entries that is shown on the page.
I have tried to pull all 5000 entries from that table and build a javascript object with them, and handling that object I have built the playlist, search in playlist, and so forth. But that takes a very large amount of resources ( un the user end ) and a lot of page loading time ( because there are a lot of entries! ) and the page is very slow.
Is it better to load all the data into an object and paginate by JavaScript each 100 records of the playlist or is it better to get the results paginated from the database and just update the playlist? ( This taking in consideration the fact that I if the player has the shuffle button activated, it may shuffle to ANY song in the user's database, not only on the current songs from the visible playlist )
I think pagination is your best option. Just create a limit of 100 (for example) and use AJAX to extract the next 100. If the client turns on shuffle, just send another request to the server and let it call a function that does the following:
Count total rows in database
Use a randomize function to get 100 random numbers
Now create a slightly tricky query to get records from the db based
on their rownumber:
function getRandomTracks($limit) {
$total = $this->db->count_all('table_tracks');
//Get random values. Speed optimization by predetermine random rownumbers using php
$arr = array();
while (count($arr) < $limit) {
$x = mt_rand(0, $total); //get random value between limit and 0
if (!isset($arr[$x])) { //Random value must be unique
//using random value as key and check using isset is faster then in_array
$arr[$x] = true;
}
}
//Create IN string
$in = implode(',', array_keys($arr));
//Selection based on random rownumbers
$query = $this->db->query('SELECT * FROM
(SELECT #row := #row + 1 as row, t.*
FROM `table_tracks` t, (SELECT #row := 0) r) AS tracks
WHERE `row` IN(' . $in . ')');
return $query->result();
}
I'm using a similar function, also to deal will large amounts of tracks (over 300.000) so I'm sure this will work!
It is very hard to load the "entire" data to client program even if you are using jQuery or other library else, as the key factor is not what code/sdk you are using but the browser itself!
By the way, chrome is the most fast and IE(before ver.10) is the lowest.
You can refer the links below:
http://www.infoq.com/news/2010/09/IE-Subsystems-Spends-Time
http://www.nczonline.net/blog/2009/01/05/what-determines-that-a-script-is-long-running/
http://www.zdnet.com/browser-benchmarks-ie-firefox-opera-and-safari-3039420732/
http://msdn.microsoft.com/en-us/library/Bb250448
http://support.microsoft.com/kb/175500/en-us
So what you should do is to move your client logic to the server-side just as other people suggesting.
As you mentioned to get paginated but with just javascript for all your data, it is the same as none paginate in essence.
use ajax to load the data in steps of 100 (or more, just try)
do a loop over your recordsets and increase the limit each time:
<?php
$Step = 100;
$u_limit = 0;
$sql = "SELECT song_id FROM $MySQL_DB.songs";
$data = mysql_query($sql, $dblk);
$numer_of_entries = mysql_num_rows($data);
while($o_limit < $numnumer_of_entries)
{
$o_limit = u_limit + $Step;
$sql = "SELECT * FROM $MySQL_DB.songs order by id DESC LIMIT $u_limit, $o_limit";
$data = mysql_query($sql, $dblk);
while($row = mysql_fetch_array($data))
{
// built an array and return this to ajax
}
$u_limit += $Step;
}
Try this: http://www.datatables.net/
I wonder but maybe it's works.

mySQL & PHP Looping

I have been given access to a third parties database and wish to create a tool using their information. The database designed for their original purpose is very very large and segregated. I need to complete the following task:
From the the below Schema, I need to complete the following tasks:
Look up the item in the invTypes, check both the invTypeMaterials and ramTypeRequirements to see if any materials are need to build the item. If yes, then look up each of those materials in invTypes, and again repeat the process to see if those in turn need components. This loop keeps going until the the check on both the invTypeMaterials and ramTypeRequirements is False, this can be 5 or 6 loops, but 5 or 6 items per loop to check so could be 1561 loops assuming 1 loop for original item, then 5 loops per material of which there is 5, 5 times.
Now I tried to complete the code and came up with the follow:
$materialList = array();
function getList($dbc, $item) {
global $materialList;
// Obtain initial material list
$materials = materialList($dbc, $item);
// For each row in the database
while ($material == mysqli_fetch_array($materials)) {
// Check if there are any sub materials required
if (subList($dbc, $material['ID'])) {
// If so then recurse over the list the given quantity (it has already done it once)
for ($i = 0; $i < $material['Qty'] - 1; $i++) {
if (!subList($dbc, $material['ID'])) {
break;
}
}
} else {
// If there are no further materials then this is the base material so add to the array.
$materialList .= array(
"Name" => $mMaterial['Name'],
"Qty" => $mMaterial['Qty'],
"ID" => $material['ID']
);
}
}
return $materialList;
}
function subList($dbc, $item) {
global $materialList;
// Query the material incase it require further building
$mMaterials = materialList($dbc, $item['ID']);
// If the database returns any rows, then it must have more sub-materials required
if (mysqli_num_rows($mMaterials) > 0) {
// Check the sub-materials to see if they intern require futher materials
if (subList($dbc, $material['ID'])) {
// If the function returns true then iterate over the list the given quantity (its already done it once before)
for ($i = 0; $i < $material['Qty'] - 1; $i++) {
if (!subList($dbc, $material['ID'])) {
break;
}
}
} else {
// if the database returns 0 rows then this object is the base material so add to array.
$materialList .= array(
"Name" => $mMaterial['Name'],
"Qty" => $mMaterial['Qty'],
"ID" => $material['ID']
);
return true;
}
} else {
return false;
}
}
function materialList($dbc, $item) {
// Query
$query = " SELECT i.typeID AS ID, i.typeName AS Name, m.Quantity AS Qty
FROM invTypes AS i
LEFT JOIN invTypeMaterials AS m
ON m.materialTypeID = i.typeID
LEFT JOIN ramTypeRequirements AS r
ON r.typeID = i.typeID
WHERE groupID NOT IN(278,269,278,270,268) AND m.typeID = $item";
$snippets = mysqli_query($dbc, $query) or die('Error: ' . mysqli_error($dbc));
return $snippets;
}
As im sure you have all noticed this code breaks about every programming law there is when it comes to recursive database calls. Not really practical especially in that subList() calls itself continually until it finds it's false. SQL isn't my strong suite, but I cannot for the life of me work out how to get over this problem.
Any pointers would be very helpful, I'm certainly not asking any of you to re-write my entire code for me, but if you have any ideas as to what I should consider I would be grateful.
As a generic solution I would do the following:
For every typeID, gather from both invTypeMaterials and ramTypeRequirements
From the gathered data, you create a new SELECT query and continue the cycle
Initial query
SELECT t.*, m.materialTypeID, m.quantity AS m_quantity, r.requiredTypeID, r.quantity AS r_quantity
FROM invTypes t
LEFT JOIN invTypeMaterials m USING (typeID)
LEFT JOIN ramTypeRequirements r USING (typeID)
WHERE <conditions to select the types>
I've just made a guess at which data from the extra tables are required to load; expand where necessary.
The materialTypeID and requiredTypeID will be non-null for matches rows and null otherwise.
Keep a table of types you have already loaded before, for faster reference. Then for the second query you replace the condition to something like `WHERE t.typeID IN ()
Let me know if this makes sense and whether it's even close to what's useful to you :)
Looks like here recursion is unavoidable. I join Jack's answer, just will extend it with PHP code :)
I must warn you that I never executed it, so it will need debugging, but I hope you will get the idea. :)
$checked_dependencies = array();
$materials = array();
function materialList( $ids ) {
// if we have an array of IDs, condition is ".. in (...)"
if(is_array($ids)) {
$condition = 'IN ('.implode(',',$ids).')';
// add all to checked dependencies
foreach($ids as $id) { $checked_dependencies[] = $id; }
}else{
// otherwise, checking for particular ID
$condition = "= {$ids}";
// add to checked dependencies
$checked_dependencies[] = $ids;
}
$query = "SELECT t.*,
m.materialTypeID, m.quantity AS m_quantity,
r.requiredTypeID,
r.quantity AS r_quantity
FROM invTypes t
LEFT JOIN invTypeMaterials m ON t.typeId = m.typeId
LEFT JOIN ramTypeRequirements r ON t.typeId = r.typeId
WHERE t.typeID {$condition}";
$res = mysqli_query($dbc, $query);
// this will be the list of IDs which we need to get
$ids_to_check = array();
while($material = mysqli_fetch_assoc($res)) {
$materialList[] = $material; // you can get only needed fields
// if we didn't check the dependencies already, adding them to the list
// (if they aren't there yet)
if(!in_array($material['materialTypeId'], $checked_dependencies)
&& !in_array($material['materialTypeId'], $ids_to_check)
&& !is_null($material['materialTypeId'])) {
$ids_to_check[] = $material['materialTypeId'];
}
if(!in_array($material['requiredTypeId'], $checked_dependencies)
&& !in_array($material['requiredTypeId'], $ids_to_check)
&& !is_null($material['requiredTypeId'])) {
$ids_to_check[] = $material['requiredTypeId'];
}
}
// if the result array isn't empty, recursively calling same func
if(!empty($ids_to_check)) { materialList($ids_to_check); }
}
I used a global array here, but it's easy to re-write the func to return data.
Also we can put some depth limit here to avoid too much recursion.
Generally, I'd say it is not a very convenient (for this task) organization of DB data. It's kinda comfortable to store data recursively like that, but, as you see, it results in an unknown amount of iterations and requests to database to get all the dependencies. And that might be expensive (PHP <-> MySQL <-> PHP <->...), on each iteration we lose time, especially if the DB is on remote server as in your case.
Of course, would be great to re-arrange the data structure for possibility to get all requirements at once, but as I understand you have a read-only access to the database. Second solution which comes to my head is a recursive MySQL stored procedure, which is also impossible here.
In some cases (not generally) it is good to get as much data as possible in one query, and operate with it locally, to lessen the iterations number. It is hard to say if it is possible here, because I don't know the size of DB and the structure, etc, but e.g. if all required dependencies are stored in one group, and the groups aren't enormously large, maybe it might be faster to get all the group info in one request to a PHP array and then collect the info from that array locally. But - it is only a guess and it needs testing and checking.

Problem: Writing a MySQL parser to split JOIN's and run them as individual queries (denormalizing the query dynamically)

I am trying to figure out a script to take a MySQL query and turn it into individual queries, i.e. denormalizing the query dynamically.
As a test I have built a simple article system that has 4 tables:
articles
article_id
article_format_id
article_title
article_body
article_date
article_categories
article_id
category_id
categories
category_id
category_title
formats
format_id
format_title
An article can be in more than one category but only have one format. I feel this is a good example of a real-life situation.
On the category page which lists all of the articles (pulling in the format_title as well) this could be easily achieved with the following query:
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC
However the script I am trying to build would receive this query, parse it and run the queries individually.
So in this category page example the script would effectively run this (worked out dynamically):
// Select article_categories
$sql = "SELECT * FROM article_categories WHERE category_id = 2";
$query = mysql_query($sql);
while ($row_article_categories = mysql_fetch_array($query, MYSQL_ASSOC)) {
// Select articles
$sql2 = "SELECT * FROM articles WHERE article_id = " . $row_article_categories['article_id'];
$query2 = mysql_query($sql2);
while ($row_articles = mysql_fetch_array($query2, MYSQL_ASSOC)) {
// Select formats
$sql3 = "SELECT * FROM formats WHERE format_id = " . $row_articles['article_format_id'];
$query3 = mysql_query($sql3);
$row_formats = mysql_fetch_array($query3, MYSQL_ASSOC);
// Merge articles and formats
$row_articles = array_merge($row_articles, $row_formats);
// Add to array
$out[] = $row_articles;
}
}
// Sort articles by date
foreach ($out as $key => $row) {
$arr[$key] = $row['article_date'];
}
array_multisort($arr, SORT_DESC, $out);
// Output articles - this would not be part of the script obviously it should just return the $out array
foreach ($out as $row) {
echo '<p>'.$row['article_title'].' <i>('.$row['format_title'].')</i><br />'.$row['article_body'].'<br /><span class="date">'.date("F jS Y", strtotime($row['article_date'])).'</span></p>';
}
The challenges of this are working out the correct queries in the right order, as you can put column names for SELECT and JOIN's in any order in the query (this is what MySQL and other SQL databases translate so well) and working out the information logic in PHP.
I am currently parsing the query using SQL_Parser which works well in splitting up the query into a multi-dimensional array, but working out the stuff mentioned above is the headache.
Any help or suggestions would be much appreciated.
From what I gather you're trying to put a layer between a 3rd-party forum application that you can't modify (obfuscated code perhaps?) and MySQL. This layer will intercept queries, re-write them to be executable individually, and generate PHP code to execute them against the database and return the aggregate result. This is a very bad idea.
It seems strange that you imply the impossibility of adding code and simultaneously suggest generating code to be added. Hopefully you're not planning on using something like funcall to inject code. This is a very bad idea.
The calls from others to avoid your initial approach and focus on the database is very sound advice. I'll add my voice to that hopefully growing chorus.
We'll assume some constraints:
You're running MySQL 5.0 or greater.
The queries cannot change.
The database tables cannot be changed.
You already have appropriate indexes in place for the tables the troublesome queries are referencing.
You have triple-checked the slow queries (and run EXPLAIN) hitting your DB and have attempted to setup indexes that would help them run faster.
The load the inner joins are placing on your MySQL install is unacceptable.
Three possible solutions:
You could deal with this problem easily by investing money into your current database by upgrading the hardware it runs on to something with more cores, more (as much as you can afford) RAM, and faster disks. If you've got the money Fusion-io's products come highly recommended for this sort of thing. This is probably the simpler of the three options I'll offer
Setup a second master MySQL database and pair it with the first. Make sure you have the ability to force AUTO_INCREMENT id alternation (one DB uses even id's, the other odd). This doesn't scale forever, but it does offer you some breathing room for the price of the hardware and rack space. Again, beef up the hardware. You may have already done this, but if not it's worth consideration.
Use something like dbShards. You still need to throw more hardware at this, but you have the added benefit of being able to scale beyond two machines and you can buy lower cost hardware over time.
To improve database performance you typically look for ways to:
Reduce the number of database calls
Making each database call as efficient as possible (via good design)
Reduce the amount of data to be transfered
...and you are doing the exact opposite? Deliberately?
On what grounds?
I'm sorry, you are doing this entirely wrong, and every single problem you encounter down this road will all be consequences of that first decision to implement a database engine outside of the database engine. You will be forced to work around work-arounds all the way to delivery date. (if you get there).
Also, we are talking about a forum? I mean, come on! Even on the most "web-scale-awesome-sauce" forums we're talking about less than what, 100 tps on average? You could do that on your laptop!
My advice is to forget about all this and implement things the most simple possible way. Then cache the aggregates (most recent, popular, statistics, whatever) in the application layer. Everything else in a forum is already primary key lookups.
I agree it sounds like a bad choice, but I can think of some situations where splitting a query could be useful.
I would try something similar to this, relying heavily on regular expressions for parsing the query. It would work in a very limited of cases, but it's support could be expanded progressively when needed.
<?php
/**
* That's a weird problem, but an interesting challenge!
* #link http://stackoverflow.com/questions/5019467/problem-writing-a-mysql-parser-to-split-joins-and-run-them-as-individual-query
*/
// Taken from the given example:
$sql = "SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
WHERE article_categories.category_id = 2
ORDER BY articles.article_date DESC";
// Parse query
// (Limited to the clauses that are present in the example...)
// Edit: Made WHERE optional
if(!preg_match('/^\s*'.
'SELECT\s+(?P<select_rows>.*[^\s])'.
'\s+FROM\s+(?P<from>.*[^\s])'.
'(?:\s+WHERE\s+(?P<where>.*[^\s]))?'.
'(?:\s+ORDER\s+BY\s+(?P<order_by>.*[^\s]))?'.
'(?:\s+(?P<desc>DESC))?'.
'(.*)$/is',$sql,$query)
) {
trigger_error('Error parsing SQL!',E_USER_ERROR);
return false;
}
## Dump matches
#foreach($query as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
/* We get the following matches:
"select_rows" => "articles.*, formats.format_title"
"from" => "articles INNER JOIN formats ON articles.article_format_id = formats.format_id INNER JOIN article_categories ON articles.article_id = article_categories.article_id"
"where" => "article_categories.category_id = 2"
"order_by" => "articles.article_date"
"desc" => "DESC"
/**/
// Will only support WHERE conditions separated by AND that are to be
// tested on a single individual table.
if(#$query['where']) // Edit: Made WHERE optional
$where_conditions = preg_split('/\s+AND\s+/is',$query['where']);
// Retrieve individual table information & data
$tables = array();
$from_conditions = array();
$from_tables = preg_split('/\s+INNER\s+JOIN\s+/is',$query['from']);
foreach($from_tables as $from_table) {
if(!preg_match('/^(?P<table_name>[^\s]*)'.
'(?P<on_clause>\s+ON\s+(?P<table_a>.*)\.(?P<column_a>.*)\s*'.
'=\s*(?P<table_b>.*)\.(?P<column_b>.*))?$/im',$from_table,$matches)
) {
trigger_error("Error parsing SQL! Unexpected format in FROM clause: $from_table", E_USER_ERROR);
return false;
}
## Dump matches
#foreach($matches as $key => $value) if(!is_int($key)) echo "\"$key\" => \"$value\"<br/>\n";
// Remember on_clause for later jointure
// We do assume each INNER JOIN's ON clause compares left table to
// right table. Forget about parsing more complex conditions in the
// ON clause...
if(#$matches['on_clause'])
$from_conditions[$matches['table_name']] = array(
'column_a' => $matches['column_a'],
'column_b' => $matches['column_b']
);
// Match applicable WHERE conditions
$where = array();
if(#$query['where']) // Edit: Made WHERE optional
foreach($where_conditions as $where_condition)
if(preg_match("/^$matches[table_name]\.(.*)$/",$where_condition,$matched))
$where[] = $matched[1];
$where_clause = empty($where) ? null : implode(' AND ',$where);
// We simply ignore $query[select_rows] and use '*' everywhere...
$query = "SELECT * FROM $matches[table_name]".($where_clause? " WHERE $where_clause" : '');
echo "$query<br/>\n";
// Retrieve table's data
// Fetching the entire table data right away avoids multiplying MySQL
// queries exponentially...
$table = array();
if($results = mysql_query($table))
while($row = mysql_fetch_array($results, MYSQL_ASSOC))
$table[] = $row;
// Sort table if applicable
if(preg_match("/^$matches[table_name]\.(.*)$/",$query['order_by'],$matched)) {
$sort_key = $matched[1];
// #todo Do your bubble sort here!
if(#$query['desc']) array_reverse($table);
}
$tables[$matches['table_name']] = $table;
}
// From here, all data is fetched.
// All left to do is the actual jointure.
/**
* Equijoin/Theta-join.
* Joins relation $R and $S where $a from $R compares to $b from $S.
* #param array $R A relation (set of tuples).
* #param array $S A relation (set of tuples).
* #param string $a Attribute from $R to compare.
* #param string $b Attribute from $S to compare.
* #return array A relation resulting from the equijoin/theta-join.
*/
function equijoin($R,$S,$a,$b) {
$T = array();
if(empty($R) or empty($S)) return $T;
foreach($R as $tupleR) foreach($S as $tupleS)
if($tupleR[$a] == #$tupleS[$b])
$T[] = array_merge($tupleR,$tupleS);
return $T;
}
$jointure = array_shift($tables);
if(!empty($tables)) foreach($tables as $table_name => $table)
$jointure = equijoin($jointure, $table,
$from_conditions[$table_name]['column_a'],
$from_conditions[$table_name]['column_b']);
return $jointure;
?>
Good night, and Good luck!
In instead of the sql rewriting I think you should create a denormalized articles table and change it at each article insert/delete/update. It will be MUCH simpler and cheaper.
Do the create and populate it:
create table articles_denormalized
...
insert into articles_denormalized
SELECT articles.*, formats.format_title
FROM articles
INNER JOIN formats ON articles.article_format_id = formats.format_id
INNER JOIN article_categories ON articles.article_id = article_categories.article_id
Now issue the appropriate article insert/update/delete against it and you will have a denormalized table always ready to be queried.

Whats the best way to retrieve information from Sphinx (in PHP)?

I'm new to sphinx, and I'm seting it up on a new website.
It's working fine, and when i search with the search in the console, everything work.
Using the PHP api and the searched, gives me the same results as well. But it gives me only ids and weights for the rows found. Is there some way to bring some text fields togheter with the 'matches' hash, for example?
If there is no way to do this, does anyone have a good idea about how to retrieve the records from the database (sql) in the sphinx weight sort order (searching all them at the same time)?
Yeah, sphinx doesn't bring the results.
But I found out a simple way to reorder the query using the IN() clause, to bring all together.
Quering something
SELECT * FROM table WHERE id IN(id_list... )
just indexing the result, with their id in the table:
while ($row = mysql_fetch_objects)
$result[$row->id] = $row;
and having the matching results from sphinx, its very easy to reorder:
$ordered_result = array();
foreach ($sphinxs_results['matches'] as $id => $content)
$ordered_result[] = $result1[$id];
this shall work, if your $sphinxs_results are in the correct order.
its almost pat's answer, but with less one loop. Can make some diference in big results, I guess.
You can use a mysql FIELD() function call in your ORDER BY to ensure everything is in the order sphinx specified.
$idlist = array();
foreach ( $sphinx_result["matches"] as $id => $idinfo ) {
$idlist[] = "$id";
}
$ids = implode(", ", $idlist);
SELECT * FROM table WHERE id IN ($ids) ORDER BY FIELD(id, $ids)
unfortually sphinx didn't returns matched fields, only its ids (sphinx index didn't contains data - only hash from data).
Post about this issue you can find on the sphinxsearch.com forum.
As Alex says, Sphinx doesn't return that information. You will have to use the IDs to query the database yourself - just loop through each ID, get your relevant data out, keeping the results in weighting order. To do it all in one query, you could try something like the following (psuedo-code - PHP ain't my language of choice):
results = db.query("SELECT * FROM table WHERE id IN (%s)", matches.join(", "));
ordered_results = [];
for (match in matches) {
for (result in results) {
if (result["id"] == match) {
ordered_results << result;
}
}
}
return ordered_results;

Categories