In my program I launch an SQL query and get back a result resource. I then iterate through the rows of this result resource using the mysql_fetch_array() function and use the contents of the fields of each row to construct a further SQL query.
The result of launching this second query is the first set of results that I want. However, because the number of results produced by doing this is not many I want to make the search less specific by dropping the last record used to make the query.
e.g. the query which produces the first set of results I want could be:
SELECT uid FROM users WHERE (gender=male AND relationship_status=single
AND shoe_size=10)
I would then want to drop the last record so that my query became:
SELECT uid FROM users WHERE (gender=male AND relationship_status=single)
I have already written code to produce the first query but as I mentioned above I use the mysql_fetch_array function to iterate through ALL of the records. In subsequent "rounds" I only want to iterate through successively less records so that my query is less specific. How can I do this?
This seems like an very inefficient method too - so I'm welcome to any simple ideas which might make it more efficient.
EDIT: Thanks for the reply - Yeah I am actually doing this in my program. I am basically trying to implement a basic search algorithm by taking all the preferences a user has specified in the DB and using it to form a query to look for people with those preferences. So the first time search using all the criteria, then on successive attempts search using one less criteria and negate the user ids which were previously returned. At the moment I am constructing the query from scratch for each "round", but I want to find a way I can do this using the last query
Using the queries above, you could do:
SELECT uid
FROM users
WHERE uid NOT IN (
SELECT uid
FROM users
WHERE
(gender=male
AND relationship_status=single
AND shoe_size=10)
)
This will essentially turn your first query into a sub-query, and use that to negate the results returned. Ie, it will return all the rows, NOT IN the first query.
Related
I have Sphinx Search running on production, performing search with keywords, accessed through official sphinxapi.php. Now I need to output a sum of an attribute called price along with search results, similar to SQL query "SELECT SUM(t.price) from table_name t WHERE condition". This data is supposed to be displayed on a web page like "Showing 1 - 10 out of 12345 results, total cost is $67890". As documentation says, SUM() function is available when used with GROUP BY. However, the documentation does not provide enough details on implementation, googling and searching Stackoverflow doesn't help much as well.
Questions:
How should I group the search result?
Can it be performed with 1 Sphinx request, or do I have to get the search results first and then query Sphinx again to get the sum of found documents?
Please advise. An example will be really helpful. Thank you.
You will need to run a second query. The 'sum' is wanted on the WHOLE result set, whereas normal grouping, the aggregation is run per row. In your example, there is an implicit GROUP BY '1' which aggregates all rows.
So would need to use Grouping to do same in sphinx.
http://sphinxsearch.com/docs/current.html#clustering
Using the aggregation function is relatively easy, use with setSelect, but not sure SetGroupBy has a syntax to group all rows so will have to emulate it.
//all normal setup need for normal query here
$cl->SetLimits($offset,$limit);
$cl->AddQuery($query, $index);
//add the group query
$cl->setSelect("1 as one, SUM(price) as sum_price");
$cl->setGroupBy("one",SPH_GROUPBY_ATTR); //dont care about sorting
$cl->setRankingMode(SPH_RANK_NONE); //no point actually ranking results.
$cl->SetLimits(0,1);
$cl->AddQuery($query, $index);
//run both queries at once...
$results = $cl->RunQueries();
var_dump($results);
//$results[0] contains the normal text query results, use its total_found
//$results[1] second contains just the SUM() data
This also shows setting up as Multi-Queries!
http://sphinxsearch.com/docs/current.html#multi-queries
I'm using Laravel built in paginate method in a query where i need search in Fulltext against a large dataset (about 100K rows with huge amount of text each).
All working fine, except i do not understand the logic in how laravel counts the results: why must execute the same query two times (the select count() as aggregate) for retrieve the total count of results, and not use the php function count(), that works great in this scenario.
Because with this method, I can literally half the time of this search, that sometimes can take up to 10 second!!
It is really necessary to use 2 query, or it is possible in some way to override this logic?
Or maybe it's me that I'm missing something behind this logic?
The query is executed twice, once to get the total number of records returned by the final query, and the second time to return only the required dataset.
So, if you have a table of 100,000 records, the first query will count the records returned by the SQL query, let's say 8,900 records match your requirement, it will return an integer of 8,900.
The second query then uses the page number you want, multiples it by the count per page, and then returns the relevant 15 or so records from just this page, which is the LIMIT and OFFSET values within your SQL query.
It is worth noting that GROUP BY paginated results are not handled in the same way. If you add GROUP BY to the end of any SQL statement within eloquent, it returns a single SQL query. This query grabs all the relevant datasets, counts the number of rows returned, and then slices the array to return just the 15 or so records you require.
The difference between these two methods is the first returns 2 tiny query responses. Firstly the total count, and secondly 15 or so datasets from your table.
The GROUP option returns a dataset for EVERY record which matches your SQL requirements. If this is a total of 8,900 records, it will be a total of 8,900 eloquent model objects.
As you can see, if you have a database with a good number of records in it, the the second method, while it may execute the SQL statement quicker, will tie up a lot more resource.
If your SQL statements are taking too long to execute twice, you may need to consider optimising your table, or adding a further INDEX. Just a thought.
Can some shed some light on hoe to get the number of rows (using php) in a table without actually having to read all the rows? I am using squlite to log data periodically and need to know the table row count before I actually access specific data?
Apart from reading all rows and incrementing a counter, I cannot seem to work out how to do this quickly (it's a large database) rather simple requirement? I have tried the following php code but it only returns a boolean response rather that the actual number of rows?
$result = $db->query('SELECT count(*) FROM mdata');
Normally the SELECT statement will also return the data object (if there is any?)
Just use
LIMIT 1
That should work!! It Limits the result to ONLY look at 1 row!!
if you have record set then you get number or record by mysql_num_rows(#Record Set#);
Apologies if I am not wording this correctly but what I have at the moment is a search form and the user can type in how many results they want to be returned but what I would like to do is also show the user the total number of matches in the database.
e.g. User selects their search criteria with a limit of 10 but there are 50 matches in the database.
I would like to tell the user is that there are 50 possible results.
One thing I am not sure of is if this can be combined with my existing query or if a separate query will have to be submitted?
The PHP code to generate my queries is quite complicated but if you would like me to post it just let me know.
Thanks.
Have a look at the SQL_CALC_FOUND_ROWS option for mysql.
http://dev.mysql.com/doc/refman/5.1/en/information-functions.html#function_found-rows
And yes, you'll need a second query like:
SELECT FOUND_ROWS()
To get the "real" number of 'would have been' results.
I have used MySQL a lot, but I always wondered exactly how does it work - when I get a positive result, where is the data stored exactly? For example, I write like this:
$sql = "SELECT * FROM TABLE";
$result = mysql_query($sql);
while ($row = mysql_fetch_object($result)) {
echo $row->column_name;
}
When a result is returned, I am assuming it's holding all the data results or does it return in a fragment and only returns where it is asked for, like $row->column_name?
Or does it really return every single row of data even if you only wanted one column in $result?
Also, if I paginate using LIMIT, does it hold THAT original (old) result even if the database is updated?
The details are implementation dependent but generally speaking, results are buffered. Executing a query against a database will return some result set. If it's sufficiently small all the results may be returned with the initial call or some might be and more results are returned as you iterate over the result object.
Think of the sequence this way:
You open a connection to the database;
There is possibly a second call to select a database or it might be done as part of (1);
That authentication and connection step is (at least) one round trip to the server (ignoring persistent connections);
You execute a query on the client;
That query is sent to the server;
The server has to determine how to execute the query;
If the server has previously executed the query the execution plan may still be in the query cache. If not a new plan must be created;
The server executes the query as given and returns a result to the client;
That result will contain some buffer of rows that is implementation dependent. It might be 100 rows or more or less. All columns are returned for each row;
As you fetch more rows eventually the client will ask the server for more rows. This may be when the client runs out or it may be done preemptively. Again this is implementation dependent.
The idea of all this is to minimize roundtrips to the server without sending back too much unnecessary data, which is why if you ask for a million rows you won't get them all back at once.
LIMIT clauses--or any clause in fact--will modify the result set.
Lastly, (7) is important because SELECT * FROM table WHERE a = 'foo' and SELECT * FROM table WHERE a = 'bar' are two different queries as far as the database optimizer is concerned so an execution plan must be determined for each separately. But a parameterized query (SELECT * FROM table WHERE a = :param) with different parameters is one query and only needs to be planned once (at least until it falls out of the query cache).
I think you are confusing the two types of variables you're dealing with, and neither answer really clarifies that so far.
$result is a MySQL result object. It does not "contain any rows." When you say $result = mysql_query($sql), MySQL executes the query, and knows what rows will match, but the data has not been transferred over to the PHP side. $result can be thought of as a pointer to a query that you asked MySQL to execute.
When you say $row = mysql_fetch_object($result), that's when PHP's MySQL interface retrieves a row for you. Only that row is put into $row (as a plain old PHP object, but you can use a different fetch function to ask for an associative array, or specific column(s) from each row.)
Rows may be buffered with the expectation that you will be retrieving all of the rows in a tight loop (which is usually the case), but in general, rows are retrieved when you ask for them with one of the mysql_fetch_* functions.
If you only want one column from the database, then you should SELECT that_column FROM .... Using a LIMIT clause is also a good idea whenever possible, because MySQL can usually perform significant optimizations if it knows that you only want a certain group of rows.
The first question can be answered by reading up on resources
Since you are SELECTing "*", every column is returned for each mysql_fetch_object call. Just look at print_r($row) to see.
In simple words the resource returned it like an ID that the MySQL library associate with other data. I think it is like the identification card in your wallet, it's just a number and some information but asociated with a lot of more information if you give it to the goverment, or your cell-phone company, etc.