I have developed a fairly simple script to search a database and then sort the results based on the search terms used, so trying to get most relevant first.
Now this ran fine on my local machine and before I put in the sorting ran okay on the web server I have hired but once the sorting went in search times have greatly increased on the webserver.
What I'm posting below I have optimized as much as I know how, so I'm looking for some help in a better sort algorithm and maybe even a better way of querying the database anything to help speed up sort times!
Now some information about what I'm working with I needed to allow searches of 3 letters or more for example cat or car and I couldn't change the natural search word length limit for the mysql server so i can't use natural language searching of mysql hence why I am doing the queries I currently have.
Also an average search can easily return anywhere between 100-15000 results with the databases holding around 20000 entries
Any help will be greatly appreciated
<?php
require_once 'config.php';
$bRingtone = true;
$aSearchStrings = $_POST["searchStrings"];
$cConnection = new mysqli($dbhost, $dbuser, $dbpass, $dbname);
if (mysqli_connect_errno())
{
exit();
}
$sTables = array("natural", "artificial", "musical", "created");
$aQueries = array();
foreach ($sTables as $sTable)
{
$sQuery = "SELECT filename, downloadPath, description, imageFilePath, keywords FROM `$sTable` WHERE";
$sParamTypes = "";
$aParams = array();
$iCount = 0;
foreach ($aSearchStrings as $sString)
{
$sParamTypes .= "ss";
$aParams[] = "%,$sString%";
$aParams[] = "$sString%";
$sQuery .= $iCount++ == 0 ? " (keywords LIKE ? OR keywords LIKE ?)" : " AND (keywords LIKE ? OR keywords LIKE ?)";
}
array_unshift($aParams, $sParamTypes);
$aQueries[$sQuery] = $aParams;
}
$aResults = array();
foreach ($aQueries as $sQuery => $aParams)
{
if ($cStmt = $cConnection->prepare($sQuery))
{
$aQueryResults = array();
call_user_func_array(array($cStmt, 'bind_param'), $aParams);
$cStmt->execute();
$cStmt->bind_result($sFileName, $sDownloadPath, $sDescription, $sImageFilePath, $sKeywords);
while($cStmt->fetch())
{
if ($bRingtone)
{
$sFileName = $_SERVER['DOCUMENT_ROOT'] . "/m4r/" . str_replace(".WAV", ".M4R", $sFileName);
if (file_exists($sFileName))
{
$sDownloadPath = str_replace("Sounds", "m4r", str_replace(".WAV", ".M4R", $sDownloadPath));
$aResults[$sDownloadPath] = array($sDownloadPath, $sDescription, $sImageFilePath, $sKeywords, $aSearchStrings);
}
}
}
$aResults = array_merge($aResults, $aQueryResults);
$cStmt->close();
}
}
$cConnection->close();
$aResults = array_values($aResults);
function in_arrayi($needle, $haystack) {
return in_array(strtolower($needle), array_map('strtolower', $haystack));
}
function keywordSort($a, $b)
{
if ($a[0] === $b[0]) return 0;
$aKeywords = explode(",", $a[3]);
$bKeywords = explode(",", $b[3]);
foreach ($a[4] as $sSearchString)
{
$aFound = in_arrayi($sSearchString, $aKeywords);
$bFound = in_arrayi($sSearchString, $bKeywords);
if ($aFound && !$bFound)
{
return -1;
}
else if ($bFound && !$aFound)
{
return 1;
}
}
return 0;
}
usort($aResults, "keywordSort");
foreach ($aResults as &$aResult)
{
unset($aResult[3]);
unset($aResult[4]);
}
echo json_encode($aResults);
?>
Sorting large quantities of data while having to split the field code-side will be slow. Rather than optimizing, I'd seriously recommend another way of doing it, such as full-text indexing. It's really quite neat once it's working.
If full-text really isn't an option, I'd recommend splitting the keywords off into a separate table. That way, you can sort based on a count after grouping. For example ...
SELECT d.*, COUNT(k.id) AS keywordcount
FROM data d
INNER JOIN keywords k ON (d.id = k.dataid)
WHERE k.value IN ('keyword1', 'keyword2', 'keyword3')
GROUP BY d.id
ORDER BY keywordcount
On another PSish type note, you can probably speed up the thing by UNIONing the selects, followed by ordering, rather than running them all independently.
Related
I have an array of CSV values in my Database like category1[obj1,obj2,obj3..]
View
<?php echo $row_company->category1;?>
Controller
$row_company = $this->employers_model->get_company_details_by_slug($company_name);
The employers_model then calls a procedure which in turn executes the required query which displays the contents of row = category1 in this fashion,
obj1,obj2,obj3...
I want to be able to show this result without the commas and as a list like this
obj1
obj2
obj3..
I'm Using CodeIgniter MVC Framework, I have some vague idea about the uses of implode function, and came across preg_split too, But don't know where to start meddling around from, the view or the controller.
Any direction towards the solution would be appreciated.
Edit : row_company in detail.
$row_company = $this->employers_model->get_company_details_by_slug($company_name);
if(!$row_company){
redirect(base_url(),'');
exit;
}
$company_website = ($row_company->company_website!='')?validate_company_url($row_company->company_website):'';
$data['row_company'] = $row_company;
$data['company_logo'] = $company_logo;
$data['company_join'] = $company_join;
$data['company_website'] = $company_website;
$data['company_location'] = $company_location;
$data['title'] = $row_company->company_name.' jobs in '.$row_company->city;
$this->load->view('company_view',$data);
}
employers_model content
public function get_company_details_by_slug($slug) {
$Q = $this->db->query('CALL get_company_by_slug("'.$slug.'")');
if ($Q->num_rows > 0) {
$return = $Q->row();
} else {
$return = 0;
}
$Q->next_result();
$Q->free_result();
return $return;
}
the procedure itself get_company_details_by_slug
BEGIN
SELECT emp.ID AS empID, emp.sts1, pc.ID, emp.country, emp.city, pc.company_name, pc.company_description, pc.company_location, pc.company_website, pc.no_of_employees, pc.established_in, pc.company_logo, pc.company_slug, pc.category1, pc.category2, pc.category3, pc.company_join
FROM `pp_employers` AS emp
INNER JOIN pp_companies AS pc
WHERE pc.company_slug=slug AND emp.sts1 ='active';END
Figured it out after referring many codeigniter/php forums.
Pretty simple actually,
<?php $array = explode(',', $row_company->category1);
foreach ($array as $item)
{
echo "<li>$item</li>";
}
?>
Thank you Nigel for pointing out to use explode function.
Cheers
We have a PHP script that loops through many XML / CSV files from different websites. Right now we manage to build a good XML / CSV parser script.
The PHP script we wrote is looping though some BIG XML or CSV files. In these XML or CVS files contains Barcodes from different products.
Right now before the script starts I fill an array with the Product ID + Barcode from the MySQL like this:
function Barcodes_Array() {
$sql = "SELECT ProductId, Barcode FROM Products WHERE (Barcode <> '') ";
$res = mysql_query($sql);
while ($rijen = mysql_fetch_assoc($res)) {
$GLOBALS['arrBarcodes'][] = $rijen;
}
}
Each time we loop through the XML (or CSV) files we have to check if the Barcode exists in the array and return the Product ID.
For searching in the function:
$ProductId = SearchBarcodeProduct($EanNr, 'Barcode');
And yet the function:
function SearchBarcodeProduct($elem, $field)
{
$top = sizeof($GLOBALS['arrBarcodes']) - 1;
$bottom = 0;
$ProductId = 0;
while($bottom <= $top)
{
if($GLOBALS['arrBarcodes'][$bottom][$field] == $elem) {
return $GLOBALS['arrBarcodes'][$bottom]['ProductId'];
}
else {
if (is_array($GLOBALS['arrBarcodes'][$bottom][$field])) {
if (in_multiarray($elem, ($GLOBALS['arrBarcodes'][$bottom][$field]))) {
return $GLOBALS['arrBarcodes'][$bottom]['ProductId'];
}
}
}
$bottom++;
}
return $ProductId;
}
We fill in the array because it took forever each time we ask the MySQL Products Table.
My Question is now:
It still takes a VERY long time each time looping through the array of the barcodes. Is there a faster way for any other solutions maybe a different way then a array?
Can someone help please i am working like weeks on this stupid :) thing!
Why do you need 2 functions?
Try just one
function itemBarcode($id) {
$id = intval($id);
$sql = "SELECT ProductId, Barcode FROM Products WHERE ProductId = $id Barcode <> '') ";
$res = mysql_query($sql);
if ($row = mysql_fetch_assoc($res)) {
return $row['barcode'];
} else {
return 0;
}
}
Update if you need to search by barcode you can create another function:
function itemProduct($barcode) {
$sql = "SELECT ProductId, Barcode FROM Products WHERE Barcode = $barcode ";
$res = mysql_query($sql);
if ($row = mysql_fetch_assoc($res)) {
return $row['ProductId'];
} else {
return 0;
}
}
Sounds like you are missing an index on your Barcode column in your database.. A single row lookup using a presumably unique single indexed column should be blisteringly fast.
CREATE INDEX Barcode_Index ON Products (Barcode)
Then simply:
SELECT ProductId FROM Products WHERE Barcode = *INPUT*
You could also make the index UNIQUE if you NULL the Barcode where they currently = '' if there are more than one of these.
Another option is keying the array you have with the Barcode:
while ($rijen = mysql_fetch_assoc($res)) {
$GLOBALS['arrBarcodes'][$rijen['Barcode']] = $rijen;
}
or even just:
while ($rijen = mysql_fetch_assoc($res)) {
$GLOBALS['arrBarcodes'][$rijen['Barcode']] = $rijen['ProductId'];
}
Then you can do a straight look up:
$ProductId = isset($GLOBALS['arrBarcodes'][$Barcode])
?$GLOBALS['arrBarcodes'][$Barcode]['ProductId']
:0;
or:
$ProductId = isset($GLOBALS['arrBarcodes'][$Barcode])
?$GLOBALS['arrBarcodes'][$Barcode]
:0;
N.B Please read the warnings in the comments about use of $GLOBALS and mysql_query.
If you need it, store the barcodes array in an object or variable instead.
PDO is pretty handy, and I think it can also key your returned array for you on fetch.
I need to check if a value exists in my database
I have a table where every user has an unique code. For example: 5h27f.
These values and users add up very quickly. So very soon I might have +2000 unique codes. What's the best, fastest and most efficient way to check if a value is unique?
foreach ($users as $user) {
$is_unique = FALSE;
while ($is_unique == FALSE) {
$code = unique_code();
$query = "SELECT * FROM unique_code_table WHERE code='$code';";
$res = $mysqli->query($query);
if ($res->num_rows > 0 {
} else {
$is_unique = TRUE;
}
}
}
OR
$query = "SELECT code FROM unique_code_table;";
$res = mysqli->query($query);
$codes = array();
$i = 0;
while ($row = $res->fetch_object()) {
$codes[$i] = $row->code;
$i++;
}
$code = unique_code();
while (in_array($code, $codes) {
$code = unique_code();
}
(this code might not be 100% accurate, I've written this just to explain the purpose of the question.)
I'd say that one query trip to the database vs. potentially 2000+ is significantly better to do. Second script will be significantly faster.
On the first code a LIMIT 1 would do wonders but compared to the second query it will pale as far as benchmarks are concerned.
Put the following at the bottom of your script to fine tune and benchmark:
PHP 5.4 +
$sParseTime = microtime(true) - $_SERVER["REQUEST_TIME_FLOAT"];
echo $sParseTime;
Using php 5.3 and mysqli I return a result set from a query that just has usernames, something like
$query_username = "SELECT username FROM some_table WHERE param = 1";
$username = $mysqliObject->query($query_username);
while($row_username = $username->fetch_object()){
print "<br>Username: $row_username->username";
}
All fine, but here is my problem, there are repeated usernames, and I don't know which names are going to be in the query before hand, could be bob, sue, james. Or it could be tom, dick, harry, tom. What I need to do is print out each username and how many times it shows up in this object. For very strange reasons I CANNOT use neat stuff like group by and count(*) in the query(don't ask it is truly weird). So my question is, what is the fastest way to loop through the returned object(or associative array if need be) to get each unique name and how many times it appears. Thanks for your help and I apologize if this is a freshman CS question, I'm self taught and always filling in the gaps!
If you really must do it on the PHP side instead of using a GROUP BY clause:
while($row_username = $username->fetch_object())
{
if(isset($usernames[$row_username['username']]))
{
$usernames[$row_username['username']]++;
}
else
{
$usernames[$row_username['username']] = 1;
}
}
asort($usernames);
// use ksort() to sort by username instead of the count
// print out the usernames
foreach($usernames as $username => $count)
{
echo $username . ", count: " . $count;
}
e.g.
$users = array();
while( false!==($row=$result->fetch_array()) ){
if ( isset($users[$row['username']]) ) {
$users[$row['username']] += 1;
}
else {
$users[$row['username']] = 1;
}
}
asort($users);
I have a game script thing set up, and when it creates a new character I want it to find an empty address for that players house.
The two relevant table fields it inserts are 'city' and 'number'. The 'city' is a random number out of 10, and the 'number' can be 1-250.
What it needs to do though is make sure there's not already an entry with the 2 random numbers it finds in the 'HOUSES' table, and if there is, then change the numbers. Repeat until it finds an 'address' not in use, then insert it.
I have a method set up to do this, but I know it's shoddy- there's probably some more logical and easier way. Any ideas?
UPDATE
Here's my current code:
$found = 0;
while ($found == 0) {
$num = (rand()%250)+1; $city = (rand()%10)+1;
$sql_result2 = mysql_query("SELECT * FROM houses WHERE city='$city' AND number='$num'", $db);
if (mysql_num_rows($sql_result2) == 0) { $found = 1; }
}
You can either do this in PHP as you do or by using a MySQL trigger.
If you stick to the PHP way, then instead of generating a number every time, do something like this
$found = 0;
$cityarr = array();
$numberarr = array();
//create the cityarr
for($i=1; $i<=10;$i++)
$cityarr[] = i;
//create the numberarr
for($i=1; $i<=250;$i++)
$numberarr[] = i;
//shuffle the arrays
shuffle($cityarr);
shuffle($numberarr);
//iterate until you find n unused one
foreach($cityarr as $city) {
foreach($numberarr as $num) {
$sql_result2 = mysql_query("SELECT * FROM houses
WHERE city='$city' AND number='$num'", $db);
if (mysql_num_rows($sql_result2) == 0) {
$found = 1;
break;
}
}
if($found) break;
}
this way you don't check the same value more than once, and you still check randomly.
But you should really consider fetching all your records before the loops, so you only have one query. That would also increase the performance a lot.
like
$taken = array();
for($i=1; $i<=10;$i++)
$taken[i] = array();
$records = mysql_query("SELECT * FROM houses", $db);
while($rec = mysql_fetch_assoc($records)) {
$taken[$rec['city']][] = $rec['number'];
}
for($i=1; $i<=10;$i++)
$cityarr[] = i;
for($i=1; $i<=250;$i++)
$numberarr[] = i;
foreach($cityarr as $city) {
foreach($numberarr as $num) {
if(in_array($num, $taken[]) {
$cityNotTaken = $city;
$numberNotTaken = $number;
$found = 1;
break;
}
}
if($found) break;
}
echo 'City ' . $cityNotTaken . ' number ' . $numberNotTaken . ' is not taken!';
I would go with this method :-)
Doing it the way you say can cause problems when there is only a couple (or even 1 left). It could take ages for the script to find an empty house.
What I recommend doing is insert all 2500 records in the database (combo 1-10 with 1-250) and mark with it if it's empty or not (or create a combo table with user <> house) and match it on that.
With MySQL you can select a random entry from the database witch is empty within no-time!
Because it's only 2500 records, you can do ORDER BY RAND() LIMIT 1 to get a random row. I don't recommend this when you have much more records.