I have a website where my database is set up with different artists and song titles within the same row, where it might look this:
artist: The Monkees, title: I'm A Believer
artist: The Monkees, title: Daydream Believer
artist: The Hollies, title: The Air That I Breathe
artist: The Hollies, title: Bus Stop
artist: The Beatles, title: Hello, Goodbye
artist: The Beatles, title: Yellow Submarine
And I have an autocomplete widget set up with my site's search form that is fed a json_encoded array filled with 'artist' values.
The first problem is that if a user were to begin typing "the" into the search form, values would come up like this:
The Monkees
The Monkees
The Hollies
The Hollies
The Beatles
The Beatles
So I used the array_unique function to remove duplicate values, but it seems that even if a value has one duplicate word (this case being "the"), it is removed entirely, so only the first value is returned:
The Monkees
Where the output I would like to have would be:
The Monkees
The Hollies
The Beatles
So, what might be another way I can remove these duplicate values and display them the way I would like?
EDIT:
Here is my source code:
<?php
include 'includes/config.php';
$return_arr = array();
$term = ($_GET['term']);
if ($con)
{
$artist = mysql_query("SELECT * FROM songs WHERE artist LIKE '%$term%' LIMIT 0, 5");
while ($row = mysql_fetch_array($artist, MYSQL_ASSOC)) {
$row_array['value'] = strtolower($row['artist']);
array_push($return_arr,$row_array);
}
}
mysql_close($con);
echo json_encode(array_unique($return_arr));
?>
array_unique uses a strict comparison. So differences in case and whitespace are taken into consideration. Since all of those values seem to be strings, it's likely the reason why array_unique is not working the way you would expect.
Your database structure makes it pretty difficult to weed out duplicates. I would suggest refactoring it into a table of artists and a table of songs, where songs simply reference the id of artist. This will give you a better chance of being able to keep your artist list unique.
Also, one thing I would do for your autocomplete is set it up to ignore certain strings. ('a', 'an', 'the') These are known as stopwords, and help search results be more relevant by not performing a search on common words.
Related
I have the following street names and house numbers in a text file:
Albert Dr: 4116-4230, 4510, 4513-4516
Bergundy Pl: 1300, 1340-1450
David Ln: 3400, 4918, 4928, 4825
Garfield Av: 5000, 5002, 5004, 5006, 8619-8627, 9104-9113
....
This data represents the boundary data for a local neighborhood (i.e., what houses are inside the community).
I want to make a PHP script that will take a user's input (in the form of something like "4918 David Lane" or "3000 Bergundy") search this list, and return a yes/no response whether that house exists within the boundaries.
What would be an efficient way to parse the input (regex?) and compare it to the text list?
Thanks for the help!
It's better to store this info in a database so that you don't have to parse out the data from a text file. Regexes are also not generally applicable to find a number in a range so a general purpose language is advised as well.
But... if you want to do it with regexes (and see why it's not a good idea)
To lookup the numbers for a street use
David Ln:(.*)
To then get the numbers use
[^,]*
You could simply import the file into a string. After this is done, breack each line of the file in an array so Array(Line 1=> array(), Line 2=> array(), etc. After this is done, you can explode using :. After, you'll simply need to search in the array. Not the fastest way, but it may be faster then regex.
You should sincerely consider using a database or re-think how your file are.
Try something like this, put your street names inside test.txt.. Now that you are able to get the details inside the text file, just compare it with the values that you submit in your form.
$filename = 'test.txt';
if(file_exists($filename)) {
if($handle = fopen($filename, 'r')) {
$name = array();
while(($file = fgets($handle)) !==FALSE) {
preg_match('#(.*):(.*)#', $file, $match);
$array = explode(',', $match[2]);
foreach($array as $val) {
$name[$match[1]][] = $val;
}
}
}
}
As mentioned, using a database to store street numbers that are relational to your street names would be ideal. I think a way you could implement this with your text file though is to create a a 2D array; storing the street names in the first array and the valid street numbers in their respective arrays.
Parse the file line by line in a loop. Parse the street name and store in array, then use a nested loop to parse all of the numbers (for ones in a range like 1414-1420, you can use an additional loop to get each number in the range) and build the next array in the initial street name array element. When you have your 2D array, you can do a simple nested loop to check it for a match.
I will try to make a little pseudo-code for you..
pseudocode:
$addresses = array();
$counter = 0;
$line = file->readline
while(!file->eof)
{
$addresses[$counter] = parse_street_name($line);
$numbers_array = parse_street_numbers($line);
foreach($numbers_array as $num)
$addresses[$counter][] = $num;
$line = file->readline
$counter++;
}
It's better if you store your streets in a separate table with IDs, and store numbers in separate table one row for each range or number and street id.
For example:
streets:
ID, street
-----------
1, Albert Dr
2, Bergundy Pl
3, David Ln
4, Garfield Av
...
houses:
street_id, house_min, house_max
-----------------
1, 4116, 4230
1, 4510, 4510
1, 4513, 4516
2, 1300, 1300
2, 1340, 1450
...
In the rows, where no range but one house number, you set both min and max to the same value.
You can write a script, that will parse your txt file and save all data to db. That should be as easy as several loops and explode() with different parameters and some insert queries too.
Then with first query you get street id
SELECT id FROM streets WHERE street LIKE '%[street name]%'
After that you run second query and get answer, is there such house number on that street
SELECT COUNT(*)
FROM houses
WHERE street_id = [street_id]
AND [house_num] BETWEEN house_min AND house_max
Inside [...] you put real values, dont forget to escape them to prevent sql injections...
Or you even can run just one query using JOIN.
Also you should make sure that your given house number is integer, not float.
I am just starting with Sphinx. So far I got it installed successfully, got a table called profiles on my MySQL database indexed and am able to get the correct results back using the PHP API. I am using CodeIgniter so I wrapped the default PHP API as a CodeIgniter library.
Anyway this is how my code looks like:
$query = $_GET['q'];
$this->load->library('sphinxclient');
$this->sphinxclient->setMatchMode(SPH_MATCH_ANY);
$result = $this->sphinxclient->query($query);
$to_fetch = array();
foreach($result['matches'] as $key => $match) {
array_push($to_fetch, $key);
}
The array $to_fetch contains the ids of the matched table rows. Now I can use a typical MySQL query to get all the relevant users to display on the search page like so:
$query = 'SELECT * FROM profiles WHERE id IN('. join(',', $to_fetch) . ')';
My question are:
is this the right way to go about it? or is there a default "Sphinx way of doing it" that would be better for performance .
secondly, all I get back at the moment is the id of the matched table rows. I also want the part of the text in the column that matched. For example if a someone searches for the keyword dog and a user on the profiles table had in their about column the following text:
I like dogs. I also like ice cream.
I would like Sphinx to return:
I like <strong>dogs</strong>. I also like ice cream.
How can I do that? I tried to play around with the buildExcerpts() function but can't get it to work.
EDIT
This is how I am getting excerpts now:
// get matched user ids
$to_fetch = array();
foreach($result['matches'] as $key => $match) {
array_push($to_fetch, $key);
}
// get user details of matched ids
$members = $this->search_m->get_users_by_id($to_fetch);
// build excerpts
$excerpts = array();
foreach($members as $member) {
$fields = array(
$member['about'],
$member['likes'],
$member['dislikes'],
$member['occupation']
);
$options = array(
'before_match' => '<strong class="match">',
'after_match' => '</strong>',
'chunk_separator' => ' ... ',
'limit' => 60,
'around' => 3,
);
$excerpt_result = $this->sphinxclient->BuildExcerpts($fields, 'profiles', $query, $options);
$excerpts[$member['user_id']] = $excerpt_result;
}
$excerpts_to_return = array();
foreach($excerpts as $key => $excerpt) {
foreach($excerpt as $v) {
if(strpos($v, '<strong class="match">') !== false) {
$excerpts_to_return[$key] = $v;
}
}
}
As you can see I am searching each query across 4 different mysql columns:
about
likes
dislikes
occupation
Because of this I don't know which of the 4 columns contains the matched keyword. It could be any of them or even more than one. So I have no choice but to run the contents of all 4 columns through the BuildExcerpts() function.
Even then I don't know which one the BuildExcerpts() returned with the <strong class="match"> tags. So I run a stpos check on all values returned by BuildExcerpts() to finally get the proper excerpt and map it to the user whose profile it belongs to.
Do you see a better way than this given my situation where I need to match against the contents of 4 different columns?
Yes that looks good way. One thing to remember the rows coming back from Mysql probably won't be in the order from sphinx.
See the FAQ on sphinx site for how to use FIELD() but personally I like to put the rows from sphinx into associative array, then just loop though the sphinx I'd list and get the row from the array. Avoids a sorting phase altogether at the expense of memory!
As for highlighting, yes do persevere with buildExcerpts - that's is the way to do it.
edit to add, this demo
http://nearby.org.uk/sphinx/search-example5-withcomments.phps
demonstrates both getting rows from mysql and "sorting" in the app. And buildExcerpts.
Below is a link to my original question:
PHP: How to display a variable (a) within another variable(b) when variable (b) contains text
Ok here's more to the problem, all your suggestions work but now I'm looking for the most efficient method to my specific problem.
In my database I have several blocks of text. When a user(described as $teamName) logs in to the site, they are randomly assigned one of these blocks of text. Each block of text is different and may have different variables in it.
The problem is I don't have knowledge of which block of text is assigned to the user without actually viewing the database or running a query. So at the moment I have to query the database and select the $newsID that corresponds to the block of text that the user has been assigned.
Because I have preset the blocks of text, I know what they contain so I can know do a switch($newsID) and depending on the value of the $newsID I then run the correct values inserted into the sprintf() function.
There is however, many many blocks of text so there will be many instances of case "": and break;. I wish to have the site working so that if at any stage I change a block of text to something different, then the variables within sprintf() are automatically updated, rather than me manually updating sprintf() within the switch() case:.
Sorry for the long post, hope it makes sense.
EDIT:
I have these predetermined blocks of text in my database in my teamNews table:
For $newsID = 1:
"$teamName is the name of a recently formed company hoping to take over the lucrative hairdryer design
$sector"
For $newsID = 2:
"The government is excited about the potential of ".$teamName.", after they made an annoucement that they have hired $HoM"
For $newsID = 3:
"It is rumored that $teamName are valuing their hairdryer at $salePrice. People are getting excited.
When a user($teamName) logs into the game they are randomly assigned one of these blocks of text with $newsID of 1,2 or 3.
Lets say the user is assigned the block of text with $newsID = 2. So now their username($teamName) is inserted into the database into the same row as their selected text.
Now I want to display the text corresponding to this user so I do the following:
$news = news ($currentStage,$teamName);
switch ($ID)
{
case "1":
sprintf($teamName,$sector)
echo $news."<br/><hr/>";
break;
case "2":
sprintf($teamName,$Hom)
break;
case "3":
sprintf($teamName,$saleprice)
break;
}
$currentStage--;
}
With the function
function news($period,$teamName)
{
$news = mysql_query("
SELECT `content`,`newsID` FROM `teamnews` WHERE `period` = '$period' && `teamName` = '$teamName'
") or die($news."<br/><br/>".mysql_error());
$row = mysql_fetch_assoc($news);
$news = $row['content'];
$ID = $row ['newsID'];
return $news,$ID;
}
The problem is that in reality there are about 20 different blocks of text that the user could be assigned to. So I will have many case:'s.
Also if I want to change all the text blocks in the database I would have to also manually change all the variables in the sprintf's in each ``case:`
I am wondering is there a better way to do this so that if I change the text in the database then the paramaters passed to sprintf will change accordingly.
So if I use
$replaces = array(
'teamName' => 'Bob the team',
'sector' => 'murdering',
'anotherSector' => 'giving fluffy bunnies to children'
);
is it possible to do this:
$replaces = array(
'$teamName' => '$teamName',
'$sector' => '$sector',
'$anotherSector' => '$anothersector'
);
I suggest you have fixed set of named placeholders, and use either the str_replace() or eval() (evil) methods of substitution.
So you would (for example) always have a $teamName and a $sector - and you might only sometimes use $anotherSector. And you have these two strings:
1 - $teamName, is the name of a recently formed company hoping to take over the lucrative $sector.
2 - The people at $teamName hate working in $sector, they would much rather work in $anotherSector
If you were to do:
$replaces = array(
'$teamName' => 'Bob the team',
'$sector' => 'murdering',
'$anotherSector' => 'giving fluffy bunnies to children'
);
$news = str_replace(array_keys($replaces),array_values($replaces),$news);
You would get
1 - Bob the team, is the name of a recently formed company hoping to take over the lucrative murdering.
2 - The people at Bob the team hate working in murdering, they would much rather work in giving fluffy bunnies to children
As long as your placeholders have known names, they don't all have to be present in the string - only the relevant ones will be replaced.
You could create a simple template language, and store templates in your database.
You can use strtr for this.
function replaceTemplateVars($str, $data) {
// change the key format to correspond to the template replacement format
$replacepairs = array();
foreach($data as $key => $value) {
$replacepairs["{{{$key}}}"] = $value;
}
// do the replacement in bulk
return strtr($str, $replacepairs);
}
// store your teamNews table text in this format
// double curly braces is easier to spot and less ambiguous to parse than `$name`.
$exampletemplate = '{{teamName}} is {{sector}} the {{otherteam}}!!'
// get $values out of your database for the user
$values = array(
'teamName' => 'Bob the team',
'sector' => 'murdering',
'otherteam' => 'fluffy bunnies'
);
echo replaceTemplateVars($exampletemplate, $values);
// this will echo "Bob the team is murdering the fluffy bunnies!!"
If you have needs more ambitious than this, such as looping or filters, you should find a third-party php template language and use it.
What about function eval?
http://php.net/eval
the problem in short,
Field:ProfileItems = "action,Search,Work,Flow,pictures";
Mysql query = "SELECT ProfileItems FROM addUsers";
then I explode with , making array e.g.: array('action','search',...etc)
and create fields for ,
Result:
<form>
action : <input type=text name=action>
search : <input type=text name=search>
...etc
<input type=submit>
</form>
My problem is how can I replace names in the database with more user friendly ones (add description) to fields without using an IF statement??
//created asoc array with Key = search item and value = user friendly value
$prase = array("ABS" => "ABS (Anti lock braking System)"
,"DriverAirBag" => "Air bags");
$string= "ABS,DriverAirbag,GOGO,abs";
foreach($prase as $db=>$eu){
echo "if $db will be $eu<br>";
echo str_ireplace($eu,$db,$string);
}
echo $string;
Tried above but was an epic fail :D !.. can you please help me out ?
Having a map inside PHP is not an unreasonable approach, but you're doing the str_ireplace() backwards: it's search, replace, subject, so in your case str_ireplace($db, $eu, $string);
But just doing a str_ireplace() on a comma-separated list of strings is not ideal anyway. For one thing, imagine if after you did the substitution for ABS you then encountered another profile item that matched lock (which just so happens to appear in "Anti-lock braking system"). Oops. Now you've overwritten your earlier replacement!
How about something like this:
$prase = array("ABS" => "ABS (Anti lock braking System)"
,"DriverAirBag" => "Air bags");
$string= "ABS,DriverAirbag,GOGO,abs";
$fields = explode(',', $string);
foreach($fields as $field) {
$friendly = $field;
if (isset($phrase[$field]))
$friendly = $phrase[$field];
echo htmlspecialchars($friendly) . ': <input type="text" name="' . htmlspecialchars($field) . '" />
}
The key here is that you're handling each field separately. And you're never just doing a replacement; you're looking specifically for the keywords "ABS" or "DriverAirbag". If there's not an exact match, you don't have a human-friendly name for that item, and there's no point doing any replacement.
All this can be improved even further if you have the ability to change the database schema. Storing a comma-separated list is never desirable. You should have a table with a schema something like:
field_id (e.g., "ABS")
name (e.g., "Anti-lock Braking System")
And another table like:
user_id (I'm inferring a little here from the name addUsers — whatever field/s you have in addUser now identifying the person)
field_id (i.e., foreign key to the above field table)
Note that you may end up with many rows in this table for each person (1, 'ABS'), (1, 'DriverAirbag')
But then your query can become
SELECT field, name
FROM user_field
INNER JOIN field USING (field_id)
Now you get back one row for each field (no explode required!) and each row includes both the computer-friendly and human-friendly name.
Hey, basically what i am trying to do is automatically assign Tags to a user input string. Now i have 5 tags to be assigned. Each tag will have around 10 keywords. A String can only be assigned one tag. In order to assign tag to string, i need to search for words matching keywords for all the five tags.
Example:
TAGS: Keywords
Drink: Beer, whiskey, drinks, drink, pint, peg.....
Fitness: gym, yoga, massage, exercise......
Apparels: men's shirt, shirt, dress......
Music: classical, western, sing, salsa.....
Food: meal, grilled, baked, delicious.......
User String: Take first step to reach your fitness goals, Pay Rs 199 for Aerobics, Yoga, Kick Boxing, Bollywood Dance and more worth Rs 1000 at The very Premium F Chisel Bounce, Koramangala.
Now i need to decide upon a tag for the above string. I need an time efficient algorithm for this problem. I don't know how to go about matching keywords for strings but i do have a thought about deciding tag. I was thinking to maintain an array count for each tag and as a keyword is matched count for respective tag is increased. if at any time count for any tag reaches 5 we can stop and decide on that tag only this will save us from searching the whole thing.
Please give any advice you have on this. I will be using php just so you know.
thanks
Interesting topic! What you are looking for is something similar to latent semantic indexing. There is questing here.
If the number of tags and keywords is small I would save me writing a complex algorithm and simply do:
$tags = array(
'drink' => array('beer', 'whiskey', ...),
...
);
$string = 'Take first step ...';
$bestTag = '';
$bestTagCount = 0;
foreach ($tags as $tag => $keywords) {
$count = 0;
foreach ($keywords as $keyword) {
$count += substr_count($string, $keyword);
}
if ($count > $bestTagCount) {
$bestTagCount = $count;
$bestTag = $tag;
}
}
var_dump($bestTag);
The algorithm is pretty obvious, but only suited for a small number of tags/keywords.
If you dont mind using an external API, you should try one of these:
http://www.zemanta.com/
http://www.opencalais.com/
Benjamin Nowack: Linked Data Entity Extraction with Zemanta and OpenCalais
To give an example, Zemanta will return the following tags (among other things) for your User String:
Bollywood, Kickboxing, Koramangala, Aerobics, Boxing, Sports, India, Asia
Open Calais will return
Sports, Hospitality Recreation, Health, Recreation, Human behavior, Kick, Yoga, Chisel
Aerobics, Meditation, Indian philosophy, Combat sports, Aerobic exercise, Exercise