Auto complete movie names - php

I have the Jquery auto complete currently implemented for searching movie names. I have it starting a 2 characters with a 150ms delay in between.
I have a PHP and Mysql DB behind it that does a like '%term%' search to return the results.
I find this is pretty slow and database intensive.
I tried using Mysql's full text search, but didn't have much luck - perhaps I wasn't using the right match type.
Can someone suggest tweaks to the mysql full text or whether I should go straight to an indexing solution like Lucene or Sphinx and if they work well on partial matches with only 1-3 characters?

You are limited by the speed at which your database returns the results. I can suggest you following to speed up.
Dont make a new connection to mysql from php for every request. Enable database connection pooling. It improves performance quite a lot. I dont know how to do connection pool in php. This might help.
If possible cache the results in php, so that you dont hit the database everytime.
Use an external service to serve data for autocomplete. Look at Autocomplete as a Service. This relieves you writing backend for autocomplete, produces faster results.

Related

Best solution for custom live search task

I'm going to add simple live search to website (tips while entering text in input box).
Main task:
39k plain text lines for search into (~500 length of each line, 4Mb total size)
1k online users can simultaneously typing something in inputbox
In some cases 2k-3k resuts can match user request
I'm worried about the following questions:
Database VS textfile?
Are there any general rules or best practices related to my task aimed for decreasing db/server memory load? (caching/indexing/etc)
Do Sphinx/Solr are appropriate for such task?
Any links/advice will be extremely helpful.
Thanks
P.S. May be this is the best solution? PHP to search within txt file and echo the whole line
Put your data in a database (SQLite should do just fine, but you can also use a more heavy-duty RDBMS like MySQL or Postgres), and put an index on the column or columns that will be searched.
Only do the absolute minimum, which means that you should not use a framework, an ORM, etc. They will just slow down your code.
Create a PHP file, grab the search text and do a SELECT query using a native PHP driver, such as SQLite, MySQLi, PDO or similar.
Also, think about how the search box will work. You can prevent many requests if you e.g. put a minimum character limit (it does not make sense to search only for one or two characters), put a short delay between sending requests (so that you do not send requests that are never used), and so on.
Whether or not to use an extension such as Solr depends on your circumstances. If you have a lot of data, and a lot of requests, then maybe you should look into it. But if the problem can be solved using a simple solution then you should probably try it out before making it more complicated.
I have implemented 'live search' many times, always using AJAX with querying the database (MySQL) and haven't had/observed any speed or large load issues yet.
Anyway I saw an implementations using Solr but cannot suggest whether it was quicker or consumed less resources.
It completely depends on the HW the server will run on, IMO. As I wrote somewhere, I had seen a server with very slow filesystem so implementing live search while reading and parsing from txt files (or using Solr) could be slower than when querying the database. On the other hand You can host on poor shared webhosting with slow DB connection (that gets even slower with more concurrent connections) so this won't be the best solution.
My suggestion: use MySQL with AJAX (look at this jquery plugin or this article), set proper INDEXes on the searched columns and if this is found slow You still can move to a txt file.
In the past, i have used Zend search Lucene with great success.
It is a general purpose text search engine written entirely in PHP 5. It manages the indexing of your sources and is quite fast (in my experience). It supports many query types, search fields, search ranking.

best way to make similar posts php/mysql

I want code not cause load on server to find similar posts in php/mysql
I try with
MATCH (post) AGAINST ('string string')
but it was cause alot of load on server, so it stop my server for
I have over 4,125,274 post in my database
please help mE
While Fulltext index will help, it will be still really slow if you want to load similar items many times. We have an implementation which has about 7 million records of posts with fulltext and it takes maybe up to a minute to search if we only rely on mysql.
A good alternative is having a search server like sphinx http://sphinxsearch.com/ which creates its own indexing and caching and is much much faster.
It is simple and efficient and is used by many big places like urbandictionary, craiglist, mozilla, etc.
If you want to do it in only mysql queries, and if you don't want to do one search many times, try caching the returned IDs on memcached.
Supposing you already have a fulltext index on post and this doesn't help you should consider incorporating a dedicated search engine on your posts, such as Lucene (not necessarily the php implementation though)

Good alternatives/practices to "LIKE" with PostgreSQL and PHP?

I'm working with a Postgres database that I have no control over the administration of. I'm building a calendar that deals with seeing if resources (physical items) were online or offline on a specific day. Unfortunately, if they're offline I can only confirm this by finding the resource name in a text field.
I've been using
select * from log WHERE log_text LIKE 'Resource Kit 06%'
The problem is that when we're building a calendar using LIKE 180+ times (at least 6 resources per day) is slow as can be. Does anybody know of a way to speed this up (keep in mind I can't modify the database). Also, if there's nothing I can do on the database end, is there anything I can do on the php end?
I think, that some form of cache will be required for this. As you cannot change anything in database, your only chance is to pull data from it and store it in some more accessible and faster form. This is highly dependent on frequency of data inserted into table. If there are more inserts than selects, it will not probably help much. Other way there is slight chance of improved performance.
Maybe you can consider using Lucene search engine, which is capable of fulltext indexing. There is implementation from Zend and even Apache has some http service. I haven't opportunity to test it however.
If you don't use something that robust, you can write your own caching mechanism in php. It will not be as fast as postgres, but probably faster than not indexed LIKE queries. If your queries need to be more sofisticated (conditions, grouping, ordering...), you can use SQLite database, which is file based and doesn't need extra service running on server.
Another way could be using triggers in database, which could on insert data store required information to some other table in more indexed manner. But without rights to administer database, it is probably dead end.
Please be more specific with your question, if you want more specific information.

Approaches to making custom site search

I'm making a social website with lots of different sections, like blogs, galleries, multimedia etc. And now the time has come to implement the search functionality. Customer refused to use google search and insisted on making custom one, where results will be shown for each section individually.
For example, if user enters 'art', the result should be displayed like this:
3 found in blogs
1 ...
2 ...
3 ...
2 found in galleries
1 ...
2 ...
None found in multimedia
I'm planning to use MySQL fulltext search for this. So, the question is: How do I make such search, so it won't kill the server if very many records match the query? I don't really see how to implement paging in this case.
I would highly recommend NOT using MySQL for full text search, it is slow both in index creation and in performing searches.
Take a look at Sphinx or Lucene, both of which are significantly faster than MySQL and which bind quite readily to PHP applications.
You wont kill a mysql server with such a thing, even if your app is huge (we are talking about thousands of queries/sec here) you will just have to set up a replicate of your mysql server dedicaced to search, you may want to build a cache of "popular keyword results" for speeding things up a bit, but appliances likes a googlemini is still the best for that ...
If you can run a Java servlet container (like Tomcat or Jetty), then I recommend Solr (http://lucene.apache.org/solr/). It sits on top of Lucene and is very powerful. Solr was started at CNET and is used by big sites like Netflix and Zappos. Stack Overflow uses a .NET implementation of Lucene. I'm not familiar with Sphinx, so I can't tell you how it compares to Solr.
If you use Solr, look into faceting. This allows you to perform a search and then get a count of how many documents were in "blogs", "galleries", "multimedia', etc.
Here is a PHP client to interface with Solr (http://code.google.com/p/solr-php-client/).
Maybe better decision is use - sphinx
I've done this before on some sites I created. What I have done is run one query against each module to find the results. What you want to do is run a mysql query, and then fetch rows in a while loop rather than using a fetch all. This will make sure you don't over consume memory.
for example:
while($row = mysql_fetch_array($result)){ echo $row['item_name']; }
You will most likely find that MySQL can handle much larger searches than you think.
Pagination is best done with a paging class, like one from code igniter or the like.
Are you using a web frame work?
Yes Sphinx or Lucene, both are good and significantly faster than MySQL and which bind quite readily to PHP applications.

Is there a C++ SOLR library?

I have a Solr box which is fed by a PHP cronjob right now.
I want to speed things up and save some memory by switching to a C++ process.
I don't want to reinvent the wheel by creating a new library.
The only thing is I can't find a library for Solr in C++.
Otherwise I will have to create one using CURL probably.
Does any of you guys know a library between for Solr written in C++?
Thanks.
There is an attempt to create a C++-API for Solr.
Take a look at this project:
http://code.google.com/p/solcpp/
With "fed" do you mean documents are passed for indexing? You'll probably find that the process that is doing the "feeding" is not the bottleneck but rather how quickly Solr can ingest documents.
I'd also recommend some profiling before you do a lot of work because the process is usually not CPU bound, so the speed increase you'll get by moving to C++ will disappointing.
Have you optimised your schema as much as possible? The two obvious first steps are:
1. Don't store data that is not needed for display.(Field ID's and meta data etc)
...and the opposite of that...
2. Don't index data is ONLY used for display, but not searched for. (Supplementary data)
And a quirky thing to try that sometimes works, and sometimes doesn't is changing the add/overwrite attribute to false.
<add overwrite="false">
This disables the unique id check (I think). So if you are doing a full wipe/replace of the index, and you are certain to only be adding unique documents, then this can speed up the import. It really depends on the size of the index though. If you have over 2,000,000 documents, and every time the indexer adds a new one, you gain a bit of speed by not forcing it to check if that document already exists. Not the most eloquent explanation, but I hope it makes sense.
Personally, I use the data import handler which cuts out the need for an intermediary script. It just hooks up to the db and sucks out the info it needs with a single query.

Categories