MySQL - Full vs partial row retrieval - php

I'm programming a website using PHP/MySql to allow visitors to search for real estate listings.
The main page shows the list of advertised apartments, displaying just a small subset of all the available attributes included in the MySql table that contains the apartments listed. The full set of attributes for each apartment is only shown on a secondary webpage, once the user selects a result from the list in the main page. So, if for example, the available features included in the database's table are price, location, number of rooms and surface area, the main page only displays price and location in the results list, and the remaining attributes are displayed only when the user selects a specific result from the list.
I'm wondering what is the best strategy in order to ensure fast response from the database and achieve the highest possible amount of concurrent users: Should I retrieve ALL the columns from the table when showing the full result list of results and avoid querying the database when the user selects a given result (since I already have all the data I need to show), or should I only extract the minimum amount of columns to display in the results list (price and location, following the example above), and fetch the remaining columns for a specific record only when the user selects a specific result?
I'm querying a single table (no joins or complex queries, although I do use a where clause) and the results list is expected to show around 30 to 50 records at a time. I don't have any data regarding how many of the results in the list shown are selected by the user to see additional info, but I would say it's resonable to say that it will select around 60% of them.
Thanks in advance for your help!

I'd fetch the first few rows and then use endless scrolling techniques via ajax. Be sure to have a (sometimes a little outdated) static list of all entries (meaning: cache them) linked to from every page. That way Google can reference every "sigle view object pages".

Related

PHP SQL: Dynamically Creating and Deleting Columns in Database

I'm creating a carousel/image slider plugin for WordPress and I've hit a wall. There's going to be an indefinite amount of user input and I need to know how to handle this.
I currently have six static inputs: transition_time, loop_carousel, stop_on_hover, reverse_order, navigation_arrows, and show_pagination and the variable amount of info will come from the images the user wants to use. So this could be anywhere from zero to infinite.
I want to be able to create/delete X amount of columns in the DB.
So starting out there will be zero images, meaning six columns. If a user adds two images I want to have eight columns, two created. If the user deletes them then I want to go back to my original six.
I'm guessing this is possible but how and is this a good idea or should I just have a set amount of images?
You Are Doing It Wrong™.
Changing a table definition should be an exceptional event.
Use two tables, one to model the Carousel, one to store image information, then link them.
Table Carousel:
id
[more fields here]
Table Image:
id
carousel_id (reference to the containing Carrousel)
[more fields]

How should I design the database structure for this problem?

I am rebuilding the background system of a site with a lot of traffic.
This is the core of the application and the way I build this part of the database is critical for a big chunk of code and upcoming work. The system described below will have to run millions of times each day. I would appreciate any input on the issue.
The background is that a user can add what he or she has been eating during the day.
Simplified, the process is more or less this:
The user arrives to the site and the site lists his/her choices for the day (if entered before as the steps below describes).
The user can add a meal (consisting of 1 to unlimited different items of food and their quantity). The meal is added through a search field and is organized in different types (like 'Breakfast', 'Lunch').
During the meal building process a list of the most commonly used food items (primarily by this user, but secondly also by all users) will be shown for quick selection.
The meals will be stored in a FoodLog table that consists of something like this: id, user_id, date, type, food_data.
What I currently have is a huge database with food items from which the search will be performed. The food items are stored with information on both the common name (like "pork cutlets") and on producer (like "coca cola"), along with other detailed information needed.
Question summary:
My problem is that I do not know the best way to store the data for it to be easily accessible in the way I need it and without the database going out of hand.
Consider 1 million users adding 1 to 7 meals each day. To store each food item for each meal, each day and each user would potentially create (1*avg_num_meals*avg_num_food_items) million rows each day.
Storing the data in some compressed way (like the food_data is an json_encoded string), would lessen the amount of rows significally, but at the same time making it hard to create the 'most used food items'-list and other statistics on the fly.
Should the table be split into several tables? If this is the case, how would they interact?
The site is currently hosted on a mid-range CDN and is using a LAMP (Linux, Apache, MySQL, PHP) backbone.
Roughly, you want a fully normalized data structure for this. You want to have one table for Users, one table for Meals (one entry per meal, with a reference to User; you probably also want to have a time / date of the meal in this table), and a table for MealItems, which is simply an association table between Meal and the Food Items table.
So when a User comes in and creates an account, you make an entry in the Users table. When a user reports a Meal they've eaten, you create a record in the Meals table, and a record in the MealItems table for every item they reported.
This structure makes it straightforward to have a variable number of items with every meal, without wasting a lot of space. You can determine the representation of items in meals with a relatively simple query, as well as determining just what the total set of items any one user has consumed in any given timespan.
This normalized table structure will support a VERY large number of records and support a large number of queries against the database.
First,
Storing the data in some compressed way (like the food_data is an
json_encoded string)
is not a recommended idea. This will cause you countless headaches in the future as new requirements are added.
You should definitely have a few tables here.
Users
id, etc
Food Items
id, name, description, etc
Meals
id, user_id, category, etc
Meal Items
id, food_item_id, meal_id
The Meal Items would tie the Meals to the Food Items using ids. The Meals would be tied to Users using ids. This makes it simple to use joins in order to get detailed lists of data- totals, averages, etc. If the fields are properly indexed, this should be a great model to support a large number of records.
In addition to what's been said:
be judicious in your use of indexes. Properly applying these to your database could significantly speed up read access to your tables.
Consider using language-specific features to minimize space. You mention that you're using mysql; consider using ENUM when appropriate (food types, meal types) to minimize database size and to simplify management.
I would split up your meal table into two tables, one table stores a single row for each meal, the second table stores one row for each food item used in a meal, with a foreign key reference to the meal it was used in.
After that, just make sure you have indices on any table columns used in joins or WHERE clauses.

Weighing search results

PHP / MySQL backend. I've got a database full of movies YouTube-style. Each video has a name and category. Videos and categories have a m:n relationship.
I'd like for my visitors to be able to search for videos and have them enter the search terms in one search field. I can't figure out how to return the best search results based on being category, occurrences in name.
What's the best way to go about something like this? Scoring? => Check for each search term whether it occurs in the name of the video; if so, award the video a point; check if the video is in categories that are also contained in the search query; if so, award it a point. Sort it by number points received? That sounds very expensive in terms of CPU usage.
Using Full-Text Search may help: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html#function_match
You can test several columns at once against an expression.
First, use full text search. It can be either MySql full-text search or some kind of extrenal full-text search engine. I recommend sphinx. It is very fast, simple and even can be integrated with MuSQL using SphinxSE (so search indexes look loke tables in MySQL). However you have to install and configure it.
Second, think about splitting search results by search type. Any kind of full-text search will return list of matched items sorted by relevancy. You can search by all fields and get a single list. This is bad idea because hits by name and hits by category will be mixed. To solve this you can do multiple searches - search by name first, then search by category.
As a result you'll have two matching sets and you have a lot of options how to display this. Some ideas:
merge 2 sets based on relevancy rate returned by the search engine. This looks like result of one single query but you know what each item is (name hit or category hit) so you can highlight this
do the same marge as above but assign different weights to different sets, for eaxmple relevancy = 0.7*name_relevancy+0.3*category_relevancy. This will make search results more natural
spit results into tabs/groups e.g. 'There are N titles and M categories matching your query)
Use bands when displaying results. For each page (assuming you are splitting search results using paginator) dispslay N items from the first set and M items from the second set (you can dipslya sets one by one or shuffle items). If there is no enough items in one of sets then just get more items from another set, so there is always M+N items per page
Any other way you can imagine
And you can use this method for any kind of fields - name, categroy, actor, director, etc. However the more fields you use the more search queries you have to execute
I don't think you can avoid looking at the title and category of every movie for each search. So the CPU usage for that is a given. If you are concerned about the CPU usage of the sort, it would be negligible in most cases, since you would only be sorting the items that have more than zero points.
Having said that, what you probably want is a system that is partially rule-based and partially point-based. For instance, if you have a title that is equal to the search term, it should come first, regardless of points. Architect your search such that you can easily add rules and tweak points as you see fit to yield the best results.
Edit: In the event of an exact title match, you can take advantage of a DB index and not search the whole table. Optionally, the same goes for category.

MySQL query result split

This is somewhat of a multipart question, but..
I am looking to query a MySQL table to get fields from a event category table.
Each category has a specific calendar assigned to it, in the "calendar" field in the category table.
I am planning to have a HTML list box for each of the different types of calendars (only 4, and they wont change).
Is there a way to query the category table once, and split the results into different arrays?
Ex.
Sports (only categories assigned to the sports calendar appear here):
(in list box):
Basketball
Baseball
Golf
etc.
then,
General:
(only categories assigned to the general calendar appear here)
etc.
etc.
etc.
I thought to do this in one query, instead of querying the whole table for each calendar type, but will there be that much difference in speed?
I am using PHP, by the way.
Thanks for the help.
You can query the table once and use mysql_data_seek to reset the rowset pointer back to the beginning after having read through it - i.e. iterate over the rowset for category 1, reset the pointer, iterate over for category 2, etc. You need only query once, and iterating over the results is very fast vs. querying.
Alternatively, have four strings each containing the HTML for the content of one of the listboxes, and iterate over the rowset once, appending to the relevant string based on the category of the current record.

How should I store product and product image data for an online store?

I'm working on a storefront application in PHP with MySQL. I'm currently storing my data in 4 tables: one for product lines, one for specific products within those product lines, one for product images, and one which specifies which images are linked to which specific products. Each specific product belongs to one product line, each product image can belong to several specific products, and each specific product can have several images. My tables look like this:
ProductLines
id, name, description, etc.
SpecificProducts
productLineID
id, color, size, etc.
ProductImageLinks
specificProductID
imageID
Images
id, imageFileLocation, name, etc.
It's working fine this way, but it seems like it's not very efficient for retrieval purposes.
For example, I have a page that lists each product line along with a thumbnail of a randomly chosen image from that product line. To do that I have to first query the database for a list of all product lines, then perform a separate query for each product line to get all of the specific products that have associated images, pick one of those, and then query again to get the image.
Another possibility I considered would be to use one query to get all the product lines I'm looking at, a second query to get all the specific products for all of those product lines, a third query to get all of the image links which specify which images are linked to which specific products, and a fourth query to get all those images. I imagine this would be a bit faster because of the reduced number of queries, but it would leave a lot of work for PHP to do figuring out the connections between product lines and products and images, which could be just as slow.
So my question is, is there a better way to store this data? Or a better way to retrieve it based on the database I already have in place? Or is one of the two options I've identified really my best bet?
Edit: I'm not actually storing image files in the database. The image files are stored in the file system. My "Images" table in the database just stores the location of the image file along with useful info like the image title, alt text, etc.
Yes - just write a single query that will retrieve all that information in one shot.
I'm a little rusty on this, but you can lookup the queries in mysql reference.
create a query that joins these tables on the appropriate keys
you need to select the first item from a subquery that retrieves the images for a specific query, and then order by rand() and select the first.
This can definitely be done in a single query. Even if it can't you can always create views which is sometimes a better way to organize your queries so that they are more readable. In other words, instead of returning the result of your query, just create a view corresponding to your first query. Then create a view that corresponds to running your second query on the result of the first query, that operates off the view. And so on. Then, your actually query can be done in one shot by retrieving from the final view.
As far the database design goes, you have a fairly solid (and standard) design. You could combine your ProductImageLinks and Images tables as long as it's a 1:1 relationship to save some queries.
As for your product line image retrieval, you have a couple of options that would drastically reduce the number of queries required:
Create a new table in your database called ProductLineImages. Instead of picking the image randomly from the associated products, load a set of images in there that you can choose randomly from. It won't be as dynamic this way, but this is the most efficient method.
You can do all of what you described in a single (but less efficient than #1) query.
Are you set on storing the images in the mysql database?
In my, similar application, I simply stored the images in /images/productimages/imagesize/productid.jpg where imagesize is "small", "large" etc, for different thumbnail sizes, and productid.jpg is the id from the SpecificProducts table

Categories