Whichever make the script more slowly (I have two choices) - php

I have table in database named ads, this table contains data about each ad.
I want to get that data from table to display ad.
Now, I have two choices:
Either get all data from table and store it in array, and then , I will treat with this array to display each ad in its position by using loops.
Or access to table directly and get each ad data to display it, note this way will consume more queries to database.
Which one is the best way, and not make the script more slow ?

In most Cases #1 is better.
Because, if you can select the data (smallest, needed set) in one query,
then you have less roundtrips to the database server.
Accessing Array or Objectproperties (from Memory) are usually faster than DB Queries.
You could also consider to prepare your Data and don't mix fetching with view output.
The second Option "select on demand" could make sense if you need to "lazy load",
maybe because you can or want to recognize client properties, like viewport.

I'd like to highlight the following part:
get all data from table and store it in array
You do not need to store all rows into an array. You could also take an iterator that represents the resultset and then use that one.
Depending on the database object you use this is often the less memory-intensive variant. Also you would run only one query here which is preferable.
The iterator is actually common with modern database result objects.
Additionally this is helpful to decouple the view code from the actual database interaction and you can also defer to do the SQL query.

You should minimize the amount of queries but you should also try to minimize the amount of data you actually get from the database.
So: Get only those ads that you are actually displaying. You could for example use columnPK IN (1, 2, 3, 4) to get those ads.
A notable exception: If your application is centered around "ads" and you need them pretty much everywhere, and/or they don't consume much memory, and/or there aren't too many adds, it might be better performance-wise to store all (or a subset) of your ads in an array.
Above all: Measure, measure, measure!
It is very, very hard to predict which algorithm will be most efficient. Often you implement something "because it will be more efficient" only to find out later that your optimization is actually slowing down your application.

You should always try to run a PHP script with the least amount of database queries possible. Whenever you query the database, a request must be sent to the database (usually) over the network, and your script will idle until the request came back.
You should, however, make sure not to request any more data from the database than necessary. So try to filter as much in the WHERE clause as possible instead of requesting the whole table and then picking individual rows on the PHP layer.
We could help with writing that SQL query when you tell us how your table looks and how you want to select which ads to display.

Related

PHP array VS MSQL table

I have a program that creates logs and these logs are used to calculate balances, trends, etc for each individual client. Currently, I store everything in separate MYSQL tables. I link all the logs to a specific client by joining the two tables. When I access a client, it pulls all the logs from the log_table and generates a report. The report varies depending on what filters are in place, mostly date and category specific.
My concern is the performance of my program as we accumulate more logs and clients. My intuition tells me to store the log information in the user_table in the form of a serialized array so only one query is used for the entire session. I can then take that log array and filter it using PHP where as before, it was filtered in a MYSQL query (using multiple methods, such as BETWEEN for dates and other comparisons).
My question is, do you think performance would be improved if I used serialized arrays to store the logs as opposed to using a MYSQL table to store each individual log? We are estimating about 500-1000 logs per client, with around 50000 clients (and growing).
It sounds like you don't understand what makes databases powerful. It's not about "storing data", it's about "storing data in a way that can be indexed, optimized, and filtered". You don't store serialized arrays, because the database can't do anything with that. All it sees is a single string without any structure that it can meaningfully work with. Using it that way voids the entire reason to even use a database.
Instead, figure out the schema for your array data, and then insert your data properly, with one field per dedicated table column so that you can actually use the database as a database, allowing it to optimize its storage, retrieval, and database algebra (selecting, joining and filtering).
Is serialized arrays in a db faster than native PHP? No, of course not. You've forced the database to act as a flat file with the extra dbms overhead.
Is using the database properly faster than native PHP? Usually, yes, by a lot.
Plus, and this part is important, it means that your database can live "anywhere", including on a faster machine next to your webserver, so that your database can return results in 0.1s, rather than PHP jacking 100% cpu to filter your data and preventing users of your website from getting page results because you blocked all the threads. In fact, for that very reason it makes absolutely no sense to keep this task in PHP, even if you're bad at implementing your schema and queries, forget to cache results and do subsequent searches inside of those cached results, forget to index the tables on columns for extremely fast retrieval, etc, etc.
PHP is not for doing all the heavy lifting. It should ask other things for the data it needs, and act as the glue between "a request comes in", "response base data is obtained" and "response is sent back to the client". It should start up, make the calls, generate the result, and die as fast as it can again.
It really depends on how you need to use the data. You might want to look into storing with mongo if you don't need to search that data. If you do, leave it in individual rows and create your indexes in a way that makes them look up fast.
If you have 10 billion rows, and need to look up 100 of them to do a calculation, it should still be fast if you have your indexes done right.
Now if you have 10 billion rows and you want to do a sum on 10,000 of them, it would probably be more efficient to save that total somewhere. Whenever a new row is added, removed or updated that would affect that total, you can change that total as well. Consider a bank, where all items in the ledger are stored in a table, but the balance is stored on the user account and is not calculated based on all the transactions every time the user wants to check his balance.

PHP Array efficiency vs mySQL query

I have a MySQL table with about 9.5K rows, these won't change much but I may slowly add to them.
I have a process where if someone scans a barcode I have to check if that barcode matches a value in this table. What would be the fastest way to accomplish this? I must mention there is no pattern to these values
Here Are Some Thoughts
Ajax call to PHP file to query MySQL table ( my thoughts would this would be slowest )
Load this MySQL table into an array on log in. Then when scanning Ajax call to PHP file to check the array
Load this table into an array on log in. When viewing the scanning page somehow load that array into a JavaScript array and check with JavaScript. (this seems to me to be the fastest because it eliminates Ajax call and MySQL Query. Would it be efficient to split into smaller arrays so I don't lag the server & browser?)
Honestly, I'd never load the entire table for anything. All I'd do is make an AJAX request back to a PHP gateway that then queries the database, and returns the result (or nothing). It can be very fast (as it only depends on the latency) and you can cache that result heavily (via memcached, or something like it).
There's really no reason to ever load the entire array for "validation"...
Much faster to used a well indexed MySQL table, then to look through an array for something.
But in the end it all depends on what you really want to do with the data.
As you mentions your table contain around 9.5K of data. There is no logic to load data on login or scanning page.
Better to index your table and do a ajax call whenever required.
Best of Luck!!
While 9.5 K rows are not that much, the related amount of data would need some time to transfer.
Therefore - and in general - I'd propose to run validation of values on the server side. AJAX is the right technology to do this quite easily.
Loading all 9.5 K rows only to find one specific row, is definitely a waste of resources. Run a SELECT-query for the single value.
Exposing PHP-functionality at the client-side / AJAX
Have a look at the xajax project, which allows to expose whole PHP classes or single methods as AJAX method at the client side. Moreover, xajax helps during the exchange of parameters between client and server.
Indexing to be searched attributes
Please ensure, that the column, which holds the barcode value, is indexed. In case the verification process tends to be slow, look out for MySQL table scans.
Avoiding table scans
To avoid table scans and keep your queries run fast, do use fixed sized fields. E.g. VARCHAR() besides other types makes queries slower, since rows no longer have a fixed size. No fixed-sized tables effectively prevent the database to easily predict the location of the next row of the result set. Therefore, you e.g. CHAR(20) instead of VARCHAR().
Finally: Security!
Don't forget, that any data transferred to the client side may expose sensitive data. While your 9.5 K rows may not get rendered by client's browser, the rows do exist in the generated HTML-page. Using Show source any user would be able to figure out all valid numbers.
Exposing valid barcode values may or may not be a security problem in your project context.
PS: While not related to your question, I'd propose to use PHPexcel for reading or writing spreadsheet data. Beside other solutions, e.g. a PEAR-based framework, PHPExcel depends on nothing.

JS, PHP, and MySQL to get large data

I am using Ajax to send query to PHP server, which then run the SQL query to get data. Because the query involves three tables (two large ones), so JOIN the three tables is very slow.
Then I split the SQL query to three queries. It improves the efficiency (for small dataset). But for large dataset, because the PHP program runs the three queries one by one, and processes the result after each, there will be 30 second timeout (by default). I don't want to remove this default setting.
To avoid timeout, I am also considering running the three query and returning the result to JS, and let client side to do processing.
Is there other way to do that?
add
Basically, I want three output, title, extviews, allviews, for each item, WHERE extviews>somevalue. title is from one small table, extviews and allviews are aggregated from two different large tables. I have all the fields indexed, but joining the two big tables still requires a long time.
So I first aggregate one table to get extviews for each item, and also a list of item id. The results are organized as an array for JSON output to JS. Then using the list of id, I get the title for each item, and aggregate the other table to get allviews. Then I update the array with the new results.
Unless your mysql server is really overloaded, it's usually quickier to use joins. I guess you've already defined indexes on your tables? (for fields used in join condition & where clauses)
Doing the processing on the client side might also be a problem, since you'll have to send a lot of data in order to do the join...
Edit:
If all "easy" optimisation is done, then you have 2 choices... The one you just described (doing it on client size, if it's possible - what is the size (in bytes) of the json arrays you send to the client?)
Your other choice is to do the processing in the background (via cron) & cache somehow the results.
As already indicated by other people responding to your post, you should give us an idea of the structure of your three tables and the intent of each. Based upon that information, you may be able to get significant performance improvements by optimizing your database structure. To make it easier to understand, let's assume that someone had a website running off an intelligently designed database. I could easily make that application perform ten times worse solely by modifying the structure of the database.
Now, maybe there's some reason why you need to have three distinct tables, but I can't make that judgment without knowing what the fields in the database are, what you're aggregating, and what your web application is doing in the first place. Is it read heavy or write heavy? The solution may be as simple as denormalizing your database so that you don't need to use any joins.
I can say from a cursory glance at your description of what you're doing, that this application can't possibly scale efficiently and that you really need to reconsider your design. The first warning sign for me is the fact that you stated that one of the joins is just to link the title to two other tables. To me, being forced to do a join just to get a title of an object seems indicative of over-normalization. Some data redundancy is not necessarily a bad thing, and in some situations it's absolutely mandatory. Also, you say that you have two large tables that you use aggregate functions on and then join everything together. I can tell you right now that you're going to run into some serious performance issues if every hit to your application involves using a triple join and two aggregate functions, I'm assuming count.
Ultimately, we'll be able to give you a better response once you provide more information as to what you're trying to accomplish, and the general structure of the database you set up for it.

performance issue on displaying records

I have a table with just 3,000 records.
I render these 3000 records in the home page without pagination, my client is not interested in pagination...
So to show page completely it takes around 1 min, 15 sec. What can be done to make the page load more quickly?
My table structure:
customer table
customer id
customer name
guider id
and few columns
guider table
guider id
guider name
and few columns
Where's the slow down? The query or the serving?
If the former, see the comments above. If the latter:
Enable gzip on the server. Otherwise capture the [HTML?] output to a file, compress it (zip), then serve it as a download. Same for any other format if you think something else can render it better than a browser (CSV and Open Office).
If you're outputting the data into a HTML table then you may have an issue where the browser is waiting for the end of the table before rendering it. You can either break this into multiple table chunks like every 500 records/rows or try CSS "table-layout: fixed;".
Check the Todos
sql Connection (dont open the
connection in loop) for query it
should be one time connection
check your queries and analyse it if you are using some complex logic
which can be replaced
use standard class for sql connection and query ; use ezsql
sql query best practice
While you could implement a cache to do this, you don't necessarily need to do so, an introducing unnecessary cache structures can often cause problems of its own. Depending on where the bottleneck is, it may not even help you much, or at all.
You need to look in two places for your analysis:
1) The query you're using to get your data. Take a look at its plan, or if you're not comfortable doing that, run it in your favorite query tool and see how long it takes to come back. If it doesn't take too long, you've got a pretty good idea that your bottleneck isn't the query. If the query itself takes a long time, that's where you should focus your efforts.
2) How your page is rendering. What is the size of your page, in bytes? It may be too big. Can you cut the size down by formatting? Can you more effectively use CSS to eliminate duplicate styling on the page? Are you using a fixed or dynamic table layout? Dynamic is generally going to be quite a bit slower, especially for large tables. Try to avoid nesting tables. Do everything you can to make the page as small as possible, and keep testing!
while displaying records i want to
display guidername so , i did once
function that return the guider name
Sounds like you need to use a JOIN. Here's a simple example:
SELECT * FROM customer JOIN guider ON guider.id=customer.guider_id
This will change your page from using N + 1 (3001) queries to just one.
Make sure both guider.id and customer.guider_id are indexed and of appropriate data types (such as integers).
This is a little list, what you should think about for improving the performance, the importance is relative to each point, so the first ist not to be the most important to you - which depends on the details of your project.
Check your database structure. If there are just these two tables, their might be little you can do. But keep in mind that there is stuff like indices and with an increasing number of records a second denormalizes table structure will improve the speed of retrieving results.
Use rather one Query for selecting your data, than iterating through ids and doing selects repeatedly
Run a separate Query for the guiders, I assume there are only a few of them. Save all guiders in a data structure, e.g. a dictionary, first and use the foreign key to apply the correct one to the current record - this might save a lot of data which has to be transmitted from the database to your web server.
Get your result set by using something like mysqli_result::fetch_all() which returns a 2-dimensional array with all results. This should be faster than iteration through each row with fetch_row()
Sanitize your HTML Output, use (external) CSS. This will save a lot of output space if you format your stuff with style=" ... a lot of formatting code ..." attributes in each line. If you use one large table, split them up in multiple tables (some browsers wait for the complete table to load before rendering it).
In a lot of languages very important: Use a string builder for concatenating your results into the output string!
Caching: Think about generating the output once a day or once an hour. Write it to a cachefile which is opened instead of querying the database and building the same stuff on every request. Maybe you want to offer this generated file as download, rather than displaying it as plain HTML Site on the web.
Last but not least, check the connections to webserver and database, the server load as well as the number of requests. If your servers are running on heavy load everything ales here might help reducing the load or you just have to upgrade hardware.
LOL
everyone is talking of big boys toys, like database structure, caching and stuff.
While the problem most likely lays in mere HTML and browsers.
Just to split whole HTML table in chunks will help first chunk to show up immediately while others will eventually come.
Only ones were right who said to profile whole thing first.
Trying to answer without profiling results is shooting in the dark.

Read multiple records given array of record IDs from the database efficiently

If you have an array of record ID's in your application code, what is the best way to read the records from the database?
$idNumsIWant = {2,4,5,7,9,23,56};
Obviously looping over each ID is bad because you do n queries:
foreach ($idNumsIWant as $memID) {
$DBinfo = mysql_fetch_assoc(mysql_query("SELECT * FROM members WHERE mem_id = '$memID'"));
echo "{$DBinfo['fname']}\n";
}
So, perhaps it is better to use a single query?
$sqlResult = mysql_query("SELECT * FROM members WHERE mem_id IN (".join(",",$idNumsIWant).")");
while ($DBinfo = mysql_fetch_assoc($sqlResult))
echo "{$DBinfo['fname']}\n";
But does this method scale when the array has 30,000 elements?
How do you tackle this problem efficiently?
The best approach depends eventually on the number of IDs you have in your array (you obviously don't want to send a 50MB SQL query to your server, even though technically it might be capable of dealing with it without too many trouble), but mostly on how you're going to deal with the resulting rows.
If the number of IDs is very low (let's say a few thousands tops), a single query with a WHERE clause using the IN syntax will be perfect. Your SQL query will be short enough for it to be transfered reliably, efficiently and quickly to the DB server. This method is perfect for a single thread looping through the resulting records.
If the number of IDs is really big, I would suggest you split the IDs array in several groups, and run more than 1 query, each one with a group of IDs. It may be a little heavier for the DB server, but on the application side you can spawn several threads and deal with the multiple recordsets as soon as they arrive, in a parrallel way.
Both methods will work.
Cliffnotes : For that kind of situations, focus on data usage, as long as data extraction isn't too big of a bottleneck. And profile your app !
My thoughts:
The first method is too costly in terms of processing and disk reads.
The second method is more efficient and you don't have to worry much about query size limit (but check it anyway).
When I have to deal with that kind of situation, I see at least three or four possible solutions :
one request per id ; as you said, this is not really good : lots of requests ; I generally don't do that
use the solution you proposed : one request for many ids
but you can't do that with a very long list of ids : some database engines have a limit on the number of data you can pass in an IN()
a very big list in IN() might not be good performance-wise
So I generally do something like one request for X ids, and repeat this. For instance, to fecth data corresponding to 1000 ids, I could do 20 requests, each getting data for 50 ids (that's just an example : benchmarking your DB/table could be intresting, for your particular case, as it might depends on several factors)
in some cases, you could also re-think your requests : maybe you could avoid passing such a list of ids, by using some kind of join ? (this really depends on what you need, your tables' schema, ...)
Also, to facilitate modifications of the fetching logic, I would write a function that gets the list of ids, and return the list of data corresponding to those.
This way, you just call this function the same way, and you always get the same data, not having to worry about how that data is fetched ; this will allow you to change the fetching method if needed (if you find another better way some day), without breaking anything : HOW the function works will change, but as it's interface (input/output) will remain the same, it will not change a thing for the rest of your code :-)
If it were me and I had that large a list of values for the in clause, I would use a stored proc with a variable containing the values I wanted and use a function in it to send them into a temp table and then join to it. Depending on the size of the values you want to send, you might need to split it up into mutiple input vairables to process. Is there any way the values could be permanently stored (if they are often querying on this) in the database? And how is the user going to pick out 30,000 values, surely he or she is n;t going to tyope them all in? So there is probably a better way to query the table based ona a join and a where clause.
Using StringTokenizer by separating your string into tokens it would be easier for u to handle this, of retrieving data for multiple values

Categories