I think I'm probably looking at this the complete wrong way. I have a stored procedure that returns a (potentially large, but usually not) result set. That set gets put into a table on the web via PHP. I'm going to implement some AJAX for stuff like dynamic reordering and things. The stored procedure takes one to two seconds to run, so it would be nice if I could store that final table somewhere that I can access it faster once it's been run. More specifically, the SP is a search function; so I want the user to be able to do the search, but then run an ORDER BY on the returned data without having to redo the whole search to get that data again.
What comes to mind is if there is a way to get results from the stored procedure without it terminating, so I can use a temp table. I know I could use a permanent table, but then I'd run into trouble if two people were trying to use it at the same time.
A short and simple answer to the question: 'is a way to get results from the stored procedure without it terminating?': No, there isn't. How else would the SP return the resultset?
2 seconds does sound like an awfully long time, perhaps you could post the SP code, so we can look at ways to speed up the query's you use. It might also prove useful to give some more info on your tables (indeces, primary keys... ).
If all else fails, you might consider looking into JavaScript table sorters... but again: some code might help here
Related
Use Case:
I'm building a site where users can search records - with SQL. BUT - they should also be able to save their search and be notified when a new submitted record meets the criteria.
It's not a car buying site, but for example: The user searches for a 1967 Ford Mustang with a 289 V8 engine, within the 90291 ZIP code. Can't find the right one, but they want to be notified if a matching car is submitted 2 weeks later.
So of course, every time a new car is added to the DB, I can retrieve all the user search queries, and run all of them over all the cars in the DB. But that is not scalable.
Rather than search the entire "car" table with every "search" query every time a new car is submitted, I would like to just check that single "car" object/array in memory, with the existing user queries.
I'm doing this in PHP with Laravel and Eloquent, but I am implementation agnostic and welcome any theoretical approaches.
Thanks,
Chris
I would rather run the saved searches in batches at scheduled intervals and not run them avery time a record is appended to the tables.
It comes down to how you structure your in memory cache.
Whatever cache it is it usually relies on key, value pairs. It will be the same for the cache you are using:
http://laravel.com/docs/4.2/cache
So in the end it is all about using the right key. If you want to update the cached objects based on a car, then you would need to make the key in a way so that you can retrieve all objects from the cache using the car as (part of) the key. Usually you would concat multiple things for key like userId+carId+xyz and then make a MD5 checksum of that.
So that would be the answer to your question. However generally I would not recommend this approach. It sounds like your search results are more like persisted long term available results. So you would probably want to store them somewhere more permanent like a simple table. Then you can use standard SQL tools to join the table and find out what is needed.
My approach would be to use a MySQL stored procedure and use https://dev.mysql.com/doc/refman/5.1/en/event-scheduler.html to review the configs for possible changes and then flag them storing some kind of dirty indicator which is then checked by a php script which would be executed on demand or from cron etc periodically.
You could use the trigger to simply flag that the event scheduler has work to do. However you approach there are a number of state variables which starts to get ugly however this use case doesn't seem to map neatly into a queuing architecture as far as I can see.
A possible approach would be to use a trigger in SQL to send a notification. Here is something related with it: 1s link or 2nd link.
I have a MySQL table with about 9.5K rows, these won't change much but I may slowly add to them.
I have a process where if someone scans a barcode I have to check if that barcode matches a value in this table. What would be the fastest way to accomplish this? I must mention there is no pattern to these values
Here Are Some Thoughts
Ajax call to PHP file to query MySQL table ( my thoughts would this would be slowest )
Load this MySQL table into an array on log in. Then when scanning Ajax call to PHP file to check the array
Load this table into an array on log in. When viewing the scanning page somehow load that array into a JavaScript array and check with JavaScript. (this seems to me to be the fastest because it eliminates Ajax call and MySQL Query. Would it be efficient to split into smaller arrays so I don't lag the server & browser?)
Honestly, I'd never load the entire table for anything. All I'd do is make an AJAX request back to a PHP gateway that then queries the database, and returns the result (or nothing). It can be very fast (as it only depends on the latency) and you can cache that result heavily (via memcached, or something like it).
There's really no reason to ever load the entire array for "validation"...
Much faster to used a well indexed MySQL table, then to look through an array for something.
But in the end it all depends on what you really want to do with the data.
As you mentions your table contain around 9.5K of data. There is no logic to load data on login or scanning page.
Better to index your table and do a ajax call whenever required.
Best of Luck!!
While 9.5 K rows are not that much, the related amount of data would need some time to transfer.
Therefore - and in general - I'd propose to run validation of values on the server side. AJAX is the right technology to do this quite easily.
Loading all 9.5 K rows only to find one specific row, is definitely a waste of resources. Run a SELECT-query for the single value.
Exposing PHP-functionality at the client-side / AJAX
Have a look at the xajax project, which allows to expose whole PHP classes or single methods as AJAX method at the client side. Moreover, xajax helps during the exchange of parameters between client and server.
Indexing to be searched attributes
Please ensure, that the column, which holds the barcode value, is indexed. In case the verification process tends to be slow, look out for MySQL table scans.
Avoiding table scans
To avoid table scans and keep your queries run fast, do use fixed sized fields. E.g. VARCHAR() besides other types makes queries slower, since rows no longer have a fixed size. No fixed-sized tables effectively prevent the database to easily predict the location of the next row of the result set. Therefore, you e.g. CHAR(20) instead of VARCHAR().
Finally: Security!
Don't forget, that any data transferred to the client side may expose sensitive data. While your 9.5 K rows may not get rendered by client's browser, the rows do exist in the generated HTML-page. Using Show source any user would be able to figure out all valid numbers.
Exposing valid barcode values may or may not be a security problem in your project context.
PS: While not related to your question, I'd propose to use PHPexcel for reading or writing spreadsheet data. Beside other solutions, e.g. a PEAR-based framework, PHPExcel depends on nothing.
I have table in database named ads, this table contains data about each ad.
I want to get that data from table to display ad.
Now, I have two choices:
Either get all data from table and store it in array, and then , I will treat with this array to display each ad in its position by using loops.
Or access to table directly and get each ad data to display it, note this way will consume more queries to database.
Which one is the best way, and not make the script more slow ?
In most Cases #1 is better.
Because, if you can select the data (smallest, needed set) in one query,
then you have less roundtrips to the database server.
Accessing Array or Objectproperties (from Memory) are usually faster than DB Queries.
You could also consider to prepare your Data and don't mix fetching with view output.
The second Option "select on demand" could make sense if you need to "lazy load",
maybe because you can or want to recognize client properties, like viewport.
I'd like to highlight the following part:
get all data from table and store it in array
You do not need to store all rows into an array. You could also take an iterator that represents the resultset and then use that one.
Depending on the database object you use this is often the less memory-intensive variant. Also you would run only one query here which is preferable.
The iterator is actually common with modern database result objects.
Additionally this is helpful to decouple the view code from the actual database interaction and you can also defer to do the SQL query.
You should minimize the amount of queries but you should also try to minimize the amount of data you actually get from the database.
So: Get only those ads that you are actually displaying. You could for example use columnPK IN (1, 2, 3, 4) to get those ads.
A notable exception: If your application is centered around "ads" and you need them pretty much everywhere, and/or they don't consume much memory, and/or there aren't too many adds, it might be better performance-wise to store all (or a subset) of your ads in an array.
Above all: Measure, measure, measure!
It is very, very hard to predict which algorithm will be most efficient. Often you implement something "because it will be more efficient" only to find out later that your optimization is actually slowing down your application.
You should always try to run a PHP script with the least amount of database queries possible. Whenever you query the database, a request must be sent to the database (usually) over the network, and your script will idle until the request came back.
You should, however, make sure not to request any more data from the database than necessary. So try to filter as much in the WHERE clause as possible instead of requesting the whole table and then picking individual rows on the PHP layer.
We could help with writing that SQL query when you tell us how your table looks and how you want to select which ads to display.
I have a database which holds URL's in a table (along with other many details about the URL). I have another table which stores strings that I'm going to use to perform searches on each and every link. My database will be big, I'm expecting at least 5 million entries in the links table.
The application which communicates with the user is written in PHP. I need some suggestions about how I can search over all the links with all the patterns (n X m searches) and in the same time not to cause a high load on the server and also not to lose speed. I want it to operate at high speed and low resources. If you have any hints, suggestions in pseudo-code, they are all welcomed.
Right now I don't know whether to use SQL commands to perform these searches and have some help from PHP also or completely do it in PHP.
First I'd suggest that you rethink the layout. It seems a little unnecessary to run this query for every user, try instead to create a result table, in which you just insert the results from that query that runs ones and everytime the patterns change.
Otherwise, make sure you have indexes (full text) set on the fields you need. For the query itself you could join the tables:
SELECT
yourFieldsHere
FROM
theUrlTable AS tu
JOIN
thePatternTable AS tp ON tu.link LIKE CONCAT('%', tp.pattern, '%');
I would say that you pretty definately want to do that in the SQL code, not the PHP code. Also searching on the strings of the URLs is going to be a long operation so perhaps some form of hashing would be good. I have seen someone use a variant of a Zobrist hash for this before (google will bring a load of results back).
Hope this helps,
Dan.
Do as much searching as you practically can within the database. If you're ending up with an n x m result set, and start with at least 5 million hits, that's a LOT Of data to be repeatedly slurping across the wire (or socket, however you're connecting to the db) just to end up throwing away most (a lot?) of it each time. Even if the DB's native search capabilities ('like' matches, regexp, full-text, etc...) aren't up to the task, culling unwanted rows BEFORE they get sent to the client (your code) will still be useful.
You must optimize your tables in DB. Use a md5 hash. New column with md5, will use index and faster found text.
But it don't help if you use LIKE '%text%'.
You can use Sphinx or Lucene.
Is it possible to do a simple count(*) query in a PHP script while another PHP script is doing insert...select... query?
The situation is that I need to create a table with ~1M or more rows from another table, and while inserting, I do not want the user feel the page is freezing, so I am trying to keep update the counting, but by using a select count(\*) from table when background in inserting, I got only 0 until the insert is completed.
So is there any way to ask MySQL returns partial result first? Or is there a fast way to do a series of insert with data fetched from a previous select query while having about the same performance as insert...select... query?
The environment is php4.3 and MySQL4.1.
Without reducing performance? Not likely. With a little performance loss, maybe...
But why are you regularily creating tables and inserting millions of row? If you do this only very seldom, can't you just warn the admin (presumably the only one allowed to do such a thing) that this takes a long time. If you're doing this all the time, are you really sure you're not doing it wrong?
I agree with Stein's comment that this is a red flag if you're copying 1 million rows at a time during a PHP request.
I believe that in a majority of cases where people are trying to micro-optimize SQL, they could get much greater performance and throughput by approaching the problem in a different way. SQL shouldn't be your bottleneck.
If you're doing a single INSERT...SELECT, then no, you won't be able to get intermediate results. In fact this would be a Bad Thing, as users should never see a database in an intermediate state showing only a partial result of a statement or transaction. For more information, read up on ACID compliance.
That said, the MyISAM engine may play fast and loose with this. I'm pretty sure I've seen MyISAM commit some but not all of the rows from an INSERT...SELECT when I've aborted it part of the way through. You haven't said which engine your table is using, though.
The other users can't see the insertion until it's committed. That's normally a good thing, since it makes sure they can't see half-done data. However, if you want them to see intermediate data, you could throw in an occassional call to "commit" while you're inserting.
By the way - don't let anybody tell you to turn autocommit on. That a HUGE time waster. I have a "delete and re-insert" job on my database that takes 1/3rd as long when I turn off autocommit.
Just to be clear, MySQL 4 isn't configured by default to use transactions. It uses the MyISAM table type which locks the entire table for each insert, if I remember correctly.
Your best bet would be to use one of the MySQL bulk insertion functions, such as LOAD DATA INFILE, as these are dramatically faster at inserting large amounts of data. As for the counting, well, you could break the inserts into N groups of 1000 (or Y) then divide your progress meter into N sections and just update it on each group's request.
Edit: Another thing to consider is, if this is static data for a template, then you could use a "select into" to create a new table with the same data. Not sure what your application is, or the intended functionality, but that could work as well.
If you can get to the console, you can ask various status questions that will give you the information you are looking for. There's a command that goes something like "SHOW processlist".