PHP's MySQL Cursor implementations and how they manage memory - php

How do the different MySQL Cursors within PHP manage memory? What I mean is, when I make a MySQL query that retrieves a large result set, and get the MySQL resource back, how much of the data that the query retrieved is stored in local memory, and how are more results retrieved? Does the cursor automatically fetch all the results, and give them to me as I iterate through the resource with fetch_array or is it a buffered system?
Finally, are the cursors for the different drivers within mysql implemented differently? There's several MySQL drivers for PHP available, mysql, mysqli, pdo, etc. Do they all follow the same practices?

That depends on what you ask php to do, for instance mysql_query() grabs all the result set (if that's 500 megabytes, goodbye) ; if you don't want that you can use :
http://php.net/manual/en/function.mysql-unbuffered-query.php
PDO, MySQLI seem to have other ways of doing the same thing.
Depending on your query, the result set may be materialized on the database side (if you need a sort, then the sort must be done entirely before you even get the first row).
For not too large result sets it's usually better to fetch it all at once, so the server can free used resources asap.

Related

Difference between mysql cache and memcached

I have 1 mysql table which is controlled strictly by admin. Data entry is very low but query is high in that table. Since the table will not change content much I was thinking to use mysql query cache with PHP but got confused (when i googled about it) with memcached.
What is the basic difference between memcached and mysqlnd_qc ?
Which is most suitable for me as per below condition ?
I also intend to extend the same for autcomplete box, which will be suitable in such case ?
My queries will return less than 30 rows mostly of very few bytes data and will have same SELECT queries. I am on a single server and no load sharing will be done. Thankyou in advance.
If your query is always the same, i.e. you do SELECT title, stock FROM books WHERE stock > 5 and your condition never changes to stock > 6 etc., I would suggest using MySQL Query Cache.
Memcached is a key-value store. Basically it can cache anything if you can turn it into key => value. There are a lot of ways you can implement caching with it. You could query your 30 rows from database, then cache it row by row but I don't see a reason to do that here if you're returning the same set of rows over and over. The most basic example I can think of for memcached is:
// Run the query
$result = mysql_query($con, "SELECT title, stock FROM books WHERE stock > 5");
// Fetch result into one array
$rows = mysqli_fetch_all($result);
// Put the result into memcache.
$memcache_obj->add('my_books', serialize($rows), false, 30);
Then do a $memcache_obj->get('my_books'); and unserialize it to get the same results.
But since you're using the same query over and over. Why add the complication when you can let MySQL handle all the caching for you? Remember that if you go with memcached option, you need to setup memcached server as well as implementing logic to check if the result is already in cache or not, or if the records have been changed in the database.
I would recommend using MySQL query cache over memcached in this case.
One thing you need to be careful with MySQL query cache, though, is that your query must be exactly the same, no extra blank spaces, comments whatsoever. This is because MySQL does no parsing to determine compare the query string from cache at all. Any extra character somewhere in the query means a different query.
Peter Zaitsev explained very well about MySQL Query Cache at http://www.mysqlperformanceblog.com/2006/07/27/mysql-query-cache/, worth taking a look at it. Make sure you don't need anything that MySQL Query Cache does not support as Peter Zaitsev mentioned.
If the queries run fast enough and does not really slows your application, do not cache it. With a table this small, MySQL will keep it in it's own cache. If your application and database are on the same server, the benefit will be very small, maybe even not measurable at all.
So, for your 3rd question, it also depends on how you query the underlying tables. Most of the time, it is sufficient to let MySQL cache it internally. An other approach is to generate all the possible combinations and store these, so mysql does not need to compute the matching rows and returns the right one straight away.
As a general rule: build your application without caching and only add caches for things that do not change often if a) the computation for the resultset is complex and timeconsuming or b) you have multiple application instances calling the database over a network. In those cases caching results in better performance.
Also, if you run PHP in a web server like Apache, caching inside your program does not add much benefit as it only uses the cache for the current page. An external cache (like memcache)- is then needed to cache over multiple results.
What is the basic difference between memcached and mysqlnd_qc ?
There is rather nothing common at all between them
Which is most suitable for me as per below condition ?
mysql query cache
I also intend to extend the same for autcomplete box, which will be suitable in such case ?
Sphinx Search

What are the benefits of creating Stored Procedures in SQL and MySQL?

I have a theoretical question.
I can't see any difference between declaring a function within a PHP file and creating a stored procedure in a database that does the same thing.
Why would I want to create a stored procedure to, for example, return a list of all the Cities for a specific Country, when I can do that with a PHP function to query the database and it will have the same result?
What are the benefits of using stored procedures in this case? Or which is better? To use functions in PHP or stored procedures within the database? And what are the differences between the two?
Thank you.
Some benefits include:
Maintainability: you can change the logic in the procedure without needing to edit app1, app2 and app3 calls.
Security/Access Control: it's easier to worry about who can call a predefined procedure than it is to control who can access which tables or which table rows.
Performance: if your app is not situated on the same server as your DB, and what you're doing involves multiple queries, using a procedure reduces the network overhead by involving a single call to the database, rather than as many calls as there are queries.
Performance (2): a procedure's query plan is typically cached, allowing you to reuse it again and again without needing to re-prepare it.
(In the case of your particular example, the benefits are admittedly nil.)
Short answer would be if you want code to be portable, don't use stored procedures because if you will want at some point change database for example from MySQL to PostgreSQL you will have to update/port all stored procedures you have written.
On the other hand, sometimes you can achieve better performance results using stored procedures because all that code will run by database engine. You also can make situation worse if stored procedures will be used improperly.
I dont think that selecting country is very expensive operation. So I guess you don't have to use stored procedures for this case.
As most of the guys already explained it, but still i would try to reiterate in my own way
Stored Procedures :
Logic resides in the database.
Lets say some query which we need to execute, then we can do that either by :
Sending the query to DataBase server from client, where it will be parsed, compiled and then executed.
The other way is stationing the query at DataBase server and create an aliasing for the query, which client will use to send the request to database server and when recieved at server it will be executed.
So we have :
Client ----------------------------------------------------------> Server
Conventional :
Query created #Client ---------- then propagate to Server ----------Query : Reached server : Parse, Compiled , execute.
Stored Procedures :
Alias is Created, used by Client----------------then propogate to Server-------- Alias reached at Server : Parse,Compiled, Cached (for the first Time)
Next time same alias comes up, execute the query executable directly.
Advantages :
Reduce Network Traffic : If client is sending a big query, and may be using the same query very frequently then every bit of the query is send to the network and hence which may increase the network traffic and unnecessary increase the network usage.
Faster Query Execution : Since stored procedures are Parsed, Compiled at once, and the executable is cached in the Database. Therefore if same query is
repeated multiple times then Database directly executes the executable and hence Time is saved in Parse,Compile etc. This is good if query is used frequently.
If query is not used frequently, then it might not be good, because storing cached executable takes space, why to put Load on Database unnecessarily.
Modular : If multiple applications wants to use the same query, then with traditional way you are duplicating code unnecessarily at applications, the best
way is to put the code close to Database, this way duplication can be alleviated easily.
Security: Stored procedures are also developed, keeping in mind about Authorization(means who is privileged to run the query and who is not).So for a specific user you can grant permissions, to others you as DBA can revoke the permission. So its a good way as a point wrt to DBAs a DBA you can know who are right persons to get the access.But such things are not that popular now, you can design your Application Database such that only authorized person can access it and not all.
So if you have only Security/Authorization as the point to use Stored Procedures instead of Conventional way of doing things, then Stored procedure might not be appropriate.
ok, this may be a little oversimplified (and possibly incomplete):
With a stored procedure:
you do not need to transmit the query to the database
the DBMS does not need to validate the query every time (validate in a sense of syntax, etc)
the DBMS does not need to optimize the query every time (remember, SQL is declarative, therefore, the DBMS has to generate an optimized query execution plan)

Which is better way to manipulate data in DB?

In MySQL there are also functions. For example DAY(), ASCII(), DATEDIFF(), and so on, a lot of function. So it is possible to make complicated queries using these functions. And all computing/manipulation with data will be done still in MySQL by MySQL functions. And I can get the result of my query in PHP already prepared by MySQL as refined as only possible.
But it’s also possible to compute/manipulate data from DB using PHP function. I mean, I retrieve data from DB using as simple SQL query as possible, then get it into an array, fetch that array into variables and only now, using PHP functions, begin necessary manipulations with my data to refine what I need.
So my question which way is better to work with DB? Should I use MySQL’s sever strength as full as only possible to get as ready-to-use data as only possible, or should I use more PHP functions to manipulate raw data from DB?
If you use MySQL (or other non-distributed databases) you might consider doing this in code. This will lighten the load on the database server while webservers are perfectly scalable these days. You only have 1 MySQL server, and can scale up to X webservers.
The best would be to create a helper class that wraps the database native functions.
If you leave the performance issues for what they are, it would always be better to use available functions instead of creating them yourself.
Assuming that you'd be executing the same statements against the database in either case, it will always be faster to run your statements directly in MySQL (e.g. functions, stored procedures) than it will be to use any of PHP's MySQL extensions.
look ta my question Text proccessing via Mysql Query.
its better to store simple data into data base as each data one row.then get it to php and proccess them.

how to store a huge amount of data in php?

I have to search over huge data from db through php code. I don't want to give many db hits. So i selected all data from db to be searched and tried to store it in array to do further search on array not on db, but problem is that the data exceeds the limit of array.
What to do?
Don't do that.
Databases are designed specifically to handle large amounts of data. Arrays are not.
Your best bet would be to properly index your db, and then write your optimized query that will get the data you need from the database. You can use PHP to construct the query. You can get almost anything from a db through a good query, no need for PHP array processing.
If you gave a specific example, we could help you construct that SQL query.
Databases are there to filter the data for you. Use the most accurate query you can, and only filter in code if it's too hard (or impossible) to do in SQL.
A full table selection can be much more expensive (especially for I/O on the db server, and it can have dire effects on the server's cache) than a correctly indexed select with the appropriate where clause(s).
There is communication overhead involved when obtaining records from a database to PHP, so not only is it a good idea to reduce the number of calls from PHP to the database, but it is also ideal to minimize the number of elements returned by the database and processed in your PHP code. You should structure your query (depending on the type of database) to return just the entries you need or as few entries as possible for whatever you need to do. There are a lot of databases that support fairly complex operations directly within the database query, and typically the database will do it way faster than PHP.
Two simple steps:
Increase the amount of memory php can use via the memory_limit setting
Install more RAM
Seriously, you'll be better off optimizing your database in a way that you can quickly pull the data you need to work on.
If you are actually running into problems, then run a query analyzer to see which queries are taking too much time. Fix them. Repeat the process.
You do not need to store your data in an array, it makes no sense. Structure your query accordingly your purpose and then fetch the data with PHP.
In case, if you need to increase your memory limit you can change memory_limit in php.ini (or update .htaccess with desired memory limit php_value memory_limit '1024M')
Last but not least - use a pagination, rather than load the whole data at once.

Are there OCI8 replacements for ADO MoveFirst and EOF, BOF?

I am researching the possibility of porting an application written in classic ASP with ADO record sets and an Oracle database to PHP5 and OCI8. We have lots of stored procedures and queries with bind variables for performance.
My problem is that we have become lazy from using the ADO classes and the EOF and BOF indicators along with MoveFirst, MoveNext and MovePrevious.
I can not find any similar functionality in the OCI module. Is there any hope?
This is outside my area of expertise, but I think the equivalent functionality outside of ADO would be to retrieve the dataset into an array, then use standard array navigation techniques, rather than functionality that is specific to the database API.
If you're dealing with datasets that are large enough that you don't want to load the whole thing at one time, you should try to find a way to narrow the result set in the query before you start navigating the results. For instance, if you find yourself loading a result set, then just going to the last row, it's easy enough to make the query just return the last row in the first place. If you find yourself retrieving a result set, then looping through it (or filtering) for a specific row (or set of rows), I think you'll find that letting Oracle do that for you will show significantly better performance.
The reason that you need to use arrays to do this kind of navigation with Oracle is that Oracle cursors are always forward-only (whereas with ADO, you have dynamic, keyset, and static cursors as well). If you really need to be able to navigate an entire large result set, loading the whole thing into an array is about your only choice.

Categories