PHP problem with selecting from Oracle global temporary table - php

I have an Oracle global temporary table which is "ON COMMIT DELETE ROWS".
I have a loop in which I:
Insert to global temporary table
Select from global temporary table (post-processing)
Commit, so that the table is purged before next iteration of the loop
Insertion is done with a call to oci_execute($stmt, OCI_DEFAULT). Retrieval is made through a call to oci_fetch_all($stmt, $result, 0, -1, OCI_FETCHSTATEMENT_BY_ROW | OCI_ASSOC). After that, a commit is made: oci_commit().
The problem is that retrieval sometimes works, and sometime I get one of the following errors:
ORA-08103: object no longer exists
ORA-01410: invalid ROWID
As if the session cannot "see" the records that it previously inserted.
Do you have any idea what could be causing this?
Thanks.

Are you using connection pooling? If so then it could be that different calls are executing in separate sessions.
A better solution would be to have a single PL/SQL procedure which populates the temporary table and returns a resultset set in a single call. Which then suggests an even better solution: do away with the temporary table altogether.
There are few situations in Oracle which demand the use of temporary tables. Most things are solvable with pure SQL or perhaps bulk collecting into nested tables. What actual manipulation of the data in the temporary table do you undertake between the insert and the subsequent select?
edit
Temporary tables have a performance hit - the rows are written to disk. PL/SQL collections remain in (session) memory and so are faster. Of course, because they are in session memory they won't solve the problem you have with connection pooling.
Is the reason you need to chunk up the data because you don't want to pass 200,000 rows to your PHP in one fell swoop? I think I need a little more context if I am to help you any further.

Related

Archiving mysql data throwing memory limit issue

I have multiple tables. like table1, table2, table3, etc.
What is required:
1. fetch specific row from table1. (for ex: id = 203)
2. fetch all values related to id 203 from table2 (ex: 1,2,3,4,5,6,7....500)
3. again fetch all values of ids from step 2 from table3,table4, etc which have foreign key relation on table2.(millions of rows)
4. Build insert statements for all above 3 steps from result.
5. Insert queries of step.4 in respected tables in archived db with same table names. ie, in short, archiving some part of the data to archive DB.
How I am doing:
For each table, whenever got the rows, created insert statement and storing in specific arrays for each table. Once fetched all values till step 3, creating insert statement and storing in array. Then running loops for each separate arrays and executing these queries archived DB. Once queries executed successfully, deleting all fetched rows from main db, then committing the transaction.
Result:
So far the above approach worked very well with small DB of size around 10-20mb data.
Issue:
For larger number of rows(say more than 5gb), the php is throwing memory exhaust error while fetching rows and hence not working in Production. Even I have increased memory limit till 3gb. I dont want to increase it more.
Alternate solution what I am thinking is, instead of using arrays to store queries, store these queries in files, and then internally use infile command to execute queries to archive DB.
Please suggest how to achieve above issue? once moved to archive DB, there are requirements to move back to main DB with similar functionality.
There are two keys to handling large result sets.
The first is to stream the result set row by row. Unless you specify this explicitly, the php APIs for MySQL immediately attempt to read the entire result set from the MySQL server into client memory, then navigate through that row by row. If your result set has tens or hundreds of thousands of rows, this can make php run out of memory.
If you're using the mysql_ interface, use mysql_unbuffered_query(). You should not be using that interface, though. It's deprecated because, well, it sucks.
If you're using the mysqli_ interface, call mysqli_real_query() instead of mysqli_query(). Then call mysqli_use_result() to initiate retrieval of the result set. You can then fetch each row with with one of the fetch() variants. Don't forget to use mysqli_free_result() to close the result set when you have fetched all its rows. mysqli_ has object-oriented methods; you can use those as well.
PDO has a similar way of streaming result sets from server to client.
The second key to handling large result sets is to use a second connection to your MySQL server to perform the INSERT and UPDATE operations so you don't have to accumulate them in memory. The same goes if you choose to write information to a file in the file system: write it out a row at a time so you don't have to hold it in RAM.
The trick is to handle one or a few rows at a time, not tens of thousands.
It has to be said: Many people prefer to use command line programs written in a number-crunching language like Java, C#, or PERL for this kind of database maintenance.

Which Insert query run faster and accurate?

I have to insert data into MySQL database (appox. 200,000). I am a little confused about the insert query. I have two options to insert data into MySQL:
INSERT INTO paper VALUES('a','b','c','d');
INSERT INTO paper VALUES('e','f','g','h');
INSERT INTO paper VALUES('k','l','m','n');
and
INSERT INTO paper VALUES('a','b','c','d'),('e','f','g','h'),('k','l','m','n');
Which insert query performs faster? What is the difference between the queries?
TL;TR
The second query will be faster. Why? Read below...
Basically, a query is executed in various steps:
Connecting: Both versions of your code have to do this
Sending query to server: Applies to both versions, only the second version sends only one query
Parsing query: Same as above, both versions need the queries to be parsed, the second version needs only 1 query to be parsed, though
Inserting row: Same in both cases
Inserting indexes: Again, same in both cases in theory. I'd expect MySQL to build update the index after the bulk insert in the second case, making it potentially faster.
Closing: Same in both cases
Of course, this doesn't tell the whole story: Table locks have an impact on performance, the MySQL config, use of prepared statements and transactions might result in better (or worse) performance, too. And of course, the way your DB server is set up makes a difference, too.
So we return to the age-old mantra:
When in doubt: test!
Depending on what your tests tell you, you might want to change some configuration, and test again until you find the best config.
In case of a big data-set, the ideal compromise will probably be a combination of both versions:
LOCK TABLE paper WRITE
/* chunked insert, with lock, probably add transaction here, too */
INSERT INTO paper VALUES ('a', 'z'), ('b','c');
INSERT INTO paper VALUES ('a', 'z'), ('b','c');
UNLOCK TABLES;
Just RTM - MySQL insert speed:
If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements. If you are adding data to a nonempty table, you can tune the bulk_insert_buffer_size variable to make data insertion even faster. See Section 5.1.4, “Server System Variables”.
If you can't use multiple values, then locking is an easy way to speed up the inserts too, as explained on the same page:
To speed up INSERT operations that are performed with multiple statements for nontransactional tables, lock your tables:
LOCK TABLES a WRITE;
INSERT INTO a VALUES (1,23),(2,34),(4,33);
INSERT INTO a VALUES (8,26),(6,29);
/* ... */
UNLOCK TABLES;
This benefits performance because the index buffer is flushed to disk only once, after all INSERT statements have completed. Normally, there would be as many index buffer flushes as there are INSERT statements. Explicit locking statements are not needed if you can insert all rows with a single INSERT.
Read through the entire page for details
I'm not sure which is faster in purely database-side manner. But when you call database from your PHP scripts, then second way should be much faster as you save resources on multiple calls.
Anyway. There is just one way to know. TEST IT.

How to run multiple sql queries using php without giving load on mysql server?

I have a script that reads an excel sheet containing list of products. These are almost 10000 products. The script reads these products & compares them with the products inside mysql database, & checks
if the product is not available, then ADD IT (so I have put insert query for that)
if the product is already available, then UPDATE IT (so I have put update query for that)
Now the problem is, it creates a very heavy load on mysql server & it shows a message as "mysql server gone away..".
I want to know is there a better method to do this excel sheet work without making load on mysql server?
I am not sure if this is the case, but judging from your post, I assume it could be the case that for every check you initilize a new connection to the MySQL server. If that indeed is the case you can simply connect once before you do this check, and run all future queries trought this connection.
Next to that a good optimization option would be to introduce indexes in MySQL that would significantly speed up product search, introduce index for those product table columns, that you reference most in your php search function.
Next to that you could increase MySQL buffer size to something above 256 MB in order to cache most of the results, and also use InnoDB so you do not need to lock whole table every time you do the check, and also the input function.
I'm not sure why PHP has come into the mix. Excel can connect directly to a MySql database and you should be able to do a WHERE NOT IN query to add items and a UPDATE statements of ons that have changed Using excel VBA.
http://helpdeskgeek.com/office-tips/excel-to-mysql/
You could try and condense your code somewhat (you might have already done this though) but if you think it can be whittled down more, post it and we can have a look.
Cache data you know exists already, so if a products variables don't change regularly you might not need to check them so often. You can cache the data for quick retrieval/changes later (see Memcached, other caching alternatives are available). You could end up reducing your work load dramatically.
Have you seperated your mysql server? Try running the product checks on a different sub-system, and merge the databases to your main, hourly or daily or whatever.
Ok, here is quick thought
Instead of running the query, after every check, where its present or not, add on to your sql as long as you reach the end and then finally execute it.
Example
$query = ""; //creat a query container
if($present) {
$query .= "UPDATE ....;"; //Remember the delimeter ";" symbol
} else {
$query .= "INSERT ....;";
}
//Now, finally run it
$result = mysql_query($query);
Now, you make one query at the last part.
Update: Approach this the another way
Use the query to handle it.
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;
UPDATE table SET c=c+1 WHERE a=1;
Reference

MYSQL table locking with PHP

I have mysql table fg_stock. Most of the time concurrent access is happening in this table. I used this code but it doesn't work:
<?php
mysql_query("LOCK TABLES fg_stock READ");
$select=mysql_query("SELECT stock FROM fg_stock WHERE Item='$item'");
while($res=mysql_fetch_array($select))
{
$stock=$res['stock'];
$close_stock=$stock+$qty_in;
$update=mysql_query("UPDATE fg_stock SET stock='$close_stock' WHERE Item='$item' LIMIT 1");
}
mysql_query("UNLOCK TABLES");
?>
Is this okay?
"Most of the time concurrent access is happening in this table"
So why would you want to lock the ENTIRE table when it's clear you are attempting to access a specific row from the table (WHERE Item='$item')? Chances are you are running a MyISAM storage engine for the table in question, you should look into using the InnoDB engine instead, as one of it's strong points is that it supports row level locking so you don't need to lock the entire table.
Why do you need to lock your table anyway?????
mysql_query("UPDATE fg_stock SET stock=stock+$qty_in WHERE Item='$item'");
That's it! No need in locking the table and no need in unnecessary loop with set of queries. Just try to avoid SQL Injection by using intval php function on $qty_in (if it is an integer, of course), for example.
And, probably, time concurrent access is only happens due to non-optimized work with database, with the excessive number of queries.
ps: moreover, your example does not make any sense as mysql could update the same record all the time in the loop. You did not tell MySQL which record exactly do you want to update. Only told to update one record with Item='$item'. At the next iteration the SAME record could be updated again as MySQL does not know about the difference between already updated records and those that it did not touched yet.
http://dev.mysql.com/doc/refman/5.0/en/internal-locking.html
mysql> LOCK TABLES real_table WRITE, temp_table WRITE;
mysql> INSERT INTO real_table SELECT * FROM temp_table;
mysql> DELETE FROM temp_table;
mysql> UNLOCK TABLES;
So your syntax is correct.
Also from another question:
Troubleshooting: You can test for table lock success by trying to work
with another table that is not locked. If you obtained the lock,
trying to write to a table that was not included in the lock statement
should generate an error.
You may want to consider an alternative solution. Instead of locking,
perform an update that includes the changed elements as part of the
where clause. If the data that you are changing has changed since you
read it, the update will "fail" and return zero rows modified. This
eliminates the table lock, and all the messy horrors that may come
with it, including deadlocks.
PHP, mysqli, and table locks?

Tricky MySQL Batch Design

I have a scraper which visits many sites and finds upcoming events and another script which is actually supposed to put them in the database. Currently the inserting into the database is my bottleneck and I need a faster way to batch the queries than what I have now.
What makes this tricky is that a single event has data across three tables which have keys to each other. To insert a single event I insert the location or get the already existing id of that location, then insert the actual event text and other data or get the event id if it already exists (some are repeating weekly etc.), and finally insert the date with the location and event ids.
I can't use a REPLACE INTO because it will orphan older data with those same keys. I asked about this in Tricky MySQL Batch Query but if TLDR the outcome was I have to check which keys already exist, preallocate those that don't exist then make a single insert for each of the tables (i.e. do most of the work in php). That's great but the problem is that if more than one batch was processing at a time, they could both choose to preallocate the same keys then overwrite each other. Is there anyway around this because then I could go back to this solution? The batches have to be able to work in parallel.
What I have right now is that I simply turn off the indexing for the duration of the batch and insert each of the events separately but I need something faster. Any ideas would be helpful on this rather tricky problem. (The tables are InnoDB now... could transactions help solve any of this?)
I'd recommend starting with Mysql Lock Tables which you can use to prevent other sessions from writing to the tables whilst you insert your data.
For example you might do something similar to this
mysql_connect("localhost","root","password");
mysql_select_db("EventsDB");
mysql_query("LOCK TABLE events WRITE");
$firstEntryIndex = mysql_insert_id() + 1;
/*Do stuff*/
...
mysql_query("UNLOCK TABLES);
The above does two things. Firstly it locks the table preventing other sessions from writing to it until you the point where you're finished and the unlock statement is run. The second thing is the $firstEntryIndex; which is the first key value which will be used in any subsequent insert queries.

Categories