How can I solve this Solr/MySQL race condition? - php

I'm experiencing a very strange problem whereby my Solr index is not able to see a change just written to a MySQL database on another connection.
Here is the chain of events:
The user initiates an action on the website that causes a row to be added to a table in MySQL.
The row is added via mysql_query() (no transactions). If I query the database again from the same connection I can naturally see the change I just made.*
A call is immediately sent to a Solr instance via curl to tell it to do a partial update of its index using the Data Import Handler.
Solr connects to the MySQL database via a separate JDBC connection (same credentials and everything) and executes a query for all records updated since its last update.
At this point, however, the results returned to Solr do not include the last-added row, unless I insert a sleep() call immediately after making the change to the database and before sending the message to Solr.
*Note that if I actually do query the database at this point though, this takes enough time for the change to actually be picked up by Solr. The same occurs if I simply sleep(1) (for one second).
What I'm looking for is some reliable solution that can allow me to make sure the change will be seen by Solr before sending it the refresh message. According to all documentation I've found, however, the call to mysql_query() should already be atomic and synchronous and should not return control to PHP until the database has been updated. Therefore there doesn't appear to be any function I can call to force this.
Does anyone have any advice/ideas? I'm banging my head over this one.

Check what the auto-commit is set to when inserting the record. Chances are the record just inserted is in the same database session and thus is seen (but isn't committed). After this, some event causes the commit to occur and hence another thread/session can then "see" the record. Also check the transaction isolation level settings.

I typically do not use the Data Import handler and would have the update in the website trigger a mechanism (either internal or external) to update the record into Solr using the appropriate Solr Client for the programming language being used. I have personally not had a lot of luck with the Data Import Handler in the past and as a result have preferred to use custom code for synchronizing Solr with the corresponding data storage platform.

Related

FileMaker PHP API - why is the initial connection so slow?

I've just set up my first remote connection with FileMaker Server using the PHP API and something a bit strange is happening.
The first connection and response takes around 5 seconds, if I hit reload immediately afterwards, I get a response within 0.5 second.
I can get a quick response for around 60 seconds or so (not timed it yet but it seems like at least a minute but less than 5 minutes) and then it goes back to taking 5 seconds to get a response. (after that it's quick again)
Is there any way of ensuring that it's always a quick response?
I can't give you an exact answer on where the speed difference may be coming from, but I'd agree with NATH's notion on caching. It's likely due to how FileMaker Server handles caching the results on the server side and when it clears that cache out.
In addition to that, a couple of things that are helpful to know when using custom web publishing with FileMaker when it comes to speed:
The fields on your layout will determine how much data is pulled
When you perform a find in the PHP api on a specific layout, e.g.:
$request = $fm->newFindCommand('myLayout');
$request->addFindCriterion('name', $myname);
$result = $request->execute();
What's being returned is data from all of the fields available on the my layout layout.
In sql terms, the above query is equivalent to:
SELECT * FROM myLayout WHERE `name` = ?; // and the $myname variable is bound to ?
The FileMaker find will return every field/column available. You designate the returned columns by placing the fields you want on the layout. To get a true * select all from your table, you would include every field from the table on your layout.
All of that said, you can speed up your requests by only including fields on the layout that you want returned in the queries. If you only need data from 3 fields returned to your php to get the job done, only include those 3 fields on the layout the requests use.
Once you have the records, hold on to them so you can edit them
Taking the example from above, if you know you need to make changes to those records somewhere down the line in your php, store the records in a variable and use the setField and commit methods to edit them. e.g.:
$request = $fm->newFindCommand('my layout');
$request->addFindCriterion('name', $myname);
$result = $request->execute();
$records = $result->getRecords();
...
// say we want to update a flag on each of the records down the line in our php code
foreach($records as $record){
$record->setField('active', true);
$record->commit();
}
Since you have the records already, you can act on them and commit them when needed.
I say this as opposed to grabbing them once for one purpose and then grabbing them again from the database later do make updates to the records.
It's not really an answer to your original question, but since FileMaker's API is a bit different than others and it doesn't have the greatest documentation I though I'd mention it.
There are some delays that you can remove.
Ensure that the layouts you are accessing via PHP are very simple, no unnecessary or slow calculations, few layout objects etc. When the PHP engine first accesses that layout it needs to load it up.
Also check for layout and file script triggers that may be run, IIRC the OnFirstWindowOpen script trigger is called when a connection is made.
I don't think that it's related to caching. Also, it's the same when accessing via XML. Haven't tested ODBC, but am assuming that it is an issue with this too.
Once the connection is established with FileMaker Server and your machine, FileMaker Server keeps this connection alive for about 3 minutes. You can see the connection in the client list in the FM Server Admin Console. The initial connection takes a few seconds to set up (depending on how many others are connected), and then ANY further queries are lightning fast. If you run your app again, it'll reuse that connection and give results in very little time.
You can do completely different queries (on different tables) in a different application, but as long as you execute the second one on the same machine and use the same credentials, FileMaker Server will reuse the existing connection and provide results instantly. This means that it is not due to caching, but it's just the time that it takes FMServer to initially establish a connection.
In our case, we're using a web server to make FileMaker PHP API calls. We have set up a cron every 2 minutes to keep that connection alive, which has pretty much eliminated all delays.
This is probably way late to answer this, but I'm posting here in case anyone else sees this.
I've seen this happen when using external authentication with FileMaker Server. The first query establishes a connection to Active Directory, which takes some time, and then subsequent queries are fast as FMS has got the authentication figured out. If you can, use local authentication in your FileMaker file for your PHP access and make sure it sits above any external authentication in your accounts list. FileMaker runs through the auth list from top to bottom, so this will make sure that FMS successfully authenticates your web query before it gets to attempt an external authentication request, making the authentication process very fast.

Force MySQL to write back

I have an issue where an instance of Solr is querying my MySQL database to refresh its index immediately after an update is made to that database, but the Solr query is not seeing the change made immediately prior.
I imagine the problem has to be something like Solr is using a different database connection, and somehow the change is not being "committed" (I'm not using transactions, just a call to mysql_query) before the other connection can see it. If I throw a sufficiently long sleep() call in there, it works most of the time, but obviously this is not acceptable.
Is there a PHP or MySQL function that I can call to force a write/update/flush of the database before continuing?
You might make Solr use SET TRANSACTION ISOLATION LEVEL = READ-COMMITTED to get more prompt view of updated data.
You should be able to do this with the transactionIsolation property of the JDBC URL.

Can I use a database value right after I insert it?

Can I insert something into a MySQL database using PHP and then immediately make a call to access that, or is the insert asynchronous (in which case the possibility exists that the database has not finished inserting the value before I query it)?
What I think the OP is asking is this:
<?
$id = $db->insert(..);
// in this case, $row will always have the data you just inserted!
$row = $db->select(...where id=$id...)
?>
In this case, if you do a insert, you will always be able to access the last inserted row with a select. That doesn't change even if a transaction is used here.
If the value is inserted in a transaction, it won't be accessible to any other transaction until your original transaction is committed. Other than that it ought to be accessible at least "very soon" after the time you commit it.
There are normally two ways of using MySQL (and most other SQL databases, for that matter):
Transactional. You start a transaction (either implicitly or by issuing something like 'BEGIN'), issue commands, and then either explicitly commit the transaction, or roll it back (failing to take any action before cutting off the database connection will result in automatic rollback).
Auto-commit. Each statement is automatically committed to the database as it's issued.
The default mode may vary, but even if you're in auto-commit mode, you can "switch" to transactional just by issuing a BEGIN.
If you're operating transactionally, any changes you make to the database will be local to your db connection/instance until you issue a commit. Issuing a commit should block until the transaction is fully committed, so once it returns without error, you can assume the data is there.
If you're operating in auto-commit (and your database library isn't doing something really strange), you can rely on data you've just entered to be available as soon as the call that inserts the data returns.
Note that best practice is to always operate transactionally. Even if you're only issuing a single atomic statement, it's good to be in the habit of properly BEGINing and COMMITing a transaction. It also saves you from trouble when a new version of your database library switches to transactional mode by default and suddenly all your one-line SQL statements never get committed. :)
Mostly the answer is yes. You would have to do some special work to force a database call to be asynchronous in the way you describe, and as long as you're doing it all in the same thread, you should be fine.
What is the context in which you're asking the question?

Live update notification on database changes MYSQL PHP

I was wondering how to trigger a notification if a new record is inserted into a database, using PHP and MySQL.
You can create a trigger than runs when an update happens. It's possible to run/notify an external process using a UDF (user defined function). There aren't any builtin methods of doing so, so it's a case of loading a UDF plugin that'll do it for you.
Google for 'mysql udf sys_exec' or 'mysql udf ipc'.
The simplest thing is probably to poll the DB every few seconds and see if new records have been inserted. Due to query caching in the DB this shouldn't effect DB performance substantially.
MySQL does now have triggers and stored procedures, but I don't believe they have any way of notifying an external process, so as far as I know it's not possible. You'd have to poll the database every second or so to look for new records.
Even if it were, this assumes that your PHP process is long-lived, such that it can afford to hang around for a record to appear. Given that most PHP is used for web sites where the code runs and then exits as quickly as possible it's unclear whether that's compatible with what you have.
If all your database changes are made by PHP I would create a wrapper function for mysql_query and if the query type was INSERT, REPLACE, UPDATE or DELETE I would call a function to send the respective email.
EDIT: I forgot to mention but you could also do something like the following:
if (mysql_affected_rows($this->connection) > 0)
{
// mail(...)
}
One day I ask in MySQL forum if event like in Firebird or Interbase exist in MySQL and I see that someone answer Yes (I'm really not sure)
check this : http://forums.mysql.com/read.php?84,3629,175177#msg-175177
This can be done relatively easily using stored procedures and triggers. I have created a 'Live View' screen which has a scrolling display which is updated with new events from my events table. It can be a bit fiddly but once its running its quick.

Queue Oracle transactions using PHP oci_pconnect function in a webservice

I have written a webservice using the PHP SOAP classes. It has functions to return XML data from an Oracle database, or to perform insert/update/delete on the database.
However, at the moment it is using Autocommit, so any operation is instantly commited.
I'm looking at how to queue up the transactions, and then commit the whole lot only when a user presses a button to "save". I'm having difficulty in finding out if this is possible. I can't maintain a consistent connection easily, as of course the webservice is called for separate operations.
I've tried using the PHP oci_pconnect function, but even when I connect each time with the same parameters, the session appears to have ended, and my changes aren't commited when I finally call oci_commit.
Any ideas?
Reusing the same uncommitted database session between PHP requests is not possible. You have no way to lock a user into a PHP processes or DB connection as the webserver will send a request to any one of many of them at random. Therefore you cannot hold uncommited data in the Oracle session between requests.
The best way to do this really depends on your requirements. My feeling is that you want some sort of session store (perhaps a database table, keyed on user_id) that can hold all the pending transactions between requests. When the user hits save, extract out all the pending requests and insert them into their final destination table and then commit.
An alternative would be to insert all the transactions with a flag that says they are not yet completed. Upon clicking save, update the flag to say they are completed.
Either way, you need somewhere to stage your pending requests until that save button is pressed.
DBMS_XA allows you to share transactions across sessions.

Categories