I've managed to get a stable load balanced front end servers that can scale horizontally quite well however the next bottle neck would be the db. There was a blog post discussing scaling dbs horizontally however very little detail on it. I'm currently using PostgreSQL and so the only plugin I've found wouldn't work.
Are my only options creating my own HAProxy or rewriting the PostgreSQL plugin to allow connections with read replicas?
I'm using AWS for all my hosting
Firstly - I'd love to be corrected on this!
Having only had a quick look through some of the ORM classes in a SilverStripe 3.5 site, it looks like while the ORM does support multiple database connections (see DB::get_conn with argument for name) it is designed for specific use cases in mind. That is to say, you may have a module that needs to write to a specific database, so this would allow it to.
What you want is native and automatic support for this within the framework, so that all reads go to your slave(s) and writes go to your master. Unfortunately, it doesn't look like this comes out of the box. You might be able to achieve it by overloading a couple of the core SQL classes using the injector.
If you were to try it, this answer outlines how you could separate select statements out from the rest and run them through a different database connector.
As a quick example of how you might go at achieving this with SQLSelect, you will notice that it is injectable, which means you can easily overload it.
File: mysite/_config/injector.yml
Injector:
SQLSelect:
class: ReadOnlySQLSelect
You need to register a new database connection with the DB class:
File: mysite/_config.php
$readDatabaseConfig = array(/** define your DB credentials here, as with the default $databaseConfig **/);
if (!DB::connect($readDatabaseConfig, 'default_read')) {
user_error('Failed to connect to read replica DB!', E_USER_ERROR);
}
Now, overload the SQLSelect class and replace the parts of it that call the DB class methods. This class inherits from SQLExpression which is the class the contains the methods you actually care about in this instance:
File: mysite/code/ReadOnlySQLSelect.php
class ReadOnlySQLSelect extends SQLSelect
{
public function sql(&$parameters = array())
{
// Changed from SQLExpression: third parameter passed as connection name
$sql = DB::build_sql($this, $parameters, 'default_read');
if (empty($sql)) {
return null;
}
if ($this->replacementsOld) {
$sql = str_replace($this->replacementsOld, $this->replacementsNew, $sql);
}
return $sql;
}
public function execute()
{
$sql = $this->sql($parameters);
// Changed from SQLExpression: skip DB::prepared_query since it doesn't allow
// you to provide the connection name - replace it with its contents instead.
$conn = DB::get_conn('default_read');
return $conn->preparedQuery($sql, $parameters);
}
}
Note: SQLSelect::unlimitedRowCount should technically be replaced where it calls DB::prepared_query, since the prepared query method calls DB::get_conn with no arguments, so will always return the default connection. You could replace the DB::prepared_query line the same as used above:
$conn = DB::get_conn('default_read');
$result = $conn->preparedQuery($sql, $innerParameters);
If you implement the above method, also change new SQLSelect() to SQLSelect::create(), otherwise you'll end up with some queries that still hit the master server because it'll bypass your class by not using the injector.
There's also an instance in SQLConditionalExpression that you should replace too (::toSelect) but that is likely to affect query transformations from other child implementations of that class, and you won't be able to do much about it without either (A) PRing a fix to the framework or (B) overloading all the other SQL* classes.
At this point you should have everything you need to route select queries to your default_read connection.
Infrastructure
On the infrastructure side, you should be able to set up read replicas through the RDS console. When you do so it will provide you with a DNS endpoint for your replica node(s), which you can use in your _config.php to configure the connection to the read replica database.
If this works for you, you should create a module for it and put it up on GitHub - this would definitely be useful for others in future!
You may also consider making pull requests to the framework to add additional arguments to methods like DB::prepared_query to accept a connection name.
Also worth noting is that if you're using the mysqlnd database adapter you may be able to take advantage of read/write splitting, implemented with some sort of injector overloading but all handled at a lower level than the application layer.
Related
I am writing fresh code, as part of refactoring an older legacy codebase.
Specifically, I am writing a Device class that will be used to compute various specifications of a device.
Device class depends on device's model number and particle count and I can call it as $device = new Device($modelNumber, $particleCount);
Problem: since this class will go into existing legacy code, I have no direct influence on if this class will be called properly. For Device to work, it needs to have correct model number and correct particle count. If it does not receive the proper configuration data, internally device will not be initialized, and the class will not work. I think that I need to find a way to let the caller know that there was an error, in case an invalid configuration data was supplied. How do I structure this to be in line with object oriented principles?
Or, alternatively, do I need to concern myself with this? I think there is a principle that if you supply garbage, you get garbage back, aka my class only needs to work properly with proper data. If improper data is supplied, it can bake a cake instead, or do nothing (and possibly fail silently). Well, I am not sure if this principle will be great. I do need something to complain if supplied configuration data is bad.
Here is some code of what I am thinking:
$device = new Device($x, $y);
$device->getData();
The above will fail or produce bad or no data if $x or $y are outside of device specs. I don't know how to handle this failure. I also want to assume that $device is valid when I call getData() method, and I can't make that assumption.
or
$device = new Device($x, $y);
if ($device->isValid())
$device->getData();
else
blow_up("invalid device configuration supplied");
The above is better, but the caller has to now they are to call isValid() function. This also "waters down" my class. It has to do two things: 1) create device, 2) verify device configuration is valid.
I can create a DeviceChecker class that deals with configuration vefication. And maybe that's a solution. It bothers me a little that DeviceChecker will have to contain some part of the logic that is already in Device class.
Questions
what problem am I trying to solve here? Am I actually trying to design an error handling system in addition to my "simple class" issue? I think I probably am... Well, I don't have the luxury of doing this at the moment (legacy code base is huge). Is there anything I can do now that is perhaps localized to the pieces of code I touch? That something is what I am looking for with this question.
I think you need to use below code to verify your passed arguments in construct
class Device {
public function __constructor($modelNumber, $particleCount) {
if(!$this->isValid($modelNumber, $particleCount) {
return false; //or return any error
}
}
}
This will check the passed params are valid or not and create object based on that only, otherwise return false or any error.
I have developed a project under which several sql query have been used. Now I want to monitor some query in order increase security. So I want every query to be passed through a function first. As there are too many queries so I can not go back and edit every file and query. Is there a way that I can trap into queries before they are sent to mysql server?
There are four ways to accomplish this depending on what you are using, the last being the much more reliable.
The General Query Log
MySQL provides a mechanism to log just about everything that the mysqld process is doing, via the general query log. As you described in your question you probably do not have persistent connections, so you will need to either:
Enable the MySQL general query log when the mysqld process is started, with the --log[=file_name]
Set a global/session variable with SET GLOBAL general_log = 'ON'.
Fore more information about the general query log, see the MySQL 5.1 reference manual.
Using sed (or manually!)
This technique involves creating a a new function, and renaming all of the mysqli_* function calls to call another function.
Presuming your newly created function is named proxy_query(), you can use sed to traverse through all files and change them automatically:
sed i '.bck' 's/mysqli_query/proxy_query/'
The -i paramater specifies that the file should be edited in place, and that a copy should be made of the original file and have a .bck extension appended.
The runkit extension
I must admit that I'm being naive here, and that I haven't used this particular extension before - but it is possible to rename functions with this PECL extension.
The requirements for this extension can be found here, and note that it is not bundled with PHP.
As with above, you can create a proxy function where all calls will go through. Let's assume it's also called proxy_query. Usage would go something like this:
// rename the function (a very bad idea, really!)
runkit_function_renam('mysqli_connect', 'proxy_super');
function mysqli_query($query, $resultmode = MYSQLI_STORE_RESULT)
{
// do something with the SQL in $query
// .. and call mysqli_query, now proxy_super
return proxy_super($query, $resultmode);
}
I have to note here that this method is highly discouraged. You shouldn't ever need to set default PHP functions.
Using Pdo/OO-mysqli
This is the simplest technique, and probably the most reliable as well. If you're using Pdo already, you can simply extend the \Pdo class. A similar approach could be used with MySQL Improved(mysqli):
class MyPdo extends \Pdo
{
public function query($query [, ... ])
{
// do something with $query
return parent::query($query [, ... ]);
}
}
Also note here, that this will only work if you are using Pdo, and if you are able to change the instantiation of the Pdo object, to overwrite it to your own class: MyPdo.
For more information about the \Pdo class, and it's children, see the manual.
If you want to monitor incoming queries using SQL profiler can be an excellent way to gather information on what's going on inside SQL without passing it all through a single function or procedure.
PHP's Mongo driver lacks a renameCommand function. There is reference to do this through the admin database. But it seems more recent versions of the Mongo driver don't let you just "use" the admin database if do don't have login privileges on that database. So this method no longer works. I've also read this doesn't work in sharded environments although this isn't a concern for me currently.
The other suggestion people seem to have is to iterate through the "from" collection and insert into the "to" collection. With the proper WriteConcern (fire and forget) this could be fairly fast. But it still means pulling down each record over the network into the PHP process and then uploading it back over the network back into the database.
I ideally want a way to do it all server-side. Sort of like an INSERT INTO ... SELECT ... in SQL. This way it is fast, network efficient and a low load on PHP.
I have just tested this, it works as designed ( http://docs.mongodb.org/manual/reference/command/renameCollection/ ):
$mongo->admin->command(array('renameCollection'=>'ns.user','to'=>'ns.e'));
That is how you rename an unsharded collection. One problem with MR is that it will change the shape of the output from the original collection. As such it is not very good at copying a collection. You would be better off copying it manually if your collection is sharded.
As an added note I upgraded to 1.4.2 (which for some reason comes out from the pecl channel into phpinfo() as 1.4.3dev :S) and it still works.
Updates:
Removed my old map/reduce method since I found out (and Sammaye pointed out) that this changes the structure
Made my exec version secondary since I found out how to do it with renameCollection.
I believe I have found a solution. It appears some versions of the PHP driver will auth against the admin database even though it doesn't need to. But there is a workaround where the authSource connection param is used to change this behavior so it doesn't auth against the admin database but instead the database of your choice. So now my renameCollection function is just a wrapper around the renameCollection command again.
The key is to add authSource when connecting. In the below code $_ENV['MONGO_URI'] holds my connection string and default_database_name() returns the name of the database I want to auth against.
$class = 'MongoClient';
if( !class_exists($class) ) $class = 'Mongo';
$db_server = new $class($_ENV['MONGO_URI'].'?authSource='.default_database_name());
Here is my older version that used eval which should also work although some environments don't allow you to eval (MongoLab gives you a crippled setup unless you have a dedicated system). But if you are running in a sharded environment this seems like a reasonable solution.
function renameCollection($old_name, $new_name) {
db()->$new_name->drop();
$copy = "function() {db.$old_name.find().forEach(function(d) {db.$new_name.insert(d)})}";
db()->execute($copy);
db()->$old_name->drop();
}
you can use this. "dropTarget" flag is true then delete exist database.
$mongo = new MongoClient('_MONGODB_HOST_URL_');
$query = array("renameCollection" => "Database.OldName", "to" => "Database.NewName", "dropTarget" => "true");
$mongo->admin->command($query);
I want to be able to call the CakeS3 plugin from the Cake Shell. However, as I understand it components cannot be loaded from the shell. I have read this post outlining strategies for overcoming it: using components in Cakephp 2+ Shell - however, I have had no success. The CakeS3 code here is similar to perfectly functioning cake S3 code in the rest of my app.
<?php
App::uses('Folder','Utility');
App::uses('File','Utility');
App::uses('CakeS3.CakeS3','Controller/Component');
class S3Shell extends AppShell {
public $uses = array('Upload', 'User', 'Comment');
public function main() {
$this->CakeS3 = new CakeS3.CakeS3(
array(
's3Key' => 'key',
's3Secret' => 'key',
'bucket' => 'bucket')
);
$this->out('Hello world.');
$this->CakeS3->permission('private');
$response = $this->CakeS3->putObject(WWW_ROOT . '/file.type' , 'file.type', $this->CakeS3->permission('private'));
if ($response == false){
echo "it failed";
} else {
echo "it worked";
}
}
This returns an error of "Fatal error: Class 'CakeS3' not found in /home/app/Console/Command/S3Shell.php. The main reason I am trying to get this to work is so I can automate some uploads with a cron. Of course, if there is a better way, I am all ears.
Forgive me this "advertising"... ;) but my plugin is probably better written and has a better architecture than this CakeS3 plugin if it is using a component which should be a model or behaviour task. Also it was made for exactly the use case you have. Plus it supports a few more storage systems than only S3.
You could do that for example in your shell:
StorageManager::adapter('S3')->write($key, StorageManager::adapter('Local')->read($key));
A file should be handled as an entity on its own that is associated to whatever it needs to be associated to. Every uploaded file (if you use or extend the models that come with the plugin, if not you have to take care of that) is stored as a single database entry that contains the name of the config that was used and some meta data for that file. If you do the line of code above in your shell you will have to keep record in the table if you want to access it this way later. Just check the examples in the readme.md out. You don't have to use the database table as a reference to your files but I really recommend the system the plugin implements.
Also, you might not be aware that WWW_ROOT is public accessible, so in the case you store sensitive data there it can be accessed publicly.
And finally in a shell you should not use echo but $this->out() for proper shell output.
I think the App:uses should look like:
App::uses('CakeS3', 'CakeS3.Controller/Component');
I'm the author of CakeS3, and no I'm afraid there is no "supported" way to do this as when we built this plugin, we didn't need to run uploads from shell and just needed a simple interface to S3 from our controllers. We then open sourced the plugin as a simple S3 connector.
If you'd like to have a go at modifying it to support shell access, I'd welcome a PR.
I don't have a particular road map for the plugin, so I've tagged your issue on github as an enhancement and will certainly consider it in future development, but I can't guarantee that it would fit your time requirements so that's why I mention you doing a PR.
How can I detect, using php, if the machine has oracle (oci8 and/or pdo_oci) installed?
I'm working on a PHP project where some developers, such as myself, have it installed, but there's little need for the themers to have it. How can I write a quick function to use in the code so that my themers are able to work on the look of the site without having it crash on them?
if the oci extension isn't installed, then you'll get a fatal error with farside.myopenid.com's answer, you can use function_exists('oci_connect') or extension_loaded('oci8') (or whatever the extension's actually called)
The folks here have pieces of the solution, but let's roll it all into one solution.
For just a single instance of an oracle function, testing with function_exists() is good enough; but if the code is sprinkled throughout to OCI calls, it's going to be a huge pain in the ass to wrap every one in a function_exists() test.
Therefore, I think the simplest solution would be to create a file called nodatabase.php that might look something like this:
<?php
// nodatabase.php
// explicitly override database functions with empty stubs. Only include this file
// when you want to run the code without an actual database backend. Any database-
// related functions used in the codebase must be included below.
function oci_connect($user, $password, $db = '', $charset='UTF-8', $session_mode=null)
{
}
function oci_execute($statement, $mode=0)
{
}
// and so on...
Then, conditionally include this file if a global (say, THEME_TESTING) is defined just ahead of where the database code is called. Such an include might look like this:
// define("THEME_TESTING", true) // uncomment this line to disable database usage
if( defined(THEME_TESTING) )
include('nodatabase.php'); // override oracle API with stub functions for the artists.
Now, when you hand the project over to the artists, they simply need to make that one modification and they're good to go.
I dont know if I fully understand your question but a simple way would be to do this:
<?php
$connection = oci_connect('username', 'password', 'table');
if (!$connection) {
// no OCI connection.
}
?>
As mentioned above by Greg, programmatically you can use the function_exists() method. Don't forget you can also use the following to see all the environment specifics with your PHP install using the following:
<?php
phpinfo();
?>