I have a PHP class that create a line with a CURRENT_TIMESTAMPS in a MySQL database when I create a new instance of it. This timestamp is then used to sort theses records (and primarily getting the last one).
In this precise case, the timestamp is the automatically generated created_at column of an Eloquent (Laravel) model. If possible I'd like a solution that is not Laravel-specific given I could encounter the same problem with other projects.
I had written tests for this class on another PC and everything was working fine, the resulting oder was always correct.
Now I copied this project on my other PC that has an SSD and that is (but not much) more powerful than the other. And now, all my tests fail, because all records end up having the exact same timestamp.
In production it would not be a an issue if two records had the same timestamps (which is very unlikely), but that's not the situation I try to replicate in my test.
To quickly address the issue without digging in the code, I added some sleep(1) functions before instantiating a new object.
But now my question is, what is the best way to deal with that ?
Here are a few possibilities I thought of, but I'm not sure which one is the best:
Hard-coding the dates in the tests: no risk of failure, but my objects currently offer no way to define a date manually. I would need to add methods for that, and they won't be used for anything else.
Empty the database before doing the next operation. The problem there is that for certain tests I need to insert two objects one after the other, so it wouldn't solve all the problems
Make the code wait. But it sounds very stupid to me, because I lose all the benefits of having a powerful machine. And of course it takes much more time to run the tests
Manipulating the system or the database current time. I have no idea of what horrible things could happen if I try that ! Maybe it is possible to isolate a process that do that without impacting anything else on the PC ?
So which method do you recommend ? Is there a better one ?
More system informations: All my dev machines are running Ubuntu 14.04 with PHP 5.5 and MySQL 5.5. My current project is using the Laravel 4 framework and phpunit for the tests
Additional informations
I'd like to add there what I said in the comments below.
Ideally my test should do its assertions with dates that are spaced by many hours, but for the moment I relied on the fact that two records had at least one second difference, which is no more the case with this other PC config.
And, additional informations, in production, the older records get normally deleted before a new one is inserted or the db queried (say, 90% of the time) so even if two were created quickly, there would be a chance that one is removed before I do my logic on the database. But for my test, I need to have at least two records in database, spaced by some time, which is not important but greater than zero.
I've found out a good solution that apply to Eloquent only.
This is the same that is used for the Database Seeds.
We can add Eloquent::unguard() before instantiating the object in the test:
Eloquent::unguard();
Model::create(array('created_at' => '2014-12-17'));
So we can manually set the created_at field of the model, without exposing it to "everybody" who use the class.
You can use Carbon::setTestNow($now) for mocking now time
foreach ($list as $index => $test) {
Carbon::setTestNow(now()->addDays($index + 1));
// functionCreateModel();
}
Related
When running our suite of acceptance tests we would like to execute every single test on a defined database state. If one of the tests would write to the database (like create users or something else), it must of course not affect later tests.
We have thought about several options to achieve that, but copying the whole database before every single tests does not seem like the best solution (thinking of possible performance issues).
One more idea was to use MySQL transactions, but some of our tests cause many HTTP requests, so different PHP processes are spawned and they would lose the transaction too early for a clean rollback after the full test is done.
Are there better ways to guarantee a defined database state for every of our acceptance tests? We would like to keep it simpler than solutions like aufs or btrfs tackling it on system level.
You could approach this problem using PhpUnit.
It is used for automated testing with PHP. Is not the only library, but one of the most extended ones.
You could use it with database testing as well ( https://phpunit.de/manual/current/en/database.html ). Basically, it lets you accomplish exactly what you are looking for. Import initially the whole database, and then in each test suite, load what you need and then restore to the previous state. For example, you could save temporarily the current status of the table A and after you are done with all tests of the suite, simply restore it. Instead of reloading the whole database.
By the way, having a minimal Database with only the required information for testing will help a lot as well. In that case you don't have to deal with big performance issues, and you can simply restore it after each test suite.
I'm having a problem retrieving documents from a MongoDB collection immediately after inserting them. I'm creating documents, then running a query (that cannot be determined in advance) to get a subset of all documents in the collection. The problem is that some or all of the documents I inserted aren't included in the results.
The process is:
Find the timestamp of the most recent record
Find transactions that have taken place since that time
Generate records for those transactions and insert() each one (this can and will become a single bulk insert)
find() some of the records
The documents are always written successfully, but more often than not the new documents aren't included when I run the find(). They are available after a few seconds.
I believe the new documents haven't propagated to all members of the replica set by the time I try to retrieve them, though I am suspicious that this may not be the case as I'm using the same connection to insert() and find().
I believe this can be solved with a write concern, but I'm not sure what value to specify to ensure that the documents have propagated to all members of the replica set, or at least the member that will be used for the find() operation if it's possible to know that in advance.
I don't want to hard code the total number of members, as this will break when another member is added. It doesn't matter if the insert() operations are slow.
Read preference
When you write to a collection, it's a good practice to set the readPreference to "primary" to make sure you're reading from the same MongoDB server that you've written to.
You do that with the MongoCollection::setReadPreference() method.
$db->mycollection->setReadPreference(MongoClient::RP_PRIMARY);
$db->mycollection->insert(['foo' => 'bar']);
$result = $db->mycollection->find([]);
Write concern (don't do it!)
You might be tempted to use write concern to wait for the data to be replicated to all secondaries by using w=3 (for a 3 server setup). However this is not the way to go.
One of the nice things about MongoDB replication, is that is will do automatic fail over. In that case you might have less than 3 servers that can accept the data, cause your script to wait for ever.
There is no w=all to write to all server that are up. Using such a write concern wouldn't be good. A secondary that have just recovered from a fail over might be hours behind, taking a long time to catch up. You script would wait (hang) until all secondaries are caught up.
A good practice is never to use w=N with N > majority outside of administrative tasks.
Basically you are looking for a write concern, which (in Layman's term) allows you to specify when insert is finished.
In PHP this is done, by providing an option in the insert statement, so you need something like
w=N Replica Set Acknowledged The write will be acknowledged by the primary server, and replicated to N-1 secondaries.
or if you do not want to hard code N:
w= Replica Set Tag Set Acknowledged The write will be
acknowledged by members of the entire tag set
$collection->insert($someDoc, ["w" => 3]);
The server is a shared Windows hosting server with Hostgator. We are allowed "unlimited" MS SQL databases and each is allowed "unlimited" space. I'm writing the website in PHP. The data (not the DB schema, but the data) needs to be versioned such that (ideally) my client can select the DB version he wants from a select box when he logs in to the website, and then (roughly once a month) tag the current data, also through a simple form on the website. I've thought of several theoretical ways to do this and I'm not excited about any of them.
1) Put a VersionNumber column on every table; have a master Version table that lists all versions for the select box at login. When tagged, every row without a version number in every table in the db would be duplicated, and the original would be given a version number.
This seems like the easiest idea for both me and my client, but I'm concerned the db would be awfully slow in just a few months, since every table will grow by (at least) its original size every month. There's not a whole lot of data, and there probably never will be, in any one version. But multiplying versions in the same table just scares me.
2) Duplicate the DB every time we tag.
It looks like this would have to be done manually by my client since the server is shared, so I already dislike the idea. But in addition, the old DBs would have to be able to work with the current website code, and as changes are made to the DB structure over time (which is inevitable) the old DBs will no longer work with the new website code.
3) Create duplicate tables (with the version in their name) inside the same database every time we tag. Like [v27_Employee].
The benefit here over idea (1) would be that no table would get humongous in size, allowing the queries to keep up their speed, and over idea (2) it could theoretically be done easily through the simple website tag form rather than manually by my client. The problems are that the queries in my PHP code are going to get all discombobulated as I try to explain which Employee table is joining with which Address table depending upon which version is selected, since they all have the same name, but different; and also that as the code changes, the old DB tables no longer match, same problem as (2).
So, finally, does anyone have any good recommendations? Best practices? Things they did that worked in the past?
Thanks guys.
Option 1 is the most obvious solution because it has the lowest maintenance overhead and it's the easiest to work with: you can view any version at any time simply by adding #VersionNumber to your queries. If you want or need to, this means you could also implement option 3 at the same time by creating views for each version number instead of real tables. If your application only queries one version at a time, consider making the VersionNumber the first column of a clustered primary key, so that all the data for one version is physically stored together.
And it isn't clear how much data you have anyway. You say it's "not a whole lot", but that means nothing. If you really have a lot of data (say, into hundreds of millions of rows) and if you have Enterprise Edition (you didn't say what edition you're using), you can use table partitioning to 'split' very large tables for better performance.
My conclusion would be to do the simplest, easiest thing to maintain right now. If it works fine then you're done. If it doesn't, you will at least be able to rework your design from a simple, stable starting point. If you do something more complicated now, you will have much more work to do if you ever need to redesign it.
You could copy your versionable tables into a new database every month. If you need to do a join between a versionable table and a non-versionable table, you'd need to do a cross-schema join - which is supported in SQL Server. This approach is a bit cleaner than duplicating tables in a single schema, since your database explorer will start getting unwieldy with all the old tables.
What I finally wound up doing was creating a new schema for each version and duplicating the tables and triggers and keys each time the DB is versioned. So, for example, I had this table:
[dbo].[TableWithData]
And I duplicated it into this table in the same DB:
[v1].[TableWithData]
Then, when the user wants to view old tables, they select which version and my code automatically changes every instance of [dbo] in every query to [v1]. It's conceptually fairly simple and the user doesn't have to do anything complicated to version -- just type in "v1" to a form and hit a submit button. My PHP and SQL does the rest.
I did find that some tables had to remain separate -- I made a different schema called [ctrl] into which I put tables that will not be versioned, like the username / password table for example. That way I just duplicate the [dbo] tables.
Its been operational for a year or so and seems to work well at the moment. They've only versioned maybe 4 times so far. The only problem I seem to have consistently that I can't figure out is that triggers seem to get lost somehow. That's probably a problem with my very complex PHP rather than the DB versioning concept itself though.
I'm currently trying to use PHPUnit to learn about Test Driven Development (TDD) and I have a question about writing reports using TDD.
First off: I understand the basic process of TDD:
But my question is this: How do you use TDD to write a report?
Say you've been tasked to write a report about the number of cars that pass by a given intersection by color, type, and weight. Now, all of the above data has been captured in a database table but you're being asked to correlate it.
How do you go about writing tests for a method that you don't know the outcome of? The outcome of the method that correlates this data is going to change based on date range and other limiting criteria that the user may provide when running the report? How do you work in the confines of TDD in this situation using a framework like PHPUnit?
You create test data beforehand that represents the type of data you will receive in production, then test your code against that, refreshing the table each time you run the test (i.e. in your SetUp() function).
You can't test against the actual data you will receive in production no matter what you're testing. You're only testing that the code works as expected for a given scenario. For example, if you load your testing table with five rows of blue cars, then you want your report to show five blue cars when you test it. You're testing the parts of the report, so that when you're done you will have tested the whole of the report automatically.
As a comparison, if you were testing a function that expected a positive integer between 1 and 100, would you write 100 tests to test each individual integer? No, you would test something within the range, then something on and around the boundaries (e.g. -1, 0, 1, 50, 99, 100, and 101). You don't test, for example, 55, because that test will go down the same code path as 50.
Identify your code paths and requirements, then create suitable tests for each one of them. Your tests will become a reflection of your requirements. If the tests pass, then the code will be an accurate representation of your requirements (and if your requirements are wrong, TDD can't save you from that anyway).
You don't use the same data when running the test suites and when running your script. You use test data. So if you want to interact with a database, a good solution is to create a sqlite database stored in your ram.
Similarly, if your function interacts with a filesystem, you can use a virtual filesystem.
And if you have to interact with objects, you can mock them too.
The good thing is you can test with all the vicious edge-case-data you think of when you write the code (hey, what if the data contains unescaped quotes?).
It is very difficult, and often unwise, to test directly against your production server, so your best bet is to fake it.
First, you create a stub, a special object which stands in for the database that allows you to have your unit tests pretend that some value came from the DB when it really came from you. If needs be, you have something which is capable of generating something which is not knowable to you, but still accessible by the tests.
Once everything is working there, you can have a data set in the DB itself in some testing schema -- basically, you connect but with different parameters so that while it thinks it is looking at PRODUCTION.CAR_TABLE it is really looking at TESTING.CAR_TABLE. You may even want to have the test drop/create table each time (though that might be a bit much it does result in more reliable tests).
I've arrived at the point where I realise that I must start versioning my database schemata and changes. I consequently read the existing posts on SO about that topic but I'm not sure how to proceed.
I'm basically a one man company and not long ago I didn't even use version control for my code. I'm on a windows environment, using Aptana (IDE) and SVN (with Tortoise). I work on PHP/mysql projects.
What's a efficient and sufficient (no overkill) way to version my database schemata?
I do have a freelancer or two in some projects but I don't expect a lot of branching and merging going on. So basically I would like to keep track of concurrent schemata to my code revisions.
[edit] Momentary solution: for the moment I decided I will just make a schema dump plus one with the necessary initial data whenever I'm going to commit a tag (stable version). That seems to be just enough for me at the current stage.[/edit]
[edit2]plus I'm now also using a third file called increments.sql where I put all the changes with dates, etc. to make it easy to trace the change history in one file. from time to time I integrate the changes into the two other files and empty the increments.sql[/edit]
Simple way for a small company: dump your database to SQL and add it to your repository. Then every time you change something, add the changes in the dump file.
You can then use diff to see changes between versions, not to mention have comments explaining your changes. This will also make you virtually immune to MySQL upgrades.
The one downside I've seen to this is that you have to remember to manually add the SQL to your dumpfile. You can train yourself to always remember, but be careful if you work with others. Missing an update could be a pain later on.
This could be mitigated by creating some elaborate script to do it for you when submitting to subversion but it's a bit much for a one man show.
Edit: In the year that's gone by since this answer, I've had to implement a versioning scheme for MySQL for a small team. Manually adding each change was seen as a cumbersome solution, much like it was mentioned in the comments, so we went with dumping the database and adding that file to version control.
What we found was that test data was ending up in the dump and was making it quite difficult to figure out what had changed. This could be solved by dumping the schema only, but this was impossible for our projects since our applications depended on certain data in the database to function. Eventually we returned to manually adding changes to the database dump.
Not only was this the simplest solution, but it also solved certain issues that some versions of MySQL have with exporting/importing. Normally we would have to dump the development database, remove any test data, log entries, etc, remove/change certain names where applicable and only then be able to create the production database. By manually adding changes we could control exactly what would end up in production, a little at a time, so that in the end everything was ready and moving to the production environment was as painless as possible.
How about versioning file generated by doing this:
mysqldump --no-data database > database.sql
Where I work we have an install script for each new version of the app which has the sql we need to run for the upgrade. This works well enough for 6 devs with some branching for maintenance releases. We're considering moving to Auto Patch http://autopatch.sourceforge.net/ which handles working out what patches to apply to any database you are upgrading. It looks like there may be some small complication handling branching with auto Patch, but it doesn't sound like that'll be an issue for you.
i'd guess, a batch file like this should do the job (didn't try tough) ...
mysqldump --no-data -ufoo -pbar dbname > path/to/app/schema.sql
svn commit path/to/app/schema.sql
just run the batch file after changing the schema, or let a cron/scheduler do it (but i don't know ... i think, commits work if just the timestamps changed, even if the contents is the same. don't know if that would be a problem.)
The main ideea is to have a folder with this structure in your project base path
/__DB
—-/changesets
——–/1123
—-/data
—-/tables
Now who the whole thing works is that you have 3 folders:
Tables
Holds the table create query. I recommend using the naming “table_name.sql”.
Data
Holds the table insert data query. I recommend using the same naming “table_name.sql”.
Note: Not all tables need a data file, you would only add the ones that need this initial data on project install.
Changesets
This is the main folder you will work with.
This holds the change sets made to the initial structure. This holds actually folders with changesets.
For example i added a folder 1123 wich will contain the modifications made in revision 1123 ( the number is from your code source control ) and may contain one or more sql files.
I like to add them grouped into tables with the naming xx_tablename.sql - the xx is a number that tells the order they need to be runned, since sometimes you need the modification runned in a certain order.
Note:
When you modify a table, you also add those modifications to table and data files … since those are the file s that will be used to do a fresh install.
This is the main ideea.
for more details you could check this blog post
Take a look at SchemaSync. It will generate the patch and revert scripts (.sql files) needed to migrate and version your database schema over time. It's a command line utility for MySQL that is language and framework independent.
Some months ago I searched tool for versioning MySQL schema. I found many useful tools, like Doctrine migration, RoR migration, some tools writen in Java and Python.
But no one of them was satisfied my requirements.
My requirements:
No requirements , exclude PHP and MySQL
No schema configuration files, like schema.yml in Doctrine
Able to read current schema from connection and create new migration script, than represent identical schema in other installations of application.
I started to write my migration tool, and today I have beta version.
Please, try it, if you have an interest in this topic.
Please send me future requests and bugreports.
Source code: bitbucket.org/idler/mmp/src
Overview in English: bitbucket.org/idler/mmp/wiki/Home
Overview in Russian: antonoff.info/development/mysql-migration-with-php-project
Our solution is MySQL Workbench. We regularly reverse-engineer the existing Database into a Model with the appropriate version number. It is then possible to easily perform Diffs between versions as needed. Plus, we get nice EER Diagrams, etc.
At our company we did it this way:
We put all tables / db objects in their own file, like tbl_Foo.sql. The files contain several "parts" that are delimited with
-- part: create
where create is just a descriptive identification for a given part, the file looks like:
-- part: create
IF not exists ...
CREATE TABLE tbl_Foo ...
-- part: addtimestamp
IF not exists ...
BEGIN
ALTER TABLE ...
END
Then we have an xml file that references every single part that we want executed when we update database to new schema.
It looks pretty much like this:
<playlist>
<classes>
<class name="table" desc="Table creation" />
<class name="schema" desc="Table optimization" />
</classes>
<dbschema>
<steps db="a_database">
<step file="tbl_Foo.sql" part="create" class="table" />
<step file="tbl_Bar.sql" part="create" class="table" />
</steps>
<steps db="a_database">
<step file="tbl_Foo.sql" part="addtimestamp" class="schema" />
</steps>
</dbschema>
</playlist>
The <classes/> part if for GUI, and <dbschema/> with <steps/> is to partition changes. The <step/>:s are executed sequentially. We have some other entities, like sqlclr to do different things like deploy binary files, but that's pretty much it.
Of course we have a component that takes that playlist file and a resource / filesystem object that crossreferences the playlist and takes out wanted parts and then runs them as admin on database.
Since the "parts" in .sql's are written so they can be executed on any version of DB, we can run all parts on every previous/older version of DB and modify it to be current.
Of course there are some cases where SQL server parses column names "early" and we have to later modify part's to become exec_sqls, but it doesn't happen often.
I think this question deserves a modern answer so I'm going to give it myself. When I wrote the question in 2009 I don't think Phinx already existed and most definitely Laravel didn't.
Today, the answer to this question is very clear: Write incremental DB migration scripts, each with an up and a down method and run all these scripts or a delta of them when installing or updating your app. And obviously add the migration scripts to your VCS.
As mentioned in the beginning, there are excellent tools today in the PHP world which help you manage your migrations easily. Laravel has DB migrations built-in including the respective shell commands. Everyone else has a similarly powerful framework agnostic solution with Phinx.
Both Artisan migrations (Laravel) and Phinx work the same. For every change in the DB, create a new migration, use plain SQL or the built-in query builder to write the up and down methods and run artisan migrate resp. phinx migrate in the console.
I do something similar to Manos except I have a 'master' file (master.sql) that I update with some regularity (once every 2 months). Then, for each change I build a version named .sql file with the changes. This way I can start off with the master.sql and add each version named .sql file until I get up to the current version and I can update clients using the version named .sql files to make things simpler.