I've noticed while developing my first site, that the smallest changes in database columns, connection options, and various other components cause the website to fail until I correct the problem which may or may not require a lot of time (woe is me). I'm wondering what steps I can take now in order to prevent these headaches, as I've delayed the website launch in order to keep upgrading. Once I'm done implementing the changes I would like to see, I know I won't truly be done, but at some point I have to move on to the next stage.
Yes, I know there is probably no one good solution, and ultimately a self-correcting design is more trouble than its worth at this point. But if any
grey beards have any tips that they could offer based on their own experiences working with WebDev, particularly with LAMP stacks, I would greatly appreciate it.
Specifically, I would like to know what to look out for when modifying databases and website code after customer information is in active use, in order to prevent errors, and how to roll out the changes.
EDIT 1:
Yes, so the answer seems to be that I need to copy the live site to my testing environment. I'm looking going to some of the already suggested development solutions. Regular backups are crucial, but I can just see inserting new columns and modifying queries as a cause for mis-ordered tables and such. "That's where being a good programmer and testing diligently comes in handy", someone in the corner said. As I look into the proposed solutions, I welcome all others in the meantime. A real-time copy of the 'live-site' would be nice to create on the fly during testing.
The above answers are all very valid and in the end, they represent your target solution.
In the meantime, you may already do a lot for your website, even with a gradual migration to those practices.
In order to do so, I suggest you to install PHPUnit (or whatever Unit comes with the web languages you use). There are also "graphical" versions of it, like VisualPHPUnit, if that's more of your taste.
These tools are not the permanent solution. You should actually aim adding them to your permanent solution, that is setting up development server etc.
However, even as interim solution they help you reach a fairly stable degree of quality for your software components and avoid 80-90% of the surprises that come with coding on a live server.
You can develop your code in a separate directory and test it before you move it into production. You can create mock objects which your code under test may freely interact with, without fear of repercussions. Your tests may load their own alternate configuration so they work on a second, copy database.
Moving even further, you may include your website into tests itself. There are several applications like Selenium that allow you both to automate and test your production website, so that you can reliably know that your latest changes did not negatively affect your website semantics.
In short, while you should certainly aim at getting a proper development environment running, you can do something very good even today, with few hours of study.
Start using some (maybe simplified) sort of release management:
Maintain a development environment. Either locally, or in a second .htaccess-protected folder online. Make it use it´s own db.
Test every change in this dev env. Once you are satisfied, move them to productive env.
Use git or svn (svn might be simpler to learn, but opinions vary. Checkout "tortoise") to save a snapshot ("commit") of every change you make. That way you can diff throu your latest commits if anything goes wrong.
Maintain regular backups.
Related
I've checked other threads about this subject, but nothing really answers my exact question. I'm currently developing a site that updates data on a live web server (usually photos) multiple times daily using a cron and some external data feeds. Additionally, we're about to allow user-submitted data. Now, I want to start using svn because the project is getting big enough where it's starting to get relatively complicated and people are making edits to our code on the live server. I've installed svn for apache and all of the subdomains mentioned below are set up on one server using vhosts. We haven't started using this workflow yet, so if I could get some advice before we start it would be greatly appreciated.
Okay, so I've set up a repository under a subdomain (http://svn.website.com), and checked it out to a development subdomain (http://dev.website.com). I also plan to check out a copy to the live environment (http://www.website.com). The idea is to first develop on the dev subdomain, commit when it's ready to go live, and then svn up to the live environment. I've read the pros and cons between exporting and updating, and updating seems best for this project because we often will commit small tweaks/code changes and exporting seems like overkill. I have also setup the appropriate directives to avoid .svn directory access. Does it sound like I'm on the right track with this? Anything I should be aware of?
If the following is a good way to take care of everything, then we can move onto the second question. The user-submitted data (mainly photos) are being uploaded to the live environment. I have to somehow manage this between the live and dev environment so we can develop with up-to-date user-submitted files. I don't really know the best way to do this...I feel like there's a serious flaw if I were to svn add and svn ci every time a file is submitted. What is the best way to accomplish this?
Sounds like you're growing!
First, I'd question as to whether SVN is really the best choice. My guess is that you know it already, have worked with it, as have others on your team, and you want to stick with it. That's fine, but just an FYI I've used SVN and Git and I found Git generally easier to use, and I would argue that it is becoming more and more commonly used by application developers.
As for your workflow, I used to go through an almost identical process with a large distributed team and it worked fine. Depending on the amount of traffic you get, I might suggest a different production deployment strategy. When you're ready for a new production push, clone into a new directory, labeled perhaps by date or milestone names, and create a new virtual host for that directory. That way, you get a couple benefits:
You don't run the risk of having someone access your application and requesting a certain file which is in the process of being updated
You can easily revert back to an earlier milestone by simply changing the active production virtual host. Bugs and problems make it out of development regardless of how thoroughly you test, and sometimes they're serious enough to warrant just reverting back to a previous codebase. Of course, schema changes would add complexity to this, but if we're just talking about code this should avoid frantic commit reverts at 2am.
This was suggested by Rasmus Lerdorf at a conference I attended when he was posed nearly the same question you're asking.
As for ensuring developers are working with the most recent fileset, do you really want to do this? If you're processing a serious quantity of images, that's going to take up a lot of hard disk space. Sure with a CDN or other mass storage no big deal, but do you want your developers to have to download 1000 new photos every time they svn up?
If you're dead set on doing this, I'd suggest adding some code within your app for automatically doing an svn add, svn ci when svn status has a certain result (ie, this is a script you could call with cron, or every time new media gets added). You could also check into using svn hooks (git has these as well) to do things like notify you that the repo has been updated.
Hope this helps. I'd strongly discourage including all of the files users upload unless you have a really, really strong reason for doing so. The complexity that's gonna add is gonna be a pain = )
Good luck and welcome to StackOverflow!
I've been reading this site here and there and appears as though you guys have a wonderful community.
As for my background, I am a sophomore at university familiar with SQL, C++, Visual Basic, and some PHP. One of my school projects for the summer term involves building a web application that allows users to log in and schedule specific timeslots over the internet. Typically, I have been the only person working on a project, but in this case I will be part of a group. Since we're all relatively new to working as a team, I would like to set up source control for my group so we're not all working off a shared drive somewhere. Additionally, I would like to make sure that all of us are able to test our changes in some sort of development server that hosts an instance of our website.
My actual question is in regards to the toolset that we should use to achieve this. As a group, we are most familiar with PHP and MySQL so we'll end up using that for the code and database. I have used SVN in the past for my own personal use, but my group members aren't very familiar with source control. We'll probably stick with something simple like Excel for the project management and bug tracking side of things. Ideally, we would like the tools to be free and open source.
How as a group should we manage the construction of the actual application? Are there methods out there that I can use that will allow any one of us to move the files to our development machine and keep track of who did it so we don't end up overwriting each other's changes? If this is not possible, one of us will write some scripts to handle it - but I would like to avoid building basically a separate software application that will only be used to manage our project. Another issue I foresee will be updating the database running on the development machine. Are there any standardised methods that we can use to manage our SQL scripts among the four of us?
I do not expect a really long winded answer here (after all, this is our project!), but any helpful tips would be greatly appreciated. Once I return from holiday I am looking forward to getting started! Thanks!
I recommend your group use source control to synchronize your code. You can either setup your own server or just use a free provider such as github, Google code, or bitbucket.
If you do decide to use one of these sites, a nice feature is that they provide free issue tracking as well, so you can use that instead of Excel.
The best way to manage the SQL scripts is to break them out into separate files and place them under source control as well. You can either create .sql files, or use a tool to manage these changes - for example, have a look at Ruby on Rails' Migrations. This may take some effort to setup, but you'll thank yourself later if you are working on a project of any size...
Draw up a plan for how you would do it if it were just you.
Split the plan up into tasks that take around 3-4 hours to complete. Make sure each task has a measurable objective.
Divy out the tasks. Try to sort them if possible to maximize developer efficiency.
Teach them to use source control. Explain to them that they will use this (maybe not svn, but SOMETHING) in a few years, so they might as well learn how now. Additionally, this will help in every group project they do down the road.
Make a script for building and running your tests. Also script your deployment. This will ensure you have the same mechanism going to live as you do going to test, which increases the number of defects found in testing. (This is as opposed to letting them exist but not found in testing.)
You mentioned updating the development database. It would be entirely reasonable to dump the development database often with a refresh from live. You may want to make 3 environments. Development, staging, and production. The development database would contain fabricated test data. The staging database would have a copy of live (recent to within a few days maybe.) And of course live is live.
Excel works fine as a "bug database." Consider putting it in source control that you manipulate and commit. This will give you a good idea of what happened over time, and you can correct mistakes quicker.
As far as source/version control, I would recommend subversion. There are some GUI tools they might use, or even webDAV to access the SVN. This will allow users to edit files collaboratively and also give you details as to who edited what, when, and why... SVN will also do a pretty good job at merging files that happen to be saved at the same time.
It's not the easiest concept to wrap your head around, but its not very complicated once you get running.
I suggest having everyone read the first chapter from: http://svnbook.red-bean.com/en/1.5/
and they should have a good idea of what's happening.
I am also curious to see what people have to say about the database
How as a group should we manage the construction of the actual application? Are there methods out there that I can use that will allow any one of us to move the files to our development machine and keep track of who did it so we don't end up overwriting each other's changes?
It sounds like you're looking for build management. In the case of PHP, a true "build" is as simple as a collection of source files because the language is interpreted; there is no compilation.
It just so happens that I am one of the developers for BuildMaster, a tool which basically solves every problem you have listed in your question... and it also sounds like it would be free in your case under the Community Edition license. I'll try to address some of your individual pain points and how BuildMaster could be used as a solution.
Source Control
As suggested by others, you must use it. The trick when it comes to deployment is to set up some form of continuous integration so that every time someone checks in, a new "build" is created. In BuildMaster, you can set this up for any source control provider you want.
Issue/Bug Tracking
Excel will work, but it's not an optimal solution. There are plenty of free issue tracking tools you can use to manage your bugs and features. With BuildMaster, you can link your bugs and features list with the application by their release number so you could view them within the tool at any time. It can also modify issue statuses and add descriptions automatically if you want.
Deployments
Using BuildMaster, you can create automated deployment plans for your development environment, e.g.:
Get Latest Source Code
Create Artifact
Copy Files To Development Machine
Deploy Configuration Files
Update Database
The best part is, once you set these up for other environments (glowcoder's point #6), pushing all of your code and database updates is as simple as clicking a button.
Another issue I foresee will be updating the database running on the development machine. Are there any standardised methods that we can use to manage our SQL scripts among the four of us?
Database Updates
Not surprisingly, BuildMaster handles these as well by using the change scripts module. When a member of your team creates a script (e.g. ALTER TABLE ADD [Blah] INT NOT NULL) he can upload it into BuildMaster, then run it on any environment you have created.
The best part is that you can add a step in your automated deployment and never worry about it again. As Justin mentions, you can use .sql files for your object code (stored procedures, views, triggers, etc.) and have those executed on every build since they are essentially code anyway. You can keep those in source control.
Configuration Files
One aspect of all this you may have neglected (but will inevitably run into) is dealing with configuration files. With PHP, you may have an .htaccess file, a php.ini file, a prepend.php, or roll your own custom config file. Since by definition configuration files need to change between your personal machine and the development machine, grabbing them from source control wouldn't necessary work without some bit of hacking a la:
if (DEV) {
// do one thing
}
else if (PROD) {
// do another
}
With BuildMaster, you can templatize your configuration files and associate them with an environment so they can be deployed automatically. It will also maintain a history of changes for you.
Automated Testing
If you want the full ALM effect, you can automatically unit test your code during an automated build, and notify you if anything fails so you know as soon as possible that something is broken.
Apologies for the "long winded" response, but I feel like you're already ahead of the game by observing the problems you might run into in the future and really believe BuildMaster will make all of this deployment stuff simple for your team so you can focus on the fun part, coding!
Unfortunately, I've never had a senior developper or mentor to show me some of the best practices. So I develop sites (php/mysql) on my Windows machine with WAMP, I test in hidden (password restricted) folders on the production server and finally move sites to production folder.
I would like to have a more fluid/practical/error-proof setup so that from development > test > production, there is no hiccups.
The important points/questions are (you probably can come up with a lot more):
Ease of use
Easy to dev/test modifications after site is live (to avoid tests on production site)
No server difference between local/test/prod (error reporting, apache setting, etc)
Avoid problem with DB differences (ex: if columns were added, how do you add them to prod DB.?)
Do you skip the test environment or do dev AND test on the same.?
etc...
How do you guys develop PHP/MySQL websites.?
Do you use SVN.? Do you use IDEs.? Do you use VMs.?
Thanks.
This is a kind of frequent question - this is why most seasoned devs do not reply on - and generally end up in a flame war with torrid opinions. So, be careful about that.
But you seem to be an nice guy intending to get on the right paths, seeking for some really productive paths. And I recognize a little about myself on this a few years ago.
OK, first thing to keep in mind is: do not blind follow anyone on anything. Anyone can claim to be a great master, but you can find at least 10.000 guys far way better and completely anonymous. So, for anything you hear about do the following: listen, test, and take your own conclusions. If there is just one golden rule this is it. Everything else is crappy until your own conclusions appear. You are your final judge.
That said, let me begin for the one of the most current question: IDE. What you should use? You should use the one you can produce more and makes you more comfortable. Netbeans, Eclipse, VIM, Notepad++, Notepad, gedit, kate, quanta plus.... You have many options, and each person has it's own opinion. Test what you think interesting and go ahead with the one you choose.
This is true also for any methodology, framework or tool. Use, learn, and get critic about it. Stick with the one which makes you more comfortable and productive.
Same thing for developing environment. Does not matter that much if you develop on Windows, Mac or Linux. The important is get the resources you need available. The resources you need can and generally do change from one project to another.
So the best environment to develop a certain project is one that reflects the real environment where the production will run. What if you develop with PHP 5.3 OOP resources and at the end you get on a PHP 5.1? That's the point. The final environment is who tells you what is the best environment to develop, not the inverse.
For testing, you should trace a strategy. I'm talking about that as a 5 years Test Team Lead inside IBM. This because there are a LOT of testing you can perform, but not all can be really interesting to the current project.
First decide, according to project needs, what you are going to test. Security, performance, UI display, UI effects, error handlings, load and balance, usability, accessibility...
Take notes of what you are going to test (what, when, where, success criteria), and make a report of success and failures.
As I said before, the project needs is what guides you on every step. Testing is not different. If you just need to check the display on different browsers, feel free to use different machines, or VM's.
Generally this is sufficient. But if the project requires performance or load testing, then you will need specific load testing softwares. I will not get deep in this subject as it is very extensive.
It takes some time to find a ideal process and tools match, and after achieve that, you will always discover a new tool to test or a process to make you save a little time. This is IT.
Here's my recommendations:
have a dev environment purely for development. keep a staging and/or a live environment based on the resources you have at your disposal. the staging environment is where you test and ensure there are no serious issue(s) with your application. the live environment is basically your production setup. in fact, the staging and live should ALWAYS be the same. It is useful to reproduce issue(s) on the staging and do a bit of troubleshooting without modifying the code. Bear in mind, this also holds true for any associated databases.
Use SVN or some form of version control. This way you will have the ability to fall back to any stable version of the application if someday the world falls apart!
If you are using Linux environments you can write simple scripts to synchronize the setup with your latest (STABLE) development environment. Ideally, you do your development and conduct unit tests to ensure everything works as per design. Run a script and the staging environment is updated with the latest codebase. Conduct functional tests on staging and ensure that everything works as per specs. Run another script and your latest changes are moved into live/production environment.
My development process is still a bit rough, and I am looking forward to the answers also.
What I do for large projects is setup a git repo on my linux desktop and my windows desktop. I will test locally if possible. As components as finishinged I will push my changes to the centrally hosted git repo (private git hub account usually), or pull them to dev (i setup dev as a repo and pull from ssh). All MySQL updates are stored in update files, and I use netbeans for development (although I have used eclipse and others, netbeans just works for me).
I think you hit on all the important points. Personally, I
run the same OS and server software on the production server and my development system. Same versions of PHP, Python, MySQL, Django, etc.
I don't change DB structure often. I set up the DB tables on the dev. system, then use mysqldump to produce the table creation SQL. I install it on the sever using mysql <name_of_sql.file. When I do make changes, I back up the DB and then just do it through the command line interface. For PHP, I use Doctrine just for the table structure/migration support.
I write everything in Kate (Linux), Komodo Editor(Mac), or Notepad++ (Windows). I don't like IDEs very much, I much prefer to-the-point text editors.
I upload files to a staging dir, and check them with diff before copying them to the actual location.
This isn't the most sophisticated set up, but it's worked quite well for me. I'm the only dev working on the projects I'm involved with, which probably explains a lot. The most important part, that isn't really just personal preference, is the first one - match your dev/test system as closely to the production system as possible, including the OS.
Our relatively small development team is getting a bit sick of Dreamweaver. The only functionality that we're reliant on is its file check in system. As the team is likely to grow over the next few months we need to address these issues.
Subversion has come to our attention but are unsure if it will suit our requirements.
All we need is to be made aware of whether or not someone is working on a particular file before we request it from the server and to block any overriding of a checked out file.
Any recommendations or advice on general best development practices would be much appreciated.
Thanks in advance.
The way this is generally done with SVN is to not "block" people the way you ask for : on the contrary, SVN allows for several developpers to work on the same file, and the modifications each one made are then "merged".
This merge is mostly automatic ; but when it is "too complex" (like when two developpers modified the same portion of a file), one human being has to resolve the "conflicts", indicating which modifications from which developper has to be kept.
This way of working by merges, instead of locks, seems a bit unusual at first, but once you get it (you'll need to take some time to explain your team about it, and how to use it efficiently, of course), it works really nice : I've used SVN on projects with more than 10 developpers, with absolutly no problem (a few conflicts once in a while, but you solve them and that's it).
On the other side, locking files so only one developper can work on it can block the whole team : what if one file is locked by a guy, and he goes for a coffee-break ? And at this same moment, someone else need to modify the same file to be able to work ?
For more informations about SVN, you can take a look at this online book. It give lots of useful informations :-) (you will probably not need all of that, but a quick look can do no harm ^^ )
As a sidenote, if you are developping in PHP, on a big application, an IDE like Eclipse PDT can help a lot ; and there are plugins, like Subversive, that can be used to integrate SVN access into Eclispe.
There are a number of version control solutions you can look into. Are you sure you want to restrict people from working on similar files? Users can fork the development branch and have their own, when they check in any conflicts will need to be merged together. Some of the version controls come with tools to handle most of the conflicts for you.
It's not always a good idea to lock files that another developer may need to make modifications for. One problem you can run into is if the person with the file checkout is out of the office and their machine is inaccessible.
If that is functionality that you MUST have most of the version controls will allow you to configure your branch to work in such a way.
A few of the version controls out there are: CVS, SVN, Git, SourceSafe, ClearCase
If you decide to use SVN with dreamweaver, there's an extension to integrate some of the commands into the ide: Subweaver
Subversion is ok for our team. We're trying to decide if we should lock files or use the default copy-modify-merge model. Could you give me some examples of when you would have multiple developers on the same file. It sounds a little bizzare as I can't see why it can't be a one man job.
It is pretty standard practice now for desktop applications to be self-updating. On the Mac, every non-Apple program that uses Sparkle in my book is an instant win. For Windows developers, this has already been discussed at length. I have not yet found information on self-updating web applications, and I hope you can help.
I am building a web application that is meant to be installed like Wordpress or Drupal - unzip it in a directory, hit some install page, and it's ready to go. In order to have broad server compatibility, I've been asked to use PHP and MySQL -- is that **MP? In any event, it has to be broadly cross-platform. For context, this is basically a unified web messaging application for small businesses. It's not another CMS platform, think webmail.
I want to know about self-updating web applications. First of all, (1) is this a bad idea? As of Wordpress 2.7 the automatic update is a single button, which seems easy, and yet I can imagine so many ways this could go terribly, terribly wrong. Also, isn't the idea that the web files are writable by the web process a security hole?
(2) Is it worth the development time? There are probably millions of WP installs in the world, so it's probably worth the time it took the WP team to make it easy, saving millions of man hours worldwide. I can only imagine a few thousand installs of my software -- is building self-upgrade worth the time investment, or can I assume that users sophisticated enough to download and install web software in the first place could go through an upgrade checklist?
If it's not a security disaster or waste of time, then (3) I'm looking for suggestions from anyone who has done it before. Do you keep a version table in your database? How do you manage DB upgrades? What method do you use for rolling back a partial upgrade in the context of a self-updating web application? Did using an ORM layer make it easier or harder? Do you keep a delta of version changes or do you just blow out the whole thing every time?
I appreciate your thoughts on this.
Frankly, it really does depend on your userbase. There are tons of PHP applications that don't automatically upgrade themselves. Their users are either technical enough to handle the upgrade process, or just don't upgrade.
I purpose two steps:
1) Seriously ask yourself what your users are likely to really need. Will self-updating provide enough of a boost to adoption to justify the additional work? If you're confident the answer is yes, just do it.
Since you're asking here, I'd guess that you don't know yet. In that case, I purpose step 2:
2) Release version 1.0 without the feature. Wait for user feedback. Your users may immediately cry for a simpler upgrade process, in which case you should prioritize it. Alternately, you may find that your users are much more concerned with some other feature.
Guessing at what your users want without asking them is a good way to waste a lot of development time on things people don't actually need.
I've been thinking about this lately in regards to database schema changes. At the moment I'm digging into WordPress to see how they've handled database changes between revisions. Here's what I've found so far:
$wp_db_version is loaded from wp-includes/version.php. This variable corresponds to a Subversion revision number, and is updated when wp-admin/includes/schema.php is changed. (Possibly through a hook? I'm not sure.) When wp-admin/admin.php is loaded, the WordPress option named db_version is read from the database. If this number is not equal to $wp_db_version, wp-admin/upgrade.php is loaded.
wp-admin/includes/upgrade.php includes a function called dbDelta(). dbDelta() scans $wp_queries (a string of SQL queries that will create the most recent database schema from scratch) and compares it to the schema in the database, altering the tables as necessary so that the schema is brought up-to-date.
upgrade.php then runs a function called upgrade_all() which runs specific upgrade_NNN() functions if $wp_db_version is less than target values. (ie. upgrade_250(), the WordPress 2.5.0 upgrade, will be run if the database version is less than 7499.) Each of these functions run their own data migration and population procedures, some of which are called during the initial database setup script. Nicely cuts down on duplicate code.
So, that's one way to do it.
Yes it would be a security feature if PHP went and overwrote its files from some place on the internet with no warning. There's no guarantee that the server is connecting correctly to your update server (it might download someone code crafted by someone else if DNS poisoning occured) - giving someone else access to your client's data. Therefore digital signing would be important.
The user could control updates by setting permissions on the web directory so that PHP only has read access to the files - this procedure could simply be documented with your program.
One question remains (I really don't know the answer to): can PHP overwrite files if it's currently using them (e.g. if the update.php file itself needed to be updated)? Worth testing.
I suppose you've already ruled this out, but you could host it as a service. (Think wordpress.com)
I'd suggest that you package your application with pear and set up a channel. Your users can then upgrade the application through a standard interface (pear). It's not entirely automatic (unless the users have some kind of automation running on top of pear), but it's standard, so any sysadmin can maintain it.
I think your best option is an update checking mechanism that will alert the administrator when there are update(s).
As you mention, there are a number of potential security problems. Due to those alone, I would suggest not doing this. Instead, try creating a fairly smart upgrading script.
Just my 2 cents: I'd consider an automatically self updating application within my CMS as a security hole, so if you decide to code this feature, you should consider to implement different levels of this behavior:
Automatically update
Check for updates and notify
Disable