Wordpress onstart(?) Adding unexpected crash handling

Wordpress onstart(?) Adding unexpected crash handling - php

I'm new to WordPress, and I'm building some backend logic to it.
I want the admin to have as smooth experience with it as possible.
I want for him to be able to run the website with "a click of a button".
I'm used to Java and nodeJS environments, where I have life cycle,
where I can specify logic to happen when the server starts, but I'm having trouble to understand how it's done in WordPress(or PHP for that matter).
I want the website to check the database and see if it has needed tables for it's functioning, and if not, to create them and fill them with relevant data, as well as to check if the database is up-to-date (in case of a long crash),
and update if necessary.
Right now I'm thinking about running a Cron script to check it, every few minutes but it's heavy on resources. A better solution might be to run it on the first interaction with a user, but it seems not ideal, as it will slow him down.
Is there a life cycle in WordPress?
should I be worried about it crashing during important operation and then starting on it's own?
Can I specify logic for it to run on it's boot/restart?

I am not completely sure what you have in mind when you say "fill them with relevant data", but I would either recommend, leveraging WordPress' own logic or complete your task using other tools.
Just like any php script, it is stateless and it has an index.php as starting point. Then files are loaded in order and the contents of your request and the environment will depend where you end up.
This is just how php works and the key difference with JS is that JS executed on your computer, and php is a set of server side scripts that are compiled and will produce some sort of output that is sent back to your browser, just like when you call a REST api.
You might want to take a look at the following things:
wp-load.php: the file that will look for your constants defined in wp-config.php, when this file is not yet present it will redirect you to the "famous" wordpress site setup (after loading a bunch of stuff related to database connections and request data). You could follow the logic, but I would advise agains that. This due to the fact that the WordPress core is very old and this gives you an example of how php applications from the early 2000s used to look like and will most likely cause headaches.
Existing tools
Not only on server level, but also things like wp-cli or maybe a composer based solution like roots/bedrock or even roots/wordpress.
To answer your question about lifecycles directly
Yes, WordPress offers an old-timey hook system, but this is just during the request lifecycle for an active install, so this wouldn't be exactly what you seem to be looking for.
Finally, it is good to have some understanding of the internal workings of WordPress, but the whole reason that WordPress is easy to run and compatible with many setups, is just because they "strive for forever backwards compatibility" (which is also why they don't use semantic versioning). Which in turn means that the core is very outdated and unreadable, so I wouldn't bother trying to figure it out yourself.
And even more so I wouldn't want you to think that this is a fair representation of the PHP-world, since the initial release of WordPress, the language has completely evolved and most of the key components to it being a nice developer experience were still a long way ahead.
In short, I'd look for existing solutions which are built for your specific server setup and if that is not possible for some reason, try to find some sort of CLI tool in php, or other languages.

Related

eval Remote Code From My Server

I am building a site platform similar to Wordpress that allows my users to download a .zip file, upload it onto their server, and be good to go.
I know everyone says eval() is evil - but the code will not include any user or variable input.
The benefit here is that updates will occur automatically. I can just change the code being grabbed on my server.
My clients using the code will have pretty low traffic sites - so I'm not worried about overloading their server. Most of the heavy lifting will be done by us.
Here's the basic code concept:
$code=file_get_contents("http://myserver.com/code.txt");
eval($code);
Is this a realistic option? What security holes do I need to worry about?

It's "realistic" in the sense that it will work, but at the same time it sounds like a sysadmin's nightmare. If you are meaning to have a client download and execute remote code every time a request is made, your clients are at your whim if the master server goes down or is unreachable at any point. It's now a mission-critical service you'll have to keep running forever for as long as your clients need it.
You list automatic updates as a benefit, but is it? In nearly every software platform, the features users depend on can change over time; function signatures can change, or functionality may be dropped entirely in favour of a more refined alternative. Since it sounds like you're writing some form of framework, can you guarantee that future versions will always be backwards-compatible? Not everyone is using the cutting-edge version of a piece of software in production for a reason -- they want what they are using to be stable. If an upgraded version of your platform rolls out overnight, and it breaks some custom code written by the client (at least one of them will try doing this, even if you don't want them to) or even, old, standard functionality that was deprecated but still worked with the previous release, how are they going to roll it back to a version that works?
It just sounds like something that will eventually incur a ton of technical debt.

How do I manage development and deployment of my website as part of a group?

I've been reading this site here and there and appears as though you guys have a wonderful community.
As for my background, I am a sophomore at university familiar with SQL, C++, Visual Basic, and some PHP. One of my school projects for the summer term involves building a web application that allows users to log in and schedule specific timeslots over the internet. Typically, I have been the only person working on a project, but in this case I will be part of a group. Since we're all relatively new to working as a team, I would like to set up source control for my group so we're not all working off a shared drive somewhere. Additionally, I would like to make sure that all of us are able to test our changes in some sort of development server that hosts an instance of our website.
My actual question is in regards to the toolset that we should use to achieve this. As a group, we are most familiar with PHP and MySQL so we'll end up using that for the code and database. I have used SVN in the past for my own personal use, but my group members aren't very familiar with source control. We'll probably stick with something simple like Excel for the project management and bug tracking side of things. Ideally, we would like the tools to be free and open source.
How as a group should we manage the construction of the actual application? Are there methods out there that I can use that will allow any one of us to move the files to our development machine and keep track of who did it so we don't end up overwriting each other's changes? If this is not possible, one of us will write some scripts to handle it - but I would like to avoid building basically a separate software application that will only be used to manage our project. Another issue I foresee will be updating the database running on the development machine. Are there any standardised methods that we can use to manage our SQL scripts among the four of us?
I do not expect a really long winded answer here (after all, this is our project!), but any helpful tips would be greatly appreciated. Once I return from holiday I am looking forward to getting started! Thanks!

I recommend your group use source control to synchronize your code. You can either setup your own server or just use a free provider such as github, Google code, or bitbucket.
If you do decide to use one of these sites, a nice feature is that they provide free issue tracking as well, so you can use that instead of Excel.
The best way to manage the SQL scripts is to break them out into separate files and place them under source control as well. You can either create .sql files, or use a tool to manage these changes - for example, have a look at Ruby on Rails' Migrations. This may take some effort to setup, but you'll thank yourself later if you are working on a project of any size...

Draw up a plan for how you would do it if it were just you.
Split the plan up into tasks that take around 3-4 hours to complete. Make sure each task has a measurable objective.
Divy out the tasks. Try to sort them if possible to maximize developer efficiency.
Teach them to use source control. Explain to them that they will use this (maybe not svn, but SOMETHING) in a few years, so they might as well learn how now. Additionally, this will help in every group project they do down the road.
Make a script for building and running your tests. Also script your deployment. This will ensure you have the same mechanism going to live as you do going to test, which increases the number of defects found in testing. (This is as opposed to letting them exist but not found in testing.)
You mentioned updating the development database. It would be entirely reasonable to dump the development database often with a refresh from live. You may want to make 3 environments. Development, staging, and production. The development database would contain fabricated test data. The staging database would have a copy of live (recent to within a few days maybe.) And of course live is live.
Excel works fine as a "bug database." Consider putting it in source control that you manipulate and commit. This will give you a good idea of what happened over time, and you can correct mistakes quicker.

As far as source/version control, I would recommend subversion. There are some GUI tools they might use, or even webDAV to access the SVN. This will allow users to edit files collaboratively and also give you details as to who edited what, when, and why... SVN will also do a pretty good job at merging files that happen to be saved at the same time.
It's not the easiest concept to wrap your head around, but its not very complicated once you get running.
I suggest having everyone read the first chapter from: http://svnbook.red-bean.com/en/1.5/
and they should have a good idea of what's happening.
I am also curious to see what people have to say about the database

How as a group should we manage the construction of the actual application? Are there methods out there that I can use that will allow any one of us to move the files to our development machine and keep track of who did it so we don't end up overwriting each other's changes?
It sounds like you're looking for build management. In the case of PHP, a true "build" is as simple as a collection of source files because the language is interpreted; there is no compilation.
It just so happens that I am one of the developers for BuildMaster, a tool which basically solves every problem you have listed in your question... and it also sounds like it would be free in your case under the Community Edition license. I'll try to address some of your individual pain points and how BuildMaster could be used as a solution.
Source Control
As suggested by others, you must use it. The trick when it comes to deployment is to set up some form of continuous integration so that every time someone checks in, a new "build" is created. In BuildMaster, you can set this up for any source control provider you want.
Issue/Bug Tracking
Excel will work, but it's not an optimal solution. There are plenty of free issue tracking tools you can use to manage your bugs and features. With BuildMaster, you can link your bugs and features list with the application by their release number so you could view them within the tool at any time. It can also modify issue statuses and add descriptions automatically if you want.
Deployments
Using BuildMaster, you can create automated deployment plans for your development environment, e.g.:
Get Latest Source Code
Create Artifact
Copy Files To Development Machine
Deploy Configuration Files
Update Database
The best part is, once you set these up for other environments (glowcoder's point #6), pushing all of your code and database updates is as simple as clicking a button.
Another issue I foresee will be updating the database running on the development machine. Are there any standardised methods that we can use to manage our SQL scripts among the four of us?
Database Updates
Not surprisingly, BuildMaster handles these as well by using the change scripts module. When a member of your team creates a script (e.g. ALTER TABLE ADD [Blah] INT NOT NULL) he can upload it into BuildMaster, then run it on any environment you have created.
The best part is that you can add a step in your automated deployment and never worry about it again. As Justin mentions, you can use .sql files for your object code (stored procedures, views, triggers, etc.) and have those executed on every build since they are essentially code anyway. You can keep those in source control.
Configuration Files
One aspect of all this you may have neglected (but will inevitably run into) is dealing with configuration files. With PHP, you may have an .htaccess file, a php.ini file, a prepend.php, or roll your own custom config file. Since by definition configuration files need to change between your personal machine and the development machine, grabbing them from source control wouldn't necessary work without some bit of hacking a la:
if (DEV) {
// do one thing
}
else if (PROD) {
// do another
}
With BuildMaster, you can templatize your configuration files and associate them with an environment so they can be deployed automatically. It will also maintain a history of changes for you.
Automated Testing
If you want the full ALM effect, you can automatically unit test your code during an automated build, and notify you if anything fails so you know as soon as possible that something is broken.
Apologies for the "long winded" response, but I feel like you're already ahead of the game by observing the problems you might run into in the future and really believe BuildMaster will make all of this deployment stuff simple for your team so you can focus on the fun part, coding!

Is it possible to get a <200ms response with Drupal (without caching)?

The question, simply put, is the one in the title. Is it possible?
So far, my experience with scripting languages is that, to increase performance, you need to cache everything and later just serve the generated HTML files.
That's ok for some use cases, but when you really need to generate a new page in realtime, it's just impossible.
Drupal can take up to 3 seconds (or more!) to render some web pages (PHP execution time, not DB). That's crazy. Completely crazy.
If many projects (like Facebook) are using PHP, obviously the problem is mine. But googling for this problem shows that it's common. Too common.
(Of course I installed APC for PHP. It certainly helps, but PHP is still ultra-slow).
Must I assume this is the reality for Drupal / PHP?
Thanks.

Short answer is no. But why would you not want to cache?
What do you mean by 'generate a new page in realtime'? Authenticated users (anyone logged in) can see new content right away. Anonymous users may have to wait a little bit (if you are using Boost, for example), BUT, you can always control it, or flush it when new content is added. You should cache as much as you can.
You can install Boost (static HTML files), Memcache, and enable Drupal cache. It's encouraged, especially the last one. You can also run nginx on the server.
You can also try using Pressflow, a drop-in replacement for Drupal that will give you better performance.
http://pressflow.org/
Its been discussed many times.. you can make Drupal extremely fast if you want to. Check out some of the 2bits articles:
http://2bits.com/contents/articles
Utilizing the available methods of caching will help you keep your hosting cost low, instead of throw more hardware on an unoptimized site.

As you say, Facebook uses PHP, and they clearly have reason to need good performance. Their solution was to write their own compiler for PHP called HipHop, which they released as open source. If you're worried about PHP's performance, you should give it a try as it will definitely improve things.
The downside is that it doesn't (yet) cover 100% of the PHP function set, so some PHP programs may not compile. I don't know where Drupal fits into this, but it would be worth trying it out - there's nothing to be lost by doing a test compilation; if its not going to work, you won't have lost anything.
On a similar vein, there is a project in the Drupal community to convert parts of the Drupal Core into a PHP Extension, meaning that some key Drupal functions are then built-in to the PHP runtime as compiled code. See the project page here. But note that this is still in a fairly early stage of development: it's still listed as experimental, and only covers a small number of functions. It might be worth keeping an eye on the project, though.

According to http://groups.drupal.org/node/34076, yes you can get a < 200ms response time with Drupal without caching.

The tips that I've received from some friends regarding Drupal load performance is to install less than 40 modules.
More than 40, especially if those contrib modules use too much hooks and memory, and the performance will be decreased.
Other tips:
remove imagecache ui and views ui on production site
if possible put htaccess on vhost.conf so that htaccess will only be called once on apahe start
use throttle module
use gzip for all html, css and js files
use cdn module and amazon server solution
use ajax for some parts or blocks of your site
last and if there is enough budget, migrate to oracle

What are best practices for self-updating PHP+MySQL applications?

It is pretty standard practice now for desktop applications to be self-updating. On the Mac, every non-Apple program that uses Sparkle in my book is an instant win. For Windows developers, this has already been discussed at length. I have not yet found information on self-updating web applications, and I hope you can help.
I am building a web application that is meant to be installed like Wordpress or Drupal - unzip it in a directory, hit some install page, and it's ready to go. In order to have broad server compatibility, I've been asked to use PHP and MySQL -- is that **MP? In any event, it has to be broadly cross-platform. For context, this is basically a unified web messaging application for small businesses. It's not another CMS platform, think webmail.
I want to know about self-updating web applications. First of all, (1) is this a bad idea? As of Wordpress 2.7 the automatic update is a single button, which seems easy, and yet I can imagine so many ways this could go terribly, terribly wrong. Also, isn't the idea that the web files are writable by the web process a security hole?
(2) Is it worth the development time? There are probably millions of WP installs in the world, so it's probably worth the time it took the WP team to make it easy, saving millions of man hours worldwide. I can only imagine a few thousand installs of my software -- is building self-upgrade worth the time investment, or can I assume that users sophisticated enough to download and install web software in the first place could go through an upgrade checklist?
If it's not a security disaster or waste of time, then (3) I'm looking for suggestions from anyone who has done it before. Do you keep a version table in your database? How do you manage DB upgrades? What method do you use for rolling back a partial upgrade in the context of a self-updating web application? Did using an ORM layer make it easier or harder? Do you keep a delta of version changes or do you just blow out the whole thing every time?
I appreciate your thoughts on this.

Frankly, it really does depend on your userbase. There are tons of PHP applications that don't automatically upgrade themselves. Their users are either technical enough to handle the upgrade process, or just don't upgrade.
I purpose two steps:
1) Seriously ask yourself what your users are likely to really need. Will self-updating provide enough of a boost to adoption to justify the additional work? If you're confident the answer is yes, just do it.
Since you're asking here, I'd guess that you don't know yet. In that case, I purpose step 2:
2) Release version 1.0 without the feature. Wait for user feedback. Your users may immediately cry for a simpler upgrade process, in which case you should prioritize it. Alternately, you may find that your users are much more concerned with some other feature.
Guessing at what your users want without asking them is a good way to waste a lot of development time on things people don't actually need.

I've been thinking about this lately in regards to database schema changes. At the moment I'm digging into WordPress to see how they've handled database changes between revisions. Here's what I've found so far:
$wp_db_version is loaded from wp-includes/version.php. This variable corresponds to a Subversion revision number, and is updated when wp-admin/includes/schema.php is changed. (Possibly through a hook? I'm not sure.) When wp-admin/admin.php is loaded, the WordPress option named db_version is read from the database. If this number is not equal to $wp_db_version, wp-admin/upgrade.php is loaded.
wp-admin/includes/upgrade.php includes a function called dbDelta(). dbDelta() scans $wp_queries (a string of SQL queries that will create the most recent database schema from scratch) and compares it to the schema in the database, altering the tables as necessary so that the schema is brought up-to-date.
upgrade.php then runs a function called upgrade_all() which runs specific upgrade_NNN() functions if $wp_db_version is less than target values. (ie. upgrade_250(), the WordPress 2.5.0 upgrade, will be run if the database version is less than 7499.) Each of these functions run their own data migration and population procedures, some of which are called during the initial database setup script. Nicely cuts down on duplicate code.
So, that's one way to do it.

Yes it would be a security feature if PHP went and overwrote its files from some place on the internet with no warning. There's no guarantee that the server is connecting correctly to your update server (it might download someone code crafted by someone else if DNS poisoning occured) - giving someone else access to your client's data. Therefore digital signing would be important.
The user could control updates by setting permissions on the web directory so that PHP only has read access to the files - this procedure could simply be documented with your program.
One question remains (I really don't know the answer to): can PHP overwrite files if it's currently using them (e.g. if the update.php file itself needed to be updated)? Worth testing.

I suppose you've already ruled this out, but you could host it as a service. (Think wordpress.com)

I'd suggest that you package your application with pear and set up a channel. Your users can then upgrade the application through a standard interface (pear). It's not entirely automatic (unless the users have some kind of automation running on top of pear), but it's standard, so any sysadmin can maintain it.

I think your best option is an update checking mechanism that will alert the administrator when there are update(s).
As you mention, there are a number of potential security problems. Due to those alone, I would suggest not doing this. Instead, try creating a fairly smart upgrading script.

Just my 2 cents: I'd consider an automatically self updating application within my CMS as a security hole, so if you decide to code this feature, you should consider to implement different levels of this behavior:
Automatically update
Check for updates and notify
Disable

Is it necessary in any circumstance to modify Wordpress other than writing plugins and themes?

I recently had to work on a project where the previous developer modified the wp-admin directory. It seems like a bad idea to me, since Wordpress is constantly updated. Am I just not at that level of expertise with modifying Wordpress?

Being open source, I think it's a common thing for software like WordPress to be modified and extended at any point.
To modify or not to modify is a choice between trade-offs. New features can be encapsulated as modules, which may, perhaps, cause their functionality to be less integrated than desired. However, fully integrating changes may hinder easily updating the software as new versions are released.
It does require that someone be very familiar with the software to modify the software directly, but this isn't necessarily a bad idea.
On a side note, I think modifying WordPress is almost a necessity, especially if you want it to have a decent architecture or to actually be secure (ok, that was a jab, sue me).

Well, it is a bad idea only in that it means you are now responsible for maintaining an internal defacto fork ... every time WordPress releases an update, you have to do a three-way diff to merge your changes into the new "real" WordPress. (Three-way diff means you do a diff between your fork of the old version and the standard old version to build a patch set, then apply that patch set to the new version.) You should also be using a VCS yourself to keep yourself sane.
If you aren't up to this then you aren't up to it, there's nothing wrong with following the KISS principle and not mucking up the application code.
If you can write a plugin that does the same thing and does it just as efficiently, then you should do that so you don't have to maintain your own fork.
However, there are a lot of things WordPress is terrible at (efficiency, security) that you can ameliorate (sometimes without much work, just by disabling code you don't need) only by hacking the application code. WordPress is dirty legacy spaghetti code originally written by people with virtually zero knowledge of software or database design, and it does a lot of tremendously stupid things like querying the database on every request to see what it's own siteurl is, when this never changes -- there's nothing wrong with taking 5 minutes to change 2 lines of code so it doesn't do this any more.
I worked as tech lead on a then-top-20 Technorati-ranked blog and did a lot of work to scale WordPress on a single server and then onto a cluster (with separate servers for admin vs. public access). We had upstream reverse proxies (i.e. Varnish or Squid) acting as HTTP accelerators and an internal object/page fragment cache system that plugged into memcached with failover to filesystem caching using PEAR::Cache_Lite. We had to modify WordPress to do things like send sane, cache-friendly HTTP headers, to disable a lot of unnecessary SQL and processing.
I modified WP to run using MySQL's memory-only NDB cluster storage engine, which meant specifying indexes in a lot of queries (in the end we opted for a replicated cluster instead, however). In modifying it to run with separate servers for admin vs. public access, we locked down the public-side version so it ran with much reduced MySQL privileges allowing only reads (a third MySQL user got commenting privileges).
If you have a serious comment spam problem (i.e. 10K/hour), then you have to do something beyond plugins. Spam will DOS you because WordPress just initializing its core is something like half a second on a standalone P4 with no concurrency, and since WP is a code hairball there's no way to do anything without initializing the core first.
"WP-Cron" is braindead and should be disabled if you have access to an actual crontab to perform these functions. Not hard to do.
In short, I could go on forever listing reasons why you might want to make modifications.
Throughout this it was of course a goal for maintainability reasons to keep these modifications to a minimum and document them as clearly as possible, and we implemented many as plugins when it made sense.

On one blog/forum combination, we hacked together the signup procedure so that people filled in one form to sign up to both WordPress and phpBB at the same time. I'm sure there's a better way to do that with plugins, but it did have one unexpected benefit - it really confuses the spambots. Despite having several of them register each day, we've had about two spam posts in the life of the forum.
Not something I'd recommend, of course - it stops us from upgrading either software.

I tend to strongly advocate against modifying core code if at all possible, especially in a project that updates like WordPress does. If WordPress can't be made to do what you need it to with plugins and the like, you're probably better off with a more extensible/generic system like Drupal. Hacking a blogging-oriented CMS into something else might not be worth it.

In the older versions of WordPress (1.0 and even the early 2.0s), I wouldn't bat an eye to modifying WordPress itself.
However, WordPress' architecture has matured. Sidebars no longer need to be manually coded. Instead, you can port your theme to use widgets and just create widgets (what a godsend!). Don't like how something is displayed - just modify the theme! Don't like how WordPress handles something? Create a plug-in. I'm hard pressed to think of a reason to modify the WordPress code itself that cannot be handled via WordPress' contemporary modular components (widgets, plug-ins, themes) instead.
I'm the type of person to always get "under the hood" in open source apps like WordPress. However, nowadays, there's really no good reason to modify the core WordPress code.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.