Collaboration script - track user contributions

Collaboration script - track user contributions - php

I am developing a collaboration tool in PHP and MySQL, and I wanted to ask what would be the most efficent way to do the following; say I have a block of text, that will get edited by different users. I need to record each change, and when the changed text is viewed, the text changed by particular user should be highlighted (possibly with css and/or jQuery).
I am not looking for a particular code snippet (and you can see that my question is fairly vaugue), but I was hoping to get an idea how to go around this particular problem.
As always cheers for all suggestions.

One way to do this, if you're using git for version control, would by to use the git blame command. It'll show you, line by line, who changed what, at what time, and in what commit. Here's some documentation, and here's a gui. I prefer to run this from the command line with
git blame path/to/filename.m
If you don't use git, and want to learn a bit more, you might check out the Git Community Book.

If you are going to build it from scratch, then I have an idea. You can create a table 'Event', in which you record every changes made to your document.
The table includes 3 concepts: "modified time", "who modify" and "what changes". The "what changes" is the main problem here. In my opinion, since you are very likely need to support "revert" ability, you should save all the version of documents. So in "what changes", you only need 2 columns: "before_change_text_link" which refers to the file before changing, and "after_change_text_link" which refers to the file after changing . By that way, you can record all the changes.
You can then highlight the changes made by different users by jQuery/css, with some comparing text procedure at server.

Ok so I've came up with a solution, when user submits the new text, run diff on it, and register the lines that has been changed. Those will be stored in a mysql table with user id, the string that diff returns and it's respective line number. This way I can return the original text, return the changed strings for particular user, and use regex to highlight it.

Related

How to merge local and live databases?

We've been developing for Wordpress for several years and whilst our workflow has been upgraded at several points there's one thing that we've never solved... merging a local Wordpress database with a live database.
So I'm talking about having a local version of the site where files and data are changed, whilst the data on the live site is also changing at the same time.
All I can find is the perfect world scenario of pulling the site down, nobody (even customers) touching the live site, then pushing the local site back up. I.e copying one thing over the other.
How can this be done without running a tonne of mysql commands? (it feels like they could fall over if they're not properly checked!) Can this be done via Gulp's (I've seen it mentioned) or a plugin?
Just to be clear, I'm not talking about pushing/pulling data back and forth via something like WP Migrate DB Pro, BackupBuddy or anything similar - this is a merge, not replacing one database with another.
I would love to know how other developers get around this!
File changes are fairly simple to get around, it's when there's data changes that it causes the nightmare.
WP Stagecoach does do a merge but you can't work locally, it creates a staging site from the live site that you're supposed to work on. The merge works great but it's a killer blow not to be able to work locally.
I've also been told by the developers that datahawk.io will do what I want but there's no release date on that.

It sounds like VersionPress might do what you need:
VersionPress staging
A couple of caveats: I haven't used it, so can't vouch for its effectiveness; and it's currently in early access.

Important : Take a backup of Live database before merging Local data to it.
Follow these steps might help in migrating the large percentage of data and merging it to live
Go to wp back-end of Local site Tools->Export.
Select All content radio button (if not selected by default).
This will bring an Xml file containing all the local data comprised of all default post types and custom post types.
Open this XML file in notepad++ or any editor and find and replace the Local URL with the Live URL.
Now visit the Live site and Import the XML under Tools->Import.
Upload the files (images) manually.
This will bring a large percentage of data from Local to Live .
Rest of the data you will have to write custom scripts.
Risk factors are :
When uploading the images from Local to Live , images of same name
will be overriden.
Wordpress saves the images in post_meta generating a serialized data for the images , than should be taken care of when uploading the database.
Serialized data in post_meta for post_type="attachment" saves serialized data for 3 or 4 dimensions of the images.
Usernames or email ids of users when importing the data , can be same (Or wp performs the function of checking unique usernames and emails) then those users will not be imported (might be possible).

If I were you I'd do the following (slow but affords you the greatest chance of success)
First off, set up a third database somewhere. Cloud services would probably be ideal, since you could get a powerful server with an SSD for a couple of hours. You'll need that horsepower.
Second, we're going to mysqldump the first DB and pipe the output into our cloud DB.
mysqldump -u user -ppassword dbname | mysql -u root -ppass -h somecloud.db.internet
Now we have a full copy of DB #1. If your cloud supports snapshotting data, be sure to take one now.
The last step is to write a PHP script that, slowly but surely, selects the data from the second DB and writes it to the third. We want to do this one record at a time. Why? Well, we need to maintain the relationships between records. So let's take comments and posts. When we pull post #1 from DB #2 it won't be able to keep record #1 because DB #1 already had one. So now post #1 becomes post #132. That means that all the comments for post #1 now need to be written as belonging to post #132. You'll also have to pull the records for the users who made those posts, because their user IDs will also change.
There's no easy fix for this but the WP structure isn't terribly complex. Building a simple loop to pull the data and translate it shouldn't be more then a couple of hours of work.

If I understand you, to merge local and live database, until now I'm using other software such as NavicatPremium, it has Data Sycn feature.

This can be achieved live using spring-xd, create a JDBC Stream to pull data from one db and insert into the other. (This acts as streaming so you don't have to disturb any environment)

The first thing you need to do is asses if it would be easier to do some copy-paste data entry instead of a migration script. Sometimes the best answer is to suck it up and do it manually using the CMS interface. This avoids any potential conflicts with merging primary keys, but you may need to watch for references like the creator of a post or similar data.
If it's just outright too much to manually migrate, you're stuck with writing a script or finding one that is already written for you. Assuming there's nothing out there, here's what you do...
ALWAYS MAKE A BACKUP BEFORE RUNNING MIGRATIONS!
1) Make a list of what you need to transfer. Do you need users, posts, etc.? Find the database tables and add them to the list.
2) Make a note all possible foreign keys in the database tables being merged into the new database. For example, wp_posts has post_author referencing wp_users. These will need specific attention during the migration. Use this documentation to help find them.
3) Once you know what tables you need and what they reference, you need to write the script. Start by figuring out what content is new for the other database. The safest way is to do this manually with some kind of side-by-side list. However, you can come up with your own rules on how to automatically match table rows. Maybe to check for $post1->post_content === $post2->post_content in cases the text needs to be the same. The only catch here is the primary/foreign keys are off limits for these rules.
4) How do you merge new content? The general idea is that all primary keys will need to be changed for any new content. You want to use everything except for the id of post and insert that into the new database. There will be an auto-increment to create the new id, so you wont need the previous id (unless you want it for script output/debug).
5) The tricky part is handling the foreign keys. This process is going to vary wildly depending on what you plan on migrating. What you need to know is which foreign key goes to which (possibly new) primary key. If you're only migrating posts, you may need to hard-code a user id to user id mapping for the post_author column, then use this to replace the values.
But what if I don't know the user ids for the mapping because some users also need to be migrated?
This is where is gets tricky. You will need to first define the merge rules to see if a user already exists. For new users, you need record the id of the newly inserted users. Then after all users are migrated, the post_author value will need to be replaced when it references a newly merged user.
6) Write and test the script! Test it on dummy databases first. And again, make backups before using it on your databases!

I've done something simillar with ETL (Extract, Transform, Load) process when I was moving data from one CMS to another.
Rather than writing a script I used a Pentaho Data Integration (Kettle) tool.
The Idea of ETL is pretty much straight forward:
Extract the data (for instance from one database)
Transform it to suit your needs
Load it to the final destination (your second database).
The tool is easy to use and it allows you to experiment with various steps and outputs to investigate the data. When you design a right ETL proces, you are ready to merge those databases of yours.

How can this be done without running a tonne of mysql commands?
No way. If both local and web sites are running at the same time how can you prevent not having the same ids' with different content?

so if you want to do this you can use mysql repication.i think it will help you to merge with different database mysql.

How to create a versioning/history/revision system for contents published by users?

After reading a lot of SO questions about Keeping page changes history or How to version control a record in a database (for example), I can't find a real elegant solution to do the work.
Now, let's try to explain as clear as possible what we need, for this simple revision system, that allows registered users to post some articles, and other users to submit a revision of those articles, and then some moderators users to check those revisions.
MySQL database
The database contains an articles table with the following simplified fields:
ARTICLE(id, id_user, title, content, date);
To implement the revision/history versions, I guess that we'll have the following table:
REVISION(id, id_article, revision_id_user, id_moderator, revision_date,
revision_title, revision_content, revision_description, revision_check);
With the relation : ARTICLE 0,n <---> 1,1 REVISION
Workflow
A user create an ARTICLE, which is inserted in the ARTICLE table (terrific !)
Another user makes an update of this ARTICLE, this update is recorded in the REVISION table, and is queued for the moderator users. (revision_check=0).
A moderator user validate the REVISION(revision_check=1), then the ARTICLE(content) gets the REVISION(revision_content) value.
My questions
Is this workflow seems to be a good way to do it? Because, I see a mistake: if there are several REVISIONs for an ARTICLE:
Should we take the content of the last submitted REVISION or the original ARTICLE?
Or, should we need to block the revisions as no other REVISION could be submitted while the last isn't checked.
Is there a way to record light versioning? By the way, is it possible to insert in the REVISION table, only the updated content through a SQL, PHP or js compare function? And how to display it like SO do it? Because I'm afraid that the REVISION table will be very heavy.
Bonus : how does SO?
Any idea, link, source, plugin (MySQL, PHP 5 and JS/jQuery) would be greatly appreciated.

I don't see a single answer with a plugin for your question, because of your personalized workflow.
About the revision workflow
It is your own vision of it, it doesn't seems too bad for your use. But I'm sure that some use cases should have to evolve by the way.
First point that I can see, you must lock the revisions until a revision is in progress AND until it is validated by a moderator. When it is in progress add ARTICLE(revision=progress) for example, to lock it, and avoid users to edit an article at the same time by displaying a message.
Second point, be careful, I believe that the author of the article could update it without any moderation process. For this reason, you'll have to set the ARTICLE(revision=progress) too, while the author updates his own article.
About recording a light version of the revisions in db
You could make a crazy function in php (or other), that creates an array, for each changes, like following :
array('1'=>array('char_pos'=>'250','type'=>'delete','length'=>'25','change'=>''),
'2'=>array('char_pos'=>'450','type'=>'insert','length'=>'16','change'=>'some text change'),
...);
As you can see, creating, formating and recording this in database could be very awerful and difficult to manage.
I think that there's no way to do versioning with MySQL. You could do something for versioning with an ORM like PROPEL but I don't think that the result will be what you expect...
Here, the better way seems to record the entire updated article for each revision, even if it grows your database. With your workflow you wont read a lot the REVISION table, so MySQL won't have an heavy load for it.
About the comparison display
You could use the Diff-Match-Patch plugin to hilight the updates between two contents, "differences" demo here. I think that SO uses Beyond compare (or similar) to hilight changes between revisions.
To read more about SO technologies, you can have a look at this page.

How to "upgrade" the database in real world?

My company have develop a web application using php + mysql. The system can display a product's original price and discount price to the user. If you haven't logined, you get the original price, if you loginned , you get the discount price. It is pretty easy to understand.
But my company want more features in the system, it want to display different prices base on different user. For example, user A is a golden parnter, he can get 50% off. User B is a silver parnter, only have 30 % off. But this logic is not prepare in the original system, so I need to add some attribute in the database, at least a user type in this example. Is there any recommendation on how to merge current database to my new version of database. Also, all the data should preserver, and the server should works 24/7. (within stop the database)
Is it possible to do so? Also , any recommend for future maintaince advice? Thz u.

I would recommend writing a tool to run SQL queries to your databases incrementally. Much like Rails migrations.
In the system I am currently working on, we have such tool written in python, we name our scripts something like 000000_somename.sql, where the 0s is the revision number in our SCM (subversion), and the tool is run as part of development/testing and finally deploying to production.
This has the benefit of being able to go back in time in terms of database changes, much like in code (if you use a source code version control tool) too.

http://dev.mysql.com/doc/refman/5.1/en/alter-table.html
Here are more concrete examples of ALTER TABLE.
http://php.about.com/od/learnmysql/p/alter_table.htm
You can add the necessary columns to your table with ALTER TABLE, then set the user type for each user with UPDATE. Then deploy the new version of your app. that uses the new column.

Did you use an ORM for data access layer ? I know Doctrine comes with a migration API which allow version switch up and down (in case something went wrong with new version).
Outside any framework or ORM consideration, a fast script will minimize slowdown (or downtime if process is too long).
To my opinion, I'd rather prefer a 30sec website access interruption with an information page, than getting shorter interuption time but getting visible bugs or no display at all. If interruption times matters, it's best doing this at night or when lesser traffic.
This can all be done in one script (or at least launched by one commande line), when we'd to do such scripts we include in a shell script :
putting application in standby (temporary static page) : you can use .htaccess redirect or whatever applicable to your app/server environment.
svn udpate (or switch) for source code and assets upgrade
empty caches, cleaning up temp files, etc.
rebuild generated classes (symfony specific)
upgrade DB structure with ALTER / CREATE TABLE querys
if needed, migrate data from old structure to new : depending on what you changed on structure, it may require fetching data before altering DB structure, or use tmp tables.
if all went well, remove temporary page. Upgrade done
if something went wrong display a red message to the operator so it can see what happened, try to fix it and then remove waiting page by hand.
The script should do checks at each steps and stop a first error, and it should be verbose (but concise) about what it does at all steps, thus you can fix the app faster if something has to went wrong.
The best would be a recoverable script (error at step 2 - stop process - manual fix - recover at step 3), I never took the time to implement it this way.
If works pretty well but these kind of script have to be intensively tested, on an environnement as closest as possible to the production one.
In general we develop such scripts locally, and test them on the same platform tha the production env (just different paths and DB)
If the waiting page is not an option, you can go whithout but you need to ensure data and users session integrity. As an example, use LOCK on tables during upgrade/data transfer and use exclusive locks on modified files (SVN does I think)
There could other better solutions, but it's basically what I use and it do the job for us. The major drawback is that kind of script had to be rewritten at each major release, this incitate me to search for other options to do this, but which one ??? I would be glad if someone here had better and simpler alternative.

How to create text diff web app

idea
I would like to create a little app for myself to store ideas (the thing is - I want it to do MY WAY)
database
I'm thinking going simple:
id - unique id of revision in database
text_id - identification number of text
rev_id - number of revision
flags - various purposes - expl. later
title - self expl.
desc - description
text - self expl
.
flags - if I (i.e.) add flag rb;65, instead of storing whole text, I just said, that whenever I ask for latest revision, I go again in DB and check revision 65
Question: Is this setup the best? Is it better to store the diff, or whole text (i know, space is cheap...)? Does that revision flag make sense (wouldn't it be better to just copy text - more disk space, but less db and php processing.
php
I'm thinking, that I'll go with PEAR here. Although main point is to open-edit-save, possiblity to view revisions can't be that hard to program and can be life-saver in certain situations (good ideas got deleted, saving wrong version, etc...).
However, I've never used PEAR in a long-time or full-project relationship, however, brief encounters in my previous experience left rather bad feeling - as I remember, it was too difficult to implement, slow and humongous to play with, so I don't know, if there's anything better.
Update: It seems, that there are more text diff pre-made libraries, some even more light-weight than PEAR, so I'll have to dig into it, probably.
why?
Although there are bazillions of various time/project/idea management tools, everything lacks something for me, whether it's sharing with users, syncing on more PCs, time-tracking, project management... And I believe, that this text diff webapp will be for internal use with various different tools later. So if you know any good and nice-UI-having project management app with support for text-heavy usage, just let me know, so I'll save my time for something better than redesigning the weel.

I think your question is just boiling down to the one line (If there's something else, let me know, and I'll add on):
Is it better to store the diff, or whole text (i know, space is cheap...)?
It's definitely better to store the whole text, unless you really need to save space. Viewing the text will be a much more common action than checking a diff, and if something has a lot of revisions it could be a significant process to "build" the text for the latest one. Imagine a heavily-used page where you've done thousands of revisions, and the "whole text" is only stored with the original. Then you have to process thousands of diffs just to view the latest text, instead of just pulling it straight out of the database.
If you want to compromise, every time you calculate a diff between any two revisions, store it in a separate table. Then you only have to calculate any given diff once, so it'll be instant the next time you view the same diff. If necessary, this table could be pruned every once in a while to remove diffs that haven't been accessed in a long time.

here is a php diff function : http://paulbutler.org/archives/a-simple-diff-algorithm-in-php/
and here is another: holomind.de/phpnet/diff.php

If you're storing a lot of different versions of files git can help you quite a lot.

Starting with versioning mysql schemata without overkill. Good solutions?

I've arrived at the point where I realise that I must start versioning my database schemata and changes. I consequently read the existing posts on SO about that topic but I'm not sure how to proceed.
I'm basically a one man company and not long ago I didn't even use version control for my code. I'm on a windows environment, using Aptana (IDE) and SVN (with Tortoise). I work on PHP/mysql projects.
What's a efficient and sufficient (no overkill) way to version my database schemata?
I do have a freelancer or two in some projects but I don't expect a lot of branching and merging going on. So basically I would like to keep track of concurrent schemata to my code revisions.
[edit] Momentary solution: for the moment I decided I will just make a schema dump plus one with the necessary initial data whenever I'm going to commit a tag (stable version). That seems to be just enough for me at the current stage.[/edit]
[edit2]plus I'm now also using a third file called increments.sql where I put all the changes with dates, etc. to make it easy to trace the change history in one file. from time to time I integrate the changes into the two other files and empty the increments.sql[/edit]

Simple way for a small company: dump your database to SQL and add it to your repository. Then every time you change something, add the changes in the dump file.
You can then use diff to see changes between versions, not to mention have comments explaining your changes. This will also make you virtually immune to MySQL upgrades.
The one downside I've seen to this is that you have to remember to manually add the SQL to your dumpfile. You can train yourself to always remember, but be careful if you work with others. Missing an update could be a pain later on.
This could be mitigated by creating some elaborate script to do it for you when submitting to subversion but it's a bit much for a one man show.
Edit: In the year that's gone by since this answer, I've had to implement a versioning scheme for MySQL for a small team. Manually adding each change was seen as a cumbersome solution, much like it was mentioned in the comments, so we went with dumping the database and adding that file to version control.
What we found was that test data was ending up in the dump and was making it quite difficult to figure out what had changed. This could be solved by dumping the schema only, but this was impossible for our projects since our applications depended on certain data in the database to function. Eventually we returned to manually adding changes to the database dump.
Not only was this the simplest solution, but it also solved certain issues that some versions of MySQL have with exporting/importing. Normally we would have to dump the development database, remove any test data, log entries, etc, remove/change certain names where applicable and only then be able to create the production database. By manually adding changes we could control exactly what would end up in production, a little at a time, so that in the end everything was ready and moving to the production environment was as painless as possible.

How about versioning file generated by doing this:
mysqldump --no-data database > database.sql

Where I work we have an install script for each new version of the app which has the sql we need to run for the upgrade. This works well enough for 6 devs with some branching for maintenance releases. We're considering moving to Auto Patch http://autopatch.sourceforge.net/ which handles working out what patches to apply to any database you are upgrading. It looks like there may be some small complication handling branching with auto Patch, but it doesn't sound like that'll be an issue for you.

i'd guess, a batch file like this should do the job (didn't try tough) ...
mysqldump --no-data -ufoo -pbar dbname > path/to/app/schema.sql
svn commit path/to/app/schema.sql
just run the batch file after changing the schema, or let a cron/scheduler do it (but i don't know ... i think, commits work if just the timestamps changed, even if the contents is the same. don't know if that would be a problem.)

The main ideea is to have a folder with this structure in your project base path
/__DB
—-/changesets
——–/1123
—-/data
—-/tables
Now who the whole thing works is that you have 3 folders:
Tables
Holds the table create query. I recommend using the naming “table_name.sql”.
Data
Holds the table insert data query. I recommend using the same naming “table_name.sql”.
Note: Not all tables need a data file, you would only add the ones that need this initial data on project install.
Changesets
This is the main folder you will work with.
This holds the change sets made to the initial structure. This holds actually folders with changesets.
For example i added a folder 1123 wich will contain the modifications made in revision 1123 ( the number is from your code source control ) and may contain one or more sql files.
I like to add them grouped into tables with the naming xx_tablename.sql - the xx is a number that tells the order they need to be runned, since sometimes you need the modification runned in a certain order.
Note:
When you modify a table, you also add those modifications to table and data files … since those are the file s that will be used to do a fresh install.
This is the main ideea.
for more details you could check this blog post

Take a look at SchemaSync. It will generate the patch and revert scripts (.sql files) needed to migrate and version your database schema over time. It's a command line utility for MySQL that is language and framework independent.

Some months ago I searched tool for versioning MySQL schema. I found many useful tools, like Doctrine migration, RoR migration, some tools writen in Java and Python.
But no one of them was satisfied my requirements.
My requirements:
No requirements , exclude PHP and MySQL
No schema configuration files, like schema.yml in Doctrine
Able to read current schema from connection and create new migration script, than represent identical schema in other installations of application.
I started to write my migration tool, and today I have beta version.
Please, try it, if you have an interest in this topic.
Please send me future requests and bugreports.
Source code: bitbucket.org/idler/mmp/src
Overview in English: bitbucket.org/idler/mmp/wiki/Home
Overview in Russian: antonoff.info/development/mysql-migration-with-php-project

Our solution is MySQL Workbench. We regularly reverse-engineer the existing Database into a Model with the appropriate version number. It is then possible to easily perform Diffs between versions as needed. Plus, we get nice EER Diagrams, etc.

At our company we did it this way:
We put all tables / db objects in their own file, like tbl_Foo.sql. The files contain several "parts" that are delimited with
-- part: create
where create is just a descriptive identification for a given part, the file looks like:
-- part: create
IF not exists ...
CREATE TABLE tbl_Foo ...
-- part: addtimestamp
IF not exists ...
BEGIN
ALTER TABLE ...
END
Then we have an xml file that references every single part that we want executed when we update database to new schema.
It looks pretty much like this:
<playlist>
<classes>
<class name="table" desc="Table creation" />
<class name="schema" desc="Table optimization" />
</classes>
<dbschema>
<steps db="a_database">
<step file="tbl_Foo.sql" part="create" class="table" />
<step file="tbl_Bar.sql" part="create" class="table" />
</steps>
<steps db="a_database">
<step file="tbl_Foo.sql" part="addtimestamp" class="schema" />
</steps>
</dbschema>
</playlist>
The <classes/> part if for GUI, and <dbschema/> with <steps/> is to partition changes. The <step/>:s are executed sequentially. We have some other entities, like sqlclr to do different things like deploy binary files, but that's pretty much it.
Of course we have a component that takes that playlist file and a resource / filesystem object that crossreferences the playlist and takes out wanted parts and then runs them as admin on database.
Since the "parts" in .sql's are written so they can be executed on any version of DB, we can run all parts on every previous/older version of DB and modify it to be current.
Of course there are some cases where SQL server parses column names "early" and we have to later modify part's to become exec_sqls, but it doesn't happen often.

I think this question deserves a modern answer so I'm going to give it myself. When I wrote the question in 2009 I don't think Phinx already existed and most definitely Laravel didn't.
Today, the answer to this question is very clear: Write incremental DB migration scripts, each with an up and a down method and run all these scripts or a delta of them when installing or updating your app. And obviously add the migration scripts to your VCS.
As mentioned in the beginning, there are excellent tools today in the PHP world which help you manage your migrations easily. Laravel has DB migrations built-in including the respective shell commands. Everyone else has a similarly powerful framework agnostic solution with Phinx.
Both Artisan migrations (Laravel) and Phinx work the same. For every change in the DB, create a new migration, use plain SQL or the built-in query builder to write the up and down methods and run artisan migrate resp. phinx migrate in the console.

I do something similar to Manos except I have a 'master' file (master.sql) that I update with some regularity (once every 2 months). Then, for each change I build a version named .sql file with the changes. This way I can start off with the master.sql and add each version named .sql file until I get up to the current version and I can update clients using the version named .sql files to make things simpler.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.