Technique for multiple users on same datasets - php

This is more a learning question than coding, but I'm certain it's a common issue for anyone developing administration systems or applications in php/mysql/js etc.
I've developed quite a complex application that lets users upload images, and define hotspots in them with associated actions. The images are stored in a table, and the actions in another, with json data for every action in a text field. It's a magazine style format that is used by a custom reading application. However, like I say, the problem is generic.
Basically, my fear is that if someone is editing the same image and set of actions at the same time, and they both submit changes, or if it was edited by someone else then there's a whole series of structures that potentially will fail on submission.
I don't want to implement a locking system, as the system is very wide ranging (links to other images, etc), and I think it's a bit ugly. I saw this link (MSDN Multi-tenant architecture article) in another question, but it seems a little overwhelming and specialised for sql server.
So - what are the terms for data and system architecture here that I can investigate, or are there some good articles to do with this topic that people can recommend? Specifically for php/web world would be great!
--
I'm still looking for good responses on this question. Found out meanwhile that the general term is 'Concurrency', but technique is the important thing :)

First
ALTER TABLE tablename ADD COLUMN changecount BIGINT NOT NULL DEFAULT 0;
for all relevant tables. Then whenever you want to submit a change, use not only
UPDATE tablename SET whatever WHERE id=whatever
but
UPDATE tablename SET whatever, changecount=changecount+1 WHERE id=whatever AND changecount=the_changecount_you_remembered_from_loading_the_object
now if a user submits a change, it will update the changecount - another user submitting a change to the same object, but loaded from older state, can be told "another user has just changed blah blah"

Related

Automated Process to Add New Data to Table

I am working on a website which utilizes a table for presenting old and newly passed laws. As such it requires that we have a large volume of data in the tables, and we constantly have to add more data to the tables.
I know already how to construct the table through CSS and HTML; however, due to the sheer volume of data which we are dealing with, I would like to know if there is a way to create a separate admin page where we can just plug in the law information and have it automatically added to the table rather than having to physically code in all of the information through HTML.
I also have a second question: I would like to add some tabs at the top of the table which allows users to sort laws based on the year they were passed. An example of this can be seen at this site: CT Legislation | 2014 | General Assembly | Passed | LegiScan . It has several tabs at the top which allow users to sort legislation- my question is what coding language is required to add this to a table?
A CMS may or may not do it for you. What really would be good is to use Parse to hold all your data. Take a look at Data Storage and Cloud Code. You can add new laws whenever you want, and you would configure Parse to dynamically add the data to your table for you.
You can use a variety of languages for this type of solution. If it were all web based it would likely utilize php for building a password protected page (admin) page. It would also be used along with SQL to send/receive data from the Database, the it could use CSS/html for the table and content styling.
For the database you could use MYSQL (a type of database).
If this is beyond what you are comfortable with, a content management system (CMS) would be a great option. They set up the entire backend for the user and have an interface that will allow someone who knows zero code or html/CSS to put a pretty decent website together.
The great part about using wordpress is it lends itself well to someone looking to learn more about development. You can see how things can be set up with code to achieve certain outcomes and can learn more and more as you work with the system on increasingly deeper levels.
Another option is using google drive. There are tabs, a table and it is cloud based so you can share it with people you want to have access to it. Anyone you choose can add/delete and it keeps very good track of what is changed and who made the changes. It is really easy to go back an fix things if they have been messed up.

Logging user activities in applications

The problem I'm here to talk about and (ask about of course) is not new. I searched web and stack overflow and I got ideas to many part of this problem (pros and cons) but there is still some part missing in my mind. So I thought it would be a good idea to share in one place (of course it will be more complete with others' ideas) and ask for it.
The problem is clear: "We Want to log every single action of user" - probably when we solve the big problem, smaller ones (like logging only one action would be piece of cake).
First from what I read over the web and stack overflow:
Use DB instead of File: That's a good advice although it always depends on situation. But because of many benefits of DB, in long term and in general, it's the better solution.
DB Layer or Application Layer: Actually it depends. For example If you want really monitor everything(I mean really every single rows that changes in Database, it seems we will have one choice "Using Database Triggers". Although there are many discussions around MySQL that says, triggers slowdown DB and they advised not to use it. So it depends on the level of details you need, you can put your logging system in DB Layer or Application Layer(for exam some common function call $logClass->logThis()).
Use Observers: Clean codes are always better. If you are familiar with observers, you can use them to do things for you when an action is happened so you don't have to add $logClass->logThis() every time a CRUD happens in your application.
What To Log: Simple and short answer is: Based on your needs, but there are some common fields you will need:
user_id (if a unique user ID is available)
timestamp (unix maybe)
ip (not everyone know how to fake it in first place so use it, even faking it give you some insight about user behavior)
action_id (should be predefined actions for better unifying in queries and reports)
object_id (the unique row ID of a record that changes had made on)
action (which my question is about this part)
and etc...
I would appreciate if anyone correct me if I made mistake in any part or add other useful information to this post, so it would become one of good references for other users.
And now my question: How to Store actions?. For better understanding, consider following scenario.
I have a table named "product" and a table named "companies". From the business logic we want to assign products to companies, which we ended up in a table "company_product". Now when a user insert new product and simultaneously assign it's companies, 2 table will be affected (the same goes for delete and update): "product" and "company_product" and we want to know:
what's inserted?
what's deleted?
what's updated to what?
For performance issue and because I don't have enough knowledge about triggers, I want to use logging in Application Layer, so I ended up with this idea that I can, save action fields of database in array or json structure. But as I developed my solution I encountered a problem: How to make this log understandable for non technical users? Because for example I want to save something like this in action field of database when delete(insert) product with id 20:
action : [{id: 20, product_id:2, company_id: 1},{id: 21, product_id:2, company_id: 2}]
And this is not something easy for every one to read and understand. Actually I can use this json more readable and make it something like this:
action : {'Product A Deleted From Company X', 'Product A Deleted From Company Y'}
and save the previous action in technical_action field for further diagnose, But it needs additional works and more query to run for something that is not always needed to be considered(log)
I would appreciate any additional information on this article (I'm definitely sure that there exist other criteria that can be discussed), and answer to my question.
You are actually going to gather details for analytics kind of stuffs.
It will be good if you go for flat tables rather than going to relational tables.
Because if you want to do more analysis your relational table will not be a good choice as it lacks in performance.

Database structure for "flag as spam" functionality

I have created a webapp with php/mysql.
In my application I have different section, where user submits contents, like photos, news, stories, videos etc.
All these are separate sections with their separate story details pages. I want to apply a "Flag as Spam" functionality for all sections, but confused with database. Should I create separate table for every section such as table name: video_spam or photo_spam or should I go with one table spam_contents which will contain following columns.
SpamId - unique id for the table
ByUserId - Who marked it as spam
SectionName - will be 'news', 'video', 'stories' etc.
Reason - Reason for which user marked it as spam
ContentId - This will contain photoid or videoid or newsid
Date - The day user marked content as spam.
If I need to fetch all content of video section, which is marked as spam by users then I can get it on the basis of SectionName and ContentId.
Will it be a good approach or anyone has any better solution for this scenario.
Please help, Thanks!
Unless there's something unique to "video spam", or something unique to "photo spam", etc., you're almost certainly better off with a single table.
Your situation is similar to this supertype/subtype issue. See my reply to that question, too.
I believe this looks like the best way. Having a centralized collector with a unique purpose is a design plus, imho. You can surely go for some more fields in each table (ex. video_table has also a 'spam_flag','flag_by','flag_date' and whatever along these lines), but I think this, a part from multiplicating your work just in creating, may have significatn drawbacks whenever you need to make adjustement or changes to the system.
And, by the way, I've seen this structure implemented in a couple of well-known open source Bullettin Boards for reported messages and similar, so I believe it's a valid and optimized design.
Alternatively, if you feel in good mood, you could also make both: something 'detailed' pertaining to each table, and a centralized structure as a sort of 'admin panel report'.

Multi-language social website - Database driven?

To store multi language content, there is lots of content, should they be stored in the database or file? And what is the basic way to approach this, we have page content, reference tables, page title bars, metadata, etc. So will every table have additional columns for each language? So if there are 50 languages (number will keep growing as this is a woldwide social site, so eventual goal is to have as many languages as possible) then 50 extra columns per table? Or is there a better way?
There is a mixture of dynamic system and user content + static content.
Scalability and performance are important. Being developed in PHP and MySQL.
User will be able to change language on any page from the footer. Language can be either session based or preference based. Not sure what is a better route?
If you have a variable, essentially unknown today number of languages, than this definately should NOT be multiple columns in a record. Basically the search key on this table should be something like message id plus language id, or maybe screen id plus message id plus language id. Then you have a separate record for each language for each message.
If you try to cram all the languages into one record, your maintenance will become a nightmare. Every time you add another language to the app, you will have to go through every program to add "else if language=='Tagalog' then text=column62" or whatever. Make it part of the search key and then you're just reading "where messageId='Foobar' and language=current_language", and you pass the current language around. If you have a new language, nothing should have to change except adding the new language to the list of valid language codes some place.
So really the question is:
blah blah blah. Should I keep my data in flat files or a database?
Short answer is whichever you find easier to work with. Depending on how you structure it, the file based approach can be faster than the database approach. OTOH, get it wrong and performance impact will be huge. The database approach enforces more consistent structure from the start. So if you make it up as you go along, then the database approach will probably pay off in the long run.
eventual goal is to have as many languages as possible) then 50 extra columns per table?
No.
If you need to change your database schema (or the file structure) every time you add a new language (or new content) then your schema is wrong. If you don't understand how to model data properly then I'd strongly recommend the database approach for the reasons given.
You should also learn how to normalize your data - even if you ultimately choose to use a non-relational database for keeping the data in.
You may find this useful:
PHP UTF-8 cheatsheet
The article describes how to design the database for multi-lingual website and the php functions to be used.
Definitely start with a well defined model so your design doesn't care whether data comes from a file, db or even memCache or something like that. Probably best to do a single call per page to get an object that contains all the fields for that single page, rather than multiple calls. The you can just reference that single returned object to get each localised field. Behind the scenes you could then code the respository access and test. Personally I'd probably go the DB approach over a file - you don't have to worry about concurrent file access and it's probably easier to deploy changes - again you don't have to worry about files being locked by reads when you're deploying new files - just a db update.
See this link about php ioc, that might help you as that would allow you to abstract from your code what type of respository is used to hold the data. That way if you go one approach and later you want to change it - you won't have to do so much rework.
There's no reason you need to stick with one data source for all "content". There is dynamic content that will be regularly added to or updated, and then there is relatively static content that only rarely gets modified. Then there is peripheral content, like system messages and menu text, vs. primary content—what users are actually here to see. You will rarely need to search or index your peripheral content, whereas you probably do want to be able to run queries on your primary content.
Dynamic content and primary content should be placed in the database in most cases. Static peripheral content can be placed in the database or not. There's no point in putting it in the database if the site is being maintained by a professional web developer who will likely find it more convenient to just edit a .pot or .po file directly using command-line tools.
Search SO for the tags i18n and l10n for more info on implementing internationalization/localization. As for how to design a database schema, that is a subject deserving of its own question. I would search for questions on normalization as suggested by symcbean as well as look up some tutorials on database design.

resources for designing a good content publishing system

The cms I'm currently working with only supports live editing of data (news, events, blogs, files, etc), and I've been asked to build a system that supports drafting (with moderation) + revision history system. The cms i'm using was developed in house so I'll probably have to code it from scratch.
At every save of a item it would create a snapshot of the data into a "timeline". The same would go for drafts. Automated functionality would pull the timeline draft into the originating record when required.
The timeline table would store the data type & primary key, seralised version of the data + created/modified dates + a drafting date (if in the future)
I've had a quick look around at other systems, but I've yet to improve from my current idea.
I'm sure someone has already built a system like this and I would like to improve on my design before I start building. Any good articles/resources would help as well.
Thanks
I think using serialize() to encode each row into a single string, then saving that to a central database may be a solution.
You'd have your 'live' database with relevant tables etc., but when you edit or create something (without clicking publish) it would instead of being saved in your main table go into a table like:
id - PRI INT
date - DATETIME
table - VARCHAR
table_id - INT
type - ENUM('UNPUBLISHED','ARCHIVED','DELETED');
data - TEXT/BLOB
...with the type set to 'unpublished' and the table and table_id stored so it knows where it is from. Clicking publish would then serialize the current tables contents, store it in the above table set to 'archive', then read out the latest change (marked as unpublished) and place this in the database. The same could also apply to deleting rows - place them in and mark as 'deleted' for potential undelete/rollback functionality.
It'll require quite a lot of legwork to get it all working, but should provide full publish/unpublish and rollback facilities. Integrated correctly into custom database functions it may also be possible to do all this transparently (from a SQL point of view).
I have been planning on implementing this as a solution to the same problem you appear to be have, but it's still theoretical from my point of view but I reckon the idea is sound.
This sounds very wiki-like to me. You may want to look at MediaWiki, the system used by Wikipedia, which also uses PHP and MySQL.
DotNetNuke is a good open source CMS, you could read the soure for that system to get ideas. Or you could simply use DotNetNuke.
http://www.dotnetnuke.com/
I think that there are many systems out there that would support this functionality out of the box. Although I don't know al your considerations for doing a custom build, consider looking at some of these. It is very likely that they will be able to support what you need, and then some.
Consider having a look at Drupal, I think still the leading CMS for publishing. Drupal in combination with the workflow module contains all that you need:
http://drupal.org
http://drupal.org/project/workflow
And add save draft for usability:
http://drupal.org/project/save_draft

Categories