Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I finished creating an accounting web application for an organization using codeignter and mysql db, and I have just submitted it to them, they liked the work, but they asked me how they would transfer their old manual data to the new one online, so that their members would be able to see their account balances and contributions history.
This is a major problem for me because most of my tables make use of 'referential integrity' to ensure data synchronization and would not support the style of their manual accounting.
I know a lot of people here have faced cases like this and I would love to know the best way to collect users history, and I also know this might probably be flagged as not a real question, but I really really have to ask people with experience.
I would appreciate all answers. Thanks (And vote downs too)..
No matter what the case is, data conversions are very challenging and almost always time consuming. Depending on the consistency of the data in question, it could be a case that about 80% of the data will transfer over neatly if you create a conversion program using PHP. That conversion code in and of itself may be more time consuming than it is worth. If you are talking hundreds of thousands of records and beyond, it is probably a good idea to make that conversion program work. Anyone who might suggest there is a silver bullet is certainly not correct.
Here are a couple of suggested steps:
(Optional) Export your Excel spreadsheets to Access. Access can help you to standardize data and has tools in place to help you locate records which have failed in some way. You can also create filters in Access if you need to. The benefit of taking this step, if you are familiar with Access, is that you have already begun the conversion process to a database. As a matter of fact, if you so desire, you can import your MySQL database information into Access as well. The benefit of this is pretty obvious: You can create a query and merge your two separate tables together to form one table, which could save you a great deal of coding.
Export your Access table/query into a CSV file (note, if you find it is overkill or if you don't have Access, you can skip step 1 and simply save your .xls or .xlsx file to type .csv. This may require more legwork for your PHP conversion code but that is probably a matter of preference. Some people prefer to avoid Access as much as possible, and if you don't normally use it you will be wasting time trying to learn it just to save yourself a little bit of time).
Utilize PHP's built-in str_getcsv function. This will convert a CSV file into a PHP array.
Create your automated program to parse through each record. Based on the column and its requirements, you can either accept or reject records. You can then export your data, such as was done in this SO answer, back to CSV. You can save two different CSV files, one with accepted records, and one with rejected records.
With rejected records, which are all but inevitable when transferring from a spreadsheet, you will need to have a course of action. The simplest way for your clients is probably to give them a procedure to either manually import records into the database, if you've given them an interface to do so, or - probably simpler but requiring more back-and-forth - to update the records in Excel to be compliant with the new system.
Edit
Based on the thread under your question which sheds more light on what you are trying to do (i.e., have a parent for each transaction that is an accepted loan), you should be able to contrive a parent field, even if it is not complete, by creating a parent record for each set of transactions based around an account. You can do this via Access, PHP, or, more likely, a combination.
Conclusion
Long story short, data conversions take time. If you put the time in up front, it will be far easier to maintain a standardized series of information in the long run. If you find something which takes less time in the beginning, it will mean additional work for you in the long run in order to make this "simple" fix work over time.
Similarly, the closer you can get legacy data to conform to your new data, the easier it will be for your clients to perform queries etc. While this may mean that some manual entry will be required on the part of you or your client, it is better to inform the client of the pros and cons of each method fully and let them decide. My recommendation would always be to put extra work in at the front-end because it almost always ends up cheaper than having to deal with a quick fix in the long run, but that is not always practical given real world constraints.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last year.
Improve this question
Currently, I am working on a website and just started studying backend. I wonder why nobody uses JSON as a database. Also, I don't quite get the utility of php and SQL. Since I could easily get data from JSON file and use it, why do I need php and SQL?
ok! let assume you put the data in a JSON variable and store it in a file for all your projects.
obviously, u need to add a subsystem for getting back up, then you will write it.
you must increase the performance for handling a very large amount of data, just like indexing, hash algorithms, and... , assume u handle it.
if you need some API for working and connecting with a variety of programming languages, u need to write them.
what about functionalities? what if you need to add some triggers, store procedures, views, full-text search and etc? ok, you will pay your time and add them.
ok, good job, but your system will grow up and you need to scale it, can you do it? u will write abilities for clustering across servers, sharding, and ...
now you need to guarantee that your system will compatible with ACID rules, to keep atomicity, Consistency, Isolation, and Durability.
can you always handle all querying techniques (Map/Reduce) and respond with a fast and standard structure?
now it's time to offer very quick write speeds, it brings serious issues for you
ok, now proper your solutions for condition racing, isolation level, locking, relations and ...
after you do all this work plus thousands of many others, probably you will have a DBMS a little bit just like MongoDB or other relational and non-relational databases!
so it's better to use them, however, obviously, you can choose to don't to use them too, I admit that sometimes saving data in a single file has better performance, but only sometimes, in some cases, with some data, for some purpose! if you know what exactly you do, then ist OK to save data in a JSON file.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
All the answers about this question assume you're storing all of your user's data in one big file - and so they talk about how that is too slow.
Let's say I have thousands of users and store their data as JSON format in separate files (which I am currently doing), what is the downside to that - as opposed to setting up a proper database like Postgresql - which seems like overkill.
The speed is great on my current setup, but I am advised against doing this.
Since each user has their own separate file, there isn't really an issue of hundreds of people writing to the file at the same time (isolation).
Maybe it only matters for sites with millions of users?
In most systems, the users don't merely have to exist, they have to do stuff. And that stuff would generally be represented in a database. So you want the users to exist in the same system where the things they interact with exist.
What happens if your system crashes (power failure, for example) when a json file is half-way written out? Will you be left with a broken JSON file for that user? With databases, that should be taken care of automatically (you find either the old record, or the new one, not some truncation or mishmash). If you roll your own database, you will have to go some way out of your way to verify that you do this in a safe manner.
How do you name your user files? By the user's name? What if different people have the same name? What if their name has characters that can't be represented in file names? By an account number you assign? What happens if they forgot their account number and need to look it up by their human name? Do you then need to read and parse every user file to identify the correct one? Not that a database will magically make this free, but at least with a database you can just build an index with first having to invent and implement them.
You are basically reimplementing a database system from scratch, one feature at a time, as you discover the need for that feature. You can do it, sure. But why not use one that already exists?
Since each user has their own separate file, there isn't really an issue of hundreds of people writing to the file at the same time (isolation).
What if one person writes to one file at the same time from two different browsers (or tabs)?
There is no absolute right or wrong.
If you will never need to take care of concurrent access to the same record (file) or you don't need to search through your records or scale to multiple servers, the solution is fine and even faster than accessing a database.
I would just recommend to properly escape the user provided data, as JSON
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
My boss asked me to create a easier system for finding points by having points associated with the user table in our mysql database. The old system just had events, there point values, then another table with events completed for a user, and then another table for just admin given points. So my job was to add these all together and put them in a column. Now he says the problem is that there is still all the queries running around adding points, but instead of changing them to simply add points to the users column upon task completion, they suggested i use a trigger to simply add points to the users column, when one of the other columns has points added to it.
To me this sounds like using a work-around and creating technical debt. Am i wrong?
Im new to the system, and i dont know exactly where all the queries are in the php pages, but if this is creating technical debt, what would be the appropriate way to fix this.
Im new and am probably going to just use sql triggers as to not go against my boss's suggestions, I want to at least know the smart/best way to do things.
Doing my best to provide not actual, but near actual db schema
EVENT: ID, point value, Desc
User-Events: USERID, EVENTID, COMPLETION-STATUS
GIVEN-POINTS:USERID, POINTS_GIVEN, DESC (Each time points are given, so its more of a log than updated points)
I added a Points column to the basic USER TABLE
the trigger would be when user-Event completion-status =done, find point value, add to points in user, instead of changing queries to do that.
Triggers are a perfectly valid way to accomplish what you are trying to do, as long as the business rules are fairly simple.
There are lots of ways to accomplish moving data from one table to another. You can use triggers, some sort of synchronous PHP process or an asynchronous process using some sort of message queue.
Triggers have the benefit of being simple and fast to code, maintain, and run. The upside is that you only have to do the code once, which is especially nice since you don't know where all the queries that touch these tables are. The downside is that you could be putting business logic into the database, which is where you might start getting into technical debt. The other downside is simply that you've added another business layer, which might not be obvious to the next developer, so they might spend a lot of time trying to figure out how and why the summary table is being updated. Comments are a good thing, in this case.
Synchronous PHP processes are are nice in that it's very obvious where the code is being executed. The other upside is that you have access to the whole PHP application context and can create more complex business rules. The downside is that you will have to put the function or method call into each place where the table is potentially being touched.
Asynchronous PHP processes have the same up and downsides as the synchronous PHP processes, with the added benefit that they aren't going to slow down the user experience. They are also a little more complex to create; you have to handle cases where the messages aren't received, or aren't received in the correct order.
Need some ideas/help on best way to approach a new data system design. Basically, the way this will work is there will be a bunch of different database/tables that will need to be updated on a regular (daily/weekly/monthly) basis with new records.
The people that will be imputing the data will be proficient in excel. The input process will be done via a simple upload form. Then the system needs to add what was imported to the existing data in the databases. There needs to be a "rollback" process that'll reset the database to any day within the last week.
There will be approximatively 30 to 50 different data sources. the main primary interface will be an online search area area. so all of the records need to be indexed/searchable.
Ideas/thoughts on how to best approach this? It needs to be built mostly out of php/mysql.
imputing the data
Typo?
What you are asking takes people with several years formal training to do. Conventionally, the approach would be to draw up a set of requirements, then a set of formal specifications, then the architecture of the system would be designed, then the data design, then the code implementation. There are other approaches which tend to shortcut this. However even in the case of a single table (although it does not necessarily follow that one "simple upload form" corresponds to one table), with a single developer there's a couple of days work before any part of the design could be finalised, the majority of which is finding out what the system is supposed to do. But you've given no indication of the usage nor data complexity of the system.
Also what do you mean by upload? That implies they'll be manipulating the data elsewhere and uploading files rather than inputting values directly.
You can't adequately describe the functionality of a complete system in a 9 line SO post.
You're unlikely to find people here to do your work for free.
You're not going to get the information you're asking for in a S.O. answer.
You seem to be struggling to use the right language to describe the facts you know.
Your question is very vague.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to keep logs of some things that people do in my app, in some cases so that it can be undone if needed.
Is it best to store such logs in a file or a database? I'm completely at a loss as to what the pros and cons are except that it's another table to setup.
Is there a third (or fourth etc) option that I'm not aware of that I should look into and learn about?
There is at least one definite reason to go for storing in the database. You can use INSERT DELAYED in MySQL (or similar constructs in other databases), which returns immediately. You won't get any return data from the database with these kinds of queries, and they are not guaranteed to be applied.
By using INSERT DELAYED, you won't slow down your app to much because of the logging. The database is free to write the INSERTs to disk at any time, so it can bundle a bunch of inserts together.
You need to watch out for using MySQL's built in timestamp function (like CURRENT_TIMESTAMP or CUR_DATE()), because they will be called whenever the query is actually executed. So you should make sure that any time data is generated in your programming language, and not by the database. (This paragraph might be MySQL-specific)
You will almost certainly want to use a database for flexible, record based access and to take advantage of the database's ability to handle concurrent data access. If you need to track information that may need to be undone, having it in a structured format is a benefit, as is having the ability to update a row indicating when and by whom a given transaction has been undone.
You likely only want to write to a file if very high performance is an issue, or if you have very unstructured or large amounts of data per record that might be unweidly to store in a database. Note that Unless your application has a very large number of transactions database speed is unlikely to be an issue. Also note that if you are working with a file you'll need to handle concurrent access (read / write / locking) very carefully which is likely not something you want to have to deal with.
I'm a big fan of log4php. It gives you a standard interface for logging actions. It's based on log4j. The library loads a central config file, so you never need to change your code to change logging. It also offers several log targets, like files, syslog, databases, etc.
I'd use a database simply for maintainability - also multiple edits on a file may cause some getting missed out.
I will second both of the above suggestions and add that file locking on a flat file log may cause issues when there are a lot of users.