Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last year.
Improve this question
Currently, I am working on a website and just started studying backend. I wonder why nobody uses JSON as a database. Also, I don't quite get the utility of php and SQL. Since I could easily get data from JSON file and use it, why do I need php and SQL?
ok! let assume you put the data in a JSON variable and store it in a file for all your projects.
obviously, u need to add a subsystem for getting back up, then you will write it.
you must increase the performance for handling a very large amount of data, just like indexing, hash algorithms, and... , assume u handle it.
if you need some API for working and connecting with a variety of programming languages, u need to write them.
what about functionalities? what if you need to add some triggers, store procedures, views, full-text search and etc? ok, you will pay your time and add them.
ok, good job, but your system will grow up and you need to scale it, can you do it? u will write abilities for clustering across servers, sharding, and ...
now you need to guarantee that your system will compatible with ACID rules, to keep atomicity, Consistency, Isolation, and Durability.
can you always handle all querying techniques (Map/Reduce) and respond with a fast and standard structure?
now it's time to offer very quick write speeds, it brings serious issues for you
ok, now proper your solutions for condition racing, isolation level, locking, relations and ...
after you do all this work plus thousands of many others, probably you will have a DBMS a little bit just like MongoDB or other relational and non-relational databases!
so it's better to use them, however, obviously, you can choose to don't to use them too, I admit that sometimes saving data in a single file has better performance, but only sometimes, in some cases, with some data, for some purpose! if you know what exactly you do, then ist OK to save data in a JSON file.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I finished creating an accounting web application for an organization using codeignter and mysql db, and I have just submitted it to them, they liked the work, but they asked me how they would transfer their old manual data to the new one online, so that their members would be able to see their account balances and contributions history.
This is a major problem for me because most of my tables make use of 'referential integrity' to ensure data synchronization and would not support the style of their manual accounting.
I know a lot of people here have faced cases like this and I would love to know the best way to collect users history, and I also know this might probably be flagged as not a real question, but I really really have to ask people with experience.
I would appreciate all answers. Thanks (And vote downs too)..
No matter what the case is, data conversions are very challenging and almost always time consuming. Depending on the consistency of the data in question, it could be a case that about 80% of the data will transfer over neatly if you create a conversion program using PHP. That conversion code in and of itself may be more time consuming than it is worth. If you are talking hundreds of thousands of records and beyond, it is probably a good idea to make that conversion program work. Anyone who might suggest there is a silver bullet is certainly not correct.
Here are a couple of suggested steps:
(Optional) Export your Excel spreadsheets to Access. Access can help you to standardize data and has tools in place to help you locate records which have failed in some way. You can also create filters in Access if you need to. The benefit of taking this step, if you are familiar with Access, is that you have already begun the conversion process to a database. As a matter of fact, if you so desire, you can import your MySQL database information into Access as well. The benefit of this is pretty obvious: You can create a query and merge your two separate tables together to form one table, which could save you a great deal of coding.
Export your Access table/query into a CSV file (note, if you find it is overkill or if you don't have Access, you can skip step 1 and simply save your .xls or .xlsx file to type .csv. This may require more legwork for your PHP conversion code but that is probably a matter of preference. Some people prefer to avoid Access as much as possible, and if you don't normally use it you will be wasting time trying to learn it just to save yourself a little bit of time).
Utilize PHP's built-in str_getcsv function. This will convert a CSV file into a PHP array.
Create your automated program to parse through each record. Based on the column and its requirements, you can either accept or reject records. You can then export your data, such as was done in this SO answer, back to CSV. You can save two different CSV files, one with accepted records, and one with rejected records.
With rejected records, which are all but inevitable when transferring from a spreadsheet, you will need to have a course of action. The simplest way for your clients is probably to give them a procedure to either manually import records into the database, if you've given them an interface to do so, or - probably simpler but requiring more back-and-forth - to update the records in Excel to be compliant with the new system.
Edit
Based on the thread under your question which sheds more light on what you are trying to do (i.e., have a parent for each transaction that is an accepted loan), you should be able to contrive a parent field, even if it is not complete, by creating a parent record for each set of transactions based around an account. You can do this via Access, PHP, or, more likely, a combination.
Conclusion
Long story short, data conversions take time. If you put the time in up front, it will be far easier to maintain a standardized series of information in the long run. If you find something which takes less time in the beginning, it will mean additional work for you in the long run in order to make this "simple" fix work over time.
Similarly, the closer you can get legacy data to conform to your new data, the easier it will be for your clients to perform queries etc. While this may mean that some manual entry will be required on the part of you or your client, it is better to inform the client of the pros and cons of each method fully and let them decide. My recommendation would always be to put extra work in at the front-end because it almost always ends up cheaper than having to deal with a quick fix in the long run, but that is not always practical given real world constraints.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I designed my Model View Controller. What i want to do is to store mysql fetched query data into php file or localStorage or sessionStorage, so that the stored information can be reused without bothering database over and over. I can save this information into a php session as well but using a lot of sessions could cause your server slow down.
Is this a good practice in this particular case to store information inside any of these said storages?
The concept of what you're trying to do is called "caching". Here's an article on a simple way to do it, kind of like what you're talking about: http://www.xeweb.net/2010/01/15/simple-php-caching/.
Whether or not it's "good practice" depends largely on your application. If it's a SQL query who's results don't change often (you might want to re-think why you're doing a query in those cases) and a complicated query that is very time consuming to run then go for it. I would advise against doing it unless that criteria apply because there's a reason you're using a database: you want it to be dynamic. Most databases can handle a lot of queries and your queries and data can be optimized for queries that you'll be running pretty often.
Do you notice it taking a long time to run the query? Have you isolated the problem to the database? Most importantly: does the data change very often and do your users care that their views won't be up to date?
I should add, it's VERY rarely a good idea to store anything like this inside of session storage. The only things you'd use session storage for is for a user name or an anonymous user's shopping cart. Never for query caching.
No not at all. Your database is the fastest and most efficient at these tasks, so using something like a session to store data is a bad idea. You can look into persistent objects though, with an ORM like Doctrine.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Recently I started eCommerce project and I need to use datamining. Simply my question is which solution may I use in development:
MySQL with PHP
SQL Server with ASP
Actually MySQL is good solution and suitable for my project for many reasons, but is it good and optimal for Datamining? I'm beginner in Datamining and I'll develop this as part of my project. Is there a good support tools for it?
SQL databases play little role in data mining. (That is, unless you consider computing various business reports involving averages as "data mining", IMHO these should at most be called "business analytics").
The reason is that the advanced statistics performed for data mining can't be accelerated by the database indexes. And usually, they also take much longer than interactive users would be willing to wait.
So in the end, most actual data mining happens "offline", outside of a database. The database may serve as initial data storage, but the actual data mining process then usually is 1. load data from database, 2. preprocess data, 3. analyze data, 4. present results.
I know that there exist some SQL extensions such as the DMX ("Data mining eXtensions"). But seriously, that isn't really data mining. That is an interface to invoke some basic prediction functionality, but nothing general. Any good data mining will require customization of the process, and you can't do this with a DMX one-liner.
Fact is, the most important tools for data mining are R and SciPy. Followed by the specialized tools such as RapidMiner, Weka and ELKI. Why? Because R and Python are best for scripting. It's ALL about customization of the process. Forget any push-button solution, they just don't work reasonably well yet.
You just can't reasonably train e.g. a support vector machine "inside" of a SQL database (and even less, inside a NoSQL database, which usually is not much more than a key-value store). Also don't underestimate the need to preprocess your data. So in fact, you will be training on a copy of the data set. You might then just get this copy into a data format most efficient for your actual data mining process later on; instead of keeping it in a random-access general-purpose database store.
I would say pick the language you and your team feels more comfortable with, there are goods and not so goods on both sides, I reckon you do a bit research before you pick a path keeping in mind your business needs.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I'm actually developping a system for bus ticket reservation. The provider has many routes and different trips. I've setup a rather comprehensive database that maps all this together, but i'm having trouble getting the pathing algorithm working when i comes to cross route reservation.
For example, the user wants to go from Montreal to Sherbrooke, he'll only take what we call here Route #47. But in the event he goes to Sutton instead of Sherbrooke, he now has to transfer into route #53 at some point.
Now, it isn't too hard to detect one and only one transfer. But when i comes to detecting what are the options he can do to cross multiple routes, i'm kinda scared. I've devised a cute and relatively efficient way to do so on 1-3 hops using only SQL but i'm wondering how i should organize all this in a much broader spectrum as the client will probably not stay with 2 routes for the rest of is life.
Example of what i've thought of so far:
StartingStop
joins to Route
joins to StopsOfTheRoute
joins to TransfersOnThatStop
joins to TargetStopOfThatTransfer
joins to RouteOfThatStop
joins to StopsOfThatNewRoute
[wash rince repeat for more hops]
where StopsOFThatNewRoute = EndingStop
Problem is, if i have more than 3 hops, i'm sure my SQL server will choke rather fast under the pressure, even if i correctly index my database, i can easily predict a major failure eventually...
Thank you
My understanding of your problem: You are looking for an algorithm that will help you identify a suitable path (covering segments of one or more routes).
This is as Jason correctly points out, a pathfinding problem. For this kind of problem, you should probably start by having a look at Wikipedia's article on pathfinding, and then dig into the details of Djikstra's algorithm. That will probably get you started.
You will however most likely soon realise that your data model might pose a problem, both from a structure and performance perspective. Typical example would be if you need to manage time constraints: which path has the shortest travel time, assuming you find several? One path might be shortest, but only provide one ride per day, while another path might be longer but provide several rides per day.
A possible way of handling this is to create a graph where each node corresponds to a particular stop, at a particular time. An edge would connect from this stop in room-time to both other geographical stops, as well as itself at the next point in time.
My suggestion would be to start by reading up on the pathfinding algorithms, then revisit your data model with regards to any constraints you might have. Then, focus on the structure for storing the data, and the structure for seaching for paths.
Suggestion (not efficient, but could work if you have a sufficient amount of RAM to spare). Use the relational database server for storing the basics: stops, which routes are connected to which stops and so on. Seems you have this covered already. Then build an in-memory representation of a graph given the constraints that you have. You could probably build your own library for this pretty fast (I am not aware of any such libraries for PHP).
Another alternative could be to use a graph database such as Neo4j and its REST interface. I guess this will require significant some redesign of your application.
Hope this gives you some helpful pointers.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to keep logs of some things that people do in my app, in some cases so that it can be undone if needed.
Is it best to store such logs in a file or a database? I'm completely at a loss as to what the pros and cons are except that it's another table to setup.
Is there a third (or fourth etc) option that I'm not aware of that I should look into and learn about?
There is at least one definite reason to go for storing in the database. You can use INSERT DELAYED in MySQL (or similar constructs in other databases), which returns immediately. You won't get any return data from the database with these kinds of queries, and they are not guaranteed to be applied.
By using INSERT DELAYED, you won't slow down your app to much because of the logging. The database is free to write the INSERTs to disk at any time, so it can bundle a bunch of inserts together.
You need to watch out for using MySQL's built in timestamp function (like CURRENT_TIMESTAMP or CUR_DATE()), because they will be called whenever the query is actually executed. So you should make sure that any time data is generated in your programming language, and not by the database. (This paragraph might be MySQL-specific)
You will almost certainly want to use a database for flexible, record based access and to take advantage of the database's ability to handle concurrent data access. If you need to track information that may need to be undone, having it in a structured format is a benefit, as is having the ability to update a row indicating when and by whom a given transaction has been undone.
You likely only want to write to a file if very high performance is an issue, or if you have very unstructured or large amounts of data per record that might be unweidly to store in a database. Note that Unless your application has a very large number of transactions database speed is unlikely to be an issue. Also note that if you are working with a file you'll need to handle concurrent access (read / write / locking) very carefully which is likely not something you want to have to deal with.
I'm a big fan of log4php. It gives you a standard interface for logging actions. It's based on log4j. The library loads a central config file, so you never need to change your code to change logging. It also offers several log targets, like files, syslog, databases, etc.
I'd use a database simply for maintainability - also multiple edits on a file may cause some getting missed out.
I will second both of the above suggestions and add that file locking on a flat file log may cause issues when there are a lot of users.