Storing user data in JSON files on server - php

I am building a web application that uses PHP and MySQL on the backend. I want to store some user data -- basically a set of objects in JSON format that detail the user's "favorites" info for the application. I don't want to store this JSON data in a single MySQL field in my user database table because it doesn't seem efficient.
So, I am thinking to just store the JSON data in a flat file on the server with a unique identifier that I can use to know which user the file is associated with. My questions is: would this be a scalable solution for upwards of 10,000 users?

This is likely to cause you lots of headaches, both in terms of technical aspects and in terms of security. And no, it's not very scalable. Think about the problems it will cause: What happens when you need to add a server? How will you sync the files? What if you want to do something involving multiple users, like seeing how many people have XYZ as a favorite?
A much better option is to do one of the following:
Normalize your database (do this part regardless) and put the favorites in their own table or
Save the favorites in a JSON column (probably the wrong answer, but makes sense in some contexts)
If you're worried about speed, you can implement some caching using Redis, memcached, or some other system. But do not do this yet - that's premature optimization. Do it when you need it.

Related

PHP - MySQL call or JSON static file for unfrequently updated information

I've got a heavy-read website associated to a MySQL database. I also have some little "auxiliary" information (fits in an array of 30-40 elements as of now), hierarchically organized and yet gets periodically and slowly updated 4-5 times per year. It's not a configuration file though since this information is about the subject of the website and not about its functioning, but still kind of a configuration file. Until now, I just used a static PHP file containing an array of info, but now I need a way to update it via a backend CMS from my admin panel.
I thought of a simple CMS that allows the admin to create/edit/delete entries, periodical rare job, and then creates a static JSON file to be used by the page building scripts instead of pulling this information from the db.
The question is: given the heavy-read nature of the website, is it better to read a rarely updated JSON file on the server when building pages or just retrieve raw info from the database for every request?
I just used a static PHP
This sounds like contradiction to me. Either static, or PHP.
given the heavy-read nature of the website, is it better to read a rarely updated JSON file on the server when building pages or just retrieve raw info from the database for every request?
Cache was invented for a reason :) Same with your case - it all depends on how often data changes vs how often is read. If data changes once a day and remains static for 100k downloads during the day, then not caching it or not serving from flat file would would simply be stupid. If data changes once a day and you have 20 reads per day average, then perhaps returning the data from code on each request would be less stupid, but from other hand, all these 19 requests could be served from cache anyway, so... If you can, serve from flat file.
Caching is your best option, Redis or Memcached are common excellent choices. For flat-file or database, it's hard to know because the SQL schema you're using, (as in, how many columns, what are the datatype definitions, how many foreign keys and indexes, etc.) you are using.
SQL is about relational data, if you have non-relational data, you don't really have a reason to use SQL. Most people are now switching to NoSQL databases to handle this since modifying SQL databases after the fact is a huge pain.

should I do Access Control List serialized as json or database tables

I've been researching on creating an Access Control List and there are a few things I've found. However, I'm not sure if one way is extremely overboard and another is far too simplistic.
So here it is:
Right now, how I have it set up is that in the users table i have a permissions field. This field contains JSON of all the permissions the user has. I was curious and wanted to know if there was a better way to do this and I have found structured databases that use separate tables for roles and permissions.
e.g.
role
-------------
id | name
permissions
-----------------
id | role_id | name
user_role
---------------
user_id | role_id
That's very basic, but the general idea.
My question is, which method is better. The tables approach seems a bit heavy with the joins and everything to get the permissions. However, when I look for the user, I can just pull out the JSON field and cache it... Or is there something fundamental that I'm missing?
You should be using native tables in place of non-native data structures when working with rational databases. Riding a bike down the interstate is an option but is it the best option?
Efficiency:
Modern database servers are able to cache the result of repeated queries so why waste resources in your app when the caching is already done for you. Most, at the least, offer a time-out cache where if the same query is produced moments apart the same result is returned(assuming values have not been altered for the result) from memory instead of requiring a DB read.
The larger your field-set becomes the more space a JSON string will take up on the file system, the slower that data will be to parse, and the more memory the cached result will consume. Using tables, the field-set will be far less resource consuming on the file system and enables you to request just the value(s) you need at that moment already formatted in a way your application understands. Where as with JSON, you retrieve a string that yet still needs more manipulation to be understood, containing not only the values you need at that moment but possibly quite a few values you do not need.
Scalability:
With a stored JSON string, if you wish to delete a no longer needed field or to add a required field you will have to spend quite a bit of processing power to adjust each item's data set. Where as with a table this can be done with a short query command. After such each time the server encounters an item with a field that is supposed to be deleted or missing a required field, that item's field will be adjusted in memory and the update will be scheduled for batch file system writing.
--
You state you are worried about joins being slow/costly to do, but assuming you are only requesting the data you need AT THAT MOMENT from ONLY the tables you need said values from the joins should be very minimum compared to alternatives.
Your JSON way looks like Wordpress serialization, WP uses serialization to store options, take a look at this post, for a quick example:
Working with serialized data in Wordpress
But it is used for options which IMHO are more or less irrelevant, options that have no need to be filtered or whatever, the only function is to provide the configuration of certain features. Let's say for unimportant things.
Permissions & roles are the kind of thing I would consider vital for any application, and the best way is to use the standard table approach. You could need new inserts and it's much easier, or to query who has certain permissions, and that's the magic of relational tables.

How should I mix XML, JSON with a MYSQL DB with performance in mind

I'm developing a php site where users select data resources from a variety of categories. The resources come from varied sources, some RSS, some XML, some JSON, some hosted internally, some hosted externally. The user has the ability to edit which resources they will see and that information will be stored both in the data base as well as cookies, sessions and caches to lower the load on the server when the user is not actively selecting resources. Some of the tables in the database will be largish(for me anyhow) ranging from 20,000-50,000,000 entries. Other tables will be quite small ranging from 51-200 entries. These smaller tables are mainly name tables things like state names, category names and other similar things.
Because this is a relatively large project for me I want to focus on optimizing server resources and I'm asking myself whether hosting some of these small tables as xml(json or includes work just as well) and integrating them via ajax might be a more efficient usage of resources. Additionally if anyone knows what the potential performance gain or penalty for using resources that way might be and what the best practices for mixing data like that are. I should note that on the front end the site will be pretty lean so I don't mind passing off work to the browser.
As an simple example
I'm going to need to store a table of US State names, Acronyms and IDs somewhere. Its a small but essential list and I really don't want to have to query for state name every time I need to use a State somewhere. Will I suffer a performance penalty if I just toss a State Table in XML and use a function to access it via ID as required, or would I be better served keeping it in my DB and running queries? Or should I just cache the results of the query somewhere and access it that way?
Your question as stated is really too broad for protocol here, but let me try to get you headed in the right direction...
Use XML for exchange where:
Industry standard schemas already exist, or you must coordinate
format agreement among partners and can benefit from the definitional
maturity of XSD or the transformational flexibility of XSLT.
Your data is naturally document-based, especially where mixed content
is required.
Use JSON for exchange where:
The above reasons for XML do not apply.
Easy programmatic access to simple, light-weight data structures is helpful.
Use a database for storage where:
ACID properties are important.
Use code-based static data for performance where:
The data never changes and access speed is paramount.
So, your given example of a very small static look-up table, where you're concerned about performance, probably fits best in code. Do avoid premature optimization, though.

Database or file record keeping - php

I am creating a record system for my site which will track users and how they interact with my site's pages. This system will record button clicks, page view times, and the method used to navigate away from a page (among other things.) I an considering one of two options:
create a log file and append a string to it for each action.
create a database table and save entries based on user interaction.
Although I am sure that both methods could easily fill my needs, which would be better in the long run. Other considerations:
General page viewing will never cause this data to be read (only added to it.)
Old Data should be archived, but still accessible.
Data will be viewed and searched via web app
As with most performance questions, the answer is 'It depends.'
I would expect it depends on the file system, media type, and operating system of your server.
I don't believe I've ever experienced performance differences INSERTing data into a large, or a small MySQL database. The performance differences manifest when you retrieve that data. The database will almost always outperform queries to files, especially when you want complex or statistical data.
If you are only concerned with the speed of inserting/appending data, and expect a large amount of traffic, build a mock environment and benchmark each approach. If you want to have any amount of speed retrieving that data in a structured way, go with the database.
If you want performance you should inspect the server log, instead of trying to build your log system...

Best way to store chat messages and files

I would like to know what do you think about storing chat messages in a database?
I need to be able to bind other stuff to them (like files, or contacts) and using a database is the best way I see for now.
The same question comes for files, because they can be bound to chat messages, I have to store them in the database too..
With thousands of messages and files I wonder about performance drops and database size.
What do you think considering I'm using PHP with MySQL/Doctrine?
I think that it would be OK to store any textual information on the database (names, messages history, etc) provided that you structure your database properly. I have worked for big Web-sites (multi-kilo visits a day) and telecom companies that store information about their users (including their traffic statistics) on the databases that have grown up to hundreds of gigabytes and the applications were working fine.
But regarding binary information like images and files it would be better to store them on the file systems and store only their paths on the database, because it will be cheaper to read them off the disks that to tie a database process to reading a multi-megabyte file.
As I said, it is important that you do several things:
Structure you information properly - it is very important to properly design your database, properly divide it into tables and tables into fields with your performance goals in mind because this will form the basis for your application and queries. Get that wrong and your queries will be slow.
Make proper decisions on table engines pertinent to every table. This is an important step because it will greatly affect the performance of your queries. For example, MyISAM blocks reading access to the table while it is being updated. That will be a problem for a web application like a social networking or a news site because im many situations your users will basically have to wait for a information update to be completed before the will see a generated page.
Create proper indexes - very important for performance, especially for applications with rapidly growing big databases.
Measure performance of your queries as data grows and look for the ways to improve it - you will always find bottlenecks that have to be removed, this is an ongoing non-stop process. Every popular web application has to do it.
I think a NoSQL database like CouchDB or MongtoDB is an option. You can also store the files separate and link them via a known filename but it depends on your system architecture.

Categories