dealing with large amounts of data

dealing with large amounts of data - php

I'm building the front end of a website that'll be holding data for users. Type of data is name, email, ethnicity, income, pets etc etc. Each person can have a partner (with the same questions) and an infinite number of children (names, dob, gender etc). The user can sign up, and then must be able to log in in the future to update their details if necessary.
The problem I'm having is things are just getting really messy. It all starts with validation have loops to check how many children there are and add then redisplay and set up validators if there is an error. Inserting all the data is easy enough, but my insert_user function has 30 paramaters so far.
Everything's getting annoying and frustrating. Is there an established way to deal with data like this? I'm thinking propel or doctrine may help, and I've had a play with PEAR's HTML_QuickForm with limited success (it can't handle things like "select your ethnicity" and an input for "other" or unlimited children)
I'm sure I'm not the first to have this trouble so what to others do?

Have a look at Symfony, it will make your life a lot easier. Your datamodel here is pretty simple, but be prepared to learn how Symfony works.
http://www.symfony-project.org/
This is the simplest tutorial I know of : http://articles.sitepoint.com/print/symfony-beginners-tutorial : it should get you up and running in a couple of hours.

When I deal with something like this I start with trying to come up with a good model to make it less complex.
For example, if you have server functions, in php, then you can use ajax calls, in javascript to deal with the display, so, you have separated concerns, so you can focus on having each side do what it does best.
If you want to keep everything in php, and just use form submission, then again, split the two parts so that the parts of the code that deals with display is separate from the api that deals with the database.
This is basically just an MVC structure.
That would be the best way to start, is to go back to your design phase, decide which languages you want to use, and separate the work.
Either way you end up with writing an API to get to the database, and the controller code (which handles getting requests and displaying them) doesn't care what type of database, if there is a database or any of those details, it wants to get the children for a particular person, so that request is given to the API and an array is returned.

Related

What is the best way to wait that an administrator validate something before comitting it?

I'm building a web application where several groups have their own page but if they want to modify it, an administrator has to validate it before.
For example, can change to change its logo, post new photo, change their phone number, their name, their location etc... Basically they can edit a value in the database but only if the administrator accepts it. The administrator has to validate every modification because... our customer asked us to.
That's why we have to create a system that could be called "pending queries" management.
At the beginning I thought that keeping the query in the database and executing when an administrator validate it was a good idea, but if we choose this option we can't use PDO to build prepared statements since we have to concatenate string to build our own statement, wich obvious security issues.
Then we thought that we should keep PHP code that calls the right methods (that use PDO) in our database and that we will execute with eval() when the administrator validates it. But again, it seems that using eval() is a very bad idea. As says this Rasmus Lerford's quote : "If eval() is the answer, you're almost certainly asking the
wrong question".
I thought about using eval because I want to call methods that uses PDO to deal with the database.
So, what is the best way to solve this problem ? It seems that there is no safe way to implements it.

Both your ideas are, to be frank, simply weird.
Add a field in a table to tell an approved content from unapproved one.

Here's one possible approach, with an attempt to keep the things organised to an extent, as the system begins to scale:
Create a table called PendingRequests. This will have to have most of the following fields and maybe quite a few more:
(id, request_type, request_contents, reqeust_made_by, request_made_timestamp,
request_approved_by, request_approved_timestamp, ....)
Request_contents is a broad term and it may not just be confined to one column alone. How you gather the data for this column will depend on the front-end environment you provide to the users (WYSIWYG, etc).
Request_approved_by will be NULL when the data is first inserted in the table (i.e. user has made an initial request). This way, you'll know which requests to present in the administration panel. Once an admin approves it, this column will be updated to reflect the id of the admin that approved it and the approved changes could eventually go live.
So far, we've only talked about managing the requests. Once that process is established, then the next question would be to determine how to finally map the approved requests to users. As such, it'd actually require a bit of study of the currently proposed system and its workflow. Though, in short, there may be two school of thoughts:
Method 1:
Create a new table each for everything (logo, phone number, name, etc) that is customisable.
Or
Method 2:
Simply add them as columns in one of your tables (which would essentially be in a 1:1 relationship with the user table, as far as attributes such as logo, name, etc. are concerned).
This brings us to Request_type. This is the field that will hold values / flags for the system to determine which field or table (depending on Method 1 or Method 2) the changes will be incident upon - after an admin has approved the changes.
No matter what requirement or approach it is to go about database management, PHP and PDO are both flexible enough to help write customisable and secure queries.
As an aside, it might be a good idea to maintain a table for history of all the changes / updates made. By now, it should probably be apparent that the number of history tables will once again depend on Method 1 or Method 2.
Hope that helps.

What's the best way to store a lot of form information on a Mysql?

I am creating a very simple web site, where you can go and register your workouts, so other people can see it and comment.
The problem is that, for example if you train 6 days a week, and do 10 exercises, I will have to store 60 exercises on Mysql.
I was thinking about creating a table with 60 rows, and then I would store all that info in there, but for some reason, this does not seems to by the best way.
So what should I do? I was looking here in Stack Overflow, and I saw something about storing this using an array and serializing it using PHP, but I'm not really sure about that.

You should consider what a database is before worrying about this kind of problem. It is common for mysql to handle millions of entries. It is not likely you will exceed this in a normal scenario. If you did get a lot of traffic and your database grew to a point where expansion and upgrading made sense, you would consult then. For now, mysql is going to be your champion.
Serialization has its uses, but when you want to individually examine results, you will want to go with a relational design. That way you can store every bit of info on a specific workout session and give better stats to the user. Afterall, stats are addicting. Many users on this site keep coming back to build stats.
The way I read your question was that you thought storing 60 workouts individually was a lot of data. That you have a form with many fields and you want to know how to store those fields in a database in a way that makes sense. Not just for you, but for anyone that comes in and registers. Never the less, this is still a relatively small task for any database. A relational design is definitely the way to go either way.

You should first decide logically what you want to store, and not how you store them.
Once you know what to store, then you should normalized the data to remove redundant data. This is done usually by converting the data to 3nf. See http://en.m.wikipedia.org/wiki/Database_normalization
That way you can be sure that all the required data is captured.

I suggest you using the database in any case aimply because you are "storing" data. Array serializzation is not really storing data, its more of storing a piece of process for later use.
With the database you can do many more things, you might not have the fantasy right now but if one day you will have a new plan on using the data you collected you will have more space for expansion.

How to structure database calls in a page?

Ok so i am new to making complex database structure into the page. I have a basic people table with a few categories. Students, teachers, parents and mods. There are again tables one for parents, students and teachers/mods. Its basically a school's website.
Now for example for a profile page where a student's info is showed to parents. Info like who are teachers, subjects, attendance, homework, etc. This will query a lot of tables. So what is my best bet here? I plan to do it in a web-app way. I was thinking maybe i can JSON data to page with ajax and let javascript do the heavy lifting of calculation. All the tables will only be queried once.
So is it even ok to do so? or will i face some hidden problems when i have dug my feet too deep in it? Maybe i can go one more level deep and make a huge JSON with entire database being sent to user and then it is cached in the browser. I dont have to support really old browsers :)
Also things like attendance and marks/result will need to be calculated every time. Is there any way to cache the result per student? like wen a parent views a student's result, it calculates it once and then caches it for x number of days on the server somehow.
EDIT:
Another non JSON approach i can think of is traditional way. I do all the things on server side with php. This means i wont have to worry about browser compatibility. Maybe i can make a call to all the needed tables in the beginning and store it is an array. That way a table only gets called once too.
EDIT 2
I read somewhere about battlefield 3 battlelog website. It is also made the same way. All the data is pulled from server with JSON and then calculated on client side. If that helps put my idea in perspective.

Probably to move away some misconceptions, but this is more a lengthy comment than an answer:
Your database has tables and fiels. That's perfectly valid. It's the job of a database server to store the data and handle the querying for you. Your own code will never be better in joining tables than the code of the database. Because the database server is optimized for the task.
So I think the idea to query all data and put it into your webapp via JSON is just a bad idea. Instead contact your server if you need specific data, the server will build the related SQL query, fire it up to the database server, get's the result back, converts the result into JSON and sends it back to the browser.

AJAX (Asynchronus Javascript and XML) just allows you to fetch data on the fly after the main portion of the page is painted. All things being equal it might be easier for you to design it as a standard (fetch data then paint the page) layout. If you design the application correctly the calls to do the individual components (Attendance, Teachers, Grades, etc.) can be gathered pre-page-render or post page render. There's obvious security concerns (URL hacking) by going the AJAX route, and I personally design non-AJAX as it's less flukey when things start going wierd.

MVC Multi User Authentication/Security

I've been working on a web application for a company that assists them with quoting, managing inventory, and running jobs. We believe the app will be useful to other companies in the industry, but there's no way I want to roll out separate instances of the app, so we're making it multi-user (or multi-company might be a better term, as each company has multiple users).
It's built in Codeigniter (wish I had've done it in Rails, too late now though), and I've tried to follow the skinny-controller fat-model approach. I just want to make sure I do the authorisation side of things properly. When a user logs in I'd store the companyID along with the userID in the session. I'm thinking that every table that the user interfaces with should have an additional companyID field (tables accessed indirectly via relationships probably wouldn't need to store the companyID too, tell me if I'm wrong though). Retrieving data seems pretty straight forward, just have an additional where clause in AR to add the company ID to the select, eg $this->db->where('companyID', $companyID). I'm ok with this.
However, what I'd like to know is how to ensure users can only modify data within their own company (in case they send say, a delete request to a random quoteID, using firebug or a similar tool). One way I thought of is to add the same where clause above to every update and delete method in the models as well. This would technically work, but I just wanted to know whether it's the correct way to go about doing it, or if anyone had any other ideas.
Another option would be to check to see if the user's company owned the record prior to modification, but that seems like a double-up on database requests, and I don't really know if there's any benefit to doing it this way.
I'm surprised I couldn't find an answer to this question, I must be searching for the wrong terms :p. But I would appreciate any answers on this topic.
Thanks in advance,
Christian

I'd say you're going about this the correct way. Keeping all of the items in the same tables will allow you to run global statistics as well as localized statistics - so I think this is the better way to go.
I would also say that it would be best to add the where clause you mention to each query (whether it's a get, update, delete. However, I'm not sure you'd want to manually go in and do that for all of your queries. I would suggest you overwrite those methods in your models to add the relevant where clauses. That way, when you call $this->model->get(), you will automatically get the where->($companyID, $userID) clause added to the query.

From the looks of things it looks like this might be a more API type system (as otherwise this is simply a normal user authentication system).
Simple Authentication
Anyway, the best bet I can see for an API is to have two tables, companies and users
in the companies table have an companyID, and password. in the users table link each user to a company.
Then when a user makes a request have them send through the companyID and password with every request.
oauth
The next option, slightly harder to implement, and means that the other end must also setup Oauth authentication is oauth.
But, in my opinion is much nicer overall to use and is a bit more secure.

One way to do it would be with table prefixes. However, if you have a lot of tables already, duplicating them will obviously grow the size of the db rapidly. If you don't have many tables, this should scale. You can set the prefix based on user credentials. See the prefixes section of this page: http://codeigniter.com/user_guide/database/queries.html for more on working with them.
Another option is to not roll out separate instances of the application, but use separate databases. Here is a post on CI forum discussing multiple db's: http://codeigniter.com/forums/viewthread/145901/ Here again you can select the proper db based on user credentials.
The only other option I see is the one you proposed where you add an identifier to the data designating ownership. This should work, but seems kinda scary.

Realtime MySQL search results on an advanced search page

I'm a hobbyist, and started learning PHP last September solely to build a hobby website that I had always wished and dreamed another more competent person might make.
I enjoy programming, but I have little free time and enjoy a wide range of other interests and activities.
I feel learning PHP alone can probably allow me to create 98% of the desired features for my site, but that last 2% is awfully appealing:
The most powerful tool of the site is an advanced search page that picks through a 1000+ record game scenario database. Users can data-mine to tremendous depths - this advanced page has upwards of 50 different potential variables. It's designed to allow the hardcore user to search on almost any possible combination of data in our database and it works well. Those who aren't interested in wading through the sea of options may use the Basic Search, which is comprised of the most popular parts of the Advanced search.
Because the advanced search is so comprehensive, and because the database is rather small (less than 1,200 potential hits maximum), with each variable you choose to include the likelihood of getting any qualifying results at all drops dramatically.
In my fantasy land where I can wield AJAX as if it were Excalibur, my users would have a realtime Total Results counter in the corner of their screen as they used this page, which would automatically update its query structure and report how many results will be displayed with the addition of each variable. In this way it would be effortless to know just how many variables are enough, and when you've gone and added one that zeroes out the results set.
A somewhat similar implementation, at least visually, would be the Subtotal sidebar when building a new custom computer on IBuyPower.com
For those of you actually still reading this, my question is really rather simple:
Given the time & ability constraints outlined above, would I be able to learn just enough AJAX (or whatever) needed to pull this one feature off without too much trouble? would I be able to more or less drop-in a pre-written code snippet and tweak to fit? or should I consider opening my code up to a trusted & capable individual in the future for this implementation? (assuming I can find one...)
Thank you.

This is a great project for a beginner to tackle.
First I'd say look into using a library like jquery (jquery.com). It will simplify the javascript part of this and the manual is very good.
What you're looking to do can be broken down into a few steps:
The user changes a field on the
advanced search page.
The user's
browser collects all the field
values and sends them back to the
server.
The server performs a
search with the values and returns
the number of results
The user's
browser receives the number of
results and updates the display.
Now for implementation details:
This can be accomplished with javascript events such as onchange and onfocus.
You could collect the field values into a javascript object, serialize the object to json and send it using ajax to a php page on your server.
The server page (in php) will read the json object and use the data to search, then send back the result as markup or text.
You can then display the result directly in the browser.
This may seem like a lot to take in but you can break each step down further and learn about the details bit by bit.

Hard to answer your question without knowing your level of expertise, but check out this short description of AJAX: http://blog.coderlab.us/rasmus-30-second-ajax-tutorial
If this makes some sense then your feature may be within reach "without too much trouble". If it seems impenetrable, then probably not.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.