Should I check a unique constraint with php? - php

Maybe this question has already been asked, but I don't really know how to search for it:
I have the postgres-table "customers", and each customer has it's own unique name.
In order to achieve this, I added an unique-constraint to this column.
I access the table with php.
When the user now tries to create a new customer, with a name that has already been taken, the database says "Integrity Constraint Violation", and php throws an error.
What I want to do is to show an error in the html-input-field: "Customer-Name already taken" when this happens.
My question is how I should do this.
Should I catch the PDO-Exception, check if the error-Code is "UNIQUE VIOLATION", and than display a message according to the Exception-Message, or should I check for duplicate names with an additional statement before I even try to insert a new row?
What is better practice? Making a further sql-statement, or catching and analyzing error-codes.
EDIT:
I'm using transactions, and I'm catching any exception in order to rollback.
The question is, if I should filter out Unique-violations so they don't lead to a rollback.
EDIT2:
If I'm using the exception-method, I would have to analyse the exception-message in order to ensure that the unique-constraint really belongs to the "name"-column.
This is everything I get from the exception:
["23505",7,"FEHLER: doppelter Schlüsselwert verletzt Unique-Constraint <customers_name_unique>\nDETAIL: Schlüssel <(name)=(test)> existiert bereits."]
The only way to get information about the column is to check if "customers_name_unique" exists (it's the name of the unique-constraint).
But as you can also see, the message is in german, so the output depends on the system / might be able to change.

You should catch the PDO exception.
It quicker to let the database fail, than to look up and see if the record already exists.
This also makes the application "less aware" of the business logic in the database. When you tell the database about the unique index that's really a business logic, and since the database is handling that particular logic it's better to skip the same check in the other layers (the application).
Also when the database layer is handling the exception you avoid race conditions. If your application is checking for consistency then you may risk that another user adds the same record after the first application has checked that it's available.

The question doesn't really belong here but I'll answer you.
Exceptions are situations when something exceptional happens. It means that you shouldn't use them to handle situation that may happen oftenly. If you do it then it's like GOTO code. The better solution is to check previosly if there is any duplicate row. However, the solution with exceptions is easier so you need to decide if you want something to work or if you want to have something that works written as it should be.

I would catch the exception, because (thanks to concurrency) that can happen anyway, even if you check with an extra query beforehand.

Errors are bad, I'd rather check if name does not exist before adding it. Well you should still check if no errors on insert, to avoid situation when concurrent scripts are trying to insert same name (there is a little time between checking for existance and insert, since its not a transaction).

On SAVE check if Exists (by a simple field, in your case: the Constraint column).
If affirmative - show notification to the user about duplication. But don't force the DB server to return you exceptions.

Related

Should I rely on MySQL errors in PHP code?

I was wondering if logic duplication can be reduced on this one. Let's say I have a users table and email column, which should be unique per record. What I normally do, is having a unique index on the column and validation code that checks if the value is already used:
SELECT EXISTS (SELECT * FROM `users` WHERE `email` = 'foo#bar.com')
Is it possible and practical to skip the explicit check and just rely on the database error when trying to put non-unique value? If we repeat the logic of uniqueness in two layers (database and application code), it's not really DRY.
I do that pretty often. In my custom database class I throw a specific exception for violated restrictions (this can be easily deduced from the numeric error code returned by MySQL) and I catch such exception upon insert.
It has the benefit of simplicity and it also prevents race conditions—MySQL takes care of data integrity in both variants, data itself and concurrent accesses.
The only drawback is that it can be tricky to figure out which index is failing when you have more than one (for instance, you may want to have a unique email and a unique username). MySQL drivers only report the violated key name in the text of the error message, there's no specific API for it.
In any case, you may want to inform the user about available names in an earlier stage, so a prior check can also make sense.
It makes sense to enforce the uniqueness of the email address in the database. Only that way you can be sure it is really unique.
If you do it only in the PHP code then any error in that code may corrupt the database.
Doing it in both is not needed but does, in my opinion, not offend against the DRY rule. For instance, you might want to check the presence of an email address during registration of a new user, and not only rely on the database reporting an error.
I assume by "DRY" you mean Don't Repeat Yourself. Applying the same logic in more than one place is not intrinsically bad - there's the old adage "measure twice, cut once".
In a more general case, I usually follow the pattern of applying the insert and catching the constraint violation, but for users with email addresses it's a much more complicated story. If your email is the only attribute required to be unique, then we can skip over a lot of discussion about a person having more than one account and working out which attribute is not unique when a constraint violation is reported. That the email is the only unqiue attribute is implied in your question, but not stated.
Based on your comments you appear to be sending this SQL from your PHP code. Given that, there are 2 real issues with polling the record first:
1) Performance and Capacity: it's an extra round trip, parse and read operation on the database
2) Security: Giving your application user direct access (particularly to tables controlling access) is not good for security. Its is much safer to encapsulate this as a stored procedure/function running with definer privileges and returning messages more aligned to the application logic. Even if you still go down the route of implementing poll first / insert if absent, you are eliminating most of the overhead in issue 1.
You might also spend a moment considering the difference between your query and...
SELECT 1 FROM `users` WHERE `email` = 'foo#bar.com'
On top of the database constraint, you should check if the email given already exists in it before trying to insert. Handling it that way is cleaner and allows for better validation and response for the client, without throwing an error.
The same goes for classic constraints such as MIN / MAX (note that such constraints are ignored on MySQL). You should check, validate and return a validation error message to the client before committing any change to the database.

Is it redundant to check if a value is unique in both the application and database?

So let's say a user registers for an account, I would like to check if the email being used is already associated with another account...
On the database side, I put a unique constraint on the email column. Now on the application side, should I run a query to check whether that email is already in use and then if it isn't, run another query to insert the user? Or should I ignore that step and since I already have a unique constraint in the database column, I should just attempt to insert the user and if I get an error, I know the email is already in use?
Is running a query just to check for the email being redundant or is it a necessary step and why?
I am using PHP and MySQL.
Yes, it's redundant, but you might want to do it.
You have two choices, really:
Check in app, then add to database. There is a race condition there, so you need to Wrap the check-set in an application level mutex, or
Just push into the database and catch any exception raised from the database layer and handle (taking care to distinguish between constraint violation on column exceptions from all other kinds of possible run time exceptions).
Depending upon the relative ease of these two approaches, decide which works for you.

mysql surpress dupe key error

When i need to know if something in unique before it gets inserted. i usually just attempt to insert it and then if it fails, check if the mysql_errno() is 1062. If it is i know it failed as a duplicate key and i can do whatever i need to do.
The most common place for this is in a user table. I set the email as unique as thats the "username" for logging in. Instead of running additional queries to check uniqueness when processing registration forms, i just compile the query, execute it and check for the 1062 error number. If it fails with 1062 i tell the user nicely that the email is registered and all is good.
However i recently set up a very basic MITM sql query function which gives myself and other developers on the system access to query times, a log of all the sql queries at the bottom of the page, and most importantly, a function which establishes the mysql connection to the correct database on demand (rather than having to do the connect and pass link identifiers manually).
In the sql error query log this function creates on disk, is all my duplicate entries. This obviously doesn't look good to other people seeing errors (even though there handled and expected). Is there a way of surpressing errors somehow for this but still being able to check the mysql_errno() ?
Whilst doing a bit of housework on my account here at SO, I thought it best to answer this with my findings so i can close it. This is basically a conclusion from my last comment above.
If you (like me) use certain error codes in mysql in your application to reduce validation queries or code (duplicate key being the most common i find). The only way to stop an error being thrown is to catch the error inside mysql and handle it. I wont go into the how-to here but a good place to get started is:
http://dev.mysql.com/doc/refman/5.0/en/declare-handler.html
Note: just for the new dev's out there, also dont forget to check out "ON DUPLICATE KEY" (google it). It was something blindly suggested to me elsewhere. It doesn't fit in this example but i've used it for year's to save checking for duplicate records before insertion (it does not return a failure on duplicate entries, so its only good if you were thinking of using a duplicate error handler to instead perform an update... hence finding your way here)

Can I use foreign key restrictions to return meaningful UI errors with PHP

I want to start by saying that I am a big fan of using foreign keys and have a tendency to use them even on small projects to keep my database from being filled with orphaned data. On larger projects I end up with gobs of keys which end up covering upwards of 8 - 10 layers of data.
I want to know if anyone could suggest a graceful way of handling 'expected errors' from the MySQL database in a way that I can construct meaningful messages for the end user. I will explain 'expected errors' with an example.
Lets say I have a set of tables used for basic discussions:
discussion
questions
responses
users
Hierarchically they would probably look something like this:
-users
--discussion
---questions
----responses
When I attempt to delete a user the FKs will check discussions and if any discussion exist the deletion is restricted, deleting discussion checks questions, deleting questions checks responses. An 'expected error' in this case would be attempting to delete a user--unless they are newly created I can anticipate that one or more foreign keys will fail causing an error.
What I WANT to do is to catch that error on deletion and be able to tell the end user something like 'We're sorry, but all discussions must be removed before you can delete this user...'.
Now I know I can keep and maintain matching arrays in PHP and map specific errors to messages but that is messy and prone to becoming stagnant, or I could manually run a set of selects prior to attempting the deletion, but then I am doing just as much work as without using FKs.
Any help here would be greatly appreciated, or if I am just looking at this completely wrong then please let me know.
On a side note I generally use CodeIgniter for my application development, so if that would open up an avenue through that framework please consider that in your answers.
Thanks in Advance
Sadly, MySQL does not expose the ability to define a custom error like you would with SQL Server or Oracle.
Bug/Feature Request #16999
Worklog #2110 spec's the behavior for v5.5
Workaround
Check this blog post about using a UDF to be able to define custom errors.
Sounds like you need to define your foreign keys with ON DELETE CASCADE. This will delete any referenced data in other tables.
You shouldn't be relying on the database to create errors for your application code. the FK's are there for when your app code messes up and tries to delete something it shouldn't.
If you really want to give the user a nice error message you will have to run the selects first, and build the appropriate error message.
edit
You can check for foreign keys in one select. If you are using an ORM like doctrine, you don't even have to specify the join, just tell it what fields to select, then check each table for nonzero rows.

Should a PHP application perform error handling on incorrect database values?

Imagine this... I have a field in the database titled 'current_round'. This may only be in the range of 0,1,2,3.
Through the application logic, it is impossible to get a number above 3 or less than 0 into the database.
Should there be error checking in place to see if the value is malformed (i.e. not in the range 0-3)? Or is this just unnecessary overhead? Is it OK to assume values in a database are correctly formatted/ranged etc (assuming you sanatise/evaluate correctly all user input?)
I generally don't validate all data from the database. Instead I try to enforce constraints on the database. In your case depending on the meaning of 0, 1, 2, 3 I might use a lookup table with a foreign key constraint or if they are just numeric values I might use a check constraint (differs from DB vendor to the next).
This helps protect against changes made to the DB by someone with direct access and/or future applications that may use the same DB but not share your input validation process.
Wherever you decide to place validation prior to insertion in the database is where you should catch these things.
The process of validation should take place in one place and one place only. Depending on how your application is structured:
Is it procedural or object oriented?
If it is object oriented, then are you using an Active Record pattern, Gateway pattern or Data Mapper pattern to handle your database mapping?
Do you have domain objects that are separate from your database abstraction layer?
Then you will need to decide on where to place this logic in your application.
In my case, domain objects contain the validation logic and functions with data mappers that actually perform the insert and update functions to the database. So before I ever attempt to save information to the database, I confirm that there are valid values.
Get the database to do this for you. Most advanced DBMS (check out free DB2 Express-C at http://FreeDB2.com) allow you to define constraints. This way you are getting the database to ensure semantic integrity of your data. Getting this done in application code will work at the beginning but you will invariably find down the line that it will stop working for various reasons. You may have additional applications populate data in to the database or you may get a bug creeping in to existing app. The thing that happens most often is you get new people to work on the application and they will add code that will fail to perform the same level of checking that you have done.
In general, you should check for what you're expecting, either value or type. And act appropriately. Only after it fails all checks should maybe some code think about working out what to do with the 'wrong' value and how to fix things. This applies with a state value, like what you have, or with an input type that needs to be the correct type.
The constraints should be put on the database, just remember to catch any exceptions thrown if your application would by any chance try to insert/update an invalid value

Categories