We are discussing some design issues in our company and, as usual, different people have different opinions about database design. This is a j2ee based website and performance is a big problem. We are changing several aspects to make a better application including database.
I would like to get your point of view on some database issues. I have following questions, please give your opinion about them.
Foreign Keys - My view is that although we face problems with foreign keys when we TRUNCATE data or in some other scenarios, but still we should stick to using them for referential integrity. This lets us throw the referential integrity problem off our shoulders to RDBMS. But a different opinion of some developers in our company is that FKs are totally un-required and create many problems including performance issues. And instead referential integrity should be managed in application code using object oriented concepts. What is your opinion on this?
Different tables for pieces of data that look same but are logically different - We have many kinds of properties in our project - Component Properties, BOM properties, LineItem properties are some examples. Since each type of property has SAME columns (fields), so we have a single table for storing all properties. In addition, we have a column - parentid, which contains componentid for component properties, bomid for BOM properties and lineitemid for lineitem properties. parentid column is not mapped as a foreign key from any table. What I am suggesting is that either
a) we should have different table for different set of properties so that tables could be simply linked to their parent master table, OR
b) we should remove parentid from Property table, and create multiple 2-column tables for each type of property to link parent to property - that means create component_property_relation table for (componentid, propertyid) ,
bom_property_relation table for (bomid, propertyid) and similarly for lineitem properties.
But a different opinion here is that existing table is fine because all properties are having same number of columns and it is simple to manage single table instead of multiple. What is your opinion about it?
- Surrogate Keys - Right now our company has text values acting as PKs. I am suggesting to change all of them to numeric surrogate keys to have small sized tables and fast retrieval. But a different opinion here is that text keys are descriptive and we want to keep them. What do you think about it?
1 - Use Foreign Keys. R.I is critical and letting the database do what it is good at is a sound strategy. Seems like the descenting developers aren’t looking at the problem from a DBA perspective If you use FK’s then you can comfortably reverse engineer your database along with the relationships giving a nice diagram by which the data can be understood WITHOUT reverting to code. As code matures, well defined relationships between entities often becomes the only source of documentation!
2 - C.J Date - Database Primer - Parent/Child
3 - The ONLY time I would EVER consider using Alpha keys as PK’s would be for ISO standard data like Currency Code and Country Code - but I’ve never done it that way ! Are these same developers discussing performance in point 1 also advocating the use of the slower performing alphanumeric keys?
- Foreign Keys are required to establish business rules in RDBMS using its Codds Rule relational theory. It allows normalition and thus keeps all anolamies(Insertion, Deletion & Updation) away. This is required to allow the stored information/data to be used not only by any web application but also via any batch process or any other external systems. One cannot imagine any database that is so much focussed to be used only by said application which controls problems like data inconcistencies etc. Also in such case ,if the said web application is modified fot some reasons whole database can get corrupted.
One must have all refrerential integrity imposed in his database
- I see that you need TYPE-GROUP modeling pattern. In such a pattern, all similar types are clubbed as one Group and kept in one table. Different TYPES in each GROUP are further stored in child table. Both these tables are referentially integrated where GOUP is master and TYPE table is child.
You can create a table with all common properties and one additional column which is TYPE_CD and is Foreogn Key to a Master table that has all possible Types e.g. Component properties(COM), BOM, LI -Line Items etc.
This should solve all the problems.
- Usage of Surrogate / Business keys is based on customer needs. Using numeric/surrogate keys definetely helps in developing applications easily and faster in performance but at some places they deny customers of defining their data. Say… in a manufacturing firm if an engineer wants to define each part number as unique and the only identifier where Part Numbers follow some naming convention of the company and they just have to use it. You cannot imagine a system that forces them to use system defines part numbers.
Also there are many problems seen in integrating two applications if they use same surrogate keys. Say 1,2,3… is used to define ‘USERS’ in one application and very obviously the same for ‘USERS’ in another application. Now if the two applications are planned to be integerated , how would you intergate them together ? PKs in both the table have to be moved together in one USERS table.
I would say , based on requirement if customer input is required use Business Keys(alpha) else surrogate.