Performance issues are like a plague. Everyone seems determined to treat the symptom when they should be focused on the cause. There is a sense that any performance issue can be resolved by increasing the hardware spec that it is running on. I believe this is a fallacy and an expensive one at that. By scaling up the hardware you are most likely scaling up and obfuscating the problem.
Prevention is better than cure
With a bit of planning and by following application design best practices you can save an inordinate amount of time troubleshooting performance issues. Issues that would just not exist in a well designed application. There are really two types of application performance issues that you will find. One is an issue in a third party piece of software which you have absolutely no control over. It should, however, be logged with that third party and if you are lucky they will re-write part of the app which will then be pushed out in the next set of patches.
The second type is an internally written app that you have full control over. You can actually redesign and rewrite the app in order to remove the problem. This will no doubt require a lot of testing and a roll out to whoever uses the app. This makes it sound like its easier to just add more resources. What is most likely to happen is you get past one hump and land in the next.
Weeks of design can save you months of troubleshooting
At some point someone is going to blame the database. In some cases I absolutely agree. The DB is only ever going to do what you have designed it to do. It just might not be doing it as efficiently as you thought it would. By telling the DB to do something inefficient it will do it badly and make it look like the DB is the problem. What you have control over is what you are telling it to do not how it will do it. So tell it to do something more efficiently and you will find the problem evaporates. Other considerations are as follows
Make sure you choose the right data types. This will save storage space on the disk and will use up less physical memory which can cause performance issues. Do NOT use the default datatype that maybe selected for you. There are millions of tables out there with NVARCHAR(255) datatypes storing single character or digit data. The N itself allows you to store non unicode data and it does this by adding an extra BYTE to every single row of data. You add that up for millions of rows and across thousands of tables and that’s a lot of wasted space. So think about what type of data you are storing and what would best suit that data.
There are thousands of wide tables (500 columns plus) out there that should be multiple thinner tables. Is normalisation a difficult process? Not really, as it’s a logical process to help you make an efficient design to store your data in the least amount of space whilst removing duplicates. It will also allow you to maintain accurate data by not storing different values for the same customer or order in multiple locations. Most applications will not even need more than 3rd normal form.
The problem does come about when applications get added to over time. An extra column here and there starts to add up until someone thinks maybe the table is a bit large let’s just split it randomly in two. This is not how to do normalisation. You need to look at the relationship of the data you are modelling.
You need to think of situations that you might end up in. For example should you store the customers address on the invoice or just store it as part of the customer table. If you do not store it on the order form and you want to go back to find out what address was used you will be frustrated.
As the address used would be the customers address. If that address has been updated after a company move the order will appear with the new address. In order to maintain the order address history you will need to store the address that it was sent to with the order or you will need an address history table with from and to dates so you can re generate the order with the correct address for the order date. These are things that need to be thought out (designed) first before you write code.
Primary and Foreign keys
Some frameworks like to keep control of referential integrity. Personally and maybe this is a bit old school but I prefer SQL Server to handle it. If it is handled at the SQL Server end there is no getting around it. If it is handled by an abstraction layer it is possible people can ignore it and bypass it by hitting the tables direct. It might just be the DBA in me but I access to data locked down and the data protected.
Actually it is about size
The more data you have the longer it will take to sort, search and return the data you are looking for. We can speed data access up by using indexes but there are still limits like memory size, network speed etc. Sooner or later if you have a very large DB as your OLTP system you are going to need to look at archiving data off or removing it completely into another reporting focused system.
Best thing about having a reporting system is that you can change the index plan. By optimising indexes on a reporting DB data retrieval is prioritised over inserting and updating your data. Whilst your OLTP system needs to be able to update, insert and perhaps delete data. Your reporting system is 100% focused on retrieving data for your reports.
Both systems require different index plans and by separating these two functions into two separate DBs you can have the best of both worlds. If you need to be able to offer reporting as well as OLTP use in the front end app. You have two choices. You can use two different connections strings one for the reports to be shown and the other for the actual live system. Second option is to split it into two DBs. Then all you need to worry about is how accurate do your reports need to be. You will need a separate process to sync the data in the OLTP and reporting systems but this can either be built in or caveat it out.
Performance issues are a pain but they happen in all systems at some points. The way to minimise the issue is to have a decent application design from the start. Second is to choose correct datatypes to minimise memory and storage requirements. Finally the size of the system can be a problem and in most cases splitting out OLTP and reporting workloads can reduce the strain on the live system. Performance issues are rarely solved by one person. Team work is required to track down the root cause which can be design, storage tech, network issues, application servers etc. Starting with a well designed app will help reduce the number of places to look in to resolve it.