We’ve already talked about batch processing in SQL – Cursor and Batching, but it’s not just human labor that batching helps us with. One of the best ways to optimize any type of coding is to limit the amount of hits it has to do against storage (physical or otherwise). It could come at an expense of more CPU cycles, so proper compromises need to be made to ensure true performance improvements.
In case of SQL, this is usually achieved by limiting the number of separate calls to the databases/tables and finally limiting the number of requests themselves (select/insert/update/delete), wrapping them into as few as possible… Oh, how many times have I seen very simple procedures with cursors or even just check requests (ie IF EXISTS), followed by possible change requests, followed yet again by closing check or pull requests. Why would you do that if you can easily do all of this at once and allow the engine to properly optimize out of the box?? The harsh cultural weight of our 20th century coding backgrounds, perhaps?
Let’s start with a simple multi-row insert- say you want to copy records over a month old into some archive table. Don’t use a cursor for that (or at least not for every single row), simply use insert…select (or update…select, delete…select), simple and straight forward, the server will do all of the work for you in one simple swoosh:
INSERT INTO Archive ( OldID, OldValue, OldDate, AddDate ) SELECT ID, Value, AddDate, GETDATE() FROM OriginalTable WHERE AddDate < GETDATE() - 30
Want to know what to do if you want to not only process a large number of rows, but also use the information about which rows were touched? The post on a great command OUTPUT is
coming out shortly out already…