Test Driven Development in the world of SQL

Posted on August 22, 2012

7


Any software/web developer worth hiring would tell you that Test Driven Development is important. Although it takes time to do properly,  It can increase productivity as features tend to be developed with a clearer understanding of the problem and make changing business requirements much easier to implement. Importantly, your code quality will improve and there will be less bugs.

In the world of SQL development we are a little behind the curve on this but as SQL Developers, DBAs, or anyone writing SQL code (yes, it is code) we need to embrace Test Driven Development (TDD) and learn to love the power this gives.

What’s important to test? Everything! The more test coverage you have the more confidence you will have in making change. Its not a silver bullet but it will change the way you think about your work and make it cleaner and better architected. From my experiences there’s three types of tests you’ll need to learn about and I’ll cover them before moving onto the some examples; Contract, Data Quality and Process.

The Contract

What do I mean by the contract? Well, the contract between the application code and the data; every Stored Procedure or SQL statement has an implicit contract with the code. The code expects certain metadata when it calls SQL Server , either directly or via a framework. So the contract is simple, the columns and data types expected by the calling code should not change unexpectedly once created.

This contract is not one written in stone, in fact its very fragile and this is why creating a test based on expectations is a good start before testing data or process.

Data Quality

This is related to the expected values when calling a statement or procedure; if you call X with parameters Y and Z you expect data to be 1, 2 and 3. You really need to think about proper coverage and writing tests to test a single value or parameter where possible. Its as simple as this but up until now not that easy to implement.

Process

Sometimes you will want to test that a number of processes have worked as you expect them to,  in Test Driven Development this is often but not exclusively an integration test. An example is you might want to run a test that calls a procedure that inserts some data and then calls another procedure where you test the output. You can assert that individual steps work and that the final output or result is as expected.

So as a guiding rule, Statements and Procedures are not always participating in a complex process but they should have at least one Contract test and one or more Data Quality tests.

Test Driven Development

So you know of the types of tests we need to think about but what is Test Driven Development? I think the best way to think about it is that instead of just going ahead and writing SQL, you need to think of how you can test what you are writing before you write the code. This is kind of a crazy concept to get your head around until you start doing it.  How can I write a test for something that doesn’t exist?

In the land of application and web development, there are many tools to help you with this – Resharper from JetBrains being a personal favourite of mine. However, the basic principles are the same regardless of the product or technology. When testing SQL, specifically a stored procedure, it would look something like this:

  • Create a test to check for the existence of the stored procedure.
  • Run it and it will fail as it doesn’t exist.
  • Do the minimal amount to pass the test; you can create an empty (stub) stored procedure.
  • Run the test and it will pass.

This pattern is known as red, green and refactor.  To repeat; you write your test and it will fail – this is important as the failure will help give you confidence when the test passes. Creating the test first means you are really thinking about your expectations.  Then you make it pass by writing the code and then you refactor (make changes) to improve your code.

Often this cycle runs many times, extracting and refactoring until you are happy with the quality.

You’d then create more tests to test the Contract and Data Quality etc. Each iteration builds up your suite of tests until you have the good coverage and fully functioning code.

How we can achieve this in the world of SQL

Whether you have been developing with SQL for years, new to SQL or have come from a development background it has until now been a frustrating experience. In this article I’m concentrating on a specific product and framework but there are some alternatives. For reference, I’ve summarised my thoughts on those here.

This year RedGate introduced a product called SQL Test that allows you to apply TDD to SQL.  SQL Test is a user interface that runs in SQL Server Management Studio, making it an easy choice for most SQL developers. Internally, it uses an open source library called tSQLt.  What SQL Test really does over and above the free library is offer an intuitive interface and set of command line tools that can run as part of your continuous build and deployment process.

I’ll cover creating the three types of tests I mentioned earlier to give you an overview and understanding of how it works using SQL Test and the tSQLt library.

Let’s get started.

You can download a 28 day trial of RedGate’s SQL Test at http://www.red-gate.com/products/sql-development/sql-test/. Once installed, you can add an existing database to SQL Test by clicking the link in the SQL  Test window.

Add Database to SQL Test

When you add a database to SQL Test, it will install the tSQLt framework into the database you select, set TRUSTWORTHY ON for and enable SQL CLR. This is a requirement of the framework.

Let’s create a Contract test. For this we’ll create a simple table called Property, it’s going to hold details of properties for a fictitious estate agent. For the purposes of this test we will assume this to be an existing table, if it was a new table you could create a test to check it’s existence first.


CREATE TABLE Property

(

PropertyID INT NOT NULL,
EstateAgentId INT NOT NULL,
Longitude FLOAT NOT NULL,
Latitude FLOAT NOT NULL

)

Now, lets assume we want to write a stored procedure to return all the properties for a specific Estate Agent. As our first test is going to be Contract type test we just want to verify that the output contains the correct columns. Later, we’ll write other tests to confirm the Data Quality. You might create a test to check the existence of the stored procedure first but I’ll cover that later. For now, this is our initial stub procedure:


CREATE PROCEDURE up_Property_select_by_EstateAgentId
    @EstateAgentId INT
AS
    SELECT 1
GO

As you can see, this doesn’t actually do much. Its not meant to. We are going to create the test now that will check the contract (metadata) returned by this procedure.

Using SQL Test, click the New Test button. You will get a dialog that allows you to create the test.

new test

The test name should be as descriptive as possible. It will display when you run the tests. Names can contain spaces but avoid special characters. SQL Test will prefix your test name with the word test.

The Test Class is a way to group related tests, SQL Test will put these tests into a schema named the same as the class you specify. Tests are just procedures, once you click Create Test it will open in a new query window ready for you to type the details of the test. SQL Test provides a handy template.


ALTER PROCEDURE [TDD].[test the contract for the property stored proc is correct]
AS
BEGIN
 --Assemble
 -- This section is for code that sets up the environment. It often
 -- contains calls to methods such as tSQLt.FakeTable and tSQLt.SpyProcedure
 -- along with INSERTs of relevant data.
 -- For more information, see http://tsqlt.org/user-guide/isolating-dependencies/

 --Act
 -- Execute the code under test like a stored procedure, function or view
 -- and capture the results in variables or tables.

 --Assert
 -- Compare the expected and actual values, or call tSQLt.Fail in an IF statement.
 -- Available Asserts: tSQLt.AssertEquals, tSQLt.AssertEqualsString, tSQLt.AssertEqualsTable
 -- For a complete list, see: http://tsqlt.org/user-guide/assertions/
 EXEC tSQLt.Fail 'TODO:Implement this test.'

END;

As you can see, it provides a lot of comments that contain guidance. I’ve included the links in the resources at the end of this article. We are going to make some changes so we can run our first test.

</pre>
ALTER PROCEDURE [TDD].[test the contract for the property stored proc is correct]
AS
BEGIN

--Assemble
IF OBJECT_ID(N'_expected', 'U') > 0 DROP TABLE _expected;
CREATE TABLE _expected (PropertyId INT, Point GEOGRAPHY);
INSERT _expected VALUES(999, GEOGRAPHY::Point(5.6 , 8.4, 4326))

EXEC tSQLt.FakeTable @TableName = N'Property'
INSERT dbo.Property VALUES(1, 1, 51.3, 1.37)

DECLARE @expected NVARCHAR(MAX) = 'SELECT * FROM _expected'
DECLARE @actual NVARCHAR(MAX) = 'EXEC [up_Property_select_by_EstateAgentId] 1'

--Assert
EXEC tSQLt.AssertResultSetsHaveSameMetaData @expected, @actual

END
<pre>

So, what’s going on here? Well, we define the expected schema as a command statement. The expectation is that the execution of the procedure will return a resultset with two columns; PropertyId (INT) and Point (GEOGRAPHY). Next we define the actual command. This statement executes the stored procedure with the required parameter.

Whoa! what’s this FakeTable thing? I know we created the table earlier but when writing tests we should make sure the test is self contained and not contaminated by existing data. As the stored procedure uses the table Property, we need a way to protect this table and make our test contained. SQL Test has the concept of a FakeTable. When you use the FakeTable method it renames your real table and creates an empty table with the same schema minus any constraints. All this is done within a transaction, as is everything when you run a test.

So we Fake the Property table and then run the assertion. The assertion method AssertResultSetsHaveSameMetaData will run and compare the two commands. Thanks to Greg Lucus for pointing out that when using AssertResultSetsHaveSameMetaData, there needs to be at least one row returned. Also if using a GEOGRAPHY data type then your comparison tables need to be in the same schema.

If the two have the same contract (metadata), then the test passes otherwise it will fail.

Let’s run the test.

run test

Yes! This is what we wanted. We wrote the test and it included our expectation. When the stored procedure ran it didn’t match what we expected. It Failed. The detailed output shows all the column information we need to resolve this.  Now we need to update our code (the stored procedure)  so that we can pass the test.


ALTER PROCEDURE up_Property_select_by_EstateAgentId
 @EstateAgentId INT
AS
 SELECT PropertyId,
 GEOGRAPHY::Point(Longitude , Latitude, 4326) AS Point
 FROM PROPERTY
GO

When we run the test again, we see that it passes.

run test

Now if the contract changes which is any of the column names or data types, the test will fail again. We now have some very valuable test coverage.

Next we are going to look at a Data Quality test. Given an Estate Agent, return the property associated with that Estate Agent. In our test we are going to prove that the procedure returns the property for the Estate Agent we specify only by Faking the Property table and inserting two rows, one for the Estate Agent we are testing and one for another. We expect that the procedure will just return one. This is our test.

ALTER PROCEDURE [TDD].[test returns a valid property for an agent]
AS
BEGIN
 --Assemble
 EXEC tSQLt.FakeTable @TableName = N'Property'

CREATE TABLE [TDD].[expected]
 (PropertyId INT, Point GEOGRAPHY)
 CREATE TABLE [TDD].[actual]
 (PropertyId INT, Point GEOGRAPHY)

 INSERT INTO Property (PropertyId, EstateAgentId, Longitude, Latitude)
 VALUES( 1, 1, 1.0,1.0), ( 2, 2, 1.0,1.0)

--Act
 INSERT INTO [TDD].[expected] (PropertyId, Point)
 VALUES (1, GEOGRAPHY::Point(1.0,1.0,4326))

INSERT INTO [TDD].[actual]
 EXEC [up_Property_select_by_EstateAgentId] @EstateAgentId =1

 --Assert
 EXEC tSQLt.AssertEqualsTable @Expected = N'[TDD].[expected]',
 @Actual = N'[TDD].[actual]'

END;

As you can see, we fake the Property table again and we create two tables to hold the expected and actual values within the TDD class (schema) so that they are dropped when the test completes. We insert two rows into our faked table, with the expectation that our procedure will return one row. Finally, we insert the results of the procedure into the actual table and then assert that the actual and expected tables match.

When we run this, it fails. Again, as expected as we haven’t added the functionality yet to the procedure to filter the data.

When you start Test Driven Development it is so tempting to write too much code. However, its very important to iterate and only add functionality when you have a covering test.

Now, we will refactor the procedure.

ALTER PROCEDURE up_Property_select_by_EstateAgentId
 @EstateAgentId INT
AS
 SELECT PropertyId,
 GEOGRAPHY::Point(Longitude , Latitude, 4326) AS Point
 FROM PROPERTY
 WHERE EstateAgentId = @EstateAgentId
GO

We run the tests and they all pass.

run all tests

An integration test or process test can be used to make sure a series of events occur for a given scenario. You may have a procedure that calls another which then returns data but it only does this under a specific condition or parameter.  Testing these kind of interactions can be complex but SQL Test and tSQLt offer some handy methods to help you handle many of these complexities.

I’m going to use a method called SpyProcedure which is neat way to determine if a procedure has been called by another procedure or you want to replace functionality of a stored procedure or if you just want to verify parameters that have been passed. What this does is replace your procedure with a doppelganger, it has the same name and signature but basically does nothing apart from log the parameters passed to it. SpyProcedure also allows you to optionally call a command at the same time, useful for insert or updating data as part of your test.

We are going to create a test that checks that an audit has been logged when someone calls the up_Property_select_by_EstateAgentId procedure we have already created. The auditing will be done by a new stored procedure that we will apply the spy to.

Before we can do that we’ll create a test to verify the auditing procedure exists and a test to verify the auditing procedure inserts data when called. This is  a good example of needing to think about what needs to be tested before you write the code.

Checking for the existence of an object is done using the SQL Test method AssertObjectExists. We can write a very simple test using this method:

ALTER PROCEDURE [TDD].[test that the Audit procedure exists]

AS
BEGIN

EXEC tSQLt.AssertObjectExists 'up_Audit_insert'

END;

As expected this test will fail. Let’s create the stub for the procedure.

CREATE PROCEDURE up_Audit_insert
    @Event VARCHAR (50),
    @Date DATETIME
AS
    SELECT 1
GO

Now it passes, lets move onto the data quality test. We need to test that when this is called it inserts data into an Audit table. I’ve include the schema for the Audit table below, see if you can write a test to check it’s existence before you create the table.

CREATE TABLE Audit
(
    AuditId INT NOT NULL IDENTITY(1,1),
    Event VARCHAR (50) NOT NULL,
    EventDate DATETIME NOT NULL
)

Now we can create the test that checks that calling the procedure inserts the expected data by faking the Audit table.

ALTER PROCEDURE [TDD].[test that audit insert proc inserts data]
AS
BEGIN

--Assemble
 DECLARE @expectedEventDate DATETIME = GETDATE()
 DECLARE @expectedEvent VARCHAR(50) = 'Test'

CREATE TABLE [TDD].[expected]
 (
 AuditId INT NOT NULL ,
 Event VARCHAR (50) NOT NULL,
 EventDate DATETIME NOT NULL
 )

 INSERT INTO [TDD].[expected]
 VALUES (1,@expectedEvent, @expectedEventDate)

EXEC tSQLt.FakeTable @TableName = N'Audit',@Identity = 1

 --Act
 EXEC dbo.up_Audit_insert @Event = @expectedEvent,
 @Date = @expectedEventDate

--Assert

 EXEC tSQLt.AssertEqualsTable @Expected = N'[TDD].[expected]',
 @Actual = N'Audit'

END;

This time we have used FakeTable with a new parameter called @Identity, this fakes the table and retains the identity information. Without this, the faked table will not have an identity and the column would be NULL if data was inserted.

You can probably guess the outcome of this test. It fails. This is because we stubbed out the Audit insert procedure and it actually isn’t doing an insert.  Refactoring the stored procedure to include the functionality will make the test pass, try this yourself.

We are are ready to write our final test. This will call our original up_Property_select_by_EstateAgentId procedure and test that the audit procedure was called. When you apply SpyProcedure to a procedure, the log will be created that will store the parameters that got passed. The log is named [Procedure]_SpyProcedureLog. It will contain an _id_ column and a column for every parameter. Our final test looks like this:


ALTER PROCEDURE [TDD].[test that audit is called when property proc is run]
AS
BEGIN
 --Assemble

 EXEC tSQLt.FakeTable @TableName = N'Property'
 EXEC tSQLt.SpyProcedure @ProcedureName = N'up_Audit_insert'

 --Act
 EXEC dbo.up_Property_select_by_EstateAgentId @EstateAgentId = 1

 --Assert
 IF NOT EXISTS (SELECT 1 FROM up_Audit_insert_spyprocedurelog
 BEGIN
 EXEC tSQLt.Fail @Message0 = N'Audit was not called'
 END

END;

We Fake the Property table because we don’t want real data making our test brittle. We apply the spy to the up_Audit_insert procedure so that we can just track it got called. We execute the up_Property_select_by_EstateAgentId procedure and then assert that the log is not empty. If it’s empty, we fail the test with a custom message.

Not surprisingly, the test fails.

Simply refactor your up_Property_select_by_EstateAgentId to include the following code that calls the up_Audit_insert procedure.


DECLARE @eventDate DATETIME = GETDATE()

EXEC dbo.up_Audit_insert @Event = 'Property selected',
 @Date = @eventDate

Now, run all yours tests.

run all tests

We have covered a lot of ground. We have a created Contract (metadata) test, Data Quality tests and a Process test. Hopefully by now you have a good understanding of the basics behind TDD in SQL with SQL Test and tSQLt.

So, how much of this can be done with just tSQLt? Well, as SQL Test is powered by tSQLt maybe not surprisingly apart from not having the user interface to create and run tests it is identical. This makes it very transferable to people within your company or team that don’t have a SQL Test license or if you don’t mind not having the graphical interface. To run tests manually without or without SQL Test installed (you must install tSQLt manually if you have not installed SQL Test) you simply use the Run method of the tSQt framework:

-- runs all tests
tSQLt.Run

-- runs all tests for specific class
tSQLt.Run 'TDD'

-- run specific test
tSQLt.Run @testname = 'TDD.[test the contract for the property stored proc is correct]'

There are currently some limitations that might cause a problem for some environments, most notably as FakeTable renames your existing table it cannot be used when the table you want to fake is replicated or has any indexed views against it. I believe the tSQLt team are working on enhancements to resolve this.

There seems to be active development on tSQLt and now RedGate have incorporated this framework in their commercial product it has by the far the best chance of support in the future.

Conclusion and future reading

And finally, I hope this article has helped you understand how Test Driven Development can and should be applied to SQL development and how easy this can be.

As SQL developers and professionals clean and testable code is often overlooked. Although not directly related to SQL I highly recommend The Clean Coder and Clean Code as great reads to make you a better and cleaner coder of SQL.

These are some useful links covering the subject of this post:

TDD is now accessible to SQL development and we should all learn to love the power that this gives.

Advertisements