SQL Coding Standards
- Stephen Fuqua
- Ian Christopher (Deactivated)
For other important guidance, see also:
Naming Conventions
Many projects have more detailed and specific naming conventions. Please review existing files and consult with the product owner when in doubt. For Analytics Middle Tier, please consult the AMT Standards and Guidelines.
SQL object names should use Pascal case.
-- Good create view dbo.MyViewName -- Bad create view dbo.my_view_name ... create view dbo.myViewName ... create view dbo.[my view name] ...
Do not use quoted-strings to enforce the Pascal case naming convention in PostgreSQL code
-- Good create view dbo.MyView -- (!) it is acceptable that PostgreSQL will treat this as 'dbo.myview' -- Bad create view dbo."MyView"
- Object names must be restricted to 63 characters or fewer due to the name length restriction in PostgreSQL.
Use singular nouns, as a table by definition already implies that there are (or could be) multiple records.
-- Good create table dbo.User -- Bad create table dbo.Users
Language Conventions
Data Types
Conceptual Type | SQL Server Type | PostgreSQL Type | Notes |
---|---|---|---|
String | NVARCHAR(size) | VARCHAR(size) | Don't just default to MAX /unbounded strings - use an appropriate length for the expected data. |
Dates | DATETIME or DATETIME2 as needed | TIMESTAMP | Avoid TIMESTAMPTZ on PostgreSQL because the source data generally do not include time zone. |
Guids |
|
| |
Booleans |
| BOOL | |
Auto-incrementing numbers | INT IDENTITY or LONG IDENTITY | serial or BIGSERIAL |
Readability Preferences
- Joins should include the word "INNER" or "OUTER" for maximum clarity.
- Avoid aliasing, where possible. Aliases make the reader work harder to understand the query. However, there will be times when aliases are required, for example when needing to join twice to the same table. Very long table names can sometimes benefit from a reasonable alias as well, to keep line lengths from becoming excessive.
SQL Language
- Where possible, utilize ANSI-SQL for maximum portability between SQL Server and PostgreSQL:
- Do not use identifier quoting — either
[...]
or"..."
. - Prefer the
COALESCE
function overISNULL
when replacing a null value with a concrete value.
- Do not use identifier quoting — either
Make sure that search terms are SARGable in joins and WHERE clauses: don't use put a search term inside a function. Examples:
-- Bad WHERE coalesce(table1.Column1, '') = '' -- Good WHERE table1.Column1 IS NULL or table1.Column1 = '' -- Bad (SQL Server) WHERE YEAR(table1.Column1) = 2019 -- Bad (PostgreSQL) WHERE EXTRACT(YEAR from table1.Column1) = 2019 -- Good (both engines) WHERE table.Column1 BETWEEN '2019-01-01' AND '2019-12-31
PostgreSQL Scripting
When writing scripts for PostgreSQL, including stored procedures, use the PL/pgSQL language.
Formatting
Formatting need not be a strict exercise — adherence to absolute precision in SQL scripts, where various tools have differing (or no) ability to help format a file, is difficult to achieve. The code reviewer should ensure that the file is readable and meets at least the spirit of the following preferences.
- Place the
CREATE
statement on a line by itself. - For the keywords
WITH
,SELECT
,FROM
,WHERE
,GROUP BY
,HAVING
:- Each should be on a line by itself.
- Indent once except when part of a CTE or sub-query.
- Within each clause:
- Each column or condition should be on a line by itself.
- Indent twice (preferably 4 spaces rather than tab character).
- This includes joins.
- In the
ON
clause for joins:- Each criteria should be on a line by itself.
- Each criteria should have an additional indentation.
- Case does not matter — but should be consistent within a query.
Example: Poor Formatting
CREATE VIEW [analytics].[UserDimension] AS SELECT [Staff].[StaffUSI] as [UserKey], [StaffElectronicMail].ElectronicMailAddress as [UserEmail], ( SELECT MAX([LastModifiedDate]) from (VALUES ([Staff].[LastModifiedDate]) -- [StaffElectronicMail] does not have a [LastModifiedDate] ,([StaffElectronicMail].[CreateDate]) ,([ElectronicMailType].[LastModifiedDate]) ) as value([LastModifiedDate]) ) as [LastModifiedDate] FROM [edfi].[Staff] join [edfi].[StaffElectronicMail] ON [Staff].[StaffUSI] = [StaffElectronicMail].StaffUSI JOIN [edfi].[ElectronicMailType] ON [StaffElectronicMail].[ElectronicMailTypeId] = [ElectronicMailType].[ElectronicMailTypeId] WHERE [ElectronicMailType].[CodeValue] = 'Work'
Problems with this example:
- Need to bring the create statement onto one line.
- The
SELECT
keyword should be indented. - First column in select statement is on the same line as
SELECT
. - Mix of case in the SQL keywords (e.g.,
SELECT
andfrom
are inconsistent). - Preference to include keyword
INNER
on the joins. - Join conditions must be separated.
- The
WHERE
condition should be on a line by itself.
Example: Improved Formatting
CREATE VIEW analytics.UserDimension AS SELECT Staff.StaffUSI as UserKey, StaffElectronicMail.ElectronicMailAddress as UserEmail, ( SELECT MAX(LastModifiedDate) FROM (VALUES (Staff.LastModifiedDate) -- StaffElectronicMail does not have a LastModifiedDate ,(StaffElectronicMail.CreateDate) ,(ElectronicMailType.LastModifiedDate) ) as value(LastModifiedDate) ) as LastModifiedDate FROM edfi.Staff INNER JOIN edfi.StaffElectronicMail ON Staff.StaffUSI = StaffElectronicMail.StaffUSI INNER JOIN edfi.ElectronicMailType ON StaffElectronicMail.ElectronicMailTypeId = ElectronicMailType.ElectronicMailTypeId WHERE ElectronicMailType.CodeValue = 'Work'
Contents