Categories
Rants SQL

DISTINCT in UNIONs

Developers absolutely love DISTINCT don’t they? They use them everywhere but often for the wrong reasons. My current pet hate in this area is DISTINCT in UNIONs. I’m guessing they:

  • Don’t realise UNION will give them the DISTINCT list
  • Somehow believe DISTINCT in that part of the UNION will make the statement miraculously faster

None of which are necessarily true – in fact the addition of a DISTINCT can adversely affect execution plans in some cases.

Categories
Hints and Tips SQL

Underscore matching in LIKE clauses

Speaking with an old colleague last week made me realize that some developers do not appreciate the significance of the underscore (_) in LIKE clauses.

A somewhat artificial example from PeopleSoft:

A developer wants to find all record definitions where the override SQL table name starts with ‘PS_’ followed by the string ‘DEFN’. They try this:

SELECT * FROM PSRECDEFN WHERE SQLTABLENAME LIKE 'PS_%DEFN%'

But that gives them too many rows because the underscore matches any character in that position. So they craft a better solution:

Categories
Hints and Tips SQL

Character Classes in SQL Server LIKE

Although SQL Server does not support regular expressions natively, character classes in LIKE clauses offer a subset of the flexibility that is often forgotten. Some examples:

For example:

SELECT TOP 10 * FROM PS_JOB WHERE EMPL_STATUS LIKE '[ALPS]'

Is equivalent to:

SELECT TOP 10 * FROM PS_JOB WHERE EMPL_STATUS IN ('A','L','P','S')

But this is NOT the same as this because the character class check is normally case sensitive:

SELECT TOP 10 * FROM PS_JOB WHERE EMPL_STATUS LIKE '[aLPS]'

You can also mix ranges and lists of values. For example, to find all OPRIDs in PSOPRDEFN starting with a lowercase letter or numerics 1,2,6,7 or 8:

Categories
Peoplesoft PeopleTools SQL SQL Server Tuning

The hidden cost of CONVERT_IMPLICIT() …

There are numerous articles on the web about CONVERT_IMPLICIT() and how it can impact performance of queries in SQL Server. Recently I had cause to look at a SQL statement that referenced tables in two databases on the same SQL Server instance but running on different Peopletools releases. As one system is on PeopleTools 8.49 and the other is on 8.53, there is an underlying difference in how character table columns are created – 8.49 uses CHAR(), 8.53 uses NVARCHAR().

However, when you join tables from the 8.49 system to the 8.53 system on a character column such as EMPLID, SQL Server data type precedence rules will silently add a CONVERT_IMPLICIT() around the 8.49 CHAR() column to “upgrade it” to an NVARCHAR(). Therein lies a performance issue – with large tables on the 8.49 side we see significant spills to TempDB and even TempDB exhaustion in the worst case. Execution plans are adversely affected as a result of the loss of cardinality information as well.

The fix? CAST( … AS CHAR()) around the 8.53 column e.g.

 

1
WHERE CAST(A.EMPLID AS CHAR(11)) = B.EMPLID 

The moral of course of the story is always check your data types – especially if you are joining pre-8.5x and post-8.5x PeopleTools versions.

This is, of course, a generic issue. The Case for Specifying Data Types and Query Performance is an excellent article on the problems you might encounter using .Net or ORM database abstraction layers if you fail to specify your data types correctly.

 

Categories
Peoplecode Peoplesoft PeopleTools SQL

RowInit PeopleCode vs Database Logic

Sometimes you need to do processing logic in RowInit PeopleCode that looks as though it is pretty harmless. But is it? Is RowInit always the correct place to do that logic?

Personally, I would nearly always recommend using RowInit PeopleCode as it is the “Peoplesoft Way” and PeopleSoft developers and functional consultants are more familiar with it. It is also generally easier to maintain and is database agnostic. But there are certain design patterns where I would consider moving logic to the database. Here is one such example: