DEV Community

Discussion on: 10 Things I Hate About SQL

Collapse
 
geraldew profile image
geraldew

On to #3 Aliases don’t work after the Select block

This is very much a matter of specific dialects - therefore of their engines and parsers.
For some, what you have there would be allowed and would do what you expect, but for others not.

In my current workplace:

  • for Teradata SQL that would work
  • for HiveQL it would not.

For HiveQL the problem seems to me to be the parser, as it's quite trivial to wrap the SELECT as a derived table and then do same WHERE clause. When used with the Hive-on-Spark engine, I'm fairly sure that the underlying engine does it in the one pass.

Which is to say that often the rant is about the implementation not the SQL language (but once again, more on this later).

Interestingly, your comment that "Some SQL engines will allow you to use having here instead" - that's not something I've encountered so I'll assume you're correct. I've only ever used HAVING with a GROUP BY.

  • (aside: gosh aren't these uppercase keywords handy when talking _about_SQL ?)

Maybe this is a good point to bring in that modern SQL uses at least three variants of the WHERE clause. There's:

  • WHERE
  • HAVING
  • QUALIFY

where each applies to a separate layer/round of under-the-hood processing. The overview being:

  • WHERE - applies first, filtering rows
  • HAVING - applies next, after the aggregations, typically for the rollups of a GROUP BY
  • QUALIFY - applies last, after the OLAP/analytical/windowing functions have been derived

Note: I'm deliberately leaving out the use of WHERE as a de-facto JOIN controller. That's a topic really about JOINs and has even more issues about dialect, parser and engine variations.

Again, dialects vary: in HiveQL you get the windowing functions but not the QUALIFY clause forcing another round of a table-within-a-table.

  • * I use the old fashioned term of "derived table" for the replacement of a FROM clause with an inner SELECT. The modern trend is to do these using a WITH clause instead. There's much more to it than that, so that's yet another topic.

p.s. predating the use of AS for aliases, some dialects had a (NAMED) mechanism. Strangely these applied at the text level before any real SQL parsing. This would let you specify a name for something inside an expression and re-use it elsewhere in the query - thus only typing it once and no risk of typos in the additional uses. I know dialects where the AS also works like a copy-paste "macro" rather than indicating something should be calculated once at run-time and then re-used rather than independently re-derived. The irony is that by now most engines have optimisers that will spot the repeated derivations regardless of whether they have the same alias or not.

Collapse
 
shaqq profile image
Shaker Islam

Thanks for the detailed explanation, though I believe we agree that this still hearkens back to #10.