Dwh.dev documentation
  • What is Dwh.dev?
  • Features
    • Data catalog
      • PIPEs
      • Fancy SQL Highlight
    • Data Lineage
      • Object-level lineage
      • Column-level Lineage
      • In-query lineage
      • Navigation
      • JOIN and WHERE
      • Equals column lineage
      • Strong and Weak dependencies
      • Default and Virtual Columns
      • Implicit Types casting
      • Circle dependencies
      • Argument forwarding
    • TASKs
  • Integrations
    • Snowflake
      • Offline mode
      • Secured online mode
      • Snowflake Marketplace
    • DBT
    • ETL
    • BI
      • Looker
      • Other
    • API
  • Snowflake SQL Syntax and Behavior
    • Identifiers
    • Reusing column aliases
    • SELECT * ILIKE EXCLUDE REPLACE RENAME
    • Scalar functions name resolution special behavior
    • Functions overloading
    • CTE as an expression alias
    • ASOF Join
    • UDF named arguments
    • Objects auto renaming
    • Columns auto renaming
Powered by GitBook
On this page

Was this helpful?

  1. Features
  2. Data Lineage

Equals column lineage

PreviousJOIN and WHERENextStrong and Weak dependencies

Last updated 8 months ago

Was this helpful?

We already have great functionality for displaying relationships from clauses.

But that's not all.

Remember what we learned in school? If we know that x = y, we can substitute one for the other anywhere. Right? Now, take a look at this query:

SELECT T1.C1 FROM
  T1 JOIN T2 ON T1.C1 = T2.C3

We get T1.C1 as the lineage result, correct?

But T1.C1 = T2.C3, which means this query is equivalent to:

SELECT T2.C3 FROM
  T1 JOIN T2 ON T1.C1 = T2.C3

See what's happening here? The lineage of the upstream column T2.C3 is hidden from your view!

Have you ever encountered a tool that reveals this to you? Sure, you'll see that there's a dependency on table T2 at the object level. But no details. Good luck debugging that!

It gets even worse! If x = y and y = z, then x = z. Right?

SELECT C1 FROM
  T1 
    JOIN T2 ON C1 = C3
    JOIN T3 ON C3 = C5;

You get the point…

All of this becomes even more complicated when you add mathematical operations, function calls, type conversions, unions…

Can we see additional data lineage generated by the equality conditions in JOIN and WHERE?

Sure! Here's what the full data lineage would look like for the examples above:

JOIN and WHERE
Equals lineage #1
Equals lineage #2