Equals column lineage
We already have great functionality for displaying relationships from JOIN and WHERE clauses.
But that's not all.
Remember what we learned in school? If we know that x = y, we can substitute one for the other anywhere. Right? Now, take a look at this query:
We get T1.C1 as the lineage result, correct?
But T1.C1 = T2.C3, which means this query is equivalent to:
See what's happening here? The lineage of the upstream column T2.C3 is hidden from your view!
Have you ever encountered a tool that reveals this to you? Sure, you'll see that there's a dependency on table T2 at the object level. But no details. Good luck debugging that!
It gets even worse! If x = y and y = z, then x = z. Right?
You get the point…
All of this becomes even more complicated when you add mathematical operations, function calls, type conversions, unions…
Can we see additional data lineage generated by the equality conditions in JOIN and WHERE?
Sure! Here's what the full data lineage would look like for the examples above:
Last updated