Datamartist is a powerful desktop ETL tool that lets you easily and visually join tables together.
Tables used for SQL Examples
In the screen shots I’ve configured Datamartist to only show the name columns to save space. The SQL code shown is “Select *” so it will return all the columns. You can see that in the Datamartist tool the type of join is selected by just checking the parts of the venn diagram that contain the rows you want.
1) Inner Join SQL Example
select * from dbo.Students S INNER JOIN dbo.Advisors A ON S.Advisor_ID=A.Advisor_ID
2) Left Outer Join SQL Example
select * from dbo.Students S LEFT OUTER JOIN dbo.Advisors A ON S.Advisor_ID=A.Advisor_ID
4) Full Outer Join SQL Example
select * from dbo.Students S FULL OUTER JOIN dbo.Advisors A ON S.Advisor_ID=A.Advisor_ID
5) SQL example for just getting the rows that don’t join
select * from dbo.Students S FULL OUTER JOIN dbo.Advisors A ON S.Advisor_ID=A.Advisor_ID where A.Advisor_ID is null or S.Student_ID is null
6) SQL example for just rows from one table that don’t join
select * from dbo.Students S FULL OUTER JOIN dbo.Advisors A ON S.Advisor_ID=A.Advisor_ID where A.Advisor_ID is null
But what about the duplicate row thing?
Now, since in this case we had a simple one to one relationship, the number of rows that were returned made the venn diagrams make sense, and add up pretty normally with table one and two.
What happens if the data in the tables are not a simple one to one relationship? What happens if we add one duplicate advisor with the same ID, but a different name?
A join will create a row for every combination of rows that join together. So if there are two advisors with the same key, for every student record that has that key, you will have two rows in the inner part of the join. The advisor duplicate makes duplicate student records for every student with that advisor.
You can see how this could add up to a lot of extra rows. The number of rows is the product of the two sets of joining rows. If the tables get big, just a few duplicates will cause the results of a join to be much larger than the total number of rows in the input tables- this is something you have to watch very carefully when joining- check your row counts.
So there you have it. If you want to try joining tables with the Datamartist tool- give it a try. It’s a super fast install, and you’ll be joining like a pro in no time.