Friday, March 18, 2011

Self Join

I've been on a kick about posting on joining tables lately. Sometimes you will come across a situation where you need to join a table to itself when executing a query. You can do this simply by aliasing your tables and using an inner join. There is no 'standard' scenario to explain this concept, so I decided to use a basic forum messaging system as our scenario.

Problem: Design a database where people can make posts, and people can comment on other people's post. Keep in mind that people can also comment on other peoples comments. Only use one table.

Let's start by deciding how our table will be. We want to keep track of the user who posted the comment, the subject and body of the comment, the date the comment was posted, and if the post is a comment, the comment that the post applies to.

Posts
-----------
postID        PK
postedBy    VARCHAR(50)
subject        VARCHAR(50)
body           VARCHAR(500) -- Note: this is an unusually large size for a comment's size.
datePosted  DATE
parentPost   INT

So, at this point we can create our database and table:

  1. USE master;
  2. GO
  3. CREATE DATABASE Forum;
  4. GO
  5. USE Forum;
  6. -- Create a Posts table.
  7. CREATE TABLE Posts
  8. (
  9.     postID INT IDENTITY(1,1) PRIMARY KEY
  10.     , postedBy VARCHAR(50)
  11.     , [subject] VARCHAR(50)
  12.     , body VARCHAR(500)
  13.     , datePosted DATE
  14.     , parentPost INT -- This field will be used for comments on our posts.
  15. );

Next, we will insert some records to play with:

  1. -- Insert some test records into our Posts table.
  2. INSERT INTO Posts
  3. (
  4.     postedBy
  5.     , [subject]
  6.     , body
  7.     , datePosted
  8. )
  9. VALUES
  10. (
  11.     'Self Destruct'
  12.     , 'Test Message'
  13.     , 'This is test message.'
  14.     , '3/17/2011'
  15. );
  16. INSERT INTO Posts
  17. (
  18.     postedBy
  19.     , [subject]
  20.     , body
  21.     , datePosted
  22. )
  23. VALUES
  24. (
  25.     'Self Destruct'
  26.     , 'Test Message #2'
  27.     , 'This is another test message.'
  28.     , '3/18/2011'
  29. );
  30. INSERT INTO Posts
  31. (
  32.     postedBy
  33.     , [subject]
  34.     , body
  35.     , datePosted
  36. )
  37. VALUES
  38. (
  39.     'Self Destruct'
  40.     , 'Test Message #3'
  41.     , 'This is yet another test message.'
  42.     , '3/18/2011'
  43. );
  44. INSERT INTO Posts
  45. (
  46.     postedBy
  47.     , [subject]
  48.     , body
  49.     , datePosted
  50.     , parentPost
  51. )
  52. VALUES
  53. (
  54.     'Self Destruct'
  55.     , 'Good post!'
  56.     , 'That was a good post'
  57.     , '3/18/2011'
  58.     , 1
  59. );
  60. INSERT INTO Posts
  61. (
  62.     postedBy
  63.     , [subject]
  64.     , body
  65.     , datePosted
  66.     , parentPost
  67. )
  68. VALUES
  69. (
  70.     'Self Destruct'
  71.     , 'Great post!'
  72.     , 'That was a great post'
  73.     , '3/18/2011'
  74.     , 1
  75. );
  76. INSERT INTO Posts
  77. (
  78.     postedBy
  79.     , [subject]
  80.     , body
  81.     , datePosted
  82.     , parentPost
  83. )
  84. VALUES
  85. (
  86.     'Self Destruct'
  87.     , 'Lalalala'
  88.     , 'allalalalala'
  89.     , '3/18/2011'
  90.     , 2
  91. );
  92. INSERT INTO Posts
  93. (
  94.     postedBy
  95.     , [subject]
  96.     , body
  97.     , datePosted
  98.     , parentPost
  99. )
  100. VALUES
  101. (
  102.     'Self Destruct'
  103.     , 'Bad Post'
  104.     , 'That was a bad post.'
  105.     , '3/18/2011'
  106.     , 3
  107. );

Now we got to think about how to get our original posts. Realize that if it is a normal post, it will not have a parentPost because it is not a comment on someone else's post. So we will start there:

  1. -- Get all parent posts.
  2. SELECT p.[subject] AS 'Subject', p.[body] AS 'Body', p.postedBy AS 'Posted By' , p.datePosted AS 'Posted On'
  3. FROM Posts AS P
  4. WHERE P.parentPost IS NULL;

Results of previous query.
Now, let's get all the posts that have comments associated with them. We will do this by creating an inner join from our Posts table, back to our Posts table. What you will need to is create two aliases for your table to compare back to itself. Let's learn by example:

  1. -- Let's retrieve all posts that have comments.
  2. SELECT p.[subject] AS 'Subject', p.[body] AS 'Body', p.postedBy AS 'Posted By' , p.datePosted AS 'Posted By', C.[subject] AS 'Comment'
  3. FROM Posts AS P
  4. INNER JOIN Posts AS C
  5. ON P.postID = C.parentPost;

In the above query, we are essentially referencing two copies of the Posts table, I called them P and C in our query. The query uses a simple inner join to grab all records from our C Posts table that have a parentPost value other than null.

The query results look like:

Results of our comments query.

You can use this technique to perform quite a range of cool things, such as comparing a table to itself to narrow query results

Referring back to the posts example, we can use another technique (called sub querying, which will be explained in future post) to obtain posts and the number of comments on that post:

  1. SELECT p.[subject] AS 'Subject', p.[body] AS 'Body', p.postedBy AS 'Posted By' , p.datePosted AS 'Posted On', (SELECT COUNT(T.postID) FROM Posts AS T WHERE T.parentPost = p.postID) AS 'Comments'
  2. FROM Posts AS P
  3. WHERE P.parentPost IS NULL;

The results look like this:

Results of our sub-query retrieving posts and the number of comments.

Full script:

  1. USE master;
  2. GO
  3. CREATE DATABASE Forum;
  4. GO
  5. USE Forum;
  6. -- Create a Posts table.
  7. CREATE TABLE Posts
  8. (
  9.     postID INT IDENTITY(1,1) PRIMARY KEY
  10.     , postedBy VARCHAR(50)
  11.     , [subject] VARCHAR(50)
  12.     , body VARCHAR(500)
  13.     , datePosted DATE
  14.     , parentPost INT -- This field will be used for comments on our posts.
  15. );
  16. -- Insert some test records into our Posts table.
  17. INSERT INTO Posts
  18. (
  19.     postedBy
  20.     , [subject]
  21.     , body
  22.     , datePosted
  23. )
  24. VALUES
  25. (
  26.     'Self Destruct'
  27.     , 'Test Message'
  28.     , 'This is test message.'
  29.     , '3/17/2011'
  30. );
  31. INSERT INTO Posts
  32. (
  33.     postedBy
  34.     , [subject]
  35.     , body
  36.     , datePosted
  37. )
  38. VALUES
  39. (
  40.     'Self Destruct'
  41.     , 'Test Message #2'
  42.     , 'This is another test message.'
  43.     , '3/18/2011'
  44. );
  45. INSERT INTO Posts
  46. (
  47.     postedBy
  48.     , [subject]
  49.     , body
  50.     , datePosted
  51. )
  52. VALUES
  53. (
  54.     'Self Destruct'
  55.     , 'Test Message #3'
  56.     , 'This is yet another test message.'
  57.     , '3/18/2011'
  58. );
  59. INSERT INTO Posts
  60. (
  61.     postedBy
  62.     , [subject]
  63.     , body
  64.     , datePosted
  65.     , parentPost
  66. )
  67. VALUES
  68. (
  69.     'Self Destruct'
  70.     , 'Good post!'
  71.     , 'That was a good post'
  72.     , '3/18/2011'
  73.     , 1
  74. );
  75. INSERT INTO Posts
  76. (
  77.     postedBy
  78.     , [subject]
  79.     , body
  80.     , datePosted
  81.     , parentPost
  82. )
  83. VALUES
  84. (
  85.     'Self Destruct'
  86.     , 'Great post!'
  87.     , 'That was a great post'
  88.     , '3/18/2011'
  89.     , 1
  90. );
  91. INSERT INTO Posts
  92. (
  93.     postedBy
  94.     , [subject]
  95.     , body
  96.     , datePosted
  97.     , parentPost
  98. )
  99. VALUES
  100. (
  101.     'Self Destruct'
  102.     , 'Lalalala'
  103.     , 'allalalalala'
  104.     , '3/18/2011'
  105.     , 2
  106. );
  107. INSERT INTO Posts
  108. (
  109.     postedBy
  110.     , [subject]
  111.     , body
  112.     , datePosted
  113.     , parentPost
  114. )
  115. VALUES
  116. (
  117.     'Self Destruct'
  118.     , 'Bad Post'
  119.     , 'That was a bad post.'
  120.     , '3/18/2011'
  121.     , 3
  122. );
  123. -- Let's retrieve all parent posts and the number of comments that post has.
  124. -- Note: This query contains a subquery.
  125. SELECT p.[subject] AS 'Subject', p.[body] AS 'Body', p.postedBy AS 'Posted By' , p.datePosted AS 'Posted On', (SELECT COUNT(T.postID) FROM Posts AS T WHERE T.parentPost = p.postID) AS 'Comments'
  126. FROM Posts AS P
  127. WHERE P.parentPost IS NULL;
  128. -- Let's retrieve all posts that have comments.
  129. SELECT p.[subject] AS 'Subject', p.[body] AS 'Body', p.postedBy AS 'Posted By' , p.datePosted AS 'Posted By', C.[subject] AS 'Comment'
  130. FROM Posts AS P
  131. INNER JOIN Posts AS C
  132. ON P.postID = C.parentPost;

No comments:

Post a Comment

I Break Code Where code gets done.
ASP.NET | HTML | SQL Server | VB.NET | Request A Topic |