How does this self inner join query is working?

Issue

Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1         | 2         | 2016-03-01 | 5            |
| 1         | 2         | 2016-05-02 | 6            |
| 1         | 3         | 2017-06-25 | 1            |
| 3         | 1         | 2016-03-02 | 0            |
| 3         | 4         | 2018-07-03 | 5            |
+-----------+-----------+------------+--------------+
Output: 
+-----------+------------+---------------------+
| player_id | event_date | games_played_so_far |
+-----------+------------+---------------------+
| 1         | 2016-03-01 | 5                   |
| 1         | 2016-05-02 | 11                  |
| 1         | 2017-06-25 | 12                  |
| 3         | 2016-03-02 | 0                   |
| 3         | 2018-07-03 | 5                   |
+-----------+------------+---------------------+

Query:

select a1.player_id, a1.event_date, sum(a2.games_played) as games_played_so_far
from activity as a1
inner join activity as a2
on a1.event_date >= a2.event_date
and a1.player_id = a2.player_id
group by  a1.player_id, a1.event_date

I am not able to understand how this self inner join works. Can anyone explain it a bit. I saw few videos but not helpful so far.

Solution

when you select the data without the group by you can see better what it does-

For every row in a2 it selects all rows from a1 that fit the criteria a1.event_date >= a2.event_date and a1.player_id = a2.player_id and joins them together

The group by only sums all rows in a2 up that have the same player id and a1.eventdate

What you so get is a Cumulative sum.

in MySQL 8 was this made easier with window functions.

select a1.player_id, a1.event_date,a2.player_id, a2.event_date
from activity as a1
inner join activity as a2
on a1.event_date >= a2.event_date
and a1.player_id = a2.player_id
player_id | event_date | player_id | event_date
--------: | :--------- | --------: | :---------
        1 | 2017-06-25 |         1 | 2016-03-01
        1 | 2016-05-02 |         1 | 2016-03-01
        1 | 2016-03-01 |         1 | 2016-03-01
        1 | 2017-06-25 |         1 | 2016-05-02
        1 | 2016-05-02 |         1 | 2016-05-02
        1 | 2017-06-25 |         1 | 2017-06-25
        3 | 2018-07-03 |         3 | 2016-03-02
        3 | 2016-03-02 |         3 | 2016-03-02
        3 | 2018-07-03 |         3 | 2018-07-03
select a1.player_id, a1.event_date, sum(a2.games_played) as games_played_so_far
from activity as a1
inner join activity as a2
on a1.event_date >= a2.event_date
and a1.player_id = a2.player_id
group by  a1.player_id, a1.event_date
ORDER BY a1.player_id, a1.event_date
player_id | event_date | games_played_so_far
--------: | :--------- | ------------------:
        1 | 2016-03-01 |                   5
        1 | 2016-05-02 |                  11
        1 | 2017-06-25 |                  12
        3 | 2016-03-02 |                   0
        3 | 2018-07-03 |                   5

db<>fiddle here

Answered By – nbk

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published