Sunday, 7 September 2014

Tabibitosan Method: Example 1


"Tabibitosan" is one of math problem.The key of the technique is to map rows belonging to the same pattern to the same number, which can then be used for grouping or partitioning.

The Tabibitosan technique lets you group consecutive rows easily, using just one level of analytic functions. And with a bit of imagination, you can also do some primitive SQL pattern matching.


Example 1:

SQL>  create table t1000
  2   as
  3   select 1 col from dual union all
  4   select 2 from dual union all
  5   select 3 from dual union all
  6   select 4 from dual union all
  7   select 7 from dual union all
  8   select 8 from dual union all
  9   select 9 from dual union all
 10   select 11 from dual union all
 11   select 12 from dual union all
 12   select 21 from dual union all
 13   select 22 from dual union all
 14   select 25 from dual
 15  /

Table created.
SQL> select * from t1000;

       COL
----------
         1
         2
         3
         4
         7
         8
         9
        11
        12
        21
        22
        25

12 rows selected.

Expected result:
1-4
7-9
11-12
21-22
25-25

That indicates we have to get the min and maximum of gap free consecutive number groups.
Hence you need to have an identifier which groups the gap free number into same groups.
So, how you will assign a group value to each set of consecutive gap free numbers?

Tricky part here is, we are using gap free sorted sequence column ie row_number() over (order by col)
for column "col".To form a group, we can subtract from col value to row_number() over (order by col).

SQL> select col,
  2         row_number() over (order by col) RN
  3  from t1000;

       COL                     RN
---------- ----------------------------
         1                            1
         2                            2
         3                            3
         4                            4
         7                            5
         8                            6
         9                            7
        11                            8
        12                            9
        21                           10
        22                           11
        25                           12

12 rows selected.

SQL> select col,
  2         col-row_number() over (order by col) grp
  3  from t1000;

       COL        GRP
---------- ----------
         1          0
         2          0
         3          0
         4          0
         ------------
         7          2
         8          2
         9          2
         --------------
        11          3
        12          3
        --------------
        21         11
        22         11
        ----------------
        25         13

12 rows selected.

Now we can group each set of gap free consecutive numbers and take the minimum and maximum of it and
then concatenate the values to get the desired result.

SQL> select min(col)||'-'||max(col) result
  2  from
  3  (select col,
  4         col-row_number() over (order by col) grp
  5  from t1000)
  6  group by grp;

RESULT
---------------------------------------------------------
1-4
7-9
11-12
21-22
25-25

Example 2:

SQL> create table t1001 as
  2  ( select to_date('01-01-2014','dd-mm-yyyy') dt1 from dual
  3    union all
  4    select to_date('02-01-2014','dd-mm-yyyy') dt1 from dual
  5    union all
  6    select to_date('03-01-2014','dd-mm-yyyy') dt1 from dual
  7    union all
  8    select to_date('07-01-2014','dd-mm-yyyy') dt1 from dual
  9    union all
 10    select to_date('12-01-2014','dd-mm-yyyy') dt1 from dual
 11    union all
 12    select to_date('13-01-2014','dd-mm-yyyy') dt1 from dual
 13    union all
 14    select to_date('14-01-2014','dd-mm-yyyy') dt1 from dual
 15    union all
 16    select to_date('15-01-2014','dd-mm-yyyy') dt1 from dual
 17   );

Table created.

SQL> 
SQL> select * from t1001;

DT1
---------
01-JAN-14
02-JAN-14
03-JAN-14
07-JAN-14
12-JAN-14
13-JAN-14
14-JAN-14
15-JAN-14

8 rows selected.

Expected output is to display the date range between consecutive dates (without any gaps).
Hence, expected output is something like below

effective from  effective to
01-01-2014         03-01-2014
07-01-2014         07-01-2014
12-01-2014        15-01-2014

Trick: How you will get the min and max range between two consecutive date? you need to have an
additional identified that does the grouping for you.


SQL> select
  2        dt1,
  3        dt1 - row_number() over ( order by dt1) diff
  4  from t1001;  

DT1       DIFF
--------- ---------
01-JAN-14 31-DEC-13
02-JAN-14 31-DEC-13
03-JAN-14 31-DEC-13
-------------------
07-JAN-14 03-JAN-14
-------------------
12-JAN-14 07-JAN-14
13-JAN-14 07-JAN-14
14-JAN-14 07-JAN-14
15-JAN-14 07-JAN-14

SQL>     select
  2            dt1,row_number() over ( order by dt1 ) RN,
  3            dt1 - row_number() over ( order by dt1 ) DIFF
  4      from t1001; 

DT1                                                                              RN   DIFF
-------------------------------------------------- ---------------------------- ----------------------
01-JAN-14                                                                     1 31-DEC-13
02-JAN-14                                                                     2 31-DEC-13
03-JAN-14                                                                     3 31-DEC-13
-------------------------------------------------------------------------------------------------------
07-JAN-14                                                                     4 03-JAN-14
-------------------------------------------------------------------------------------------------------
12-JAN-14                                                                     5 07-JAN-14
13-JAN-14                                                                     6 07-JAN-14
14-JAN-14                                                                     7 07-JAN-14
15-JAN-14                                                                     8 07-JAN-14

8 rows selected.

Now in the above sql we can form a group to get the minimum and maximum date between consecutive dates
(without gaps).

SQL> select min(a.dt1) as "effective from",
  2      max(a.dt1) as "effective to"
  3  from   
  4   (
  5   select  dt1,
  6     dt1 - row_number() over ( order by dt1) diff
  7   from t1001
  8   ) a
  9  group by a.diff;

effective from                                     effective to
-------------------------------------------------- -------------
01-JAN-14                                          03-JAN-14
07-JAN-14                                          07-JAN-14
12-JAN-14                                          15-JAN-14

contd...

No comments:

Post a Comment