Tabibitosan Method: Example 1
contd...
"Tabibitosan" is one of math problem.The key of the technique is
to map rows belonging to the same pattern to the same number, which can
then be used for grouping or partitioning.
The Tabibitosan technique lets you group consecutive rows easily, using
just one level of analytic functions. And with a bit of imagination, you
can also do some primitive SQL pattern matching.
Example 1:
SQL> create table t1000
2 as
3 select 1 col from dual union all
4 select 2 from dual union all
5 select 3 from dual union all
6 select 4 from dual union all
7 select 7 from dual union all
8 select 8 from dual union all
9 select 9 from dual union all
10 select 11 from dual union all
11 select 12 from dual union all
12 select 21 from dual union all
13 select 22 from dual union all
14 select 25 from dual
15 /
Table created.
SQL> select * from t1000;
COL
----------
1
2
3
4
7
8
9
11
12
21
22
25
12 rows selected.
Expected result:
1-4
7-9
11-12
21-22
25-25
That indicates we have to get the min and maximum of gap free consecutive number groups.
Hence you need to have an identifier which groups the gap free number into same groups.
So, how you will assign a group value to each set of consecutive gap free numbers?
Tricky part here is, we are using gap free sorted sequence column ie row_number() over (order by col)
for column "col".To form a group, we can subtract from col value to row_number() over (order by col).
SQL> select col,
2 row_number() over (order by col) RN
3 from t1000;
COL RN
---------- ----------------------------
1 1
2 2
3 3
4 4
7 5
8 6
9 7
11 8
12 9
21 10
22 11
25 12
12 rows selected.
SQL> select col,
2 col-row_number() over (order by col) grp
3 from t1000;
COL GRP
---------- ----------
1 0
2 0
3 0
4 0
------------
7 2
8 2
9 2
--------------
11 3
12 3
--------------
21 11
22 11
----------------
25 13
12 rows selected.
Now we can group each set of gap free consecutive numbers and take the minimum and maximum of it and
then concatenate the values to get the desired result.
SQL> select min(col)||'-'||max(col) result
2 from
3 (select col,
4 col-row_number() over (order by col) grp
5 from t1000)
6 group by grp;
RESULT
---------------------------------------------------------
1-4
7-9
11-12
21-22
25-25
Example 2:
SQL> create table t1001 as
2 ( select to_date('01-01-2014','dd-mm-yyyy') dt1 from dual
3 union all
4 select to_date('02-01-2014','dd-mm-yyyy') dt1 from dual
5 union all
6 select to_date('03-01-2014','dd-mm-yyyy') dt1 from dual
7 union all
8 select to_date('07-01-2014','dd-mm-yyyy') dt1 from dual
9 union all
10 select to_date('12-01-2014','dd-mm-yyyy') dt1 from dual
11 union all
12 select to_date('13-01-2014','dd-mm-yyyy') dt1 from dual
13 union all
14 select to_date('14-01-2014','dd-mm-yyyy') dt1 from dual
15 union all
16 select to_date('15-01-2014','dd-mm-yyyy') dt1 from dual
17 );
Table created.
SQL>
SQL> select * from t1001;
DT1
---------
01-JAN-14
02-JAN-14
03-JAN-14
07-JAN-14
12-JAN-14
13-JAN-14
14-JAN-14
15-JAN-14
8 rows selected.
Expected output is to display the date range between consecutive dates (without any gaps).
Hence, expected output is something like below
effective from effective to
01-01-2014 03-01-2014
07-01-2014 07-01-2014
12-01-2014 15-01-2014
Trick: How you will get the min and max range between two consecutive date? you need to have an
additional identified that does the grouping for you.
SQL> select
2 dt1,
3 dt1 - row_number() over ( order by dt1) diff
4 from t1001;
DT1 DIFF
--------- ---------
01-JAN-14 31-DEC-13
02-JAN-14 31-DEC-13
03-JAN-14 31-DEC-13
-------------------
07-JAN-14 03-JAN-14
-------------------
12-JAN-14 07-JAN-14
13-JAN-14 07-JAN-14
14-JAN-14 07-JAN-14
15-JAN-14 07-JAN-14
SQL> select
2 dt1,row_number() over ( order by dt1 ) RN,
3 dt1 - row_number() over ( order by dt1 ) DIFF
4 from t1001;
DT1 RN DIFF
-------------------------------------------------- ---------------------------- ----------------------
01-JAN-14 1 31-DEC-13
02-JAN-14 2 31-DEC-13
03-JAN-14 3 31-DEC-13
-------------------------------------------------------------------------------------------------------
07-JAN-14 4 03-JAN-14
-------------------------------------------------------------------------------------------------------
12-JAN-14 5 07-JAN-14
13-JAN-14 6 07-JAN-14
14-JAN-14 7 07-JAN-14
15-JAN-14 8 07-JAN-14
8 rows selected.
Now in the above sql we can form a group to get the minimum and maximum date between consecutive dates
(without gaps).
SQL> select min(a.dt1) as "effective from",
2 max(a.dt1) as "effective to"
3 from
4 (
5 select dt1,
6 dt1 - row_number() over ( order by dt1) diff
7 from t1001
8 ) a
9 group by a.diff;
effective from effective to
-------------------------------------------------- -------------
01-JAN-14 03-JAN-14
07-JAN-14 07-JAN-14
12-JAN-14 15-JAN-14
SQL> create table t1000
2 as
3 select 1 col from dual union all
4 select 2 from dual union all
5 select 3 from dual union all
6 select 4 from dual union all
7 select 7 from dual union all
8 select 8 from dual union all
9 select 9 from dual union all
10 select 11 from dual union all
11 select 12 from dual union all
12 select 21 from dual union all
13 select 22 from dual union all
14 select 25 from dual
15 /
Table created.
SQL> select * from t1000;
COL
----------
1
2
3
4
7
8
9
11
12
21
22
25
12 rows selected.
Expected result:
1-4
7-9
11-12
21-22
25-25
That indicates we have to get the min and maximum of gap free consecutive number groups.
Hence you need to have an identifier which groups the gap free number into same groups.
So, how you will assign a group value to each set of consecutive gap free numbers?
Tricky part here is, we are using gap free sorted sequence column ie row_number() over (order by col)
for column "col".To form a group, we can subtract from col value to row_number() over (order by col).
SQL> select col,
2 row_number() over (order by col) RN
3 from t1000;
COL RN
---------- ----------------------------
1 1
2 2
3 3
4 4
7 5
8 6
9 7
11 8
12 9
21 10
22 11
25 12
12 rows selected.
SQL> select col,
2 col-row_number() over (order by col) grp
3 from t1000;
COL GRP
---------- ----------
1 0
2 0
3 0
4 0
------------
7 2
8 2
9 2
--------------
11 3
12 3
--------------
21 11
22 11
----------------
25 13
12 rows selected.
Now we can group each set of gap free consecutive numbers and take the minimum and maximum of it and
then concatenate the values to get the desired result.
SQL> select min(col)||'-'||max(col) result
2 from
3 (select col,
4 col-row_number() over (order by col) grp
5 from t1000)
6 group by grp;
RESULT
---------------------------------------------------------
1-4
7-9
11-12
21-22
25-25
Example 2:
SQL> create table t1001 as
2 ( select to_date('01-01-2014','dd-mm-yyyy') dt1 from dual
3 union all
4 select to_date('02-01-2014','dd-mm-yyyy') dt1 from dual
5 union all
6 select to_date('03-01-2014','dd-mm-yyyy') dt1 from dual
7 union all
8 select to_date('07-01-2014','dd-mm-yyyy') dt1 from dual
9 union all
10 select to_date('12-01-2014','dd-mm-yyyy') dt1 from dual
11 union all
12 select to_date('13-01-2014','dd-mm-yyyy') dt1 from dual
13 union all
14 select to_date('14-01-2014','dd-mm-yyyy') dt1 from dual
15 union all
16 select to_date('15-01-2014','dd-mm-yyyy') dt1 from dual
17 );
Table created.
SQL>
SQL> select * from t1001;
DT1
---------
01-JAN-14
02-JAN-14
03-JAN-14
07-JAN-14
12-JAN-14
13-JAN-14
14-JAN-14
15-JAN-14
8 rows selected.
Expected output is to display the date range between consecutive dates (without any gaps).
Hence, expected output is something like below
effective from effective to
01-01-2014 03-01-2014
07-01-2014 07-01-2014
12-01-2014 15-01-2014
Trick: How you will get the min and max range between two consecutive date? you need to have an
additional identified that does the grouping for you.
SQL> select
2 dt1,
3 dt1 - row_number() over ( order by dt1) diff
4 from t1001;
DT1 DIFF
--------- ---------
01-JAN-14 31-DEC-13
02-JAN-14 31-DEC-13
03-JAN-14 31-DEC-13
-------------------
07-JAN-14 03-JAN-14
-------------------
12-JAN-14 07-JAN-14
13-JAN-14 07-JAN-14
14-JAN-14 07-JAN-14
15-JAN-14 07-JAN-14
SQL> select
2 dt1,row_number() over ( order by dt1 ) RN,
3 dt1 - row_number() over ( order by dt1 ) DIFF
4 from t1001;
DT1 RN DIFF
-------------------------------------------------- ---------------------------- ----------------------
01-JAN-14 1 31-DEC-13
02-JAN-14 2 31-DEC-13
03-JAN-14 3 31-DEC-13
-------------------------------------------------------------------------------------------------------
07-JAN-14 4 03-JAN-14
-------------------------------------------------------------------------------------------------------
12-JAN-14 5 07-JAN-14
13-JAN-14 6 07-JAN-14
14-JAN-14 7 07-JAN-14
15-JAN-14 8 07-JAN-14
8 rows selected.
Now in the above sql we can form a group to get the minimum and maximum date between consecutive dates
(without gaps).
SQL> select min(a.dt1) as "effective from",
2 max(a.dt1) as "effective to"
3 from
4 (
5 select dt1,
6 dt1 - row_number() over ( order by dt1) diff
7 from t1001
8 ) a
9 group by a.diff;
effective from effective to
-------------------------------------------------- -------------
01-JAN-14 03-JAN-14
07-JAN-14 07-JAN-14
12-JAN-14 15-JAN-14
contd...
No comments:
Post a Comment