求两个时间点之间的时间序列问题
create table t(start1 date,end1 date) tablespace users;
insert into t values(sysdate,sysdate+5);
commit;
现在需要的结果是;
trunc(sysdate)
trunc(sysdate+1 )
trunc(sysdate+2 )
trunc(sysdate+3 )
trunc(sysdate+4 )
trunc(sysdate+5 )
可能大多数人跟我一样,通过connect by去解决这个问题。看下面的测试
select start1 + level - 1 from t connect by level <= end1 - start1 + 1; set linesize 120 alter session set statistics_level=all; select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')); SQL> select * from t; START1 END1 ------------------- ------------------- 2010-07-23 14:17:46 2010-07-28 14:17:46 Elapsed: 00:00:00.00 SQL> select start1 + level - 1 from t connect by level <= end1 - start1 + 1; START1+LEVEL-1 ------------------- 2010-07-23 14:17:46 2010-07-24 14:17:46 2010-07-25 14:17:46 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 6 rows selected. Elapsed: 00:00:00.01 SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID 42yx56f49f77c, child number 1 ------------------------------------- select start1 + level - :"SYS_B_0" from t connect by level <= end1 - start1 + :"SYS_B_1" Plan hash value: 1071874124 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ----------------------------------------------------------------------------------------------- | 1 | CONNECT BY WITHOUT FILTERING| | 1 | | 6 |00:00:00.01 | 7 | PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ | 2 | TABLE ACCESS FULL | T | 1 | 1 | 1 |00:00:00.01 | 7 | ----------------------------------------------------------------------------------------------- 14 rows selected.
执行计划不错,但如果是多条数据呢
SQL> insert into t values(sysdate+10,sysdate+12); 1 row created. Elapsed: 00:00:00.01 SQL> commit; Commit complete. Elapsed: 00:00:00.01 SQL> select start1 + level - 1 from t connect by level <= end1 - start1 + 1; START1+LEVEL-1 ------------------- 2010-08-02 16:02:30 2010-08-03 16:02:30 2010-08-04 16:02:30 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 2010-07-25 14:17:46 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 2010-07-24 14:17:46 START1+LEVEL-1 ------------------- 2010-08-04 16:02:30 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 2010-07-25 14:17:46 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 2010-07-23 14:17:46 2010-08-03 16:02:30 2010-08-04 16:02:30 START1+LEVEL-1 ------------------- 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 2010-07-25 14:17:46 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 2010-07-24 14:17:46 2010-08-04 16:02:30 2010-07-26 14:17:46 2010-07-27 14:17:46 START1+LEVEL-1 ------------------- 2010-07-28 14:17:46 2010-07-25 14:17:46 2010-07-26 14:17:46 2010-07-27 14:17:46 2010-07-28 14:17:46 38 rows selected. SQL> SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ SQL_ID 42yx56f49f77c, child number 1 ------------------------------------- select start1 + level - :"SYS_B_0" from t connect by level <= end1 - start1 + :"SYS_B_1" Plan hash value: 1071874124 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ----------------------------------------------------------------------------------------------- | 1 | CONNECT BY WITHOUT FILTERING| | 1 | | 38 |00:00:00.01 | 7 | PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------ | 2 | TABLE ACCESS FULL | T | 1 | 1 | 2 |00:00:00.01 | 7 | -----------------------------------------------------------------------------------------------
Oracle做了一个递归查询,递归查询的内部路径见下面的结果
select sys_connect_by_path(start1 + level - 1,'/') from t connect by level <= end1 - start1 + 1; SYS_CONNECT_BY_PATH(START1+LEV -------------------------------------------------------------------------------- /2010-08-02 16:02:30 /2010-08-02 16:02:30/2010-08-03 16:02:30 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-08-04 16:02:30 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-07-25 14:17:46 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-08-02 16:02:30/2010-08-03 16:02:30/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-08-04 16:02:30 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-07-25 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-08-02 16:02:30/2010-07-24 14:17:46/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 /2010-07-23 14:17:46 /2010-07-23 14:17:46/2010-08-03 16:02:30 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-08-04 16:02:30 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-07-25 14:17:46 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-07-23 14:17:46/2010-08-03 16:02:30/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-08-04 16:02:30 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-08-04 16:02:30/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-07-25 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46 /2010-07-23 14:17:46/2010-07-24 14:17:46/2010-07-25 14:17:46/2010-07-26 14:17:46 /2010-07-27 14:17:46/2010-07-28 14:17:46 38 rows selected.
Oracle根据level的层数在做递归查询,那么这样的话随着数据量的增加,比如上百万条的数据,那么Oracle将会完全耗费所有的CPU做递归,为了我想要的结果,还必须过滤递归后的重复数据,加一个distinct 那么这个sql的执行时间将变得遥遥无期。
最后从http://www.os2ora.com/how-to-return-continuous-dates-between-two-dates/发现一个同样的案例,OS2提供了一个非常不错的方法(关于connect by的性能分析详见http://www.os2ora.com/connect-by-performance-tunning/):
SELECT a1 FROM (
SELECT a+level-1 a1
FROM (
SELECT min(start1) a,max(end1) b
FROM t
)
CONNECT BY LEVEL<=b - a + 1) q,t
WHERE trunc(q.a1) BETWEEN trunc(t.start1) and trunc(t.end1)
/
这条SQL的中心思想是:
1、找出表中最小的start和最大的end;
2、计算start和end之间的连续时间
3、跟t表join,找出符合t表时间段的时间列表;
SQL> SELECT a1 FROM (
2 SELECT a+level-1 a1
3 FROM (
4 SELECT min(start1) a,max(end1) b
5 FROM t
6 )
7 CONNECT BY LEVEL<=b - a + 1) q,t
8 WHERE trunc(q.a1) BETWEEN trunc(t.start1) and trunc(t.end1)
9 /
A1
-------------------
2010-07-23 14:17:46
2010-07-24 14:17:46
2010-07-25 14:17:46
2010-07-26 14:17:46
2010-07-27 14:17:46
2010-07-28 14:17:46
2010-08-02 14:17:46
2010-08-03 14:17:46
2010-08-04 14:17:46
9 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 18v3ubnmwhq7n, child number 0
-------------------------------------
SELECT a1 FROM ( SELECT a+level-:"SYS_B_0" a1 FROM ( SELECT min(start1)
a,max(end1) b FROM t ) CONNECT BY LEVEL<=b - a + :"SYS_B_1") q,t WHERE
trunc(q.a1) BETWEEN trunc(t.start1) and trunc(t.end1)
Plan hash value: 1814807998
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
| 1 | NESTED LOOPS | | 1 | 1 | 9 |00:00:00.01 | 99 |
| 2 | VIEW | | 1 | 1 | 13 |00:00:00.01 | 7 |
| 3 | CONNECT BY WITHOUT FILTERING| | 1 | | 13 |00:00:00.01 | 7 |
| 4 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 7 |
| 5 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 7 |
| 6 | TABLE ACCESS FULL | T | 1 | 1 | 2 |00:00:00.01 | 7 |
|* 7 | TABLE ACCESS FULL | T | 13 | 1 | 9 |00:00:00.01 | 92 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - filter((TRUNC("Q"."A1")>=TRUNC(INTERNAL_FUNCTION("T"."START1")) AND
TRUNC("Q"."A1")<=TRUNC(INTERNAL_FUNCTION("T"."END1"))))
这个plan虽然看起来没有第一个好看,但随着数据量的增加,关键部分connect的计算将变得非常简单。影响这条SQL的唯一路径是最后和T表的NESTED LOOPS ,转换成HASH JOIN也不具备条件。在大批量数据中做 NESTED LOOPS显然不是最优的,还有一种方法就是在start1,end1上创建函数索引,这可能是唯一能降低执行时间的方法:
CREATE INDEX IDX_START1_END1 ON T (TRUNC(START1),TRUNC(END1));
SQL> set autotrace traceonly explain
SQL> SELECT a1 FROM (
2 SELECT a+level-1 a1
3 FROM (
4 SELECT min(start1) a,max(end1) b
5 FROM t
6 )
7 CONNECT BY LEVEL<=b - a + 1) q,t
8 WHERE trunc(q.a1) BETWEEN trunc(t.start1) and trunc(t.end1)
9 /
Elapsed: 00:00:00.00
Execution Plan
----------------------------------------------------------
Plan hash value: 2337532387
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250 | 5500 | 90 (4)| 00:00:02 |
| 1 | NESTED LOOPS | | 250 | 5500 | 90 (4)| 00:00:02 |
| 2 | VIEW | | 1 | 6 | 86 (4)| 00:00:02 |
|* 3 | CONNECT BY WITHOUT FILTERING| | | | | |
| 4 | VIEW | | 1 | 18 | 86 (4)| 00:00:02 |
| 5 | SORT AGGREGATE | | 1 | 16 | | |
| 6 | TABLE ACCESS FULL | T | 100K| 1562K| 86 (4)| 00:00:02 |
|* 7 | INDEX RANGE SCAN | IDX_START1_END1 | 250 | 4000 | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(LEVEL<="B"-"A"+1)
7 - access(TRUNC("Q"."A1")<=TRUNC(INTERNAL_FUNCTION("END1")) AND
TRUNC("Q"."A1")>=TRUNC(INTERNAL_FUNCTION("START1")))
filter(TRUNC("Q"."A1")<=TRUNC(INTERNAL_FUNCTION("END1")))
Execution Plan
----------------------------------------------------------
Plan hash value: 1814807998
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 250 | 5500 | 172 (4)| 00:00:03 |
| 1 | NESTED LOOPS | | 250 | 5500 | 172 (4)| 00:00:03 |
| 2 | VIEW | | 1 | 6 | 86 (4)| 00:00:02 |
|* 3 | CONNECT BY WITHOUT FILTERING| | | | | |
| 4 | VIEW | | 1 | 18 | 86 (4)| 00:00:02 |
| 5 | SORT AGGREGATE | | 1 | 16 | | |
| 6 | TABLE ACCESS FULL | T | 100K| 1562K| 86 (4)| 00:00:02 |
|* 7 | TABLE ACCESS FULL | T | 250 | 4000 | 86 (4)| 00:00:02 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(LEVEL<="B"-"A"+1)
7 - filter(TRUNC("Q"."A1")>=TRUNC(INTERNAL_FUNCTION("T"."START1")) AND
TRUNC("Q"."A1")<=TRUNC(INTERNAL_FUNCTION("T"."END1")))
从执行计划上看,加了加索引明显降低了CPU的使用,降低了执行的时间。