擴(kuò)展GROUP BY概述
在本章的開頭已經(jīng)簡(jiǎn)單描述了擴(kuò)展GROUP BY的應(yīng)用場(chǎng)景,Oracle擴(kuò)展GROUP BY允許使用SQL語(yǔ)句對(duì)數(shù)據(jù)匯總結(jié)果進(jìn)行多維展現(xiàn),從而生成復(fù)雜的報(bào)表,為決策者提供有效的數(shù)據(jù)支持。最重要的是,Oracle9i之后,擴(kuò)展GROUP BY的功能已經(jīng)趨于完善,能夠滿足大部分多維數(shù)據(jù)分析統(tǒng)計(jì)的工作。
主要表現(xiàn)在:
本章的表DEPT和EMP都來(lái)自于SCOTT用戶下,雖然表比較簡(jiǎn)單,但是足以能說明擴(kuò)展GROUP BY的功能。下面開始進(jìn)入主要內(nèi)容,探討強(qiáng)大的擴(kuò)展GROUP BY功能。
ROLLUP
假設(shè)有這樣的需求:
--需求1實(shí)現(xiàn)
SELECT a.dname,b.job,SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY a.dname,b.job
UNION ALL
--需求2實(shí)現(xiàn)
SELECT a.dname,NULL, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY a.dname
UNION ALL
--需求3實(shí)現(xiàn)
SELECT NULL,NULL, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno;
DNAME JOB SUM_SAL
---------------------------- ------------------ ----------
SALES MANAGER 2850
SALES CLERK 950
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
ACCOUNTING CLERK 1300
SALES SALESMAN 5600
RESEARCH MANAGER 2975
RESEARCH ANALYST 3000
RESEARCH CLERK 800
ACCOUNTING 8750
RESEARCH 6775
SALES 9400
24925
已選擇13行。
----------------------------------------------------------
Plan hash value: 3113041979
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 423 | 13 (70)| 00:00:01 |
| 1 | UNION-ALL | | | | | |
| 2 | HASH GROUP BY | | 12 | 336 | 5 (20)| 00:00:01 |
| 3 | NESTED LOOPS | | 12 | 336 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | EMP | 12 | 180 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
| 7 | HASH GROUP BY | | 4 | 80 | 5 (20)| 00:00:01 |
| 8 | NESTED LOOPS | | 12 | 240 | 4 (0)| 00:00:01 |
| 9 | TABLE ACCESS FULL | EMP | 12 | 84 | 3 (0)| 00:00:01 |
| 10 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 11 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
| 12 | SORT AGGREGATE | | 1 | 7 | | |
|* 13 | TABLE ACCESS FULL | EMP | 12 | 84 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("A"."DEPTNO"="B"."DEPTNO")
11 - access("A"."DEPTNO"="B"."DEPTNO")
13 - filter("B"."DEPTNO" IS NOT NULL)
從執(zhí)行計(jì)劃可以看出,為了實(shí)現(xiàn)這樣的需求,需要多次訪問EMP、DEPT表以及DEPT表的索引,這里的測(cè)試數(shù)據(jù)很少而且表結(jié)構(gòu)簡(jiǎn)單,實(shí)際應(yīng)用中表結(jié)構(gòu)可能很復(fù)雜,經(jīng)常是多表關(guān)聯(lián),數(shù)據(jù)量可能達(dá)到百萬(wàn)級(jí),千萬(wàn)級(jí),甚至上億,那么使用UNION ALL,明顯性能低下,如果使用WITH子句將常規(guī)分組的結(jié)果固定下來(lái),然后在此基礎(chǔ)上再聚合,效率可能比單純UNION ALL好,但是還是不夠完美,現(xiàn)在的需求是對(duì)2列進(jìn)行多維分析,如果是很多列呢?必然增加語(yǔ)句復(fù)雜度,類似地,CUBE,GROUPING SETS用UNION ALL改寫也有此類問題,何況擴(kuò)展GROUP BY還提供了很多復(fù)雜功能,用UNION ALL改寫就更加麻煩了。
試想對(duì)于上面的需求,如果Oracle能提供一個(gè)這樣的分組功能就好了:
2.2 ROLLUP分組
SELECT … GROUP BY ROLLUP(grouping_column_reference_list)
SELECT a.dname,b.job, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY ROLLUP(a.dname,b.job);
DNAME JOB SUM_SAL
---------------------------- ------------------ ----------
SALES CLERK 950
SALES MANAGER 2850
SALES SALESMAN 5600
SALES 9400
RESEARCH CLERK 800
RESEARCH ANALYST 3000
RESEARCH MANAGER 2975
RESEARCH 6775
ACCOUNTING CLERK 1300
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
ACCOUNTING 8750
24925
已選擇13行。
----------------------------------------------------------
Plan hash value: 503922295
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 12 | 336 | 5 (20)| 00:00:01 |
| 1 | SORT GROUP BY ROLLUP | | 12 | 336 | 5 (20)| 00:00:01 |
| 2 | NESTED LOOPS | | 12 | 336 | 4 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | EMP | 12 | 180 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("A"."DEPTNO"="B"."DEPTNO")
ROLLUP分組具有方向性,從上面的結(jié)果看出,ROLLUP(a.dname,b.job)分組的過程是這樣的:
分組級(jí)別 | 描述 |
a,b,c | 標(biāo)準(zhǔn)分組 |
a,b | 對(duì)于每個(gè)a,b列值,計(jì)算橫跨c列的小計(jì) |
a | 對(duì)于每個(gè)a列值,計(jì)算橫跨b,c列的小計(jì) |
合計(jì)匯總 | 合計(jì) |
另外提一下,其實(shí)ROLLUP操作,如果使用HINT: expand_gset_to_union,那么則優(yōu)化器會(huì)將ROLLUP轉(zhuǎn)為對(duì)應(yīng)的UNION ALL操作,其它的GROUPING SETS、CUBE也可以,有興趣的可以試一下。
ROLLUP語(yǔ)法簡(jiǎn)單,而且具體處理過程也很簡(jiǎn)單,除了第1個(gè)是標(biāo)準(zhǔn)分組,然后就是列從右到左遞減的分組,最后合計(jì)。下面實(shí)現(xiàn)需求:
SELECT to_char(b.hiredate,yyyy) hire_year,a.dname,b.job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY ROLLUP(to_char(b.hiredate,yyyy),a.dname,b.job);
HIRE_YEA DNAME JOB SUM_SAL
-------- ---------------------------- ------------------ ----------
1980 RESEARCH CLERK 800
1980 RESEARCH 800
1980 800
1981 SALES CLERK 950
1981 SALES MANAGER 2850
1981 SALES SALESMAN 5600
1981 SALES 9400
1981 RESEARCH ANALYST 3000
1981 RESEARCH MANAGER 2975
1981 RESEARCH 5975
1981 ACCOUNTING MANAGER 2450
1981 ACCOUNTING PRESIDENT 5000
1981 ACCOUNTING 7450
1981 22825
1982 ACCOUNTING CLERK 1300
1982 ACCOUNTING 1300
1982 1300
24925
相關(guān)行 | 描述 |
第7行 | 對(duì)第4、5、6行的所有JOB進(jìn)行小計(jì) |
第14行 | 對(duì)第7行、第13行的所有DNAME、JOB進(jìn)行小計(jì),當(dāng)然也相當(dāng)于對(duì)4-6、8-12行所有DNAME、JOB的小計(jì)。 |
最后一行 | 對(duì)所有入職日期(精確到年)、DNAME、JOB進(jìn)行合計(jì) |
因?yàn)镽OLLUP分組過程具有方向性,所以通過改變ROLLUP中列的順序就可以達(dá)到改變報(bào)表結(jié)果和含義的目的。比如將前面的ROLLUP(dname,job)改為ROLLUP(job,dname)則含義就發(fā)生了變化,現(xiàn)在需要查詢的就是標(biāo)準(zhǔn)分組、計(jì)算每個(gè)job的所有部門的小計(jì)、最后合計(jì),這里就兩個(gè)列,也就是小計(jì)的含義發(fā)生了變化,請(qǐng)看:
SELECT b.job,a.dname, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY ROLLUP(b.job,a.dname);
JOB DNAME SUM_SAL
------------------ ---------------------------- ----------
CLERK SALES 950
CLERK RESEARCH 800
CLERK ACCOUNTING 1300
CLERK 3050
ANALYST RESEARCH 3000
ANALYST 3000
MANAGER SALES 2850
MANAGER RESEARCH 2975
MANAGER ACCOUNTING 2450
MANAGER 8275
SALESMAN SALES 5600
SALESMAN 5600
PRESIDENT ACCOUNTING 5000
PRESIDENT 5000
24925
已選擇15行。
SELECT to_char(b.hiredate,yyyy) hire_year,a.dname,b.job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY to_char(b.hiredate,yyyy),a.dname,ROLLUP(b.job);
文章版權(quán)歸作者所有,未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。
轉(zhuǎn)載請(qǐng)注明本文地址:http://systransis.cn/yun/129114.html
摘要:當(dāng)向傳入一列時(shí),會(huì)得到一個(gè)總計(jì)行。結(jié)果當(dāng)向傳遞兩列時(shí),將會(huì)按照這兩列進(jìn)行分組,同時(shí)按照第一列的分組結(jié)果返回小計(jì)行。結(jié)果可以看出來(lái)結(jié)果是按照工廠和部門分別分組匯總的。選擇的就表示兩列都不為空。 Group By Group By 誰(shuí)不會(huì)啊?這不是最簡(jiǎn)單的嗎?越是簡(jiǎn)單的東西,我們?cè)綍?huì)忽略掉他,因?yàn)槲覀儾辉敢庠偃ド钊肓私馑? 小時(shí) SQL 極速入門(一)1 小時(shí) SQL 極速入門(二)1 ...
閱讀 1356·2023-01-11 13:20
閱讀 1707·2023-01-11 13:20
閱讀 1215·2023-01-11 13:20
閱讀 1906·2023-01-11 13:20
閱讀 4165·2023-01-11 13:20
閱讀 2757·2023-01-11 13:20
閱讀 1402·2023-01-11 13:20
閱讀 3671·2023-01-11 13:20