點(diǎn)擊上方“IT那活兒”公眾號,關(guān)注后了解更多內(nèi)容,不管IT什么活兒,干就完了?。。?/strong>
本文主要講述實(shí)現(xiàn)對指定的空行,按照前面非空或后面非空數(shù)據(jù)進(jìn)行填充。原來這種實(shí)現(xiàn)數(shù)據(jù)填充的方法,主要是用LAST_VALUE+IGNORE NULLS實(shí)現(xiàn),在11G中LAG分析函數(shù)也支持IGNORE NULLS,但是,在性能上,他們是有區(qū)別的。
本文討論2點(diǎn)內(nèi)容:
dingjun123@ORADB> SELECT * FROM t;
ID VAL CATE
---------- ---------- ----------
1 VAL1 CATE0
2 CATE0
3 CATE0
4 CATE0
5 CATE0
6 VAL6 CATE1
7 CATE1
8 CATE1
9 CATE1
9 rows selected.
在10g中有LAST_VALUE+IGNORE NULLS很好解決,如下:
dingjun123@ORADB> SELECT ID,
2 last_value(val IGNORE NULLS) over(ORDER BY ID) val,
3 cate
4 FROM t;
ID VAL CATE
---------- ---------- ----------
1 VAL1 CATE0
2 VAL1 CATE0
3 VAL1 CATE0
4 VAL1 CATE0
5 VAL1 CATE0
6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
上面的SQL含義是ID排序直到當(dāng)前行(默認(rèn)是RANGE窗口),忽略VAL為空的值,因?yàn)槭荓AST_VALUE,所以找最近的不為空的VAL值來填充到當(dāng)前行。在11G中,LAG分析函數(shù)也帶IGNORE NULLS,所以也能實(shí)現(xiàn)上面的功能,因?yàn)長AG是找當(dāng)前行前面1行的值,所以需要加個(gè)NVL,LAST_VALUE不需要,它是直接找到當(dāng)前行,否則有值的可能為空,如下:
dingjun123@ORADB> SELECT ID,
2 nvl(val,lag(val IGNORE NULLS) over(ORDER BY ID)) val,
3 cate
4 FROM t;
ID VAL CATE
---------- ---------- ----------
1 VAL1 CATE0
2 VAL1 CATE0
3 VAL1 CATE0
4 VAL1 CATE0
5 VAL1 CATE0
6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
dingjun123@ORADB> select id,val,cate from t;
ID VAL CATE
---------- ---------- ----------
1 CATE0
2 CATE0
3 VAL3 CATE0
4 CATE0
5 CATE0
6 VAL6 CATE1
7 CATE1
8 CATE1
9 CATE1
9 rows selected.
對于ID=1和ID=2的行,因?yàn)榍懊嬲也坏絍AL的值,所以用ID=3的來填充。很顯然,這里需要用到2次LAST_VALUE分析函數(shù),一次是正常用當(dāng)前行前面的VAL來填充,如果填充不了,就用按ID倒敘排列的最近一行來填充。如下:
dingjun123@ORADB> SELECT ID,
2 nvl(last_value(val IGNORE NULLS) over(ORDER BY ID),
3 last_value(val IGNORE NULLS) over(ORDER BY ID DESC)) val,
4 cate
5 FROM t
6 ORDER BY ID;
ID VAL CATE
---------- ---------- ----------
1 VAL3 CATE0
2 VAL3 CATE0
3 VAL3 CATE0
4 VAL3 CATE0
5 VAL3 CATE0
6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
黃色區(qū)域的數(shù)據(jù)還是按向上查找的填充方式,紅色部分按照向下查找填充的方式。當(dāng)然,也可以使用LAG或LEAD來實(shí)現(xiàn)。
如下:
dingjun123@ORADB> SELECT ID,
2 nvl(val,nvl(lag(val IGNORE NULLS) over(ORDER BY ID),lag(val IGNORE NULLS) over(ORDER BY ID DESC))) val,
3 cate
4 FROM t
5 ORDER BY ID;
ID VAL CATE
---------- ---------- ----------
1 VAL3 CATE0
2 VAL3 CATE0
3 VAL3 CATE0
4 VAL3 CATE0
5 VAL3 CATE0
6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
dingjun123@ORADB> SELECT ID,
2 nvl(val,nvl(lag(val IGNORE NULLS) over(ORDER BY ID),lead(val IGNORE NULLS) over(ORDER BY ID))) val,
3 cate
4 FROM t
5 ORDER BY ID;
ID VAL CATE
---------- ---------- ----------
1 VAL3 CATE0
2 VAL3 CATE0
3 VAL3 CATE0
4 VAL3 CATE0
5 VAL3 CATE0
6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
--不加WINDOW窗口,不正確
dingjun123@ORADB> SELECT ID,
2 nvl(last_value(val IGNORE NULLS) over(ORDER BY ID),
3 first_value(val IGNORE NULLS) over(ORDER BY ID)) val,
4 cate
5 FROM t
6 ORDER BY ID;
ID VAL CATE
---------- ---------- ----------
1 CATE0
2 CATE0
3 VAL3 CATE0
4 VAL3 CATE0
5 VAL3 CATE0
6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
--加WINDOW窗口的FIRST_VALUE,正確
dingjun123@ORADB> SELECT ID,
2 nvl(last_value(val IGNORE NULLS) over(ORDER BY ID),
3 first_value(val IGNORE NULLS) over(ORDER BY ID ROWS BETWEEN unbounded preceding AND unbounded following)) val,
4 cate
5 FROM t
6 ORDER BY ID;
ID VAL CATE
---------- ---------- ----------
1 VAL3 CATE0
2 VAL3 CATE0
3 VAL3 CATE0
4 VAL3 CATE0
5 VAL3 CATE0
6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
dingjun123@ORADB> select id,val,cate from t;
ID VAL CATE
---------- ---------- ----------
1 CATE0
2 CATE0
3 VAL3 CATE0
4 CATE0
5 CATE0
6 CATE1
7 VAL7 CATE1
8 CATE1
9 CATE1
9 rows selected.
dingjun123@ORADB> SELECT ID,
2 nvl(last_value(val IGNORE NULLS) over(PARTITION BY cate ORDER BY ID),
3 last_value(val IGNORE NULLS) over( PARTITION BY cate ORDER BY ID DESC)) val,
4 cate
5 FROM t
6 ORDER BY ID;
ID VAL CATE
---------- ---------- ----------
1 VAL3 CATE0
2 VAL3 CATE0
3 VAL3 CATE0
4 VAL3 CATE0
5 VAL3 CATE0
6 VAL7 CATE1
7 VAL7 CATE1
8 VAL7 CATE1
9 VAL7 CATE1
9 rows selected.
SELECT ID,
nvl(val,nvl(lag(val IGNORE NULLS) over(PARTITION BY cate ORDER BY ID),
lead(val IGNORE NULLS) over(PARTITION BY cate ORDER BY ID))) val,
cate
FROM t
ORDER BY ID;
--結(jié)果一樣,省略
dingjun123@ORADB> SELECT ID,val,
2 nvl(val,last_value(val IGNORE NULLS) over(ORDER BY ID ROWS BETWEEN unbounded preceding AND 2 following)) new_val,
3 cate
4 FROM t;
ID VAL NEW_VAL CATE
---------- ---------- ------------------------------------------- ----------
1 VAL1 VAL1 CATE0
2 VAL1 CATE0
3 VAL1 CATE0
4 VAL6 CATE0
5 VAL6 CATE0
6 VAL6 VAL6 CATE1
7 VAL6 CATE1
8 VAL6 CATE1
9 VAL6 CATE1
9 rows selected.
如果上面的需求使用LAG分析函數(shù)來實(shí)現(xiàn),那就比較復(fù)雜了。
dingjun123@ORADB> DROP TABLE t;
Table dropped.
dingjun123@ORADB> CREATE TABLE t AS SELECT LEVEL ID,decode(MOD(LEVEL,5),1,VAL||LEVEL) val,
2 CATE||(trunc((LEVEL-1)/5)) cate FROM dual CONNECT BY LEVEL<10000;
Table created.
dingjun123@ORADB> select count(*) cnt,count(val) cnt_val from t;
CNT CNT_VAL
---------- ----------
9999 2000
1 row selected.
dingjun123@ORADB> SELECT ID,
2 nvl(val,last_value(val IGNORE NULLS) over(ORDER BY ID)) val,
3 cate
4 FROM t;
9999 rows selected.
Elapsed: 00:00:00.13
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
31 consistent gets
0 physical reads
0 redo size
207607 bytes sent via SQL*Net to client
7741 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
9999 rows processed
dingjun123@ORADB> SELECT ID,
2 nvl(val,lag(val IGNORE NULLS) over(ORDER BY ID)) val,
3 cate
4 FROM t;
9999 rows selected.
Elapsed: 00:00:22.49
Statistics
--------------------------------------------------------
0 recursive calls
0 db block gets
31 consistent gets
0 physical reads
0 redo size
207607 bytes sent via SQL*Net to client
7741 bytes received via SQL*Net from client
668 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
9999 rows processed
SELECT ID,
nvl(val,lag(val IGNORE NULLS) over(ORDER BY ID)) val,
cate
FROM t
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.02 0 1 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 668 21.98 22.08 0 31 0 9999
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 670 21.98 22.11 0 32 0 9999
文章版權(quán)歸作者所有,未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。
轉(zhuǎn)載請注明本文地址:http://systransis.cn/yun/129306.html
摘要:中面向行和面向列的操作基本是平衡的。用層次化索引,將其表示為更高維度的數(shù)據(jù)。使用浮點(diǎn)值表示浮點(diǎn)和非浮點(diǎn)數(shù)組中的缺失數(shù)據(jù)。索引的的格式化輸出形式選取數(shù)據(jù)子集在內(nèi)層中進(jìn)行選取層次化索引在數(shù)據(jù)重塑和基于分組的操作中很重要。 我們在上一篇介紹了 NumPy,本篇介紹 pandas。 pandas入門 Pandas 是基于Numpy構(gòu)建的,讓以NumPy為中心的應(yīng)用變的更加簡單。 pandas...
摘要:年月日,發(fā)布版,對應(yīng)的版本為。相比版本,該版本對系統(tǒng)穩(wěn)定性易用性功能優(yōu)化器統(tǒng)計(jì)信息以及執(zhí)行引擎做了很多改進(jìn)。 2019 年 03 月 26 日,TiDB 發(fā)布 3.0.0 Beta.1 版,對應(yīng)的 TiDB-Ansible 版本為 3.0.0 Beta。相比 3.0.0 Beta 版本,該版本對系統(tǒng)穩(wěn)定性、易用性、功能、優(yōu)化器、統(tǒng)計(jì)信息以及執(zhí)行引擎做了很多改進(jìn)。 TiDB SQL ...
摘要:數(shù)據(jù)規(guī)整化清理轉(zhuǎn)換合并重塑數(shù)據(jù)聚合與分組運(yùn)算數(shù)據(jù)規(guī)整化清理轉(zhuǎn)換合并重塑合并數(shù)據(jù)集可根據(jù)一個(gè)或多個(gè)鍵將不同中的行鏈接起來。函數(shù)根據(jù)樣本分位數(shù)對數(shù)據(jù)進(jìn)行面元?jiǎng)澐?。字典或,給出待分組軸上的值與分組名之間的對應(yīng)關(guān)系。 本篇內(nèi)容為整理《利用Python進(jìn)行數(shù)據(jù)分析》,博主使用代碼為 Python3,部分內(nèi)容和書本有出入。 在前幾篇中我們介紹了 NumPy、pandas、matplotlib 三個(gè)...
閱讀 1356·2023-01-11 13:20
閱讀 1707·2023-01-11 13:20
閱讀 1215·2023-01-11 13:20
閱讀 1906·2023-01-11 13:20
閱讀 4165·2023-01-11 13:20
閱讀 2757·2023-01-11 13:20
閱讀 1402·2023-01-11 13:20
閱讀 3671·2023-01-11 13:20