MongoDB入門之索引篇

jsbintask 發(fā)布于2019-06-26 16:57 / 2153人閱讀

摘要：排序方向并不重要，可以從任意方向?qū)λ饕M(jìn)行遍歷。其中可以使用指定要使用的索引。即為唯一索引，并且不能刪除。索引過期后，相應(yīng)的數(shù)據(jù)會(huì)被刪除。

索引就像書的目錄，如果查找某內(nèi)容在沒有目錄的幫助下，只能全篇查找翻閱，這導(dǎo)致效率非常的低下；如果在借助目錄情況下，就能很快的定位具體內(nèi)容所在區(qū)域，效率會(huì)直線提高。

索引簡介

首先打開命令行，輸入mongo。默認(rèn)mongodb會(huì)連接名為test的數(shù)據(jù)庫。

?  ~  mongo
MongoDB shell version: 2.4.9
connecting to: test
> show collections
>

可以使用show collections/tables查看數(shù)據(jù)庫為空。

然后在mongodb shell執(zhí)行如下代碼

> for(var i=0;i<100000;i++) {
... db.users.insert({username:"user"+i})
... }
> show collections
system.indexes
users
>

再查看數(shù)據(jù)庫發(fā)現(xiàn)多了system.indexes 和 users兩個(gè)表，前者即所謂的索引，后者為新建的數(shù)據(jù)庫表。
這樣user表中即有了10萬條數(shù)據(jù)。

> db.users.find()
{ "_id" : ObjectId("5694d5da8fad9e319c5b43e4"), "username" : "user0" }
{ "_id" : ObjectId("5694d5da8fad9e319c5b43e5"), "username" : "user1" }
{ "_id" : ObjectId("5694d5da8fad9e319c5b43e6"), "username" : "user2" }
{ "_id" : ObjectId("5694d5da8fad9e319c5b43e7"), "username" : "user3" }
{ "_id" : ObjectId("5694d5da8fad9e319c5b43e8"), "username" : "user4" }
{ "_id" : ObjectId("5694d5da8fad9e319c5b43e9"), "username" : "user5" }

現(xiàn)在需要查找其中任意一條數(shù)據(jù),比如

> db.users.find({username: "user1234"})
{ "_id" : ObjectId("5694d5db8fad9e319c5b48b6"), "username" : "user1234" }

發(fā)現(xiàn)這條數(shù)據(jù)成功找到，但需要了解詳細(xì)信息，需要加上explain方法

> db.users.find({username: "user1234"}).explain()
{
    "cursor" : "BasicCursor",
    "isMultiKey" : false,
    "n" : 1,
    "nscannedObjects" : 100000,
    "nscanned" : 100000,
    "nscannedObjectsAllPlans" : 100000,
    "nscannedAllPlans" : 100000,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 30,
    "indexBounds" : {
        
    },
    "server" : "root:27017"
}

參數(shù)很多，目前我們只關(guān)注其中的"nscanned" : 100000和"millis" : 30這兩項(xiàng)。
nscanned表示mongodb在完成這個(gè)查詢過程中掃描的文檔總數(shù)?？梢园l(fā)現(xiàn)，集合中的每個(gè)文檔都被掃描了，并且總時(shí)間為30毫秒。
如果數(shù)據(jù)有1000萬個(gè)，如果每次查詢文檔都遍歷一遍。呃，時(shí)間也是相當(dāng)可觀。

對(duì)于此類查詢，索引是一個(gè)非常好的解決方案。

> db.users.ensureIndex({"username": 1})

其中數(shù)字1或-1表示索引的排序方向，一般都可以。
然后再查找user1234

> db.users.ensureIndex({"username": 1})
> db.users.find({username: "user1234"}).explain()
{
    "cursor" : "BtreeCursor username_1",
    "isMultiKey" : false,
    "n" : 1,
    "nscannedObjects" : 1,
    "nscanned" : 1,
    "nscannedObjectsAllPlans" : 1,
    "nscannedAllPlans" : 1,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,
    "indexBounds" : {
        "username" : [
            [
                "user1234",
                "user1234"
            ]
        ]
    },
    "server" : "root:27017"
}

的確有點(diǎn)不可思議，查詢在瞬間完成，因?yàn)橥ㄟ^索引只查找了一條數(shù)據(jù)，而不是100000條。

當(dāng)然使用索引是也是有代價(jià)的：對(duì)于添加的每一條索引，每次寫操作（插入、更新、刪除）都將耗費(fèi)更多的時(shí)間。這是因?yàn)椋?dāng)數(shù)據(jù)發(fā)生變化時(shí)，不僅要更新文檔，還要更新級(jí)集合上的所有索引。因此，mongodb限制每個(gè)集合最多有64個(gè)索引。通常，在一個(gè)特定的集合上，不應(yīng)該擁有兩個(gè)以上的索引。

小技巧

如果一個(gè)非常通用的查詢，或者這個(gè)查詢造成了性能瓶頸，那么在某字段（比如username）建立索引是非常好的選擇。但只是給管理員用的查詢（不太在意查詢耗費(fèi)時(shí)間），就不該對(duì)這個(gè)字段建立索引。

復(fù)合索引

索引的值是按一定順序排列的，所以使用索引鍵對(duì)文檔進(jìn)行排序非?？?。

db.users.find().sort({"age": 1, "username": 1})

這里先根據(jù)age排序再根據(jù)username排序，所以u(píng)sername在這里發(fā)揮的作用并不大。為了優(yōu)化這個(gè)排序，可能需要在age和username上建立索引。

db.users.ensureIndex({"age":1, "username": 1})

這就建立了一個(gè)復(fù)合索引（建立在多個(gè)字段上的索引），如果查詢條件包括多個(gè)鍵，這個(gè)索引就非常有用。

建立復(fù)合索引后，每個(gè)索引條目都包括一個(gè)age字段和一個(gè)username字段，并且指向文檔在磁盤上的存儲(chǔ)位置。
此時(shí)，age字段是嚴(yán)格升序排列的，如果age相等時(shí)再按照username升序排列。

查詢方式 點(diǎn)查詢（point query）

用于查詢單個(gè)值（盡管包含這個(gè)值的文檔可能有多個(gè)）

db.users.find({"age": 21}).sort({"username": -1})

因?yàn)槲覀円呀?jīng)建立好復(fù)合索引，一個(gè)age一個(gè)username，建立索引時(shí)使用的是升序排序（即數(shù)字1），當(dāng)使用點(diǎn)查詢查找{age：21},假設(shè)仍然是10萬條數(shù)據(jù)。可能年齡是21的很多人，因此會(huì)找到不只一條數(shù)據(jù)。然后sort({"username": -1})會(huì)對(duì)這些數(shù)據(jù)進(jìn)行逆序排序，本意是這樣。但我們不要忘記建立索引時(shí)"username"：1是升序（從小到大）,如果想得到逆序只要對(duì)數(shù)據(jù)從最后一個(gè)索引開始，依次遍歷即可得到想要的結(jié)果。

排序方向并不重要，mongodb可以從任意方向?qū)λ饕M(jìn)行遍歷。

綜上，復(fù)合索引在點(diǎn)查詢這種情況非常高效，直接定位年齡，不需要對(duì)結(jié)果進(jìn)行排序即可返回結(jié)果。

多值查詢（multi-value-query）

db.users.find({"age": {"$gte": 21, "$lte": 30}})

查找多個(gè)值相匹配的文檔。多值查詢也可以理解為多個(gè)點(diǎn)查詢。
如上，要查找年齡介于21到30之間。monogdb會(huì)使用索引的中的第一個(gè)鍵"age"得到匹配的結(jié)果，而結(jié)果通常是按照索引順序排列的。

db.users.find({"age": {"$gte": 21, "$lte": 30}}).sort({"username": 1})

與上一個(gè)類似，這次需要對(duì)結(jié)果排序。
在沒有sort時(shí)，我們查詢的結(jié)果首先是根據(jù)age等于21，age等于22..這樣從小到大排序，當(dāng)age等于21有多個(gè)時(shí)，在進(jìn)行usernameA-Z（0-9）這樣排序。所以，sort({"username": 1})，要將所有結(jié)果通過名字升序排列，這次不得不先在內(nèi)存中進(jìn)行排序，然后返回。效率不如上一個(gè)高。

當(dāng)然，在文檔非常少的情況，排序也花費(fèi)不了多少時(shí)間。
如果結(jié)果集很大，比如超過32MB，MongoDB會(huì)拒絕對(duì)如此多的數(shù)據(jù)進(jìn)行排序工作。

還有另外一種解決方案

也可以建立另外一個(gè)索引{"username": 1, "age": 1}, 如果先對(duì)username建立索引，如果再sortusername,相當(dāng)沒有進(jìn)行排序。但是需要在整個(gè)文檔查找age等于21的帥哥美女，所以搜尋時(shí)間就長了。

但哪個(gè)效率更高呢？

如果建立多個(gè)索引，如何選擇使用哪個(gè)呢？

效率高低是分情況的，如果在沒有限制的情況下，不進(jìn)行排序但需要搜索整個(gè)集合時(shí)間會(huì)遠(yuǎn)超過前者。但是在返回部分?jǐn)?shù)據(jù)（比如limit（1000）），新的贏家就產(chǎn)生了。

>db.users.find({"age": {"$gte": 21, "$lte": 30}}).
sort({username": 1}).
limit(1000).
hint({"age": 1, "username": 1})
explain()["millis"]

2031ms

>db.users.find({"age": {"$gte": 21, "$lte": 30}}).
sort({username": 1}).
limit(1000).
hint({"username": 1, "age": 1}).
explain()["millis"]

181ms

其中可以使用hint指定要使用的索引。
所以這種方式還是很有優(yōu)勢的。比如一般場景下，我們不會(huì)把所有的數(shù)據(jù)都取出來，只是去查詢最近的，所以這種效率也會(huì)更高。

索引類型 單鍵索引

最普通索引，如

db.users.ensureIndex({"username": 1})

唯一索引

可以確保集合的每個(gè)文檔的指定鍵都有唯一值。

db.users.ensureIndex({"username": 1, unique: true})

如果插入2個(gè)相同都叫張三的數(shù)據(jù)，第二次插入的則會(huì)失敗。_id即為唯一索引，并且不能刪除。
這和使用mongoose框架很相似，比如在定義schema時(shí)，即可指定unique: true

company: { // 公司名稱
    type: String,
    required: true,
    unique: true
}

多鍵索引

如果某個(gè)鍵的值在文檔中是一個(gè)數(shù)組，那么這個(gè)索引就會(huì)被標(biāo)記為多鍵索引。
比如現(xiàn)在members文檔中隨便添加有3條數(shù)據(jù)：

> db.members.find()
{ "_id" : ObjectId("1"), "tags" : [  "ame",  "fear",  "big" ] }
{ "_id" : ObjectId("2"), "tags" : [  "ame",  "fear",  "big",  "chi" ] }
{ "_id" : ObjectId("3"), "tags" : [  "ame",  "jr",  "big",  "chi" ] }

當(dāng)我查找tags="jr"數(shù)據(jù)時(shí)，db會(huì)查找所有文檔，所以nscanned=3,并且返回一條，此時(shí)n=1。

>db.members.find({tags: "jr"}).explain()
{
    "cursor" : "BasicCursor",
    "isMultiKey" : false,
    "n" : 1,
    "nscanned" : 3,
}

然后建立索引

> db.members.ensureIndex({tags:1})

之后我們在對(duì)tags="jr"進(jìn)行查找，此時(shí)nscanned=1，并且isMultiKey由原來的false變?yōu)?b>true。所以可以說明，mongodb對(duì)數(shù)組做了多個(gè)鍵的索引，即把所有的數(shù)組元素都做了索引。

> db.members.find({tags: "jr"}).explain()
{
    "cursor" : "BtreeCursor tags_1",
    "isMultiKey" : true,
    "n" : 1,
    "nscannedObjects" : 1,
    "nscanned" : 1,
}

過期索引

是在一段時(shí)間后會(huì)過期的索引。索引過期后，相應(yīng)的數(shù)據(jù)會(huì)被刪除。適合存儲(chǔ)一些在一段時(shí)間失效的數(shù)據(jù)比如用戶的登錄信息，存儲(chǔ)的日志等。
和設(shè)置單鍵索引很類似，只是多個(gè)expireAfterSeconds參數(shù)，單位是秒。

db.collectionName.ensureIndex({key: 1}, {expireAfterSeconds: 10})

首先我們先建立一下索引，數(shù)據(jù)會(huì)在30秒后刪除

> db.members.ensureIndex({time:1}, {expireAfterSeconds: 30})

插入數(shù)據(jù)

> db.members.insert({time: new Date()})

查詢

> db.members.find()

{ "_id" : ObjectId("4"), "time" : ISODate("2016-01-16T12:27:20.171Z") }

30秒后再次查詢，數(shù)據(jù)則消失了。

存儲(chǔ)的值必須是ISODate時(shí)間類型（比如new Date()），如果存儲(chǔ)的非時(shí)間類型，則不會(huì)自動(dòng)刪除。
過期索引不能是復(fù)合索引。
刪除的時(shí)間不精確，因?yàn)閯h除過程每60秒后臺(tái)程序跑一次，而且刪除也需要一些時(shí)間，存在誤差。

稀疏索引

使用sparse可以創(chuàng)建稀疏索引和唯一索引

>db.users.ensureIndex({"email": 1}, {"unique": true, "sparse": true})

下面來自官網(wǎng)的問候

Sparse Index with Unique Constraint（約束）

Consider a collection scores that contains the following documents:

{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }

You could create an index with a unique constraint and sparse filter on the score field using the following operation:

db.scores.createIndex( { score: 1 } , { sparse: true, unique: true } )

This index would permit the insertion of documents that had unique values for the score field or did not include a score field.
所以索引會(huì)允許不同score的文檔或根本沒有score這個(gè)字段的文檔插入成功。

As such, given the existing documents in the scores collection, the index permits the following insert operations:
以下插入成功：

db.scores.insert( { "userid": "AAAAAAA", "score": 43 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 34 } )
db.scores.insert( { "userid": "CCCCCCC" } )
db.scores.insert( { "userid": "DDDDDDD" } )

However, the index would not permit the addition of the following documents since documents already exists with score value of 82 and 90:

db.scores.insert( { "userid": "AAAAAAA", "score": 82 } )
db.scores.insert( { "userid": "BBBBBBB", "score": 90 } )

索引管理

system.indexes集合中包含了每個(gè)索引的詳細(xì)信息

db.system.indexes.find()

創(chuàng)建索引 Mongo shell

ensureIndex()

createIndex()

example

db.users.ensureIndex({"username": 1})

后臺(tái)創(chuàng)建索引，這樣數(shù)據(jù)庫再創(chuàng)建索引的同時(shí)，仍然能夠處理讀寫請求，可以指定background選項(xiàng)。

db.test.ensureIndex({"username":1},{"background":true})

Schema

var animalSchema = new Schema({
  name: String,
  type: String,
  tags: { type: [String], index: true } // field level
});

animalSchema.index({ name: 1, type: -1 }); // schema level

在Schema中，官方不推薦在生成環(huán)境直接創(chuàng)建索引

When your application starts up, Mongoose automatically calls ensureIndex for each defined index in your schema. Mongoose will call ensureIndex for each index sequentially, and emit an "index" event on the model when all the ensureIndex calls succeeded or when there was an error. While nice for development, it is recommended this behavior be disabled in production since index creation can cause a significant performance impact . Disable the behavior by setting the autoIndex option of your schema to false, or globally on the connection by setting the option config.autoIndex to false.

2.getIndexes()查看索引

db.collectionName.getIndexes()

db.users.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "ns" : "test.users",
        "name" : "_id_"
    },
    {
        "v" : 1,
        "key" : {
            "username" : 1
        },
        "ns" : "test.users",
        "name" : "username_1"
    }
]

其中v字段只在內(nèi)部使用，用于標(biāo)識(shí)索引版本。

3.dropIndex刪除索引

> db.users.dropIndex("username_1")
{ "nIndexesWas" : 2, "ok" : 1 }

或

> db.users.dropIndex({"username":1})

云服務(wù)器 GPU云服務(wù)器之基礎(chǔ)篇入門篇 ASPNET入門數(shù)據(jù)篇機(jī)器人制作入門篇

文章版權(quán)歸作者所有，未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請注明本文地址：http://systransis.cn/yun/18804.html

發(fā)表評(píng)論

登陸后可評(píng)論

0條評(píng)論

jsbintask

男|高級(jí)講師

我要關(guān)注我要私信

TA的文章

【三子棋（井字棋）】如何用C語言實(shí)現(xiàn)

閱讀 3073·2021-11-11 16:55
Shotcut 一款跨平臺(tái)支持的免費(fèi)開源視頻剪輯軟件（支持簡體中文）

閱讀 3215·2021-10-18 13:34
性能測試，你需要了解這款工具

閱讀 604·2021-10-14 09:42
獨(dú)立顯卡二季度賣出118億美元：同比暴漲1.5倍

閱讀 1653·2021-09-03 10:30
CYUN：2021年盛夏促銷來襲，全系列服務(wù)器產(chǎn)品新購8.5折，低至24.65元/月

閱讀 906·2021-08-05 10:02
兩個(gè)盒子垂直水平居中，并且相距距離一樣的實(shí)現(xiàn)

閱讀 988·2019-08-30 11:27
小卡片左右滑動(dòng)的實(shí)現(xiàn)

閱讀 3495·2019-08-29 15:14
CSS重置樣式

閱讀 1261·2019-08-29 13:02

成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長期優(yōu)惠，快來選購！

MongoDB入門之索引篇

相關(guān)文章

數(shù)據(jù)庫收集 - 收藏集 - 掘金

發(fā)表評(píng)論

0條評(píng)論

jsbintask

男|高級(jí)講師

TA的文章

【三子棋（井字棋）】如何用C語言實(shí)現(xiàn)

Shotcut 一款跨平臺(tái)支持的免費(fèi)開源視頻剪輯軟件（支持簡體中文）

性能測試，你需要了解這款工具

獨(dú)立顯卡二季度賣出118億美元：同比暴漲1.5倍

CYUN：2021年盛夏促銷來襲，全系列服務(wù)器產(chǎn)品新購8.5折，低至24.65元/月

兩個(gè)盒子垂直水平居中，并且相距距離一樣的實(shí)現(xiàn)

小卡片左右滑動(dòng)的實(shí)現(xiàn)

CSS重置樣式

最新活動(dòng)

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長期優(yōu)惠，快來選購！

MongoDB入門之索引篇

相關(guān)文章

發(fā)表評(píng)論

0條評(píng)論

男|高級(jí)講師

TA的文章

最新活動(dòng)

上云采購季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長期優(yōu)惠，快來選購！