Redis原始碼剖析之robj(redisObject)

首頁>技術>XINDOO的平行宇宙2021-01-10 22:52

Redis原始碼剖析之robj(redisObject)

欄位詳解

相對與其他幾個資料結構，robj相對簡單，因為只包含了幾個欄位，含義都很明確。

typedef struct redisObject {    unsigned type:4;       // 資料型別  integer  string  list  set    unsigned encoding:4;    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or                            * LFU data (least significant 8 bits frequency                            * and most significant 16 bits access time).                             * redis用24個位來儲存LRU和LFU的資訊，當使用LRU時儲存上次                            * 讀寫的時間戳(秒),使用LFU時儲存上次時間戳(16位 min級) 儲存近似統計數8位 */    int refcount;          // 引用計數     void *ptr;              // 指標指向具體儲存的值，型別用type區分} robj;

核心就五個欄位，我們分別來介紹下。

type(4位)

type是表示當然robj裡所儲存的資料型別，目前redis中包含以下幾種型別。

識別符號	值	含義
OBJ_STRING	0	字串(string)
OBJ_LIST	1	列表(list)
OBJ_SET	2	集合(set)
OBJ_ZSET	3	有序集(zset)
OBJ_HASH	4	雜湊表(hash)
OBJ_MODULE	5	模組(module)
OBJ_STREAM	6	流(stream)

encoding(4位)

編碼方式，如果說每個型別只有一種方式，那麼其實type和encoding兩個欄位只需要保留一個即可，但redis為了在各種情況下儘可能介紹記憶體，對每種型別的資料在不同情況下有不同的編碼格式，所以這裡需要用額外的欄位標識出來。目前有以下幾種編碼(redis 6.2)。

識別符號	值	含義
OBJ_ENCODING_RAW	0	最原始的標識方式，只有string才會用到
OBJ_ENCODING_INT	1	整數
OBJ_ENCODING_HT	2	dict
OBJ_ENCODING_ZIPMAP	3	zipmap 目前已經不再使用
OBJ_ENCODING_LINKEDLIST	4	就的連結串列，現在已經不再使用了
OBJ_ENCODING_ZIPLIST	5	ziplist
OBJ_ENCODING_INTSET	6	intset
OBJ_ENCODING_SKIPLIST	7	跳錶 skiplist
OBJ_ENCODING_EMBSTR	8	嵌入式的sds
OBJ_ENCODING_QUICKLIST	9	快表 quicklist
OBJ_ENCODING_STREAM	10	流 stream

這裡有個OBJ_ENCODING_EMBSTR，這裡著重介紹下。

robj *createEmbeddedStringObject(const char *ptr, size_t len) {    robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);    struct sdshdr8 *sh = (void*)(o+1);    o->type = OBJ_STRING;    o->encoding = OBJ_ENCODING_EMBSTR;    o->ptr = sh+1;    o->refcount = 1;    if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {        o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;    } else {        o->lru = LRU_CLOCK();    }    sh->len = len;    sh->alloc = len;    sh->flags = SDS_TYPE_8;    if (ptr == SDS_NOINIT)        sh->buf[len] = '\0';    else if (ptr) {        memcpy(sh->buf,ptr,len);        sh->buf[len] = '\0';    } else {        memset(sh->buf,0,len+1);    }    return o;}

從上面程式碼就可以看出，它是robj和sds的一個結合，將sds直接放在robj裡，這裡限制最多可以存放44位元組長度的字串。因為robj佔16位元組，sdshdr8頭佔3位元組，'\0'一個位元組，限制字串最長為44就可以保證在64個位元組裡存放下所有內容（16+3+1+44==64）。

lru(24位)

眾所周知，redis提供了過期資料自動淘汰的策略，如何知道資料是否已經過期？按照什麼樣的策略淘汰資料？這倆問題的答案都和 lru 這個欄位有關。redis給了lru這個欄位24位，但千萬別以為欄位名叫lru就認為它只是LRU淘汰策略中才會使用的，其實LFU用的也是這個欄位。 我估計是redis作者先寫了lru策略，所以直接就叫lru了，後來再加lfu策略的時候直接複用這個欄位了。lru欄位在不同淘汰策略時有不同的含義。當使用LRU時，它就是一個24位的秒級unix時間戳，代表這個資料在第多少秒被更新過。但使用LFU策略時，24位會被分為兩部分，16位的分鐘級時間戳和8位的特殊計數器，這裡就不再詳解了，更具體可以關注我後續的博文。

refcount

引用計數，表示這個robj目前被多少個地方應用，refcount的出現為物件複用提供了基礎。瞭解過垃圾回收的同學都知道有中回收策略就是採用計數器的方式，當refcount為0時，說明該物件已經沒用了，就可以被回收掉了，redis的作者也實現了這種引用回收的策略。

*ptr

這個就很簡單了，前面幾個欄位是為當然robj提供meta資訊，那這個欄位就是資料具體所在地址。

robj的編解碼

redis向來將記憶體空間節省做到了極致，這裡redis的作者又對字串型別的robj做了特殊的編碼處理，以達到節省記憶體的目的，編碼過程的程式碼及註釋如下：

/* 將string型別的robj做特殊編碼，以節省儲存空間  */robj *tryObjectEncoding(robj *o) {    long value;    sds s = o->ptr;    size_t len;    /* Make sure this is a string object, the only type we encode     * in this function. Other types use encoded memory efficient     * representations but are handled by the commands implementing     * the type.      * 這裡只編碼string物件，其他型別的的編碼都由其對應的實現處理 */    serverAssertWithInfo(NULL,o,o->type == OBJ_STRING);    /* We try some specialized encoding only for objects that are     * RAW or EMBSTR encoded, in other words objects that are still     * in represented by an actually array of chars.     * 非sds string直接返回原資料 */    if (!sdsEncodedObject(o)) return o;    /* It's not safe to encode shared objects: shared objects can be shared     * everywhere in the "object space" of Redis and may end in places where     * they are not handled. We handle them only as values in the keyspace.      * 如果是共享的物件，不能編碼，因為可能會影響到其他地方的使用*/     if (o->refcount > 1) return o;    /* Check if we can represent this string as a long integer.     * Note that we are sure that a string larger than 20 chars is not     * representable as a 32 nor 64 bit integer.      * 檢查是否可以把字串表示為一個長整型數。注意如果長度大於20個字元的字串是     * 不能被表示為32或者64位的整數的*/    len = sdslen(s);    if (len <= 20 && string2l(s,len,&value)) {        /* This object is encodable as a long. Try to use a shared object.         * Note that we avoid using shared integers when maxmemory is used         * because every object needs to have a private LRU field for the LRU         * algorithm to work well.          * 如果可以被編碼為long型，且編碼後的值小於OBJ_SHARED_INTEGERS(10000)，且未配         * 置LRU替換淘汰策略, 就使用這個數的共享物件，相當於所有小於10000的數都是用的同一個robj*/        if ((server.maxmemory == 0 ||            !(server.maxmemory_policy & MAXMEMORY_FLAG_NO_SHARED_INTEGERS)) &&            value >= 0 &&            value < OBJ_SHARED_INTEGERS)        {            decrRefCount(o);            incrRefCount(shared.integers[value]);            return shared.integers[value];        } else {            /* 否則原來如果是RAW型別，直接轉為OBJ_ENCODING_INT型別，然後用long來直接儲存字串 */                if (o->encoding == OBJ_ENCODING_RAW) {                sdsfree(o->ptr);                o->encoding = OBJ_ENCODING_INT;                o->ptr = (void*) value;                return o;            /*如果是OBJ_ENCODING_EMBSTR，也會轉化為OBJ_ENCODING_INT，並用long儲存字串*/            } else if (o->encoding == OBJ_ENCODING_EMBSTR) {                decrRefCount(o);                return createStringObjectFromLongLongForValue(value);            }        }    }    // 對於那些無法轉為long的字串，做如下處理    /* If the string is small and is still RAW encoded,     * try the EMBSTR encoding which is more efficient.     * In this representation the object and the SDS string are allocated     * in the same chunk of memory to save space and cache misses.      * 如果字串太小，長度小於等於44，直接轉為OBJ_ENCODING_EMBSTR*/    if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT) {        robj *emb;        if (o->encoding == OBJ_ENCODING_EMBSTR) return o;        emb = createEmbeddedStringObject(s,sdslen(s));        decrRefCount(o);        return emb;    }    /* We can't encode the object...     *     * Do the last try, and at least optimize the SDS string inside     * the string object to require little space, in case there     * is more than 10% of free space at the end of the SDS string.     *     * We do that only for relatively large strings as this branch     * is only entered if the length of the string is greater than     * OBJ_ENCODING_EMBSTR_SIZE_LIMIT.      *      * 如果前面沒有編碼成功，這裡做最後一次嘗試，如果sds有超過10%的可用空閒空間，     * 且字元長度大於OBJ_ENCODING_EMBSTR_SIZE_LIMIT(44)那嘗試釋放sds中多餘     * 的空間以節省記憶體。     **/    trimStringObjectIfNeeded(o);    /* 直接返回原始物件. */    return o;}

檢查是否是字串，如果不是直接返回。檢查是否是共享物件(refcount > 1)，被共享的物件不做編碼。如果字串長度小於等於20，直接可以編碼為一個long型的整數，這裡小於10000的long物件都是共享的。如果字串長度小於等於44，直接用OBJ_ENCODING_EMBSTR儲存。如果沒有被編碼，且字串長度超過44，且sds中的空閒空間超過10%，則清除空閒空間，以節省記憶體。

當然有編碼就有解碼，程式碼及如下，相對比較簡單：

/* Get a decoded version of an encoded object (returned as a new object). * If the object is already raw-encoded just increment the ref count. * 獲取解碼後的物件(返回的是有個新物件)，如果這個物件是個原始型別，只是把引用加一。 */robj *getDecodedObject(robj *o) {    robj *dec;    if (sdsEncodedObject(o)) {        incrRefCount(o);        return o;    }    if (o->type == OBJ_STRING && o->encoding == OBJ_ENCODING_INT) {        char buf[32];        ll2string(buf,32,(long)o->ptr);        dec = createStringObject(buf,strlen(buf));        return dec;    } else {        serverPanic("Unknown encoding type");    }}

引用計數和自動清理

上文已經說到了，redis為了節省空間，會複用一些物件，沒有引用的物件會被自動清理。作者用了引用計數的方式來實現gc，程式碼也比較簡單，如下：

void incrRefCount(robj *o) {    if (o->refcount < OBJ_FIRST_SPECIAL_REFCOUNT) {        o->refcount++;    } else {        if (o->refcount == OBJ_SHARED_REFCOUNT) {            /* Nothing to do: this refcount is immutable. */        } else if (o->refcount == OBJ_STATIC_REFCOUNT) {            serverPanic("You tried to retain an object allocated in the stack");        }    }}/* 減少引用計數，如果沒有引用了就釋放記憶體空間 */void decrRefCount(robj *o) {    // 清理空間     if (o->refcount == 1) {        switch(o->type) {        case OBJ_STRING: freeStringObject(o); break;        case OBJ_LIST: freeListObject(o); break;        case OBJ_SET: freeSetObject(o); break;        case OBJ_ZSET: freeZsetObject(o); break;        case OBJ_HASH: freeHashObject(o); break;        case OBJ_MODULE: freeModuleObject(o); break;        case OBJ_STREAM: freeStreamObject(o); break;        default: serverPanic("Unknown object type"); break;        }        zfree(o);    } else {        if (o->refcount <= 0) serverPanic("decrRefCount against refcount <= 0");        if (o->refcount != OBJ_SHARED_REFCOUNT) o->refcount--;    }}

總結

總結下，可以認為robj有這樣幾個作用。

最新評論

劇多

Redis原始碼剖析之robj(redisObject)

相關內容