Java8之後，HashMap連結串列插入方式，為何要尾插入

首頁>技術>程式設計師springmeng2020-12-27 08:30

Java8之後，HashMap連結串列插入方式，為何要尾插入

資料結構

對 HashMap 的底層資料結構，相信大家都有所瞭解，不同的版本，底層資料結構會有所不同

1.7 的底層資料結構

/** * An empty table instance to share when the table is not inflated. */static final Entry<?,?>[] EMPTY_TABLE = {};/** * The table, resized as necessary. Length MUST Always be a power of two. */transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;static class Entry<K,V> implements Map.Entry<K,V> {    final K key;    V value;    Entry<K,V> next;    int hash;        ...}

1.8 的底層資料結構

/** * The table, initialized on first use, and resized as * necessary. When allocated, length is always a power of two. * (We also tolerate length zero in some operations to allow * bootstrapping mechanics that are currently not needed.) */transient Node<K,V>[] table;static class Node<K,V> implements Map.Entry<K,V> {    final int hash;    final K key;    V value;    Node<K,V> next;    ...}/** * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn * extends Node) so can be used as extension of either regular or * linked node. */static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {    TreeNode<K,V> parent;  // red-black tree links    TreeNode<K,V> left;    TreeNode<K,V> right;    TreeNode<K,V> prev;    // needed to unlink next upon deletion    boolean red;        ...}

但基礎結構還是：陣列 + 連結串列，稱作雜湊表或散列表

只是 1.8 做了最佳化，引進了紅黑樹，來提升連結串列中元素獲取的速度

JDK1.7 頭插

只有元素新增的時候，才會出現連結串列元素的插入，那麼我們先來看看 put 方法

put - 新增元素

原始碼如下

   /**     * Associates the specified value with the specified key in this map.     * If the map previously contained a mapping for the key, the old     * value is replaced.     *     * @param key key with which the specified value is to be associated     * @param value value to be associated with the specified key     * @return the previous value associated with <tt>key</tt>, or     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.     *         (A <tt>null</tt> return can also indicate that the map     *         previously associated <tt>null</tt> with <tt>key</tt>.)     */    public V put(K key, V value) {        if (table == EMPTY_TABLE) {            inflateTable(threshold);        }        if (key == null)            return putForNullKey(value);        int hash = hash(key);        int i = indexFor(hash, table.length);        for (Entry<K,V> e = table[i]; e != null; e = e.next) {            Object k;            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {                V oldValue = e.value;                e.value = value;                e.recordAccess(this);                return oldValue;            }        }        modCount++;        addEntry(hash, key, value, i);        return null;    }

直接看程式碼可能不夠直觀，我們結合流程圖來看

什麼？還是不夠直觀？（樓主也這麼覺得）

那我們就結合具體案例來看下這個流程

假設 HashMap 初始狀態

然後依次往裡面新增元素：(2,b), (3,w), (5,e), (9,t), (16,p)

再利用斷點除錯，我們來看看真實情況

一切都對得上，進展的也挺順利

resize - 陣列擴容

上述提到了擴容，但是沒細講，我們來看看擴容的實現

關鍵程式碼如下

/** * Rehashes the contents of this map into a new array with a * larger capacity.  This method is called automatically when the * number of keys in this map reaches its threshold. * * If current capacity is MAXIMUM_CAPACITY, this method does not * resize the map, but sets threshold to Integer.MAX_VALUE. * This has the effect of preventing future calls. * * @param newCapacity the new capacity, MUST be a power of two; *        must be greater than current capacity unless current *        capacity is MAXIMUM_CAPACITY (in which case value *        is irrelevant). */void resize(int newCapacity) {    Entry[] oldTable = table;    int oldCapacity = oldTable.length;    if (oldCapacity == MAXIMUM_CAPACITY) {        threshold = Integer.MAX_VALUE;        return;    }    Entry[] newTable = new Entry[newCapacity];    transfer(newTable, initHashSeedAsNeeded(newCapacity));    table = newTable;    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);}/** * Transfers all entries from current table to newTable. */void transfer(Entry[] newTable, boolean rehash) {    int newCapacity = newTable.length;    for (Entry<K,V> e : table) {        while(null != e) {            Entry<K,V> next = e.next;            if (rehash) {                e.hash = null == e.key ? 0 : hash(e.key);            }            int i = indexFor(e.hash, newCapacity);            e.next = newTable[i];            newTable[i] = e;            e = next;        }    }}

主要做了兩件事：1、建立一個新的 Entry 空陣列，長度是原陣列的 2 倍，2、遍歷原陣列，對每個元素重新計算新陣列的索引值，然後放入到新陣列的對應位置

有意思的是這個轉移方法：transfer，我們結合案例來仔細看看

假設擴容之前的狀態如下圖所示

擴容過程如下

利用斷點除錯，我們來看看真實情況

連結串列元素的轉移，還是採用的頭插法

連結串列成環

不管是元素的新增，還是陣列擴容，只要涉及到 hash 衝突，就會採用頭插法將元素新增到連結串列中

上面講了那麼多，看似風平浪靜，實則暗流湧動；單執行緒下，確實不會有什麼問題，那多執行緒下呢？我們接著往下看

將設擴容之前的的狀態如下所示

然後，執行緒 1 新增 (1,a) ，執行緒 2 新增 (19,n)，執行緒 1 會進行擴容，執行緒 2 也進行擴容，那麼 transfer 的時候就可能出現如下情況

哦豁，連結串列成環了，這就會導致：Infinite Loop

JDK1.8 尾插

1.8就不講那麼詳細了，我們主要來看看 resize 中的元素轉移部分

if (oldTab != null) {    // 從索引 0 開始逐個遍歷舊 table    for (int j = 0; j < oldCap; ++j) {        Node<K,V> e;        if ((e = oldTab[j]) != null) {            oldTab[j] = null;            if (e.next == null)    // 連結串列只有一個元素                newTab[e.hash & (newCap - 1)] = e;            else if (e instanceof TreeNode)    // 紅黑樹，先不管                ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);            else { // preserve order                // 拆連結串列，拆成兩個子連結串列：索引不變的元素連結串列和有相同偏移量的元素連結串列                // 每個連結串列都保持原有順序                Node<K,V> loHead = null, loTail = null;                Node<K,V> hiHead = null, hiTail = null;                Node<K,V> next;                do {                    next = e.next;                    if ((e.hash & oldCap) == 0) {                        // 索引不變的元素連結串列                        if (loTail == null)                            loHead = e;                        else    // 透過尾部去關聯 next，維持了元素原有順序                            loTail.next = e;                        loTail = e;                    }                    else {                        // 相同偏移量的元素連結串列                        if (hiTail == null)                            hiHead = e;                        else    // 透過尾部去關聯 next，維持了元素原有順序                            hiTail.next = e;                        hiTail = e;                    }                } while ((e = next) != null);                if (loTail != null) {                    loTail.next = null;                    newTab[j] = loHead;                }                if (hiTail != null) {                    hiTail.next = null;                    newTab[j + oldCap] = hiHead;                }            }        }    }}

透過尾插法，維護了連結串列元素的原有順序

在擴容時，頭插法會改變連結串列中元素原本的順序，以至於在併發場景下導致連結串列成環的問題，而尾插法，在擴容時會保持連結串列元素原本的順序，就不會出現連結串列成環的問題

相關疑惑

1、JDK 1.7及之前，為什麼採用頭插法

呃... 這個可能需要問頭插法的實現者了；

但有種說法，我覺得挺有道理：快取的時間區域性性原則，最近訪問過的資料下次大機率會再次訪問，把剛訪問過的元素放在連結串列最前面可以直接被查詢到，減少查詢次數

2、既然頭插法有連結串列成環的問題，為什麼直到 1.8 才採用尾插法來替代頭插法

只有在併發情況下，頭插法才會出現連結串列成環的問題，多執行緒情況下，HashMap 本就非執行緒安全，這就相當於你在它的規則之外出了問題，那能怪誰？

1.8 採用尾插，是對 1.7 的最佳化

3、既然 1.8 沒有連結串列成環的問題，那是不是說明可以把 1.8 中的 HashMap 用在多執行緒中

連結串列成環只是併發問題中的一種，1.8 雖然解決了此問題，但是還是會有很多其他的併發問題，比如：上秒 put 的值，下秒 get 的時候卻不是剛 put 的值；因為操作都沒有加鎖，不是執行緒安全的

總結

1、JDK 1.7 採用頭插法來新增連結串列元素，存在連結串列成環的問題，1.8 中做了最佳化，採用尾插法來新增連結串列元素

2、HashMap 不管在哪個版本都不是執行緒安全的，出了併發問題不要怪 HashMap，從自己身上找原因

參考

https://juejin.im/post/6844903682664824845

https://www.cnblogs.com/youzhibing/p/11833040.html

最新評論

劇多

Java8之後，HashMap連結串列插入方式，為何要尾插入

相關內容