【Java源码分析】HashTable源码分析

旧城等待, 2022-09-23 09:29 443阅读 0赞

类的定义

  1. public class Hashtable<K,V>
  2. extends Dictionary<K,V>
  3. implements Map<K,V>, Cloneable, java.io.Serializable {}

注意前面虽然说HashTable和HashMap是非常像的,但是这两个类的父类是不一样的。前者是字典类的子类,后者是抽象Map的子类。

  1. HashTable 也是将key映射到value的集合类,key不允许为null,并且作为key的类是必须实现hashCode()和equals()方法
  2. 影响HashTable的性能的两个因素也是容量capacity和装载因子load-factor。关于这一点和HashMap是一样的,默认装载因子也是0.75
  3. 初始化容量用于控制空间利用率和扩容之间的平衡,如果初始容量很小,那么后续可能引发多次扩容(扩容会重新分配空间,再hash,并拷贝数据,所以比较耗时);如果初始容量很大,可能会有点浪费存储空间;所以和HashMap一样,最好预估一个合理的初始大小
  4. 迭代器创建并迭代的过程中同样是不允许修改HashTable结构的(比如增加或者删除数据),否则出现fail-fast,同时fail-fast虽然抛出了ConcurrentModificationException但是它并不能保证线程安全性
  5. HashTable是线程安全的如果程序本身不是多线程环境下运行,那么建议使用HashMap,如果是高并发环境下,建议使用java.util.concurrent.ConcurrentHashMap,只有在一般并发环境下建议使用HashTable

重要的成员变量

  1. private transient Entry<K,V>[] table; // 基于数组实现
  2. private transient int count; // 实际存放的实体个数
  3. private int threshold; // 阈值,用于判断是否需要扩容
  4. private float loadFactor; // 装载因子

构造函数

  1. public Hashtable() {
  2. this(11, 0.75f);
  3. }
  4. public Hashtable(int initialCapacity) {
  5. this(initialCapacity, 0.75f);
  6. }
  7. public Hashtable(int initialCapacity, float loadFactor) {
  8. if (initialCapacity < 0)
  9. throw new IllegalArgumentException("Illegal Capacity: "+
  10. initialCapacity);
  11. if (loadFactor <= 0 || Float.isNaN(loadFactor))
  12. throw new IllegalArgumentException("Illegal Load: "+loadFactor);
  13. if (initialCapacity==0)
  14. initialCapacity = 1;
  15. this.loadFactor = loadFactor;
  16. table = new Entry[initialCapacity];
  17. threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
  18. useAltHashing = sun.misc.VM.isBooted() &&
  19. (initialCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
  20. }
  21. public Hashtable(Map<? extends K, ? extends V> t) {
  22. this(Math.max(2*t.size(), 11), 0.75f);
  23. putAll(t);
  24. }

构造函数和HashMap一样,不同点在于默认容量不一样,HashTable的默认容量是11

hash函数

  1. transient final int hashSeed = sun.misc.Hashing.randomHashSeed(this);
  2. private int hash(Object k) {
  3. if (useAltHashing) {
  4. if (k.getClass() == String.class) {
  5. return sun.misc.Hashing.stringHash32((String) k);
  6. } else {
  7. int h = hashSeed ^ k.hashCode();
  8. // This function ensures that hashCodes that differ only by
  9. // constant multiples at each bit position have a bounded
  10. // number of collisions (approximately 8 at default load factor).
  11. h ^= (h >>> 20) ^ (h >>> 12);
  12. return h ^ (h >>> 7) ^ (h >>> 4);
  13. }
  14. } else {
  15. return k.hashCode();
  16. }
  17. }

hash的计算和HashMap是一样的

判断是否存在给定值

  1. public synchronized boolean contains(Object value) {
  2. if (value == null) {
  3. throw new NullPointerException();
  4. }
  5. Entry tab[] = table;
  6. for (int i = tab.length ; i-- > 0 ;) {
  7. for (Entry<K,V> e = tab[i] ; e != null ; e = e.next) {
  8. if (e.value.equals(value)) {
  9. return true;
  10. }
  11. }
  12. }
  13. return false;
  14. }
  15. public boolean containsValue(Object value) {
  16. return contains(value);
  17. }

可以看到这里一旦发现value是null就直接抛出异常,这样直接说明HashTable是不允许null-value的。另外该方法虽然是contains(),但是实际功能和containsValue()是一样的,从containsValue()的实现可以看出

给定的key是否存在

  1. public synchronized boolean containsKey(Object key) {
  2. Entry tab[] = table;
  3. int hash = hash(key);
  4. int index = (hash & 0x7FFFFFFF) % tab.length;
  5. for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
  6. if ((e.hash == hash) && e.key.equals(key)) {
  7. return true;
  8. }
  9. }
  10. return false;
  11. }

当且仅当给定的key是存在的时候才返回true,但是注意一点的是这里所说的是否存在是取决于equals()方法是如何实现的。这和开头所强调的存入HashTable的key必须实现hashCode和equals方法是一致的.这里(hash & 0x7FFFFFFF)的目的应该是为了保证hash是一个正值,毕竟只有第32位为0其他都是1

使用给定的key查找

  1. public synchronized V get(Object key) {
  2. Entry tab[] = table;
  3. int hash = hash(key);
  4. int index = (hash & 0x7FFFFFFF) % tab.length;
  5. for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
  6. if ((e.hash == hash) && e.key.equals(key)) {
  7. return e.value;
  8. }
  9. }
  10. return null;
  11. }

同HashMap,因为这里的底层实现结构都是一样的,都是基于数组,数组的每个下标对应一个单链表。如果对应的key在HashTable中存在着对应的value那么返回找到的value,否则返回Null

扩容

  1. protected void rehash() {
  2. int oldCapacity = table.length;
  3. Entry<K,V>[] oldMap = table;
  4. // overflow-conscious code
  5. int newCapacity = (oldCapacity << 1) + 1;
  6. if (newCapacity - MAX_ARRAY_SIZE > 0) {
  7. if (oldCapacity == MAX_ARRAY_SIZE)
  8. // Keep running with MAX_ARRAY_SIZE buckets
  9. return;
  10. newCapacity = MAX_ARRAY_SIZE;
  11. }
  12. Entry<K,V>[] newMap = new Entry[newCapacity];
  13. modCount++;
  14. threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
  15. boolean currentAltHashing = useAltHashing;
  16. useAltHashing = sun.misc.VM.isBooted() &&
  17. (newCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
  18. boolean rehash = currentAltHashing ^ useAltHashing;
  19. table = newMap;
  20. for (int i = oldCapacity ; i-- > 0 ;) {
  21. for (Entry<K,V> old = oldMap[i] ; old != null ; ) {
  22. Entry<K,V> e = old;
  23. old = old.next;
  24. if (rehash) {
  25. e.hash = hash(e.key);
  26. }
  27. int index = (e.hash & 0x7FFFFFFF) % newCapacity;
  28. e.next = newMap[index];
  29. newMap[index] = e;
  30. }
  31. }
  32. }

当实际装入的键值对的个数超过了table的长度也就是容量capacity和装载因子的乘积,就会进行扩容,扩容过程是自动的,但是比较耗时。扩增后的容量是最初的容量的2倍加1,这里和HashMap对比就很容易理解,HashMap的初始容量是偶数,而且要求容量一直是偶数,那么扩容的时候就直接扩大到原来的2倍。相应的,HashTable的初始容量是奇数11,2倍之后变成偶数,所以加1。不过这里没有要求HashTable的容量必须是奇数,但无论如何扩容后容量都会变成奇数

添加键值对

  1. public synchronized V put(K key, V value) {
  2. // Make sure the value is not null
  3. if (value == null) {
  4. throw new NullPointerException();
  5. }
  6. // Makes sure the key is not already in the hashtable.
  7. Entry tab[] = table;
  8. int hash = hash(key);
  9. int index = (hash & 0x7FFFFFFF) % tab.length;
  10. for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
  11. if ((e.hash == hash) && e.key.equals(key)) {
  12. V old = e.value;
  13. e.value = value;
  14. return old;
  15. }
  16. }
  17. modCount++;
  18. if (count >= threshold) {
  19. // Rehash the table if the threshold is exceeded
  20. rehash();
  21. tab = table;
  22. hash = hash(key);
  23. index = (hash & 0x7FFFFFFF) % tab.length;
  24. }
  25. // Creates the new entry.
  26. Entry<K,V> e = tab[index];
  27. tab[index] = new Entry<>(hash, key, value, e);
  28. count++;
  29. return null;
  30. }

注意键和值都必须不能为NULL添加过程中如果发现给定的key已经存在,那么替换旧值并返回旧值,否则将新的键值对加入到HashTable,其中如果添加过程中发现实际键值对数目已经超过阈值,那么就进行扩容

删除键值对

  1. public synchronized V remove(Object key) {
  2. Entry tab[] = table;
  3. int hash = hash(key);
  4. int index = (hash & 0x7FFFFFFF) % tab.length;
  5. for (Entry<K,V> e = tab[index], prev = null ; e != null ; prev = e, e = e.next) {
  6. if ((e.hash == hash) && e.key.equals(key)) {
  7. modCount++;
  8. if (prev != null) {
  9. prev.next = e.next;
  10. } else {
  11. tab[index] = e.next;
  12. }
  13. count--;
  14. V oldValue = e.value;
  15. e.value = null;
  16. return oldValue;
  17. }
  18. }
  19. return null;
  20. }

如果找到了给定key对应的键值对,那么就做删除操作,如果没有找到,那么就什么也不做。实际的删除过程就是单链表的节点删除

清空操作

  1. public synchronized void clear() {
  2. Entry tab[] = table;
  3. modCount++;
  4. for (int index = tab.length; --index >= 0; )
  5. tab[index] = null;
  6. count = 0;
  7. }

和HashMap的清空操作一样,只是将每个slot置为NULL,剩下的交给GC

比较两个HashTable是否相等

  1. public synchronized boolean equals(Object o) {
  2. if (o == this)
  3. return true;
  4. if (!(o instanceof Map))
  5. return false;
  6. Map<K,V> t = (Map<K,V>) o;
  7. if (t.size() != size())
  8. return false;
  9. try {
  10. Iterator<Map.Entry<K,V>> i = entrySet().iterator();
  11. while (i.hasNext()) {
  12. Map.Entry<K,V> e = i.next();
  13. K key = e.getKey();
  14. V value = e.getValue();
  15. if (value == null) {
  16. if (!(t.get(key)==null && t.containsKey(key)))
  17. return false;
  18. } else {
  19. if (!value.equals(t.get(key)))
  20. return false;
  21. }
  22. }
  23. } catch (ClassCastException unused) {
  24. return false;
  25. } catch (NullPointerException unused) {
  26. return false;
  27. }
  28. return true;
  29. }

首先比较大小是否相等,如果大小相等,那么就依次比较实体集合中每个实体的是否是一样的键值对,如果所有的键值对都一样,那么就返回true,否则返回false ,注意由于HashTable中数据存储并不是有序的,所以实际比较的时候使用的是get()根据指定的key获取对应的value

获取HashTable的hashCode

  1. public synchronized int hashCode() {
  2. /*
  3. * This code detects the recursion caused by computing the hash code
  4. * of a self-referential hash table and prevents the stack overflow
  5. * that would otherwise result. This allows certain 1.1-era
  6. * applets with self-referential hash tables to work. This code
  7. * abuses the loadFactor field to do double-duty as a hashCode
  8. * in progress flag, so as not to worsen the space performance.
  9. * A negative load factor indicates that hash code computation is
  10. * in progress.
  11. */
  12. int h = 0;
  13. if (count == 0 || loadFactor < 0)
  14. return h; // Returns zero
  15. loadFactor = -loadFactor; // Mark hashCode computation in progress
  16. Entry[] tab = table;
  17. for (Entry<K,V> entry : tab)
  18. while (entry != null) {
  19. h += entry.hashCode();
  20. entry = entry.next;
  21. }
  22. loadFactor = -loadFactor; // Mark hashCode computation complete
  23. return h;
  24. }

实际计算HashTable的hashCode是根据每个实体的hashCode来计算的,所有实体的hashCode()之和就是HashTable的hashCode。loadFractor扮演了一个标志位的角色,标志计算的开始和结束。关于这里h += entry.hashCode()中h是否会溢出的问题,上面再hash()函数中已经看到了每次求hash的时候都会与0x7fffffff,也就是说hash永远为正,不会溢出

序列化和反序列化

  1. private void writeObject(java.io.ObjectOutputStream s)
  2. throws IOException {
  3. Entry<K, V> entryStack = null;
  4. synchronized (this) {
  5. // Write out the length, threshold, loadfactor
  6. s.defaultWriteObject();
  7. // Write out length, count of elements
  8. s.writeInt(table.length);
  9. s.writeInt(count);
  10. // Stack copies of the entries in the table
  11. for (int index = 0; index < table.length; index++) {
  12. Entry<K,V> entry = table[index];
  13. while (entry != null) {
  14. entryStack =
  15. new Entry<>(0, entry.key, entry.value, entryStack);
  16. entry = entry.next;
  17. }
  18. }
  19. }
  20. // Write out the key/value objects from the stacked entries
  21. while (entryStack != null) {
  22. s.writeObject(entryStack.key);
  23. s.writeObject(entryStack.value);
  24. entryStack = entryStack.next;
  25. }
  26. }
  27. private void readObject(java.io.ObjectInputStream s)
  28. throws IOException, ClassNotFoundException
  29. {
  30. // Read in the length, threshold, and loadfactor
  31. s.defaultReadObject();
  32. // set hashSeed
  33. UNSAFE.putIntVolatile(this, HASHSEED_OFFSET,
  34. sun.misc.Hashing.randomHashSeed(this));
  35. // Read the original length of the array and number of elements
  36. int origlength = s.readInt();
  37. int elements = s.readInt();
  38. // Compute new size with a bit of room 5% to grow but
  39. // no larger than the original size. Make the length
  40. // odd if it's large enough, this helps distribute the entries.
  41. // Guard against the length ending up zero, that's not valid.
  42. int length = (int)(elements * loadFactor) + (elements / 20) + 3;
  43. if (length > elements && (length & 1) == 0)
  44. length--;
  45. if (origlength > 0 && length > origlength)
  46. length = origlength;
  47. Entry<K,V>[] table = new Entry[length];
  48. threshold = (int) Math.min(length * loadFactor, MAX_ARRAY_SIZE + 1);
  49. count = 0;
  50. useAltHashing = sun.misc.VM.isBooted() &&
  51. (length >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
  52. // Read the number of elements and then all the key/value objects
  53. for (; elements > 0; elements--) {
  54. K key = (K)s.readObject();
  55. V value = (V)s.readObject();
  56. // synch could be eliminated for performance
  57. reconstitutionPut(table, key, value);
  58. }
  59. this.table = table;
  60. }

注意在反序列化过程中,并不会重建一个和序列化时候一模一样的HashTable,相反,会根据实际的大小和装载因子等数据创建一个更加合理大小的数组来存放数据,也就是上面加注释的求取新的length的代码。首先需要保证的是新的length是不能超过原来的length,否则没有意义,另外是保证这个新的length是奇数,而且比原来增加5%.另外这里也说明了为什么capacity或者说length需要时奇数,主要是为了更均匀的映射。HashMap中设置为偶数也是为了均匀Hash

发表评论

表情:
评论列表 (有 0 条评论,443人围观)

还没有评论,来说两句吧...

相关阅读

    相关 HashTable分析

    一、前言 前面几篇介绍了List相关的几个类。本篇开始分析Map相关的集合常用类的源码,OK,从HashTable开始分析。我们知道HashTable是线程安全的,但是其