Python_set集合&dict字典-蒲公英云

-—————————集合set—————————

概念

set是可变的、无序的、不重复的元素集合（约定：set为集合，collection为集合类型或容器）

set操作

set初始化

>>> s1 = set() >>> s2 = {} >>> s3 = {1, 2, 3} # {}内元素非k-v格式并且不为空时，类型为set >>> type(s2), type(s3) (dict, set)

set元素
- 必须可hash：可hash的都为不可变类型，称为hash类型， hashable（不可hash的类型：list、set、bytearray）
  - 数值型：int、float、complex
  - 布尔型：True、False
  - 字符串: string、bytes
  - tuple
  - None
- 不可以是索引
- 可迭代
  
  hash语法：hash(object) # object ->对象 # 返回对象的哈希值 # 示例： >>> hash(11), hash(11.11) # 数值 (11, 253642731013505035) >>> hash(True), hash(False) # 布尔 (1, 0) >>> hash(‘summer’), hash(b’a’) # 字符串 (1892576216179648960, -6180982495755337473) >>> hash(None) # None -9223372036575868911 >>> hash((1,)) # tuple 3430019387558 >>> hash([1, 2, 3]) # list不可直接进行hash —————————————————————————————————————- TypeError Traceback (most recent call last) in ——> 1 hash([1, 2, 3]) TypeError: unhashable type: ‘list’

set增加

# add() 增加元素到set中，若元素存在，则不做操作 >>> s1 = set() >>> s1.add('summer') >>> s1 {'summer'} # update() 合并其他元素到set中，参数必须为可迭代对象，就地修改 >>> s1.update([18, 'qingdao']) >>> s1 {18, 'qingdao', 'summer'}

set删除

# remove() 从set中移除一个元素，若元素不存在，抛出KeyError异常 >>> s1 = {1, 2, 3, 'ab', 66, 'ff'} >>> s1.remove('ab') # 移除元素'ab' >>> s1 {1, 2, 3, 66, 'ff'} >>> s1.remove('ss') # 若移除元素不存在，则抛出异常 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-33-63061d263830> in <module> ----> 1 s1.remove('ss') KeyError: 'ss' # discard() 从set中移除一个元素，若元素不存在，则不做操作 >>> s1.discard(1) # 移除元素1 >>> s1 {1, 2, 3, 66, 'ff'} >>> s1.discard('mm') # 移除元素不存在，则不做任何操作 >>> s1 {1, 2, 3, 66, 'ff'} # pop() 移除并返回任意元素，空集返回KeyError异常 >>> s1 {2, 3, 66, 'ff'} >>> s1.pop() 66 >>> s1.pop() 2 >>> s1.pop() 3 >>> s1.pop() 'ff' >>> s1.pop() # 当移除元素的集合为空时，抛出异常 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-51-095118de218b> in <module> ----> 1 s1.pop() # clear() 移除所有元素 >>> s1 = set(range(10)) >>> s1 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} >>> s1.clear() >>> s1 set()

set修改、查询
- set修改只能增加或删除元素
- set为非线形结构，无法索引
- 可进行遍历（迭代所有元素）
```
>>> s1 = set(range(5, 9)) >>> for i in s1: print(i) 8 5 6 7
```
set成员运算的比较
- list和set成员运算效率比较（set遍历的速率远高于list）

set和线性结构

线性结构的查询时间复杂度是O(n), 耗时随数据规模的增大而增加
set、dict等结构，内部使用hash值作为key，时间复杂度可做到O(1),且查询时间和规模无关

集合

概念
- 全集：所有元素的集合
- 子集subset和superset：A集合所有元素都在集合B中，A是B的子集，B是A的超集
- 真子集和真超集：A是B的子集，且A不等于B， A是B的真子集，B是A的真超集
- 并集：多个集合合并的结果
- 交集：多个集合共同的部分
- 差集：集合中除去其他集合共同的部分
集合运算
- 并集
  
  union(others) 返回多个集合合并后的新的集合 >>> s1, s2 ({0, 1, 2, 3, 4}, {3, 4, 5, 6, 7, 8, 9})= >>> s1.union(s2) {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} # 等同于union() >>> s1 | s2 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} # update(.others) 和多个集合合并，就地修改 >>> s1.update(s2) >>> s1 {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} # |= 等同于update() >>> s3 = {‘a’, ‘b’, ‘c’} >>> s2 |= s3 >>> s2 {3, 4, 5, 6, 7, 8, 9, ‘a’, ‘b’, ‘c’}
- 交集
  
  a = set(range(6)) >>> b = set(range(3, 10)) # intersection(others) 返回多个集合的交集 >>> a.intersection(b) # 返回同时属于a集合和b集合的元素,a集合和b集合不变 {3, 4, 5} >>> a, b ({0, 1, 2, 3, 4, 5}, {3, 4, 5, 6, 7, 8, 9}) # & 等同intersection() >>> a & b {3, 4, 5} >>> a, b ({0, 1, 2, 3, 4, 5}, {3, 4, 5, 6, 7, 8, 9}) # intersection_update(others) 获取和多个集合的交集，并且就地修改 >>> a.intersection_update(b) # 就地修改a集合的元素为a和b的交集，b集合不变 >>> a, b ({3, 4, 5}, {3, 4, 5, 6, 7, 8, 9}) # &= 等同于intersection_update() >>> a, c ({3, 4, 5}, {4, 5, 6, 7}) >>> a &= c >>> a, c ({4, 5}, {4, 5, 6, 7})
- 差集
  
  s1, s2 ({100, 101, 102, 103, 104, 105, 106, 107, 108, 109}, {105, 106, 107, 108, 109, 110, 111, 112, 113, 114}) # difference(others) 返回多个集合的差集 >>> s1.difference(s2) # 返回所有属于s1集合且不属于s2集合的元素所组成的新集合 {100, 101, 102, 103, 104} >>> s1, s2 # 原s1和s2集合不变 ({100, 101, 102, 103, 104, 105, 106, 107, 108, 109}, {105, 106, 107, 108, 109, 110, 111, 112, 113, 114}) # - 等同于difference >>> s1 - s2 {100, 101, 102, 103, 104} >>> s1, s2 ({100, 101, 102, 103, 104, 105, 106, 107, 108, 109}, {105, 106, 107, 108, 109, 110, 111, 112, 113, 114}) # difference_update(others) 获取多个集合的差集并就地修改 >>> s1.difference_update(s2) # 获取s1和s2集合的差集，并就地修改s1，s2不变 >>> s1, s2 ({100, 101, 102, 103, 104}, {105, 106, 107, 108, 109, 110, 111, 112, 113, 114}) # -= 等同于difference_update() >>> s3 = set(range(110, 120)) >>> s2 -= s3s >>> s2, s3 ({105, 106, 107, 108, 109}, {110, 111, 112, 113, 114, 115, 116, 117, 118, 119})
- 对称差集：集合A和集合B，由不属于A和B的交集元素组成的组合，即：(A - B) U (B - A)
  
  symmetric_difference(other) 返回和另一个集合的对称差集 >>> A, B ({‘a’, ‘b’, ‘c’, ‘d’, ‘e’}, {‘a’, ‘b’, ‘c’, ‘f’, ‘h’, ‘zz’}) >>> A.symmetric_difference(B) # 返回A和B的对称差集 {‘d’, ‘e’, ‘f’, ‘h’, ‘zz’} >>> A, B ({‘a’, ‘b’, ‘c’, ‘d’, ‘e’}, {‘a’, ‘b’, ‘c’, ‘f’, ‘h’, ‘zz’}) # ^ 等同于symmetric_difference() >>> A ^ B {‘d’, ‘e’, ‘f’, ‘h’, ‘zz’} >>> A, B ({‘a’, ‘b’, ‘c’, ‘d’, ‘e’}, {‘a’, ‘b’, ‘c’, ‘f’, ‘h’, ‘zz’}) # symmetric_difference_update(other) 获取和另一个集合的对称差集并就地修改 >>> A, B ({‘a’, ‘b’, ‘c’, ‘d’, ‘e’}, {‘a’, ‘b’, ‘c’, ‘f’, ‘h’, ‘zz’}) >>> A.symmetric_difference_update(B) # A集合修改为A和B集合的对称差集元素 >>> A, B ({‘d’, ‘e’, ‘f’, ‘h’, ‘zz’}, {‘a’, ‘b’, ‘c’, ‘f’, ‘h’, ‘zz’}) # ^= 等同于 symmetric_difference_update() >>> C = {‘f’, ‘h’, ‘zz’} >>> A ^= C >>> A, C ({‘d’, ‘e’}, {‘f’, ‘h’, ‘zz’})
- 其他集合运算
  
  a, b ({0, 1, 2, 3, 4, 5, 6, 7, 8, 9}, {0, 1, 2, 3, 4}) # issubset(other) 等同于<= 判断当前集合是否是另一个集合的子集 >>> a.issubset(b), b.issubset(a), a <= b, b <= a (False, True, False, True) # set1 < set2 判断set1是否是set2的真子集 >>> a < b, b < a (False, True) # set1 》 set2 判断set1是否是set2的真超集 >>> a > b, b > a (True, False) # isdisjoint(other) 判断当前集合和另一个集合没有交集，没有交集则返回True >>> a.isdisjoint(b), b.isdisjoint(a) (False, False) >>> c = set(range(20, 25)) >>> a.isdisjoint(c), b.isdisjoint(c) (True, True)

集合练习

随机产生2组各10个数字的列表，要求如下：
- 每个数字取值范围[10, 20]
- 统计20个数字中，一共有多少个不同的数字
- 2组比较，打印不同的数字及个数
- 2组比较，打印重复的数字及个数
import random lst1 = [] lst2 = [] for i in range(10): lst1.append(random.randint(1, 20)) lst2.append(random.randint(1, 20)) print(lst1,’\n’,lst2) s1 = set(lst1) s2 = set(lst2) print(s1, s2) print(‘Total: {} nums: {}’.format(len(s1 | s2), s1 | s2)) print(‘Total: {} The same nums: {}’.format(len(s1 & s2), s1 & s2)) print(‘Total: {} The diff num: {}’.format(len(s1 ^ s2), s1 ^ s2))

-————————-字典dict—————————

概念

dict是key-value键值对的数据的集合
dict是可变的、无序的、key不可重复
dict的key和set集合的元素要求一致，必须可hash

字典dict操作

dict定义（初始化）

# dict(**kwargs) 使用key=value对初始化一个字典，或直接使用{key:values,} >>> d1 = dict(a=1, b=2, c=3) >>> d2 = {'name': 'summer', 'age': 18} >>> d1, d2 ({'a': 1, 'b': 2, 'c': 3}, {'name': 'summer', 'age': 18}) # dict(iterable, **kwarg)使用可迭代对象构造字典，可迭代对象必须是一个二元结构 >>> d3 = dict(((1, 'a'), (2, 'b'))) >>> d4 = dict(([5, 'zz'], [6, 'ss']), c=100) >>> d3, d4 ({1: 'a', 2: 'b'}, {5: 'zz', 6: 'ss', 'c': 100}) # dict.fromkeys(iterable, value) 类方法，默认value=None >>> d6 = dict.fromkeys(range(5)) >>> d7 = dict.fromkeys(range(5), 0) >>> d6, d7 ({0: None, 1: None, 2: None, 3: None, 4: None}, {0: 0, 1: 0, 2: 0, 3: 0, 4: 0})

dict元素访问

# d[key] 返回key对应的value值，key不存在抛出KeyError异常 In [4]: d Out[4]: {5: 'zz', 6: 'ww', 7: 'qq', 8: 'aa', 9: 'ss'} In [5]: d[5], d[9] # 返回key对应的value值 Out[5]: ('zz', 'ss') In [6]: d[10] # key不存在抛异常 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-6-76a0c2738598> in <module> ----> 1 d[10] KeyError: 10 # get(key[, default]) 返回key对应的value值，key不存在返回缺省值，若无缺省值返回None In [10]: d.get(7), d.get(5) # 返回key对应的value值 Out[10]: ('qq', 'zz') In [12]: d.get(11), d.get(20, 'haha') # key值不存在返回None，设置缺省值则返回缺省值 Out[12]: (None, 'haha') # setdefault(key[, default]) 返回key对应的value值，key不存在，添加kv对，value默认为缺省值，若未设置缺省值则为None Out[13]: {5: 'zz', 6: 'ww', 7: 'qq', 8: 'aa', 9: 'ss'} In [14]: d.setdefault(9) # 返回key对应的value值 Out[14]: 'ss' In [15]: d.setdefault(11, 'hh') # key不存在，添加kv对，value默认为缺省值 Out[15]: 'hh' In [16]: d Out[16]: {5: 'zz', 6: 'ww', 7: 'qq', 8: 'aa', 9: 'ss', 11: 'hh'}

dict增加和修改

# d[key] = value 将key对应的值修改为value， key值不存在则添加新的kv对 In [17]: d Out[17]: {5: 'zz', 6: 'ww', 7: 'qq', 8: 'aa', 9: 'ss', 11: 'hh'} In [18]: d[11] = 'haha' # 修改key对应的value值 In [19]: d Out[19]: {5: 'zz', 6: 'ww', 7: 'qq', 8: 'aa', 9: 'ss', 11: 'haha'} In [20]: d[100] = 'world' # key值不存在则添加新kv对 In [21]: d Out[21]: {5: 'zz', 6: 'ww', 7: 'qq', 8: 'aa', 9: 'ss', 11: 'haha', 100: 'world'} # update([other]) 使用另一个字典的kv更新本字典，key不存在则创建，key存在则覆盖key对应的value值，就地修改 In [27]: d2 Out[27]: {111: 'boom', 6: 'mm', 7: 'wow'} In [28]: d.update(d2) # 使用字典d2更新字典d,key不存在则创建，存在则使用d2对应的value值覆盖d的value值 In [29]: d Out[29]: {5: 'zz', 6: 'mm', 7: 'wow', 8: 'aa', 9: 'ss', 11: 'haha', 100: 'world', 111: 'boom'}

dict删除

# pop(key[, default]) key存在，移除并返回对应value值；key不存在返回给定的defaule，default未设置，key不存在则抛出异常 In [31]: d.pop(5) # key存在，返回对应的value值 Out[31]: 'zz' In [32]: d.pop(22, 0) # key不存在，返回给定的值 Out[32]: 0 In [34]: d.pop(33) # key不存在且default为设置，则抛出异常 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-34-f427c8d8b6e4> in <module> ----> 1 d.pop(33) KeyError: 33 # popitem() 移除并返回任意一个键值对，字典为empty，抛出KeyError异常 In [40]: d2 Out[40]: {111: 'boom', 6: 'mm', 7: 'wow'} In [41]: d2.popitem() Out[41]: (7, 'wow') In [42]: d2.popitem() Out[42]: (6, 'mm') In [43]: d2.popitem() Out[43]: (111, 'boom') In [45]: d2 Out[45]: {} In [44]: d2.popitem() # 字典为空后，抛出异常 --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-44-2d6551896b5f> in <module> ----> 1 d2.popitem() KeyError: 'popitem(): dictionary is empty' # clear() 清空字典 In [46]: d Out[46]: {6: 'mm', 7: 'wow', 8: 'aa', 9: 'ss', 11: 'haha'} In [47]: d.clear() In [48]: d Out[48]: {}

dict遍历

# 遍历values # 1 ----------------- for k in d: print(d[k]) # 2 ----------------- for k in d.keys(): print(d.get(k)) # 3 ----------------- for v in d.values(): print(v) # 4 ----------------- for _,v in d.items(): print(v) # 遍历item，即kv对 # 1 ----------------- for item in d.items(): print(item) # 2 ----------------- for k,v in d.items(): print(k, v)