一文掌握Python 内置数据结构
列表
Python 没有数组数据结构(大小固定),相反,它有一个列表数据结构,它是一个动态数组,可以存储混合数据类型的值。与传统数组相比,它具有优势,例如不必指定大小和混合数据类型。
当创建列表时,内部会分配一个内存块来保存所有元素。随着添加更多元素,列表会动态地重新分配内存以适应不断增长的列表大小。当列表超过分配的内存时,Python 会分配更大的内存块,将所有现有元素复制到新的内存块,然后释放旧内存。
让我们来看看可以对列表执行的常见作:
fruits: list[str] = ["", "", "", ""]
# Accessing elements in a list
print(fruits[0]) #
print(fruits[1]) #
print(fruits[-1]) #
print(fruits[-2]) #
# Slicing a list (inclusive:exclusive)
print(fruits[1:3]) # ['', '']
print(fruits[1:]) # ['', '', '']
print(fruits[:2]) # ['', '']
print(fruits[:]) # ['', '', '', '']
# Modifying elements in a list
fruits[0] = ""
print(fruits) # ['', '', '', '']
# Removing elements from a list by index
del fruits[1]
print(fruits) # ['', '', '']
# Getting the length of a list
print(len(fruits)) # 4
# Joining lists
vegetables = ["", "", ""]
fruits_and_vegetables = fruits + vegetables
print(fruits_and_vegetables) # ['', '', '', '', '', '', '']
# Repeating a list
repeated_fruits = fruits * 2 # ['', '', '', '', '', '']
请注意,如果引用不存在的索引,则会引发 IndexError。
使用带有 range 函数的列表的其他一些有用示例:
# Creating a list with a range of numbers
numbers = list(range(10))
print(numbers) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# Creating a list with a range of numbers with a step
numbers = list(range(0, 10, 2))
print(numbers) # [0, 2, 4, 6, 8]
# Creating a list with a range of numbers in reverse
numbers = list(range(10, 0, -1))
print(numbers) # [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
复制和引用
列表由引用共享。这意味着 identifiers 引用内存中的相同值。这对于理解和避免不必要的错误非常重要。这与字符串、整数、浮点值和布尔值等简单(没有这样的东西)数据类型不同。这些是按价值共享的。
这种通过引用传递是有益的,因为现在你不会在每次传递/分配值时都创建新的内存。
在下面的示例中,is() 指示 user1 和 user2 引用位于同一内存位置的值。
users: list[str] = ["Alice", "Bob", "Charlie"]
users_2 = users
users_3 = ["Alice", "Bob", "Charlie"]
print(users == users_2) # True
print(users is users_2) # True, same memory location
print(users == users_3) # True
print(users is users_3) # False
列表是可变的,即它们可以被修改。一个非常常见的陷阱是使用默认的空列表初始化函数中的变量。这样做的问题在于,Python 中的默认参数值在定义函数时计算一次,而不是在每次调用函数时计算。因此,如果你使用可变的默认值(在本例中为 list),那么对于每个函数调用,你都会引用内存中的相同列表。
def append_to_list(value, my_list=[]):
my_list.append(value)
return my_list
result1 = append_to_list(1)
print(result1) # Output: [1]
result2 = append_to_list(2)
print(result2) # Output: [1, 2] (unexpected!)
result3 = append_to_list(3)
print(result3) # Output: [1, 2, 3] (unexpected!)
解决方案是使用不可变值 None (首选) 进行初始化。所以在这种情况下,它将是 my_list=None。
如果要复制列表而不仅仅是使用引用,则可以使用 copy 方法或 copy 模块(以便更好地控制复制)。copy 方法只执行浅拷贝。对于深拷贝,您需要使用 deepcopy 函数。
from copy import deepcopy, copy
users: list[dict] = [
{"name": "John Doe", "age": 12},
{"name": "Jane Doe", "age": 20},
{"name": "Tom Smith", "age": 30},
]
users_2 = copy(users) # shallow
users_2_1 = users.copy() # shallow
users_3 = deepcopy(users) # deep
print(users_2 is users) # False
print(users_3 is users) # False
print(users_2[0] is users[0]) # True
print(users_3[0] is users[0]) # False
浅层复制会创建新对象,但不会在原始对象中创建嵌套对象的副本。相反,它只是复制对这些嵌套对象的引用。因此,对复制对象中的嵌套对象所做的更改也会影响原始对象,反之亦然。
另一种复制方法是使用 * 运算符。
list1 = [1, 2, 3]
list2 = [4, 5, 6]
# Spreading elements from both lists into a new list
combined_list = [*list1, *list2]
print(combined_list) # Output: [1, 2, 3, 4, 5, 6]
def add_numbers(a, b, c):
return a + b + c
numbers = [1, 2, 3]
result = add_numbers(*numbers)
print(result) # Output: 6
常用方法
以下代码段包含常用方法的示例:
nums: list[int] = [1, 1, 6, 2, 5, 8, 9, 3]
print(nums)
print(nums.count(1)) # 2
print(nums.index(6)) # 2
nums.append(10)
print(nums) # [1, 1, 6, 2, 5, 8, 9, 3, 10]
nums.insert(2, 7)
print(nums) # [1, 1, 7, 6, 2, 5, 8, 9, 3, 10]
nums.remove(5)
print(nums) # [1, 1, 7, 6, 2, 8, 9, 3, 10]
nums.pop()
print(nums) # [1, 1, 7, 6, 2, 8, 9, 3]
nums.sort()
print(nums) # [1, 1, 2, 3, 6, 7, 8, 9]
nums.reverse()
print(nums) # [9, 8, 7, 6, 3, 2, 1, 1]
nums.extend([1, 2, 3, 4, 5])
print(nums) # [9, 8, 7, 6, 3, 2, 1, 1, 1, 2, 3, 4, 5]
nums.clear()
print(nums) # []
nums = [1, 1, 6, 2, 5, 8, 9, 3]
print("-".join([str(num) for num in nums])) # join needs an iterable of strings
# Output: 1-1-6-2-5-8-9-3j
pop() 方法用于从索引中删除项目,而 remove() 方法用于删除值。(仅删除它的第一个匹配项。如果索引或值不存在,则您将分别收到 IndexError 和 ValueError。
nums: list[int] = []
print(nums.pop()) # IndexError: pop from empty list
nums = [1, 2, 3]
print(nums.pop(1)) # 2
print(nums) # [1, 3]
print(nums.remove(3)) # None
print(nums) # [1]
print(nums.remove(3)) # ValueError: list.remove(x): x not in list
set() 和 reverse() 方法执行就地作,即它们修改原始列表而不是返回新的修改后的列表。如果我们不想更改原始列表并仍然执行这些作,那么我们必须使用 sorted() 函数。
nums: list[int] = [3, 1, 7, 4, 5, 10, 6]
# In-place
nums.sort()
print(nums) # Output: [1, 3, 4, 5, 6, 7, 10]
nums.sort(reverse=True)
print(nums) # Output: [10, 7, 6, 5, 4, 3, 1]
nums.reverse()
print(nums) # Output: [1, 3, 4, 5, 6, 7, 10]
# New list
sorted_nums = sorted(nums)
print(sorted_nums) # Output: [1, 3, 4, 5, 6, 7, 10]
sorted_nums_desc = sorted(nums, reverse=True)
print(sorted_nums_desc) # Output: [10, 7, 6, 5, 4, 3, 1]
reversed_nums = list(reversed(nums))
print(reversed_nums) # Output: [10, 7, 6, 5, 4, 3, 1]
使用 in 关键字检查列表中是否存在值。
nums: list[int] = [3, 1, 7, 4, 5, 10, 6]
print(3 in nums) # True
print(9 in nums) # False
循环
可以使用 for..in 循环遍历列表。还可以使用 enumerate() 函数遍历列表并获取当前迭代的项目索引。
tools: list[str] = ["", "", "", "", "", ""]
for tool in tools:
print(tool)
# Output:
#
#
#
#
#
#
for index, tool in enumerate(tools):
print(f"Tool {index + 1}: {tool}")
# Output:
# Tool 1:
# Tool 2:
# Tool 3:
# Tool 4:
# Tool 5:
# Tool 6:
enumerate() 函数有第二个参数,可用于指定起始索引。(默认情况下为 0。
解开
我们可以使用 * 运算符来解压缩列表中的元素。这是一个非常有用的功能,为我们节省了大量时间。以下是一些示例:
superheros: list[str] = ["batman", "superman", "spiderman"]
supervillans: list[str] = ["joker", "lex luthor", "venom"]
hero1, hero2, hero3 = superheros
print(hero1, hero2, hero3) # Output: batman superman spiderman
villan1, *villans = supervillans
print(villan1, villans) # Output: joker ['lex luthor', 'venom']
comic_characters = [*superheros, *supervillans]
print(comic_characters)
# Output: ['batman', 'superman', 'spiderman', 'joker', 'lex luthor', 'venom']
元组
元组是元素的有序集合。它类似于列表,但有一些关键区别。它是一个不可变的序列,这意味着一旦创建,就不能修改它的元素,也不能在元组中添加或删除元素。
# Creating an empty tuple
empty_tuple: tuple[()] = ()
# Creating a tuple with elements
num_and_str_tuple = (1, 2, 3, "a", "b", "c")
# Creating a tuple with a single element (note the comma)
single_element_tuple = (42,)
# Accessing elements of a tuple
first_element = num_and_str_tuple[0] # 1
last_element = num_and_str_tuple[-1] # "c"
# Slicing a tuple
first_three_elements = num_and_str_tuple[:3] # (1, 2, 3)
last_three_elements = num_and_str_tuple[-3:] # ("a", "b", "c")
以下是从 Tuples 中解包值的示例:
num_and_str_tuple = (1, 2, 3, "a", "b", "c")
# Unpacking a tuple
first, second, third, *rest = num_and_str_tuple
print(first, second, third, rest) # Output: 1 2 3 ['a', 'b', 'c']
# Accessing a single element from the tuple
single_element = num_and_str_tuple[0]
print(single_element) # Output: 1
# Unpacking a tuple with a single element using parentheses
# Won't work with multiple elements (for them use star operator)
(single_element,) = (1,)
print(single_element) # Output: 1
# Unpacking a tuple with a single element using the star operator
single_element, *_ = num_and_str_tuple
print(single_element) # Output: 1
# Unpacking a tuple with a single element using the star operator
*_, last_element = num_and_str_tuple
print(last_element) # Output: c
由于元组是不可变的,因此它们是可哈希的,可以用作字典中的键或集合中的元素。它们的不变性还允许 Python 优化存储和访问时间。
# Creating a dictionary with tuples as keys
employee_info = {
("John", "Doe"): 50000,
("Alice", "Smith"): 60000,
("Bob", "Johnson"): 55000,
}
# Accessing values using tuple keys
print(employee_info[("John", "Doe")]) # Output: 50000
print(employee_info[("Alice", "Smith")]) # Output: 60000
集合
集是唯一元素的无序集合。它是使用哈希表实现的。集依赖于哈希表进行存储。因此,成员资格测试和添加元素等作具有平均 O(1) 复杂度:
# Creating and modifying a set
numbers = {1, 2, 3}
numbers.add(4)
print(numbers) # Output: {1, 2, 3, 4}
numbers.discard(2)
print(numbers) # Output: {1, 3, 4}
# Set operations
odds = {1, 3, 5}
evens = {2, 4, 6}
print(odds.union(evens)) # Output: {1, 2, 3, 4, 5, 6}
print(odds.intersection(numbers)) # Output: {1, 3}
还有 frozenset,它是 set 的不可变版本。
# Creating a frozenset
immutable_set = frozenset([1, 2, 2, 3])
print(immutable_set) # Output: frozenset({1, 2, 3})
# Frozensets support set operations
another_set = frozenset([3, 4, 5])
print(immutable_set.union(another_set)) # Output: frozenset({1, 2, 3, 4, 5})
字典
Python 中的字典数据结构是键值对的可变、无序集合。(在 Python v3.7 之后,插入顺序保持不变。在内部,它使用哈希表进行高效查找。键必须是唯一的且可哈希的。
字典使用哈希表将键映射到值。哈希可确保快速查找(平均 O(1) 复杂性)。冲突(当两个键具有相同的哈希值时)使用开放寻址或链接来处理。
以下是字典在日常工作中的基本用法:
# Creating a dictionary
person = {"name": "Alice", "age": 25}
# Accessing values
print(person["name"]) # Output: Alice
# Updating values
person["age"] = 26
print(person) # Output: {'name': 'Alice', 'age': 26}
# Using common methods
print(person.keys()) # Output: dict_keys(['name', 'age'])
print(person.values()) # Output: dict_values(['Alice', 26])
print(person.items()) # Output: dict_items([('name', 'Alice'), ('age', 26)])