一文掌握Python 内置数据结构

liftword3个月前 (04-09)技术文章20

列表

Python 没有数组数据结构（大小固定），相反，它有一个列表数据结构，它是一个动态数组，可以存储混合数据类型的值。与传统数组相比，它具有优势，例如不必指定大小和混合数据类型。

当创建列表时，内部会分配一个内存块来保存所有元素。随着添加更多元素，列表会动态地重新分配内存以适应不断增长的列表大小。当列表超过分配的内存时，Python 会分配更大的内存块，将所有现有元素复制到新的内存块，然后释放旧内存。

让我们来看看可以对列表执行的常见作：

fruits: list[str] = ["", "", "", ""]

# Accessing elements in a list
print(fruits[0])  # 
print(fruits[1])  # 
print(fruits[-1])  # 
print(fruits[-2])  # 

# Slicing a list (inclusive:exclusive)
print(fruits[1:3])  # ['', '']
print(fruits[1:])  # ['', '', '']
print(fruits[:2])  # ['', '']
print(fruits[:])  # ['', '', '', '']

# Modifying elements in a list
fruits[0] = ""
print(fruits)  # ['', '', '', '']

# Removing elements from a list by index
del fruits[1]
print(fruits)  # ['', '', '']

# Getting the length of a list
print(len(fruits))  # 4

# Joining lists
vegetables = ["", "", ""]
fruits_and_vegetables = fruits + vegetables
print(fruits_and_vegetables)  # ['', '', '', '', '', '', '']

# Repeating a list
repeated_fruits = fruits * 2  # ['', '', '', '', '', '']

请注意，如果引用不存在的索引，则会引发 IndexError。

使用带有 range 函数的列表的其他一些有用示例：

# Creating a list with a range of numbers
numbers = list(range(10))
print(numbers)  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Creating a list with a range of numbers with a step
numbers = list(range(0, 10, 2))
print(numbers)  # [0, 2, 4, 6, 8]

# Creating a list with a range of numbers in reverse
numbers = list(range(10, 0, -1))
print(numbers)  # [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

复制和引用

列表由引用共享。这意味着 identifiers 引用内存中的相同值。这对于理解和避免不必要的错误非常重要。这与字符串、整数、浮点值和布尔值等简单（没有这样的东西）数据类型不同。这些是按价值共享的。

这种通过引用传递是有益的，因为现在你不会在每次传递/分配值时都创建新的内存。

在下面的示例中，is（）指示 user1 和 user2 引用位于同一内存位置的值。

users: list[str] = ["Alice", "Bob", "Charlie"]

users_2 = users
users_3 = ["Alice", "Bob", "Charlie"]

print(users == users_2)  # True
print(users is users_2)  # True, same memory location

print(users == users_3)  # True
print(users is users_3)  # False

列表是可变的，即它们可以被修改。一个非常常见的陷阱是使用默认的空列表初始化函数中的变量。这样做的问题在于，Python 中的默认参数值在定义函数时计算一次，而不是在每次调用函数时计算。因此，如果你使用可变的默认值（在本例中为 list），那么对于每个函数调用，你都会引用内存中的相同列表。

def append_to_list(value, my_list=[]):
    my_list.append(value)
    return my_list


result1 = append_to_list(1)
print(result1)  # Output: [1]

result2 = append_to_list(2)
print(result2)  # Output: [1, 2]  (unexpected!)

result3 = append_to_list(3)
print(result3)  # Output: [1, 2, 3]  (unexpected!)

解决方案是使用不可变值 None （首选）进行初始化。所以在这种情况下，它将是 my_list=None。

如果要复制列表而不仅仅是使用引用，则可以使用 copy 方法或 copy 模块（以便更好地控制复制）。copy 方法只执行浅拷贝。对于深拷贝，您需要使用 deepcopy 函数。

from copy import deepcopy, copy

users: list[dict] = [
    {"name": "John Doe", "age": 12},
    {"name": "Jane Doe", "age": 20},
    {"name": "Tom Smith", "age": 30},
]

users_2 = copy(users)  # shallow
users_2_1 = users.copy()  # shallow
users_3 = deepcopy(users)  # deep

print(users_2 is users)  # False
print(users_3 is users)  # False

print(users_2[0] is users[0])  # True
print(users_3[0] is users[0])  # False

浅层复制会创建新对象，但不会在原始对象中创建嵌套对象的副本。相反，它只是复制对这些嵌套对象的引用。因此，对复制对象中的嵌套对象所做的更改也会影响原始对象，反之亦然。

另一种复制方法是使用 * 运算符。

list1 = [1, 2, 3]
list2 = [4, 5, 6]

# Spreading elements from both lists into a new list
combined_list = [*list1, *list2]
print(combined_list)  # Output: [1, 2, 3, 4, 5, 6]


def add_numbers(a, b, c):
    return a + b + c


numbers = [1, 2, 3]

result = add_numbers(*numbers)
print(result)  # Output: 6

常用方法

以下代码段包含常用方法的示例：

nums: list[int] = [1, 1, 6, 2, 5, 8, 9, 3]
print(nums)

print(nums.count(1))  # 2
print(nums.index(6))  # 2

nums.append(10)
print(nums)  # [1, 1, 6, 2, 5, 8, 9, 3, 10]

nums.insert(2, 7)
print(nums)  # [1, 1, 7, 6, 2, 5, 8, 9, 3, 10]

nums.remove(5)
print(nums)  # [1, 1, 7, 6, 2, 8, 9, 3, 10]

nums.pop()
print(nums)  # [1, 1, 7, 6, 2, 8, 9, 3]

nums.sort()
print(nums)  # [1, 1, 2, 3, 6, 7, 8, 9]

nums.reverse()
print(nums)  # [9, 8, 7, 6, 3, 2, 1, 1]

nums.extend([1, 2, 3, 4, 5])
print(nums)  # [9, 8, 7, 6, 3, 2, 1, 1, 1, 2, 3, 4, 5]

nums.clear()
print(nums)  # []

nums = [1, 1, 6, 2, 5, 8, 9, 3]
print("-".join([str(num) for num in nums]))  # join needs an iterable of strings
# Output: 1-1-6-2-5-8-9-3j

pop（）方法用于从索引中删除项目，而 remove（）方法用于删除值。（仅删除它的第一个匹配项。如果索引或值不存在，则您将分别收到 IndexError 和 ValueError。

nums: list[int] = []
print(nums.pop())  # IndexError: pop from empty list

nums = [1, 2, 3]
print(nums.pop(1))  # 2
print(nums)  # [1, 3]

print(nums.remove(3))  # None
print(nums)  # [1]
print(nums.remove(3))  # ValueError: list.remove(x): x not in list

set（）和 reverse（）方法执行就地作，即它们修改原始列表而不是返回新的修改后的列表。如果我们不想更改原始列表并仍然执行这些作，那么我们必须使用 sorted（）函数。

nums: list[int] = [3, 1, 7, 4, 5, 10, 6]

# In-place

nums.sort()
print(nums)  # Output: [1, 3, 4, 5, 6, 7, 10]

nums.sort(reverse=True)
print(nums)  # Output: [10, 7, 6, 5, 4, 3, 1]

nums.reverse()
print(nums)  # Output: [1, 3, 4, 5, 6, 7, 10]

# New list

sorted_nums = sorted(nums)
print(sorted_nums)  # Output: [1, 3, 4, 5, 6, 7, 10]

sorted_nums_desc = sorted(nums, reverse=True)
print(sorted_nums_desc)  # Output: [10, 7, 6, 5, 4, 3, 1]

reversed_nums = list(reversed(nums))
print(reversed_nums)  # Output: [10, 7, 6, 5, 4, 3, 1]

使用 in 关键字检查列表中是否存在值。

nums: list[int] = [3, 1, 7, 4, 5, 10, 6]
print(3 in nums)  # True
print(9 in nums)  # False

循环

可以使用 for..in 循环遍历列表。还可以使用 enumerate（）函数遍历列表并获取当前迭代的项目索引。

tools: list[str] = ["", "", "", "", "", ""]

for tool in tools:
    print(tool)

# Output:
# 
# 
# 
# 
# 
# 

for index, tool in enumerate(tools):
    print(f"Tool {index + 1}: {tool}")

# Output:
# Tool 1: 
# Tool 2: 
# Tool 3: 
# Tool 4: 
# Tool 5: 
# Tool 6:

enumerate（）函数有第二个参数，可用于指定起始索引。（默认情况下为 0。

解开

我们可以使用 * 运算符来解压缩列表中的元素。这是一个非常有用的功能，为我们节省了大量时间。以下是一些示例：

superheros: list[str] = ["batman", "superman", "spiderman"]
supervillans: list[str] = ["joker", "lex luthor", "venom"]

hero1, hero2, hero3 = superheros
print(hero1, hero2, hero3)  # Output: batman superman spiderman

villan1, *villans = supervillans
print(villan1, villans)  # Output: joker ['lex luthor', 'venom']

comic_characters = [*superheros, *supervillans]
print(comic_characters)
# Output: ['batman', 'superman', 'spiderman', 'joker', 'lex luthor', 'venom']

元组

元组是元素的有序集合。它类似于列表，但有一些关键区别。它是一个不可变的序列，这意味着一旦创建，就不能修改它的元素，也不能在元组中添加或删除元素。

# Creating an empty tuple
empty_tuple: tuple[()] = ()

# Creating a tuple with elements
num_and_str_tuple = (1, 2, 3, "a", "b", "c")

# Creating a tuple with a single element (note the comma)
single_element_tuple = (42,)

# Accessing elements of a tuple
first_element = num_and_str_tuple[0]  # 1
last_element = num_and_str_tuple[-1]  # "c"

# Slicing a tuple
first_three_elements = num_and_str_tuple[:3]  # (1, 2, 3)
last_three_elements = num_and_str_tuple[-3:]  # ("a", "b", "c")

以下是从 Tuples 中解包值的示例：

num_and_str_tuple = (1, 2, 3, "a", "b", "c")

# Unpacking a tuple
first, second, third, *rest = num_and_str_tuple
print(first, second, third, rest)  # Output: 1 2 3 ['a', 'b', 'c']

# Accessing a single element from the tuple
single_element = num_and_str_tuple[0]
print(single_element)  # Output: 1

# Unpacking a tuple with a single element using parentheses
# Won't work with multiple elements (for them use star operator)
(single_element,) = (1,)
print(single_element)  # Output: 1

# Unpacking a tuple with a single element using the star operator
single_element, *_ = num_and_str_tuple
print(single_element)  # Output: 1

# Unpacking a tuple with a single element using the star operator
*_, last_element = num_and_str_tuple
print(last_element)  # Output: c

由于元组是不可变的，因此它们是可哈希的，可以用作字典中的键或集合中的元素。它们的不变性还允许 Python 优化存储和访问时间。

# Creating a dictionary with tuples as keys
employee_info = {
    ("John", "Doe"): 50000,
    ("Alice", "Smith"): 60000,
    ("Bob", "Johnson"): 55000,
}

# Accessing values using tuple keys
print(employee_info[("John", "Doe")])  # Output: 50000
print(employee_info[("Alice", "Smith")])  # Output: 60000

集合

集是唯一元素的无序集合。它是使用哈希表实现的。集依赖于哈希表进行存储。因此，成员资格测试和添加元素等作具有平均 O（1）复杂度：

# Creating and modifying a set
numbers = {1, 2, 3}
numbers.add(4)
print(numbers)  # Output: {1, 2, 3, 4}

numbers.discard(2)
print(numbers)  # Output: {1, 3, 4}

# Set operations
odds = {1, 3, 5}
evens = {2, 4, 6}

print(odds.union(evens))  # Output: {1, 2, 3, 4, 5, 6}
print(odds.intersection(numbers))  # Output: {1, 3}

还有 frozenset，它是 set 的不可变版本。

# Creating a frozenset
immutable_set = frozenset([1, 2, 2, 3])
print(immutable_set)  # Output: frozenset({1, 2, 3})

# Frozensets support set operations
another_set = frozenset([3, 4, 5])
print(immutable_set.union(another_set))  # Output: frozenset({1, 2, 3, 4, 5})

字典

Python 中的字典数据结构是键值对的可变、无序集合。（在 Python v3.7 之后，插入顺序保持不变。在内部，它使用哈希表进行高效查找。键必须是唯一的且可哈希的。

字典使用哈希表将键映射到值。哈希可确保快速查找（平均 O（1）复杂性）。冲突（当两个键具有相同的哈希值时）使用开放寻址或链接来处理。

以下是字典在日常工作中的基本用法：

# Creating a dictionary
person = {"name": "Alice", "age": 25}

# Accessing values
print(person["name"])  # Output: Alice

# Updating values
person["age"] = 26
print(person)  # Output: {'name': 'Alice', 'age': 26}

# Using common methods
print(person.keys())  # Output: dict_keys(['name', 'age'])
print(person.values())  # Output: dict_values(['Alice', 26])
print(person.items())  # Output: dict_items([('name', 'Alice'), ('age', 26)])

流照教程网

一文掌握Python 内置数据结构

列表

复制和引用

常用方法

循环

解开

元组

集合

字典

相关文章

8-Python内置函数

Python用内置模块来构建REST服务、RPC服务

这就是Python 模块

Python常用的内置函数介绍

Python包导入指南:从菜鸟到专家的import魔法

蜀ICP备2024111239号-1