Python正则表达式：10个正则表达式应用实例！

正则表达式，常被称为“regex”，是处理文本数据的超级工具。在Python中，通过re模块，我们可以轻松地进行模式匹配、搜索、替换等操作。本文将通过15个实例，从基础到进阶，让你掌握Python正则表达式的实用技巧。

实例1：基本匹配
目标：找出字符串中的所有单词。
import re

text = "Hello, world! Welcome to Python programming."
words = re.findall(r'\b\w+\b', text)
print(words) # 输出: ['Hello', 'world', 'Welcome', 'to', 'Python', 'programming']

解释：\b表示单词边界，\w+匹配一个或多个字母数字字符。
实例2：数字提取
目标：提取电话号码（假设格式为XXX-XXXX-XXXX）。
phone_numbers = "My number is 123-456-7890."
matches = re.findall(r'\d{3}-\d{4}-\d{4}', phone_numbers)
print(matches) # 输出: ['123-456-7890']技巧：\d代表数字，{n}指定重复次数。
实例3：邮箱地址匹配
目标：从一段文本中找出所有邮箱地址。
text_email = "Contact us at info@example.com or support@example.co.uk."
emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', text_email)
print(emails) # 输出: ['info@example.com', 'support@example.co.uk']注意：邮箱地址的正则表达式相对复杂，但能匹配大多数格式。
实例4：替换操作
目标：将所有的“Python”替换为“Python编程”。
text_replace = "Python is fun. I love Python."
updated_text = re.sub(r'Python', 'Python编程', text_replace)
print(updated_text) # 输出: Python编程 is fun. I love Python编程.功能：re.sub()用于替换匹配到的内容。
实例5：贪婪与非贪婪匹配
目标：提取HTML标签间的文本，考虑非贪婪匹配。
html_text = "<p>Hello, world!</p><div>Welcome!</div>"
content = re.findall(r'<[^>]*>(.*?)</[^>]*>', html_text, re.DOTALL)
print(content) # 输出: ['Hello, world!', 'Welcome!']关键：?使匹配非贪婪，re.DOTALL使.匹配包括换行在内的所有字符。
实例6：分组与引用
目标：提取网址的协议和主机部分。
url = "https://www.example.com/path"
protocol, host = re.search(r'^(https?://)([^/]+)', url).groups()
print(protocol, host) # 输出: https:// www.example.com解析：圆括号定义了分组，\1和\2引用分组内容。
实例7：重复模式
目标：匹配连续的数字序列。
sequence = "123456789012345"
consecutive_digits = re.findall(r'(\d)\1+', sequence)
print(consecutive_digits) # 输出: ['1', '2', '3', '4', '5']技巧：\1+匹配至少一次前面的分组。
实例8：否定预查
目标：查找不以数字开头的单词。
text = "3 apples, no bananas, 10 oranges."
words = re.findall(r'\b(?!\d)\w+\b', text)
print(words) # 输出: ['apples,', 'no', 'bananas,', 'oranges.']解释：(?!...)是负向前瞻，确保其后不匹配特定模式。
实例9：条件匹配
目标：区分邮箱的教育和商业账号。
email_text = "edu@example.edu biz@example.biz"
edu_or_biz = re.findall(r'(\w+@)(edu|biz)\.', email_text)
print(edu_or_biz) # 输出: [('edu@example.', 'edu'), ('biz@example.', 'biz')]使用：通过条件分支实现特定匹配。
实例10：全局标志
目标：大小写不敏感的搜索。
mixed_case = "Python is fun. PYTHON too!"
result = re.findall(r'python', mixed_case, re.IGNORECASE)
print(result) # 输出: ['Python', 'PYTHON']标志：re.IGNORECASE忽略大小写。
实战案例：清理CSV文件中的无效数据
场景：从CSV文件中移除非数字的手机号码记录。
import csv
import re

# 假设手机号码应为10位数字
pattern = re.compile(r'^\d{10}$')

with open('phone_numbers.csv', 'r') as file:
reader = csv.reader(file)
cleaned_data = []
for row in reader:
if pattern.match(row[0]): # 假设手机号码在第一列
cleaned_data.append(row)

# 将清洗后的数据保存到新文件
with open('cleaned_phone_numbers.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerows(cleaned_data)分析：此案例展示了如何结合正则表达式和文件操作来处理实际问题，确保数据质量。
通过上述实例和实战案例，你已经掌握了Python正则表达式的基础到进阶应用。

以上就是“Python正则表达式：10个正则表达式应用实例！”的详细内容，想要了解更多Python 教程欢迎持续关注编程学习网。

扫码二维码 获取免费视频学习资料

Python编程学习

查看2022高级编程视频教程免费获取