V2EX › lleon 的所有回复 › 第 5 页 / 共 6 页

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

1 2 3 4 5 6

❮

❯

2018-05-12 14:42:05 +08:00

回复了 Chengyaojin 创建的主题 › 问与答 › ios 各种语言学习相对好的词典是哪本比如英语朗文英英不错

Merriam Webster 很多释义我看不懂，但是看得懂的释义觉得秒杀其它所有词典

2018-05-12 14:36:40 +08:00

回复了 mokeyjay 创建的主题 › 问与答 › 竟没一款笔记软件合我心意？吐槽一下市面上笔记软件我不能忍的问题

如果印象笔记支持大纲视图就好了，markdown 就可以抛弃了。
现在用印象记小笔记，长笔记电脑上用 typora，手机平板用 mweb，iCloud 同步。

2018-05-07 18:34:43 +08:00

回复了 zhaoFinger 创建的主题 › 强迫症 › 别碰我屏幕

想个办法让屏幕带电

2018-04-27 14:21:45 +08:00

回复了 cheese 创建的主题 › 优惠信息 › JD Plus 会员可以抽爱奇艺会员天数

谢谢，领了 255 天，

2018-04-25 16:39:09 +08:00

回复了 lleon 创建的主题 › iPhone › ios 上有没有自由准确调节 mp3 播放进度的 app？

@beijiaoff 谢谢，回头试试

2018-04-20 14:43:06 +08:00

回复了 lleon 创建的主题 › iPhone › ios 怎么看本地带图片文件的 html 文件？

发现用 documents 可以

2017-07-30 13:45:08 +08:00

回复了 yechengzhe 创建的主题 › Python › 有哪些或者教程可以很好的入门 Python ?

@onlyhot
下一个 RegexBuddy 软件，它帮助里自带的 tutorials 是最好的正则表达式教程。
另外，最好背下元字符表：
. * ? + ( ) [ ^ $ | \
口决：点星问号加，小中大括号，头尾竖杠。

以前写的 Python 正则表达式笔记：
（参考资料：廖雪峰 Python 教程、正则指引）

re 模块包含所有正则表达式的功能。

在正则表达式内部插入匹配模式：

对大小写不敏感：(?i)
单行模式，即.匹配'\n'：(?s)
多行模式：(?m)
ASCII 模式，即\d、\w、\s 不会匹配非 ASCII 数字，字母和空格：(?a)
Unicode 模式，与 ASCII 模式相反，Python 3 的默认模式：(?u)
注释模式：(?x)

注意：Python 中的模式永远是对整个正则表达式生效的，无论(?...)出现在哪里； Python 不支持用(?-...)停用模式。

\A 匹配字符串的开始位置，\Z 匹配字符串的结束位置。Python 没有\z，\Z 相当于其它语言正则表达式的\z。
如果使用了命名分组，在正则表达式中应当使用(?P=name)来引用，在替换时应当使用\g<name>来引用。例如：r'(?P<char>[a-z])(?P=char)'

观察某个正则表达式的详细信息：

>>> re.compile(r'(ab|[cde])+', re.DEBUG)
max_repeat 1 2147483647
subpattern 1
branch
literal 97
literal 98
or
in
literal 99
literal 100
literal 101

re.compile('(ab|[cde])+', re.DEBUG)

Match 对象常用的方法和属性：

import re
m = re.search(r'(\d{4})-(\d{2})-(\d{2})', '2010-12-20')
print('%s start at %d and ends at %d' % (m.group(), m.pos, m.endpos))
for i in range(1, m.lastindex + 1): # 最大分组的编号
print('%s start at %d and ends at %d' %
(m.group(i), m.start(i), m.end(i)))
print(m.expand(r'year:\1 month:\2 day:\3'))

运行结果：

2010-12-20 start at 0 and ends at 10
2010 start at 0 and ends at 4
12 start at 5 and ends at 7
20 start at 8 and ends at 10
year:2010 month:12 day:20

常用操作示例：

import re

# 1. 验证与搜索

# 如果找到，search()方法返回一个 Match 对象，否则返回 None
if re.search(r'\A\d{4}-\d{2}-\d{2}\Z', '2010-12-20'):
print('ok')

# 如果一个正则表达式要反复使用，可以先编译它，以提高效率
dateRegex = re.compile(r'\A\d{4}-\d{2}-\d{2}\Z')
if dateRegex.search('2010-12-20'): # 或 re.search(dateRegex, '2010-12-20')
print('ok')

# Match 对象的 group()和 groups()方法
phoneNumRegex = re.compile(r'(\d\d\d)-(\d\d\d-\d\d\d\d)')
m = phoneNumRegex.search('My number is 415-555-4242.')
print(m.group()) # 输出：415-555-4242
print(m.group(0)) # 输出：415-555-4242
print(m.group(1)) # 输出：415
print(m.group(2)) # 输出：555-4242
# 一次性获取从 1 开始的所有的分组
print(m.groups()) # 输出：('415', '555-4242')

# match()和 search()非常相似，参数和返回值都相同，唯一的区别在于：
# match()只会从字符串的最左端开始匹配，search()则没有这个限制。

# 2. 提取

# findall()方法返回一个字符串列表或一个元组列表，没找到则返回一个空列表
print(re.findall(r'\d{4}-\d{2}-\d{2}', '2010-12-20 2011-02-14'))
# 输出：['2010-12-20', '2011-02-14']
print(re.findall(r'(\d{4})-(\d{2})-(\d{2})', '2010-12-20 2011-02-14'))
# 输出：[('2010', '12', '20'), ('2011', '02', '14')]

# 使用 finditer()迭代查找
for iter in re.finditer(r'(\d{4})-(\d{2}-(\d{2}))', '2010-12-20 2011-02-14'):
print(iter.group()) # iter 是一个 Match 对象
# 输出：2010-12-20
# 输出：2011-02-14

# 使用命名分组
regex = re.compile(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')
for iter in regex.finditer('2010-12-20 2011-02-14'):
print(iter.group('year'), iter.group('month'), iter.group('day'))

# 3. 替换

print(re.sub(r'\d+', r'**', 'She is 22 years old.'))
# 输出：She is ** years old.

regex = r'(\d{4})-(\d{2})-(\d{2})'
replacement = r'\2/\3/\1' # 或 r'\g<2>/\g<3>/\g<1>'
print(re.sub(regex, replacement, '2010-12-20'))
# 输出：12/20/2010

# 使用命名分组
regex = r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})'
replacement = r'\g<month>/\g<day>/\g<year>'
print(re.sub(regex, replacement, '2010-12-20'))
# 输出：12/20/2010

# 在替换中使用整个表达式匹配的文本
regex = r'\d+.\d{0,2}'
replacement = r'$\g<0>'
print(re.sub(regex, replacement, 'the price is 12.99'))
# 输出：the price is $12.99

# 指定替换操作最多发生的次数：
print(re.sub(r'\d+', 'x', '13 + 76 = 89', 1))
# 输出：x + 76 = 89

# 将所有单词统一为首字母大写其余字母小写格式：
def capitalize(match):
return match.group(1).upper() + match.group(2).lower()
result = re.sub(r'(?i)\b([a-z])([a-z]+)\b', capitalize, 'one tWO THREE')
print(result)
# 输出：One Two Three

# 4. 切分字符串
print(re.split(r'\s+', 'a b \t\r\n c '))
# 输出：['a', 'b', 'c', '']

2017-07-30 12:56:30 +08:00

回复了 lleon 创建的主题 › Python › 这是不是 Python 的一个 bug？

接上
Console.WriteLine(@"dir ""c:\my docs""");

Python 的 r 字符串里如果有引号，还要根据里面的引号选择不同的外围引号，真麻烦

2017-07-30 12:52:48 +08:00

回复了 lleon 创建的主题 › Python › 这是不是 Python 的一个 bug？

@wisej
C#中类似 r 前缀的 @前缀的语义简单多了：将字符中的\变成普通字符。
Console.WriteLine(@"why\"); # 输出：why\
碰到"，用""表示：
C

2017-07-30 10:50:24 +08:00

回复了 lleon 创建的主题 › Python › 这是不是 Python 的一个 bug？

@RLib
thanks

2017-07-30 10:41:07 +08:00

回复了 lleon 创建的主题 › Python › 这是不是 Python 的一个 bug？

（少写了 r ）
事实上，print(r'why\\')的结果是 why\\而不是 why\