V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
binjjam
V2EX  ›  Python

re unicode 范围报错

  •  
  •   binjjam · 2017-06-01 14:27:02 +08:00 · 1860 次点击
    这是一个创建于 2738 天前的主题,其中的信息可能已经有所发展或是发生改变。

    https://repl.it/languages/python 使用 python 和 python3,执行这个 re 都没问题

    import re;re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)
    

    但是在 Ubuntu 14.04 LTS 的 python 和 python3.4 执行

    Python 3.4.0 (default, Jun 19 2015, 14:20:21) 
    [GCC 4.8.2] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re;re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)
    ['']
    >>> 
    
    
    
    Python 2.7.6 (default, Jun 22 2015, 17:58:13) 
    [GCC 4.8.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re;re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)
    [u'\U0001f61b']
    >>> 
    

    在 CentOS 执行

    Python 2.7.10 (default, Oct 21 2015, 19:55:03) 
    [GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re 
    >>> re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)  
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/usr/local/python2.7/lib/python2.7/re.py", line 181, in findall
    	return _compile(pattern, flags).findall(string)
      File "/usr/local/python2.7/lib/python2.7/re.py", line 251, in _compile
    	raise error, v # invalid expression
    sre_constants.error: bad character range
    >>> 
    
    Python 2.6.6 (r266:84292, Jul 23 2015, 15:22:56) 
    [GCC 4.4.7 20120313 (Red Hat 4.4.7-11)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re;re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)  
    [u'\U0001f61b']
    >>> 
    

    想请教下各位大侠的是长什么样的?对比了下,2.7 的 re 源码是一样的,而 GCC 版本明显不同,但是同个 CentOS 上 Python 2.6 是正常的

    第 1 条附言  ·  2017-06-01 15:08:45 +08:00
    1 条回复    2017-06-01 15:51:40 +08:00
    wwqgtxx
        1
    wwqgtxx  
       2017-06-01 15:51:40 +08:00
    wwq@ubuntu:~$ python3.5
    Python 3.5.2 (default, Nov 17 2016, 17:05:23)
    [GCC 5.4.0 20160609] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re;re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)
    ['😛']
    >>>
    wwq@ubuntu:~$ python3.6
    Python 3.6.1 (default, Apr 22 2017, 20:17:23)
    [GCC 5.4.0 20160609] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re;re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)
    ['😛']
    >>>
    wwq@ubuntu:~$ python2.7
    Python 2.7.12 (default, Nov 19 2016, 06:48:10)
    [GCC 5.4.0 20160609] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re;re.findall(u'[\U00010000-\U0001FFFFF]', u'\U0001f61b',re.U)
    [u'\U0001f61b']
    >>>
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   2915 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 31ms · UTC 02:47 · PVG 10:47 · LAX 18:47 · JFK 21:47
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.