求教 Python 读文件，每次读 1w 行，有没有可以直接用的函数呢？

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

推荐学习书目

› Learn Python the Hard Way

Python Sites

› PyPI - Python Package Index

› http://diveintopython.org/toc/index.html

› Pocoo

值得关注的项目

› PyPy

› Celery

› Jinja2

› Read the Docs

› gevent

› pyenv

› virtualenv

› Stackless Python

› Beautiful Soup

› 结巴中文分词

› Green Unicorn

› Sentry

› Shovel

› Pyflakes

› pytest

Python 编程

› pep8 Checker

Styles

› PEP 8

› Google Python Style Guide

› Code Style from The Hitchhiker's Guide

这是一个创建于 2305 天前的主题，其中的信息可能已经有所发展或是发生改变。

用了 readlines （ 10000 ） Returns the next 10000 bytes of line. Only complete lines will be returned. 有没有可以直接用的函数呢？可以顺序每次读 1w 行，最后不足 1w 也能正常返回？

函数

readlines

returned

returns

10 条回复 • 2018-07-25 11:34:42 +08:00

noqwerty

2018-07-25 02:24:32 +08:00

https://gist.github.com/SCP-028/ba9318ba819568fe2fa3466f5e373b96

sjmcefc2

2018-07-25 02:37:48 +08:00

@noqwerty 非常感谢。请教如何才能找到自己需要的轮子呢。这个太棒了。

noqwerty

2018-07-25 02:44:05 +08:00

@sjmcefc2 #2 善用 Google 和 StackOverflow 呗，我觉得我这半桶水能问出来的问题基本都是别人问过的……

luzhongqiu

2018-07-25 08:42:08 +08:00

建议迭代器返回，上限用参数传过去不就好了么-。-

qsnow6

2018-07-25 09:36:27 +08:00

linecache 模块了解下

wwwyiqiao

2018-07-25 09:37:45 +08:00

自己封装一个呀

huahuajun9527

2018-07-25 09:58:28 +08:00

```python
def rows_reader(filepath, size=10):
with open(filepath) as f:
tmp = []
for line in f:
tmp.append(line)
if len(tmp) >= size:
yield tmp
tmp = []
if tmp:
yield tmp

for rows in rows_reader('m.md', size=10):
print(rows)

```

necomancer

2018-07-25 10:33:11 +08:00

要是读数据的话可以试试 pandas
import pandas as pd
f = pd.read_csv(<file_name>, chunksize =10000, ...)
更多参数看 https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
然后
for chunk in f:
do_sth(chunk)

sjmcefc2

2018-07-25 11:31:18 +08:00

@wwwyiqiao 觉得还是站在各位巨人肩膀上更好。

sjmcefc2

2018-07-25 11:34:42 +08:00

@luzhongqiu 不知道每个文件有多少行啊，迭代器我再学习下。

@necomancer 非常感谢。不过我这个是个 txt，要按照行来读取。

@huahuajun9527 这个太棒了我，试一下。
另外大家都怎么找到这些常用的模块，而不用自己封装一个呢？当然，高手，封装一个也是一秒钟的事儿。现成的模块对我这种外行来说，如何找到呢？