作者:蜗牛 | 来源:互联网 | 2023-10-16 19:03
据说2014年Python语言很热门,这跟它专注与数据打交道有很大关系。那么如何使用Python来处理数据呢?首先,我们要利用Python来打开文档,然后读取数据,再处理数据,
据说2014年Python语言很热门,这跟它专注与数据打交道有很大关系。
那么如何使用Python来处理数据呢?
首先,我们要利用Python来打开文档,然后读取数据,再处理数据,最后输出数据。
下面利用一个HeadFirstPython一书中的例子来说明,以此为学习笔记。
首先导入‘os‘模块,并把当前工作目录切换到包含数据文件的那个文件夹。
>>> import os#导入os模块
>>> os.getcwd()#获取当前工作目录
‘C:\\Python33‘
>>> os.chdir(‘D:\Python\HeadFirstPython\Chapter3‘)#切换当前工作目录
>>> os.getcwd()
‘D:\\Python\\HeadFirstPython\\Chapter3‘ 然后打开数据文件,从文件读取前两行,并在屏幕中显示出来。
>>> data=open(‘sketch.txt‘)#打开一个命名文件,将文件赋至一个"data"的文件对象
>>> print(data.readline(),end=‘‘)#使用"readline()"方法从文件获取一个数据行,然后使用"print()"BIF在屏幕上显示这个数据行
Man: Is this the right room for an argument?
>>> print(data.readline(),end=‘‘)
Other Man: I‘ve told you once. 再”退回“到文件起始位置,然后使用for语句处理文件中的每一行。
>>> data.seek(0)
0
>>> for each_line in data:
print(each_line,end=‘‘)
最后关闭文件。
>>>data.close() 通过上述程序,即可读取出文件中的每一行数据(文字)。
Man: Is this the right room for an argument?
Other Man: I‘ve told you once.
Man: No you haven‘t!
Other Man: Yes I have.
Man: When?
Other Man: Just now.
Man: No you didn‘t!
Other Man: Yes I did!
Man: You didn‘t!
Other Man: I‘m telling you, I did!
Man: You did not!
Other Man: Oh I‘m sorry, is this a five minute argument, or the full half hour?
Man: Ah! (taking out his wallet and paying) Just the five minutes.
Other Man: Just the five minutes. Thank you.
Other Man: Anyway, I did.
Man: You most certainly did not!
Other Man: Now let‘s get one thing quite clear: I most definitely told you!
Man: Oh no you didn‘t!
Other Man: Oh yes I did!
Man: Oh no you didn‘t!
Other Man: Oh yes I did!
Man: Oh look, this isn‘t an argument!
(pause)
Other Man: Yes it is!
Man: No it isn‘t!
(pause)
Man: It‘s just contradiction!
Other Man: No it isn‘t!
Man: It IS!
Other Man: It is NOT!
Man: You just contradicted me!
Other Man: No I didn‘t!
Man: You DID!
Other Man: No no no!
Man: You did just then!
Other Man: Nonsense!
Man: (exasperated) Oh, this is futile!!
(pause)
Other Man: No it isn‘t!
Man: Yes it is!
通过分析数据,我们可以发现这些数据遵循某种特定的格式:演员角色 冒号 演员讲的台词
下一节,我们将尝试把数据行中的各个部分提取出来。