FileNotFoundError:[Errno2]Nosuchfileordirectory:‘errors.out‘（python自然语言处理章节5.6最后的示例报错）

作者：广东庚舞飞扬 | 来源：互联网 | 2023-07-23 12:36

在使用python3.7运行NaturalLanguageProcessingwithPythonChapter5的最后一个示例fromnltk.tblimportdemoasbr

在使用python3.7运行Natural Language Processing with Python Chapter 5 的最后一个示例

from nltk.tbl import demo as brill_demo brill_demo.demo() print(open("errors.out").read())

时，出现如下错误：

Traceback (most recent call last): File "E:/Python Practice/NLP/Chapter5.py", line 332, in print(open("errors.out").read()) FileNotFoundError: [Errno 2] No such file or directory: 'errors.out'

字面意思就是说，该文件不存在，在当前目录查找后也确实没有。通过搜索没有找到现成的解决方法，于是在StackOverflow求助，怀疑是nltk.tbl.demo模块的版本问题——是不是新的模块中有其他类似的生成errors.out文件的方法？

于是查看nltk/tbl/demo模块的源码，果然发现有一个类似的函数，如下

def demo_error_analysis(): """ Writes a file with context for each erroneous word after tagging testing data """ postag(error_output="errors.txt")

根据注释，发现这个函数的功能正是生成类似errors.out的文件。于是自然就想到，我们只要首先执行demo_error_analysis()函数，然后读取生成的文件就好啦，

brill_demo.demo_error_analysis()

然而事情往往没有那么简单。。。运行后报错如下：

Traceback (most recent call last): File "E:/Python Practice/NLP/Chapter5.py", line 331, in brill_demo.demo_error_analysis() File "D:\Anaconda3\lib\site-packages\nltk\tbl\demo.py", line 124, in demo_error_analysis postag(error_output="errors.txt") File "D:\Anaconda3\lib\site-packages\nltk\tbl\demo.py", line 322, in postag u"\n".join(error_list(gold_data, taggedtest)).encode("utf-8") + "\n" # TypeError: can't concat str to bytes

跟随提示的路径找到报错所在的源文件，如下

# writing error analysis to file if error_output is not None: with open(error_output, "w") as f: f.write("Errors for Brill Tagger %r\n\n" % serialize_output) f.write( u"\n".join(error_list(gold_data, taggedtest)).encode("utf-8") + "\n" ) print("Wrote tagger errors including context to {0}".format(error_output))

那么报错的意思就是说，在下面这一行，生成error_list时出现类型转换的问题了

u"\n".join(error_list(gold_data, taggedtest)).encode("utf-8") + "\n"

通过查阅这篇文章，发现问题所在：encode函数返回的是bytes类型的变量，不可以直接和string类型的变量合并，需要再调用decode函数，把bytes类型转变为string类型。

因此，解决方法很简单，即把这一行改成

u"\n".join(error_list(gold_data, taggedtest)).encode("utf-8").decode() + "\n" #add .decode()

（修改时可能会出现提示信息询问是否确认修改，放心大胆的改吧朋友们，如果不放心的话后面注释一下修改的内容，向我上面那样做）

经过小小的改动之后，再次运行

brill_demo.demo_error_analysis()

这时候就正常啦！

Loading tagged data from treebank... Read testing data (200 sents/5251 wds) Read training data (800 sents/19933 wds) Read baseline data (800 sents/19933 wds) [reused the training set] Trained baseline tagger Accuracy on test set: 0.8366 Training tbl tagger... TBL train (fast) (seqs: 800; tokens: 19933; tpls: 24; min score: 3; min acc: None) Finding initial useful rules... Found 12799 useful rules. B | S F r O | Score = Fixed - Broken c i o t | R Fixed = num tags changed incorrect -> correct o x k h | u Broken = num tags changed correct -> incorrect r e e e | l Other = num tags changed incorrect -> incorrect e d n r | e ------------------+------------------------------------------------------- 23 23 0 0 | POS->VBZ if Pos:[email&＃160;protected][-2,-1] 18 19 1 0 | NN->VB if Pos:[email&＃160;protected][-2] & Pos:[email&＃160;protected][-1] 14 14 0 0 | VBP->VB if Pos:[email&＃160;protected][-2,-1] 12 12 0 0 | VBP->VB if Pos:[email&＃160;protected][-1] 11 11 0 0 | VBD->VBN if Pos:[email&＃160;protected][-1] 11 11 0 0 | IN->WDT if Pos:[email&＃160;protected][1] & Pos:[email&＃160;protected][2] 10 11 1 0 | VBN->VBD if Pos:[email&＃160;protected][-1] 9 10 1 0 | VBD->VBN if Pos:[email&＃160;protected][-1] 8 8 0 0 | NN->VB if Pos:[email&＃160;protected][-1] 7 7 0 1 | VB->NN if Pos:[email&＃160;protected][-1] 7 7 0 0 | VB->VBP if Pos:[email&＃160;protected][-1] 7 7 0 0 | IN->WDT if Pos:[email&＃160;protected][1] & Pos:[email&＃160;protected][2] 7 8 1 0 | IN->RB if Word:[email&＃160;protected][2] 6 6 0 0 | VBD->VBN if Pos:[email&＃160;protected][-2,-1] 6 6 0 1 | IN->WDT if Pos:[email&＃160;protected][1] & Pos:[email&＃160;protected][2] 5 5 0 0 | POS->VBZ if Pos:[email&＃160;protected][-1] 5 5 0 0 | VB->VBP if Pos:[email&＃160;protected][-1] 5 5 0 0 | VBD->VBN if Word:[email&＃160;protected][-2,-1] 4 4 0 0 | POS->VBZ if Pos:``@[-2] 4 4 0 0 | VBP->VB if Pos:[email&＃160;protected][-2,-1] 4 6 2 3 | RP->RB if Pos:[email&＃160;protected][1,2] 4 4 0 0 | RB->JJ if Pos:[email&＃160;protected][-1] & Pos:[email&＃160;protected][1] 4 4 0 0 | NN->VBP if Pos:[email&＃160;protected][-2] & Pos:[email&＃160;protected][-1] 4 5 1 0 | VBN->VBD if Pos:[email&＃160;protected][-2] & Pos:[email&＃160;protected][-1] 4 4 0 0 | IN->WDT if Pos:[email&＃160;protected][1] & Pos:[email&＃160;protected][2] 4 8 4 0 | VBD->VBN if Word:*@[1] 4 4 0 0 | JJS->RBS if Word:[email&＃160;protected][0] & Word:[email&＃160;protected][-1] & Pos:[email&＃160;protected][-1] 3 3 0 0 | VBD->VBN if Pos:[email&＃160;protected][-1] 3 4 1 0 | VBN->VB if Pos:[email&＃160;protected][-1] 3 4 1 1 | IN->RB if Pos:[email&＃160;protected][1] 3 3 0 0 | JJ->RB if Pos:[email&＃160;protected][1] 3 3 0 0 | PRP$->PRP if Pos:[email&＃160;protected][1] 3 3 0 0 | NN->VBP if Pos:[email&＃160;protected][-1] & Pos:[email&＃160;protected][1] 3 3 0 0 | VBP->VB if Word:n'[email&＃160;protected][-2,-1] Trained tbl tagger in 2.45 seconds Accuracy on test set: 0.8572 Tagging the test data Wrote tagger errors including context to errors.txt

我们可以看到当前目录下多出了一个errors.txt文件

最后一步，读取并输出文件

print(open("errors.txt").read())

输出内容如下（部分）：

至此，我们就解决了最初的问题~

赶在双十一的尾巴总结一下这个困扰我两三个小时的问题，希望对后来者有帮助~

推荐阅读

function
Python基础：使用NLTK和Python构建机器学习应用

本文节选自《NLTK基础教程——用NLTK和Python库构建机器学习应用》一书的第1章第1.2节，作者Nitin Hardeniya。本文将带领读者快速了解Python的基础知识，为后续的机器学习应用打下坚实的基础。 ... [详细]

蜡笔小新 2024-11-13 21:23:34
default
HDFS API

Hadoop的文件操作位于包org.apache.hadoop.fs里面，能够进行新建、删除、修改等操作。比较重要的几个类：(1)Configurati ... [详细]

蜡笔小新 2024-11-13 17:31:50
int
利用OpenCV和线性SVM实现人脸识别

本文介绍如何使用OpenCV和线性支持向量机（SVM）模型来开发一个简单的人脸识别系统，特别关注在只有一个用户数据集时的处理方法。 ... [详细]

蜡笔小新 2024-11-13 14:50:37
default
Spring – Bean Life Cycle

Spring – Bean Life Cycle ... [详细]

蜡笔小新 2024-11-13 13:24:40
int
浅析python实现布隆过滤器及Redis中的缓存穿透原理_python

本文带你了解了位图的实现，布隆过滤器的原理及Python中的使用，以及布隆过滤器如何应对Redis中的缓存穿透，相信你对布隆过滤 ... [详细]

蜡笔小新 2024-11-13 16:43:07
object
Java DAO模式详解与代码示例

DAO（Data Access Object）模式是一种用于抽象和封装所有对数据库或其他持久化机制访问的方法，它通过提供一个统一的接口来隐藏底层数据访问的复杂性。 ... [详细]

蜡笔小新 2024-11-13 12:25:33
int
Spring Boot 中配置全局文件上传路径并实现文件上传功能

本文介绍如何在 Spring Boot 项目中配置全局文件上传路径，并通过读取配置项实现文件上传功能。通过这种方式，可以更好地管理和维护文件路径。 ... [详细]

蜡笔小新 2024-11-13 11:19:38
object
Java 编程错误：对象无法转换为 long 类型

本文介绍了在 Java 编程中遇到的一个常见错误：对象无法转换为 long 类型，并提供了详细的解决方案。 ... [详细]

蜡笔小新 2024-11-13 10:57:24
function
在范围[0..n-1]中产生m个不同的随机数 - Generating m distinct random numbers in the range [0..n-1]

Ihavetwomethodsofgeneratingmdistinctrandomnumbersintherange[0..n-1]我有两种方法在范围[0.n-1]中生 ... [详细]

蜡笔小新 2024-11-13 09:49:14
function
Java 并发编程：深入解析 AtomicInteger 和 CAS 无锁算法

在多线程并发环境中，普通变量的操作往往是线程不安全的。本文通过一个简单的例子，展示了如何使用 AtomicInteger 类及其核心的 CAS 无锁算法来保证线程安全。 ... [详细]

蜡笔小新 2024-11-12 16:40:04
export
c/c++常用代码doc,ppt,xls文件格式转PDF格式[转]

[转]doc,ppt,xls文件格式转PDF格式http:blog.csdn.netlee353086articledetails7920355确实好用。需要注意的是#import ... [详细]

蜡笔小新 2024-11-12 16:19:40
default
Python 使用 DOM 和 SAX 解析 XML 的应用实例

本文介绍如何使用 Python 的 DOM 和 SAX 方法解析 XML 文件，并通过示例展示了如何动态创建数据库表和处理大量数据的实时插入。 ... [详细]

蜡笔小新 2024-11-12 16:10:39
join
Spring详解（六）AOP

原文网址：https:www.cnblogs.comysoceanp7476379.html目录1、AOP什么？2、需求3、解决办法1:使用静态代理4 ... [详细]

蜡笔小新 2024-11-12 14:40:40
int
实验九：使用SharedPreferences存储简单数据

本实验旨在帮助学生理解和掌握使用SharedPreferences存储和读取简单数据的方法，包括程序参数和用户选项。 ... [详细]

蜡笔小新 2024-11-12 14:21:47
chat
使用Python和smtplib实现邮件发送功能

本文详细介绍了如何使用Python中的smtplib库来发送带有附件的邮件，并提供了完整的代码示例。作者：多测师_王sir，时间：2020年5月20日 17:24，微信：15367499889，公司：上海多测师信息有限公司。 ... [详细]

蜡笔小新 2024-11-12 12:21:27

广东庚舞飞扬

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章