成功解决 AttributeError: ‘Field‘ object has no attribute ‘vocab‘
最近复现代码过程中,需要用到 torchtext.data 中的 Field 类。本篇博客记录使用过程中的问题及解决方式。
- 注意
torchtext
版本不宜过新
在较新版本的 torchtext.data
里面并没有 Field
方法,这一点需要注意。
启示:在复现别人代码时,应同时复制他们使用环境的版本信息。
- 运行下述代码:
from torchtext.data import FieldSRC = Field(tokenize = tokenize_en, init_token = '<sos>', eos_token = '<eos>',fix_length = max_length,lower = True, batch_first = True,sequential=True)TRG = Field(tokenize = tokenize_en, init_token = '<sos>', eos_token = '<eos>', fix_length = max_length,lower = True, batch_first = True,sequential=True)print(SRC.vocab.stoi["<sos>"])
print(TRG.vocab.stoi["<sos>"])
报错信息:
print(SRC.vocab.stoi["<sos>"]) # 2
AttributeError: 'Field' object has no attribute 'vocab'
于是查看 Field
类的定义,寻找和词表建立相关的函数,发现其 build_vocab()
函数中有建立词表的操作, build_vocab()
函数定义如下:
class Field(RawField):...def build_vocab(self, *args, **kwargs):"""Construct the Vocab object for this field from one or more datasets.Arguments:Positional arguments: Dataset objects or other iterable datasources from which to construct the Vocab object thatrepresents the set of possible values for this field. Ifa Dataset object is provided, all columns correspondingto this field are used; individual columns can also beprovided directly.Remaining keyword arguments: Passed to the constructor of Vocab."""counter = Counter()sources = []for arg in args:if isinstance(arg, Dataset):sources += [getattr(arg, name) for name, field inarg.fields.items() if field is self]else:sources.append(arg)for data in sources:for x in data:if not self.sequential:x = [x]try:counter.update(x)except TypeError:counter.update(chain.from_iterable(x))specials = list(OrderedDict.fromkeys(tok for tok in [self.unk_token, self.pad_token, self.init_token,self.eos_token] + kwargs.pop('specials', [])if tok is not None))self.vocab = self.vocab_cls(counter, specials=specials, **kwargs)...
解决方式:在程序中 Field
定义后添加 SRC.build_vocab()
和 TRG.build_vocab()
,程序变成:
SRC.build_vocab()
TRG.build_vocab()print(SRC.vocab.stoi["<sos>"]) # 输出结果:2
print(TRG.vocab.stoi["<sos>"]) # 输出结果:2
至此,程序就会顺利执行啦!
参考资料
- python - BucketIterator 抛出 ‘Field’ 对象没有属性 ‘vocab’ - IT工具网 (coder.work)
- ImportError: cannot import name ‘Field‘ from ‘torchtext.data‘, No module named “legacy“_no module named 'torchtext.legacy_御用厨师的博客-CSDN博客