作者:berryhu | 来源:互联网 | 2023-10-12 14:55
Bug/Feature Request Title
When I try to create depth 3 seed feature it always returns a column equal to 0 instead of expected values.
Bug/Feature Request Description
I built an example dataset described like this:
1 2 3 4 5 6 7 8 9 10 11
| python
Entityset: None
Entities:
pers [Rows: 1, Columns: 4]
cli [Rows: 29, Columns: 5]
k [Rows: 43, Columns: 4]
fin_exp [Rows: 43, Columns: 4]
Relationships:
cli.hid -> pers.hid
k.id_k -> cli.id_k
fin_exp.id_fin -> k.id_fin |
When I try to calculate a seed feature of depth 3 like
1 2 3
| python
ft.Feature(ft.Feature(ft.Feature(tset['fin_exp']['premium'], parent_entity = tset['k'], primitive = Sum()), parent_entity = tset['cli'], primitive = Sum()), parent_entity = tset['pers'], primitive = Sum())
ft.Feature(ft.Feature(ft.Feature(tset['fin_exp']['id_fin'], parent_entity = tset['k'], primitive = NumUnique()), parent_entity = tset['cli'], primitive = Sum()), parent_entity = tset['pers'], primitive = Sum()) |
it returns a column (one value in this case) fulls of zeroes instead of actual values. For comparison if I count on a duplicate
1
| tset['k']['idf'] |
of index
1
| tset['k']['id_fin'] |
for the parent entity
1 2
| python
ft.Feature(ft.Feature(tset['k']['idf'], parent_entity = tset['cli'], primitive = NumUnique()), parent_entity = tset['pers'], primitive = Sum()) |
I get 43 - as expected since fin_exp and k have one-to-one relationship in the example.
Expected Output
1 2 3
| python
ft.Feature(ft.Feature(ft.Feature(tset['fin_exp']['id_fin'], parent_entity = tset['k'], primitive = NumUnique()), parent_entity = tset['cli'], primitive = Sum()), parent_entity = tset['pers'], primitive = Sum())
ft.Feature(ft.Feature(tset['k']['idf'], parent_entity = tset['cli'], primitive = NumUnique()), parent_entity = tset['pers'], primitive = Sum()) |
should return the same value.
Output of
1
| featuretools.show_info() |
featuretools.show_info() still doesn't print any input. Branch is slightly behind master.
Featuretools version: 0.9.0
Featuretools installation directory: y:\git\featuretools\featuretools
SYSTEM INFO
-----------
python: 3.7.3.final.0
python-bits: 64
OS: Windows
OS-release: 2008ServerR2
machine: AMD64
processor: Intel64 Family 6 Model 45 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
INSTALLED VERSIONS
------------------
numpy: 1.16.4
pandas: 0.24.2
tqdm: 4.32.2
toolz: 0.9.0
PyYAML: 5.1.1
cloudpickle: 1.2.1
future: 0.17.1
dask: 2.0.0
distributed: 2.0.1
psutil: 5.6.3
Click: 7.0
scikit-learn: 0.21.2
pip: 19.1.1
setuptools: 41.0.1
该提问来源于开源项目:alteryx/featuretools
I can confirm that the issue is fixed in
1
| set-index-featureset-calculator |
branch.