作者:cocoa_小米多本_148 | 来源:互联网 | 2023-09-15 09:12
Hello all,
I couldn´t find anything relatable in the documentation regarding this topic. Can you please help me out?
I got following problem:
I create a DataFrame from a collection with Java objects. After that I factorize the DataFrame.
I train a model with this DataFrame and serialized the trained model.
Then I create again a new DataFrame from a collection of Java objects but with only one new object in it. This new DataFrame also gets factorized. I load the serialized model in my programm and use it to predict the new DataFrame with the predict() method.
The model now classifies this new DataFrame always with the same (wrong) label.
(I tried this with serval different objects in the DataFrame and the model always predicts the same wrong label.)
I thought the mistake is because of the different schemas. (The train DataFrame has a different schema as the new DataFrame which is used for the prediction.)
So I copy the previous DataFrame, which was used for the training, and add the new object into this DataFrame and factorize it. After that the labeling of the new row is working fine.
How can I avoid it to use the "old, previous" DataFrame? Shouldn´t the model also work with only a new one-row DataFrame?
该提问来源于开源项目:haifengl/smile
Okay, thanks a lot!