Is it possible to load the model once and reuse it again in python?

Question

I have trained scikit learn model and now I want to use in my python code. Is there a way I can re-use the same model instance? In a simple way, I can load the model again whenever I need it, but as my needs are more frequent I want to load the model once and reuse it again.

Is there a way I can achieve this in python?

Here is the code for one thread in prediction.py:

clf = joblib.load('trainedsgdhuberclassifier.pkl')
clf.predict(userid)

Now for another user I don't want to initiate prediction.py again and spend time in loading the model. Is there a way, I can simply write.

new_recommendations = prediction(userid)

Is it multiprocessing that I should be using here? I am not sure !!

Oq01 · Accepted Answer · 2014-10-21 18:18:15Z

7

As per the Scikit-learn documentation the following code may help you:

from sklearn import svm
from sklearn import datasets
clf = svm.SVC()
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)  
import pickle
s = pickle.dumps(clf)
clf2 = pickle.loads(s)
clf2.predict(X[0])

In the specific case of the scikit, it may be more interesting to use joblib’s replacement of pickle (joblib.dump & joblib.load), which is more efficient on objects that carry large numpy arrays internally as is often the case for fitted scikit-learn estimators, but can only pickle to the disk and not to a string:

from sklearn.externals import joblib
joblib.dump(clf, 'filename.pkl')

Later you can load back the pickled model (possibly in another Python process) with:

clf = joblib.load('filename.pkl')

Once you have loaded your model again. You can re-use it without retraining it.

clf.predict(X[0])

Source: http://scikit-learn.org/stable/modules/model_persistence.html

edited Oct 21, 2014 at 18:18

answered Oct 21, 2014 at 17:56

Oq01

1578 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

ashu Over a year ago

Thanks for your answer. I am aware about this joblib.load. What I want is to resuse clf from joblib.load('filename.pkl') again. How can I do that?I don't wanna load it for multiple users as it takes time !!

Oq01 Over a year ago

Ok, If I understand you correctly then you could just call the clf.predict() or clf.transform() method depending on the type of estimator you used or depending on what you want to achieve. You don't have to fit the model again... Difficult to help if you don't provide any code examples.

alvas Over a year ago

@ashu, please provide the code so that @Oq01 can help you out. pickle is the right way to go.

ashu Over a year ago

@alvas just did that in the question !

Oq01 Over a year ago

@ashu, thanks but it's confusing. You don't seem to be using scikit-learn. That looks like graphlab, so I am not really sure if the same methods will apply.

|

Andreas Mueller · Accepted Answer · 2014-10-24 22:25:08Z

First, you should check how much of a bottleneck this is and if it is really worth avoiding the IO. An SGDClassifier is usually quite small. You can easily reuse the model, but the question is not really about how to reuse the model I would say, but how to get the new user instances to the classifier.

I would imagine userid is a feature vector, not an ID, right?

To make the model do prediction on new data, you need some kind of event based processing that calls the model when a new input arrives. I am by far no expert here but I think one simple solution might be using an http interface and use a light-weight server like flask.

Collectives™ on Stack Overflow

Is it possible to load the model once and reuse it again in python?

2 Answers 2

7 Comments

Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

Comments

Related