51工具盒子

依楼听风雨
笑看云卷云舒,淡观潮起潮落

grid_pipeline.fit使用solver参数的默认值,而不是GridSearchCV中指定的值。

英文:

grid_pipeline.fit uses default value of solver parameter instead of GridSearchCV value

问题 {#heading}

我尝试找到sklearnLogisticRegression的最佳超参数组合。以下是我的代码示例:

pipeline = Pipeline([("scaler", StandardScaler()),
                     ("smt",    SMOTE(random_state=42)),
                     ("logreg", LogisticRegression())])

parameters = [{'logreg__solver': ['saga']}, {'logreg__penalty':['l1', 'l2']}, {'logreg__C':[1e-3, 0.1, 1, 10, 100]}]

grid_pipeline = GridSearchCV(pipeline, parameters, scoring= 'f1', n_jobs=5, verbose=5, return_train_score=True, cv=5)

grid_result = grid_pipeline.fit(X_train, y_train)

在拟合过程中,我收到以下错误信息:

ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

出现了一个问题,solver参数默认使用了'lbfgs'而不是选择的'saga'。为什么会发生这种情况? 英文:

I tried to find the best combination of hyperparameters for LogisticRegression in sklearn. Below is the example of my code:

pipeline = Pipeline([("scaler", StandardScaler()),
                     ("smt",    SMOTE(random_state=42)),
                     ("logreg", LogisticRegression())])

parameters = [{'logreg__solver': ['saga']}, {'logreg__penalty':['l1', 'l2']}, {'logreg__C':[1e-3, 0.1, 1, 10, 100]}]

grid_pipeline = GridSearchCV(pipeline, parameters, scoring= 'f1', n_jobs=5, verbose=5, return_train_score=True, cv=5)

grid_result = grid_pipeline.fit(X_train,y_train)

During fitting I get the following error:

ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

For some reason, default value 'lbfgs' is used for solver parameter instead of chosen 'saga'. Why does it happen?

答案1 {#1}

得分: 1

我认为问题出在您如何指定parameters上。为了获得所需的行为,请使用单个dict,如下所示:

parameters = {'logreg__solver': ['saga'],
              'logreg__penalty':['l1', 'l2'],
              'logreg__C':[1e-3, 0.1, 1, 10, 100]
              }

您之前将其指定为字典列表,这使GridSearchCV 有选择地挑选一些参数并忽略其他参数,这意味着它有时会要求在默认(非saga)求解器上使用l1。这两个选项不兼容。 英文:

I think the issue is how you have specified parameters. To get the desired behaviour, use a single dict as follows:

parameters = {'logreg__solver': ['saga'],
              'logreg__penalty':['l1', 'l2'],
              'logreg__C':[1e-3, 0.1, 1, 10, 100]
              }

You had specified it as a list of dicts, which gave GridSearchCV the option of picking some and ignoring others, meaning it sometimes encountered the request to use l1 on the default (non-saga) solver. Those two options are not compatible.

答案2 {#2}

得分: 0

为什么你将参数传递为一个字典列表,而不是一个列表字典?

难道不是

parameters = {'solver': ['saga'],
              'penalty':['l1', 'l2'],
              'C':[0.001, 0.01, 0.1, 1, 10, 100]}

这是你想要的吗?

在这里可行。 英文:

Why are you passing your parameters as a list of dictionaries, instead of a dictionary of lists?

Isn't

parameters = {'solver': ['saga'],
              'penalty':[ 'l1', 'l2'],
              'C':[0.001, 0.01, 0.1, 1, 10, 100]}

what you want?

Works here.


赞(4)
未经允许不得转载:工具盒子 » grid_pipeline.fit使用solver参数的默认值,而不是GridSearchCV中指定的值。