参考:https://pytorch.org/docs/master/optim.html#how-to-adjust-learning-rate
torch.optim.lr_scheduler提供了几种方法来根据迭代的数量来调整学习率
自己手动定义一个学习率衰减函数:
def adjust_learning_rate(optimizer, epoch, lr): """Sets the learning rate to the initial LR decayed by 10 every 2 epochs""" lr *= (0.1 ** (epoch // 2)) for param_group in optimizer.param_groups: param_group['lr'] = lr
optimizer通过param_group来管理参数组。param_group中保存了参数组及其对应的学习率,动量等等
使用:
model = AlexNet(num_classes=2) optimizer = optim.SGD(params = model.parameters(), lr=10) plt.figure() x = list(range(10)) y = [] lr_init = optimizer.param_groups[0]['lr'] for epoch in range(10): adjust_learning_rate(optimizer, epoch, lr_init) lr = optimizer.param_groups[0]['lr'] print(epoch, lr) y.append(lr) plt.plot(x,y)
返回:
0 10.0 1 10.0 2 1.0 3 1.0 4 0.10000000000000002 5 0.10000000000000002 6 0.010000000000000002 7 0.010000000000000002 8 0.0010000000000000002 9 0.0010000000000000002
如图:
举例先导入所需的库:
import torch import torch.optim as optim from torch.optim import lr_scheduler from torchvision.models import AlexNet import matplotlib.pyplot as plt
1.LambdaLR
CLASS torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda, last_epoch=-1)
将每个参数组的学习率设置为初始lr乘以给定函数。当last_epoch=-1时,将初始lr设置为lr。
参数:
optimizer (Optimizer) – 封装好的优化器
lr_lambda (function or list) –当是一个函数时,需要给其一个整数参数,使其计算出一个乘数因子,用于调整学习率,通常该输入参数是epoch数目;或此类函数的列表,根据在optimator.param_groups中的每组的长度决定lr_lambda的函数个数,如下报错。
last_epoch (int) – 最后一个迭代epoch的索引. Default: -1.
如:
optimizer = optim.SGD(params = model.parameters(), lr=0.05) lambda1 = lambda epoch:epoch // 10 #根据epoch计算出与lr相乘的乘数因子为epoch//10的值 lambda2 = lambda epoch:0.95 ** epoch #根据epoch计算出与lr相乘的乘数因子为0.95 ** epoch的值 scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=[lambda1, lambda2])
报错:
--------------------------------------------------------------------------- ValueError Traceback (most recent call last)2-c02d2d9ffc0d> in 4 lambda1 = lambda epoch:epoch // 10 5 lambda2 = lambda epoch:0.95 ** epoch ----> 6 scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=[lambda1, lambda2]) 7 plt.figure() 8 x = list(range(40)) /anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/optim/lr_scheduler.py in __init__(self, optimizer, lr_lambda, last_epoch) 83 if len(lr_lambda) != len(optimizer.param_groups): 84 raise ValueError("Expected {} lr_lambdas, but got {}".format( ---> 85 len(optimizer.param_groups), len(lr_lambda))) 86 self.lr_lambdas = list(lr_lambda) 87 self.last_epoch = last_epoch ValueError: Expected 1 lr_lambdas, but got 2
说明这里只需要一个lambda函数
举例:
1)使用的是lambda2
model = AlexNet(num_classes=2) optimizer = optim.SGD(params = model.parameters(), lr=0.05) #下面是两种lambda函数 #epoch=0到9时,epoch//10=0,所以这时的lr = 0.05*0=0 #epoch=10到19时,epoch//10=1,所以这时的lr = 0.05*1=0.05 lambda1 = lambda epoch:epoch // 10 #当epoch=0时,lr = lr * (0.2**0)=0.05;当epoch=1时,lr = lr * (0.2**1)=0.01 lambda2 = lambda epoch:0.2 ** epoch scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda2) plt.figure() x = list(range(40)) y = [] for epoch in range(40): scheduler.step() lr = scheduler.get_lr() print(epoch, scheduler.get_lr()[0]) y.append(scheduler.get_lr()[0]) plt.plot(x,y)
返回:
0 0.05 1 0.010000000000000002 2 0.0020000000000000005 3 0.00040000000000000013 4 8.000000000000002e-05 5 1.6000000000000006e-05 6 3.2000000000000015e-06 7 6.400000000000002e-07 8 1.2800000000000006e-07 9 2.5600000000000014e-08 10 5.120000000000003e-09 11 1.0240000000000006e-09 12 2.0480000000000014e-10 13 4.096000000000003e-11 14 8.192000000000007e-12 15 1.6384000000000016e-12 16 3.276800000000003e-13 17 6.553600000000007e-14 18 1.3107200000000014e-14 19 2.621440000000003e-15 20 5.242880000000006e-16 21 1.0485760000000013e-16 22 2.0971520000000027e-17 23 4.194304000000006e-18 24 8.388608000000012e-19 25 1.6777216000000025e-19 26 3.355443200000005e-20 27 6.71088640000001e-21 28 1.3421772800000022e-21 29 2.6843545600000045e-22 30 5.368709120000009e-23 31 1.0737418240000018e-23 32 2.147483648000004e-24 33 4.294967296000008e-25 34 8.589934592000016e-26 35 1.7179869184000033e-26 36 3.435973836800007e-27 37 6.871947673600015e-28 38 1.3743895347200028e-28 39 2.748779069440006e-29
如图:
也可以写成下面的格式:
def lambda_rule(epoch): lr_l = 0.2 ** epoch return lr_l scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda_rule)
2)使用的是lambda1函数:
返回:
0 0.0 1 0.0 2 0.0 3 0.0 4 0.0 5 0.0 6 0.0 7 0.0 8 0.0 9 0.0 10 0.05 11 0.05 12 0.05 13 0.05 14 0.05 15 0.05 16 0.05 17 0.05 18 0.05 19 0.05 20 0.1 21 0.1 22 0.1 23 0.1 24 0.1 25 0.1 26 0.1 27 0.1 28 0.1 29 0.1 30 0.15000000000000002 31 0.15000000000000002 32 0.15000000000000002 33 0.15000000000000002 34 0.15000000000000002 35 0.15000000000000002 36 0.15000000000000002 37 0.15000000000000002 38 0.15000000000000002 39 0.15000000000000002
如图:
其他函数:
load_state_dict(state_dict)
下载调试器状态
参数:
state_dict()
调用返回的对象.
state_dict()
以字典格式返回调试器的状态
它为self.__dict__中的每个变量都包含一个条目,这不是优化器。只有当学习率的lambda函数是可调用对象时才会保存它们,而当它们是函数或lambdas时则不会保存它们。
2.StepLR
CLASS torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoch=-1)
每个step_size时间步长后使每个参数组的学习率降低。注意,这种衰减可以与此调度程序外部对学习率的其他更改同时发生。当last_epoch=-1时,将初始lr设置为lr。
参数:
optimizer (Optimizer) – 封装的优化器
step_size (int) – 学习率衰减的周期
gamma (float) – 学习率衰减的乘数因子。Default: 0.1.
last_epoch (int) – 最后一个迭代epoch的索引. Default: -1.
举例:
model = AlexNet(num_classes=2) optimizer = optim.SGD(params = model.parameters(), lr=0.05) #即每10次迭代,lr = lr * gamma scheduler = lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1) plt.figure() x = list(range(40)) y = [] for epoch in range(40): scheduler.step() lr = scheduler.get_lr() print(epoch, scheduler.get_lr()[0]) y.append(scheduler.get_lr()[0]) plt.plot(x,y)
返回:
0 0.05 1 0.05 2 0.05 3 0.05 4 0.05 5 0.05 6 0.05 7 0.05 8 0.05 9 0.05 10 0.005000000000000001 11 0.005000000000000001 12 0.005000000000000001 13 0.005000000000000001 14 0.005000000000000001 15 0.005000000000000001 16 0.005000000000000001 17 0.005000000000000001 18 0.005000000000000001 19 0.005000000000000001 20 0.0005000000000000001 21 0.0005000000000000001 22 0.0005000000000000001 23 0.0005000000000000001 24 0.0005000000000000001 25 0.0005000000000000001 26 0.0005000000000000001 27 0.0005000000000000001 28 0.0005000000000000001 29 0.0005000000000000001 30 5.0000000000000016e-05 31 5.0000000000000016e-05 32 5.0000000000000016e-05 33 5.0000000000000016e-05 34 5.0000000000000016e-05 35 5.0000000000000016e-05 36 5.0000000000000016e-05 37 5.0000000000000016e-05 38 5.0000000000000016e-05 39 5.0000000000000016e-05
如图:
3.MultiStepLR
CLASS torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma=0.1, last_epoch=-1)
当迭代数epoch达到某个里程碑时,每个参数组的学习率将被gamma衰减。注意,这种衰减可以与此调度程序外部对学习率的其他更改同时发生。当last_epoch=-1时,将初始lr设置为lr。
参数:
optimizer (Optimizer) – 封装的优化器
milestones (list) –迭代epochs指数列表. 列表中的值必须是增长的.
gamma (float) – 学习率衰减的乘数因子。Default: 0.1.
last_epoch (int) – 最后一个迭代epoch的索引. Default: -1.
举例:
model = AlexNet(num_classes=2) optimizer = optim.SGD(params = model.parameters(), lr=0.05) #在指定的epoch值,如[10,15,25,30]处对学习率进行衰减,lr = lr * gamma scheduler = lr_scheduler.MultiStepLR(optimizer, milestOnes=[10,15,25,30], gamma=0.1) plt.figure() x = list(range(40)) y = [] for epoch in range(40): scheduler.step() lr = scheduler.get_lr() print(epoch, scheduler.get_lr()[0]) y.append(scheduler.get_lr()[0]) plt.plot(x,y)
返回:
0 0.05 1 0.05 2 0.05 3 0.05 4 0.05 5 0.05 6 0.05 7 0.05 8 0.05 9 0.05 10 0.005000000000000001 11 0.005000000000000001 12 0.005000000000000001 13 0.005000000000000001 14 0.005000000000000001 15 0.0005000000000000001 16 0.0005000000000000001 17 0.0005000000000000001 18 0.0005000000000000001 19 0.0005000000000000001 20 0.0005000000000000001 21 0.0005000000000000001 22 0.0005000000000000001 23 0.0005000000000000001 24 0.0005000000000000001 25 5.0000000000000016e-05 26 5.0000000000000016e-05 27 5.0000000000000016e-05 28 5.0000000000000016e-05 29 5.0000000000000016e-05 30 5.000000000000001e-06 31 5.000000000000001e-06 32 5.000000000000001e-06 33 5.000000000000001e-06 34 5.000000000000001e-06 35 5.000000000000001e-06 36 5.000000000000001e-06 37 5.000000000000001e-06 38 5.000000000000001e-06 39 5.000000000000001e-06
如图:
4.ExponentialLR
CLASS torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma, last_epoch=-1)
每个epoch都对每个参数组的学习率进行衰减。当last_epoch=-1时,将初始lr设置为lr。
参数:
optimizer (Optimizer) – 封装的优化器
gamma (float) – 学习率衰减的乘数因子
last_epoch (int) – 最后一个迭代epoch的索引. Default: -1.
举例:
model = AlexNet(num_classes=2) optimizer = optim.SGD(params = model.parameters(), lr=0.2) #即每个epoch都衰减lr = lr * gamma,即进行指数衰减 scheduler = lr_scheduler.ExponentialLR(optimizer, gamma=0.2) plt.figure() x = list(range(10)) y = [] for epoch in range(10): scheduler.step() lr = scheduler.get_lr() print(epoch, scheduler.get_lr()[0]) y.append(scheduler.get_lr()[0]) plt.plot(x,y)
返回:
0 0.2 1 0.04000000000000001 2 0.008000000000000002 3 0.0016000000000000005 4 0.0003200000000000001 5 6.400000000000002e-05 6 1.2800000000000006e-05 7 2.560000000000001e-06 8 5.120000000000002e-07 9 1.0240000000000006e-07
如图:
5.CosineAnnealingLR
CLASS torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=-1)
使用余弦退火调度设置各参数组的学习率,其中ηmax设为初始lr, Tcur为SGDR上次重启后的时间间隔个数:
当last_epoch=-1时,将初始lr设置为lr。注意,由于调度是递归定义的,所以其他操作符可以在此调度程序之外同时修改学习率。如果学习速率仅由该调度程序设置。利用cos曲线降低学习率,该方法来源SGDR,学习率变换如下公式:
该方法已在SGDR中提出SGDR: Stochastic Gradient Descent with Warm Restarts。注意,这只实现了SGDR的余弦退火部分,而没有重新启动。
参数:
optimizer (Optimizer) – 封装的优化器
T_max (int) – 迭代的最大数量
eta_min (float) – 最小学习率 Default: 0.
last_epoch (int) – 最后一个迭代epoch的索引. Default: -1.
举例:
model = AlexNet(num_classes=2) optimizer = optim.SGD(params = model.parameters(), lr=10) #根据式子进行计算 scheduler = lr_scheduler.CosineAnnealingLR(optimizer, T_max=2) plt.figure() x = list(range(10)) y = [] for epoch in range(10): scheduler.step() lr = scheduler.get_lr() print(epoch, scheduler.get_lr()[0]) y.append(scheduler.get_lr()[0]) plt.plot(x,y)
返回:
0 10.0 1 5.0 2 0.0 3 4.999999999999999 4 10.0 5 5.000000000000001 6 0.0 7 4.999999999999998 8 10.0 9 5.000000000000002
如图:
该例子前三个学习率结果的计算方式:
6.ReduceLROnPlateau(动态衰减lr)
CLASS torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=10, verbose=False, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-08)
torch.optim.lr_scheduler.ReduceLROnPlateau
允许基于一些验证测量对学习率进行动态的下降
当评价指标停止改进时,降低学习率。一旦学习停滞不前,模型通常会从将学习率降低2-10倍中获益。这个调度器读取一个度量量,如果在“patience”时间内没有看到改进,那么学习率就会降低。
参数:
optimizer (Optimizer) – 封装的优化器
mode (str) – min, max两个模式中一个。在min模式下,当监测的数量停止下降时,lr会减少;在max模式下,当监视的数量停止增加时,它将减少。默认值:“分钟”。
factor (float) – 学习率衰减的乘数因子。new_lr = lr * factor. Default: 0.1.
patience (int) – 没有改善的迭代epoch数量,这之后学习率会降低。例如,如果patience = 2,那么我们将忽略前2个没有改善的epoch,如果loss仍然没有改善,那么我们只会在第3个epoch之后降低LR。Default:10。
verbose (bool) – 如果为真,则为每次更新打印一条消息到stdout. Default: False
.
threshold (float) – 阈值,为衡量新的最优值,只关注显著变化. Default: 1e-4.
threshold_mode (str) – rel, abs两个模式中一个. 在rel模式的“max”模式下的计算公式为dynamic_threshold = best * (1 + threshold),或在“min”模式下的公式为best * (1 - threshold)。在abs模式下的“max”模式下的计算公式为dynamic_threshold = best + threshold,在“min”模式下的公式为的best - threshold. Default: ‘rel’.
cooldown (int) – 减少lr后恢复正常操作前等待的时间间隔. Default: 0.
min_lr (float or list) – 标量或标量列表。所有参数组或每组的学习率的下界. Default: 0.
eps (float) – 作用于lr的最小衰减。如果新旧lr之间的差异小于eps,则忽略更新. Default: 1e-8.
举例:
import torchvision.models as models import torch.nn as nn model = models.resnet34(pretrained=True) fc_features = model.fc.in_features model.fc = nn.Linear(fc_features, 2) criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(params = model.parameters(), lr=10) scheduler = lr_scheduler.ReduceLROnPlateau(optimizer, 'min') inputs = torch.randn(4,3,224,224) labels = torch.LongTensor([1,1,0,1]) plt.figure() x = list(range(60)) y = [] for epoch in range(60): optimizer.zero_grad() outputs = model(inputs) #print(outputs) loss = criterion(outputs, labels) print(loss) loss.backward() scheduler.step(loss) optimizer.step() lr = optimizer.param_groups[0]['lr'] print(epoch, lr) y.append(lr) plt.plot(x,y)
监督loss,如果loss在patient=10的epoch中没有改进,那么lr就会衰减
返回:
tensor(0.7329, grad_fn=) 0 10 tensor(690.2101, grad_fn= ) 1 10 tensor(2359.7373, grad_fn= ) 2 10 tensor(409365., grad_fn= ) 3 10 tensor(240161.7969, grad_fn= ) 4 10 tensor(3476952.2500, grad_fn= ) 5 10 tensor(5098666.5000, grad_fn= ) 6 10 tensor(719.7433, grad_fn= ) 7 10 tensor(6.7871, grad_fn= ) 8 10 tensor(5.5356, grad_fn= ) 9 10 tensor(4.2844, grad_fn= ) 10 10 tensor(3.0342, grad_fn= ) 11 1.0 tensor(2.9092, grad_fn= ) 12 1.0 tensor(2.7843, grad_fn= ) 13 1.0 tensor(2.6593, grad_fn= ) 14 1.0 tensor(2.5343, grad_fn= ) 15 1.0 tensor(2.4093, grad_fn= ) 16 1.0 tensor(2.2844, grad_fn= ) 17 1.0 tensor(2.1595, grad_fn= ) 18 1.0 tensor(2.0346, grad_fn= ) 19 1.0 tensor(1.9099, grad_fn= ) 20 1.0 tensor(1.7853, grad_fn= ) 21 1.0 tensor(1.6610, grad_fn= ) 22 0.1 tensor(1.6486, grad_fn= ) 23 0.1 tensor(1.6362, grad_fn= ) 24 0.1 tensor(1.6238, grad_fn= ) 25 0.1 tensor(1.6114, grad_fn= ) 26 0.1 tensor(1.5990, grad_fn= ) 27 0.1 tensor(1.5866, grad_fn= ) 28 0.1 tensor(1.5743, grad_fn= ) 29 0.1 tensor(1.5619, grad_fn= ) 30 0.1 tensor(1.5496, grad_fn= ) 31 0.1 tensor(1.5372, grad_fn= ) 32 0.1 tensor(1.5249, grad_fn= ) 33 0.010000000000000002 tensor(1.5236, grad_fn= ) 34 0.010000000000000002 tensor(1.5224, grad_fn= ) 35 0.010000000000000002 tensor(1.5212, grad_fn= ) 36 0.010000000000000002 tensor(1.5199, grad_fn= ) 37 0.010000000000000002 tensor(1.5187, grad_fn= ) 38 0.010000000000000002 tensor(1.5175, grad_fn= ) 39 0.010000000000000002 tensor(1.5163, grad_fn= ) 40 0.010000000000000002 tensor(1.5150, grad_fn= ) 41 0.010000000000000002 tensor(1.5138, grad_fn= ) 42 0.010000000000000002 tensor(1.5126, grad_fn= ) 43 0.010000000000000002 tensor(1.5113, grad_fn= ) 44 0.0010000000000000002 tensor(1.5112, grad_fn= ) 45 0.0010000000000000002 tensor(1.5111, grad_fn= ) 46 0.0010000000000000002 tensor(1.5110, grad_fn= ) 47 0.0010000000000000002 tensor(1.5108, grad_fn= ) 48 0.0010000000000000002 tensor(1.5107, grad_fn= ) 49 0.0010000000000000002 tensor(1.5106, grad_fn= ) 50 0.0010000000000000002 tensor(1.5105, grad_fn= ) 51 0.0010000000000000002 tensor(1.5103, grad_fn= ) 52 0.0010000000000000002 tensor(1.5102, grad_fn= ) 53 0.0010000000000000002 tensor(1.5101, grad_fn= ) 54 0.0010000000000000002 tensor(1.5100, grad_fn= ) 55 0.00010000000000000003 tensor(1.5100, grad_fn= ) 56 0.00010000000000000003 tensor(1.5099, grad_fn= ) 57 0.00010000000000000003 tensor(1.5099, grad_fn= ) 58 0.00010000000000000003 tensor(1.5099, grad_fn= ) 59 0.00010000000000000003
第一个loss为0.7329,一直向后的patient=10的10个epoch中都没有loss小于它,所以根据mode='min',lr = lr*factor=lr * 0.1,所以lr从10变为了1.0
如图:
⚠️该函数没有get_lr
(),所以使用optimizer.param_groups[0]['lr']得到当前的学习率
7.CyclicLR
CLASS torch.optim.lr_scheduler.CyclicLR(optimizer, base_lr, max_lr, step_size_up=2000, step_size_down=None, mode='triangular', gamma=1.0, scale_fn=None, scale_mode='cycle', cycle_momentum=True, base_momentum=0.8, max_momentum=0.9, last_epoch=-1)
根据循环学习速率策略(CLR)设置各参数组的学习速率。该策略以恒定的频率循环两个边界之间的学习率,论文 Cyclical Learning Rates for Training Neural Networks中进行详细介绍。两个边界之间的距离可以按每次迭代或每次循环进行缩放。
循环学习率策略会在每批batch数据之后改变学习率。该类的step()函数应在使用批处理进行训练后调用。
该类有三个内置策略:
optimizer (Optimizer) – 封装的优化器
base_lr (float or list) –初始学习率,即各参数组在循环中的下界。
max_lr (float or list) – 各参数组在循环中的学习率上限。在功能上,它定义了周期振幅(max_lr - base_lr)。任意周期的lr是base_lr和振幅的某种比例的和;因此,根据缩放函数,实际上可能无法达到max_lr。
step_size_up (int) – 在周期的上升部分中,训练迭代的次数. Default: 2000
step_size_down (int) – 在一个周期的下降部分的训练迭代次数。如果step_size_down为None,则将其设置为step_size_up. Default: None
mode (str) – {triangular, triangular2, exp_range}三个模式之一.值对应于上面详细说明的策略。如果scale_fn不为None,则忽略该参数. Default: ‘triangular’
gamma (float) – 在' exp_range '缩放函数中的常量 :计算公式为gamma**(循环迭代)。Default: 1.0
scale_fn (function) – 由一个参数lambda函数定义的自定义缩放策略,其中对于所有x >= 0, 0 <= scale_fn(x) <= 1。如果指定,则忽略“mode”. Default: None
scale_mode (str) – {‘cycle’, ‘iterations’}.定义scale_fn是根据循环数计算还是根据循环迭代(从循环开始的训练迭代)计算. Default: ‘cycle’
cycle_momentum (bool) – 如果为
True
,momentum与“base_momentum”和“max_momentum”之间的学习率成反比。 . Default: True
base_momentum (float or list) – 初始momentum是各参数组在周期中的下界. Default: 0.8
max_momentum (float or list) – 各参数组在周期中的最大momentum边界。在功能上,它定义了周期振幅(max_momentum - base_momentum)。任意周期的momentum等于最大动量与振幅的比例之差;因此,根据缩放函数,实际上可能无法达到base_momentum. Default: 0.9
last_epoch (int) – 最后一批batch的索引。此参数用于恢复训练工作。因为step()应该在每个批处理之后调用,而不是在每个epoch之后调用,所以这个数字表示计算的批处理总数,而不是计算的epoch总数。当last_epoch=-1时,调度将从头开始. Default: -1
举例:
出错:找不到CyclicLR,因为我1.0.1版本中没有这个类:
AttributeError: module 'torch.optim.lr_scheduler' has no attribute 'CyclicLR'
然后更新torch到最新版本1.1.0:
(deeplearning) userdeMBP:ageAndGender user$ pip install --upgrade torch torchvision Collecting torch ... Installing collected packages: torch Found existing installation: torch 1.0.1.post2 Uninstalling torch-1.0.1.post2: Successfully uninstalled torch-1.0.1.post2 Successfully installed torch-1.1.0
然后又报错:
(deeplearning) userdeMBP:ageAndGender user$ python Python 3.6.8 |Anaconda, Inc.| (default, Dec 29 2018, 19:04:46) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import torch Traceback (most recent call last): File "", line 1, in File "/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/__init__.py", line 79, in from torch._C import * ImportError: dlopen(/anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/_C.cpython-36m-darwin.so, 9): Library not loaded: /usr/local/opt/libomp/lib/libomp.dylib Referenced from: /anaconda3/envs/deeplearning/lib/python3.6/site-packages/torch/lib/libshm.dylib Reason: image not found >>>
没能解决这个问题,小伙伴有解决了的希望能告知
该类详情可见https://blog.csdn.net/u013166817/article/details/86503899
pix2pix代码相关部分为:
def get_scheduler(optimizer, opt): """返回学习率调试器 Parameters: optimizer -- 网络使用的优化器 opt (option class) -- 存储的所有的实验标记,需要是BaseOptions的子类 opt.lr_policy是学习率策略的名字,如: linear | step | plateau | cosine 对于'linear',对前个迭代(默认为100)中我们保持相同的学习率,并在下面的 个迭代(m默认为100)中线性衰减学习率到0 对于其他的调试器(step, plateau, and cosine),使用默认的pytorch配置 See https://pytorch.org/docs/stable/optim.html for more details. """ if opt.lr_policy == 'linear': def lambda_rule(epoch): #epoch_count表示epoch开始的值,默认为1 lr_l = 1.0 - max(0, epoch + opt.epoch_count - opt.niter) / float(opt.niter_decay + 1) return lr_l scheduler = lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda_rule) elif opt.lr_policy == 'step': scheduler = lr_scheduler.StepLR(optimizer, step_size=opt.lr_decay_iters, gamma=0.1) elif opt.lr_policy == 'plateau': scheduler = lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.2, threshold=0.01, patience=5) elif opt.lr_policy == 'cosine': scheduler = lr_scheduler.CosineAnnealingLR(optimizer, T_max=opt.niter, eta_min=0) else: return NotImplementedError('learning rate policy [%s] is not implemented', opt.lr_policy) return scheduler