作者:apiaoapiao_622 | 来源:互联网 | 2023-06-28 12:58
快速构建字符识别 实验开始之前,先导入本实验所需要的函数库:
import paddleimport paddle. fluid as fluidfrom paddle. fluid. dygraph. nn import Linearimport numpy as npimport osfrom PIL import Image
数据集的预处理 paddle.dataset
中提供了很多的数据集集合,如下:
mnist cifar Conll05 imdb imikolov movielens sentiment uci_housing wmt14 wmt16 我们可以通过 paddle.dataset.mnist.train()
加载训练数据,通过 paddle.batch
来分批:
trainset = paddle. dataset. mnist. train( ) train_reader = paddle. batch( trainset, batch_size= 8 )
接下来,将数据中的 img 和 label 分开。这里使用迭代的方式,一批次一批次的分开:
for batch_id, data in enumerate ( train_reader( ) ) : img_data = np. array( [ x[ 0 ] for x in data] ) . astype( 'float32' ) label_data = np. array( [ x[ 1 ] for x in data] ) . astype( 'float32' ) print ( "图像数据形状和对应数据为:" , img_data. shape) print ( "图像标签形状和对应数据为:" , label_data. shape) break
结果: 我们可以展示一下训练数据:
print ( "\n打印第一个batch的第一个图像,对应标签数字为{}" . format ( label_data[ 0 ] ) ) import matplotlib. pyplot as plt img = np. array( img_data[ 0 ] + 1 ) * 127.5 img = np. reshape( img, [ 28 , 28 ] ) . astype( np. uint8) plt. figure( "Image" ) plt. imshow( img) plt. axis( 'on' ) plt. title( 'image' ) plt. show( )
结果: 从打印结果看,从数据加载器train_loader()中读取一次数据,可以得到形状为(8, 784)的图像数据和形状为(8,)的标签数据。其中,形状中的数字8与设置的batch_size大小对应,784为MINIST数据集中每个图像的像素大小(28*28)。
识别模型 模型的建立 作为入门课程,这里使用最简单的线性网络。和 PyTorch 定义模型的方法类似,不过这里需要继承的是 fluid.dygraph.Layer
class minist_model ( fluid. dygraph. Layer) : def __init__ ( self) : super ( minist_model, self) . __init__( ) self. fc = Linear( input_dim= 28 * 28 , output_dim= 1 , act= None ) def forward ( self, inputs) : outputs = self. fc( inputs) return outputs model = minist_model( ) model
结果:
模型的配置 with fluid. dygraph. guard( ) : model = minist_model( ) model. train( ) train_loader = paddle. batch( paddle. dataset. mnist. train( ) , batch_size= 16 ) opt = fluid. optimizer. SGDOptimizer( learning_rate= 0.001 , parameter_list= model. parameters( ) ) opt
结果:
模型的训练 with fluid. dygraph. guard( ) : model = minist_model( ) model. train( ) train_loader = paddle. batch( paddle. dataset. mnist. train( ) , batch_size= 16 ) opt = fluid. optimizer. SGDOptimizer( learning_rate= 0.001 , parameter_list= model. parameters( ) ) for epoch_id in range ( 100 ) : for batch_id, data in enumerate ( train_loader( ) ) : image_data = np. array( [ x[ 0 ] for x in data] ) . astype( 'float32' ) label_data = np. array( [ x[ 1 ] for x in data] ) . astype( 'float32' ) . reshape( - 1 , 1 ) image = fluid. dygraph. to_variable( image_data) label = fluid. dygraph. to_variable( label_data) pre = model( image) loss = fluid. layers. square_error_cost( pre, label) avg_loss = fluid. layers. mean( loss) if batch_id != 0 and batch_id % 1000 == 0 : print ( "epoch: {}, batch: {}, loss is: {}" . format ( epoch_id, batch_id, avg_loss. numpy( ) ) ) avg_loss. backward( ) opt. minimize( avg_loss) model. clear_gradients( ) fluid. save_dygraph( model. state_dict( ) , 'mnist' )
结果:
模型的测试 首先加载一张新的图像:
```pythonimport matplotlib. image as Imgimport matplotlib. pyplot as plt example = Img. imread( './work/example_0.png' ) plt. imshow( example) plt. show( )
结果: 相对图像进行预处理,然后进行预测:
def load_image ( img_path) : im = Image. open ( img_path) . convert( 'L' ) im = im. resize( ( 28 , 28 ) , Image. ANTIALIAS) im = np. array( im) . reshape( 1 , - 1 ) . astype( np. float32) im = 1 - im / 127.5 return imwith fluid. dygraph. guard( ) : model = MNIST( ) params_file_path = 'mnist' img_path = './work/example_0.png' model_dict, _ = fluid. load_dygraph( "mnist" ) model. load_dict( model_dict) model. eval ( ) tensor_img = load_image( img_path) print ( "数据集的大小为:" , tensor_img. shape) result = model( fluid. dygraph. to_variable( tensor_img) ) print ( "本次预测的数字是" , result. numpy( ) . astype( 'int32' ) )
结果:
由于上面用的是线性网络,所以得到结果不尽人意。我们可以查看一下该模型的模型准确率:
correct = 0 count = 0 with fluid. dygraph. guard( ) : model = MNIST( ) model_dict, _ = fluid. load_dygraph( "mnist" ) model. load_dict( model_dict) test_loader = paddle. batch( paddle. dataset. mnist. test( ) , batch_size= 16 ) for batch_id, data in enumerate ( test_loader( ) ) : image_data = np. array( [ x[ 0 ] for x in data] ) . astype( 'float32' ) label_data = np. array( [ x[ 1 ] for x in data] ) . astype( 'float32' ) . reshape( - 1 , 1 ) image = fluid. dygraph. to_variable( image_data) label = fluid. dygraph. to_variable( label_data) model. eval ( ) predict = model( image) pre = predict. numpy( ) . astype( 'int32' ) correct= correct+ np. sum ( pre== label_data) count = count+ len ( image_data) print ( f"正确率为:{correct/count*100:.2f}%" )
结果: