热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

Python3BP神经网络

转自麦子学院12network.py3~~~~~~~~~~45Amoduletoimplementthestochasticgradi

转自麦子学院

  1 """
  2 network.py
  3 ~~~~~~~~~~
  4 
  5 A module to implement the stochastic gradient descent learning
  6 algorithm for a feedforward neural network.  Gradients are calculated
  7 using backpropagation.  Note that I have focused on making the code
  8 simple, easily readable, and easily modifiable.  It is not optimized,
  9 and omits many desirable features.
 10 """
 11 
 12 #### Libraries
 13 # Standard library
 14 import random
 15 
 16 # Third-party libraries
 17 import numpy as np
 18 
 19 class Network(object):
 20 
 21     def __init__(self, sizes):
 22         """The list ``sizes`` contains the number of neurons in the
 23         respective layers of the network.  For example, if the list
 24         was [2, 3, 1] then it would be a three-layer network, with the
 25         first layer containing 2 neurons, the second layer 3 neurons,
 26         and the third layer 1 neuron.  The biases and weights for the
 27         network are initialized randomly, using a Gaussian
 28         distribution with mean 0, and variance 1.  Note that the first
 29         layer is assumed to be an input layer, and by convention we
 30         won't set any biases for those neurons, since biases are only
 31         ever used in computing the outputs from later layers."""
 32         self.num_layers = len(sizes)
 33         self.sizes = sizes
 34         self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
 35         self.weights = [np.random.randn(y, x)
 36                         for x, y in zip(sizes[:-1], sizes[1:])]
 37 
 38     def feedforward(self, a):
 39         """Return the output of the network if ``a`` is input."""
 40         for b, w in zip(self.biases, self.weights):
 41             a = sigmoid(np.dot(w, a)+b)
 42         return a
 43 
 44     def SGD(self, training_data, epochs, mini_batch_size, eta,
 45             test_data=None):
 46         """Train the neural network using mini-batch stochastic
 47         gradient descent.  The ``training_data`` is a list of tuples
 48         ``(x, y)`` representing the training inputs and the desired
 49         outputs.  The other non-optional parameters are
 50         self-explanatory.  If ``test_data`` is provided then the
 51         network will be evaluated against the test data after each
 52         epoch, and partial progress printed out.  This is useful for
 53         tracking progress, but slows things down substantially."""
 54         if test_data: n_test = len(test_data)
 55         n = len(training_data)
 56         for j in range(epochs):
 57             random.shuffle(training_data)
 58             mini_batches = [
 59                 training_data[k:k+mini_batch_size]
 60                 for k in range(0, n, mini_batch_size)]
 61             for mini_batch in mini_batches:
 62                 self.update_mini_batch(mini_batch, eta)
 63             if test_data:
 64                 print ("Epoch {0}: {1} / {2}".format(
 65                     j, self.evaluate(test_data), n_test))
 66             else:
 67                 print ("Epoch {0} complete".format(j))
 68 
 69     def update_mini_batch(self, mini_batch, eta):
 70         """Update the network's weights and biases by applying
 71         gradient descent using backpropagation to a single mini batch.
 72         The ``mini_batch`` is a list of tuples ``(x, y)``, and ``eta``
 73         is the learning rate."""
 74         nabla_b = [np.zeros(b.shape) for b in self.biases]
 75         nabla_w = [np.zeros(w.shape) for w in self.weights]
 76         #一个一个的进行训练  跟吴恩达的Mini-Batch 不一样
 77         for x, y in mini_batch:
 78             delta_nabla_b, delta_nabla_w = self.backprop(x, y)
 79             nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]
 80             nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]
 81         self.weights = [w-(eta/len(mini_batch))*nw
 82                         for w, nw in zip(self.weights, nabla_w)]
 83         self.biases = [b-(eta/len(mini_batch))*nb
 84                        for b, nb in zip(self.biases, nabla_b)]
 85 
 86     def backprop(self, x, y):
 87         """Return a tuple ``(nabla_b, nabla_w)`` representing the
 88         gradient for the cost function C_x.  ``nabla_b`` and
 89         ``nabla_w`` are layer-by-layer lists of numpy arrays, similar
 90         to ``self.biases`` and ``self.weights``."""
 91         nabla_b = [np.zeros(b.shape) for b in self.biases]
 92         nabla_w = [np.zeros(w.shape) for w in self.weights]
 93         # feedforward
 94         activation = x
 95         activatiOns= [x] # list to store all the activations, layer by layer
 96         zs = [] # list to store all the z vectors, layer by layer
 97         for b, w in zip(self.biases, self.weights):
 98             z = np.dot(w, activation)+b
 99             zs.append(z)
100             activation = sigmoid(z)
101             activations.append(activation)
102         # backward pass
103         delta = self.cost_derivative(activations[-1], y) * \
104             sigmoid_prime(zs[-1])
105         nabla_b[-1] = delta
106         nabla_w[-1] = np.dot(delta, activations[-2].transpose())
107         # Note that the variable l in the loop below is used a little
108         # differently to the notation in Chapter 2 of the book.  Here,
109         # l = 1 means the last layer of neurons, l = 2 is the
110         # second-last layer, and so on.  It's a renumbering of the
111         # scheme in the book, used here to take advantage of the fact
112         # that Python can use negative indices in lists.
113         for l in range(2, self.num_layers):
114             z = zs[-l]
115             sp = sigmoid_prime(z)
116             delta = np.dot(self.weights[-l+1].transpose(), delta) * sp
117             nabla_b[-l] = delta
118             nabla_w[-l] = np.dot(delta, activations[-l-1].transpose())
119         return (nabla_b, nabla_w)
120 
121     def evaluate(self, test_data):
122         """Return the number of test inputs for which the neural
123         network outputs the correct result. Note that the neural
124         network's output is assumed to be the index of whichever
125         neuron in the final layer has the highest activation."""
126         test_results = [(np.argmax(self.feedforward(x)), y)
127                         for (x, y) in test_data]
128         return sum(int(x == y) for (x, y) in test_results)
129 
130     def cost_derivative(self, output_activations, y):
131         """Return the vector of partial derivatives \partial C_x /
132         \partial a for the output activations."""
133         return (output_activations-y)
134 
135 #### Miscellaneous functions
136 def sigmoid(z):
137     """The sigmoid function."""
138     return 1.0/(1.0+np.exp(-z))
139 
140 def sigmoid_prime(z):
141     """Derivative of the sigmoid function."""
142     return sigmoid(z)*(1-sigmoid(z))

该算法比我之前写的神经网络算法准确率高,但是在测试过程中发现有错误,各个地方的注释我是没看明白,与理论结合不是很好。本人在他的基础上进行了改进,提高了算法的扩展程度,自己也亲测了改进后的代码,效果杠杠的。

  1 # -*- coding: utf-8 -*-
  2 """
  3 Created on Thu Jan 18 15:27:24 2018
  4 
  5 @author: markli
  6 """
  7 
  8 import numpy as np;
  9 import random;
 10 
 11 def tanh(x):  
 12     return np.tanh(x);
 13 
 14 def tanh_derivative(x):  
 15     return 1.0 - np.tanh(x)*np.tanh(x);
 16 
 17 def logistic(x):  
 18     return 1/(1 + np.exp(-x));
 19 
 20 def logistic_derivative(x):  
 21     return logistic(x)*(1-logistic(x));
 22 
 23 def ReLU(x,a=1):
 24     return max(0,a * x);
 25 
 26 def ReLU_derivative(x,a=1):
 27     return 0 if x <0 else a;
 28 
 29 class NeuralNetwork:
 30     '''
 31     Z = W * x + b
 32     A = sigmod(Z)
 33     Z 净输入
 34     x 样本集合 n * m n 个特征 m 个样本数量
 35     b 偏移量
 36     W 权重
 37     A 净输出
 38     '''
 39     def __init__(self,layers,active_function=[logistic],active_function_der=[logistic_derivative],learn_rate=0.9):
 40         """
 41         初始化神经网络
 42         layer中存放每层的神经元数量,layer的长度即为网络的层数
 43         active_function 为每一层指定一个激活函数,若长度为1则表示所有层使用同一个激活函数
 44         active_function_der 激活函数的导数
 45         learn_rate 学习速率 
 46         """
 47         self.weights = [np.random.randn(x,y) for x,y in zip(layers[1:],layers[:-1])];
 48         self.biases = [np.random.randn(x,1) for x in layers[1:]];
 49         self.size = len(layers);
 50         self.rate = learn_rate;
 51         self.sigmoids = [];
 52         self.sigmoids_der = [];
 53         for i in range(len(layers)-1):
 54             if(len(active_function) == self.size-1):
 55                 self.sigmoids = active_function;
 56             else:
 57                 self.sigmoids.append(active_function[0]);
 58             if(len(active_function_der)== self.size-1):
 59                 self.sigmoids_der = active_function_der;
 60             else:
 61                 self.sigmoids_der.append(active_function_der[0]);
 62         
 63     def fit(self,TrainData,epochs=1000,mini_batch_size=32):
 64         """
 65         运用后向传播算法学习神经网络模型
 66         TrainData 是(X,Y)值对
 67         X 输入特征矩阵 m*n 维 n 个特征,m个样本
 68         Y 输入实际值 t*m 维 t个类别标签,m个样本
 69         epochs 迭代次数
 70         mini_batch_size mini_batch 一次的大小,不使用则mini_batch_size = 1
 71         """
 72         n = len(TrainData);
 73         for i in range(epochs):
 74             random.shuffle(TrainData)
 75             mini_batches = [
 76                 TrainData[k:k+mini_batch_size]
 77                 for k in range(0, n, mini_batch_size)];
 78             for mini_batch in mini_batches:
 79                 self.BP(mini_batch, self.rate);
 80         
 81         
 82         
 83         
 84     def predict(self, x):
 85         """前向传播"""
 86         i = 0;
 87         for b, w in zip(self.biases, self.weights):
 88             x = self.sigmoids[i](np.dot(w, x)+b);
 89             i = i + 1;
 90         return x
 91     
 92     def BP(self,mini_batch,rate):
 93         """
 94         BP 神经网络算法
 95         """
 96         size = len(mini_batch);
 97 
 98         nabla_b = [np.zeros(b.shape) for b in self.biases]; #存放每次训练b的变化量
 99         nabla_w = [np.zeros(w.shape) for w in self.weights]; #存放每次训练w的变化量
100         #一个一个的进行训练  
101         for x, y in mini_batch:
102             delta_nabla_b, delta_nabla_w = self.backprop(x, y);
103             nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]; #累加每次训练b的变化量
104             nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]; #累加每次训练w的变化量
105         self.weights = [w-(rate/size)*nw
106                         for w, nw in zip(self.weights, nabla_w)];
107         self.biases = [b-(rate/size)*nb
108                        for b, nb in zip(self.biases, nabla_b)];
109             
110     def backprop(self, x, y):
111         """
112         x 是一维 的行向量
113         y 是一维行向量
114         """
115         nabla_b = [np.zeros(b.shape) for b in self.biases];
116         nabla_w = [np.zeros(w.shape) for w in self.weights];
117         # feedforward
118         activation = np.atleast_2d(x).reshape((len(x),1)); #转换为列向量
119         activatiOns= [activation]; # 存放每层a
120         zs = []; # 存放每z值
121         i = 0;
122         for b, w in zip(self.biases, self.weights):
123             z = np.dot(w, activation)+b;
124             zs.append(z);
125             activation = self.sigmoids[i](z);
126             activations.append(activation);
127             i = i + 1;
128         # backward pass
129         y = np.atleast_2d(y).reshape((len(y),1)); #将y转化为列向量
130         #delta cost对z的偏导数
131         delta = self.cost_der(activations[-1], y) * \
132             self.sigmoids_der[-1](zs[-1]);
133         nabla_b[-1] = delta;
134         nabla_w[-1] = np.dot(delta, np.transpose(activations[-2]));
135         #从后往前遍历每一层,从倒数第2层开始
141         for l in range(2, self.size):
142             z = zs[-l]; #当前层的z
143             sp = self.sigmoids_der[-l](z); #对z的偏导数值
144             delta = np.multiply(np.dot(np.transpose(self.weights[-l+1]), delta), sp); #求出当前层的误差
145             nabla_b[-l] = delta;
146             nabla_w[-l] = np.dot(delta, np.transpose(activations[-l-1]));
147         return (nabla_b, nabla_w)
148     
149     """
150     损失函数
151     cost_der 差的平方损失函数对a 的导数
152     cost_cross_entropy_der 交叉熵损失函数对a的导数
153     """
154     def cost_der(self,a,y):
155         return a - y;
156     
157     def cost_cross_entropy_der(self,a,y):
158         return (a-y)/(a * (1-a));
159         
160         

以上是BP神经网络算法源码,下面给出一个数字识别程序,用来测试上述代码的正确性。

 1 import numpy as np
 2 from sklearn.datasets import load_digits
 3 from sklearn.metrics import confusion_matrix, classification_report
 4 from sklearn.preprocessing import LabelBinarizer
 5 from network_mark import  NeuralNetwork
 6 from sklearn.cross_validation import train_test_split
 7 
 8 
 9 
10 digits = load_digits();
11 X = digits.data;
12 y = digits.target;
13 X -= X.min(); # normalize the values to bring them into the range 0-1
14 X /= X.max();
15 
16 nn = NeuralNetwork([64,100,10]);
17 X_train, X_test, y_train, y_test = train_test_split(X, y);
18 labels_train = LabelBinarizer().fit_transform(y_train);
19 labels_test = LabelBinarizer().fit_transform(y_test);
20 
21 
22 # X_train.shape (1347,64)
23 #y_train.shape(1347)
24 #labels_train.shape (1347,10)
25 #labels_test.shape(450,10)
26 
27 print ("start fitting");
28 Data = [(x,y) for x,y in zip(X_train,labels_train)];
29 #print(Data);
30 nn.fit(Data,epochs=500,mini_batch_size=32);
31 result = nn.predict(X_test.T);
32 predictiOns= [np.argmax(result[:,y]) for y in range(result.shape[1])];
33 
34 print(predictions);
35 #for i in range(result.shape[1]):
36 #    y = result[:,i];
37 #    predictions.append(np.argmax(y));
38 ##print(np.atleast_2d(predictions).shape);
39 print (confusion_matrix(y_test,predictions));
40 print (classification_report(y_test,predictions));
41  

最后是测试结果,效果很客观。


推荐阅读
  • 自动轮播,反转播放的ViewPagerAdapter的使用方法和效果展示
    本文介绍了如何使用自动轮播、反转播放的ViewPagerAdapter,并展示了其效果。该ViewPagerAdapter支持无限循环、触摸暂停、切换缩放等功能。同时提供了使用GIF.gif的示例和github地址。通过LoopFragmentPagerAdapter类的getActualCount、getActualItem和getActualPagerTitle方法可以实现自定义的循环效果和标题展示。 ... [详细]
  • [大整数乘法] java代码实现
    本文介绍了使用java代码实现大整数乘法的过程,同时也涉及到大整数加法和大整数减法的计算方法。通过分治算法来提高计算效率,并对算法的时间复杂度进行了研究。详细代码实现请参考文章链接。 ... [详细]
  • 本文介绍了iOS数据库Sqlite的SQL语句分类和常见约束关键字。SQL语句分为DDL、DML和DQL三种类型,其中DDL语句用于定义、删除和修改数据表,关键字包括create、drop和alter。常见约束关键字包括if not exists、if exists、primary key、autoincrement、not null和default。此外,还介绍了常见的数据库数据类型,包括integer、text和real。 ... [详细]
  • 欢乐的票圈重构之旅——RecyclerView的头尾布局增加
    项目重构的Git地址:https:github.comrazerdpFriendCircletreemain-dev项目同步更新的文集:http:www.jianshu.comno ... [详细]
  • 基于Socket的多个客户端之间的聊天功能实现方法
    本文介绍了基于Socket的多个客户端之间实现聊天功能的方法,包括服务器端的实现和客户端的实现。服务器端通过每个用户的输出流向特定用户发送消息,而客户端通过输入流接收消息。同时,还介绍了相关的实体类和Socket的基本概念。 ... [详细]
  • 本文由编程笔记#小编为大家整理,主要介绍了logistic回归(线性和非线性)相关的知识,包括线性logistic回归的代码和数据集的分布情况。希望对你有一定的参考价值。 ... [详细]
  • 在Android开发中,使用Picasso库可以实现对网络图片的等比例缩放。本文介绍了使用Picasso库进行图片缩放的方法,并提供了具体的代码实现。通过获取图片的宽高,计算目标宽度和高度,并创建新图实现等比例缩放。 ... [详细]
  • FeatureRequestIsyourfeaturerequestrelatedtoaproblem?Please ... [详细]
  • Java学习笔记之面向对象编程(OOP)
    本文介绍了Java学习笔记中的面向对象编程(OOP)内容,包括OOP的三大特性(封装、继承、多态)和五大原则(单一职责原则、开放封闭原则、里式替换原则、依赖倒置原则)。通过学习OOP,可以提高代码复用性、拓展性和安全性。 ... [详细]
  • 3.223.28周学习总结中的贪心作业收获及困惑
    本文是对3.223.28周学习总结中的贪心作业进行总结,作者在解题过程中参考了他人的代码,但前提是要先理解题目并有解题思路。作者分享了自己在贪心作业中的收获,同时提到了一道让他困惑的题目,即input details部分引发的疑惑。 ... [详细]
  • 浏览器中的异常检测算法及其在深度学习中的应用
    本文介绍了在浏览器中进行异常检测的算法,包括统计学方法和机器学习方法,并探讨了异常检测在深度学习中的应用。异常检测在金融领域的信用卡欺诈、企业安全领域的非法入侵、IT运维中的设备维护时间点预测等方面具有广泛的应用。通过使用TensorFlow.js进行异常检测,可以实现对单变量和多变量异常的检测。统计学方法通过估计数据的分布概率来计算数据点的异常概率,而机器学习方法则通过训练数据来建立异常检测模型。 ... [详细]
  • Whatsthedifferencebetweento_aandto_ary?to_a和to_ary有什么区别? ... [详细]
  • Android系统源码分析Zygote和SystemServer启动过程详解
    本文详细解析了Android系统源码中Zygote和SystemServer的启动过程。首先介绍了系统framework层启动的内容,帮助理解四大组件的启动和管理过程。接着介绍了AMS、PMS等系统服务的作用和调用方式。然后详细分析了Zygote的启动过程,解释了Zygote在Android启动过程中的决定作用。最后通过时序图展示了整个过程。 ... [详细]
  • 重入锁(ReentrantLock)学习及实现原理
    本文介绍了重入锁(ReentrantLock)的学习及实现原理。在学习synchronized的基础上,重入锁提供了更多的灵活性和功能。文章详细介绍了重入锁的特性、使用方法和实现原理,并提供了类图和测试代码供读者参考。重入锁支持重入和公平与非公平两种实现方式,通过对比和分析,读者可以更好地理解和应用重入锁。 ... [详细]
  • 本文介绍了MVP架构模式及其在国庆技术博客中的应用。MVP架构模式是一种演变自MVC架构的新模式,其中View和Model之间的通信通过Presenter进行。相比MVC架构,MVP架构将交互逻辑放在Presenter内部,而View直接从Model中读取数据而不是通过Controller。本文还探讨了MVP架构在国庆技术博客中的具体应用。 ... [详细]
author-avatar
中科蓝天李跃华
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有