小編延續之前的教學繼續教大家如何把前面所講的公式用python一步一步實作出來，這裡選擇的是用ipython notebook實作，這種筆記本也是小編愛上python的原因，有了ipython notebook程式碼和解說公式可以放在一起互相比對，也可以把實驗結果跑出來的圖放在筆記本上，簡直是神器阿XD。在公式旁小編都會附上對應的程式碼，如果還有不懂得歡迎留言詢問 ^.^

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

import numpy as np

設置神經元層數，並初始化參數¶

sizes=[2,3,1]
num_layers = len(sizes)
biases = [np.random.randn(y, 1) for y in sizes[1:]]  #輸入層沒有bias
weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])]  #23 31

np.random.randn(2, 3) #Return a sample (or samples) from the “standard normal” distribution.

array([[-0.40753557,  1.69663529,  1.68384902],
       [ 0.2270228 , -0.67659325, -1.56037246]])

第一行為隱藏層的偏權值，第二行為輸出神經元的偏權值

biases

[array([[ 1.67783431],
        [ 1.48388232],
        [ 1.25672544]]), array([[-0.37243728]])]

第一個array為輸入層與隱藏層之間的權重
第二個array為隱藏層與輸出層之間的權重

weights

[array([[-0.67155003, -0.16412354],
        [ 1.03533557, -1.62712015],
        [ 0.81559388,  0.33848189]]),
 array([[ 1.55154725,  0.75984726, -0.83059525]])]

準備矩陣儲存算出的偏微分值¶

nabla_b = [np.zeros(b.shape) for b in biases]
nabla_b

[array([[ 0.],
        [ 0.],
        [ 0.]]), array([[ 0.]])]

nabla_w = [np.zeros(w.shape) for w in weights]
nabla_w

[array([[ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.]]), array([[ 0.,  0.,  0.]])]

定義函數¶

def sigmoid(z):
    """The sigmoid function."""
    return 1.0/(1.0+np.exp(-z))

def sigmoid_prime(z):
    """Derivative of the sigmoid function."""
    return sigmoid(z)*(1-sigmoid(z))

def cost_derivative(output_activations, y):
        """Return the vector of partial derivatives \partial C_x / \partial a for the output activations."""
        return (output_activations-y)

製造訓練數據 x,y¶

np.random.seed(1)
x = 10 * np.random.randn(sizes[0], 1)
y = np.array([1])
print x
print y

[[ 16.24345364]
 [ -6.11756414]]
[1]

前饋網路¶

前饋網路矩陣示意圖 W_{layer2 X layer1} * W_{layer1 X N} = W_{layer2 X N}

	輸入層第0顆	輸入層第1顆		輸入層第1組 (X_2*n)		Z_3*n
隱藏層第0顆	W_0,0	W_0,1		X₀		W_0,0 X₀ +W_0,1 X₁+ b₀
隱藏層第1顆	W_1,0	W_1,1	*	X₁	=>	W_1,0 X₀ +W_1,1 X₁+ b₁
隱藏層第2顆	W_2,0	W_2,1				W_2,0 X₀ +W_2,1 X₁+ b₂

activation = x
activations = [x] # list to store all the activations, layer by layer
zs = [] # list to store all the z vectors, layer by layer
for b, w in zip(biases, weights):    
    z = np.dot(w, activation)+b
    zs.append(z)
    activation = sigmoid(z)
    activations.append(activation)

各層的輸出神經元Z值,zs中包含兩個矩陣，np.array(zs)[0]就是第一個矩陣也就是代表隱藏層的輸出Z值

print np.array(zs)

[array([[ -8.22642123],
       [ 28.25531946],
       [ 12.43410211]])
 array([[-0.44276705]])]

各層的輸出神經元a值

print np.array(activations)

[array([[ 16.24345364],
       [ -6.11756414]])
 array([[  2.67420380e-04],
       [  1.00000000e+00],
       [  9.99996020e-01]])
 array([[ 0.39108184]])]

後饋網路¶

算出最後一層的敏感度delta $$\delta_{j}^{L} = \frac{\partial E}{\partial a_{j}^{L}}\ f^{'}\left( z_{j}^{L} \right)$$

delta = cost_derivative(activations[-1], y) * sigmoid_prime(zs[-1])
delta

array([[-0.14500584]])

總誤差對輸出層b的微分就等於最後一層的敏感度 $$\frac{\partial E}{\partial b_{j}^{L}} = \delta_{j}^{L}$$

nabla_b[-1] = delta

根據公式算出最後一層總誤差對weight的微分 $$\frac{\partial E}{\partial w_{\text{jk}}^{l}} = a_{k}^{l - 1}\delta_{j}^{l}$$

nabla_w_j,k=delta_j,1 * a_1,k

nabla_w[-1] = np.dot(delta, activations[-2].transpose())
nabla_w[-1]

array([[ -3.87775177e-05,  -1.45005844e-01,  -1.45005266e-01]])

算出倒數第二層函數微分 $$f^{'}\left( z_{k}^{L - 1} \right)$$

z = zs[-2]
f_prime = sigmoid_prime(z)
print f_prime

[[  2.67348866e-04]
 [  5.35571587e-13]
 [  3.98047234e-06]]

代入公式 $$\delta_{k}^{l - 1} = f^{'}\left( z_{k}^{l - 1} \right)*\sum_{j}^{}{\delta_{j}^{l}w_{\text{jk}}^{l}}$$

w_k,j X delta_j,1 * f_prime_k,1

delta_l_1 = np.dot(weights[-1].transpose(), delta) * f_prime
delta_l_1

array([[ -6.01490617e-05],
       [ -5.90105057e-14],
       [  4.79412722e-07]])

將算出的delta放入nabla_b矩陣中

nabla_b[-2] = delta_l_1
print nabla_b[-2]

[[ -6.01490617e-05]
 [ -5.90105057e-14]
 [  4.79412722e-07]]

根據公式算出最後第二層總誤差對weight的微分 $$\frac{\partial E}{\partial w_{\text{ki}}^{l-1}} = a_{i}^{l - 2}\delta_{k}^{l-1}$$

nabla_w_k,i=delta_k,1 * a_1,i

nabla_w[-2] = np.dot(delta_l_1, activations[-2-1].transpose())
print nabla_w[-2]

[[ -9.77028495e-04   3.67965743e-04]
 [ -9.58534413e-13   3.61000553e-13]
 [  7.78731833e-06  -2.93283808e-06]]

寫成函式全部合再一起¶

def backprop( x, y):
        nabla_b = [np.zeros(b.shape) for b in biases]
        nabla_w = [np.zeros(w.shape) for w in weights]
        # feedforward
        activation = x
        activations = [x] # list to store all the activations, layer by layer
        zs = [] # list to store all the z vectors, layer by layer
        for b, w in zip(biases, weights):
            z = np.dot(w, activation)+b
            zs.append(z)
            activation = sigmoid(z)
            activations.append(activation)
        # backward pass
        delta = cost_derivative(activations[-1], y) * \
            sigmoid_prime(zs[-1])
        nabla_b[-1] = delta
        nabla_w[-1] = np.dot(delta, activations[-2].transpose())
        # l的定義在程式中不一樣，l=1代表最後一層,l=2代表倒數第二層
        for l in xrange(2, num_layers):
            z = zs[-l]
            f_prime = sigmoid_prime(z)
            delta = np.dot(weights[-l+1].transpose(), delta) * f_prime
            nabla_b[-l] = delta
            nabla_w[-l] = np.dot(delta, activations[-l-1].transpose())
        return (nabla_b, nabla_w)

X丟輸入,Y丟輸出，經由backprop副函式算出偏微分¶

nabla_b, nabla_w=backprop(x,y)

nabla_w

[array([[ -9.77028495e-04,   3.67965743e-04],
        [ -9.58534413e-13,   3.61000553e-13],
        [  7.78731833e-06,  -2.93283808e-06]]),
 array([[ -3.87775177e-05,  -1.45005844e-01,  -1.45005266e-01]])]

nabla_b

[array([[ -6.01490617e-05],
        [ -5.90105057e-14],
        [  4.79412722e-07]]), array([[-0.14500584]])]

重要公式總整理¶

公式1:根據公式算出最後一層總誤差對weight的微分 $$\frac{\partial E}{\partial w_{\text{jk}}^{l}} = a_{k}^{l - 1}\delta_{j}^{l}$$

nabla_w_j,k=delta_j,1 * a_1,k

公式2:根據前一層算出的delta算出當層delta $$\delta_{k}^{l - 1} = f^{'}\left( z_{k}^{l - 1} \right)*\sum_{j}^{}{\delta_{j}^{l}w_{\text{jk}}^{l}}$$

w_k,j X delta_j,1 * f_prime_k,1

參考資料:¶

http://neuralnetworksanddeeplearning.com/chap1.html

我的小小AI 天地

目前此網站已經不太更新囉，為了給讀者更好的閱讀環境小編決定自己架設網站供讀者閱讀，新網站網址 https://easylearnai.com/

手把手實作出類神經公式-with ipython notebook

設置神經元層數，並初始化參數¶

準備矩陣儲存算出的偏微分值¶

定義函數¶

製造訓練數據 x,y¶

前饋網路¶

後饋網路¶

寫成函式全部合再一起¶

X丟輸入,Y丟輸出，經由backprop副函式算出偏微分¶

重要公式總整理¶

參考資料:¶

留言列表

活動快報

【船井...

我的好友

熱門文章

文章分類

最新文章

最新留言

動態訂閱

文章精選

文章搜尋

新聞交換(RSS)

誰來我家

參觀人氣

QR Code

POWERED BY

站方公告

我的小小AI 天地

目前此網站已經不太更新囉，為了給讀者更好的閱讀環境 小編決定自己架設網站供讀者閱讀，新網站網址 https://easylearnai.com/

手把手實作出類神經公式-with ipython notebook

設置神經元層數，並初始化參數¶

準備矩陣儲存算出的偏微分值¶

定義函數¶

製造訓練數據 x,y¶

前饋網路¶

後饋網路¶

寫成函式全部合再一起¶

X丟輸入,Y丟輸出，經由backprop副函式算出偏微分¶

重要公式總整理¶

參考資料:¶

留言列表

活動快報

【船井...

我的好友

熱門文章

文章分類

最新文章

最新留言

動態訂閱

文章精選

文章搜尋

新聞交換(RSS)

誰來我家

參觀人氣

QR Code

POWERED BY

站方公告

目前此網站已經不太更新囉，為了給讀者更好的閱讀環境小編決定自己架設網站供讀者閱讀，新網站網址 https://easylearnai.com/