DeepLearning 笔记:用 Python 实现反向传播算法

用反向传播算法更新权重的算法如下:

  • 给每一层的权重赋值为 0
    • 输入层→隐层的权重 $\Delta w_{ij}=0$
    • 隐层→输出层的权重 $\Delta W_j=0$
  • 对训练集里的每一个数据:
    • 使用 forward pass,计算输出节点的值 $\hat y$
    • 计算输出节点的误差梯度 $\delta^o=(y-\hat y)f’(z)$, 这里的 $z=\sum_jW_ja_j$
    • 将误差反向传递到隐层 $\delta^h_j=\delta^oW_jf’(h_j)$
    • 更新权重步长
      • $\Delta W_j = \Delta W_j + \delta^oa_j$
      • $\Delta w{ij} = \Delta w{ij} + \delta^h_ja_i$
  • 更新权重(η 为学习率,m 为输入节点的个数):
    • $W_j = W_j + \eta \Delta W_j /m$
    • $w{ij} = w{ij} + \eta \Delta w_{ij} /m$
  • 重复 e 次训练步骤 (epochs)

在 python 中实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
import numpy as np
from data_prep import features, targets, features_test, targets_test

np.random.seed(21)

def sigmoid(x):
"""
Calculate sigmoid
"""
return 1 / (1 + np.exp(-x))


# Hyperparameters
n_hidden = 2 # number of hidden units
epochs = 900
learnrate = 0.005

n_records, n_features = features.shape
last_loss = None
# Initialize weights
weights_input_hidden = np.random.normal(scale=1 / n_features ** .5,
size=(n_features, n_hidden))
weights_hidden_output = np.random.normal(scale=1 / n_features ** .5,
size=n_hidden)

for e in range(epochs):
del_w_input_hidden = np.zeros(weights_input_hidden.shape)
del_w_hidden_output = np.zeros(weights_hidden_output.shape)
for x, y in zip(features.values, targets):

## Forward pass ##

# Calculate the output
hidden_input = np.dot(x, weights_input_hidden) # x·w
hidden_output = sigmoid(hidden_input)
output = sigmoid(np.dot(hidden_output, weights_hidden_output))

## Backward pass ##

# Calculate the network's prediction error
error = y - output

# Calculate error term for the output unit
output_error_term = error * output * (1 - output)

## propagate errors to hidden layer

# Calculate the hidden layer's contribution to the error
hidden_error = np.dot(output_error_term, weights_hidden_output)

# Calculate the error term for the hidden layer
hidden_error_term = hidden_error * hidden_output * (1 - hidden_output)

# Update the change in weights
del_w_hidden_output += output_error_term * hidden_output
del_w_input_hidden += hidden_error_term * x[:,None] # x.T

# Update weights
weights_input_hidden += learnrate * del_w_input_hidden / n_records
weights_hidden_output += learnrate * del_w_hidden_output / n_records

# Printing out the mean square error on the training set
if e % (epochs / 10) == 0:
hidden_output = sigmoid(np.dot(x, weights_input_hidden))
out = sigmoid(np.dot(hidden_output,
weights_hidden_output))
loss = np.mean((out - targets) ** 2)

if last_loss and last_loss < loss:
print("Train loss: ", loss, " WARNING - Loss Increasing")
else:
print("Train loss: ", loss)
last_loss = loss

# Calculate accuracy on test data
hidden = sigmoid(np.dot(features_test, weights_input_hidden))
out = sigmoid(np.dot(hidden, weights_hidden_output))
predictions = out > 0.5
accuracy = np.mean(predictions == targets_test)
print("Prediction accuracy: {:.3f}".format(accuracy))

你可能会感兴趣:

kidult00 wechat
扫码关注 00 的公众号
如果文章帮您节省时间或者解答疑问,不妨打个赏 :)