TensorFlow學(xué)習(xí)筆記（5）：基于MNIST數(shù)據(jù)的卷積神經(jīng)網(wǎng)絡(luò)CNN

ad6623 發(fā)布于2019-07-25 11:26 / 3098人閱讀

摘要：前言本文基于官網(wǎng)的寫成。這個數(shù)據(jù)集可以從牛的不行的教授的網(wǎng)站獲取。本文將使用卷積神經(jīng)網(wǎng)絡(luò)以獲得更高的準確率。對于不存在這個問題，在最小點損失函數(shù)的梯度變?yōu)?，因此可以使用固定的?/p>

前言

本文基于TensorFlow官網(wǎng)的Tutorial寫成。輸入數(shù)據(jù)是MNIST，全稱是Modified National Institute of Standards and Technology，是一組由這個機構(gòu)搜集的手寫數(shù)字掃描文件和每個文件對應(yīng)標簽的數(shù)據(jù)集，經(jīng)過一定的修改使其適合機器學(xué)習(xí)算法讀取。這個數(shù)據(jù)集可以從牛的不行的Yann LeCun教授的網(wǎng)站獲取。

本系列文章的這一篇對這份數(shù)據(jù)集使用了softmax regression，在測試集上取得了接近92%的準確率。本文將使用卷積神經(jīng)網(wǎng)絡(luò)以獲得更高的準確率。關(guān)于CNN的理論知識，可以參考這篇文章

代碼

#!/usr/bin/env python
# -*- coding=utf-8 -*-
# @author: 陳水平
# @date: 2017-02-04
# @description: implement a CNN model upon MNIST handwritten digits
# @ref: http://yann.lecun.com/exdb/mnist/

import gzip
import struct
import numpy as np
from sklearn import preprocessing
import tensorflow as tf

# MNIST data is stored in binary format, 
# and we transform them into numpy ndarray objects by the following two utility functions
def read_image(file_name):
    with gzip.open(file_name, "rb") as f:
        buf = f.read()
        index = 0
        magic, images, rows, columns = struct.unpack_from(">IIII" , buf , index)
        index += struct.calcsize(">IIII")

        image_size = ">" + str(images*rows*columns) + "B"
        ims = struct.unpack_from(image_size, buf, index)
        
        im_array = np.array(ims).reshape(images, rows, columns)
        return im_array

def read_label(file_name):
    with gzip.open(file_name, "rb") as f:
        buf = f.read()
        index = 0
        magic, labels = struct.unpack_from(">II", buf, index)
        index += struct.calcsize(">II")
        
        label_size = ">" + str(labels) + "B"
        labels = struct.unpack_from(label_size, buf, index)

        label_array = np.array(labels)
        return label_array

print "Start processing MNIST handwritten digits data..."
train_x_data = read_image("MNIST_data/train-images-idx3-ubyte.gz")  # shape: 60000x28x28
train_x_data = train_x_data.reshape(train_x_data.shape[0], train_x_data.shape[1], train_x_data.shape[2], 1).astype(np.float32)
train_y_data = read_label("MNIST_data/train-labels-idx1-ubyte.gz")  
test_x_data = read_image("MNIST_data/t10k-images-idx3-ubyte.gz")  # shape: 10000x28x28
test_x_data = test_x_data.reshape(test_x_data.shape[0], test_x_data.shape[1], test_x_data.shape[2], 1).astype(np.float32)
test_y_data = read_label("MNIST_data/t10k-labels-idx1-ubyte.gz")

train_x_minmax = train_x_data / 255.0
test_x_minmax = test_x_data / 255.0

# Of course you can also use the utility function to read in MNIST provided by tensorflow
# from tensorflow.examples.tutorials.mnist import input_data
# mnist = input_data.read_data_sets("MNIST_data/", one_hot=False)
# train_x_minmax = mnist.train.images
# train_y_data = mnist.train.labels
# test_x_minmax = mnist.test.images
# test_y_data = mnist.test.labels

# Reformat y into one-hot encoding style
lb = preprocessing.LabelBinarizer()
lb.fit(train_y_data)
train_y_data_trans = lb.transform(train_y_data)
test_y_data_trans = lb.transform(test_y_data)

print "Start evaluating CNN model by tensorflow..."

# Model input
x = tf.placeholder(tf.float32, shape=[None, 28, 28, 1])
y_ = tf.placeholder(tf.float32, [None, 10])

# Weight initialization
def weight_variable(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

# Convolution and Pooling
def conv2d(x, W):
    # `tf.nn.conv2d()` computes a 2-D convolution given 4-D `input` and `filter` tensors
    # input tensor shape `[batch, in_height, in_width, in_channels]`, batch is number of observation 
    # filter tensor shape `[filter_height, filter_width, in_channels, out_channels]`
    # strides: the stride of the sliding window for each dimension of input.
    # padding: "SAME" or "VALID", determine the type of padding algorithm to use
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")

def max_pool_2x2(x):
    # `tf.nn.max_pool` performs the max pooling on the input
    #  ksize: the size of the window for each dimension of the input tensor.
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")


# First convolutional layer
# Convolution: compute 32 features for each 5x5 patch
# Max pooling: reduce image size to 14x14.
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])

h_conv1 = tf.nn.relu(conv2d(x,  W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# Second convolutional layer
# Convolution: compute 64 features for each 5x5 patch
# Max pooling: reduce image size to 7x7
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# Densely connected layer
# Fully-conected layer with 1024 neurons
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# Dropout
# To reduce overfitting, we apply dropout before the readout layer.
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# Readout layer
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

# Train and evaluate
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_conv, y_))
# y = tf.nn.softmax(y_conv)
# loss = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
optimizer = tf.train.AdamOptimizer(1e-4)
# optimizer = tf.train.GradientDescentOptimizer(1e-4)
train = optimizer.minimize(loss)

correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

for step in range(20000):
    sample_index = np.random.choice(train_x_minmax.shape[0], 50)
    batch_xs = train_x_minmax[sample_index, :]
    batch_ys = train_y_data_trans[sample_index, :]
    if step % 100 == 0:
        train_accuracy = sess.run(accuracy, feed_dict={
            x: batch_xs, y_: batch_ys, keep_prob: 1.0})
        print "step %d, training accuracy %g" % (step, train_accuracy)
    sess.run(train, feed_dict={x: batch_xs, y_: batch_ys, keep_prob: 0.5})

print "test accuracy %g" % sess.run(accuracy, feed_dict={
    x: test_x_minmax, y_: test_y_data_trans, keep_prob: 1.0})

輸出如下：

Start processing MNIST handwritten digits data...
Start evaluating CNN model by tensorflow...
step 0, training accuracy 0.1
step 100, training accuracy 0.82
step 200, training accuracy 0.92
step 300, training accuracy 0.96
step 400, training accuracy 0.92
step 500, training accuracy 0.92
step 600, training accuracy 1
step 700, training accuracy 0.94
step 800, training accuracy 0.96
step 900, training accuracy 0.96
step 1000, training accuracy 0.94
step 1100, training accuracy 0.98
step 1200, training accuracy 0.94
step 1300, training accuracy 0.98
step 1400, training accuracy 0.96
step 1500, training accuracy 1
...
step 15700, training accuracy 1
step 15800, training accuracy 0.98
step 15900, training accuracy 1
step 16000, training accuracy 1
step 16100, training accuracy 1
step 16200, training accuracy 1
step 16300, training accuracy 1
step 16400, training accuracy 1
step 16500, training accuracy 1
step 16600, training accuracy 1
step 16700, training accuracy 1
step 16800, training accuracy 1
step 16900, training accuracy 1
step 17000, training accuracy 1
step 17100, training accuracy 1
step 17200, training accuracy 1
step 17300, training accuracy 1
step 17400, training accuracy 1
step 17500, training accuracy 1
step 17600, training accuracy 1
step 17700, training accuracy 1
step 17800, training accuracy 1
step 17900, training accuracy 1
step 18000, training accuracy 1
step 18100, training accuracy 1
step 18200, training accuracy 1
step 18300, training accuracy 1
step 18400, training accuracy 1
step 18500, training accuracy 1
step 18600, training accuracy 1
step 18700, training accuracy 1
step 18800, training accuracy 1
step 18900, training accuracy 1
step 19000, training accuracy 1
step 19100, training accuracy 1
step 19200, training accuracy 1
step 19300, training accuracy 1
step 19400, training accuracy 1
step 19500, training accuracy 1
step 19600, training accuracy 1
step 19700, training accuracy 1
step 19800, training accuracy 1
step 19900, training accuracy 1
test accuracy 0.9929

思考

參數(shù)數(shù)量：第一個卷積層5x5x1x32=800個參數(shù)，第二個卷積層5x5x32x64=51200個參數(shù)，第三個全連接層7x7x64x1024=3211264個參數(shù)，第四個輸出層1024x10=10240個參數(shù)，總量級為330萬個參數(shù)，單機訓(xùn)練時間約為30分鐘。

關(guān)于優(yōu)化算法：隨機梯度下降法的learning rate需要逐漸變小，因為隨機抽取樣本引入了噪音，使得我們在最小點處的隨機梯度仍然不為0。對于batch gradient descent不存在這個問題，在最小點損失函數(shù)的梯度變?yōu)?，因此batch gradient descent可以使用固定的learning rate。為了讓learning rate逐漸變小，有以下幾種變種算法。

Momentum algorithm accumulates an exponentially decaying moving average of past gradients and continues to move in their direction.

AdaGrad adapts the learning rates of all model parameters by scaling them inversely proportional to the square root of the sum of all their historical squared values. But the accumulation of squared gradients from the beginning of training can result in a premature and excessive decrease in the effective learning rate.

RMSProp： AdaGrad is designed to converge rapidly when applied to a convex function. When applied to a non-convex function to train a neural network, the learning trajectory may pass through many different structures and eventually arrive at a region that is a locally convex bowl. AdaGrad shrinks the learning rate according to the entire history of the squared gradient and may have made the learning rate too small before arriving at such a convex structure.
RMSProp uses an exponentially decaying average to discard history from the extreme past so that it can converge rapidly after finding a convex bowl, as if it were an instance of the AdaGrad algorithm initialized within that bowl.

Adam：The name “Adam” derives from the phrase “adaptive moments.” It is a variant on the combination of RMSProp and momentum with a few important distinctions. Adam is generally regarded as being fairly robust to the choice of hyperparameters, though the learning rate sometimes needs to be changed from the suggested default.

如果將MNIST數(shù)據(jù)集的AdamOptimizer換成GradientDescentOptimizer，測試集的準確率為0.9296.

云服務(wù)器 GPU云服務(wù)器學(xué)習(xí)tensorflow的 tensorflow的學(xué)習(xí) 機器學(xué)習(xí)cnn 基于機器學(xué)習(xí)的

文章版權(quán)歸作者所有，未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請注明本文地址：http://systransis.cn/yun/38394.html

發(fā)表評論

登陸后可評論

0條評論

ad6623

男|高級講師

我要關(guān)注我要私信

TA的文章

RAKsmart：SSL證書上線，鉅惠大放價，SSL證書免費使用3個月

閱讀 1368·2021-09-24 10:26
給準備著手寫博客的小白的一些實用建議

閱讀 3681·2021-09-06 15:02
解決“有邊框的子元素寬度設(shè)定絕對值后，縮放瀏覽器會錯位”的兩種方法

閱讀 639·2019-08-30 14:18
emberjs引百度地圖問題

閱讀 591·2019-08-30 12:44
前端優(yōu)化

閱讀 3130·2019-08-30 10:48
今天，你的瀏覽器 “滾動” 了嗎？

閱讀 1956·2019-08-29 13:09
手把手教你打造一個純CSS圖標庫

閱讀 2010·2019-08-29 11:30
javascript事件基礎(chǔ)知識

閱讀 2296·2019-08-26 13:36

成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺、長期優(yōu)惠，快來選購！

TensorFlow學(xué)習(xí)筆記（5）：基于MNIST數(shù)據(jù)的卷積神經(jīng)網(wǎng)絡(luò)CNN

相關(guān)文章

深度學(xué)習(xí)

發(fā)表評論

0條評論

ad6623

男|高級講師

TA的文章

RAKsmart：SSL證書上線，鉅惠大放價，SSL證書免費使用3個月

給準備著手寫博客的小白的一些實用建議

解決“有邊框的子元素寬度設(shè)定絕對值后，縮放瀏覽器會錯位”的兩種方法

emberjs引百度地圖問題

前端優(yōu)化

今天，你的瀏覽器 “滾動” 了嗎？

手把手教你打造一個純CSS圖標庫

javascript事件基礎(chǔ)知識

最新活動

資訊專欄INFORMATION COLUMN

上云采購季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺、長期優(yōu)惠，快來選購！

TensorFlow學(xué)習(xí)筆記（5）：基于MNIST數(shù)據(jù)的卷積神經(jīng)網(wǎng)絡(luò)CNN

相關(guān)文章

發(fā)表評論

0條評論

男|高級講師

TA的文章

最新活動

上云采購季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺、長期優(yōu)惠，快來選購！