Deep MNIST for Experts:
TensorFlow is a powerful library for doing large-scale numerical computation. One of the tasks at which it excels is implementing and training deep neural networks. In this tutorial we will learn the basic building blocks of a TensorFlow model while constructing a deep convolutional MNIST classifier.
TensorFlow是一個(gè)非常強(qiáng)大的用來(lái)做大規(guī)模數(shù)值計(jì)算的庫(kù)。其所擅長(zhǎng)的任務(wù)之一就是實(shí)現(xiàn)以及訓(xùn)練深度神經(jīng)網(wǎng)絡(luò)。在本教程中,我們將學(xué)到構(gòu)建一個(gè)TensorFlow模型的基本步驟,并將通過(guò)這些步驟為MNIST構(gòu)建一個(gè)深度卷積神經(jīng)網(wǎng)絡(luò)。
This introduction assumes familiarity with neural networks and the MNIST dataset. If you don’t have a background with them, check out the introduction for beginners. Be sure to install TensorFlow before starting.
這個(gè)教程假設(shè)你已經(jīng)熟悉神經(jīng)網(wǎng)絡(luò)和MNIST數(shù)據(jù)集。如果你尚未了解,請(qǐng)查看新手指南.
About this tutorial:
The first part of this tutorial explains what is happening in the mnist_softmax.py code, which is a basic implementation of a Tensorflow model. The second part shows some ways to improve the accuracy.
教程的第一部分是在解釋mnist_softmax.py代碼,它是對(duì)TensorFlow模型的基本實(shí)現(xiàn),第二部分是在分析如何改善它的精度。
You can copy and paste each code snippet from this tutorial into a Python environment, or you can choose to just read through the code.
你可以將代碼一行行復(fù)制到Python環(huán)境下,也可以選擇閱讀這些代碼。
What we will accomplish in this tutorial:
接下來(lái)我們需要完成下面目標(biāo)
a) Create a softmax regression function that is a model for recognizing MNIST digits, based on looking at every pixel in the image
創(chuàng)建一個(gè)softmax回歸函數(shù),它是基于在圖像中的每個(gè)像素以識(shí)別MNIST數(shù)字模型。
b) Use Tensorflow to train the model to recognize digits by having it “l(fā)ook” at thousands of examples (and run our first Tensorflow session to do so)
使用tensorflow訓(xùn)練模型識(shí)別數(shù)字有了“看”在成千上萬(wàn)的例子(并運(yùn)行我們的第一tensorflow會(huì)話這樣做)
c) Check the model’s accuracy with our test data
用我們的測(cè)試數(shù)據(jù)檢查模型的準(zhǔn)確性。
d) Build, train, and test a multilayer convolutional neural network to improve the results
構(gòu)建、訓(xùn)練和測(cè)試多層卷積神經(jīng)網(wǎng)絡(luò)以提高結(jié)果
Setup:
Before we create our model, we will first load the MNIST dataset, and start a TensorFlow session.
在創(chuàng)建模型之前,我們首先需要加載MNIST數(shù)據(jù),并且啟動(dòng)一個(gè)TensorFlow會(huì)話
Load MNIST Data:
If you are copying and pasting in the code from this tutorial, start here with these two lines of code which will download and read in the data automatically:
如果你從教程中復(fù)制黏貼代碼,那么現(xiàn)在下面兩句能自動(dòng)幫你下載代碼
from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('/MNIST_data',one_hot=Ture)
Here mnist is a lightweight class which stores the training, validation, and testing sets as NumPy arrays. It also provides a function for iterating through data minibatches, which we will use below.
這里,mnist是一個(gè)輕量級(jí)的類。它以Numpy數(shù)組的形式存儲(chǔ)著訓(xùn)練、校驗(yàn)和測(cè)試數(shù)據(jù)集。同時(shí)提供了一個(gè)函數(shù),用于在迭代中獲得minibatch,后面我們將會(huì)用到。
Start TensorFlow InteractiveSession:
TensorFlow relies on a highly efficient C++ backend to do its computation. The connection to this backend is called a session. The common usage for TensorFlow programs is to first create a graph and then launch it in a session.
Tensorflow依賴于一個(gè)高效的C++后端來(lái)進(jìn)行計(jì)算。與后端的這個(gè)連接叫做session。一般而言,使用TensorFlow程序的流程是先創(chuàng)建一個(gè)圖,然后在session中啟動(dòng)它。
Here we instead use the convenient InteractiveSession class, which makes TensorFlow more flexible about how you structure your code. It allows you to interleave operations which build a computation graph with ones that run the graph. This is particularly convenient when working in interactive contexts like IPython. If you are not using an InteractiveSession, then you should build the entire computation graph before starting a session and launching the graph.
這里,我們使用更加方便的InteractiveSession類。通過(guò)它,你可以更加靈活地構(gòu)建你的代碼。它能讓你在運(yùn)行圖的時(shí)候,插入一些計(jì)算圖,這些計(jì)算圖是由某些操作(operations)構(gòu)成的。這對(duì)于工作在交互式環(huán)境中的人們來(lái)說(shuō)非常便利,比如使用IPython。如果你沒有使用InteractiveSession,那么你需要在啟動(dòng)session之前構(gòu)建整個(gè)計(jì)算圖,然后啟動(dòng)該計(jì)算圖。
import tensorflow as tfsess = tf.InteractiveSession
Computation Graph:
To do efficient numerical computing in Python, we typically use libraries like NumPy that do expensive operations such as matrix multiplication outside Python, using highly efficient code implemented in another language. Unfortunately, there can still be a lot of overhead from switching back to Python every operation. This overhead is especially bad if you want to run computations on GPUs or in a distributed manner, where there can be a high cost to transferring data.
為了在Python中進(jìn)行高效的數(shù)值計(jì)算,我們通常會(huì)使用像NumPy一類的庫(kù),將一些諸如矩陣乘法的耗時(shí)操作在Python環(huán)境的外部來(lái)計(jì)算,這些計(jì)算通常會(huì)通過(guò)其它語(yǔ)言并用更為高效的代碼來(lái)實(shí)現(xiàn)。但遺憾的是,每一個(gè)操作切換回Python環(huán)境時(shí)仍需要不小的開銷。如果你想在GPU或者分布式環(huán)境中計(jì)算時(shí),這一開銷更加恐怖,這一開銷主要可能是用來(lái)進(jìn)行數(shù)據(jù)遷移。
TensorFlow also does its heavy lifting outside Python, but it takes things a step further to avoid this overhead. Instead of running a single expensive operation independently from Python, TensorFlow lets us describe a graph of interacting operations that run entirely outside Python. This approach is similar to that used in Theano or Torch.
TensorFlow也是在Python外部完成其主要工作,但是進(jìn)行了改進(jìn)以避免這種開銷。其并沒有采用在Python外部獨(dú)立運(yùn)行某個(gè)耗時(shí)操作的方式,而是先讓我們描述一個(gè)交互操作圖,然后完全將其運(yùn)行在Python外部。這與Theano或Torch的做法類似。
The role of the Python code is therefore to build this external computation graph, and to dictate which parts of the computation graph should be run. See the Computation Graph section of Basic Usage for more detail.
因此Python代碼的目的是用來(lái)構(gòu)建這個(gè)可以在外部運(yùn)行的計(jì)算圖,以及安排計(jì)算圖的哪一部分應(yīng)該被運(yùn)行。詳情請(qǐng)查看基本用法中的計(jì)算圖表(https://www.tensorflow.org/versions/r0.12/get_started/basic_usage.html#the-computation-graph)一節(jié)。
Build a Softmax Regression Model:
In this section we will build a softmax regression model with a single linear layer. In the next section, we will extend this to the case of softmax regression with a multilayer convolutional network.
在這一節(jié)中我們將建立一個(gè)擁有一個(gè)線性層的softmax回歸模型。在下一節(jié),我們會(huì)將其擴(kuò)展為一個(gè)擁有多層卷積網(wǎng)絡(luò)的softmax回歸模型。
Placeholders:
We start building the computation graph by creating nodes for the input images and target output classes.
我們通過(guò)為輸入圖像和目標(biāo)輸出類別創(chuàng)建節(jié)點(diǎn),來(lái)開始構(gòu)建計(jì)算圖。
x = tf.placeholder(tf.float32,shape=[None,784]) y_ = tf.placeholder(tf.float32,shape=[None,10])
Here x and y_ aren’t specific values. Rather, they are each a placeholder – a value that we’ll input when we ask TensorFlow to run a computation.
這里的x和y并不是特定的值,相反,他們都只是一個(gè)占位符,可以在TensorFlow運(yùn)行某一計(jì)算時(shí)根據(jù)該占位符輸入具體的值。
The input images x will consist of a 2d tensor of floating point numbers. Here we assign it a shape of [None, 784], where 784 is the dimensionality of a single flattened 28 by 28 pixel MNIST image, and None indicates that the first dimension, corresponding to the batch size, can be of any size. The target output classes y_ will also consist of a 2d tensor, where each row is a one-hot 10-dimensional vector indicating which digit class (zero through nine) the corresponding MNIST image belongs to.
輸入圖片x是一個(gè)2維的浮點(diǎn)數(shù)張量。這里,分配給它的shape為[None, 784],其中784是一張展平的MNIST圖片的維度。None表示其值大小不定,在這里作為第一個(gè)維度值,用以指代batch的大小,意即x的數(shù)量不定。輸出類別值y_也是一個(gè)2維張量,其中每一行為一個(gè)10維的one-hot向量,用于代表對(duì)應(yīng)某一MNIST圖片的類別。
The shape argument to placeholder is optional, but it allows TensorFlow to automatically catch bugs stemming from inconsistent tensor shapes.
雖然placeholder的shape參數(shù)是可選的,但有了它,TensorFlow能夠自動(dòng)捕捉因數(shù)據(jù)維度不一致導(dǎo)致的錯(cuò)誤。
Variables:
We now define the weights W and biases b for our model. We could imagine treating these like additional inputs, but TensorFlow has an even better way to handle them: Variable. A Variable is a value that lives in TensorFlow’s computation graph. It can be used and even modified by the computation. In machine learning applications, one generally has the model parameters be Variables.
我們現(xiàn)在為模型定義權(quán)重W和偏置b??梢詫⑺鼈儺?dāng)作額外的輸入量,但是TensorFlow有一個(gè)更好的處理方式:變量。一個(gè)變量代表著TensorFlow計(jì)算圖中的一個(gè)值,能夠在計(jì)算過(guò)程中使用,甚至進(jìn)行修改。在機(jī)器學(xué)習(xí)的應(yīng)用過(guò)程中,模型參數(shù)一般用Variable來(lái)表示。
w = tf.Variable(tf.zeros([78410])) b = tf.Variable(tf.zeros([10]))
We pass the initial value for each parameter in the call to tf.Variable. In this case, we initialize both W and b as tensors full of zeros. W is a 784x10 matrix (because we have 784 input features and 10 outputs) and b is a 10-dimensional vector (because we have 10 classes).
我們?cè)谡{(diào)用tf.Variable的時(shí)候傳入初始值。在這個(gè)例子里,我們把W和b都初始化為零向量。W是一個(gè)784x10的矩陣(因?yàn)槲覀冇?84個(gè)特征和10個(gè)輸出值)。b是一個(gè)10維的向量(因?yàn)槲覀冇?0個(gè)分類)。
Before Variables can be used within a session, they must be initialized using that session. This step takes the initial values (in this case tensors full of zeros) that have already been specified, and assigns them to each Variable. This can be done for all Variables at once:
變量需要通過(guò)seesion初始化后,才能在session中使用。這一初始化步驟為,為初始值指定具體值(本例當(dāng)中是全為零),并將其分配給每個(gè)變量,可以一次性為所有變量完成此操作。
sess.run(tf.global_variables_initializer)
Predicted Class and Loss Function:
We can now implement our regression model. It only takes one line! We multiply the vectorized input images x by the weight matrix W, add the bias b.
現(xiàn)在我們可以實(shí)現(xiàn)我們的回歸模型了。這只需要一行!我們把向量化后的圖片x和權(quán)重矩陣W相乘,加上偏置b,然后計(jì)算每個(gè)分類的softmax概率值。
y = tf.nn.softmax(tf.matmul(x,W)+b)
We can specify a loss function just as easily. Loss indicates how bad the model’s prediction was on a single example; we try to minimize that while training across all the examples. Here, our loss function is the cross-entropy between the target and the softmax activation function applied to the model’s prediction. As in the beginners tutorial, we use the stable formulation:
可以很容易的為訓(xùn)練過(guò)程指定最小化誤差用的損失函數(shù),我們的損失函數(shù)是目標(biāo)類別和預(yù)測(cè)類別之間的交叉熵。
cross_entropy = tf.reduce_sum(y_*tf.log(y))
Note that tf.nn.softmax_cross_entropy_with_logits internally applies the softmax on the model’s unnormalized model prediction and sums across all classes, and tf.reduce_mean takes the average over these sums.
注意,tf.reduce_sum把minibatch里的每張圖片的交叉熵值都加起來(lái)了。我們計(jì)算的交叉熵是指整個(gè)minibatch的。
Train the Model:
Now that we have defined our model and training loss function, it is straightforward to train using TensorFlow. Because TensorFlow knows the entire computation graph, it can use automatic differentiation to find the gradients of the loss with respect to each of the variables. TensorFlow has a variety of built-in optimization algorithms. For this example, we will use steepest gradient descent, with a step length of 0.5, to descend the cross entropy.
我們已經(jīng)定義好模型和訓(xùn)練用的損失函數(shù),那么用TensorFlow進(jìn)行訓(xùn)練就很簡(jiǎn)單了。因?yàn)門ensorFlow知道整個(gè)計(jì)算圖,它可以使用自動(dòng)微分法找到對(duì)于各個(gè)變量的損失的梯度值。TensorFlow有大量?jī)?nèi)置的優(yōu)化算法 這個(gè)例子中,我們用最速下降法讓交叉熵下降,步長(zhǎng)為0.5.
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
What TensorFlow actually did in that single line was to add new operations to the computation graph. These operations included ones to compute gradients, compute parameter update steps, and apply update steps to the parameters.
這一行代碼實(shí)際上是用來(lái)往計(jì)算圖上添加一個(gè)新操作,其中包括計(jì)算梯度,計(jì)算每個(gè)參數(shù)的步長(zhǎng)變化,并且計(jì)算出新的參數(shù)值。
The returned operation train_step, when run, will apply the gradient descent updates to the parameters. Training the model can therefore be accomplished by repeatedly running train_step.
返回的train_step操作對(duì)象,在運(yùn)行時(shí)會(huì)使用梯度下降來(lái)更新參數(shù)。因此,整個(gè)模型的訓(xùn)練可以通過(guò)反復(fù)地運(yùn)行train_step來(lái)完成。
for i in range(1000): batch = mnist.train.next_batch(100) train_step.run(feed_dict={x:batch[0],y_:batch[1]})
We load 100 training examples in each training iteration. We then run the train_step operation, using feed_dict to replace the placeholder tensors x and y_ with the training examples. Note that you can replace any tensor in your computation graph using feed_dict – it’s not restricted to just placeholders.
每一步迭代,我們都會(huì)加載100個(gè)訓(xùn)練樣本,然后執(zhí)行一次train_step,并通過(guò)feed_dict將x 和 y_張量占位符用訓(xùn)練訓(xùn)練數(shù)據(jù)替代。注意,在計(jì)算圖中,你可以用feed_dict來(lái)替代任何張量,并不僅限于替換占位符。
Evaluate the Model:
How well did our model do?
那么我們的模型性能如何呢?
First we’ll figure out where we predicted the correct label. tf.argmax is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, tf.argmax(y,1) is the label our model thinks is most likely for each input, while tf.argmax(y_,1) is the true label. We can use tf.equal to check if our prediction matches the truth.
首先讓我們找出那些預(yù)測(cè)正確的標(biāo)簽。tf.argmax 是一個(gè)非常有用的函數(shù),它能給出某個(gè)tensor對(duì)象在某一維上的其數(shù)據(jù)最大值所在的索引值。由于標(biāo)簽向量是由0,1組成,因此最大值1所在的索引位置就是類別標(biāo)簽,比如tf.argmax(y,1)返回的是模型對(duì)于任一輸入x預(yù)測(cè)到的標(biāo)簽值,而 tf.argmax(y_,1) 代表正確的標(biāo)簽,我們可以用 tf.equal 來(lái)檢測(cè)我們的預(yù)測(cè)是否真實(shí)標(biāo)簽匹配(索引位置一樣表示匹配)。
correct_prediction = tf.equal(tf.argmax(y1),tf.argmax(y_,1))
That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, [True, False, True, True] would become [1,0,1,1] which would become 0.75.
這里返回一個(gè)布爾數(shù)組。為了計(jì)算我們分類的準(zhǔn)確率,我們將布爾值轉(zhuǎn)換為浮點(diǎn)數(shù)來(lái)代表對(duì)、錯(cuò),然后取平均值。例如:[True, False, True, True]變?yōu)閇1,0,1,1],計(jì)算出平均值為0.75。
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))
Finally, we can evaluate our accuracy on the test data. This should be about 92% correct.
最后,我們可以計(jì)算出在測(cè)試數(shù)據(jù)上的準(zhǔn)確率,大概是91%。
print(accuracy.eval(feed_dict={x:mnist.test.images,y_:mnist.test.labels}))
Build a Multilayer Convolutional Network:
Getting 92% accuracy on MNIST is bad. It’s almost embarrassingly bad. In this section, we’ll fix that, jumping from a very simple model to something moderately sophisticated: a small convolutional neural network. This will get us to around 99.2% accuracy – not state of the art, but respectable.
在MNIST上只有91%正確率,實(shí)在太糟糕。在這個(gè)小節(jié)里,我們用一個(gè)稍微復(fù)雜的模型:卷積神經(jīng)網(wǎng)絡(luò)來(lái)改善效果。這會(huì)達(dá)到大概99.2%的準(zhǔn)確率。雖然不是最高,但是還是比較讓人滿意。
Weight Initialization
To create this model, we’re going to need to create a lot of weights and biases. One should generally initialize weights with a small amount of noise for symmetry breaking, and to prevent 0 gradients. Since we’re using ReLU neurons, it is also good practice to initialize them with a slightly positive initial bias to avoid “dead neurons”. Instead of doing this repeatedly while we build the model, let’s create two handy functions to do it for us.
為了創(chuàng)建這個(gè)模型,我們需要?jiǎng)?chuàng)建大量的權(quán)重和偏置項(xiàng)。這個(gè)模型中的權(quán)重在初始化時(shí)應(yīng)該加入少量的噪聲來(lái)打破對(duì)稱性以及避免0梯度。由于我們使用的是ReLU神經(jīng)元,因此比較好的做法是用一個(gè)較小的正數(shù)來(lái)初始化偏置項(xiàng),以避免神經(jīng)元節(jié)點(diǎn)輸出恒為0的問(wèn)題(dead neurons)。為了不在建立模型的時(shí)候反復(fù)做初始化操作,我們定義兩個(gè)函數(shù)用于初始化。
defweight_variable(shape): initial = tf.truncated_normal(shape,stddev=0.1) return tf.Variable(initial) defbias_variable(shape): initial = tf.constant(0.1,shape=shape) return tf.Variable(initial)
Convolution and Pooling
TensorFlow also gives us a lot of flexibility in convolution and pooling operations. How do we handle the boundaries? What is our stride size? In this example, we’re always going to choose the vanilla version. Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input. Our pooling is plain old max pooling over 2x2 blocks. To keep our code cleaner, let’s also abstract those operations into functions.
TensorFlow在卷積和池化上有很強(qiáng)的靈活性。我們?cè)趺刺幚磉吔??步長(zhǎng)應(yīng)該設(shè)多大?在這個(gè)實(shí)例里,我們會(huì)一直使用vanilla版本。我們的卷積使用1步長(zhǎng)(stride size),0邊距(padding size)的模板,保證輸出和輸入是同一個(gè)大小。我們的池化用簡(jiǎn)單傳統(tǒng)的2x2大小的模板做max pooling。為了代碼更簡(jiǎn)潔,我們把這部分抽象成一個(gè)函數(shù)。
defconv2d(x,W):return tf.nn.conv2d(x,W,strides=[1111],padding='SAME') defmax_pool_2x2(x):return tf.nn.max_pool(x,ksize=[1221],strides=[1221],padding='SAME')
First Convolutional Layer
We can now implement our first layer. It will consist of convolution, followed by max pooling. The convolution will compute 32 features for each 5x5 patch. Its weight tensor will have a shape of [5, 5, 1, 32]. The first two dimensions are the patch size, the next is the number of input channels, and the last is the number of output channels. We will also have a bias vector with a component for each output channel.
現(xiàn)在我們可以開始實(shí)現(xiàn)第一層了。它由一個(gè)卷積接一個(gè)max pooling完成。卷積在每個(gè)5x5的patch中算出32個(gè)特征。卷積的權(quán)重張量形狀是[5, 5, 1, 32],【插入一個(gè)解釋:rgb是三通道圖片,這里的32可以看成是32通道圖片,即有32張不同的圖片,比如圖片是xy方向,通道就是z方向的。這里的[5,5,1,32]表示的就是5*5的卷積核,使得圖片從1通道卷成32通道。從1個(gè)方向的圖片變成了32個(gè)不同方向的圖片。28x28的圖片,經(jīng)過(guò)[5,5,1,32]且步長(zhǎng)為1的話,就會(huì)由28x28x1的圖片變成24x24x32的圖片。1x1的卷積核的用出就是原來(lái)是28x28x1024的話用[1,1,1024,32]就會(huì)變成28x28x32,這樣就能到達(dá)降維的效果,降低的是通道數(shù)。這個(gè)卷積過(guò)程也可以這么理解:把不同通道的相同位置加起來(lái)后再分解成32個(gè)不同方向。】
前兩個(gè)維度是patch的大小,接著是輸入的通道數(shù)目,最后是輸出的通道數(shù)目。 而對(duì)于每一個(gè)輸出通道都有一個(gè)對(duì)應(yīng)的偏置量。
注意:
a) 輸入圖像的shape是這樣的:input shape:[batch, in_height, in_width, in_channels]
b) 過(guò)濾器的shape是這樣的:filter shape:filter_height, filter_width, in_channels, out_channels]
W_conv1 = weight_variable([55132])b_conv1 = bias_variable([32])
To apply the layer, we first reshape x to a 4d tensor, with the second and third dimensions corresponding to image width and height, and the final dimension corresponding to the number of color channels.
為了用這一層,我們把x變成一個(gè)4d張量,其第2、第3維對(duì)應(yīng)圖片的寬、高,最后一維代表圖片的顏色通道數(shù)(因?yàn)槭腔叶葓D所以這里的通道數(shù)為1,如果是rgb彩色圖,則為3)。
x_image = tf.reshape(x,[-128281]) # tensor 't' is [[[1, 1, 1], # [2, 2, 2]], # [[3, 3, 3], # [4, 4, 4]], # [[5, 5, 5], # [6, 6, 6]]] # tensor 't' has shape [3, 2, 3], # 即從外層到內(nèi)層的括號(hào)數(shù),以下表示了第一個(gè)值為3的原因 # [【[1, 1, 1], # [2, 2, 2]】, # 【[3, 3, 3], # [4, 4, 4]】, # 【[5, 5, 5], # [6, 6, 6]】] # tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9] # tensor 't' has shape [9] # reshape(t, [3, 3]) ==> [[1, 2, 3], # [4, 5, 6], # [7, 8, 9]] # tensor 't' is [[[1, 1], [2, 2]], # [[3, 3], [4, 4]]] # tf.reshape(tensor, shape, name=None) #函數(shù)的作用是將tensor變換為參數(shù)shape的形式。 #其中shape為一個(gè)列表形式,特殊的一點(diǎn)是列表中可以存在-1。-1 #代表的含義是不用我們自己指定這一維的大小,函數(shù)會(huì)自動(dòng)計(jì)算,但列表中只能存在一個(gè)-1,不然就會(huì)存在多解方程了。
We then convolve x_image with the weight tensor, add the bias, apply the ReLU function, and finally max pool. The max_pool_2x2 method will reduce the image size to 14x14.
我們把x_image和權(quán)值向量進(jìn)行卷積,加上偏置項(xiàng),然后應(yīng)用ReLU激活函數(shù),最后進(jìn)行max pooling。
h_conv1 = tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1)h_pool1 = max_pool_2x2(h_conv1)
Second Convolutional Layer:
In order to build a deep network, we stack several layers of this type. The second layer will have 64 features for each 5x5 patch.
為了構(gòu)建一個(gè)更深的網(wǎng)絡(luò),我們會(huì)把幾個(gè)類似的層堆疊起來(lái)。第二層中,每個(gè)5x5的patch會(huì)得到64個(gè)特征。
W_conv2 = weight_variable([553264])b_conv2 = bias_variable([64])h_conv2 = tf.nn.relu(conv2d(h_pool1,W_conv2)+b_conv2)h_pool2 = max_pool_2x2(h_conv2)
Densely Connected Layer(密集連接層)
Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU.
現(xiàn)在,圖片尺寸減小到7x7【插入一點(diǎn):conv參數(shù)中’SAME’和’VALID’的區(qū)別:SAME:卷積后,輸入信號(hào)和輸出信號(hào)是一樣的大小,VALID就和我們平常理解的一樣,輸入圖像是5x5的,卷積核是2x2的,則輸出(5-2+1)x(5-2+1)的,這里用的是’VALID’因此只有池化層是縮減倍數(shù)的28->24->7】,我們加入一個(gè)有1024個(gè)神經(jīng)元的全連接層,用于處理整個(gè)圖片。我們把池化層輸出的張量reshape成一些向量,乘上權(quán)重矩陣,加上偏置,然后對(duì)其使用ReLU。
W_fc1 = weight_variable([7*7*641024])b_fc1 = bias_variable([1024])h_pool2_flat = tf.reshape(h_pool2,[-17*7*64])h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)
Dropout
To reduce overfitting, we will apply dropout before the readout layer. We create a placeholder for the probability that a neuron’s output is kept during dropout. This allows us to turn dropout on during training, and turn it off during testing. TensorFlow’s tf.nn.dropout op automatically handles scaling neuron outputs in addition to masking them, so dropout just works without any additional scaling.1
為了減少過(guò)擬合,我們?cè)谳敵鰧又凹尤雂ropout。我們用一個(gè)placeholder來(lái)代表一個(gè)神經(jīng)元的輸出在dropout中保持不變的概率。這樣我們可以在訓(xùn)練過(guò)程中啟用dropout,在測(cè)試過(guò)程中關(guān)閉dropout。 TensorFlow的tf.nn.dropout操作除了可以屏蔽神經(jīng)元的輸出外,還會(huì)自動(dòng)處理神經(jīng)元輸出值的scale。所以用dropout的時(shí)候可以不用考慮scale。
Readout Layer
Finally, we add a layer, just like for the one layer softmax regression above.
最后,我們添加一個(gè)softmax層,就像前面的單層softmax regression一樣。
聯(lián)系客服