逻辑回归

逻辑回归是应用非常广泛的一个分类机器学习算法，它将数据拟合到一个logit函数(或者叫做logistic函数)中，从而能够完成对事件发生的概率进行预测。

MINIST数据集

MNIST 数据集来自美国国家标准与技术研究所, National Institute of Standards and Technology (NIST). 训练集 (training set) 由来自 250 个不同人手写的数字构成，是一个非常有名的手写体数字识别数据集，在很多资料中，这个数据集都会被用作深度学习的入门样例。

存储形式
共有四个压缩文件
train-images-idx3-ubyte.gz: training set images (9912422 bytes)
train-labels-idx1-ubyte.gz: training set labels (28881 bytes)
t10k-images-idx3-ubyte.gz: test set images (1648877 bytes)
t10k-labels-idx1-ubyte.gz: test set labels (4542 bytes)
样本个数
训练样本：共55000个
验证样本：共5000个
测试样本：共10000个

TensorFlow实现

代码如下：

import tensorflow as tf

# 下载MINIST数据集
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data", one_hot=True)

# 设定参数
learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

# 模型输入,784为MINIST数据集的图片大小28*28
x = tf.placeholder(tf.float32, [None, 784])

# 模型输出,10为预测的类别数
y = tf.placeholder(tf.float32, [None, 10])

# 设定模型的权重和偏移量
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# 模型的结构
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# 使用cross entropy来作为损失函数
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))

# 使用Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# 初始化所有变量
init = tf.global_variables_initializer()

# 开始训练
with tf.Session() as sess:

    # 执行初始化
    sess.run(init)

    # 训练循环
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # 循环每一个batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # 执行optimizer,获得cost
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # 计算平均损失
            avg_cost += c / total_batch
        # 显示每一轮的结果
        if (epoch+1) % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print("Optimization Finished!")

    # 测试模型
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))

    # 计算精确率
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))

输出结果：

Epoch: 0001 cost= 1.183872078
Epoch: 0002 cost= 0.665350118
Epoch: 0003 cost= 0.552830602
Epoch: 0004 cost= 0.498699041
Epoch: 0005 cost= 0.465488806
Epoch: 0006 cost= 0.442619649
Epoch: 0007 cost= 0.425471577
Epoch: 0008 cost= 0.412201005
Epoch: 0009 cost= 0.401415385
Epoch: 0010 cost= 0.392391824
Epoch: 0011 cost= 0.384738960
Epoch: 0012 cost= 0.378136856
Epoch: 0013 cost= 0.372445326
Epoch: 0014 cost= 0.367273882
Epoch: 0015 cost= 0.362716155
Epoch: 0016 cost= 0.358604888
Epoch: 0017 cost= 0.354853253
Epoch: 0018 cost= 0.351472244
Epoch: 0019 cost= 0.348347617
Epoch: 0020 cost= 0.345449658
Epoch: 0021 cost= 0.342724947
Epoch: 0022 cost= 0.340273546
Epoch: 0023 cost= 0.337938625
Epoch: 0024 cost= 0.335751063
Epoch: 0025 cost= 0.333709621
Optimization Finished!
Accuracy: 0.9138