Softmax classification: Multinomial classfication

  • 여러개의 class가 있을 때 예측

Logistic regression 복습

  • :linear hypothesis
  • :z로 놓고 0 or 1로 압축 할 수 있는 함수인
  • :Sigmoid를 g(z)로 사용
  • :Hypothesis

Multinomial classfication

  • Binary classfication으로 구분을 해낼 수 있음
  • 각각의 classfier를 가지기 보단 Vector를 추가하면 계산을 한번에 할 수 있음

Sigmoid

  • sigmoid로 0~1 내로 압축하고
  • One-hot encoding: 그중에 가장 큰 값을 1.0으로 만들어서 선택

Softmax

  • 전체의 합이 1이 되도록 만들어 줌
  • Score -> probabilities

Cost function

  • Cross-entropy
  • element-wise 곱셈을 사용

숙제는 logistic cost function과 cross entropy cost function이 결국은 같은데 이유에 대해서는 생각 해 볼 것

Gradient descent

  • cost function C를 최소화 하는 w vector를 찾기
import tensorflow as tf
import numpy as np
import os

print(os.getcwd())
xy = np.loadtxt('softmax.in', unpack=True, dtype='float')
x_data = np.transpose(xy[0:3])
y_data = np.transpose(xy[3:])

# input
X = tf.placeholder("float", [None, 3])
Y = tf.placeholder("float", [None, 3])
# set model weights
W = tf.Variable(tf.zeros([3, 3]))

# Construct model
# X와 W의 위치를 바꾸어줬기 때문에 위에 x_data 읽어올때 transpose를 해줌
hypothesis = tf.nn.softmax(tf.matmul(X, W))

# learning rate
learning_rate = 0.001

# Cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(Y * tf.log(hypothesis), reduction_indices=1))

# Gradent Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Init
init = tf.global_variables_initializer()

# Launch
with tf.Session() as sess:
    sess.run(init)

    for step in range(2001):
        sess.run(optimizer, feed_dict={X: x_data, Y: y_data})
        if step % 200 == 0:
            print(step, sess.run(cost, feed_dict={X: x_data, Y: y_data}), sess.run(W))

    # Test
    a = sess.run(hypothesis, feed_dict={X:[[1, 11, 7]]})
    print(a, sess.run(tf.argmax(a, 1)))

    b = sess.run(hypothesis, feed_dict={X:[[1, 3, 4]]})
    print(b, sess.run(tf.argmax(b, 1)))

    c = sess.run(hypothesis, feed_dict={X:[[1, 1, 0]]})
    print(c, sess.run(tf.argmax(c, 1)))

    all = sess.run(hypothesis, feed_dict={X:[[1, 11, 7], [1, 3, 4], [1, 1, 0]]})
    print(all, sess.run(tf.argmax(all, 1)))

출력

    /Users/bruc2kim/PycharmProjects/b2yond
    0 1.09774 [[ -8.33333252e-05   4.16666626e-05   4.16666480e-05]
     [  1.66666694e-04   2.91666773e-04  -4.58333408e-04]
     [  1.66666636e-04   4.16666706e-04  -5.83333429e-04]]
    200 1.05962 [[-0.02051384 -0.00103983  0.02155367]
     [ 0.01406438  0.01097753 -0.02504191]
     [ 0.01431208  0.03574873 -0.05006079]]
    400 1.04985 [[-0.04282598 -0.00625899  0.04908497]
     [ 0.01747187  0.00156368 -0.01903554]
     [ 0.01831204  0.04954104 -0.06785304]]
    600 1.0407 [[-0.06517859 -0.01176361  0.07694222]
     [ 0.01943521 -0.00848972 -0.01094548]
     [ 0.0211558   0.06118308 -0.0823388 ]]
    800 1.03194 [[-0.08734013 -0.01729389  0.10463405]
     [ 0.0211172  -0.01796171 -0.00315548]
     [ 0.02396628  0.07198346 -0.09594974]]
    1000 1.02354 [[-0.10928574 -0.02282181  0.13210757]
     [ 0.02266254 -0.02678034  0.00411783]
     [ 0.02685853  0.08210357 -0.10896212]]
    1200 1.01547 [[-0.13101467 -0.02834092  0.15935561]
     [ 0.02409703 -0.03497621  0.01087923]
     [ 0.029832    0.091598   -0.12143001]]
    1400 1.0077 [[-0.15252906 -0.03384715  0.18637618]
     [ 0.02543302 -0.04258838  0.01715542]
     [ 0.03287466  0.10050805 -0.13338271]]
    1600 1.00021 [[-0.17383142 -0.03933692  0.21316831]
     [ 0.02668081 -0.04965464  0.0229739 ]
     [ 0.03597427  0.10887136 -0.14484565]]
    1800 0.992979 [[-0.19492456 -0.04480688  0.23973146]
     [ 0.02784993 -0.05621058  0.02836074]
     [ 0.03911954  0.11672314 -0.15584265]]
    2000 0.985988 [[-0.21581143 -0.05025396  0.26606542]
     [ 0.02894915 -0.06228962  0.03334056]
     [ 0.04230019  0.12409624 -0.16639642]]
    [[ 0.46272627  0.35483006  0.18244371]] [0]
    [[ 0.33820099  0.42101386  0.24078514]] [1]
    [[ 0.27002314  0.29085544  0.4391214 ]] [2]
    [[ 0.46272627  0.35483006  0.18244371]
     [ 0.33820099  0.42101386  0.24078514]
     [ 0.27002314  0.29085544  0.4391214 ]] [0 1 2]