摘要:前言本文使用訓(xùn)練邏輯回歸模型,并將其與做比較。對(duì)數(shù)極大似然估計(jì)方法的目標(biāo)函數(shù)是最大化所有樣本的發(fā)生概率機(jī)器學(xué)習(xí)習(xí)慣將目標(biāo)函數(shù)稱(chēng)為損失,所以將損失定義為對(duì)數(shù)似然的相反數(shù),以轉(zhuǎn)化為極小值問(wèn)題。
前言
本文使用tensorflow訓(xùn)練邏輯回歸模型,并將其與scikit-learn做比較。數(shù)據(jù)集來(lái)自Andrew Ng的網(wǎng)上公開(kāi)課程Deep Learning
代碼#!/usr/bin/env python # -*- coding=utf-8 -*- # @author: 陳水平 # @date: 2017-01-04 # @description: compare the logistics regression of tensorflow with sklearn based on the exercise of deep learning course of Andrew Ng. # @ref: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.html import tensorflow as tf import numpy as np from sklearn.linear_model import LogisticRegression from sklearn import preprocessing # Read x and y x_data = np.loadtxt("ex4x.dat").astype(np.float32) y_data = np.loadtxt("ex4y.dat").astype(np.float32) scaler = preprocessing.StandardScaler().fit(x_data) x_data_standard = scaler.transform(x_data) # We evaluate the x and y by sklearn to get a sense of the coefficients. reg = LogisticRegression(C=999999999, solver="newton-cg") # Set C as a large positive number to minimize the regularization effect reg.fit(x_data, y_data) print "Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_) # Now we use tensorflow to get similar results. W = tf.Variable(tf.zeros([2, 1])) b = tf.Variable(tf.zeros([1, 1])) y = 1 / (1 + tf.exp(-tf.matmul(x_data_standard, W) + b)) loss = tf.reduce_mean(- y_data.reshape(-1, 1) * tf.log(y) - (1 - y_data.reshape(-1, 1)) * tf.log(1 - y)) optimizer = tf.train.GradientDescentOptimizer(1.3) train = optimizer.minimize(loss) init = tf.initialize_all_variables() sess = tf.Session() sess.run(init) for step in range(100): sess.run(train) if step % 10 == 0: print step, sess.run(W).flatten(), sess.run(b).flatten() print "Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten()) print "Coefficients of tensorflow (raw input): K=%s, b=%s" % (sess.run(W).flatten() / scaler.scale_, sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W))) # Problem solved and we are happy. But... # I"d like to implement the logistic regression from a multi-class viewpoint instead of binary. # In machine learning domain, it is called softmax regression # In economic and statistics domain, it is called multinomial logit (MNL) model, proposed by Daniel McFadden, who shared the 2000 Nobel Memorial Prize in Economic Sciences. print "------------------------------------------------" print "We solve this binary classification problem again from the viewpoint of multinomial classification" print "------------------------------------------------" # As a tradition, sklearn first reg = LogisticRegression(C=9999999999, solver="newton-cg", multi_class="multinomial") reg.fit(x_data, y_data) print "Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_) print "A little bit difference at first glance. What about multiply them with 2?" # Then try tensorflow W = tf.Variable(tf.zeros([2, 2])) # first 2 is feature number, second 2 is class number b = tf.Variable(tf.zeros([1, 2])) V = tf.matmul(x_data_standard, W) + b y = tf.nn.softmax(V) # tensorflow provide a utility function to calculate the probability of observer n choose alternative i, you can replace it with `y = tf.exp(V) / tf.reduce_sum(tf.exp(V), keep_dims=True, reduction_indices=[1])` # Encode the y label in one-hot manner lb = preprocessing.LabelBinarizer() lb.fit(y_data) y_data_trans = lb.transform(y_data) y_data_trans = np.concatenate((1 - y_data_trans, y_data_trans), axis=1) # Only necessary for binary class loss = tf.reduce_mean(-tf.reduce_sum(y_data_trans * tf.log(y), reduction_indices=[1])) optimizer = tf.train.GradientDescentOptimizer(1.3) train = optimizer.minimize(loss) init = tf.initialize_all_variables() sess = tf.Session() sess.run(init) for step in range(100): sess.run(train) if step % 10 == 0: print step, sess.run(W).flatten(), sess.run(b).flatten() print "Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten()) print "Coefficients of tensorflow (raw input): K=%s, b=%s" % ((sess.run(W) / scaler.scale_).flatten(), sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W)))
輸出如下:
Coefficients of sklearn: K=[[ 0.14834077 0.15890845]], b=-16.378743 0 [ 0.33699557 0.34786162] [ -4.84287721e-09] 10 [ 1.15830743 1.22841871] [ 0.02142336] 20 [ 1.3378191 1.42655993] [ 0.03946959] 30 [ 1.40735555 1.50197577] [ 0.04853692] 40 [ 1.43754184 1.53418231] [ 0.05283691] 50 [ 1.45117068 1.54856908] [ 0.05484771] 60 [ 1.45742035 1.55512536] [ 0.05578374] 70 [ 1.46030474 1.55814099] [ 0.05621871] 80 [ 1.46163988 1.55953443] [ 0.05642065] 90 [ 1.46225858 1.56017959] [ 0.0565144] Coefficients of tensorflow (input should be standardized): K=[ 1.46252561 1.56045783], b=[ 0.05655487] Coefficients of tensorflow (raw input): K=[ 0.14831361 0.15888004], b=[-16.26265144] ------------------------------------------------ We solve this binary classification problem again from the viewpoint of multinomial classification ------------------------------------------------ Coefficients of sklearn: K=[[ 0.07417039 0.07945423]], b=-8.189372 A little bit difference at first glance. What about multiply them with 2? 0 [-0.33699557 0.33699557 -0.34786162 0.34786162] [ 6.05359674e-09 -6.05359674e-09] 10 [-0.68416572 0.68416572 -0.72988117 0.72988123] [ 0.02157043 -0.02157041] 20 [-0.72234094 0.72234106 -0.77087188 0.77087194] [ 0.02693938 -0.02693932] 30 [-0.72958517 0.72958535 -0.7784785 0.77847856] [ 0.02802362 -0.02802352] 40 [-0.73103166 0.73103184 -0.77998811 0.77998811] [ 0.02824244 -0.02824241] 50 [-0.73132294 0.73132324 -0.78029168 0.78029174] [ 0.02828659 -0.02828649] 60 [-0.73138171 0.73138207 -0.78035289 0.78035301] [ 0.02829553 -0.02829544] 70 [-0.73139352 0.73139393 -0.78036523 0.78036535] [ 0.02829732 -0.0282972 ] 80 [-0.73139596 0.73139632 -0.78036767 0.78036791] [ 0.02829764 -0.02829755] 90 [-0.73139644 0.73139679 -0.78036815 0.78036839] [ 0.02829781 -0.02829765] Coefficients of tensorflow (input should be standardized): K=[-0.7313965 0.73139679 -0.78036827 0.78036839], b=[ 0.02829777 -0.02829769] Coefficients of tensorflow (raw input): K=[-0.07417037 0.07446811 -0.07913655 0.07945422], b=[ 8.1893692 -8.18937111]思考
對(duì)于邏輯回歸,損失函數(shù)比線(xiàn)性回歸模型復(fù)雜了一些。首先需要通過(guò)sigmoid函數(shù),將線(xiàn)性回歸的結(jié)果轉(zhuǎn)化為0至1之間的概率值。然后寫(xiě)出每個(gè)樣本的發(fā)生概率(似然),那么所有樣本的發(fā)生概率就是每個(gè)樣本發(fā)生概率的乘積。為了求導(dǎo)方便,我們對(duì)所有樣本的發(fā)生概率取對(duì)數(shù),保持其單調(diào)性的同時(shí),可以將連乘變?yōu)榍蠛停臃ǖ那髮?dǎo)公式比乘法的求導(dǎo)公式簡(jiǎn)單很多)。對(duì)數(shù)極大似然估計(jì)方法的目標(biāo)函數(shù)是最大化所有樣本的發(fā)生概率;機(jī)器學(xué)習(xí)習(xí)慣將目標(biāo)函數(shù)稱(chēng)為損失,所以將損失定義為對(duì)數(shù)似然的相反數(shù),以轉(zhuǎn)化為極小值問(wèn)題。
我們提到邏輯回歸時(shí),一般指的是二分類(lèi)問(wèn)題;然而這套思想是可以很輕松就拓展為多分類(lèi)問(wèn)題的,在機(jī)器學(xué)習(xí)領(lǐng)域一般稱(chēng)為softmax回歸模型。本文的作者是統(tǒng)計(jì)學(xué)與計(jì)量經(jīng)濟(jì)學(xué)背景,因此一般將其稱(chēng)為MNL模型。
文章版權(quán)歸作者所有,未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。
轉(zhuǎn)載請(qǐng)注明本文地址:http://systransis.cn/yun/38335.html
摘要:貢獻(xiàn)者飛龍版本最近總是有人問(wèn)我,把這些資料看完一遍要用多長(zhǎng)時(shí)間,如果你一本書(shū)一本書(shū)看的話(huà),的確要用很長(zhǎng)時(shí)間。為了方便大家,我就把每本書(shū)的章節(jié)拆開(kāi),再按照知識(shí)點(diǎn)合并,手動(dòng)整理了這個(gè)知識(shí)樹(shù)。 Special Sponsors showImg(https://segmentfault.com/img/remote/1460000018907426?w=1760&h=200); 貢獻(xiàn)者:飛龍版...
摘要:深度學(xué)習(xí)這幾年很火,所以,從今天起涉足深度學(xué)習(xí),為未來(lái)學(xué)習(xí),注本博文為慕課課程學(xué)習(xí)筆記。用完后,可以通過(guò)發(fā)出以下命令來(lái)停用此環(huán)境提示符將恢復(fù)為您的默認(rèn)提示符由所定義。本機(jī)器激活命令使用安裝多層神經(jīng)網(wǎng)絡(luò)的實(shí)戰(zhàn)神經(jīng)元的實(shí)現(xiàn) 深度學(xué)習(xí)這幾年很火,所以,從今天起涉足深度學(xué)習(xí),為未來(lái)學(xué)習(xí),注本博文為慕課課程學(xué)習(xí)筆記。 一、入門(mén)基本概念 機(jī)器學(xué)習(xí)簡(jiǎn)介 機(jī)器學(xué)習(xí):無(wú)序數(shù)據(jù)轉(zhuǎn)化為價(jià)值的方法機(jī)器學(xué)習(xí)價(jià)值...
摘要:前言本文使用訓(xùn)練多元線(xiàn)性回歸模型,并將其與做比較。在這個(gè)例子中,變量一個(gè)是面積,一個(gè)是房間數(shù),量級(jí)相差很大,如果不歸一化,面積在目標(biāo)函數(shù)和梯度中就會(huì)占據(jù)主導(dǎo)地位,導(dǎo)致收斂極慢。 前言 本文使用tensorflow訓(xùn)練多元線(xiàn)性回歸模型,并將其與scikit-learn做比較。數(shù)據(jù)集來(lái)自Andrew Ng的網(wǎng)上公開(kāi)課程Deep Learning 代碼 #!/usr/bin/env pyth...
閱讀 1219·2021-11-22 12:05
閱讀 1344·2021-09-29 09:35
閱讀 641·2019-08-30 15:55
閱讀 3135·2019-08-30 14:12
閱讀 962·2019-08-30 14:11
閱讀 2882·2019-08-30 13:10
閱讀 2411·2019-08-29 16:33
閱讀 3338·2019-08-29 11:02