
本文共 3941 字,大约阅读时间需要 13 分钟。
问题
这里有八名同学在考试前一天的活动以及他们的考试结果如下表所示:
挂科 | 喝酒 | 逛街 | 学习 |
---|---|---|---|
1 | 1 | 1 | 0 |
0 | 0 | 0 | 1 |
0 | 1 | 0 | 1 |
1 | 1 | 0 | 0 |
1 | 0 | 1 | 0 |
0 | 0 | 1 | 1 |
0 | 0 | 1 | 0 |
1 | 0 | 0 | 1 |
通过以上数据,根据朴素贝叶斯原理,判断某学生在没有喝酒,没有逛街并且学习了的情况下是否会挂科。
算法步骤
朴素贝叶斯分类问题的主要目标就是求解 P ( y = 1 ∣ x 1 , x 2 , x 3 ) P(y=1|x_1,x_2,x_3) P(y=1∣x1,x2,x3)以及 P ( y = 0 ∣ x 1 , x 2 , x 3 ) P(y=0|x_1,x_2,x_3) P(y=0∣x1,x2,x3),通过比较两者大小来做出判断。
在这个问题中, y y y表示是否挂科, x 1 , x 2 , x 3 x_1, x_2, x_3 x1,x2,x3分别表示是否喝酒、逛街、学习。
我们知道,对于条件概率,有以下公式:
P ( A ∣ B ) = P ( A B ) P ( B ) P(A|B)=\frac{P(AB)}{P(B)} P(A∣B)=P(B)P(AB)
P ( B ∣ A ) = P ( A B ) P ( A ) P(B|A)=\frac{P(AB)}{P(A)} P(B∣A)=P(A)P(AB)
由此可以推得:
P ( A ∣ B ) = P ( B ∣ A ) P ( A ) P ( B ) P(A|B)=\frac{P(B|A)P(A)}{P(B)} P(A∣B)=P(B)P(B∣A)P(A)
由此知:
P ( y ∣ x 1 , x 2 , x 3 ) = P ( x 1 , x 2 , x 3 ∣ y ) P ( y ) P ( x 1 , x 2 , x 3 ) P(y|x_1,x_2,x_3)=\frac{P(x_1,x_2,x_3|y)P(y)}{P(x_1,x_2,x_3)} P(y∣x1,x2,x3)=P(x1,x2,x3)P(x1,x2,x3∣y)P(y)
根据马尔可夫假设, P ( x 1 , x 2 , x 3 ∣ y ) = P ( x 1 ∣ y ) × P ( x 2 ∣ y ) × P ( x 3 ∣ y ) P(x_1,x_2,x_3|y)=P(x_1|y)\times P(x_2|y)\times P(x_3|y) P(x1,x2,x3∣y)=P(x1∣y)×P(x2∣y)×P(x3∣y)。
由此可以解出,当 x 1 = 0 , x 2 = 0 , x 3 = 1 x_1=0, x_2=0, x_3=1 x1=0,x2=0,x3=1时, y = 0 y=0 y=0的概率为:
P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 0 ) P ( y = 0 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=0|x_1=0, x_2=0, x_3=1)=\frac{P(x_1=0, x_2=0, x_3=1|y=0)P(y=0)}{P(x_1=0, x_2=0, x_3=1)} P(y=0∣x1=0,x2=0,x3=1)=P(x1=0,x2=0,x3=1)P(x1=0,x2=0,x3=1∣y=0)P(y=0)
P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 1 ) P ( y = 1 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=1|x_1=0, x_2=0, x_3=1)=\frac{P(x_1=0, x_2=0, x_3=1|y=1)P(y=1)}{P(x_1=0, x_2=0, x_3=1)} P(y=1∣x1=0,x2=0,x3=1)=P(x1=0,x2=0,x3=1)P(x1=0,x2=0,x3=1∣y=1)P(y=1)
因为
P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 0 ) = P ( x 1 = 0 ∣ y = 0 ) × P ( x 2 = 0 ∣ y = 0 ) × P ( x 3 = 1 ∣ y = 0 ) = 4 64 P(x_1=0, x_2=0, x_3=1|y=0)=P(x_1=0|y=0)\times P(x_2=0|y=0)\times P(x_3=1|y=0)=\frac{4}{64} P(x1=0,x2=0,x3=1∣y=0)=P(x1=0∣y=0)×P(x2=0∣y=0)×P(x3=1∣y=0)=644
P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ∣ y = 1 ) = P ( x 1 = 0 ∣ y = 1 ) × P ( x 2 = 0 ∣ y = 1 ) × P ( x 3 = 1 ∣ y = 1 ) = 18 64 P(x_1=0, x_2=0, x_3=1|y=1)=P(x_1=0|y=1)\times P(x_2=0|y=1)\times P(x_3=1|y=1)=\frac{18}{64} P(x1=0,x2=0,x3=1∣y=1)=P(x1=0∣y=1)×P(x2=0∣y=1)×P(x3=1∣y=1)=6418
所以:
P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 4 64 P ( y = 0 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=0|x_1=0, x_2=0, x_3=1)=\frac{\frac{4}{64}P(y=0)}{P(x_1=0, x_2=0, x_3=1)} P(y=0∣x1=0,x2=0,x3=1)=P(x1=0,x2=0,x3=1)644P(y=0)
P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 18 64 P ( y = 1 ) P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=1|x_1=0, x_2=0, x_3=1)=\frac{\frac{18}{64}P(y=1)}{P(x_1=0, x_2=0, x_3=1)} P(y=1∣x1=0,x2=0,x3=1)=P(x1=0,x2=0,x3=1)6418P(y=1)
由于 P ( y = 0 ) = P ( y = 1 ) = 1 2 P(y=0)=P(y=1)=\frac{1}{2} P(y=0)=P(y=1)=21
所以得到
P ( y = 0 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 4 128 P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=0|x_1=0, x_2=0, x_3=1)=\frac{4}{128P(x_1=0, x_2=0, x_3=1)} P(y=0∣x1=0,x2=0,x3=1)=128P(x1=0,x2=0,x3=1)4
P ( y = 1 ∣ x 1 = 0 , x 2 = 0 , x 3 = 1 ) = 18 128 P ( x 1 = 0 , x 2 = 0 , x 3 = 1 ) P(y=1|x_1=0, x_2=0, x_3=1)=\frac{18}{128P(x_1=0, x_2=0, x_3=1)} P(y=1∣x1=0,x2=0,x3=1)=128P(x1=0,x2=0,x3=1)18
由此可知此学生不挂科的概率要更大一些,故将 y y y判断为 y = 0 y=0 y=0。
python实现
1、导入需要的库
import numpy as np from sklearn.naive_bayes import GaussianNB from sklearn.datasets import load_digits from sklearn.model_selection import train_test_splitfrom sklearn.metrics import confusion_matrix as CM
2、导入数据并划分训练集和测试集
digits = load_digits() X, y = digits.data, digits.targetXtrain,Xtest,Ytrain,Ytest = train_test_split(X,y,test_size=0.3,random_state=420)print(Xtrain.shape)print(Xtest.shape)print(Ytrain.shape)print(Ytest.shape)
3、朴素贝叶斯分类
gnb = GaussianNB().fit(Xtrain,Ytrain) #查看分数 acc_score = gnb.score(Xtest,Ytest)print(acc_score) #查看预测结果 Y_pred = gnb.predict(Xtest)print(Y_pred) #查看预测的概率结果 prob = gnb.predict_proba(Xtest)print(prob.shape)
4、使用混淆矩阵来查看贝叶斯的分类结果
CM(Ytest,Y_pred)