1、开启数据挖掘之旅

发布日期：2021-05-14 23:10:35 浏览次数：15 分类：精选文章

本文共 4611 字，大约阅读时间需要 15 分钟。

��

�� 328 Pilaf ��

��

����

����IPython��scikit-learn��

��IPython Notebook��IPython�� notebooks ��

pip install ipython[all]
ipython3 notebook

��IPython Notebook��Notebook�� ctrl+c ��

��scikit-learn��

pip3 install -U scikit-learn

��

1. ��

��

import numpy as np
filename = 'affinity_dataset.txt'
x = np.loadtxt(filename)
n_samples, n_features = x.shape
print("��������������� {} ������������ {} ���������".format(n_samples, n_features))

��5��

[1, 0, 1, 1, 1]
[1, 1, 0, 1, 0]
[0, 1, 1, 0, 1]
[1, 0, 1, 1, 0]
[1, 1, 1, 1, 0]

2. ��

��

num_apple = 0
num_banana = 0
for sample in x:
    if sample[3] == 1:  # ������������������
        num_apple += 1
    if sample[4] == 1:  # ������������������
        num_banana += 1
print("������������������������{}", num_apple)
print("������������������������{}", num_banana)

��

from collections import defaultdict
valid_rules = defaultdict(int)
invalid_rules = defaultdict(int)
num_occurances = defaultdict(int)
for sample in x:
    for premise in range(n_features):
        num_occurances[premise] += 1
        if sample[premise] == 0:
            continue
        for conclusion in range(n_features):
            if premise == conclusion:
                continue
            if sample[conclusion] == 1:
                valid_rules[(premise, conclusion)] += 1
            else:
                invalid_rules[(premise, conclusion)] += 1
support = valid_rules
confidence = defaultdict(float)
for premise, conclusion in valid_rules.keys():
    confidence[premise, conclusion] = support[(premise, conclusion)] / num_occurances[premise]

3. ��

��

from operator import itemgeter
# ������������������
sorted_support = sorted(support.items(), key=itemgeter(1), reverse=True)
# ������������������
sorted_confidence = sorted(confidence.items(), key=itemgeter(1), reverse=True)

��

4. ��

��

上一篇：6、第九章安装RPM包或源码包

下一篇：5、几个与文档有关的命令

发表评论

关于作者

喝酒易醉，品茶养心，人生如梦，品茶悟道，何以解忧？唯有杜康！

-- 愿君每日到此一游！

��

1. ��

2. ��

3. ��

4. ��

发表评论

最新留言

关于作者

推荐文章

���������������������������������

1. ���������������

2. ������������

3. ������������

4. ������������������

发表评论

最新留言

关于作者

推荐文章

��

1. ��

2. ��

3. ��

4. ��