数据集说明:数据集为12组成人身高、体重、鞋码的组合数据,以及是男性还是女性。</br>
程序说明:采用决策树算法,根据男生女生生理特征(身高、体重、鞋号),由Python语言实现男生女生预测。</br>
算法理论请参照:决策树算法</br>
Ipynb演示文件:Ipynb文件</br>
Python代码:Python代码</br>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# 数据说明:数据集为12组成人身高、体重、鞋码的组合数据,以及是男性还是女性"""
# [height, weight, shoe size]
X = [[181, 80, 44], [177, 70, 43], [160, 60, 38], [154, 54, 37],
[166, 65, 40], [190, 90, 47], [175, 64, 39], [177, 70, 40],
[159, 55, 38], [171, 75, 42], [181, 85, 43], [148, 70, 42]]

y = ['male', 'female', 'female', 'female',
'female', 'female', 'female', 'female',
'male', 'female', 'male', 'male']

"""选择决策树算法,训练算法"""
from sklearn import tree
try:
with open('DecisionTree.pickle', 'rb') as f:
clf = pickle.load(f)
except Exception, e:
# 训练算法
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)

# 序列化算法
with open('DecisionTree.pickle', 'wb') as f:
import pickle
pickle.dump(clf, f)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# make a prediction.
prediction = clf.predict([[190, 70, 43], [156, 60, 36]])
print prediction
```


```python
"""Visualization: graphviz export of the above tree trained on the entire dataset;
the results are saved in an output file GenderClassifier.pdf"""

# Graphviz是图形绘制工具,可以很方便的用来绘制结构化的图形网络,支持多种格式输出
import graphviz
dot_data = tree.export_graphviz(clf, out_file=None,
feature_names=['height', 'weight', 'shoe size'],
class_names='gender',
filled=True, rounded=True,
special_characters=True)
graph = graphviz.Source(dot_data)
graph.render('GenderClassifier')