Parameter Learning
Parameter learning of a BBN proceeds after learning the structure. Let’s see how this works.
Load data
Let’s read our data into a Spark DataFrame SDF
.
[1]:
from pysparkbbn.discrete.data import DiscreteData
sdf = spark.read.csv('hdfs://localhost/data-1479668986461.csv', header=True)
data = DiscreteData(sdf)
Structure learning
Let’s pick the MWST
algorithm to learn the structure.
[2]:
from pysparkbbn.discrete.scblearn import Mwst
mwst = Mwst(data)
g = mwst.get_network()
Parameter learning
After we have a structure, we can learn the parameters.
[3]:
from pysparkbbn.discrete.plearn import ParamLearner
import json
param_learner = ParamLearner(data, g)
params = param_learner.get_params()
print(json.dumps(params, indent=2))
{
"n4": [
{
"n4": "f",
"__p__": 0.40255
},
{
"n4": "t",
"__p__": 0.59745
}
],
"n3": [
{
"n3": "f",
"n2": "f",
"n4": "f",
"__p__": 0.9914285714285714
},
{
"n3": "t",
"n2": "f",
"n4": "f",
"__p__": 0.008571428571428572
},
{
"n3": "f",
"n2": "f",
"n4": "t",
"__p__": 0.3994593202883625
},
{
"n3": "t",
"n2": "f",
"n4": "t",
"__p__": 0.6005406797116375
},
{
"n3": "f",
"n2": "t",
"n4": "f",
"__p__": 0.39842913245269546
},
{
"n3": "t",
"n2": "t",
"n4": "f",
"__p__": 0.6015708675473045
},
{
"n3": "f",
"n2": "t",
"n4": "t",
"__p__": 0.010762975364745277
},
{
"n3": "t",
"n2": "t",
"n4": "t",
"__p__": 0.9892370246352548
}
],
"n2": [
{
"n2": "f",
"n1": "f",
"__p__": 0.7991074402184773
},
{
"n2": "t",
"n1": "f",
"__p__": 0.20089255978152268
},
{
"n2": "f",
"n1": "t",
"__p__": 0.20473230399037498
},
{
"n2": "t",
"n1": "t",
"__p__": 0.795267696009625
}
],
"n5": [
{
"n5": "maybe",
"n4": "f",
"__p__": 0.2997143212023351
},
{
"n5": "no",
"n4": "f",
"__p__": 0.5976897279841014
},
{
"n5": "yes",
"n4": "f",
"__p__": 0.10259595081356353
},
{
"n5": "maybe",
"n4": "t",
"__p__": 0.29324629676123526
},
{
"n5": "no",
"n4": "t",
"__p__": 0.09649343041258683
},
{
"n5": "yes",
"n4": "t",
"__p__": 0.6102602728261779
}
],
"n1": [
{
"n1": "f",
"__p__": 0.75065
},
{
"n1": "t",
"__p__": 0.24935
}
]
}