Parameter Learning

Parameter learning of a BBN proceeds after learning the structure. Let’s see how this works.

Load data

Let’s read our data into a Spark DataFrame SDF.

[1]:
from pysparkbbn.discrete.data import DiscreteData

sdf = spark.read.csv('hdfs://localhost/data-1479668986461.csv', header=True)
data = DiscreteData(sdf)

Structure learning

Let’s pick the MWST algorithm to learn the structure.

[2]:
from pysparkbbn.discrete.scblearn import Mwst

mwst = Mwst(data)
g = mwst.get_network()

Parameter learning

After we have a structure, we can learn the parameters.

[3]:
from pysparkbbn.discrete.plearn import ParamLearner
import json

param_learner = ParamLearner(data, g)
params = param_learner.get_params()

print(json.dumps(params, indent=2))
{
  "n4": [
    {
      "n4": "f",
      "__p__": 0.40255
    },
    {
      "n4": "t",
      "__p__": 0.59745
    }
  ],
  "n3": [
    {
      "n3": "f",
      "n2": "f",
      "n4": "f",
      "__p__": 0.9914285714285714
    },
    {
      "n3": "t",
      "n2": "f",
      "n4": "f",
      "__p__": 0.008571428571428572
    },
    {
      "n3": "f",
      "n2": "f",
      "n4": "t",
      "__p__": 0.3994593202883625
    },
    {
      "n3": "t",
      "n2": "f",
      "n4": "t",
      "__p__": 0.6005406797116375
    },
    {
      "n3": "f",
      "n2": "t",
      "n4": "f",
      "__p__": 0.39842913245269546
    },
    {
      "n3": "t",
      "n2": "t",
      "n4": "f",
      "__p__": 0.6015708675473045
    },
    {
      "n3": "f",
      "n2": "t",
      "n4": "t",
      "__p__": 0.010762975364745277
    },
    {
      "n3": "t",
      "n2": "t",
      "n4": "t",
      "__p__": 0.9892370246352548
    }
  ],
  "n2": [
    {
      "n2": "f",
      "n1": "f",
      "__p__": 0.7991074402184773
    },
    {
      "n2": "t",
      "n1": "f",
      "__p__": 0.20089255978152268
    },
    {
      "n2": "f",
      "n1": "t",
      "__p__": 0.20473230399037498
    },
    {
      "n2": "t",
      "n1": "t",
      "__p__": 0.795267696009625
    }
  ],
  "n5": [
    {
      "n5": "maybe",
      "n4": "f",
      "__p__": 0.2997143212023351
    },
    {
      "n5": "no",
      "n4": "f",
      "__p__": 0.5976897279841014
    },
    {
      "n5": "yes",
      "n4": "f",
      "__p__": 0.10259595081356353
    },
    {
      "n5": "maybe",
      "n4": "t",
      "__p__": 0.29324629676123526
    },
    {
      "n5": "no",
      "n4": "t",
      "__p__": 0.09649343041258683
    },
    {
      "n5": "yes",
      "n4": "t",
      "__p__": 0.6102602728261779
    }
  ],
  "n1": [
    {
      "n1": "f",
      "__p__": 0.75065
    },
    {
      "n1": "t",
      "__p__": 0.24935
    }
  ]
}