Probabilistic Logic Programming (PLP) introduces probabilistic reasoning in Logic Programs in
                            order to represent uncertain information. It is receiving an increased attention due to its applications
                            in particular in the Machine Learning field.
                            
                            In this tutorial we will show how to use 
                            cplint on SWISH,
                            a web application for performing inference and learning 
                            on user-defined probabilistic logic programs. You will 
                            learn how to write a probabilistic 
                            logic program processable by cplint on SWISH,
                            how to execute the different types of queries allowed by 
                            this application and how to perform learning.
                        
cplint on SWISH is based on SWISH, a web framework for Logic Programming, and on cplint , a suite of programs for inference and learning of Logic Programs with annotated disjunctions (LPADs) . It keeps the same syntax of cplint and as cplint it uses Logic Programs with annotated disjunctions (LPADs) as formalism to represent probabilistic logic programs.
The tutorial is structured as follows. provides an introduction to LPADs. shows how to write an LPAD processable by cplint on SWISH and how to perform simple and conditional exact inference over LPADs. and illustrate how to perform approximate inference over LPADs. Sections and explain how to perform simple and conditional approximate inference over hybrid programs, i.e., programs where some of the random variables are continuous. describes how to learn the parameters of an LPAD, whereas explains how to perform structure learning. Finally shows how to test a learned program.
A Logic Program with Annotated Disjunction (LPAD) consists of a set of rules of the following form:
h_1 : a_1 ; ... ; h_n : a_n :- b_1, ..., b_m. 
                        where h_i are atoms, b_i are literals and a_i are real numbers between 0 and 1 such that the sum of all a_i is 1. The set of elements h_i : a_i compose the head of a rule, while the set b_i is the body. Disjunction in  the  head  is  represented  with  a  semicolon  and  atoms  in  the  head  are  separated  from probabilities by a colon.
                            If the head of a rule contains only one element h : 1, we can simply write this element as h, i.e. the clause takes the form of a normal prolog clause. Therefore
h : 1 :- b_1, ..., b_m.
                        is equivalent to
h :- b_1, ..., b_m.
                        If the clause has an empty body, it can be represented like this:
h_1 : a_1 ; ... ; h_n : a_n. 
                        If the sum of all the a_i is smaller than 1, an extra disjunct null is assumed with probability 1 - sum(a_i). Therefore
h_1 : 0.5 ; h_2 : 0.2 :- b_1, ..., b_m. 
                        is equivalent to
null : 0.3 ; h_1 : 0.5 ; h_2 : 0.2 :- b_1, ..., b_2.
                    We consider as example a program which models the fact that if somebody has the flu and the climate is cold, there is the possibility that an epidemic or a pandemic arises. We are uncertain about whether the climate is cold but we know for sure that David and Robert have the flu (based on ).
The rule that we want to write is the one which states that, if somebody has the flu and the climate is cold, an epidemic arises with 60% probability, a pandemic arises with 30% probability, whereas we have a 10% probability that neither an epidemic nor a pandemic arises. We can write
epidemic : 0.6; pandemic : 0.3; null: 0.1 :- flu(_), cold.
                        As we said in Section Logic Program with Annotated Disjunction, the null atom can be implicit.
                            Therefore the previous rule, without changing its meaning, can be written
epidemic : 0.6; pandemic : 0.3 :- flu(_), cold.
                        The following probabilistic fact says that the weather is cold with a 70% probability. Note that the null atom is implicit here as well.
cold : 0.7.
                        Now we assert that David and Robert have the flu.
flu(david).
flu(robert).
                        The program is almost complete, what we need now is to 
                            load the library pita  in order to perform 
                            exact inference (for further information about PITA see 
                            ). 
                            Therefore we import this library with the built-in predicate 
                            use_module/1. So we need to write
:- use_module(library(pita)).
                        Moreover, after :- use_module(library(pita)) we need to write :- pita. in order to initalize the library and the program should be enclosed by :- begin_lpad. and :- end_lpad. (respectively at the begin and at the end of the program). These goals are mandatory to initialize the inference system.
The full LPAD of this example is shown below.
% load the 'pita' library to perform inference
:- use_module(library(pita)).
:- pita.
% to be written before the program
:- begin_lpad.
epidemic : 0.6; pandemic : 0.3 :- flu(_), cold.
cold : 0.7.
flu(david).
flu(robert).
% to be written after the program
:- end_lpad.
Ok, we have our program, now what?! Now it’s time to submit some queries!
To query a program you must use the prob/2 predicate with the following syntax
prob(:Query:atom,-Probability:float).
                        Query is an atom that represents the query that we want to perform, while Probability is the variable that will contain the probability that Query is true (a float between 0 and 1).
For instance, let us ask for the probability that an epidemic arises. To know it we just have to submit the following query.
prob(epidemic, P).
If you click on the icon 
                            a frame containing cplint on SWISH will be open, then you can 
                            submit the query by clicking on the "Run" button.
                            
                            You can close the frame by clicking on the icon 
                            
                        
cplint on SWISH allows to ask conditional 
                                queries with the predicate prob/3
prob(:Query:atom,:Evidence:atom,-Probability:float).
       
                        For instance, we can ask for the probability that an epidemic arises given that outside is cold
prob(epidemic, cold, P).
cplint on SWISH can show the probabilistic 
                            results of a query as histograms using predicate bar/2. 
                            
                            
                            The syntax  is 
bar(+Probability:float,-Chart:dict).
                        Where Probability is the probability of the query and
                         Chart is 
                            a dictionary that will contain a bar chart with two bars, 
                            one for the probability of the atom of being true and 
                            one for the probability of the atom of being false 
                            (1- true_probability). 
                            After the submission of the query a graphical bar chart 
                            of the two values will be plotted.
However, before submitting this kind of query, we need to 
                            specify that we want to use the renderer c3 
                            by adding the following line before the 
                            :- begin_lpad. goal
:- use_rendering(c3).
                        Moreover cplint on SWISH allows to draw 
                            the Binary Decision Diagram (BDD) used to perform queries with
                            exact inference. A BDD represents the explanations
                            of a query.
                            
                            The syntax is the following:
                        
bdd_dot_string(:Query:atom,-DotString:string,-Var:list).
                        A solid edge indicates a 1-child, a dashed edge 
                            indicates a 0-child and a dotted edge indicates a 
                            negated 0-child. Each level of the BDD is associated 
                            to a variable of the form XI_J indicated on the left: 
                            I indicates the multivalued variable index and J the 
                            index of the Boolean variable of I. The table 
                            Var
                            contains the associations between the rule groundings 
                            and the multivalued variables: the first column 
                            contains the multivalued variable index, the second 
                            column contains the rule index, corresponding to its 
                            position in the program, and the last column contains 
                            the list of constants grounding the rule, each replacing 
                            a variable in the order of appearance in the rule.
                        
However, before submitting bdd_dot_string/3,
                            we have to specify that we want to use the renderers graphviz
                            and table. 
                        
:- use_rendering(graphviz).
:- use_rendering(table,[header(['Multivalued variable index','Rule index','Grounding substitution'])]).
                        This render plugins are necessary to plot the BDD and to show
                            what is contained in Var as a table.
                        
Therefore our program becomes
% load the 'pita' library to perform inference
:- use_module(library(pita)).
:- pita.
% allows to create graphical results
:- use_rendering(c3).
:- pita.
% the following renderers allow to correctly show the BDDs
:- use_rendering(graphviz).
% to be written before the program
:- begin_lpad.
epidemic : 0.6; pandemic : 0.3 :- flu(_), cold.
cold : 0.7.
flu(david).
flu(robert).
% to be written after the program
:- end_lpad.
                        Let us see the histogram of the first query of the example (simple inference)
prob(epidemic, P),bar(P,G).
If we want to see the histogram of the second query (conditional query)
prob(epidemic, cold, P),bar(P,G).
If we want to see the BDD of the first query of the example (simple inference)
bdd_dot_string(epidemic, BDD, Var).
This example shows that conclusions from different 
                            groundings of a rule are combined with a noisy or 
                            rule: the probability of an epidemic is obtained by 
                            combining with noisy or the conclusions of the two 
                            groundings of the rule where the only variable is 
                            replaced by David or Robert. So the probability of 
                            an epidemic if cold is true is 0.6+0.6-0.6*0.6=84. 
                            Since cold is also uncertain, the overall 
                            probability is 0.84*0.7=0.588.
Complete example: epidemic.pl
In this example we will show how to perform an approximate inference with Monte Carlo sampling. In addition, as in the exact inference setting, it is possible to plot an histogram representing the probabilistic values.
To show these features we exploit the Markov chain example . In this example we want to know what is the likelihood that on an execution of a Markov chain from a start state 's', a final state 't' will be reached? The chains may be infinite so the query may have an infinite number of explanations and if we want exact inference and use PITA, the inference may not terminate. To ensure termination, we have two solutions. We may either fix a bound on the depth of the derivations of PITA by setting the parameters
:- set_pita(depth_bound,true).
:- set_pita(depth,<level of depth (integer)>).
                        (see exact inference variant of this example). Alternatively, MCINTYRE can be used.
Here we will use the latter approach.
Below the full program of this example is shown
% load the library ‘mcintyre’ to perform approximate inference
:- use_module(library(mcintyre)).
% load the renderer ‘c3’ for graphical results
:- use_rendering(c3).
% initialize the library 'mcintyre'
:- mc.
% to be written before the program
:- begin_lpad.
reach(S, I, T) :-
trans(S, I, U),
reach(U, next(I), T).
reach(S, _, S).
trans(s0,S,s0):0.5; trans(s0,S,s1):0.3; trans(s0,S,s2):0.2.
trans(s1,S,s1):0.4; trans(s1,S,s3):0.1; trans(s1,S,s4):0.5.
trans(s4,_,s3).
% to be written after the program
:- end_lpad.
To execute queries we must use the predicates mc_prob/2.
mc_prob(:Query:atom,-Probability:float).
                        We ask for the probability that starting at state 's0' at instance 0, state 's3' is reachable
mc_prob(reach(s0,0,s3),P).
if we want to see the probability histogram
mc_prob(reach(s0,0,s3),P),bar(P,G).
With MCINTYRE, you can also take a given number of sample with
mc_sample(:Query:atom,+Samples:int,-Probability:float).
                        For example this query
mc_sample(reach(s0,0,s3),1000,P).
samples reach(s0,0,s3) 1000 times and returns in P the estimated probability.
We can obtain a bar chart of the samples with the predicate bar/2 (note: remember to load the renderer c3):
mc_sample(reach(s0,0,s3),1000,P),bar(P,G).
We can also sample arguments of queries with the predicate mc_sample_arg/4
mc_sample_arg(:Query:atom,+Samples:int,?Arg:var,-Values:list).
                        The predicate samples Query a number of Samples times. Arg should be a variable in Query.
                            The predicate returns in Values a list of couples L-N where L is the list of
                            values of Arg for which Query succeeds in world sampled at random and
                            N is the number of samples. If L is the empty list, it means that for that sample the query failed.
                            If L is a list with a
                            single element, it means that for that sample the query is
                            determinate.
                            If, in all couples L-N, L
                            is a list with a
                            single element, it means that the clauses in the program
                            are mutually exclusive, i.e., that in every sample,
                            only one clause for each subgoal has the body true.
So for example we may sample the argument S of 
                            reach(s0,0,S) with
mc_sample_arg(reach(s0,0,S),50,S,Values).
If we want to see the bar graph of this sampling we use the predicate argbar/2
argbar(+List:list,-Chart:dict).
                        For example
mc_sample_arg(reach(s0,0,S),50,S,List),argbar(List,Chart).
Moreover, we can sample arguments of queries with the predicate mc_sample_arg_first/4
mc_sample_arg_first(:Query:atom,+Samples:int,?Arg:var,-Values:list)
                        that returns in Values a list of couples V-N where
                            V is the value of Arg returned as the first answer by Query in
                            a world sampled at random and N is the number of samples
                            returning that value.
                            V is failure if the query fails.
                            mc_sample_arg_first/4 differs from mc_sample_arg/4 because the first just computes the first answer of the query for each sampled world.
So for example we may sample 50 times the first answer for S in reach(s0,0,S) with
mc_sample_arg_first(reach(s0,0,S),50,S,Values).
Complete example: markov_chain.pl
With cplint on SWISH it is possible to
                                compute the expectation of a random variable.
                                
                                In this example we want to perform model checking of the 
                                Synchronous Leader Election Protocol 
                                
                                expressed in Probabilistic 
                                Computation Tree Logic (PCTL).
                                
                                Given a synchronous ring of N processes the Synchronous 
                                Leader Election Protocol is used to elect a leader 
                                (a uniquely designated processor) by sending messages around the ring.
                                The protocol proceeds in rounds and is parametrised by a
                                constant K. Each round begins by all processors (independently) 
                                choosing a random number (uniformly) from {1,…,K} as an 
                                id. The processors then pass their ids around the ring. 
                                If there is a unique id, then the processor with the 
                                maximum unique id is elected the leader, and otherwise 
                                the processors begin a new round.
The full program of this example is
:- use_module(library(mcintyre)).
:- if(current_predicate(use_rendering/1)).
:- use_rendering(c3).
:- endif.
:- dynamic kr/1,num/1.
:- mc.
:- begin_lpad.
% State Formulae 
models(_S, tt,_Hist,_Limit,_Time).
models(S, prop(P),_Hist,_Limit,_Time) :-
    proposition(P, S).
models(S, and(F1, F2),Hist,Limit,Time) :-
    models(S, F1,Hist,Limit,Time), models(S, F2,Hist,Limit,Time).
models(S, or(F1, _F2),Hist,Limit,Time) :-
    models(S, F1,Hist,Limit,Time).
models(S, or(F1, F2),Hist,Limit,Time) :-
    \+ models(S, F1,Hist,Limit,Time),
    models(S, F2,Hist,Limit,Time).
models(S, not(F), Hist,Limit,Time) :-
    \+ models(S, F,Hist,Limit,Time).
models(S, prob_until(comp(Op, P), F1, F2),Hist,Limit,Time) :-
    mc_sample(pmodels(S, until(F1, F2),Hist,Limit,Time),20, Q),
    comp(Q, Op, P).
models(S, prob_next(comp(Op, P), F),Hist,Limit,Time) :-
    mc_sample(pmodels(S, next(F),Hist,Limit,Time),20, Q),
    comp(Q, Op, P).
comp(Q,>,P):-
  Q>P.
comp(Q,>=,P):-
  Q>=P.
comp(Q,<,P):-
  Q<P.
comp(Q,=<,P):-
  Q=<P.
% Path Formulae
pmodels(S,F):-
  pmodels(S,F,[],nolimit,0,_Time).
pmodels(S,F,Hist,Limit,Time):-
  pmodels(S,F,Hist,Limit,Time,_Time).
pmodels(S, until(_F1, F2),Hist,Limit,Time,Time) :-
    models(S, F2,Hist,Limit,Time),!.
    
pmodels(S, until(F1, F2),Hist0,Limit,Time0,Time) :-
    within_limit(Time0,Limit),
    models(S, F1,Hist0,Limit,Time0),
    ctrans(S, _, T, Hist0, Hist),!,
    Time1 is Time0+1,
    pmodels(T, until(F1,F2),Hist,Limit,Time1,Time).
pmodels(S, next(F), Hist,Limit,Time0,Time) :-
    within_limit(Time0,Limit),
    ctrans(S, _, T, Hist, _),!,
    Time is Time0+1,
    models(T, F,Hist,Limit,Time).
within_limit(_Time,nolimit):-!.
within_limit(Time,Limit):-
  Time<Limit.
bounded_eventually(Prop,Rounds):-
  num(N),
  B is Rounds*(N+1),
  eventually(Prop,B,_T).
eventually(Prop):-
  eventually(Prop,_T).
eventually(Prop,Rounds):-
  eventually(Prop,nolimit,T),
  num(N),
  Rounds is T/(N+1).
eventually(Prop,Limit,T) :-
    init(S),
    pctlspec(Prop, F),
    pmodels(S, F,[],Limit,0,T).
pctlspec(X, until(tt, prop(X))).
proposition(P, S) :- final(P, S).
final(elect, [_|L]) :-
    num(N),
    gen_elected_state(N, L).
gen_elected_state(J, L) :-
    (J==0
    ->    L=[]
    ;     L = [state(3,_,_,_)|Rest],
          J1 is J-1,
          gen_elected_state(J1,Rest)
    ).
    
% transitions
% module counter
% [read] c<N-1 -> (c'=c+1);
% reading
trans(counter, counter(C), read, counter(D),_S,H,H) :-
  num(N),
  C < N-1,
  D is C+1.
% [read] c=N-1 -> (c'=c);
% finished reading
trans(counter, counter(C), read, counter(C),_S,H,H) :-
  num(N),
  C =:= N-1.
% [done] u1 | u2 | u3 | u4 -> (c'=c);
% done
trans(counter, counter(C), done, counter(C),S,H,H) :-
  get_processid(P), 
  nonlocal(process(P,_), uniqueid, 1,S).
   
% [retry] !(u1 | u2 | u3 | u4) -> (c'=1);
% pick again reset counter 
trans(counter, counter(_C), retry, counter(1),S,H,H) :-
        findall(P,get_processid(P),PL),
    maplist(nl(S),PL).
% [loop] s1=3 -> (c'=c);
% loop (when finished to avoid deadlocks)
trans(counter, counter(C), loop, counter(C),S,H,H) :-
  nonlocal(process(1,_), state, 3,S).
% module process
% local state
% s1=0 make random choice
% s1=1 reading
% s1=2 deciding
% s1=3 finished
% [pick] s1=0 -> 1/K : (s1'=1) & (p1'=0) & (v1'=0) & (u1'=true) + ...;
% pick value
trans(process(_N,_Next), state(0,_,_,_), pick, state(1,1,R,R),_S,H,[pick(R)|H]) :-
  pick(H,R).
%read 
% [read] s1=1 &  u1 & !p1=v2 & c<N-1 -> (u1'=true) & (v1'=v2);
trans(process(_N,Next), state(1,1,_,P), read, state(1,1,V,P),S,H,H) :-
  nonlocal(counter, counter, C,S),
  num(CN),
  C < CN - 1,
  nonlocal(process(Next,_), value, V,S),
  P \== V.
% [read] s1=1 &  u1 &  p1=v2 & c<N-1 -> (u1'=false) & (v1'=v2) & (p1'=0);
trans(process(_N,Next), state(1,1,_,P), read, state(1,0,V,0),S,H,H) :-
  nonlocal(counter, counter, C,S),
  num(CN),
  C < CN - 1,
  nonlocal(process(Next,_), value, V,S),
  P == V.
% [read] s1=1 & !u1 &  c<N-1 -> (u1'=false) & (v1'=v2);
trans(process(_N,Next), state(1,0,_,P), read, state(1,0,V,P),S,H,H) :-
  nonlocal(counter, counter, C,S),
  num(CN),
  C < CN - 1,
  nonlocal(process(Next,_), value, V,S).
 
% read and move to decide 
% [read] s1=1 &  u1 & !p1=v2 & c=N-1 -> (s1'=2) & (u1'=true) & (v1'=0) & (p1'=0);
trans(process(_N,Next), state(1,1,_,P), read, state(2,1,0,0),S,H,H) :-
  nonlocal(counter, counter, C,S),
  num(CN),
  C =:= CN - 1,
  nonlocal(process(Next,_), value, V,S),
  P \== V.
% [read] s1=1 &  u1 &  p1=v2 & c=N-1 -> (u1'=false) & (v1'=0) & (p1'=0);
trans(process(_N,Next), state(1,1,_,P), read, state(2,0,0,0),S,H,H) :-
  nonlocal(counter, counter, C,S),
  num(CN),
  C =:= CN - 1,
  nonlocal(process(Next,_), value, V,S),
  P == V.
% [read] s1=1 & !u1 &  c=N-1 -> (s1'=2) & (u1'=false) & (v1'=0);
trans(process(_N,_Next), state(1,0,_,P), read, state(2,0,0,P),S,H,H) :-
  nonlocal(counter, counter, C,S),
  num(CN),
  C =:= CN - 1.
% done
% [done] s1=2 -> (s1'=3) & (u1'=false) & (v1'=0) & (p1'=0);
trans(process(_N,_Next), state(2,_,_,_), done, state(3,0,0,0),_S,H,H).
% retry
% [retry] s1=2 -> (s1'=0) & (u1'=false) & (v1'=0) & (p1'=0);
trans(process(_N,_Next), state(2,_,_,_), retry, state(0,0,0,0),_S,H,H).
% loop (when finished to avoid deadlocks)
% [loop] s1=3 -> (s1'=3);
trans(process(_N,_Next), state(3,U,V,P), loop, state(3,U,V,P),_S,H,H).
pick(H,V):-
  kr(K),
  K1 is K-1,
  PH is 1/K,
  findall(I,between(0,K1,I),L),
  foldl(pick_value(H,PH),L,(1,_),(_,V)).
pick_value(_H,_PH,_I,(P0,V0),(P0,V0)):-
  nonvar(V0).
pick_value(H,PH,I,(P0,V0),(P1,V1)):-
  var(V0),
  PF is PH/P0,
  (pick_fact(H,V0,PF)->
    P1=PF,
    V1=I
  ;
    P1 is P0*(1-PF),
    V1=V0
  ).
pick_fact(_,_,P):P.
%pick(H,0):0.5; pick(H,1):0.5.
ctrans(S, A, T, Hi, Ho) :-
    config(P),
    find_matching_trans(P,S,S,[],A,T,Hi,Ho).
find_matching_trans([], [], _CS, _PA, A, [], H,H) :- nonvar(A).
find_matching_trans([P|Ps], [S|Ss], CS, PA, A, [T|Ts],Hi,Ho) :-
    pick_trans(P, S, CS, PA, A, T, Hi,H1),
    find_matching_trans(Ps, Ss, CS, PA, A, Ts,H1,Ho).
find_matching_trans([P|Ps], [S|Ss], CS, PA, A, [S|Ts], Hi,Ho) :-
    % skip current process; but then all transitions enabled in the current
    % process will be prohibited for selection in later processes.
    enabled_trans(P,L),
    (nonvar(A) -> \+ member(A,L); true),
    append(L, PA, PA1),
    find_matching_trans(Ps, Ss, CS, PA1, A, Ts, Hi, Ho).
pick_trans(P, S, CS, PA, A, T, H0,H) :-
    etrans(P, S, PA, A, T,CS, H0,H).
etrans(P, S, PA, A, T, CS,H0,H) :-
    trans(P, S, A, T,CS,H0,H),
    A \= epsilon,
    \+ member(A, PA).
enabled_trans(P, L) :-
    setof(A, enabled_trans_in_process(P,A), L).
enabled_trans_in_process(P,A) :-
    clause(trans(P,_,A,_,_,_,_),_),
    A \= epsilon.
nonlocal(Proc, Var, Val,CS) :-
    getpid(Proc, Var, Pid, Idx),
    nth1(Pid, CS, State),
    arg(Idx, State, Val).
%   writeln(nonlocal_read(Proc, State, Var, Val)).
getpid(Proc, Var, Pid, Idx) :-
    config(Config),
    nth1(Pid, Config, Proc),
    nonlocal_access(Proc, Var, Idx).
get_processid(P):-
  num(N),
  between(1,N,P).
config([counter| L]) :-
    findall(P,get_processid(P),PL),
    maplist(neighbor,PL,L).
neighbor(I,process(I,J)) :-
    num(N),
    (I<N
  ->
      J is I+1
    ;   J = 1
    ).
%config([counter, process(1,2), process(2,3), process(3,4), process(4,1)]).
init(S) :-
    config(P),
    maplist(init_state,P,S).
init_state(counter, counter(1)).
init_state(process(_,_), state(0,0,0,0)).
nonlocal_access(counter, counter, 1).
nonlocal_access(process(_,_), state, 1).
nonlocal_access(process(_,_), uniqueid, 2).
nonlocal_access(process(_,_), value, 3).
nl(S,P):-
  nonlocal(process(P, _), uniqueid, 0,S).
num(4).
kr(4).
:- end_lpad.
graph_prob(G):-
  retract(num(N)),
  retract(kr(K)),
  assert(num(4)),
  assert(kr(2)),
  findall(L-P,
    (between(1,6,L),mc_sample(bounded_eventually(elect,L),100,P)),LV),
  G=c3{data:_{x:x, rows:[x-'Probability of leader elected within L rounds (N=4, K=2)'|LV]},%legend:_{show: false},
    axis:_{x:_{min:1,max:6,label:'L',padding:_{bottom:0.0,top:0.0}},
           y:_{label:'Probability',padding:_{bottom:0.0,top:0.0}}}},
  retract(num(4)),
  retract(kr(2)),
  assert(num(N)),
  assert(kr(K)).
graph_exp_rounds_n(G):-
  retract(num(NI)),
  retract(kr(KI)),
  assert(kr(3)),
  findall(N-E,
    (between(3,8,N),
     assert(num(N)),
     mc_expectation(eventually(elect,T),100,T,E),
     retract(num(N))),
    LV),
  G=c3{data:_{x:x, rows:[x-'Expected rounds to elect a leader (K=3)'|LV]},%legend:_{show: false},
    axis:_{x:_{min:3,max:8,label:'N',padding:_{bottom:0.0,top:0.0}},
           y:_{label:'Expected rounds',padding:_{bottom:0.0,top:0.0}}}},
  retract(kr(3)),
  assert(num(NI)),
  assert(kr(KI)).
graph_exp_rounds_k(G):-
  retract(num(NI)),
  retract(kr(KI)),
  assert(num(3)),
  findall(K-E,
    (between(3,8,K),
     assert(kr(K)),
     mc_expectation(eventually(elect,T),500,T,E),
     retract(kr(K))),
    LV),
  G=c3{data:_{x:x, rows:[x-'Expected rounds to elect a leader (N=3)'|LV]},%legend:_{show: false},
    axis:_{x:_{min:3,max:8,label:'K',padding:_{bottom:0.0,top:0.0}},
           y:_{label:'Expected rounds',padding:_{bottom:0.0,top:0.0}}}},
  retract(num(3)),
  assert(num(NI)),
  assert(kr(KI)).
We consider the problem of computing expectations.
                                We would like to compute the expected number of rounds to elect a leader.
                                
                                We can compute expectations with
mc_expectation(:Query:atom,+N:int,?Arg:var,-Exp:float).
                            that computes the expected value of Arg in Query by
                                sampling.
                                It takes N samples of Query and sums up the value of Arg for
                                each sample. The overall sum is divided by N to give Exp.
An example of use of the above predicate is
mc_expectation(eventually(elect,T),1000,T,E).
that returns in E the expected value of rounds necessary to elect a leader computed by taking 1000 samples.
Complete example: pctl_slep.pl
In this example we want to show how to perform conditional inference in an approximate way using sampling. In particular, we will show how to use rejection sampling and Metropolis-Hastings.
This example generatively defines a random arithmetic function. The problem is to predict the value returned by the function given one or two couples of input-output, i.e., to compute a conditional probability. This program is translated from the example in the Church functional probabilistic programming language. Sampling is necessary as queries have an infinite number of explanations.
The full program of this example is
:- use_module(library(mcintyre)).
:- if(current_predicate(use_rendering/1)).
:- use_rendering(c3).
:- endif.
:- mc.
:- begin_lpad.
eval(X,Y):-
  random_fn(X,0,F),
  Y is F.
op(+):0.5;op(-):0.5.
random_fn(X,L,F):-
  comb(L),
  random_fn(X,l(L),F1),
  random_fn(X,r(L),F2),
  op(Op),
  F=..[Op,F1,F2].
random_fn(X,L,F):-
  \+ comb(L),
  base_random_fn(X,L,F).
comb(_):0.3.
base_random_fn(X,L,X):-
  identity(L).
base_random_fn(_X,L,C):-
  \+ identity(L),
  random_const(L,C).
identity(_):0.5.
random_const(L,0):0.1;random_const(L,1):0.1;random_const(L,2):0.1;
random_const(L,3):0.1;random_const(L,4):0.1;random_const(L,5):0.1;
random_const(L,6):0.1;random_const(L,7):0.1;random_const(L,8):0.1;
random_const(L,9):0.1.
:- end_lpad.
We know that the random function returns 3 for input 1 and we want to
                            compute the probability that it returns 4 for input 2.
                            We thus need to ask a conditional query and sampling 
                            is necessary as queries have an infinite number of explanations.
                            
                            The simplest approach is to use rejection sampling:  you first query the evidence and, if the query is successful, query the goal in the same sample, otherwise
                            the sample is discarded.
                            
                            This can be done with
mc_rejection_sample(:Query:atom,:Evidence:atom,+Samples:int,
  -Probability:float,+Options:list).
                        that takes Samples samples of Query given that Evidence is true.
 Options is a list of options, the following are recognised:
successes(-successes:int)  Number of successes
failures(-failures:int) Number of failures
An example of use of the above predicate is
mc_rejection_sample(eval(2,4),eval(1,3),1000,P,[]).
that perform rejection sampling of eval(2,4) given that eval(1,3) is true.
Differently from exact inference, in approximate inference 
                            the evidence can be a conjunction of atoms, so if 
                            you also know that eval(0,2) is true, you can use
mc_rejection_sample(eval(2,4),(eval(0,2),eval(1,3)),1000,P,[]).
and, as you can see, the query with more evidence is now almost certainly true.
In Metropolis-Hastings MCMC, a Markov chain is produced using the algorithm of : after a sample, a number of sampled probabilistic choices are deleted and the others are retained for the next sample. The sample is accepted with a probability of min{1,N0/N1} where N0 is the number of choices sampled in the previous sample and N1 is the number of choices sampled in the current sample. Metropolis-Hastings is usually much faster than rejection sampling because less samples are discarded.
To use Metropolis-Hastings, the following predicate is available
mc_mh_sample(:Query:atom,:Evidence:atom,+Samples:int,
  -Probability:float,+Options:list).
where Options is a list of options, the following are recognised:
mix(+Mix:int)  The first Mix samples are discarded (mixing time), default value 0
lag(+lag:int) lag between each sample, Lag sampled choices are forgotten, default value 1
successes(-successes:int)  Number of successes
failures(-failures:int) Number of failures
bar(-BarChar:dict)   BarChart is a bar chart of the possible values
where Lag is the number of sampled choices to forget before taking a new sample.
                            For example
                        
mc_mh_sample(eval(2,4),eval(1,3),10000,P).
takes 10000 accepted samples and returns in P the
                            estimated probability.
mc_rejection_sample_arg(:Query:atom,:Evidence:atom,+Samples:int,
  ?Arg:Var,-Values:list,+Options:list).
mc_mh_sample_arg(:Query:atom,:Evidence:atom,+Samples:int,
  ?Arg:Var,-Values:list,+Options:list).
                        that return the distribution of values for 
                            Arg in Query in 
                            Samples of Query given that 
                            Evidence is true. The predicate returns in Values a list of couples L-N where L is the list of values of Arg for which Query succeeds in a world sampled at random where Evidence is true and N is the number of samples returning that list of values.
 For example if we want to sample arg Y 
                            of eval(2,Y) given that 
                            eval(0,2) and eval(1,3) 
                            are true with Metropolis-Hastings MCMC
mc_mh_sample_arg(eval(2,Y),eval(1,3),1000,Y,V,[]).
The printed results are pretty ugly. 
                            For visualizing the results of sampling arguments as
                            bar chart you can use argbar/2
If we want to plot the bar chart of the previous query
mc_mh_sample_arg(eval(2,Y),eval(1,3),1000,Y,L),argbar(L,C).
You can also compute conditional expectations with
mc_mh_expectation(:Query:atom,:Evidence:atom,+N:int,?Arg:var,-Exp:float,+Options:list).
                        as in
mc_mh_expectation(eval(2,Y),eval(1,3),1000,Y,E,[]).
that computes the expectation of argument Y of eval(2,Y) given that
                            eval(1,3) is true by taking 1000 samples using Metropolis-Hastings MCMC.
Complete example: arithm.pl
Up to now we have considered only discrete random variables and discrete probability distributions. Here we consider Hybrid Probabilistic Logic Programs, where some of the random variables are continuous.
How can we consider continuous random variables and probability density functions, for example real variables following a Gaussian distribution?
                            cplint on SWISH with its sampling inference module 
                            allows the specification of density functions over arguments
                            of atoms in the head of rules. To specify a probability 
                            density on an argument Var of an atom 
                            A you can used rules of the form
A : Density :- Body.
                        where Density is a special atom identifying 
                            a probability density on variable Var 
                            and Body (optional) is a regular clause body. 
                            Allowed Density atoms are:
                        
uniform(Var,L,U): Var 
                                    is uniformly distributed in \([L,U]\)
gaussian(Var,Mean,Variance): Var 
                                    follows a Gaussian distribution with mean Mean 
                                    and variance Variance
dirichlet(Var,Par): Var 
                                    is a list of real numbers following a Dirichlet distribution with 
                                    parameters \(\alpha\) 
                                    specified by the list Par
gamma(Var,Shape,Scale): Var 
                                    follows a gamma distribution with shape parameter 
                                    Shape and scale parameter 
                                    Scale.
beta(Var,Alpha,Beta): Var 
                                    follows a beta distribution with parameters 
                                    Alpha and Beta.
This syntax can be used to describe also discrete distribution, with either a finite or countably infinite support:
discrete(Var,D) or finite(Var,D):
                                    A is an atom containg variable 
                                    Var and D is a list 
                                    of couples Value:Prob assigning probability 
                                    Prob to Value
uniform(Var,D): A 
                                    is an atom containg variable Var 
                                    and D is a list of values each 
                                    taking the same probability (1 over the length 
                                    of D).
poisson(Var,Lambda): Var 
                                    follows a Poisson distribution with parameter 
                                    Lambda.
g(X) : gaussian(X,0,1).
                        states that argument X of g(X) 
                            follows a Gaussian distribution with mean 0 and variance 1.
In this section we will see how to perform simple 
                            queries over hybrid programs. 
                            
                            If an atom encodes a continuous random variable (such 
                            as g(X) above),
                            asking the probability that a ground instantiation, 
                            such as g(0.3),
                            is true is
                            not meaningful, as the probability that a continuous 
                            random variables takes a
                            specific value is always 0. In this case you are more 
                            interested in computing the
                            distribution of X of a goal g(X),
                            possibly after having observed some evidence.
                        
                            If the query is unconditional, we can use approximate inference 
                            with Monte Carlo sampling as described in the section
                            .
                            When you have continuous random variables, you may be 
                            interested in sampling arguments of goals representing 
                            continuous random variables. In this way you can build a
                            probability density of the sampled argument. To do that 
                            you can use the predicate mc_sample_arg/4.
                            
                            As example let us consider a program the following program.
                        
A biased coin is thrown, if it lands heads, X 
                            in mix(X)
                            is sampled from a Gaussian with mean 0 and variance 1. 
                            If it lands tails, X is sampled from a Gaussian with 
                            mean 5 and variance 2.
:- use_module(library(mcintyre)).
:- use_rendering(c3).
:- mc.
:- begin_lpad.
% a coin is thrown. The coin is biased: with probability 0.6 it lands heads,
% with probabiity 0.4 it lands tails
heads:0.6;tails:0.4. 
% X in g(X)  follows a Gaussian distribution with mean 0 and variance 1
g(X): gaussian(X,0,1).
% X in h(X)  follows a Gaussian distribution with mean 5 and variance 2
h(X): gaussian(X,5,2).
% if the coin lands heads, X in mix(X) is given by g(X)
mix(X) :- heads, g(X).
% if the coin lands tails, X in mix(X) is given by h(X)
mix(X) :- tails, h(X).
:- end_lpad.
For example we can now perform the query
mc_sample_arg(mix(X),1000,X,Values).
Values will contain a list of 
                            couples L-N where L is the list of values 
                            of X for which query succeeds in a world 
                            sampled at random and N is the number of 
                            samples returning L.
                            
                            Notice that, in every couple L-N,
                            L will contain just 
                            one element and N will be always 1. This 
                            is because the random variable X is continuous
                            and mix(X) always succeeds exactly once,
                            therefore the predicate mc_sample_arg/4
                            will sample 1000 different worlds and every world will
                            have a different value for X.
                        
                            Of course watching the 
                            resulting values as a list of numerical 
                            values could make you feel the pesky sensation of your eyes melting. 
                            
                            It would be better if we could use some 
                            kind of graphical feature to represent the results. 
                            In other words our desire is to draw the 
                            probability density function of the argument.
                            
                            cplint on SWISH
                            provides the predicate histogram/3 that comes
                            in handy for these situations.
                        
histogram(+List:list,-Chart:dict,+Options:list).
where Options is a list of options, the following are recognised:
min(+Min:float) the minimum value of domain, default value the minimum in List
max(+Max:float) the maximum value of domain, default value the maximum in List
nbins(+NBins:int) the number of bins for dividing the domain, default value 40
This predicate draws a vertical histogram of the samples in 
                            List 
                            dividing the domain in NBins bins. 
                            List must be a list of couples of the 
                            form L-W where L is a 
                            list of sampled values and W is its weight. 
                            
                            So we can use the output list Values of
                            mc_sample_arg/4 as input list List
                            for histogram/3. In our example, if we 
                            want to plot the probability density function of X,
                            we can use
                        
mc_sample_arg(mix(X),1000,X,_Values), histogram(_Values,Chart,[]).
Note: we wrote _Values
                            instead of Values because we are not interested
                            in printing the results of the sampling.
                        
Complete examples: gaussian_mixture.pl
cplint on SWISH also allows to execute conditional query over hybrid programs.
                            As in the previous section
                            we are interested in 
                            sampling arguments of goals representing 
                            continuous random variables and drawing a
                            probability density of the sampled argument.
                            
                            To perform this
                            kind of query we must distinguish three cases depending on what 
                            type of evidence we have:
                        
mc_rejection_sample_arg/6 and
                        mc_mh_sample_arg/6, the predicates for rejection sampling and 
                        Metropolis-Hastings MCMC (see section )
                        Let us consider the same program of the 
                            previous
                                section
                            
                            We want to take 1000 samples of X 
                            in mix(X) given that heads was true 
                            using rejection sampling and Metropolis-Hastings MCMC
                        
mc_rejection_sample_arg(mix(X),heads,1000,X,Values,[]).
mc_mh_sample_arg(mix(X),heads,1000,X,Values,[lag(2)]).
                            
                            To plot the distribution of the arguments we can use 
                            histogram/3.
                        
mc_rejection_sample_arg(mix(X),heads,1000,X,_V), histogram(_V,Chart,[]).
mc_mh_sample_arg(mix(X),heads,1000,X,_V,[lag(2)]), histogram(_V,Chart,[]).
Let us consider the same program of the 
                            previous
                                section
                            
                            We want to take 1000 samples of X 
                            in mix(X) given that X>2
                            was true using rejection sampling and draw an
                            histogram with 40 bins representing the probability 
                            density of X. To do that we can ask 
                            the query
                        
mc_rejection_sample_arg(mix(X),(mix(Y),Y>2),1000,X,_V,[]), histogram(_V,Chart,[]).
We can do the same by using Metropolis-Hastings MCMC
mc_mh_sample_arg(mix(X),(mix(Y),Y>2),1000,X,_Values,[lag(2)]), 
  histogram(_Values,Chart,[]).
When you have evidence on ground atoms that have 
                            continuous values as arguments, you cannot use neither
                            rejection sampling nor Metropolis-Hastings,
                            as the probability of the evidence is 0,
                            but you need to use likelihood weighting  
                            to obtain samples of continuous arguments of a goal.
                            
                            Let us take as example the following program.
                        
We are trying to estimate the true value of a Gaussian distributed random variable, given some observed data. The variance is known (2) and we suppose that the mean has a Gaussian distribution with mean 1 and variance 5. We take different measurement (e.g. at different times), indexed with an integer. Given that we observe 9 and 8 at indexes 1 and 2, how does the distribution of the random variable (value at index 0) changes with respect to the case of no observations?
:- use_module(library(mcintyre)).
:- use_rendering(c3).
:- mc.
:- begin_lpad.
% at time I we see X sampled from a Gaussian with mean M and variance 2.0
val(I,X) :- 
  mean(M),
  measurement(I,M,X).
% Gaussian distribution of the mean of the Gaussian of the variable
mean(M): gaussian(M,1.0, 5.0).
% Gaussian distribution of the variable
measurement(_,M,X): gaussian(X,M, 2.0).
:- end_lpad.
As in the previous cases we are interested in finding the posterior probability density function of an argument. To do that we can use the predicate
mc_lw_sample_arg(:Query:atom,:Evidence:atom,+N:int,?Arg:var,-ValList)
                        This predicate returns in ValList a list of couples 
                            V-W where V is a value 
                            of Arg for which Query 
                            succeeds and W is the weight computed 
                            by likelihood weighting according to Evidence 
                            (a conjunction of atoms is allowed here).
                            
                            For example we may ask given that we observe 9 and 8 at indexes 1 and 2, 
                            what is the distribution of the random 
                            variable (value at index 0)?
                        
mc_lw_sample_arg(val(0,X),(val(1,9),val(2,8)),1000,X,V)
                            Of course it would be better if we could plot the
                            results in a graph, instead of just printing them (aah! my eyes!). 
                            
                            We cannot use
                            histogram/4 because it takes as input
                            a list of couples L-W where L is a list 
                            of values, whereas mc_lw_sample_arg/5 returns
                            a list of couples V-W where V is a value.
                            In this case you can use densities/4
                        
densities(+PriorList:list,+PostList:list,-Chart:dict,+Options:list)
                        This predicate draws a line chart of the density of two sets 
                            of samples, usually prior and post observations. 
                            The samples from the prior are in PriorList 
                            as couples L-W, where L is a 
                            list of values and W is its weight, 
                            while the samples from the posterior are in 
                            PostList as couples V-W 
                            where V 
                            is a value and W its weigth. 
                            The same options as histogram/3 are recognized.
                            The lines 
                            are drawn dividing the domain in the number of bins provided as an option.
                            
                            If we want to plot the densities of the random variable before and after  
                            observing 9 and 8 by taking 1000 samples and by dividing the domain
                            in 40 bins, we can run
                        
mc_sample_arg(val(0,Y),1000,Y,_V0), 
mc_lw_sample_arg(val(0,X),(val(1,9),val(2,8)),1000,X,_V),
densities(_V0,_V,Chart,[]).
Complete examples: gaussian_mixture.pl, gauss_mean_est.pl
In this section we will see how to learn the parameters given a background knowledge and an initial program. We take into account the Machines dataset .
To execute a learning algorithm the input source, a .pl file, must be divided in five parts:
Here we will define a program step by step and then execute the algorithm EMBLEM which learns the parameters of a given initial LPAD program.
For more information of how to perform learning see the cplint on SWISH manual (PDF version).
In order to perform both EMBLEM and SLIPCOVER you need to load the library slipcover with the command
:- use_module(library(slipcover)).
                        After that you have to initialize slipcover with the command
:- sc.
                        
                        Now that we have defined the preamble, one can specify the background knowledge with a fact of the form
bg(<list of terms representing clauses>).
                        Alternatively, we can specify a set of clauses by including them in a section between the goals :- begin_bg. and :- end_bg.. We will use the latter approach.
:- begin_bg.
component(C):-
  replaceable(C).
component(C):-
  not_replaceable(C).
replaceable(gear).
replaceable(wheel).
replaceable(chain).
not_replaceable(engine).
not_replaceable(control_unit).
not_worn(C):-
  component(C),
  \+ worn(C).
one_worn:-
  worn(_).
none_worn:-
  \+ one_worn.
:- end_bg.
                        At this point we can define an initial program for which we want to learn the parameters. We can do it with a fact of the form
in(<list of terms representing clauses>).
                        Alternatively, you can specify an input program in a section between :- begin_in. and :- end_in.. We will use the latter method. Therefore in our example
:- begin_in.
class(sendback):0.5 :-
  worn(A),
  not_replaceable(A).
class(fix):0.6 :-
  worn(A),
  replaceable(A).
class(ok):0.5 :-
  not_worn(_A).
:- end_in.
                        The language bias part contains the declarations of the input and output predicates.
The typical input for EMBLEM will be a set of interpretations, i.e. sets of ground facts. Among the predicates for the input facts the user has to indicate which are the output predicates. Output predicates are declared as
output(<predicate>/<arity>).
                        In our example
output(class/1).
                        Input predicates are those whose atoms you are not interested in predicting.
You can declare closed world input predicates with
input_cw(<predicate>/<arity>).
                        For these predicates, the only true atoms are those in the interpretations and those derivable from them using the background knowledge; the clauses in the input/hypothesized program are not used to derive atoms for these predicates. Moreover, clauses in the background knowledge that define closed world input predicates and that call an output predicate in the body will not be used for deriving examples. In our example
input_cw(replaceable/1).
input_cw(not_replaceable/1).
input_cw(worn/1).
input_cw(not_worn/1).
input_cw(none_worn/0).
                        Besides closed world input predicates we can declare open world input predicates with
input(<predicate>/<arity>).
                        In our example we do not have open world input predicates.
The last part of the input file contains the data. 
                            You can specify data with two modalities: models and keys.
                            In the models type, you specify an example model (or interpretation) 
                            as a list of Prolog facts initiated by begin(model(<name>)). 
                            and terminated by end(model(<name>))..
                            Alternatively, with the keys modality, you can directly 
                            write the facts and the first argument will be interpreted as a model identifier.
                            
                            The two modalities, models and keys, can be mixed in the same source.
                            If we use the model modality for the example/interpretation 1, we get
begin(model(1)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(gear).
worn(engine).
end(model(1)).
                        If we use this modality the system asserts a int(<name>). for each model enclosed in begin(model(<name>)). and end(model(<name>))..
Instead, if we use the key modality, our example will be (note the first argument of each fact)
class(1,sendback).
neg(1,class(fix)).
neg(1,class(ok)).
worn(1,gear).
worn(1,engine).
                        If we use this modality, facts for int/1 are not asserted for interpretations, but can be explicitily added by the user.
After defining the examples/interpretations we must indicate how the examples are divided in folds with facts of the form:
fold(<fold_name>,<list of model identifiers>)
                        as for example
fold(train1,[1,2,3]).
fold(train2,[4,5,6,7,8,9,10]).
                        We can also define intensionally the folds as in
fold(all,F) :- findall(I,int(I),F).
                        The complete Machines input file is
% PREAMBLE %
:- use_module(library(slipcover)).
% use the renderer 'lpad'. It not mandatory to use it, but it prints the learned clauses in a more readable way
:- use_rendering(lpad).
:- sc.
:- set_sc(depth_bound,false).
:- set_sc(neg_ex,given).
:- set_sc(megaex_bottom,15).
:- set_sc(max_iter,10).
:- set_sc(max_iter_structure,50).
:- set_sc(verbosity,1).
% BACKGROUND KNOWLEDGE %
:- begin_bg.
component(C):-
  replaceable(C).
component(C):-
  not_replaceable(C).
replaceable(gear).
replaceable(wheel).
replaceable(chain).
not_replaceable(engine).
not_replaceable(control_unit).
not_worn(C):-
  component(C),
  \+ worn(C).
one_worn:-
  worn(_).
none_worn:-
  \+ one_worn.
:- end_bg.
% INITIAL PROGRAM %
:- begin_in.
class(sendback):0.5 :-
  worn(A),
  not_replaceable(A).
class(fix):0.6 :-
  worn(A),
  replaceable(A).
class(ok):0.5 :-
  not_worn(_A).
:- end_in. 
% LANGUAGE BIAS %
% output predicates
output(class/1).
% input closed world predicates
input_cw(replaceable/1).
input_cw(not_replaceable/1).
input_cw(worn/1).
input_cw(not_worn/1).
input_cw(none_worn/0).
% EXAMPLES (interpretations) %
begin(model(1)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(gear).
worn(engine).
end(model(1)).
begin(model(2)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(2)).
begin(model(3)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
end(model(3)).
begin(model(4)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
end(model(4)).
begin(model(5)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
worn(chain).
end(model(5)).
begin(model(6)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
end(model(6)).
begin(model(7)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(wheel).
worn(control_unit).
end(model(7)).
begin(model(8)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(8)).
begin(model(9)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(chain).
end(model(9)).
begin(model(10)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(chain).
end(model(10)).
begin(model(11)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(control_unit).
end(model(11)).
begin(model(12)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
end(model(12)).
begin(model(13)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
worn(engine).
end(model(13)).
begin(model(14)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(14)).
begin(model(15)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(gear).
end(model(15)).
fold(all, F) :- findall(I,int(I),F).
To execute the parameter learning algorithm EMBLEM, we need to ask a query of the form
induce_par(<list of folds>,P).
                        where <list of folds> is a list of the folds for training and P will contain the
                            input program with updated parameters.
In our example we want to learn the parameters by using the fold which contains all the examples (all). Therefore
induce_par([all],P).
Complete example: mach.pl
For more information about how to perform learning and EMBLEM see the manual (or PDF version) and the references in the About page
cplint on SWISH allows the user to 
                            perform structure learning.
                            In this tutorial section we will learn how to do that.
                            
                            We take into account the Machines dataset 
                            already illustrated in the 
                            previous section.
Note: the learning algorithms are available only if you use the Prolog editor.
To execute a learning algorithm the input source must be divided in five parts:
Here we will define a program step by step and then 
                            execute the algorithm SLIPCOVER 
                            which learns the structure and the parameters given a background knowledge.
                            
                            We keep the background knowledge, the initial LPAD, 
                            and the interpretations unchanged. However, in order to 
                            perform structure learning, we need to
                            modify the preamble and the language bias.
                        
For more information of how to perform learning see the cplint on SWISH manual (PDF version).
We load the library slipcover, initialize 
                            it and then set some parameters. We use the same preamble
                            of the previous section, plus 
                            we add the settings set_sc(max_iter_structure,50). and 
                            :- set_sc(megaex_bottom,15).. So the preamble becomes
:-use_module(library(slipcover)).
:-sc.
:- set_sc(depth_bound,false).
:- set_sc(neg_ex,given).
:- set_sc(max_iter,10).
:- set_sc(max_iter_structure,50).
:- set_sc(megaex_bottom,15).
:- set_sc(verbosity,1).
                        In the previous example we already defined the set of output and input predicates for the EMBLEM algorithm. However now we want to learn new clauses (the structure) for the program, in order to do that we need to add mode declarations in the style of Progol.
To specify the atoms that can appear in the head of clauses you must use facts of the form
modeh(<recall>,<predicate>(<arg1>,...)).
                        To specify the atoms that can appear in the body of clauses you must use facts of the form
modeb(<recall>,<predicate>(<arg1>,...)).
                        where <recall> can be an integer or *. <recall> indicates how many atoms for the predicate specification are retained in the bottom clause during a saturation step. * stands for all those that are found.
We refer to the cplint on SWISH manual for further details.
In our example we use the following mode declarations
modeh(*,class(fix)).
modeh(*,class(ok)).
modeh(*,class(sendback)).
modeb(*,not_replaceable(-comp)).
modeb(*,replaceable(-comp)).
modeb(*,worn(-comp)).
modeb(*,not_worn(-comp)).
modeb(*,none_worn).
                        SLIPCOVER also requires facts for the determination/2 
                            predicate Aleph-style that indicate which predicates 
                            can appear in the body of clauses. For our program we have
determination(class/1,replaceable/1).
determination(class/1,not_replaceable/1).
determination(class/1,worn/1).
determination(class/1,not_worn/1).
determination(class/1,none_worn/0).
                        For example the first determination/2 fact 
                            states that the predicate replaceable/1 
                            can appear in the body of hypothesized clauses having 
                            class/1 in the head.
Below there is the complete program
% PREAMBLE %
:- use_module(library(slipcover)).
% use the renderer 'lpad'. It not mandatory to use it, but it prints the learned clauses in a more readable way
:- use_rendering(lpad).
:- use_rendering(c3).
:- sc.
:- set_sc(depth_bound,false).
:- set_sc(neg_ex,given).
:- set_sc(megaex_bottom,15).
:- set_sc(max_iter,10).
:- set_sc(max_iter_structure,50).
:- set_sc(verbosity,1).
% BACKGROUND KNOWLEDGE %
:- begin_bg.
component(C):-
  replaceable(C).
component(C):-
  not_replaceable(C).
replaceable(gear).
replaceable(wheel).
replaceable(chain).
not_replaceable(engine).
not_replaceable(control_unit).
not_worn(C):-
  component(C),
  \+ worn(C).
one_worn:-
  worn(_).
none_worn:-
  \+ one_worn.
:- end_bg.
% INITIAL PROGRAM %
:- begin_in.
class(sendback):0.5 :-
  worn(A),
  not_replaceable(A).
class(fix):0.6 :-
  worn(A),
  replaceable(A).
class(ok):0.5 :-
  not_worn(_A).
:- end_in. 
% LANGUAGE BIAS %
% output predicates
output(class/1).
% input closed world predicates
input_cw(replaceable/1).
input_cw(not_replaceable/1).
input_cw(worn/1).
input_cw(not_worn/1).
input_cw(none_worn/0).
modeh(*,class(fix)).
modeh(*,class(ok)).
modeh(*,class(sendback)).
modeb(*,not_replaceable(-comp)).
modeb(*,replaceable(-comp)).
modeb(*,worn(-comp)).
modeb(*,not_worn(-comp)).
modeb(*,none_worn).
determination(class/1,replaceable/1).
determination(class/1,not_replaceable/1).
determination(class/1,worn/1).
determination(class/1,not_worn/1).
determination(class/1,none_worn/0).
% EXAMPLES (interpretations) %
begin(model(1)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(gear).
worn(engine).
end(model(1)).
begin(model(2)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(2)).
begin(model(3)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
end(model(3)).
begin(model(4)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
end(model(4)).
begin(model(5)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(gear).
worn(chain).
end(model(5)).
begin(model(6)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
end(model(6)).
begin(model(7)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(wheel).
worn(control_unit).
end(model(7)).
begin(model(8)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(8)).
begin(model(9)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(chain).
end(model(9)).
begin(model(10)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(chain).
end(model(10)).
begin(model(11)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(engine).
worn(control_unit).
end(model(11)).
begin(model(12)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
end(model(12)).
begin(model(13)).
class(sendback).
neg(class(fix)).
neg(class(ok)).
worn(chain).
worn(wheel).
worn(gear).
worn(engine).
end(model(13)).
begin(model(14)).
class(ok).
neg(class(sendback)).
neg(class(fix)).
end(model(14)).
begin(model(15)).
class(fix).
neg(class(sendback)).
neg(class(ok)).
worn(wheel).
worn(gear).
end(model(15)).
fold(all,F):- findall(I,int(I),F).
:- fold(all,F),
   sample(4,F,FTr,FTe),
   assert(fold(train,FTr)),
   assert(fold(test,FTe)).
To execute the structure learning algorithm SLIPCOVER (which learns also the parameters of the learned program), we need to execute a query with the form
induce(<list of folds>,P).
                        where <list of folds> is a list of the folds for training and P will contain the learned program.
In our example we want to learn the structure (and the parameters) by using the fold which contains all the examples (all). Therefore
induce([all],P).
Complete example: mach.pl
For more information about how to perform learning and SLIPCOVER see the cplint on SWISH manual (or PDF version) and .
In this tutorial section we will see how to execute a test on a program. We take into account the Machines dataset already seen in the previous sections ( Parameter Learning and Structure Learning).
A program can also be tested on a test set with a query of the form
?- test(<program>,<list_of_folds>,LL,AUCROC,ROC,AUCPR,PR).
                        where <program> is a list of terms representing clauses and <list_of_folds> is a list of folds. This returns the log likelihood of the test examples in LL, the Area Under the ROC curve in AUCROC, a dictionary containing the list of points (in the form of Prolog pairs x-y) of the ROC curve in ROC, the Area Under the PR curve in AUCPR, a dictionary containing the list of points of the PR curve in PR.
Let us suppose now that we have two disjunt folds of examples named train and test. We will now see how to test a (learned) program.
we can test the input program on the fold test with a query of the form
?- in(P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
                        Suppose we want to perform parameter learning on the initial program by using the train fold and then test the resulting program by using the test fold. Then we have just to run the query
?- induce_par([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
                        Suppose we want to learn new clauses (i.e. we perform structure learning) by using the train fold and then test the resulting program by using the test fold. Then we have just to run the query
?- induce([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
                        It is possible to see the curves AUCROC, ROC and PR as graphs by including the renderer c3 before :- sc.. Morover we include the renderer lpad to have the output program pretty printed. Therefore we add the following commands in the preamble before :- sc..
:- use_rendering(c3).
:- use_rendering(lpad).
                        We can intensionally create the fold containing all the example with
fold(all,F):- findall(I,int(I),F).
                        We can dynamically create the folds train and test with the following command
:- fold(all,F),
   sample(4,F,FTr,FTe),
   assert(fold(train,FTr)),
   assert(fold(test,FTe)).
                        This last command should however be inserted after the input interpretations. As can be seen, it uses sample(N,List,Sampled,Rest) exported from the library slipcover that samples N elements from List and returns the sampled elements in Sampled and the rest in Rest. If List has N elements or less, Sampled is equal to List and Rest is empty.
If we want to learn the parameters of the initial program and then test the resulting program, we can use the following query
induce_par([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
If we want to learn a program and then test it, we can use the following query
induce([train],P), test(P,[test],LL,AUCROC,ROC,AUCPR,PR).
Complete example: mach.pl
For more information about how to perform learning see the cplint on SWISH manual (or PDF version).
                    
                
                    PhD student - University of Ferrara