API Documentation

BayesNets.BDeuPrior — Type

Assigns equal scores to Markov equivalent structures

α_ijk = x/{q_i * r_i} for each j, k and some given x

see DMU section 2.4.3

source

BayesNets.BayesNetSampler — Type

Abstract type for sampling with:

Random.rand(BayesNet, BayesNetSampler)
Random.rand(BayesNet, BayesNetSampler, nsamples)
Random.rand!(Assignment, BayesNet, BayesNetSampler)

source

BayesNets.DirectSampler — Type

Straightforward sampling from a BayesNet. The default sampler.

source

BayesNets.DirichletPrior — Type

Baysian Structure learning seeks to maximize P(G|D) In the Bayesian fashion, we can provide a prior over the parameters in our learning network. This is described using a Dirichlet Prior.

source

BayesNets.DiscreteBayesNet — Type

DiscreteBayesNets are Bayesian Networks where every variable is an integer within 1:Nᵢ and every distribution is Categorical.

This representation is very common, and allows for the use of factors, for example in Probabilistic Graphical Models by Koller and Friedman

source

BayesNets.ExactInference — Type

Exact inference using factors and variable eliminations

source

BayesNets.Factor — Type

Factor(bn, name, evidence=Assignment())

Create a factor for a node, given some evidence.

source

BayesNets.Factor — Type

Factor(dims, potential)

Create a Factor corresponding to the potential.

source

BayesNets.Factor — Method

Factor(dims, lengths, fill_value=0)

Create a factor with dimensions dims, each with lengths corresponding to lengths. fill_value will fill the potential array with that value. To keep uninitialized, use fill_value=nothing.

source

BayesNets.GibbsSampler — Type

The GibbsSampler type houses the parameters of the Gibbs sampling algorithm. The parameters are defined below:

burnin: The first burnin samples will be discarded. They will not be returned. The thinning parameter does not affect the burn in period. This is used to ensure that the Gibbs sampler converges to the target stationary distribution before actual samples are drawn.

thinning: For every thinning + 1 number of samples drawn, only the last is kept. Thinning is used to reduce autocorrelation between samples. Thinning is not used during the burn in period. e.g. If thinning is 1, samples will be drawn in groups of two and only the second sample will be in the output.

timelimit: The number of milliseconds to run the algorithm. The algorithm will return the samples it has collected when either nsamples samples have been collected or timelimit milliseconds have passed. If time_limit is null then the algorithm will run until nsamples have been collected. This means it is possible that zero samples are returned.

erroriftimeout: If erroriftimeout is true and the timelimit expires, an error will be raised. If erroriftimeout is false and the time limit expires, the samples that have been collected so far will be returned. This means it is possible that zero samples are returned. Burn in samples will not be returned. If time_limit is null, this parameter does nothing.

consistent_with: the assignment that all samples must be consistent with (ie, Assignment(:A=>1) means all samples must have :A=1). Use to sample conditional distributions.

maxcachesize: If null, cache as much as possible, otherwise cache at most "maxcachesize" distributions

variableorder: variableorder determines the order of variables changed when generating a new sample. If null use a random order for every sample (this is different from updating the variables at random). Otherwise should be a list containing all the variables in the order they should be updated.

initial_sample: The inital assignment to variables to use. If null, the initial sample is chosen by briefly using a LikelihoodWeightedSampler.

source

BayesNets.GibbsSamplerState — Type

Used to cache various things the Gibbs sampler needs

source

BayesNets.GibbsSamplingFull — Type

infer(im, inf)

Run Gibbs sampling for N iterations. Each iteration changes all nodes. Discareds first burn_in samples and keeps only the thin-th sample. Ex, if thin=3, will discard the first two samples and keep the third.

source

BayesNets.GibbsSamplingNodewise — Type

infer(GibbsSampling, state::Assignment, InferenceState)

Run Gibbs sampling for N iterations. Each iteration changes one node.

Discareds first burn_in samples and keeps only the thin-th sample. Ex, if thin=3, will discard the first two samples and keep the third.

source

BayesNets.K2GraphSearch — Type

K2GraphSearch

A GraphSearchStrategy following the K2 algorithm. Takes polynomial time to find the optimal structure assuming a topological variable ordering.

source

BayesNets.LikelihoodWeightedSampler — Type

Likelihood Weighted Sampling

source

BayesNets.LikelihoodWeightingInference — Type

Approximates p(query|evidence) with N weighted samples using likelihood weighted sampling

source

BayesNets.LoopyBelief — Type

Loopy belief propogation for a network.

Early stopping if change is messages < tol for `itersforconvergence' iterations. For no stopping, use tol < 0.

source

BayesNets.NegativeBayesianInformationCriterion — Type

NegativeBayesianInformationCriterion

A ScoringFunction for the negative Bayesian information criterion.

BIC = -2⋅L + k⋅ln(n)

   L - the log likelihood of the data under the cpd
   k - the number of free parameters to be estimated
   n - the sample size

source

BayesNets.RejectionSampler — Type

Rejection Sampling in which the assignments are forced to be consistent with the provided values. Each sampler is attempted at most max_nsamples times before returning an empty assignment.

source

BayesNets.ScoreComponentCache — Type

ScoreComponentCache

Used to store scores in a priority queue such that graph search algorithms know when a particular construction has already been made.

source

BayesNets.ScoreComponentCache — Method

ScoreComponentCache(data::DataFrame)

Construct an empty ScoreComponentCache the size of ncol(data)

source

BayesNets.ScoringFunction — Type

ScoringFunction

An abstract type for which subtypes allow extracting CPD score components, which are to be maximized: score_component(::ScoringFunction, cpd::CPD, data::DataFrame)

source

BayesNets.UniformPrior — Type

A uniform Dirichlet prior such that all α are the same

Defaults to the popular K2 prior, α = 1, which is similar to Laplace Smoothing

https://en.wikipedia.org/wiki/Additive_smoothing

source

Base.:* — Method

Table multiplication

source

Base.Broadcast.broadcast! — Method

broadcast!(f, ϕ, dims, values)

Broadcast a vector (or array of vectors) across the dimension(s) dims Each vector in values will be broadcast acroos its respective dimension in dims

See Base.broadcast for more info.

source

Base.Broadcast.broadcast — Method

broadcast(f, ϕ, dims, values)

Broadcast a vector (or array of vectors) across the dimension(s) dims Each vector in values will be broadcast acroos its respective dimension in dims

See Base.broadcast for more info.

source

Base.Sort.partialsort — Method

Given a Table, extract the rows which match the given assignment

source

Base.convert — Method

Convert a Factor to a DataFrame

source

Base.convert — Method

convert(DiscreteCPD, cpd)

Construct a Factor from a DiscreteCPD.

source

Base.count — Method

Base.count(bn::BayesNet, name::NodeName, data::DataFrame)

returns a table containing all observed assignments and their corresponding counts

source

Base.delete! — Method

delete!(bn::BayesNets, target::NodeName)

Removing cpds will alter the vertex indeces. In particular, removing the ith cpd will swap i and n and then remove n.

source

Base.eltype — Method

Returns Float64

source

Base.getindex — Method

getindex(ϕ, a)

Get values with dimensions consistent with an assignment. Colons select entire dimension.

source

Base.in — Method

in(dim, ϕ) -> Bool

Return true if dim is in the Factor ϕ

source

Base.indexin — Method

indexin(dims, ϕ)

Return the index of dimension dim in ϕ, or 0 if not in ϕ.

source

Base.join — Function

join(op, ϕ1, ϕ2, :outer, [v0])
join(op, ϕ1, ϕ2, :inner, [reducehow], [v0])

Performs either an inner or outer join,

An outer join returns a Factor with the union of the two dimensions The two factors are combined with Base.broadcast(op, ...)

An inner join keeps the dimensions in common between the two Factors. The extra dimensions are reduced with reducedim(reducehow, ...) and then the two factors are combined with: op(ϕ1[commondims].potential, ϕ2[commondims].potential)

source

Base.length — Method

Total number of elements in Factor (potential)

source

Base.names — Method

Returns the ordered list of NodeNames

source

Base.names — Method

Names of each dimension

source

Base.push! — Method

Appends a new dimension to a Factor

source

Base.rand — Method

Generates a DataFrame containing a dataset of variable assignments. Always return a DataFrame with nsamples rows.

source

Base.rand — Method

Returns an assignment sampled from the bn using the provided sampler

source

Base.rand — Method

Implements Gibbs sampling. (https://en.wikipedia.org/wiki/Gibbs_sampling) For finite variables, the posterior distribution is sampled by building the exact distribution. For continuous variables, the posterior distribution is sampled using Metropolis Hastings MCMC. Discrete variables with infinite support are currently not supported. The Gibbs Sampler only supports CPDs that return Univariate Distributions. (CPD{D<:UnivariateDistribution})

Sampling requires a GibbsSampler object which contains the parameters for Gibbs sampling. See the GibbsSampler documentation for parameter details.

source

Base.similar — Method

similar(ϕ)

Return a factor similar to ϕ with unitialized values

source

Base.size — Method

size(ϕ, [dims...])

Returns a tuple of the dimensions of ϕ

source

Base.write — Method

write(io, text/plain, bn)

Writes a text file containing the sufficient statistics for a discrete Bayesian network. This was inspired by the format listed in Appendix A of "Correlated Encounter Model for Cooperative Aircraft in the National Airspace System Version 1.0" by Mykel Kochenderfer.

The text file contains the following parameters:

variable labels: A space-delimited list specifies the variable labels, which are symbols. The ordering of the variables in this list determines the ordering of the variables in the other tables. Note that the ordering of the variable labels is not necessarily topological.
graphical structure: A binary matrix is used to represent the graphical structure of the Bayesian network. A 1 in the ith row and jth column means that there is a directed edge from the ith varible to the jth variable in the Bayesian network. The ordering of the variables are as defined in the variable labels section of the file. The entries are 0 or 1 and are not delimited.
variable instantiations: A list of integers specifying the number of instantiations for each variable. The list is space-delimited.
sufficient statistics: A list of space-delimited integers Pₐⱼₖ which specifies the sufficient statistics. The array is ordered first by increasing k, then increasing j, then increasing i. The variable ordering is defined in the variable labels section of the file. The list is a flattened matrices, where each matrix is rₐ × qₐ where rₐ is the number of instantiations of variable a and qₐ is the number of instantiations of the parents of variable a. The ordering is the same as the ordering of the distributions vector in the CategoricalCPD type. The entires in Pₐⱼₖ are floating point probability values.

For example, the network Success -> Forecast with Success ∈ [1, 2] and P(1) = 0.2, P(2) = 0.8 and Forecast ∈ [1, 2, 3] with P(1 | 1) = 0.4, P(2 | 1) = 0.4, P(3 | 1) = 0.2 P(1 | 2) = 0.1, P(2 | 2) = 0.3, P(3 | 2) = 0.6

Is output as:

Success Forecast 01 00 2 3 2 4 4 1 3

source

BayesNets.CPDs.ProbabilisticGraphicalModels.infer — Method

Approximates p(query|evidence) with nsamples likelihood weighted samples.

Since this uses a Factor, it is only efficient if the number of samples is (signifcantly) greater than the number of possible instantiations for the query variables

source

BayesNets.CPDs.ProbabilisticGraphicalModels.is_independent — Method

Returns whether the set of node names x is d-separated from the set y given the set given

source

BayesNets.CPDs.ProbabilisticGraphicalModels.markov_blanket — Method

Return the children, parents, and parents of children (excluding target) as a Set of NodeNames

source

BayesNets.CPDs.parents — Method

Returns the parents as a list of NodeNames

source

BayesNets._evidence_lambda — Method

Get the lambda-message to itself for an evidence node. If it isn't an evidence node, this will break

source

BayesNets._get_parent_indeces — Method

score_component(a::ScoringFunction, cpd::CPD, data::DataFrame, cache::ScoreComponentCache)

As score_component(ScoringFunction, cpd, data), but returns pre-computed values from the cache if they exist, and populates the cache if they don't

source

BayesNets._init_gibbs_sample — Function

_init_gibbs_sample(bn, evidence)

A random sample of non-evidence nodes uniformly over their domain

source

BayesNets.bayesian_score — Function

bayesian_score(G::DAG, names::Vector{Symbol}, data::DataFrame[, ncategories::Vector{Int}[, prior::DirichletPrior]])

Compute the bayesian score for graph structure g, with the data in data. names containes a symbol corresponding to each vertex in g that is the name of a column in data. ncategories is a vector of the number of values that each variable in the Bayesian network can take.

Note that every entry in data must be an integer greater than 0

source

BayesNets.bayesian_score_component — Method

Computes the Bayesian score component for the given target variable index and Dirichlet prior counts given in alpha

INPUT: i - index of the target variable parents - list of indeces of parent variables (should not contain self) r - list of instantiation counts accessed by variable index r[1] gives number of discrete states variable 1 can take on data - matrix of sufficient statistics / counts d[j,k] gives the number of times the target variable took on its kth instantiation given the jth parental instantiation

OUTPUT: the Bayesian score, Float64

source

BayesNets.children — Method

Returns the children as a list of NodeNames

source

BayesNets.duplicate — Method

duplicate(A, dims)

Repeates an array only through higer dimensions dims.

Custom version of repeate, but only outer repetition, and only duplicates the array for the number of times specified in dims for dimensions greater than ndims(A). If dims is empty, returns a copy of A.

julia> duplicate(collect(1:3), (2,))
3×2 Array{Int64,2}:
 1  1
 2  2
 3  3

julia> duplicate([1 3; 2 4], (3,))
2×2×3 Array{Int64,3}:
[:, :, 1] =
 1  3
 2  4

[:, :, 2] =
 1  3
 2  4

[:, :, 3] =
 1  3
 2  4

source

BayesNets.eval_mb_cpd — Method

eval_mb_cpd(node, ncategories, assignment, mb_cpds)

Return the potential of all instances of a node given its markove blanket as a WeightVec: P(node | panode) * Prod (c in children) P(c | pac)

Trys out all possible values of node (assumes categorical) Assignment should have values for all in the Markov blanket, including the variable itself.

source

BayesNets.get_asia_bn — Method

An ergodic version of the asia network, with the E variable removed

Orignal network: Lauritzen, Steffen L. and David J. Spiegelhalter, 1988

source

BayesNets.get_finite_distribution! — Method

Helper to sampleposteriorfinite

Modifies a and gss

source

BayesNets.get_mb_cpds — Method

Get the cpd's of a node and its children

source

BayesNets.get_sat_fail_bn — Method

Satellite failure network from DMU, pg 17

source

BayesNets.get_sprinkler_bn — Method

The usual sprinkler problem

source

BayesNets.get_weighted_dataframe — Method

A dataset of variable assignments is obtained with an additional column of weights in accordance with the likelihood of each assignment.

source

BayesNets.get_weighted_sample! — Method

Draw an assignment from the Bayesian network but set any variables in the evidence accordingly. Returns the assignment and the probability weighting associated with the evidence.

source

BayesNets.gibbs_sample — Method

bn:: A Bayesian Network to sample from. bn should only contain CPDs that return UnivariateDistributions.

nsamples: The number of samples to return.

consistent_with: the assignment that all samples must be consistent with (ie, Assignment(:A=>1) means all samples must have :A=1). Use to sample conditional distributions.

maxcachesize: If null, cache as much as possible, otherwise cache at most "maxcachesize" distributions

initialsample: The inital assignment to variables to use. If null, the initial sample is chosen by briefly running rand(bn, getweighted_dataframe).

source

BayesNets.gibbs_sample_main_loop — Method

The main loop associated with Gibbs sampling Returns a data frame with nsamples samples

Supports the various parameters supported by gibbssample Refer to gibbssample for parameter meanings

source

BayesNets.ndgrid_fill! — Method

???

source

BayesNets.pattern — Method

pattern(ϕ, [dims])

Return an array with the pattern of each dimension's state for all possible instances

source

BayesNets.rand_bn_inference — Function

rand_bn_inference(bn, num_query=2, num_evidence=3)

Generate a random inference state for a Bayesian Network with an evidence assignment sample uniformly over the chosen nodes' domain.

source

BayesNets.rand_cpd — Function

rand_cpd(bn::DiscreteBayesNet, ncategories::Int, target::NodeName, parents::NodeNames=NodeName[])

Return a CategoricalCPD with the given number of categories with random categorical distributions

source

BayesNets.rand_discrete_bn — Function

rand_discrete_bn(num_nodes16, max_num_parents=3,
        max_num_states=5, connected=true)

Generate a random DiscreteBayesNet.

Creates DiscreteBayesNet with num_nodes nodes, with each node having a random number of states and parents, up to max_num_parents and max_num_parents, respectively. If connected, each node (except the first) will be guaranteed at least one parent, making the graph connected.

source

BayesNets.readxdsl — Method

readxdsl( filename::AbstractString )

Return a DiscreteBayesNet read from the xdsl file

source

BayesNets.reducedim — Function

reducedim(op, ϕ, dims, [v0])

Reduce dimensions dims in ϕ using function op.

source

BayesNets.sample_posterior! — Method

set a[varname] ~ P(varname | not varname)

Modifies a and caches in gss

source

BayesNets.sample_posterior_continuous! — Method

Implements Metropolis-Hastings with a normal distribution proposal with mean equal to the previous value of the variable "varname" and stddev equal to 10 times the standard deviation of the distribution of the target variable given its parents ( var_distribution should be get(bn, varname)(a) )

MH will go through nsamples iterations. If no proposal is accepted, the original value will remain

This function expects that a[varname] is within the support of the distribution, it will not check to make sure this is true

Helper to sample_posterior Should only be used to sampling continuous distributions

set a[varname] ~ P(varname | not varname)

Modifies a and caches in gss

source

BayesNets.sample_posterior_finite! — Method

Helper to sample_posterior Should only be called if the variable associated with varname is discrete

set a[varname] ~ P(varname | not varname)

Modifies both a and gss

source

BayesNets.sample_weighted_dataframe! — Method

Chooses a sample at random from a weighted dataframe

source

BayesNets.score_component — Method

score_component(a::ScoringFunction, cpd::CPD, data::DataFrame)

Extract a Float64 score for a cpd given the data. One seeks to maximize the score.

source

BayesNets.score_components — Method

score_components(a::ScoringFunction, cpd::CPD, data::DataFrame)
score_components(a::ScoringFunction, cpds::Vector{CPD}, data::DataFrame, cache::ScoreComponentCache)

Get a list of score components for all cpds

source

BayesNets.statistics — Method

statistics(
    targetind::Int,
    parents::AbstractVector{Int},
    ncategories::AbstractVector{Int},
    data::AbstractMatrix{Int}
    )

outputs a sufficient statistics table for the target variable that is r × q where r = ncategories[i] is the number of variable instantiations and q is the number of parental instantiations of variable i

The r-values are ordered from 1 → ncategories[i] The q-values are ordered in the same ordering as ind2sub() in Julia Base Thus the instantiation of the first parent (by order given in parents[i]) is varied the fastest.

ex: Variable 1 has parents 2 and 3, with r₁ = 2, r₂ = 2, r₃ = 3 q for variable 1 is q = r₂×r₃ = 6 N will be a 6×2 matrix where: N[1,1] is the number of time v₁ = 1, v₂ = 1, v₃ = 1 N[2,1] is the number of time v₁ = 1, v₂ = 2, v₃ = 1 N[3,1] is the number of time v₁ = 1, v₂ = 1, v₃ = 2 N[4,1] is the number of time v₁ = 1, v₂ = 2, v₃ = 2 N[5,1] is the number of time v₁ = 1, v₂ = 1, v₃ = 3 N[6,1] is the number of time v₁ = 1, v₂ = 2, v₃ = 3 N[6,2] is the number of time v₁ = 2, v₂ = 1, v₃ = 1 ...

source

BayesNets.statistics — Method

statistics(
    parent_list::Vector{Vector{Int}},
    ncategories::AbstractVector{Int},
    data::AbstractMatrix{Int},
    )

Computes sufficient statistics from a discrete dataset for a Discrete Bayesian Net structure

INPUT: parents: list of lists of parent indices A variable with index i has ncategories[i] and row in data[i,:] No acyclicity checking is done ncategories: list of variable bin counts, or number of discrete values the variable can take on, v ∈ {1 : ncategories[i]} data: table of discrete values [n×m] where n is the number of nodes and m is the number of samples

OUTPUT: N :: Vector{Matrix{Int}} a sufficient statistics table for each variable Variable with index i has statistics table N[i], which is r × q where r = ncategories[i] is the number of variable instantiations and q is the number of parental instantiations of variable i

    The r-values are ordered from 1 → ncategories[i]
    The q-values are ordered in the same ordering as ind2sub() in Julia Base
        Thus the instantiation of the first parent (by order given in parents[i])
        is varied the fastest.

    ex:
        Variable 1 has parents 2 and 3, with r₁ = 2, r₂ = 2, r₃ = 3
        q for variable 1 is q = r₂×r₃ = 6
        N[1] will be a 6×2 matrix where:
            N[1][1,1] is the number of time v₁ = 1, v₂ = 1, v₃ = 1
            N[1][2,1] is the number of time v₁ = 1, v₂ = 2, v₃ = 1
            N[1][3,1] is the number of time v₁ = 1, v₂ = 1, v₃ = 2
            N[1][4,1] is the number of time v₁ = 1, v₂ = 2, v₃ = 2
            N[1][5,1] is the number of time v₁ = 1, v₂ = 1, v₃ = 3
            N[1][6,1] is the number of time v₁ = 1, v₂ = 2, v₃ = 3
            N[1][6,2] is the number of time v₁ = 2, v₂ = 1, v₃ = 1
            ...

This function uses sparse matrix black magic and was mercilessly stolen from Ed Schmerling.

source

BayesNets.sumout — Method

sumout(t, v)

Table marginalization

source

BayesNets.table — Method

table(bn::DiscreteBayesNet, name::NodeName)

Constructs the CPD factor associated with the given node in the BayesNet

source

Distributions.logpdf — Method

The logpdf of a given assignment after conditioning on the values

source

Distributions.ncategories — Method

Distributions.ncategories(bn::DiscreteBayesNet, node::Symbol)

Return the number of categories for a node in the network.

source

Distributions.pdf — Method

The pdf of a given assignment after conditioning on the values

source

Graphs.dst — Method

Returns all descendants as a list of NodeNames.

source

Graphs.has_edge — Method

Whether the BayesNet contains the given edge

source

Graphs.neighbors — Method

Returns all neighbors as a list of NodeNames.

source

LinearAlgebra.normalize! — Method

normalize!(ϕ, dims; p=1)
normalize!(ϕ; p=1)

Normalize the factor so all instances of dims have (or the entire factors has) p-norm of 1

source

LinearAlgebra.normalize! — Method

Table normalization Ensures that the :potential column sums to one

source

LinearAlgebra.normalize — Method

normalize!(ϕ, dims; p=1)
normalize!(ϕ; p=1)

Return a normalized copy of the factor so all instances of dims have (or the entire factors has) p-norm of 1

source

Random.rand! — Method

Overwrites assignment with a sample from bn using the sampler

source

Random.rand! — Method

NOTE: this is inefficient. Use rand(bn, GibbsSampler, nsamples) whenever you can

source

Random.rand! — Method

rand!(ϕ)

Fill with random values

source

StatsAPI.fit — Method

takes a list of observations of assignments represented as a DataFrame or a set of data samples (without :potential), takes the unique assignments, and estimates the associated probability of each assignment based on its frequency of occurrence.

source

StatsAPI.fit — Method

fit{C<:CPD}(::Type{BayesNet{C}}, ::DataFrame, ::GraphSearchStrategy)

Run the graph search algorithm defined by GraphSearchStrategy

source

StatsAPI.fit — Method

fit(::Type{BayesNet}, data, edges)

Fit a Bayesian Net whose variables are the columns in data and whose edges are given in edges

ex: fit(DiscreteBayesNet, data, (:A=>:B, :C=>B))

source

BayesNets.CPDs.CategoricalCPD — Type

A categorical distribution, P(x|parents(x)) where all parents are discrete integers 1:N.

The ordering of distributions array follows the convention in Decision Making Under Uncertainty. Suppose a variable has three discrete parents. The first parental instantiation assigns all parents to their first bin. The second will assign the first parent (as defined in parents) to its second bin and the other parents to their first bin. The sequence continues until all parents are instantiated to their last bins.

This is equivalent to:

X,Y,Z 1,1,1 2,1,1 1,2,1 2,2,1 1,1,2 ...

source

BayesNets.CPDs.ConditionalLinearGaussianCPD — Type

A conditional linear Gaussian CPD, always returns a Normal{Float64}

This is a combination of the CategoricalCPD and the LinearGaussianCPD.
For a variable with N discrete parents and M continuous parents, it will construct
a linear gaussian distribution for all M parents for each discrete instantiation.

                  { Normal(μ=a₁×continuous_parents(x) + b₁, σ₁) for discrete instantiation 1
P(x|parents(x)) = { Normal(μ=a₂×continuous_parents(x) + b₂, σ₂) for discrete instantiation 2
                  { ...

source

BayesNets.CPDs.LinearGaussianCPD — Type

A linear Gaussian CPD, always returns a Normal

Assumes that target and all parents can be converted to Float64 (ie, are numeric)

P(x|parents(x)) = Normal(μ=a×parents(x) + b, σ)

source

BayesNets.CPDs.StaticCPD — Type

A CPD for which the distribution never changes. target: name of the CPD's variable parents: list of parent variables. d: a Distributions.jl distribution

While a StaticCPD can have parents, their assignments will not affect the distribution.

source

Base.get! — Method

get!(a::Assignment, b::Assignment)

Modify and return the assignment to contain the ith entry

source

Base.rand — Method

rand(cpd::CPD)

Condition and then draw from the distribution

source

BayesNets.CPDs.disttype — Method

disttype(cpd::CPD)

Return the type of the CPD's distribution

source

BayesNets.CPDs.infer_number_of_instantiations — Method

infer_number_of_instantiations{I<:Int}(arr::AbstractVector{I})

Infer the number of instantiations, N, for a data type, assuming that it takes on the values 1:N

source

BayesNets.CPDs.name — Method

name(cpd::CPD)

Return the NodeName for the variable this CPD is defined for.

source

BayesNets.CPDs.nparams — Method

nparams(cpd::CPD)

Return the number of free parameters that needed to be estimated for the CPD

source

BayesNets.CPDs.parentless — Method

parentless(cpd::CPD)

Return whether this CPD has parents.

source

BayesNets.CPDs.parents — Method

parents(cpd::CPD)

Return the parents for this CPD as a vector of NodeName.

source

BayesNets.CPDs.strip_arg — Method

strip_arg(arg::Symbol)

Strip anything extra (type annotations, default values, etc) from an argument. For now this cannot handle keyword arguments (it will throw an error).

source

Distributions.logpdf — Method

logpdf(cpd::CPD, data::DataFrame)

Return the logpdf across the dataset

source

Distributions.logpdf — Method

logpdf(cpd::CPD)

Condition and then return the logpdf

source

Distributions.ncategories — Method

Distributions.ncategories(cpd::CategoricalCPD)

Return the number of categories for a cpd.

source

Distributions.pdf — Method

pdf(cpd::CPD, data::DataFrame)

Return the pdf across the dataset

source

Distributions.pdf — Method

pdf(cpd::CPD)

Condition and then return the pdf

source

StatsAPI.fit — Method

fit(::Type{CPD}, data::DataFrame, target::NodeName, parents::NodeNames)

Construct a CPD for target by fitting it to the provided data

source

BayesNets.CPDs.@required_func — Macro

required_func(signature)

Provide a default function implementation that throws an error when called.

source

BayesNets.CPDs.ProbabilisticGraphicalModels — Module

Provides a basic interface for defining and working with probabilistic graphical models

source

BayesNets.CPDs.ProbabilisticGraphicalModels.GraphSearchStrategy — Type

GraphSearchStrategy

An abstract type which defines a graph search strategy for learning probabilistic graphical model structures These allow: fit(::Type{ProbabilisticGraphicalModel}, data, GraphSearchStrategy)

source

BayesNets.CPDs.ProbabilisticGraphicalModels.InferenceMethod — Type

Abstract type for probability inference

source

BayesNets.CPDs.ProbabilisticGraphicalModels.InferenceState — Type

Type for capturing the inference state

source

BayesNets.CPDs.ProbabilisticGraphicalModels.Sampler — Type

Abstract type for sampling with Base.rand(ProbabilisticGraphicalModel, Sampler, nsamples) Base.rand!(Assignment, ProbabilisticGraphicalModel, Sampler) Base.rand(ProbabilisticGraphicalModel, Sampler)

source

Base.length — Method

length(PGM)

Returns the number of variables in the probabilistic graphical model

source

Base.names — Method

names(PGM)

Returns a list of NodeNames

source

Base.rand — Method

Generates a DataFrame containing a dataset of variable assignments. Always return a DataFrame with nsamples rows.

source

Base.rand — Method

Returns a new Assignment sampled from the PGM using the provided sampler

source

BayesNets.CPDs.ProbabilisticGraphicalModels.consistent — Method

consistent(a::Assignment, b::Assignment)

True if all shared NodeNames have the same value

source

BayesNets.CPDs.ProbabilisticGraphicalModels.infer — Method

infer(InferenceMethod, InferenceState)

Infer p(query|evidence)

source

BayesNets.CPDs.ProbabilisticGraphicalModels.is_independent — Method

is_independent(PGM, x::NodeNames, y::NodeNames, given::NodeNames) Returns whether the set of node names x is d-separated from the set y given the set given

source

BayesNets.CPDs.ProbabilisticGraphicalModels.markov_blanket — Method

markov_blanket(PGM) Returns the list of NodeNames forming the Markov blanket for the PGM

source

BayesNets.CPDs.ProbabilisticGraphicalModels.nodenames — Method

nodenames(a::Assignment)

Return a vector of NodeNames (aka symbols) for the assignment

source

Distributions.logpdf — Method

The logpdf of a set of assignment after conditioning on the values

source

Distributions.logpdf — Method

The logpdf of a given assignment after conditioning on the values

source

Distributions.pdf — Method

The pdf of a set of assignments after conditioning on the values

source

Distributions.pdf — Method

The pdf of a given assignment after conditioning on the values

source

Random.rand! — Method

Overwrites Assignment with a sample from the PGM using the given Sampler

source

StatsAPI.fit — Method

fit(::Type{ProbabilisticGraphicalModel}, data::DataFrame, params::GraphSearchStrategy)

Runs the graph search algorithm to learn a probabilistic graphical model of the provided type from data.

source

Tables.AbstractColumns — Type

Tables.AbstractColumns

An interface type defined as an ordered set of columns that support retrieval of individual columns by name or index. A retrieved column must be a 1-based indexable collection with known length, i.e. an object that supports length(col) and col[i] for any i = 1:length(col). Tables.columns must return an object that satisfies the Tables.AbstractColumns interface. While Tables.AbstractColumns is an abstract type that custom "columns" types may subtype for useful default behavior (indexing, iteration, property-access, etc.), users should not use it for dispatch, as Tables.jl interface objects are not required to subtype, but only implement the required interface methods.

Interface definition:

Required Methods	Default Definition	Brief Description
`Tables.getcolumn(table, i::Int)`	getfield(table, i)	Retrieve a column by index
`Tables.getcolumn(table, nm::Symbol)`	getproperty(table, nm)	Retrieve a column by name
`Tables.columnnames(table)`	propertynames(table)	Return column names for a table as a 1-based indexable collection
Optional methods
`Tables.getcolumn(table, ::Type{T}, i::Int, nm::Symbol)`	Tables.getcolumn(table, nm)	Given a column eltype `T`, index `i`, and column name `nm`, retrieve the column. Provides a type-stable or even constant-prop-able mechanism for efficiency.

Note that subtypes of Tables.AbstractColumns must overload all required methods listed above instead of relying on these methods' default definitions.

While types aren't required to subtype Tables.AbstractColumns, benefits of doing so include:

Indexing interface defined (using getcolumn); i.e. tbl[i] will retrieve the column at index i
Property access interface defined (using columnnames and getcolumn); i.e. tbl.col1 will retrieve column named col1
Iteration interface defined; i.e. for col in table will iterate each column in the table
AbstractDict methods defined (get, haskey, etc.) for checking and retrieving columns
A default show method

This allows a custom table type to behave as close as possible to a builtin NamedTuple of vectors object.

source

Tables.AbstractRow — Type

Tables.AbstractRow

Abstract interface type representing the expected eltype of the iterator returned from Tables.rows(table). Tables.rows must return an iterator of elements that satisfy the Tables.AbstractRow interface. While Tables.AbstractRow is an abstract type that custom "row" types may subtype for useful default behavior (indexing, iteration, property-access, etc.), users should not use it for dispatch, as Tables.jl interface objects are not required to subtype, but only implement the required interface methods.

Interface definition:

Required Methods	Default Definition	Brief Description
`Tables.getcolumn(row, i::Int)`	getfield(row, i)	Retrieve a column value by index
`Tables.getcolumn(row, nm::Symbol)`	getproperty(row, nm)	Retrieve a column value by name
`Tables.columnnames(row)`	propertynames(row)	Return column names for a row as a 1-based indexable collection
Optional methods
`Tables.getcolumn(row, ::Type{T}, i::Int, nm::Symbol)`	Tables.getcolumn(row, nm)	Given a column element type `T`, index `i`, and column name `nm`, retrieve the column value. Provides a type-stable or even constant-prop-able mechanism for efficiency.

Note that subtypes of Tables.AbstractRow must overload all required methods listed above instead of relying on these methods' default definitions.

While custom row types aren't required to subtype Tables.AbstractRow, benefits of doing so include:

Indexing interface defined (using getcolumn); i.e. row[i] will return the column value at index i
Property access interface defined (using columnnames and getcolumn); i.e. row.col1 will retrieve the value for the column named col1
Iteration interface defined; i.e. for x in row will iterate each column value in the row
AbstractDict methods defined (get, haskey, etc.) for checking and retrieving column values
A default show method

This allows the custom row type to behave as close as possible to a builtin NamedTuple object.

source

Tables.ByRow — Type

ByRow <: Function

ByRow(f) returns a function which applies function f to each element in a vector.

ByRow(f) can be passed two types of arguments:

One or more 1-based AbstractVectors of equal length: In this case the returned value

is a vector resulting from applying f to elements of passed vectors element-wise. Function f is called exactly once for each element of passed vectors (as opposed to map which assumes for some types of source vectors (e.g. SparseVector) that the wrapped function is pure, and may call the function f only once for multiple equal values.

A Tables.ColumnTable holding 1-based columns of equal length: In this case the function

f is passed a NamedTuple created for each row of passed table.

The return value of ByRow(f) is always a vector.

ByRow expects that at least one argument is passed to it and in the case of Tables.ColumnTable passed that the table has at least one column. In some contexts of operations on tables (for example DataFrame) the user might want to pass no arguments (or an empty Tables.ColumnTable) to ByRow. This case must be separately handled by the code implementing the logic of processing the ByRow operation on this specific parent table (the reason is that passing such arguments to ByRow does not allow it to determine the number of rows of the source table).

Examples

julia> Tables.ByRow(x -> x^2)(1:3)
3-element Vector{Int64}:
 1
 4
 9

julia> Tables.ByRow((x, y) -> x*y)(1:3, 2:4)
3-element Vector{Int64}:
  2
  6
 12

julia> Tables.ByRow(x -> x.a)((a=1:2, b=3:4))
2-element Vector{Int64}:
 1
 2

 julia> Tables.ByRow(x -> (a=x.a*2, b=sin(x.b), c=x.c))((a=[1, 2, 3],
                                                         b=[1.2, 3.4, 5.6],
                                                         c=["a", "b", "c"]))
3-element Vector{NamedTuple{(:a, :b, :c), Tuple{Int64, Float64, String}}}:
 (a = 2, b = 0.9320390859672263, c = "a")
 (a = 4, b = -0.2555411020268312, c = "b")
 (a = 6, b = -0.6312666378723216, c = "c")

source

Tables.Columns — Type

Tables.Columns(tbl)

Convenience type that calls Tables.columns on an input tbl and wraps the resulting AbstractColumns interface object in a dedicated struct to provide useful default behaviors (allows any AbstractColumns to be used like a NamedTuple of Vectors):

Indexing interface defined; i.e. row[i] will return the column at index i, row[nm] will return column for column name nm
Property access interface defined; i.e. row.col1 will retrieve the value for the column named col1
Iteration interface defined; i.e. for x in row will iterate each column in the row
AbstractDict methods defined (get, haskey, etc.) for checking and retrieving columns

Note that Tables.Columns calls Tables.columns internally on the provided table argument. Tables.Columns can be used for dispatch if needed.

source

Tables.CopiedColumns — Type

Tables.CopiedColumns

For some sinks, there's a concern about whether they can safely "own" columns from the input. If mutation will be allowed, to be safe, they should always copy input columns, to avoid unintended mutation to the original source. When we've called buildcolumns, however, Tables.jl essentially built/owns the columns, and it's happy to pass ownership to the sink. Thus, any built columns will be wrapped in a CopiedColumns struct to signal to the sink that essentially "a copy has already been made" and they're safe to assume ownership.

source

Tables.LazyTable — Type

Tables.LazyTable(f, arg)

A "table" type that delays materialization until Tables.columns or Tables.rows is called. This allows, for example, sending a LazyTable to a remote process or thread which can then call Tables.columns or Tables.rows to "materialize" the table. Is used by default in Tables.partitioner(f, itr) where a materializer function f is passed to each element of an iterable itr, allowing distributed/concurrent patterns like:

for tbl in Tables.partitions(Tables.partitioner(CSV.File, list_of_csv_files))
    Threads.@spawn begin
        cols = Tables.columns(tbl)
        # do stuff with cols
    end
end

In this example, CSV.File will be called like CSV.File(x) for each element of the list_of_csv_files iterable, but not until Tables.columns(tbl) is called, which in this case happens in a thread-spawned task, allowing files to be parsed and processed in parallel.

source

Tables.Row — Type

Tables.Row(row)

Convenience type to wrap any AbstractRow interface object in a dedicated struct to provide useful default behaviors (allows any AbstractRow to be used like a NamedTuple):

Indexing interface defined; i.e. row[i] will return the column value at index i, row[nm] will return column value for column name nm
Property access interface defined; i.e. row.col1 will retrieve the value for the column named col1
Iteration interface defined; i.e. for x in row will iterate each column value in the row
AbstractDict methods defined (get, haskey, etc.) for checking and retrieving column values

source

Tables.Schema — Type

Tables.Schema(names, types)

Create a Tables.Schema object that holds the column names and types for an AbstractRow iterator returned from Tables.rows or an AbstractColumns object returned from Tables.columns. Tables.Schema is dual-purposed: provide an easy interface for users to query these properties, as well as provide a convenient "structural" type for code generation.

To get a table's schema, one can call Tables.schema on the result of Tables.rows or Tables.columns, but also note that a table may return nothing, indicating that its column names and/or column element types are unknown (usually not inferable). This is similar to the Base.EltypeUnknown() trait for iterators when Base.IteratorEltype is called. Users should account for the Tables.schema(tbl) => nothing case by using the properties of the results of Tables.rows(x) and Tables.columns(x) directly.

To access the names, one can simply call sch.names to return a collection of Symbols (Tuple or Vector). To access column element types, one can similarly call sch.types, which will return a collection of types (like (Int64, Float64, String)).

The actual type definition is

struct Schema{names, types}
    storednames::Union{Nothing, Vector{Symbol}}
    storedtypes::Union{Nothing, Vector{Type}}
end

Where names is a tuple of Symbols or nothing, and types is a tuple type of types (like Tuple{Int64, Float64, String}) or nothing. Encoding the names & types as type parameters allows convenient use of the type in generated functions and other optimization use-cases, but users should note that when names and/or types are the nothing value, the names and/or types are stored in the storednames and storedtypes fields. This is to account for extremely wide tables with columns in the 10s of thousands where encoding the names/types as type parameters becomes prohibitive to the compiler. So while optimizations can be written on the typed names/types type parameters, users should also consider handling the extremely wide tables by specializing on Tables.Schema{nothing, nothing}.

source

Tables.allocatecolumn — Method

Tables.allocatecolumn(::Type{T}, len) => returns a column type (usually `AbstractVector`) with size to hold `len` elements

Custom column types can override with an appropriate "scalar" element type that should dispatch to their column allocator. Alternatively, and more generally, custom scalars can overload DataAPI.defaultarray to signal the default array type. In this case the signaled array type must support a constructor accepting undef for initialization.

source

Tables.columnaccess — Function

Tables.columnaccess(x) => Bool

Check whether an object has specifically defined that it implements the Tables.columns function that does not copy table data. That is to say, Tables.columns(x) must be done with O(1) time and space complexity when Tables.columnaccess(x) == true. Note that Tables.columns has generic fallbacks allowing it to produces AbstractColumns objects, even if the input doesn't define columnaccess. However, this generic fallback may copy the data from input table x. Also note that just because an object defines columnaccess doesn't mean a user should call Tables.columns on it; Tables.rows will also work, providing a valid AbstractRow iterator. Hence, users should call Tables.rows or Tables.columns depending on what is most natural for them to consume instead of worrying about what and how the input is oriented.

It is recommended that for users implementing MyType, they define only columnaccess(::Type{MyType}). columnaccess(::MyType) will then automatically delegate to this method.

source

Tables.columnindex — Method

Tables.columnindex(table, name::Symbol)

Return the column index (1-based) of a column by name in a table with a known schema; returns 0 if name doesn't exist in table

source

Tables.columnindex — Method

given names and a Symbol name, compute the index (1-based) of the name in names

source

Tables.columnnames — Function

Tables.columnnames(::Union{AbstractColumns, AbstractRow}) => Indexable collection

Retrieves the list of column names as a 1-based indexable collection (like a Tuple or Vector) for a AbstractColumns or AbstractRow interface object. The default definition calls propertynames(x). The returned column names must be unique.

source

Tables.columns — Function

Tables.columns(x) => AbstractColumns-compatible object

Accesses data of input table source x by returning an AbstractColumns-compatible object, which allows retrieving entire columns by name or index. A retrieved column is a 1-based indexable object that has a known length, i.e. supports length(col) and col[i] for any i = 1:length(col). Note that even if the input table source is row-oriented by nature, an efficient generic definition of Tables.columns is defined in Tables.jl to build a AbstractColumns- compatible object object from the input rows.

The Tables.Schema of a AbstractColumns object can be queried via Tables.schema(columns), which may return nothing if the schema is unknown. Column names can always be queried by calling Tables.columnnames(columns), and individual columns can be accessed by calling Tables.getcolumn(columns, i::Int ) or Tables.getcolumn(columns, nm::Symbol) with a column index or name, respectively.

Note that if x is an object in which columns are stored as vectors, the check that these vectors use 1-based indexing is not performed (it should be ensured when x is constructed).

source

Tables.columntable — Function

Tables.columntable(x) => NamedTuple of AbstractVectors

Takes any input table source x and returns a NamedTuple of AbstractVectors, also known as a "column table". A "column table" is a kind of default table type of sorts, since it satisfies the Tables.jl column interface naturally.

Note that if x is an object in which columns are stored as vectors, the check that these vectors use 1-based indexing is not performed (it should be ensured when x is constructed).

Not for use with extremely wide tables with # of columns > 67K; current fundamental compiler limits prevent constructing NamedTuples that large.

source

Tables.columntype — Method

Tables.columntype(table, name::Symbol)

Return the column element type of a column by name in a table with a known schema; returns Union{} if name doesn't exist in table

source

Tables.columntype — Method

given tuple type and a Symbol name, compute the type of the name in the tuples types

source

Tables.datavaluerows — Method

Tables.datavaluerows(x) => NamedTuple iterator

Takes any table input x and returns a NamedTuple iterator that will replace missing values with DataValue-wrapped values; this allows any table type to satisfy the TableTraits.jl Queryverse integration interface by defining:

IteratorInterfaceExtensions.getiterator(x::MyTable) = Tables.datavaluerows(x)

source

Tables.dictcolumntable — Method

Tables.dictcolumntable(x) => Tables.DictColumnTable

Take any Tables.jl-compatible source x and return a DictColumnTable, which can be thought of as a OrderedDict mapping column names as Symbols to AbstractVectors. The order of the input table columns is preserved via the Tables.schema(::DictColumnTable).

For "schema-less" input tables, dictcolumntable employs a "column unioning" behavior, as opposed to inferring the schema from the first row like Tables.columns. This means that as rows are iterated, each value from the row is joined into an aggregate final set of columns. This is especially useful when input table rows may not include columns if the value is missing, instead of including an actual value missing, which is common in json, for example. This results in a performance cost tracking all seen values and inferring the final unioned schemas, so it's recommended to use only when needed.

source

Tables.dictrowtable — Method

Tables.dictrowtable(x) => Tables.DictRowTable

Take any Tables.jl-compatible source x and return a DictRowTable, which can be thought of as a Vector of OrderedDict rows mapping column names as Symbols to values. The order of the input table columns is preserved via the Tables.schema(::DictRowTable).

For "schema-less" input tables, dictrowtable employs a "column unioning" behavior, as opposed to inferring the schema from the first row like Tables.columns. This means that as rows are iterated, each value from the row is joined into an aggregate final set of columns. This is especially useful when input table rows may not include columns if the value is missing, instead of including an actual value missing, which is common in json, for example. This results in a performance cost tracking all seen values and inferring the final unioned schemas, so it's recommended to use only when the union behavior is needed.

source

Tables.eachcolumn — Function

Tables.eachcolumn(f, sch::Tables.Schema{names, types}, x::Union{Tables.AbstractRow, Tables.AbstractColumns})
Tables.eachcolumn(f, sch::Tables.Schema{names, nothing}, x::Union{Tables.AbstractRow, Tables.AbstractColumns})

Takes a function f, table schema sch, x, which is an object that satisfies the AbstractRow or AbstractColumns interfaces; it generates calls to get the value for each column (Tables.getcolumn(x, nm)) and then calls f(val, index, name), where f is the user-provided function, val is the column value (AbstractRow) or entire column (AbstractColumns), index is the column index as an Int, and name is the column name as a Symbol.

An example using Tables.eachcolumn is:

rows = Tables.rows(tbl)
sch = Tables.schema(rows)
if sch === nothing
    state = iterate(rows)
    state === nothing && return
    row, st = state
    sch = Tables.schema(Tables.columnnames(row), nothing)
    while state !== nothing
        Tables.eachcolumn(sch, row) do val, i, nm
            bind!(stmt, i, val)
        end
        state = iterate(rows, st)
        state === nothing && return
        row, st = state
    end
else
    for row in rows
        Tables.eachcolumn(sch, row) do val, i, nm
            bind!(stmt, i, val)
        end
    end
end

Note in this example we account for the input table potentially returning nothing from Tables.schema(rows); in that case, we start iterating the rows, and build a partial schema using the column names from the first row sch = Tables.schema(Tables.columnnames(row), nothing), which is valid to pass to Tables.eachcolumn.

source

Tables.getcolumn — Function

Tables.getcolumn(::AbstractColumns, nm::Symbol) => Indexable collection with known length
Tables.getcolumn(::AbstractColumns, i::Int) => Indexable collection with known length
Tables.getcolumn(::AbstractColumns, T, i::Int, nm::Symbol) => Indexable collection with known length

Tables.getcolumn(::AbstractRow, nm::Symbol) => Column value
Tables.getcolumn(::AbstractRow, i::Int) => Column value
Tables.getcolumn(::AbstractRow, T, i::Int, nm::Symbol) => Column value

Retrieve an entire column (from AbstractColumns) or single row column value (from an AbstractRow) by column name (nm), index (i), or if desired, by column element type (T), index (i), and name (nm). When called on a AbstractColumns interface object, the returned object should be a 1-based indexable collection with known length. When called on a AbstractRow interface object, it returns the single column value. The methods taking a single Symbol or Int are both required for the AbstractColumns and AbstractRow interfaces; the third method is optional if type stability is possible. The default definition of Tables.getcolumn(x, i::Int) is getfield(x, i). The default definition of Tables.getcolumn(x, nm::Symbol) is getproperty(x, nm).

source

Tables.isrowtable — Function

Tables.isrowtable(x) => Bool

For convenience, some table objects that are naturally "row oriented" can define Tables.isrowtable(::Type{TableType}) = true to simplify satisfying the Tables.jl interface. Requirements for defining isrowtable include:

Tables.rows(x) === x, i.e. the table object itself is a Row iterator
If the table object is mutable, it should support:
- push!(x, row): allow pushing a single row onto table
- append!(x, rows): allow appending set of rows onto table
If table object is mutable and indexable, it should support:
- x[i] = row: allow replacing of a row with another row by index

A table object that defines Tables.isrowtable will have definitions for Tables.istable, Tables.rowaccess, and Tables.rows automatically defined.

source

Tables.istable — Function

Tables.istable(x) => Bool

Check if an object has specifically defined that it is a table. Note that not all valid tables will return true, since it's possible to satisfy the Tables.jl interface at "run-time", e.g. a Generator of NamedTuples iterates NamedTuples, which satisfies the AbstractRow interface, but there's no static way of knowing that the generator is a table.

It is recommended that for users implementing MyType, they define only istable(::Type{MyType}). istable(::MyType) will then automatically delegate to this method.

istable calls TableTraits.isiterabletable as a fallback. This can have a considerable runtime overhead in some contexts. To avoid these and use istable as a compile-time trait, it can be called on a type as istable(typeof(obj)).

source

Tables.materializer — Function

Tables.materializer(x) => Callable

For a table input, return the "sink" function or "materializing" function that can take a Tables.jl-compatible table input and make an instance of the table type. This enables "transform" workflows that take table inputs, apply transformations, potentially converting the table to a different form, and end with producing a table of the same type as the original input. The default materializer is Tables.columntable, which converts any table input into a NamedTuple of Vectors.

It is recommended that for users implementing MyType, they define only materializer(::Type{<:MyType}). materializer(::MyType) will then automatically delegate to this method.

source

Tables.matrix — Method

Tables.matrix(table; transpose::Bool=false)

Materialize any table source input as a new Matrix or in the case of a MatrixTable return the originally wrapped matrix. If the table column element types are not homogeneous, they will be promoted to a common type in the materialized Matrix. Note that column names are ignored in the conversion. By default, input table columns will be materialized as corresponding matrix columns; passing transpose=true will transpose the input with input columns as matrix rows or in the case of a MatrixTable apply permutedims to the originally wrapped matrix.

source

Tables.namedtupleiterator — Method

Tables.namedtupleiterator(x)

Pass any table input source and return a NamedTuple iterator