API Documentation
Contents
Index
ExpFamilyPCA.EPCA
ExpFamilyPCA.Options
ExpFamilyPCA.NegativeDomain
ExpFamilyPCA.PositiveDomain
ExpFamilyPCA.compress
ExpFamilyPCA.decompress
ExpFamilyPCA.fit!
Functions
The core of the ExpFamilyPCA.jl
API is the EPCA
abstract type. All supported and custom EPCA specifications are subtypes of EPCA
and include three methods in their EPCA
interface: fit!
, compress
and decompress
.
ExpFamilyPCA.EPCA
— TypeSupertype for exponential family principal component analysis models.
ExpFamilyPCA.fit!
— Functionfit!(epca::EPCA, X::AbstractMatrix{T}; maxiter::Integer = 100, verbose::Bool = false, steps_per_print::Integer = 10) where T <: Real
Fits the EPCA model on the dataset X
. Call this function after creating an EPCA struct or to continue training on more data.
Should be called after creating an EPCA object or when you want to fit on new data.
The default fit!
may have long run times depending on the dataset size and model complexity. Consider adjusting the verbosity (verbose
) and number of iterations (maxiter
) to better balance runtime and model performance.
Arguments
epca::EPCA
: The EPCA model.X::AbstractMatrix{T}
: (n
,indim
) - The input training data matrix. Rows are observations. Columns are features or variables.
Keyword Arguments
maxiter::Integer = 100
: The maximum number of iterations performed during loss minimization. Defaults to100
. May converge early.verbose::Bool = false
: A flag indicating whether to print optimization progress. If set totrue
, prints the loss value and iteration number at specified intervals (steps_per_print
). Defaults tofalse
.steps_per_print::Integer = 10
: The number of iterations between printed progress updates whenverbose
is set totrue
. For example, ifsteps_per_print
is10
, progress will be printed every 10 iterations. Defaults to10
.
Returns
A::AbstractMatrix{T}
: (n
,outdim
) - The compressed data.
Usage
Input:
using ExpFamilyPCA
using Random; Random.seed!(1)
# Create the model
indim = 10 # Input dimension
outdim = 5 # Output dimension
epca = BernoulliEPCA(indim, outdim)
# Generate some random training data
n = 100
X = rand(0:1, n, indim)
# Fit the model to the data
A = fit!(epca, X; maxiter=200, verbose=true, steps_per_print=50);
Output:
Iteration: 1/200 | Loss: 31.7721864082419
Iteration: 50/200 | Loss: 11.07389383509631
Iteration: 100/200 | Loss: 10.971490262772905
Iteration: 150/200 | Loss: 10.886018474442618
Iteration: 200/200 | Loss: 10.718703556787007
ExpFamilyPCA.compress
— Functioncompress(epca::EPCA, X::AbstractMatrix{T}; maxiter::Integer = 100, verbose::Bool = false, steps_per_print::Integer = 10) where T <: Real
Compresses the input data X
with the EPCA model.
Arguments
epca::EPCA
: The fitted EPCA model.[1]fit!
should be called beforecompress
.X::AbstractMatrix{T}
: (n
,indim
) - The input data matrix (can differ from the training data). Rows are observations. Columns are features or variables.
Keyword Arguments
maxiter::Integer = 100
: The maximum number of iterations performed during loss minimization. Defaults to100
. May converge early.verbose::Bool = false
: A flag indicating whether to print optimization progress. If set totrue
, prints the loss value and iteration number at specified intervals (steps_per_print
). Defaults tofalse
.steps_per_print::Integer = 10
: The number of iterations between printed progress updates whenverbose
is set totrue
. For example, ifsteps_per_print
is10
, progress will be printed every 10 iterations. Defaults to10
.
Returns
A::AbstractMatrix{T}
: (n
,outdim
) - The compressed data.
Usage
# Generate some random test data
m = 10
Y = rand(0:1, m, indim)
# Compress the test data using the fitted model from the previous example
Y_compressed = compress(epca, Y)
ExpFamilyPCA.decompress
— Functiondecompress(epca::EPCA, A::AbstractMatrix{T}) where T <: Real
Decompress the compressed matrix A
with the EPCA model.
Arguments
epca::EPCA
: The fitted EPCA model.[1]fit!
should be called beforecompress
.A::AbstractMatrix{T}
: (n
,outdim
) - A compressed data matrix.
Returns
X̂::AbstractMatrix{T}
: (n
,indim
) - The reconstructed data matrix approximated using EPCA model parameters.
Usage
Y_reconstructed = decompress(epca, Y)
Options
ExpFamilyPCA.Options
— TypeOptions(; metaprogramming::Bool = true, μ::Real = 1, ϵ::Real = eps(), A_init_value::Real = 1.0, A_lower::Union{Real, Nothing} = nothing, A_upper::Union{Real, Nothing} = nothing, A_use_sobol::Bool = false, V_init_value::Real = 1.0, V_lower::Union{Real, Nothing} = nothing, V_upper::Union{Real, Nothing} = nothing, V_use_sobol::Bool = false, low = -1e10, high = 1e10, tol = 1e-10, maxiter = 1e6)
Defines a struct Options
for configuring various parameters used in optimization and calculus. It provides flexible defaults for metaprogramming, initialization values, optimization boundaries, and binary search controls.
Fields
metaprogramming::Bool
: Enables metaprogramming for symbolic calculus conversions. Default istrue
.μ::Real
: A regularization hyperparameter. Default is1
.ϵ::Real
: A regularization hyperparameter. Default iseps()
.A_init_value::Real
: Initial value for parameterA
. Default is1.0
.A_lower::Union{Real, Nothing}
: Lower bound forA
, ornothing
. Default isnothing
.A_upper::Union{Real, Nothing}
: Upper bound forA
, ornothing
. Default isnothing
.A_use_sobol::Bool
: Use Sobol sequences for initializingA
. Default isfalse
.V_init_value::Real
: Initial value for parameterV
. Default is1.0
.V_lower::Union{Real, Nothing}
: Lower bound forV
, ornothing
. Default isnothing
.V_upper::Union{Real, Nothing}
: Upper bound forV
, ornothing
. Default isnothing
.V_use_sobol::Bool
: Use Sobol sequences for initializingV
. Default isfalse
.low::Real
: Lower bound for binary search. Default is-1e10
.high::Real
: Upper bound for binary search. Default is1e10
.tol::Real
: Tolerance for stopping binary search. Default is1e-10
.maxiter::Real
: Maximum iterations for binary search. Default is1e6
.
The metaprogramming
flag controls whether metaprogramming is used during symbolic differentiation conversion. While conversion between Symbolics.jl atoms and base Julia can occur without it, this approach is slower and requires more calls. Nonetheless, the flag is provided for users who keenly want to avoid metaprogramming in their pipeline.
ExpFamilyPCA.NegativeDomain
— FunctionNegativeDomain(; metaprogramming::Bool = true, μ::Real = 1, ϵ::Real = eps(), low::Real = -1e10, high::Real = 1e10, tol::Real = 1e-10, maxiter::Real = 1e6)
Returns an instance of Options
configured for optimization over the negative domain. Sets defaults for A
and V
parameters while keeping the remaining settings from Options
.
Specific Settings
A_init_value = -1
: InitializesA
with a negative value.A_upper = -1e-4
: Upper bound forA
is constrained to a small negative value.V_init_value = 1
: InitializesV
with a positive value.V_lower = 1e-4
: Lower bound forV
is constrained to a small positive value.
Other fields inherit from the Options
struct.
ExpFamilyPCA.PositiveDomain
— FunctionPositiveDomain(; metaprogramming::Bool = true, μ::Real = 1, ϵ::Real = eps(), low::Real = -1e10, high::Real = 1e10, tol::Real = 1e-10, maxiter::Real = 1e6)
Returns an instance of Options
configured for optimization over the positive domain. Sets defaults for A
and V
parameters while keeping the remaining settings from Options
.
Specific Settings
A_init_value = 1
: InitializesA
with a positive value.A_lower = 1e-4
: Lower bound forA
is constrained to a small positive value.V_init_value = 1
: InitializesV
with a positive value.V_lower = 1e-4
: Lower bound forV
is constrained to a small positive value.
Other fields inherit from the Options
struct.
- 1If
compress
is called beforefit!
,X
will compressed using unfitted starting weights.