API Documentation

Contents

Index

Functions

The core of the ExpFamilyPCA.jl API is the EPCA abstract type. All supported and custom EPCA specifications are subtypes of EPCA and include three methods in their EPCA interface: fit!, compress and decompress.

ExpFamilyPCA.fit!Function
fit!(epca::EPCA, X::AbstractMatrix{T}; maxiter::Integer = 100, verbose::Bool = false, steps_per_print::Integer = 10) where T <: Real

Fits the EPCA model on the dataset X. Call this function after creating an EPCA struct or to continue training on more data.

Should be called after creating an EPCA object or when you want to fit on new data.

Warning

The default fit! may have long run times depending on the dataset size and model complexity. Consider adjusting the verbosity (verbose) and number of iterations (maxiter) to better balance runtime and model performance.

Arguments

  • epca::EPCA: The EPCA model.
  • X::AbstractMatrix{T}: (n, indim) - The input training data matrix. Rows are observations. Columns are features or variables.

Keyword Arguments

  • maxiter::Integer = 100: The maximum number of iterations performed during loss minimization. Defaults to 100. May converge early.
  • verbose::Bool = false: A flag indicating whether to print optimization progress. If set to true, prints the loss value and iteration number at specified intervals (steps_per_print). Defaults to false.
  • steps_per_print::Integer = 10: The number of iterations between printed progress updates when verbose is set to true. For example, if steps_per_print is 10, progress will be printed every 10 iterations. Defaults to 10.

Returns

  • A::AbstractMatrix{T}: (n, outdim) - The compressed data.

Usage

Input:

using ExpFamilyPCA
using Random; Random.seed!(1)

# Create the model
indim = 10  # Input dimension
outdim = 5  # Output dimension
epca = BernoulliEPCA(indim, outdim)

# Generate some random training data
n = 100
X = rand(0:1, n, indim)

# Fit the model to the data
A = fit!(epca, X; maxiter=200, verbose=true, steps_per_print=50);

Output:

Iteration: 1/200 | Loss: 31.7721864082419
Iteration: 50/200 | Loss: 11.07389383509631
Iteration: 100/200 | Loss: 10.971490262772905
Iteration: 150/200 | Loss: 10.886018474442618
Iteration: 200/200 | Loss: 10.718703556787007
source
ExpFamilyPCA.compressFunction
compress(epca::EPCA, X::AbstractMatrix{T}; maxiter::Integer = 100, verbose::Bool = false, steps_per_print::Integer = 10) where T <: Real

Compresses the input data X with the EPCA model.

Arguments

  • epca::EPCA: The fitted EPCA model.[1] fit! should be called before compress.
  • X::AbstractMatrix{T}: (n, indim) - The input data matrix (can differ from the training data). Rows are observations. Columns are features or variables.

Keyword Arguments

  • maxiter::Integer = 100: The maximum number of iterations performed during loss minimization. Defaults to 100. May converge early.
  • verbose::Bool = false: A flag indicating whether to print optimization progress. If set to true, prints the loss value and iteration number at specified intervals (steps_per_print). Defaults to false.
  • steps_per_print::Integer = 10: The number of iterations between printed progress updates when verbose is set to true. For example, if steps_per_print is 10, progress will be printed every 10 iterations. Defaults to 10.

Returns

  • A::AbstractMatrix{T}: (n, outdim) - The compressed data.

Usage

# Generate some random test data
m = 10
Y = rand(0:1, m, indim)

# Compress the test data using the fitted model from the previous example
Y_compressed = compress(epca, Y)
source
ExpFamilyPCA.decompressFunction
decompress(epca::EPCA, A::AbstractMatrix{T}) where T <: Real

Decompress the compressed matrix A with the EPCA model.

Arguments

  • epca::EPCA: The fitted EPCA model.[1] fit! should be called before compress.
  • A::AbstractMatrix{T}: (n, outdim) - A compressed data matrix.

Returns

  • X̂::AbstractMatrix{T}: (n, indim) - The reconstructed data matrix approximated using EPCA model parameters.

Usage

Y_reconstructed = decompress(epca, Y)
source

Options

ExpFamilyPCA.OptionsType
Options(; metaprogramming::Bool = true, μ::Real = 1, ϵ::Real = eps(), A_init_value::Real = 1.0, A_lower::Union{Real, Nothing} = nothing, A_upper::Union{Real, Nothing} = nothing, A_use_sobol::Bool = false, V_init_value::Real = 1.0, V_lower::Union{Real, Nothing} = nothing, V_upper::Union{Real, Nothing} = nothing, V_use_sobol::Bool = false, low = -1e10, high = 1e10, tol = 1e-10, maxiter = 1e6)

Defines a struct Options for configuring various parameters used in optimization and calculus. It provides flexible defaults for metaprogramming, initialization values, optimization boundaries, and binary search controls.

Fields

  • metaprogramming::Bool: Enables metaprogramming for symbolic calculus conversions. Default is true.
  • μ::Real: A regularization hyperparameter. Default is 1.
  • ϵ::Real: A regularization hyperparameter. Default is eps().
  • A_init_value::Real: Initial value for parameter A. Default is 1.0.
  • A_lower::Union{Real, Nothing}: Lower bound for A, or nothing. Default is nothing.
  • A_upper::Union{Real, Nothing}: Upper bound for A, or nothing. Default is nothing.
  • A_use_sobol::Bool: Use Sobol sequences for initializing A. Default is false.
  • V_init_value::Real: Initial value for parameter V. Default is 1.0.
  • V_lower::Union{Real, Nothing}: Lower bound for V, or nothing. Default is nothing.
  • V_upper::Union{Real, Nothing}: Upper bound for V, or nothing. Default is nothing.
  • V_use_sobol::Bool: Use Sobol sequences for initializing V. Default is false.
  • low::Real: Lower bound for binary search. Default is -1e10.
  • high::Real: Upper bound for binary search. Default is 1e10.
  • tol::Real: Tolerance for stopping binary search. Default is 1e-10.
  • maxiter::Real: Maximum iterations for binary search. Default is 1e6.
Info

The metaprogramming flag controls whether metaprogramming is used during symbolic differentiation conversion. While conversion between Symbolics.jl atoms and base Julia can occur without it, this approach is slower and requires more calls. Nonetheless, the flag is provided for users who keenly want to avoid metaprogramming in their pipeline.

source
ExpFamilyPCA.NegativeDomainFunction
NegativeDomain(; metaprogramming::Bool = true, μ::Real = 1, ϵ::Real = eps(), low::Real = -1e10, high::Real = 1e10, tol::Real = 1e-10, maxiter::Real = 1e6)

Returns an instance of Options configured for optimization over the negative domain. Sets defaults for A and V parameters while keeping the remaining settings from Options.

Specific Settings

  • A_init_value = -1: Initializes A with a negative value.
  • A_upper = -1e-4: Upper bound for A is constrained to a small negative value.
  • V_init_value = 1: Initializes V with a positive value.
  • V_lower = 1e-4: Lower bound for V is constrained to a small positive value.

Other fields inherit from the Options struct.

source
ExpFamilyPCA.PositiveDomainFunction
PositiveDomain(; metaprogramming::Bool = true, μ::Real = 1, ϵ::Real = eps(), low::Real = -1e10, high::Real = 1e10, tol::Real = 1e-10, maxiter::Real = 1e6)

Returns an instance of Options configured for optimization over the positive domain. Sets defaults for A and V parameters while keeping the remaining settings from Options.

Specific Settings

  • A_init_value = 1: Initializes A with a positive value.
  • A_lower = 1e-4: Lower bound for A is constrained to a small positive value.
  • V_init_value = 1: Initializes V with a positive value.
  • V_lower = 1e-4: Lower bound for V is constrained to a small positive value.

Other fields inherit from the Options struct.

source
  • 1If compress is called before fit!, X will compressed using unfitted starting weights.