CyqloneSolver#

template<index_t VL = 4, class T = real_t, StorageOrder DefaultOrder = StorageOrder::ColMajor, class Ctx = parallel::Context<>> struct CyqloneSolver

Linear solver for systems with optimal control structure.

Template Parameters:

VL – Vector length.
T – Scalar type.
DefaultOrder – Storage order for the matrix workspaces (row/column major).
Ctx – Parallel execution context type, see parallel::Context.

Parallelization and vectorization

using tricyqle_t = TricyqleSolver<VL, T, DefaultOrder, Ctx>: Tricyqle solver type for solving block-tridiagonal systems in parallel.

using Context = tricyqle_t::Context

using SharedContext = tricyqle_t::SharedContext

using simd = tricyqle_t::simd

const index_t p: Number of processors/threads.

const index_t n = (N_horiz + p * v - 1) / (p * v): Number of stages per thread per vector lane (rounded up).

static index_t v = VL: Vector length.

inline index_t lv() const: log₂(v), logarithm of the vector length.

inline index_t lp() const: log₂(p), logarithm of the number of processors/threads, rounded up.

inline index_t ceil_p() const: The number of processors p rounded up to the next power of two.

inline index_t ceil_P() const: The number of parallel execution units P rounded up to the next power of two.

inline std::unique_ptr<SharedContext> create_parallel_context() const

inline void foreach_stage(Context &ctx, auto &&func, auto&&... xs) const

Call a function for each stage in the horizon, passing the stage index, the data batch index, and optionally the corresponding batches of the given arrays.

Iterates backwards in time (decreasing stage index j).

inline void foreach_stage_fwd(Context &ctx, auto &&func, auto&&... xs) const

Call a function for each stage in the horizon, passing the stage index, the data batch index, and optionally the corresponding batches of the given arrays.

Iterates forward in time (increasing stage index j).

Matrix data structures

template<StorageOrder O = column_major> using matrix = typename tricyqle_t::template matrix<O>: Owning type for a batch of matrices (with batch size v).

template<StorageOrder O = column_major> using view = typename tricyqle_t::template view<O>: Non-owning immutable view type for matrix.

template<StorageOrder O = column_major> using mut_view = typename tricyqle_t::template mut_view<O>: Non-owning mutable view type for matrix.

using layer_stride = typename tricyqle_t::layer_stride

template<StorageOrder O = column_major> using batch_view = typename tricyqle_t::template batch_view<O>: Non-owning immutable view type for a single batch of v matrices.

template<StorageOrder O = column_major> using mut_batch_view = typename tricyqle_t::template mut_batch_view<O>: Non-owning mutable view type for a single batch of v matrices.

static auto default_order = tricyqle_t::default_order: Default storage order for most matrices.

static auto column_major = tricyqle_t::column_major: Column-major storage order for column vectors and update matrices.

Problem dimensions

const index_t N_horiz: Horizon length of the optimal control problem.

const index_t nx: Number of states of the OCP.

const index_t nu: Number of controls of the OCP.

const index_t ny: Number of general constraints of the OCP per stage.

const index_t ny_0: Number of general constraints at stage 0, D(0) u(0).

const index_t ny_N: Number of general constraints at the final stage, C(N) x(N).

inline index_t num_variables() const: Get the total number of primal variables in the OCP.

Note

The actual number of variable stored in Cyqlone’s internal data structures may be larger.

inline index_t num_dynamics_constraints() const: Get the total number of dynamics constraints in the OCP.

Note

The actual number of constraints stored in Cyqlone’s internal data structures may be larger.

inline index_t num_general_constraints() const: Get the total number of general constraints in the OCP.

Note

The actual number of constraints stored in Cyqlone’s internal data structures may be larger.

Solver parameters

CyqloneParams<value_type> params = {}: Solver parameters and settings.

inline CyqloneParams<value_type> get_params() const: Get the current Cyqlone solver parameters.

inline void update_params(const CyqloneParams<value_type> &new_params): Update the Cyqlone solver parameters.

inline TricyqleParams<value_type> get_tricyqle_params() const: Get the current Tricyqle solver parameters.

inline void update_tricyqle_params(const TricyqleParams<value_type> &new_params): Update the Tricyqle solver parameters.

inline std::string get_params_string() const: Get a string representation of the main solver parameters. Used mainly for file names.

Tricyqle solver for block-tridiagonal systems

tricyqle_t tricyqle = {.block_size =nx,.max_rank = std::max(ny, ny_0 + ny_N) * N_horiz,.p = p,}: Block-tridiagonal solver (CR/PCR/PCG).

OCP data (reordered for use during the Cyqlone algorithm)

matrix<default_order> data_H = [this] {returnmatrix<default_order>{{.depth = ceil_N(), .rows = nu + nx, .cols = nu + nx}};}(): Stage-wise Hessian blocks H(j) = [ R(j) S(j); S(j)ᵀ Q(j) ] of the OCP cost function.

matrix<default_order> data_F = [this] {returnmatrix<default_order>{{.depth = ceil_N(), .rows = nx, .cols = nu + nx}};}(): Stage-wise dynamics matrices F(j) = [ B(j) A(j) ] of the OCP.

matrix< default_order > data_Gᵀ = [this] {const auto nyM = std::max(ny, ny_0 + ny_N);returnmatrix<default_order>{{.depth = ceil_N(), .rows = nu + nx, .cols = nyM}};}(): Stage-wise constraint Jacobians G(j)ᵀ = [ D(j) C(j) ]ᵀ of the OCP.

Modified Riccati data structures

matrix<default_order> riccati_LH = [this] {returnmatrix<default_order>{{.depth = p * v, .rows = nu + nx, .cols = n * (nu + nx)}};}()

Cholesky factors of the Hessian blocks for the Riccati recursion.

LH(j) = [ LR(j) 0; LS(j) LQ(j) ]

matrix<default_order> riccati_LAB = [this] {returnmatrix<default_order>{{.depth = p * v, .rows = nx, .cols = n * nx + n * nu}};}()

Storage for the matrices LB(j), Acl(j) and LA(j₁) for the Riccati recursion.

Grouped per thread, with layout [ Acl(jₙ) … Acl(j₂) LA(j₁) | LB(jₙ) … LB(j₁) ], so that LA(j₁) and LB(j) are contiguous (useful when evaluating the Schur complement).

matrix<default_order> riccati_V = [this] {const auto nyM = std::max(ny, ny_0 + ny_N);returnmatrix<default_order>{{.depth = p * v, .rows = nu + nx, .cols = nx+ nyM}};}()

Temporary storage for the V(j) = [ B(j)ᵀ LQ(j); A(j)ᵀ LQ(j) ] matrices during the Riccati recursion.

The workspace is wider than just V to also accommodate the active constraint Jacobians, since both are used to update the Hessian blocks during the Riccati recursion.

matrix<column_major> riccati_work = [this] {returnmatrix<column_major>{{.depth = p * v, .rows = nx, .cols = 1}};}(): Temporary workspace for the Riccati solve phase.

Riccati factorization update data structures

matrix< column_major > work_Σ = [this] {const auto nyM = std::max(ny, ny_0 + ny_N);returnmatrix<column_major>{{.depth = p * v, .rows = n* nyM, .cols = 1}};}(): Compressed representation of the nonzero diagonal elements of the matrix Σ, populated for each thread separately during the factorization update of the Riccati recursion.

matrix< column_major > riccati_Υ1 = [this] {const auto nyM = std::max(ny, ny_0 + ny_N);returnmatrix<column_major>{{.depth = p * v, .rows = nu + nx + nx, .cols = n* nyM}};}()

Workspace to store the update matrices Υu, Υx, Υλ, Φu, Φx and Φλ during the factorization update of the Riccati recursion.

Both riccati_Υ1 and riccati_Υ2 are used alternately.

matrix< column_major > riccati_Υ2 = [this] {const auto nyM = std::max(ny, ny_0 + ny_N);returnmatrix<column_major>{{.depth = p * v, .rows = nu + nx + nx, .cols = n* nyM}};}(): Alternate workspace to riccati_Υ1.

Indexing utilities

inline index_t ceil_N() const: Horizon length, rounded up to a multiple of the number of parallel execution units.

index_t ν2 (index_t i) const: 2-adic valuation ν₂.

index_t ν2p (index_t i) const: 2-adic valuation modulo p, i.e. ν2p(0) = ν2p(p) = lp().

index_t add_wrap_ceil_N(index_t a, index_t b) const: Add b to a modulo N_horiz.

index_t sub_wrap_ceil_N(index_t a, index_t b) const: Subtract b from a modulo N_horiz.

index_t add_wrap_p(index_t a, index_t b) const: Add b to a modulo p.

index_t sub_wrap_p(index_t a, index_t b) const: Subtract b from a modulo p.

index_t add_wrap_ceil_p(index_t a, index_t b) const: Add b to a modulo ceil_p().

index_t sub_wrap_ceil_p(index_t a, index_t b) const: Subtract b from a modulo ceil_p().

index_t sub_wrap_ceil_P(index_t a, index_t b) const

index_t add_wrap_ceil_P(index_t a, index_t b) const

index_t get_linear_batch_offset(index_t biA) const

Packing and unpacking of OCP data to Cyqlone storage format

void update_data(const CyqloneStorage<value_type> &ocp): Update the internal data structures to reflect changes in the OCP data (without changing the problem size).

void initialize_rhs(const CyqloneStorage<value_type> &ocp, mut_view<> rhs) const: Initialize the right-hand side vector for the dynamics constraints of the OCP, using the custom Cyqlone storage format.

matrix initialize_rhs(const CyqloneStorage<value_type> &ocp) const: Initialize the right-hand side vector for the dynamics constraints of the OCP, using the custom Cyqlone storage format.

void initialize_gradient(const CyqloneStorage<value_type> &ocp, mut_view<> grad) const: Initialize the gradient vector for the OCP cost function, using the custom Cyqlone storage format.

matrix initialize_gradient(const CyqloneStorage<value_type> &ocp) const: Initialize the gradient vector for the OCP cost function, using the custom Cyqlone storage format.

void initialize_bounds(const CyqloneStorage<value_type> &ocp, mut_view<> b_min, mut_view<> b_max) const: Initialize the lower and upper bounds for the general constraints of the OCP, using the custom Cyqlone storage format.

std::pair<matrix<>, matrix<>> initialize_bounds(const CyqloneStorage<value_type> &ocp) const: Initialize the lower and upper bounds for the general constraints of the OCP, using the custom Cyqlone storage format.

void pack_variables(std::span<const value_type> ux_lin, mut_view<> ux) const

matrix pack_variables(std::span<const value_type> ux_lin) const

void unpack_variables(view<> ux, std::span<value_type> ux_lin) const

std::vector<value_type> unpack_variables(view<> ux) const

void pack_dynamics (std::span< const value_type > λ_lin, mut_view<> λ) const

matrix pack_dynamics (std::span< const value_type > λ_lin) const

void unpack_dynamics (view<> λ, std::span< value_type > λ_lin) const

std::vector< value_type > unpack_dynamics (view<> λ) const

void pack_constraints(std::span<const value_type> y_lin, mut_view<> y, value_type fill = 0) const

matrix pack_constraints(std::span<const value_type> y_lin, value_type fill = 0) const

void unpack_constraints(view<> y, std::span<value_type> y_lin) const

std::vector<value_type> unpack_constraints(view<> y) const

matrix initialize_variables() const: Get a zero-initialized matrix for the primal variables u and x.

matrix initialize_dynamics_constraints() const: Get a zero-initialized matrix for the dynamics constraints (or their multipliers).

matrix initialize_general_constraints() const: Get a zero-initialized matrix for the general constraints (or their multipliers).

static CyqloneSolver build(const CyqloneStorage<value_type> &ocp, index_t p)

Initialize a Cyqlone solver for the given OCP.

Note: constraints on u(0) and x(N) should be independent.

              nx  nu
ocp.CD(0) = [ 0 | D ] ny₀
            [ 0 | 0 ] ny - ny₀

Since ocp.D(0) and ocp.C(N) will be merged, the top ny₀ rows of ocp.C(N) should be zero.

OCP cost gradient and constraints evaluation

void residual_dynamics_constr(Context &ctx, view<> x, view<> b, mut_view<> Mxb) const

Compute Mx + b, where M is the dynamics constraint Jacobian matrix of the OCP.

In other words, evaluate the residuals for all stages, i.e., \( A_j x^j + B_j u^j - x^{j+1} + b^j \).

void transposed_dynamics_constr (Context &ctx, view<> λ, mut_view<> Mᵀλ, bool accum=false) const

Compute Mᵀλ, where M is the dynamics constraint Jacobian matrix of the OCP.

Optionally add the result to the existing contents of Mᵀλ by setting accum to true.

void general_constr(Context &ctx, view<> ux, mut_view<> DCux) const

Compute the general constraints Gx, where G is the general constraint Jacobian matrix of the OCP.

In other words, evaluate the constraints for all stages, i.e., \( C_j x^j + D_j u^j \).

void transposed_general_constr (Context &ctx, view<> y, mut_view<> DCᵀy) const: Compute Gᵀy, where G is the general constraint Jacobian matrix of the OCP.

void transposed_general_constr (view<> y, mut_view<> DCᵀy) const: Compute Gᵀy, where G is the general constraint Jacobian matrix of the OCP.

void cost_gradient (Context &ctx, view<> ux, value_type α, view<> q, value_type β, mut_view<> grad_f) const

Compute the cost gradient, with optional scaling factors.

grad_f ← Q ux + α q + β grad_f

void cost_gradient_regularized (Context &ctx, value_type γ, view<> ux, view<> ux0, view<> q, mut_view<> grad_f) const: Compute the regularized cost gradient, with regularization parameter γ⁻¹, with respect to the point ux0.

void cost_gradient_remove_regularization (Context &ctx, value_type γ, view<> x, view<> x0, mut_view<> grad_f) const: Subtract the regularization term from the cost gradient.

Factorization and solve routines

void factor_solve (Context &ctx, value_type γ, view<> Σ, mut_view<> ux, mut_view<> λ)

Compute the Cyqlone factorization of the KKT matrix of the OCP and perform a forward solve (fused for improved locality).