Skip to content

Results

FluxAnalysis

FluxAnalysis(log)

Bases: BaseAnalysis

add_categorical

add_categorical(value: str, col_name: str = 'name') -> None

Add a categorical column to flux_df in the result

Parameters:

Name Type Description Default
value str

The values filled in the categorical column

required
col_name str

The columns name of the categorical column

'name'

Returns:

Type Description
None

aggregate classmethod

aggregate(
    analyses: List[FluxAnalysis],
    method: str,
    log: Optional[dict] = None,
    **kwargs
)

Returns an aggregated dataframe, if concat method is used, return a df with 'name' column representing the model name

Parameters:

Name Type Description Default
analyses List[FluxAnalysis]

FluxAnalysis objects used to be aggregated

required
method str

A string represents the aggregation method Possible choices are: concat, sum, mean, and median

required
log Optional[dict]

A dict contains new analysis results' information

None
kwargs

Additional keyword arguments added to the result dict

{}

Returns:

Name Type Description
aggregated_flux_analysis FluxAnalysis

FBA_Analysis

FBA_Analysis(log)

Bases: FluxAnalysis

FBA analysis result.

Attributes:

Name Type Description
log

FVA_Analysis

FVA_Analysis(log)

Bases: FluxAnalysis

aggregate classmethod

aggregate(
    analyses: List[FVA_Analysis],
    method: str,
    log: Optional[dict] = None,
    **kwargs
)

Returns an aggregated FVA_Analysis, if concat method is used, return a df with 'name' column representing the model name

Parameters:

Name Type Description Default
analyses List[FVA_Analysis]

FVA_Analysis objects to be aggregated

required
method str

A string represents the aggregation method Possible choices are: concat, sum, mean, and median

required
log Optional[dict]

A dict contains new analysis results' information

None
kwargs

Additional keyword arguments

{}

Returns:

Name Type Description
aggregated_flux_analysis FluxAnalysis

EFluxAnalysis

EFluxAnalysis(log)

Bases: BaseAnalysis

Analysis result object for the E-Flux algorithm.

Encapsulates the results from the apply_EFlux function, which constrains model reaction bounds based on expression scores.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the E-Flux execution, such as max_ub, min_lb, and protected_rxns.

required

Attributes:

Name Type Description
rxn_bounds dict[str, tuple[float, float]]

A dictionary mapping reaction IDs to the final bounds ((lower_bound, upper_bound)) applied to the model based on the scaled expression scores.

rxn_scores dict[str, float]

The original input dictionary mapping reaction IDs to their expression scores.

flux_result DataFrame or None

A DataFrame containing the flux distribution obtained from running parsimonious FBA (pFBA) on the E-Flux constrained model. This is None if return_fluxes was set to False in the apply_EFlux call. Indexed by reaction ID, with a single column 'fluxes'.

result_model Model

The cobra.Model object with the E-Flux constraints applied to its reaction bounds. If remove_zero_fluxes was True in apply_EFlux, this model will also have reactions with near-zero pFBA flux removed.

GIMMEAnalysis

GIMMEAnalysis(log)

Bases: BaseAnalysis

Analysis result object for the GIMME algorithm.

Encapsulates the results from the apply_GIMME function, which generates a context-specific model by minimizing flux through reactions inconsistent with expression data, while maintaining a required metabolic objective.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the GIMME execution, such as high_exp (expression threshold) and obj_frac (objective fraction).

required

Attributes:

Name Type Description
rxn_coefficients dict[str, float]

A dictionary mapping reaction IDs to the objective coefficients (penalties) applied during the GIMME optimization. Penalties are typically calculated as (high_exp - score) for reactions with scores below high_exp.

rxn_scores dict[str, float]

The original input dictionary mapping reaction IDs to their expression scores.

flux_result DataFrame or None

A DataFrame containing the optimal flux distribution found by the GIMME optimization. This is None if return_fluxes was set to False in the apply_GIMME call. Indexed by reaction ID, with a single column 'fluxes'.

result_model Model or None

The pruned cobra.Model object after removing reactions with near-zero flux based on flux_threshold. This is None if remove_zero_fluxes was set to False in the apply_GIMME call.

SPOTAnalysis

SPOTAnalysis(log)

Bases: BaseAnalysis

Analysis result object for the SPOT algorithm.

Encapsulates the results from the apply_SPOT function, which generates an expression-guided flux distribution by maximising a weighted-sum objective over reaction expression scores while maintaining a fraction of the FBA-optimal objective and bounding total flux via an L1 norm constraint.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the SPOT execution, such as obj_frac, norm_ub, protected_rxns, and remove_zero_fluxes.

required

Attributes:

Name Type Description
rxn_scores dict[str, float]

The original input dictionary mapping reaction IDs to their expression scores passed to apply_SPOT.

flux_result DataFrame or None

A DataFrame containing the SPOT flux distribution. Indexed by reaction ID with a single column 'fluxes'. None when return_fluxes=False was passed to apply_SPOT.

result_model Model or None

The pruned cobra.Model after removing reactions with near-zero SPOT flux. None when remove_zero_fluxes=False (default).

RIPTiDePruningAnalysis

RIPTiDePruningAnalysis(log)

Bases: BaseAnalysis

Analysis result object for the RIPTiDe pruning step.

Encapsulates the results from the apply_RIPTiDe_pruning function, including the pruned model and the objective weights used.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the RIPTiDe pruning execution, such as max_gw, obj_frac, and threshold.

required

Attributes:

Name Type Description
result_model Model

The pruned context-specific metabolic model resulting from the pFBA-based removal of low-flux reactions.

removed_rxn_ids list[str]

A list of string IDs for the reactions that were removed from the original model during the pruning process.

obj_dict dict[str, float]

A dictionary mapping reaction IDs to the calculated objective weights used in the parsimonious FBA (pFBA) step. Weights are derived from reaction expression scores (RALs).

RIPTiDeSamplingAnalysis

RIPTiDeSamplingAnalysis(log)

Bases: BaseAnalysis

Analysis result object for the RIPTiDe sampling step.

Encapsulates the results from the apply_RIPTiDe_sampling function, primarily the flux sampling data if generated.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the RIPTiDe sampling execution, such as max_gw, obj_frac, sampling_obj_frac, sampling_method, etc.

required

Attributes:

Name Type Description
sampling_result SamplingAnalysis or None

An object containing the results of the flux sampling process (e.g., flux distributions stored in sampling_result.flux_df). This is None if do_sampling was set to False in the apply_RIPTiDe_sampling call.

flux_result DataFrame or None

Property providing direct access to the flux sampling dataframe stored within sampling_result. Returns None if no sampling was performed.

flux_result property

flux_result

Flux sampling results as a pandas DataFrame, if available.

FASTCOREAnalysis

FASTCOREAnalysis(log)

Bases: BaseAnalysis

Analysis result object for the FASTCORE algorithm.

Encapsulates the results from the apply_FASTCORE function, which extracts a flux-consistent subnetwork from a larger metabolic model based on core and non-penalty reaction sets.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the FASTCORE execution, notably the flux tolerance epsilon (which might be a single float or a pandas Series if rxn_scaling_coefs were used).

required

Attributes:

Name Type Description
result_model Model or None

The extracted, flux-consistent subnetwork as a cobra.Model object. This attribute is None if return_model was set to False during the apply_FASTCORE call.

removed_rxn_ids ndarray

A NumPy array containing the string IDs of reactions that were present in the original model but removed to create the result_model.

kept_rxn_ids ndarray

A NumPy array containing the string IDs of reactions from the original model that were retained in the final result_model.

algo_efficacy dict or None

A dictionary containing efficacy metrics (e.g., 'precision', 'recall', 'F1_score', 'MCC') evaluating the performance of the algorithm by comparing the kept_rxn_ids and removed_rxn_ids against the initial core (C) and non-core sets (derived from reactions not in C or nonP). This attribute is None if calc_efficacy was False during the apply_FASTCORE call.

rFASTCORMICSAnalysis

rFASTCORMICSAnalysis(log)

Bases: BaseAnalysis

Analysis result object for the rFASTCORMICS algorithm.

Stores the outputs generated by the apply_rFASTCORMICS function.

Parameters:

Name Type Description Default
log dict

Dictionary storing parameters used to perform this analysis.

required

Attributes:

Name Type Description
fastcore_result FASTCOREAnalysis

The result object from the underlying FASTCORE run. Contains: - result_model (cobra.Model): The final context-specific model. - removed_rxn_ids (np.ndarray): IDs of reactions removed. - kept_rxn_ids (np.ndarray): IDs of reactions kept.

threshold_analysis rFASTCORMICSThresholdAnalysis

Threshold analysis object defining core/non-core reactions.

core_rxns set[str]

Set of identified core reaction IDs.

noncore_rxns set[str]

Set of identified non-core reaction IDs.

nonP_rxns set[str]

Set of identified non-penalty reaction IDs.

result_model Model

Property accessing the final context-specific model from fastcore_result.

kept_rxn_ids ndarray

Property accessing the kept reaction IDs from fastcore_result.

removed_rxn_ids ndarray

Property accessing the removed reaction IDs from fastcore_result.

CORDA_Analysis

CORDA_Analysis(log)

Bases: BaseAnalysis

Analysis result object for the CORDA algorithm.

Encapsulates the results from the apply_CORDA function, which builds a context-specific metabolic model based on reaction confidence scores, typically derived from experimental data like transcriptomics.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the CORDA execution, including penalty factors (penalty_factor, penalty_increase_factor), support thresholds (keep_if_support), flux thresholds (threshold, support_flux_value), and bounds (upper_bound).

required

Attributes:

Name Type Description
result_model Model

The final context-specific metabolic model generated by the CORDA algorithm, containing reactions deemed active in the specific context.

conf_scores dict[str, float]

A dictionary mapping reaction variable IDs (including both forward and reverse directions, e.g., 'ATPS4r' and 'ATPS4r_reverse') to their final confidence scores after the CORDA refinement process. Scores typically range from -1 (low confidence, likely removed) to 3 (high confidence/core reaction, kept).

threshold_analysis ThresholdAnalysis

An object containing details about the thresholding strategy used to convert continuous input data (e.g., gene expression) into the initial discrete confidence scores used by CORDA. The specific type of this object (e.g., rFASTCORMICSThresholdAnalysis) depends on the predefined_threshold or strategy used in apply_CORDA.

removed_rxn_ids ndarray

A NumPy array containing the cobra.Reaction objects (not just their IDs) that were present in the original model but were removed during the CORDA model building process based on confidence scores and dependency assessments.

algo_efficacy dict or None

A dictionary containing efficacy metrics (e.g., 'precision', 'recall', 'F1_score', 'MCC') evaluating the performance of the algorithm. It compares the reactions present in the result_model against the initial high-confidence (core) and low-confidence (non-core) sets derived from the input conf_scores.

MBA_Analysis

MBA_Analysis(log)

Bases: BaseAnalysis

Analysis result object for the Model Building Algorithm (MBA).

Encapsulates the results from the apply_MBA function, representing a context-specific model built by iteratively removing reactions based on confidence levels (high, medium, none).

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the MBA execution, such as tolerance, epsilon, and random_state.

required

Attributes:

Name Type Description
result_model Model

The final context-specific metabolic model generated by the MBA algorithm after iterative removal of no-confidence reactions.

removed_rxn_ids ndarray

A NumPy array containing the string IDs of reactions removed from the original model during the MBA process.

threshold_analysis ThresholdAnalysis or None

An object containing details about the thresholding strategy used to derive the initial high- and medium-confidence reaction sets from continuous data (like gene expression). This is None if confidence sets were provided directly instead of using data. The specific type depends on the predefined_threshold used.

algo_efficacy float or None

An efficacy score (e.g., F1-score) comparing the reactions present in the result_model against the initial high-confidence and no-confidence reaction sets.

mCADRE_Analysis

mCADRE_Analysis(log)

Bases: BaseAnalysis

Analysis result object for the mCADRE algorithm.

Encapsulates the results from the apply_mCADRE function, representing a context-specific model built by evaluating reactions based on expression, connectivity, evidence, and metabolic task performance.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the mCADRE execution, such as exp_cutoff, absent_value, eta, and tol.

required

Attributes:

Name Type Description
result_model Model

The final context-specific metabolic model generated by mCADRE after iterative reaction removal.

removed_rxn_ids ndarray

A NumPy array containing the string IDs of reactions removed from the original model during the pruning process.

core_rxn_ids ndarray

A NumPy array containing the string IDs of reactions initially classified as 'core' based on the exp_cutoff threshold applied to mapped expression scores.

non_expressed_rxn_ids ndarray

A NumPy array containing the string IDs of reactions initially classified as 'non-expressed' based on the absent_value_indicator.

score_df DataFrame

A DataFrame containing the calculated scores for each reaction, indexed by reaction ID, with columns for 'expression' (mapped score), 'connectivity' (based on neighboring reaction scores), and 'evidence' (user-provided or default zero). This DataFrame is sorted to guide the removal process.

func_test_result TaskAnalysis or None

An object containing the results of the functional metabolic task tests performed on the model during the pruning process. None if no functional tests were provided or run.

salvage_test_result TaskAnalysis or None

An object containing the results of the salvage pathway task tests performed on the model during the pruning process. None if no salvage tests were provided or run.

threshold_analysis ThresholdAnalysis

An object containing details about the thresholding strategy used to convert continuous input data (e.g., gene expression) into the initial scores used for core/non-expressed classification. The specific type depends on the predefined_threshold used.

algo_efficacy float or None

An efficacy score (e.g., F1-score) comparing the reactions present in the result_model against the initial core_rxn_ids and non-core IDs.

iMAT_Analysis

iMAT_Analysis(log)

Bases: BaseAnalysis

Analysis result object for the iMAT algorithm.

Encapsulates the results from the apply_iMAT function, representing a context-specific model generated by maximizing flux through high-confidence reactions and minimizing flux through low-confidence reactions using MILP.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the iMAT execution, such as eps, tol, and thresholding keyword arguments.

required

Attributes:

Name Type Description
result_model Model

The final context-specific metabolic model generated by iMAT after removing reactions with near-zero flux in the optimal MILP solution.

removed_rxn_ids ndarray

A NumPy array containing the string IDs of reactions removed from the original model based on the flux tolerance tol.

threshold_analysis ThresholdAnalysis

An object containing details about the thresholding strategy used to derive the initial high-confidence (core) and low-confidence (non-core) reaction sets from continuous data (like gene expression). The specific type depends on the predefined_threshold used.

INIT_Analysis

INIT_Analysis(log)

Bases: BaseAnalysis

Analysis result object for the INIT algorithm.

Encapsulates the results from the apply_INIT function, representing a context-specific model generated by maximizing the sum of reaction weights (derived from expression data) in a MILP framework.

Parameters:

Name Type Description Default
log dict

A dictionary storing parameters used during the INIT execution, such as eps, tol, weight_method, and thresholding keyword arguments.

required

Attributes:

Name Type Description
result_model Model

The final context-specific metabolic model generated by INIT after removing reactions with near-zero flux in the optimal MILP solution.

removed_rxn_ids ndarray

A NumPy array containing the string IDs of reactions removed from the original model based on the flux tolerance tol.

threshold_analysis ThresholdAnalysis or None

An object containing details about the thresholding strategy used to derive reaction weights if weight_method was 'threshold'. This is None if weight_method was 'default'. The specific type depends on the predefined_threshold used.

weight_dic dict[str, float]

A dictionary mapping reaction IDs to the calculated weights used in the INIT objective function.

fluxes DataFrame

A DataFrame containing the absolute flux values for all reactions in the model from the optimal MILP solution, before reaction removal based on tol. Indexed by reaction ID, with a single column 'fluxes'.

rFASTCORMICSThresholdAnalysis

rFASTCORMICSThresholdAnalysis(log)

Bases: BaseAnalysis

Analysis result object for thresholds found using the rFASTCORMICS method.

Stores the results of fitting a bimodal Gaussian distribution to the expression data's Kernel Density Estimate (KDE). Provides access to the calculated expression and non-expression thresholds, the fitted curves, and the original KDE data. Also includes plotting functionality.

Attributes:

Name Type Description
exp_th float

The primary expression threshold (mean of the higher-expression Gaussian).

non_exp_th float

The primary non-expression threshold (mean of the lower-expression Gaussian).

init_threshold tuple[float, float]

The initial heuristic guesses for the expression and non-expression thresholds.

_result dict

Dictionary holding the detailed results: - "x": np.ndarray, x-values for the KDE. - "y": np.ndarray, y-values (density) for the KDE. - "exp_th_arr": np.ndarray, array of best expression thresholds found (ranked). - "nonexp_th_arr": np.ndarray, array of best non-expression thresholds found (ranked). - "right_curve_arr": np.ndarray | None, array of y-values for the fitted higher-expression Gaussian curves. - "left_curve_arr": np.ndarray | None, array of y-values for the fitted lower-expression Gaussian curves. - "init_exp": float, initial guess for expression threshold. - "init_nonexp": float, initial guess for non-expression threshold.

PercentileThresholdAnalysis

PercentileThresholdAnalysis(log)

Bases: BaseAnalysis

Analysis result object for thresholds found using simple percentiles.

Stores the results of calculating thresholds based on specified percentiles of the expression data. Provides access to the calculated expression threshold (which might be a single value or a series if multiple percentiles were used) and the original data. Includes plotting functionality.

Attributes:

Name Type Description
exp_th float | Series

The calculated expression threshold(s). If a single percentile p was used, this is a float. If multiple percentiles were specified, this might be the highest threshold (default) or a specific one if exp_p was set. Refer to the threshold_series for all calculated percentile values.

non_exp_th float | Series

The calculated non-expression threshold(s). Similar logic to exp_th, using the lowest threshold by default or a specific one if non_exp_p was set.

_result dict

Dictionary holding the detailed results: - "data": np.ndarray, the filtered input expression data used for calculation. - "exp_th": float, the final expression threshold selected. - "non_exp_th": float, the final non-expression threshold selected. - "threshold_series": pd.Series | None, Series containing thresholds for all calculated percentiles (index: "p=value"), or None if only one p was given.

LocalThresholdAnalysis

LocalThresholdAnalysis(log)

Bases: BaseAnalysis

Analysis result object for locally calculated expression thresholds.

Stores gene-specific expression thresholds calculated for different sample groups based on within-group percentiles. Also stores optional global 'on' and 'off' thresholds used to override local thresholds for consistently high/low genes. Includes plotting functionality for visualizing expression distributions and thresholds.

Attributes:

Name Type Description
exp_ths DataFrame

DataFrame containing the local expression thresholds (genes x groups).

global_off_th Series

Series containing the global 'off' threshold for each group (index: group name). Genes with maximum expression below this in a group use this threshold.

global_on_th Series

Series containing the global 'on' threshold for each group (index: group name). Genes with minimum expression above this in a group use this threshold.

_result dict

Dictionary holding the detailed results: - "exp_ths": pd.DataFrame, the local thresholds (genes x groups). - "global_on_th": pd.Series, global 'on' thresholds per group. - "global_off_th": pd.Series, global 'off' thresholds per group. - "data": pd.DataFrame, the input expression data (genes x samples). - "groups": pd.Series, mapping of samples (index) to group names (values).

TaskAnalysis

TaskAnalysis(log)

Bases: BaseAnalysis

An object containing task analysis result. This should contain results including: result_df: dict A dataframe recording the details of all tests score: int Number of passed functionality (metabolic task) tests.

Parameters:

Name Type Description Default
log

A dict storing parameters used to perform this analysis

required

FastCCAnalysis

FastCCAnalysis(log)

Bases: ConsistencyAnalysis

FASTCC analysis result containing consistent_model, removed_rxn_ids, and kept_rxn_ids: consistent_model: pg.Model or cobra.Model A model without inconsistent reactions. An inconsistent reaction cannot produce non-zero flux at any circumstance. removed_rxn_ids: np.ndarray An array contains the ids of removed reactions kept_rxn_ids: np.ndarray An array contains the ids of remaining reactions

Parameters:

Name Type Description Default
log

A dict storing parameters used to perform this analysis

required

FVAConsistencyAnalysis

FVAConsistencyAnalysis(log)

Bases: ConsistencyAnalysis

FVA analysis result containing consistent_model, removed_rxn_ids, and kept_rxn_ids: consistent_model: pg.Model or cobra.Model A model without inconsistent reactions. An inconsistent reaction cannot produce non-zero flux at any circumstance. removed_rxn_ids: np.ndarray An array contains the ids of removed reactions kept_rxn_ids: np.ndarray An array contains the ids of remaining reactions

Parameters:

Name Type Description Default
log

A dict storing parameters used to perform this analysis

required

CorrelationAnalysis

CorrelationAnalysis(log)

Bases: BaseAnalysis

Correlation result containing a result dict with a key named correlation_result and a pd.DataFrame as value

Parameters:

Name Type Description Default
log

A dict storing parameters used to perform this analysis

required

corr_df property

corr_df

Return the computed correlation matrix.

PCA_Analysis

PCA_Analysis(log)

Bases: BaseAnalysis

embedding_df property

embedding_df

Return the coordinate dataframe for PCA/embedding analyses.

DataAggregation

DataAggregation(log)

Bases: BaseAnalysis

Aggregated GeneData that helps to perform local thresholding, correlation, and dimensionality reduction analysis.

Parameters:

Name Type Description Default
log

A dict storing parameters used to perform this analysis

required

bh_adjust

bh_adjust(p)

Benjamini-Hochberg p-value correction for multiple hypothesis testing.

hypergeometric_test

hypergeometric_test(
    data: DataFrame, pathway_col: str, sig_col: str
) -> pd.DataFrame

Perform hypergeometric test on the given data.

Parameters:

Name Type Description Default
data DataFrame

A pandas DataFrame containing the data used for the test. It should have two columns: pathway_col indicating the pathway each reaction is categorized into, and sig_col, a boolean column indicating whether the differential test of the reaction is significant (True) or not (False).

required
pathway_col str

A string specifying the name of the column in the DataFrame that indicates the pathway each reaction belongs to.

required
sig_col str

A string specifying the name of the boolean column in the DataFrame that indicates whether a particular reaction is significant (True) or not (False) in the differential test.

required

Returns:

Name Type Description
result_df DataFrame
The function returns a pandas DataFrame named result_df, which contains the following columns:
`pval`: The raw p-values of the hypergeometric tests for each pathway.
`padj`: The Benjamini-Hochberg (BH)-adjusted p-values of the hypergeometric tests for each pathway.

The BH adjustment is a method to control the false discovery rate (FDR).

`BgRatio`: The ratio of the number of reactions in a specific pathway to the total number of reactions in the dataset.

This indicates the proportion of reactions in a pathway relative to the whole dataset.

`SigRatio`: The ratio of the number of significant reactions in a specific pathway

to the total number of significant reactions in the dataset. This shows the proportion of significant reactions in a pathway relative to the total number of significant reactions.

prepare_PCA_dfs

prepare_PCA_dfs(
    feature_df: DataFrame,
    transform_func: Optional[Callable] = None,
    n_components: Optional[int] = None,
    standardize: bool = True,
    incremental: bool = False,
)

Prepare principal component analysis (PCA) dataframes from a feature dataframe.

Parameters:

Name Type Description Default
feature_df DataFrame

The feature dataframe to analyze. Rows represent features, and columns represent samples.

required
transform_func Optional[Callable]

A function to apply to the feature dataframe before analysis. Default is None.

None
n_components Optional[int]

The number of components in the result dataframes. If None, the minimum of the feature_df's shape[0] and shape[1] is used. Default is None.

None
standardize bool

Whether to standardize the feature dataframe before analysis by centering and scaling to unit variance. Default is True.

True
incremental bool

Whether to use an incremental PCA algorithm instead of a regular PCA algorithm. Default is False.

False

Returns:

Name Type Description
PC_df DataFrame

A dataframe containing the principal components (columns) of each sample (rows).

exp_var_df DataFrame

A dataframe containing the explained variance ratio of each principal component (rows).

component_df DataFrame

A dataframe containing the principal axes in feature space, representing the directions of maximum variance in the data.

References

https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

prepare_embedding_dfs

prepare_embedding_dfs(
    feature_df: DataFrame,
    transform_func: Optional[Callable] = None,
    n_components: int = 3,
    reducer: str = "TSNE",
    standardize: bool = True,
    **kwargs
)

Get a dataframe containing an embedding result from a feature dataframe.

Parameters:

Name Type Description Default
feature_df DataFrame

A dataframe where the rows are features and the columns are samples.

required
transform_func Optional[Callable]

A function that will be performed on the dataframe before analysis.

None
n_components int

Number of components in the result dfs.

3
reducer str

A string or enum specifying the dimensionality reduction algorithm to use. Supported options are "TSNE", "Isomap", "MDS", "SpectralEmbedding", "LocallyLinearEmbedding", and "UMAP". Default is "TSNE".

'TSNE'
standardize bool

If True, standardize the dataframe before analysis by removing the mean and scaling to unit variance. Default is True.

True
**kwargs

Additional keyword arguments to be passed to the dimensionality reduction algorithm.

{}

Returns:

Name Type Description
df DataFrame

The embedding result containing the component values of each data (rows). The index of the returned dataframe is the embedding component number (e.g., "embedding 1", "embedding 2"). The columns are the sample names from the input feature_df.

References

https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html https://scikit-learn.org/stable/modules/generated/sklearn.manifold.Isomap.html https://scikit-learn.org/stable/modules/generated/sklearn.manifold.MDS.html https://scikit-learn.org/stable/modules/generated/sklearn.manifold.SpectralEmbedding.html https://scikit-learn.org/stable/modules/generated/sklearn.manifold.LocallyLinearEmbedding.html https://umap-learn.readthedocs.io/en/latest/

save_model

save_model(
    model: Model, output_file_name: Union[str, PathLike]
) -> None

Save a cobra.Model

Parameters:

Name Type Description Default
model Model

Saved cobra.Model

required
output_file_name Union[str, PathLike]

Saved core' file name

required

Returns:

Type Description
None

load_model

load_model(model_file_path: str) -> cobra.Model

Parameters:

Name Type Description Default
model_file_path str
required

Returns:

Type Description
Model

get_logger

get_logger(name: str) -> logging.Logger

Get a logger within the pipeGEM namespace.

Parameters:

Name Type Description Default
name str

The name of the logger, typically __name__ of the calling module.

required

Returns:

Type Description
Logger