Model¶
Bases: GEMComposite
A comprehensive container for metabolic models and associated data.
This class wraps a cobra.Model object and extends it with capabilities
for managing and integrating various types of biological data, including
gene expression, enzyme kinetics, metabolite concentrations, and medium
compositions. It also facilitates task-based analysis and model consistency checks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name_tag
|
str
|
A unique identifier for this model instance, used within |
None
|
model
|
Model
|
An existing |
None
|
gene_data_factor_df
|
DataFrame
|
A DataFrame specifying how different gene datasets should be grouped or factored during aggregation (e.g., by condition, time point). |
None
|
**kwargs
|
Additional key-value pairs to store as annotations for this model. |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the provided |
metabolite_ids
property
¶
metabolite_ids: List[str]
List[str]: A list of all metabolite IDs in the model.
subsystems
property
¶
subsystems: Dict[str, List[str]]
Dict[str, List[str]]: Reactions grouped by subsystem.
gene_data
property
¶
gene_data: Dict[str, GeneData]
Dict[str, GeneData]: Dictionary of associated gene data objects.
metabolite_data
property
¶
metabolite_data: Optional[MetaboliteData]
Optional[MetaboliteData]: Associated metabolite data object.
enzyme_data
property
¶
enzyme_data: Optional[EnzymeData]
Optional[EnzymeData]: Associated enzyme data object.
medium_data
property
¶
medium_data: Dict[str, MediumData]
Dict[str, MediumData]: Dictionary of associated medium data objects.
tasks
property
¶
tasks: Dict[str, TaskContainer]
Dict[str, TaskContainer]: Dictionary of associated task containers.
aggregated_gene_data
property
¶
aggregated_gene_data
GeneData: Aggregated gene data based on the factor DataFrame.
get_rxn_info ¶
get_rxn_info(attrs) -> pd.DataFrame
Get reaction information for specified attributes.
copy ¶
copy(
copy_gene_data=False,
copy_medium_data=False,
copy_tasks=False,
copy_merging_info=True,
)
Create a deep-copied object of this Model
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
copy_gene_data
|
Also copy the gene data in this Model |
False
|
|
copy_medium_data
|
Also copy the medium data in this Model |
False
|
|
copy_tasks
|
Also copy the tasks in this Model |
False
|
|
copy_merging_info
|
Also copy the merged reaction information |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
copied_model |
Model
|
|
add_medium_data ¶
add_medium_data(
name,
data: Union[MediumData, DataFrame],
data_kwargs=None,
**kwargs
) -> None
Add medium data to the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Name to assign to this medium dataset. |
required |
data
|
Union[MediumData, DataFrame]
|
The medium data, either as a MediumData object or a DataFrame. If a DataFrame, it will be converted to MediumData. |
required |
data_kwargs
|
dict
|
Keyword arguments for the MediumData constructor if |
None
|
**kwargs
|
Additional keyword arguments passed to the |
{}
|
apply_medium ¶
apply_medium(name, **kwargs)
Apply a defined medium composition to the model's exchange reactions.
add_gene_data ¶
add_gene_data(
name_or_prefix: str,
data: Union[GeneData, DataFrame, Series, AnnData],
data_kwargs: dict = None,
**kwargs
) -> None
Add gene data to the internal dictionary of gene data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name_or_prefix
|
str
|
The name or prefix of the gene data. If a prefix is provided, then the actual column names in the pd.DataFrame will be suffixed with the prefix. If an empty string is provided, then the column names will not be modified. |
required |
data
|
Union[GeneData, DataFrame, Series, AnnData]
|
The gene data to add to the internal dictionary. This can be a pd.DataFrame, pd.Series, anndata.AnnData, or GeneData object. If a pd.DataFrame is provided, then each column of the DataFrame will be converted into a GeneData object with a modified name based on the name_or_prefix argument. If a pd.Series is provided, then it will be converted into a GeneData object with the name provided by name_or_prefix. If a GeneData object is provided, then it will be added to the internal dictionary as-is. |
required |
data_kwargs
|
dict
|
Additional keyword arguments to pass to the GeneData constructor when converting a pd.DataFrame or pd.Series into GeneData objects. The default value is None, which means no additional arguments are passed to the GeneData constructor. Ignored when the input data is already a GeneData. |
None
|
**kwargs
|
Additional keyword arguments to pass to the align method of the GeneData object(s) after they have been added to the internal dictionary. |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If the data argument is not a pd.DataFrame, pd.Series, anndata.AnnData, or GeneData object. |
set_gene_data ¶
set_gene_data(name, data, data_kwargs=None, **kwargs)
Replace an existing gene dataset.
test_tasks ¶
test_tasks(
name, model_compartment_parenthesis="[{}]", **kwargs
)
Test the model's ability to perform defined metabolic tasks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
The name of the TaskContainer to use for testing. |
required |
model_compartment_parenthesis
|
str
|
String format for compartment identifiers in the model, default "[{}]". |
'[{}]'
|
**kwargs
|
Additional arguments passed to |
{}
|
Returns:
| Type | Description |
|---|---|
TaskAnalysis
|
An object containing the results of the task analysis. |
calc_ind_task_score ¶
calc_ind_task_score(
data_name: str,
task_analysis: TaskAnalysis,
all_na_indicator=-1,
**kwargs
)
Calculate scores for individual tasks based on associated gene data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_name
|
str
|
Name of the GeneData object to use for scoring. |
required |
task_analysis
|
TaskAnalysis
|
The TaskAnalysis result object containing task definitions and supporting reactions. |
required |
all_na_indicator
|
numeric
|
Value to return if all genes associated with a task's reactions have NA scores. Default is -1. |
-1
|
**kwargs
|
Additional arguments passed to |
{}
|
Returns:
| Type | Description |
|---|---|
dict
|
A dictionary mapping task IDs to their calculated scores. |
get_activated_tasks ¶
get_activated_tasks(
data_name,
task_analysis: TaskAnalysis,
all_na_indicator=-1,
score_threshold=5 * np.log10(2),
**kwargs
)
Identify tasks considered 'activated' based on gene data scores and task analysis results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_name
|
str
|
Name of the GeneData object to use for scoring. |
required |
task_analysis
|
TaskAnalysis
|
The TaskAnalysis result object. |
required |
all_na_indicator
|
numeric
|
Indicator value used in |
-1
|
score_threshold
|
float
|
Minimum score for a task to be considered activated. Default is 5*log10(2). |
5 * log10(2)
|
**kwargs
|
Additional arguments passed to |
{}
|
Returns:
| Type | Description |
|---|---|
list
|
A list of task IDs considered activated. |
get_activated_task_sup_rxns ¶
get_activated_task_sup_rxns(
data_name: str,
task_analysis: TaskAnalysis,
score_threshold: float = 5 * np.log10(2),
include_supp_rxns: bool = True,
**kwargs
)
Get supporting reactions for tasks identified as 'activated'.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_name
|
str
|
Name of the GeneData object to use for scoring. |
required |
task_analysis
|
TaskAnalysis
|
The TaskAnalysis result object. |
required |
score_threshold
|
float
|
Minimum score threshold used in |
5 * log10(2)
|
include_supp_rxns
|
bool
|
Whether to include supplementary reactions defined in the tasks. Default is True. |
True
|
**kwargs
|
Additional arguments passed to |
{}
|
Returns:
| Type | Description |
|---|---|
list
|
A list of unique reaction IDs supporting the activated tasks. |
check_rxn_scales ¶
check_rxn_scales(threshold=10000.0)
Check if reaction stoichiometric coefficients exceed a threshold.
check_model_scale ¶
check_model_scale(method='geometric_mean', n_iter=10)
Check the numerical scale of the model's stoichiometric matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Scaling method to use ('geometric_mean', etc.). Default is "geometric_mean". |
'geometric_mean'
|
n_iter
|
int
|
Number of iterations for the scaling algorithm. Default is 10. |
10
|
Returns:
| Type | Description |
|---|---|
ScalingResult
|
An object containing the results of the scaling analysis. |
scale_model ¶
scale_model(scaling_result)
Apply a previously calculated scaling to the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
scaling_result
|
ScalingResult
|
The result object obtained from |
required |
Returns:
| Type | Description |
|---|---|
Model
|
The rescaled pipeGEM Model object. |
check_consistency ¶
check_consistency(
method: str = "FASTCC", tol: float = 1e-06, **kwargs
)
Check the flux consistency of the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Consistency checking algorithm ('FASTCC', etc.). Default is "FASTCC". |
'FASTCC'
|
tol
|
float
|
Numerical tolerance for consistency checks. Default is 1e-6. |
1e-06
|
**kwargs
|
Additional arguments passed to the consistency checker's |
{}
|
Returns:
| Type | Description |
|---|---|
ConsistencyAnalysis
|
An object containing the results of the consistency check, including a consistent sub-model. |
do_flux_analysis ¶
do_flux_analysis(method, solver='gurobi', **kwargs)
Perform flux balance analysis (FBA) or its variants.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Flux analysis method ('FBA', 'pFBA', 'FVA', etc.). |
required |
solver
|
str
|
LP solver to use ('gurobi', 'cplex', 'glpk', etc.). Default is "gurobi". |
'gurobi'
|
**kwargs
|
Additional arguments passed to the flux analyzer's |
{}
|
Returns:
| Type | Description |
|---|---|
FluxAnalysisResult
|
An object containing the results of the flux analysis. |
simulate_ko_genes ¶
simulate_ko_genes(gene_ids, **kwargs)
Simulate gene knockouts by setting their associated reaction scores to zero.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gene_ids
|
list
|
List of gene IDs to knock out. |
required |
**kwargs
|
Additional arguments passed to |
{}
|
Returns:
| Type | Description |
|---|---|
Series
|
Reaction scores reflecting the simulated knockouts. |
do_ko_analysis ¶
do_ko_analysis(
method="single_KO", solver="gurobi", **kwargs
)
Perform gene knockout analysis.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Knockout analysis method ('single_KO', etc.). Default is "single_KO". |
'single_KO'
|
solver
|
str
|
LP solver to use. Default is "gurobi". |
'gurobi'
|
**kwargs
|
Additional arguments passed to the knockout analyzer's |
{}
|
Returns:
| Type | Description |
|---|---|
KOAnalysisResult
|
An object containing the results of the knockout analysis. |
integrate_enzyme_data ¶
integrate_enzyme_data(
prot_abund_data_name=None, method="GECKOLight", **kwargs
)
Integrate enzyme data using GECKO formulations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prot_abund_data_name
|
str
|
Name of the ProteinAbundanceData attached to this model. If
|
None
|
method
|
str
|
GECKO method to use: |
'GECKOLight'
|
**kwargs
|
Additional keyword arguments passed to the integrator. |
{}
|
Returns:
| Type | Description |
|---|---|
GECKOLightAnalysis or GECKOFullAnalysis
|
|
integrate_gene_data ¶
integrate_gene_data(
data_name,
integrator="GIMME",
integrator_init_kwargs=None,
rxn_scaling_coefs=None,
predefined_threshold=None,
protected_rxns=None,
**kwargs
)
Integrate gene data with this model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_name
|
Name of the gene data to be integrated with the model |
required | |
integrator
|
Name of the used integrator (algorithm name) Possible choices: GIMME, CORDA, rFASTCORMICS, mCADRE, RIPTiDe, and Eflux (for now). |
'GIMME'
|
|
integrator_init_kwargs
|
Keyword arguments for initializing the integrator |
None
|
|
rxn_scaling_coefs
|
Reaction scaling coefficient for the integrator if the model was rescaled before. |
None
|
|
predefined_threshold
|
Threshold analysis object contains expression threshold needed, or a dict contains an expression threshold with a key named exp_th and a non-expression threshold with a key named non_exp_th |
None
|
|
protected_rxns
|
Protected reaction IDs contained in a list |
None
|
|
kwargs
|
Keyword arguments for integrating the data. |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
integrating_result |
BaseAnalysis
|
Result object containing gene data-integrated model (context-specific model). |
save_model ¶
save_model(file_name: str) -> None
Save the pipeGEM model and its annotations.
Saves the underlying cobra.Model to the specified file_name (e.g.,
'model.json', 'model.xml'). Additionally, saves model annotations
(including name_tag) to a corresponding TOML file (e.g.,
'model_annotations.toml') in the same directory.
This is just a workaround for now
since the io function for all the file types haven't been implemented.
Besides the model, this function stores annotations and name_tag as a toml file in the same folder of the model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_name
|
str
|
|
required |
Returns:
| Type | Description |
|---|---|
None
|
|
load_model
classmethod
¶
load_model(file_name: str)
Load a pipeGEM model from a model file (json, sbml, mat..) and a toml file storing the metadata of the model
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_name
|
str
|
Model file name. In the same directory, here should be a toml file having the same file name and a .toml suffix For example, a valid model file called 'model.json' is stored in a folder called 'folder'. Then the files in the folder should be: folder |- model.json |- model.toml ... |
required |
Returns:
| Name | Type | Description |
|---|---|---|
model |
Model
|
|
update_merged_rxn ¶
update_merged_rxn(merged_rxn)
Update internal state when a reaction is merged.
Stores the original objective coefficients if not already done, adds the merged reaction to the lookup table, and handles empty merged reactions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
merged_rxn
|
Reaction
|
The reaction object representing the merged reaction. It should have
a |
required |
get_merged_rxn ¶
get_merged_rxn(rxn_id)
Retrieve the merged reaction object corresponding to an original reaction ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rxn_id
|
str
|
The ID of the original reaction before merging. |
required |
Returns:
| Type | Description |
|---|---|
Reaction or None
|
The merged reaction object if the original reaction was merged, otherwise None. |