Group¶
Bases: GEMComposite
A container for managing and comparing multiple pipeGEM.Model objects.
This class facilitates comparative analyses across a collection of metabolic models, such as comparing component numbers, calculating similarity indices (e.g., Jaccard), performing dimensionality reduction (PCA), and aggregating analysis results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
group
|
Union[List[Model], Dict[str, Model], Dict[str, List[Model]], Dict[str, Dict[str, Model]]]
|
The collection of models to include in the group. Can be provided as:
- A list of |
required |
name_tag
|
str
|
An identifier for this group. Defaults to "Unnamed_group". |
None
|
factors
|
DataFrame
|
A DataFrame providing annotations for the models in the group. Index should correspond to model name tags, columns are annotation keys. |
None
|
**kwargs
|
Additional annotations provided as key-value pairs, where keys are
annotation names and values are dictionaries mapping model name tags
to annotation values (e.g., |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If input models have non-unique name tags or if the input |
TypeError
|
If elements within the input |
KeyError
|
If annotation dictionaries or factor DataFrames refer to model names not present in the group. |
annotation
property
¶
annotation: DataFrame
pd.DataFrame: Combined annotations from models and the group level.
reaction_ids
property
¶
reaction_ids: List[str]
List[str]: A list of unique reaction IDs across all models in the group.
metabolite_ids
property
¶
metabolite_ids: List[str]
List[str]: A list of unique metabolite IDs across all models in the group.
gene_ids
property
¶
gene_ids: List[str]
List[str]: A list of unique gene IDs across all models in the group.
subsystems
property
¶
subsystems: Dict[str, set]
Dict[str, set]: Unique reaction IDs grouped by subsystem across all models.
gene_data
property
¶
gene_data: DataAggregation
GeneData: Aggregated gene data from all models in the group.
add_annotation ¶
add_annotation(added, store_in_model=False)
Add annotations to the models in the group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
added
|
Dict
|
A dictionary where keys are annotation names and values are
dictionaries mapping model name tags to annotation values.
Example: |
required |
store_in_model
|
bool
|
If True, add annotations directly to the individual |
False
|
index ¶
index(item, raise_err=True)
Get the numerical index of a model within the group's internal order (dict).
Note: Dictionary order is guaranteed in Python 3.7+.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
item
|
str
|
The name tag of the model. |
required |
raise_err
|
bool
|
If True (default), raise KeyError if the item is not found. If False, return None. |
True
|
Returns:
| Type | Description |
|---|---|
int or None
|
The index of the model, or None if not found and |
Raises:
| Type | Description |
|---|---|
KeyError
|
If |
get_RAS ¶
get_RAS(data_name, method='mean')
Calculate aggregated Reaction Activity Scores (RAS) across the group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_name
|
str
|
The name of the gene data set within each model to use. (Currently assumes this name exists in all models, might need error handling). |
required |
method
|
str
|
Aggregation method for RAS (e.g., 'mean', 'median'). Default is 'mean'. |
'mean'
|
Returns:
| Type | Description |
|---|---|
Series or DataFrame
|
Aggregated RAS scores. Structure depends on GeneData.aggregate implementation. |
aggregate_models ¶
aggregate_models(group_by)
Create new Group objects based on an annotation key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
group_by
|
str
|
The annotation key to group models by. |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, Group]
|
A dictionary where keys are the unique values of the |
get_rxn_info ¶
get_rxn_info(
models: Optional[Union[str, list]] = None,
attrs: list = None,
drop_duplicates=True,
) -> pd.DataFrame
Get reaction information across specified models in the group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
Union[str, list]
|
A single model name tag or a list of name tags to include. If None (default), includes all models in the group. |
None
|
attrs
|
list
|
A list of reaction attributes to retrieve (e.g., ['name', 'subsystem', 'gene_reaction_rule']).
If None, behavior might depend on |
None
|
drop_duplicates
|
bool
|
If True (default), remove duplicate rows from the combined DataFrame. |
True
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
A DataFrame containing the requested reaction information, indexed by reaction ID (potentially duplicated if drop_duplicates=False). |
rename ¶
rename(name_tag, inplace=False)
Rename the group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name_tag
|
str
|
The new name tag for the group. |
required |
inplace
|
bool
|
If True, modify the current group's name tag directly. If False (default), return a new Group instance with the new name. |
False
|
Returns:
| Type | Description |
|---|---|
Group or None
|
The renamed Group instance if |
do_flux_analysis ¶
do_flux_analysis(
method: str,
aggregate_method: str = "concat",
solver: str = "gurobi",
group_by: str = None,
**kwargs
)
Do flux analysis on the models contained in this group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Analysis performed on the models. |
required |
aggregate_method
|
str
|
Aggregation method performed on the flux result. |
'concat'
|
solver
|
str
|
Solver used to do the analysis. |
'gurobi'
|
group_by
|
str
|
Used to determine the groups for the aggregate_method. |
None
|
kwargs
|
Keyword arguments used in the model.do_flux_analysis() |
{}
|
Returns:
| Name | Type | Description |
|---|---|---|
flux_result |
FluxAnalysis
|
|
get_info ¶
get_info(models=None, features=None) -> pd.DataFrame
Get a information table by traversing the object structure
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
The name tag of the selected models, if None, use all models. |
None
|
|
features
|
The features to be obtained while the traverse |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
information_table |
DataFrame
|
|
compare ¶
compare(
models: Optional[Union[str, list, ndarray]] = None,
group_by: Optional[str] = "group_name",
method: Literal["jaccard", "PCA", "num"] = "jaccard",
**kwargs
)
Compare models within the group based on their components.
This method provides different ways to compare the models contained within this group, or subsets/aggregations thereof. Comparisons can be based on component overlap (Jaccard index), component counts, or dimensionality reduction (PCA) of component presence/absence.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
str or list[str] or ndarray
|
Specifies which models from the group to include in the comparison. Can be a single model name tag, a list/array of name tags, or None to include all models in the current group. Defaults to None. |
None
|
group_by
|
str
|
An annotation key present in the group's |
'group_name'
|
method
|
(jaccard, PCA, num)
|
The comparison method to use:
- 'jaccard': Calculate pairwise Jaccard similarity based on shared
components (genes, reactions, metabolites). See |
'jaccard'
|
**kwargs
|
Additional keyword arguments passed to the specific comparison method
(e.g., |
{}
|
Returns:
| Type | Description |
|---|---|
Union[ComponentComparisonAnalysis, ComponentNumberAnalysis, PCA_Analysis]
|
An analysis result object corresponding to the chosen |
Raises:
| Type | Description |
|---|---|
ValueError
|
If an invalid |
KeyError
|
If |
See Also
_compare_components_jaccard : Calculates Jaccard similarity. _compare_component_num : Compares component counts. _compare_component_PCA : Performs PCA on component presence. aggregate_models : Aggregates models based on annotations.