Group¶

Bases: GEMComposite

A container for managing and comparing multiple pipeGEM.Model objects.

This class facilitates comparative analyses across a collection of metabolic models, such as comparing component numbers, calculating similarity indices (e.g., Jaccard), performing dimensionality reduction (PCA), and aggregating analysis results.

Parameters:

Name	Type	Description	Default
`group`	`Union[List[Model], Dict[str, Model], Dict[str, List[Model]], Dict[str, Dict[str, Model]]]`	The collection of models to include in the group. Can be provided as: - A list of `pipeGEM.Model` objects. - A dictionary mapping desired name tags to `cobra.Model` objects. - A dictionary mapping subgroup names to lists of `pipeGEM.Model` objects. - A dictionary mapping subgroup names to dictionaries mapping model names to `cobra.Model` objects.	required
`name_tag`	`str`	An identifier for this group. Defaults to "Unnamed_group".	`None`
`factors`	`DataFrame`	A DataFrame providing annotations for the models in the group. Index should correspond to model name tags, columns are annotation keys.	`None`
`**kwargs`		Additional annotations provided as key-value pairs, where keys are annotation names and values are dictionaries mapping model name tags to annotation values (e.g., `condition={'model1': 'control', 'model2': 'treated'}`).	`{}`

Raises:

Type	Description
`ValueError`	If input models have non-unique name tags or if the input `group` format is invalid.
`TypeError`	If elements within the input `group` are not of the expected types.
`KeyError`	If annotation dictionaries or factor DataFrames refer to model names not present in the group.

annotation `property` ¶

annotation: DataFrame

pd.DataFrame: Combined annotations from models and the group level.

reaction_ids `property` ¶

reaction_ids: List[str]

List[str]: A list of unique reaction IDs across all models in the group.

metabolite_ids `property` ¶

metabolite_ids: List[str]

List[str]: A list of unique metabolite IDs across all models in the group.

gene_ids `property` ¶

gene_ids: List[str]

List[str]: A list of unique gene IDs across all models in the group.

subsystems `property` ¶

subsystems: Dict[str, set]

Dict[str, set]: Unique reaction IDs grouped by subsystem across all models.

gene_data `property` ¶

gene_data: DataAggregation

GeneData: Aggregated gene data from all models in the group.

items ¶

items()

Return an iterator over the group's (name_tag, model) items.

add_annotation ¶

add_annotation(added, store_in_model=False)

Add annotations to the models in the group.

Parameters:

Name	Type	Description	Default
`added`	`Dict`	A dictionary where keys are annotation names and values are dictionaries mapping model name tags to annotation values. Example: `{'condition': {'model1': 'A', 'model2': 'B'}}`	required
`store_in_model`	`bool`	If True, add annotations directly to the individual `pipeGEM.Model` objects. If False (default), store them at the `Group` level.	`False`

index ¶

index(item, raise_err=True)

Get the numerical index of a model within the group's internal order (dict).

Note: Dictionary order is guaranteed in Python 3.7+.

Parameters:

Name	Type	Description	Default
`item`	`str`	The name tag of the model.	required
`raise_err`	`bool`	If True (default), raise KeyError if the item is not found. If False, return None.	`True`

Returns:

Type	Description
`int or None`	The index of the model, or None if not found and `raise_err` is False.

Raises:

Type	Description
`KeyError`	If `item` is not found and `raise_err` is True.

get_RAS ¶

get_RAS(data_name, method='mean')

Calculate aggregated Reaction Activity Scores (RAS) across the group.

Parameters:

Name	Type	Description	Default
`data_name`	`str`	The name of the gene data set within each model to use. (Currently assumes this name exists in all models, might need error handling).	required
`method`	`str`	Aggregation method for RAS (e.g., 'mean', 'median'). Default is 'mean'.	`'mean'`

Returns:

Type	Description
`Series or DataFrame`	Aggregated RAS scores. Structure depends on GeneData.aggregate implementation.

aggregate_models ¶

aggregate_models(group_by)

Create new Group objects based on an annotation key.

Parameters:

Name	Type	Description	Default
`group_by`	`str`	The annotation key to group models by.	required

Returns:

Type	Description
`Dict[str, Group]`	A dictionary where keys are the unique values of the `group_by` annotation, and values are new `Group` objects containing the corresponding models. Returns {self.name_tag: self} if group_by is None.

get_rxn_info ¶

get_rxn_info(
    models: Optional[Union[str, list]] = None,
    attrs: list = None,
    drop_duplicates=True,
) -> pd.DataFrame

Get reaction information across specified models in the group.

Parameters:

Name	Type	Description	Default
`models`	`Union[str, list]`	A single model name tag or a list of name tags to include. If None (default), includes all models in the group.	`None`
`attrs`	`list`	A list of reaction attributes to retrieve (e.g., ['name', 'subsystem', 'gene_reaction_rule']). If None, behavior might depend on `Model.get_rxn_info`.	`None`
`drop_duplicates`	`bool`	If True (default), remove duplicate rows from the combined DataFrame.	`True`

Returns:

Type	Description
`DataFrame`	A DataFrame containing the requested reaction information, indexed by reaction ID (potentially duplicated if drop_duplicates=False).

rename ¶

rename(name_tag, inplace=False)

Rename the group.

Parameters:

Name	Type	Description	Default
`name_tag`	`str`	The new name tag for the group.	required
`inplace`	`bool`	If True, modify the current group's name tag directly. If False (default), return a new Group instance with the new name.	`False`

Returns:

Type	Description
`Group or None`	The renamed Group instance if `inplace` is False, otherwise None.

do_flux_analysis ¶

do_flux_analysis(
    method: str,
    aggregate_method: str = "concat",
    solver: str = "gurobi",
    group_by: str = None,
    **kwargs
)

Do flux analysis on the models contained in this group.

Parameters:

Name	Type	Description	Default
`method`	`str`	Analysis performed on the models.	required
`aggregate_method`	`str`	Aggregation method performed on the flux result.	`'concat'`
`solver`	`str`	Solver used to do the analysis.	`'gurobi'`
`group_by`	`str`	Used to determine the groups for the aggregate_method.	`None`
`kwargs`		Keyword arguments used in the model.do_flux_analysis()	`{}`

Returns:

Name	Type	Description
`flux_result`	`FluxAnalysis`

get_info ¶

get_info(models=None, features=None) -> pd.DataFrame

Get a information table by traversing the object structure

Parameters:

Name	Type	Description	Default
`models`		The name tag of the selected models, if None, use all models.	`None`
`features`		The features to be obtained while the traverse	`None`

Returns:

Name	Type	Description
`information_table`	`DataFrame`

compare ¶

compare(
    models: Optional[Union[str, list, ndarray]] = None,
    group_by: Optional[str] = "group_name",
    method: Literal["jaccard", "PCA", "num"] = "jaccard",
    **kwargs
)

Compare models within the group based on their components.

This method provides different ways to compare the models contained within this group, or subsets/aggregations thereof. Comparisons can be based on component overlap (Jaccard index), component counts, or dimensionality reduction (PCA) of component presence/absence.

Parameters:

Name	Type	Description	Default
`models`	`str or list[str] or ndarray`	Specifies which models from the group to include in the comparison. Can be a single model name tag, a list/array of name tags, or None to include all models in the current group. Defaults to None.	`None`
`group_by`	`str`	An annotation key present in the group's `annotation` DataFrame. If provided, models are first aggregated into subgroups based on the unique values of this annotation before comparison. The comparison (e.g., Jaccard index, PCA) is then performed between these aggregated subgroups. If None, comparison happens between individual models specified by the `models` parameter (or all models if `models` is None). Defaults to "group_name" (which might exist if groups were nested).	`'group_name'`
`method`	`(jaccard, PCA, num)`	The comparison method to use: - 'jaccard': Calculate pairwise Jaccard similarity based on shared components (genes, reactions, metabolites). See `_compare_components_jaccard`. - 'PCA': Perform Principal Component Analysis on a matrix where rows are models/groups and columns are components (presence/absence). See `_compare_component_PCA`. - 'num': Compare the number of components (genes, reactions, metabolites) across models/groups. See `_compare_component_num`. Defaults to "jaccard".	`'jaccard'`
`**kwargs`		Additional keyword arguments passed to the specific comparison method (e.g., `components` for 'jaccard' and 'num', `n_components` for 'PCA').	`{}`

Returns:

Type	Description
`Union[ComponentComparisonAnalysis, ComponentNumberAnalysis, PCA_Analysis]`	An analysis result object corresponding to the chosen `method`.

Raises:

Type	Description
`ValueError`	If an invalid `method` is specified.
`KeyError`	If `group_by` refers to an annotation key not present, or if `models` contains names not in the group.

Group¶

annotation property ¶

reaction_ids property ¶

metabolite_ids property ¶

gene_ids property ¶

subsystems property ¶

gene_data property ¶

items ¶

add_annotation ¶

index ¶

get_RAS ¶

aggregate_models ¶

get_rxn_info ¶

rename ¶

do_flux_analysis ¶

get_info ¶

compare ¶

annotation `property` ¶

reaction_ids `property` ¶

metabolite_ids `property` ¶

gene_ids `property` ¶

subsystems `property` ¶

gene_data `property` ¶