Quickstart¶

pipeGEM is a package for visualizing, analyzing, and integrating data with genome-scale metabolic models. Flux analysis builds on COBRApy, while pipeGEM adds model/group containers, omics-data handling, thresholding, metabolic tasks, medium utilities, and model extraction workflows.

Command line¶

Generate template configuration files for a pipeline:

pipeGEM template -p integration -o ./

This writes template TOML files under a configs directory. Edit those files to point at your model, gene data, thresholds, mapping tables, and integration settings.

Run model processing:

pipeGEM process -t configs/model.toml

Run expression-data integration:

pipeGEM integrate \
  -g configs/gene_data.toml \
  -t configs/model.toml \
  -r configs/threshold.toml \
  -m configs/mapping.toml \
  -i configs/integration.toml

Run flux analysis:

pipeGEM flux -f configs/flux_analysis/pFBA.toml -t configs/model.toml

Run model comparison:

pipeGEM compare -c configs/comparison.toml

Legacy CLI

The previous python -m pipeGEM.cli -n <pipeline> and pipeGEM -n <pipeline> style is still accepted for compatibility, but new scripts should use subcommands.

Runnable Python example¶

This example loads the E. coli core model, wraps it in pipeGEM.Model, and runs an optimization through the underlying COBRA model.

import pipeGEM as pg
from pipeGEM import load_remote_model

cobra_model = load_remote_model("e_coli_core")
model = pg.Model(name_tag="e_coli_core", model=cobra_model)

print(model)
print(len(model.reaction_ids))

solution = model.cobra_model.optimize()
print(solution.objective_value)

For local analyses, load an SBML, JSON, YAML, or MAT model and then call pipeGEM workflows:

import pipeGEM as pg
from pipeGEM.utils import load_model

cobra_model = load_model("your_model_path")
model = pg.Model(name_tag="model_name", model=cobra_model)

flux_analysis = model.do_flux_analysis("pFBA", solver="glpk")
flux_analysis.plot(
    rxn_ids=["rxn_a", "rxn_b"],
    file_name="pfba_flux.png",
)

Multiple models¶

import pipeGEM as pg
from pipeGEM.utils import load_model

model_a1 = load_model("your_model_path_1")
model_a2 = load_model("your_model_path_2")
model_b1 = load_model("your_model_path_3")
model_b2 = load_model("your_model_path_4")

group = pg.Group(
    {
        "group_a": {
            "model_a_dmso": model_a1,
            "model_a_metformin": model_a2,
        },
        "group_b": {
            "model_b_dmso": model_b1,
            "model_b_metformin": model_b2,
        },
    },
    name_tag="my_group",
    treatments={
        "model_a_dmso": "DMSO",
        "model_b_dmso": "DMSO",
        "model_a_metformin": "metformin",
        "model_b_metformin": "metformin",
    },
)

flux_analysis = group.do_flux_analysis("pFBA")
flux_analysis.plot(rxn_ids=["rxn_a", "rxn_b"])

Context-specific models¶

import numpy as np
import pipeGEM as pg
from pipeGEM.data import GeneData, synthesis
from pipeGEM.utils import load_model

model = pg.Model(name_tag="sample_0", model=load_model("your_model_path"))
dummy_data = synthesis.get_syn_gene_data(model, n_sample=3)

gene_data = GeneData(
    data=dummy_data["sample_0"],
    data_transform=lambda x: np.log2(x),
    absent_expression=-np.inf,
)

model.add_gene_data(
    name_or_prefix="sample_0",
    data=gene_data,
    or_operation="nanmax",
    threshold=-np.inf,
    absent_value=-np.inf,
)

result = model.integrate_gene_data(
    data_name="sample_0",
    integrator="GIMME",
    high_exp=5 * np.log10(2),
)

context_specific_model = result.result_model