hormone2cell.hormone_strength
- hormone2cell.hormone_strength(ave_all: DataFrame, geneset_definition: DataFrame, celltype_column: str = 'Celltype_unique', tissue_col: str | None = 'Tissue', adjustment: bool = True, assay: str | None = 'cell', include_cols: str = ['hormonegene_include1', 'hormonegene_include2', 'hormonegene_include3', 'hormonegene_include4'], exclude_cols: str = ['hormonegene_exclude1', 'hormonegene_exclude2'], thresh_expr_low=0.01, thresh_pct=5, specificity_threshold=0.8, coverage_threshold=0.5, thresh_included_initial=0.1, thresh_excluded_initial=0.15, thresh_included_adjusted=0.5, thresh_excluded_adjusted=0.15, use_precomputed: str | None = None, max_expression_file='gene_max_expression.csv') DataFrame
End-to-end pipeline to compute hormone strength per cell type and return a long-form, annotated table.
Workflow
Collect all included/excluded genes from hormone_producing.
Compute a wide hormone × cell-type matrix of strengths. - If adjustment=True, run a specificity/coverage-adjusted pass and
replace the affected hormones.
If adjustment=False, run a single unadjusted pass.
Convert the wide matrix to long format and append hormone annotations (and the assay label if provided).
Parameters
- ave_allpd.DataFrame
Input table used by downstream steps. Must contain at least the column named by celltype_column; if adjustment=True, it must also contain tissue_col.
- geneset_definitionpd.DataFrame
Hormone definition/annotation table. Must contain ‘hormone_short’; may optionally include ‘hormone_display’, ‘hormone_figures’, ‘Tier’. Columns listed in include_cols / exclude_cols should hold gene IDs.
- celltype_columnstr, default “Celltype_unique”
Column name identifying cell types.
- tissue_colOptional[str], default “Tissue”
Column name identifying tissue; only required when adjustment=True.
- adjustmentbool, default True
Whether to run the specificity/coverage-adjusted second pass.
- assayOptional[str], default “cell”
If provided, added as a constant ‘assay’ column in the output.
- include_cols, exclude_cols
Column lists in hormone_producing that define include/exclude gene sets.
- thresh_expr_low, thresh_pctfloat, defaults 0.1 and 0.05
Initial filtering thresholds used inside the calculation.
- specificity_threshold, coverage_thresholdfloat, defaults 0.8 and 0.5
Thresholds for identifying hormones that need adjustment (τ specificity and coverage across cell types within tissues).
- thresh_included_initial, thresh_excluded_initialfloat
Expression thresholds for the first pass (included/excluded genes).
- thresh_included_adjusted, thresh_excluded_adjustedfloat
Expression thresholds used for the adjusted pass.
- use_precomputedcell or nucleus or None, default None.
Load precomputed maximum log-expression values for hormone-related genes and indicate which assay to load the precomputed thresholds for.
Returns
- pd.DataFrame
Long-form DataFrame with columns like: [‘Hormone’, <celltype_column>, ‘Strength’, ‘hormone_short’,
‘hormone_display’, ‘hormone_figures’, ‘Tier’, ‘assay’(optional)].
‘Strength’ is derived from the wide matrix’s values.