annsel


Status
Meta
Package
Ecosystem

Annsel is a user-friendly library that brings familiar dataframe-style operations to AnnData objects.

It's built on the narwhals compatibility layer for dataframes.

Take a look at the GitHub Projects board for features and future plans: Annsel Features

Getting started

Please refer to the documentation, in particular, the API documentation.

There's also a brief tutorial on how to use all the features of annsel: All of Annsel.

Installation

You need to have Python 3.10 or newer installed on your system. If you don't have Python installed, we recommend installing uv. There are several ways to install annsel:

Install the most recent release:

With uv:
```
uv add annsel
```
With pip:
```
pip install annsel
```

Install the latest development version:

With uv:

uv add git+https://github.com/srivarra/annsel

With pip:

pip install git+https://github.com/srivarra/annsel.git@main

Examples

annsel comes with a small dataset from Cell X Gene to help you get familiar with the API.

import annsel as an

adata = an.datasets.leukemic_bone_marrow_dataset()

The dataset looks like this:

AnnData object with n_obs × n_vars = 31586 × 458
    obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
    var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
    uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
    obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Filter

You can filter on obs, var, var_names, obs_names, X and it's layers, as well as obsm and varm matrices as a key-value pair containing the attribute's key name and the predicate to filter on. Currently the column names are numerical indices for obsm and varm matrices.

adata.an.filter(
    obs=(
        an.col(["Cell_label"]).is_in(["Classical Monocytes", "CD8+CD103+ tissue resident memory T cells"]),
        an.col(["sex"]) == "male",
    ),
    var=an.col(["vst.mean"]) >= 3,
    obsm={"X_pca": an.col([0]) > 0}, # PC1 values greater than 0
    copy=False, # Whether to return a copy of the AnnData object or just a view of it.
)

View of AnnData object with n_obs × n_vars = 736 × 67
    obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
    var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
    uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
    obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Select

You can select on obs, var, var_names, obs_names, X and it's layers. Selecting returns a new AnnData object. It's useful if you don't need all the columns in obs or var and just want to work with a few.

adata.an.select(
    obs=an.col(["Cell_label"]),
    var=an.col(["vst.mean", "vst.std"]),
)

Group By

You can group over obs and var columns which returns a generator of objects containing the grouped data and the grouping parameters.

gb_adata_result = adata.an.group_by(
    obs=an.col(["Cell_label"]),
    var=an.col(["feature_type"]),
    copy=False,
)

Here's what the first group looks like:

next(adata.an.group_by(
    obs=an.col(["Cell_label"]),
    copy=False,
))

GroupByAnnData:
  ├── Observations:
  │   └── Cell_label: Lymphomyeloid prog
  ├── Variables:
  │   └── (all variables)
  └── AnnData:
      View of AnnData object with n_obs × n_vars = 913 × 458
          obs: 'Cluster_ID', 'donor_id', 'Sample_Tag', 'Cell_label', 'is_primary_data', 'organism_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'assay_ontology_term_id', 'tissue_ontology_term_id', 'Genotype', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'disease_ontology_term_id', 'cell_type_ontology_term_id', 'suspension_type', 'tissue_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage', 'observation_joinid'
          var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable', 'feature_is_filtered', 'Unnamed: 0', 'feature_name', 'feature_reference', 'feature_biotype', 'feature_length', 'feature_type'
          uns: 'cell_type_ontology_term_id_colors', 'citation', 'default_embedding', 'schema_reference', 'schema_version', 'title'
          obsm: 'X_bothumap', 'X_pca', 'X_projected', 'X_projectedmean', 'X_tsneni', 'X_umapni'

Pipe

There's also a small utility method which allows you to chain operations together like in Xarray and Pandas called pipe.

import scanpy as sc
adata.an.pipe(sc.pl.embedding, basis="X_tsneni", color="Cell_label")

Release notes

See the changelog.

Contact

For questions and help requests, you can reach out in the scverse discourse or the discussions tab. If you found a bug, please use the issue tracker.

Citation

Varra, S. R. annsel [Computer software]. https://github.com/srivarra/annsel

Name		Name	Last commit message	Last commit date
Latest commit History 139 Commits
.github		.github
docs		docs
src/annsel		src/annsel
tests		tests
.codecov.yaml		.codecov.yaml
.cruft.json		.cruft.json
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
biome.jsonc		biome.jsonc
cliff.toml		cliff.toml
hatch.toml		hatch.toml
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

annsel

Getting started

Installation

Examples

Filter

Select

Group By

Pipe

Release notes

Contact

Citation

About

Uh oh!

Releases 13

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

License

srivarra/annsel

Folders and files

Latest commit

History

Repository files navigation

annsel

Getting started

Installation

Examples

Filter

Select

Group By

Pipe

Release notes

Contact

Citation

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 13

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages