ChemMap

ChemMap is a Python library that tries to bridge the gap from metabolomics to proteomics using existing databases.

Table of Contents
ChemMap in a Nutshell
How to Download
How to Use

ChemMap in a Nutshell

A sketch of the main method of ChemMap can be found on the following diagram.


Schema showing the workflow of ChemMap

The main functionality of ChemMap, the function map_smiles_to_proteins, accepts a SMILES or a list of them and on the first phase tries to extract PubChem's and ChEBI's chemical identifiers of this molecule using the PUG REST API. Should you select "expand_all" or "expand_pubchem" as parameters of the search_method, ChemMap would then find molecules that are structurally similar using PUG REST API fastsimilarity_2d endpoint, which uses Tanimoto similarity scores. It is noteworthy that in order to extract ChEBI's identifiers at this stage we are relying on them being reported on PubChem, which might not be the case for newly reported ChEBI substances.

On the second phase, if either "expand_all" or "expand_chebi" where selected as input for the parameter search_method. The workflow will use libChEBIpy to find substances that are related to the ones found by one of the following relationships is_conjugate_base_of is_conjugate_acid_of, is_a, is_tautomer_of or is_enantiomer_of.

On the last step, the ChEBI identifiers are used to search for the presence of the compound on a Rhea reaction as a substrate. If we found one, we retrieve the EC Number and UniProt protein identifier, if available. On the background we are using the UniProt SPARQL Endpoint and the fact that Rhea and UniProt are synchronized on every UniProt release (more here).

The output of this process are 3 dataframes that contain, compound data (as explained in the first and second phases), reaction data (last step) and reaction data of similar structures, respectively. Should the to_tsv parameter be passed to the method, the data will then be saved on a folder with name corresponding to the date and time up to the second.

How to Download

This library can be downloaded through pip

pip install chemmap

or by direct clone using git

git clone [email protected]:anguera5/ChemMap.git

and installing the dependencies with poetry.

cd ChemMap
poetry env activate

How to Use

A minimal use case would look as follows. We are interested in knowing all the chemical identifiers and its reactions for Aspirin. A quick Google search will show us that the SMILES for Aspirin is CC(=O)OC1=CC=CC=C1C(=O)O

from ChemMap.chem_map import ChemMap

smiles = "CC(=O)OC1=CC=CC=C1C(=O)O"
search_method = "expand_all" 
cm = ChemMap()
cm.map_smiles_to_proteins(smiles, search_method=search_method)

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
ChemMap		ChemMap
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChemMap

ChemMap in a Nutshell

How to Download

How to Use

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

anguera5/ChemMap

Folders and files

Latest commit

History

Repository files navigation

ChemMap

ChemMap in a Nutshell

How to Download

How to Use

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages