tooluniverse.chem_tool module¶
- tooluniverse.chem_tool.quote('abc def') 'abc%20def' [source][source]¶
Each part of a URL, e.g. the path info, the query, etc., has a different set of reserved characters that must be quoted. The quote function offers a cautious (not minimal) way to quote a string for most of these parts.
RFC 3986 Uniform Resource Identifier (URI): Generic Syntax lists the following (un)reserved characters.
unreserved = ALPHA / DIGIT / “-” / “.” / “_” / “~” reserved = gen-delims / sub-delims gen-delims = “:” / “/” / “?” / “#” / “[” / “]” / “@” sub-delims = “!” / “$” / “&” / “’” / “(” / “)”
/ “*” / “+” / “,” / “;” / “=”
Each of the reserved characters is reserved in some component of a URL, but not necessarily in all of them.
The quote function %-escapes all characters that are neither in the unreserved chars (“always safe”) nor the additional chars set via the safe arg.
The default for the safe arg is ‘/’. The character is reserved, but in typical usage the quote function is being called on a path where the existing slash characters are to be preserved.
Python 3.7 updates from using RFC 2396 to RFC 3986 to quote URL strings. Now, “~” is included in the set of unreserved characters.
string and safe may be either str or bytes objects. encoding and errors must not be specified if string is a bytes object.
The optional encoding and errors parameters specify how to deal with non-ASCII characters, as accepted by the str.encode method. By default, encoding=’utf-8’ (characters are encoded with UTF-8), and errors=’strict’ (unsupported characters raise a UnicodeEncodeError).
- class tooluniverse.chem_tool.BaseTool(tool_config)[source][source]¶
Bases:
object
- classmethod get_default_config_file()[source][source]¶
Get the path to the default configuration file for this tool type.
This method uses a robust path resolution strategy that works across different installation scenarios:
Installed packages: Uses importlib.resources for proper package resource access
Development mode: Falls back to file-based path resolution
Legacy Python: Handles importlib.resources and importlib_resources
Override this method in subclasses to specify a custom defaults file.
- Returns:
Path or resource object pointing to the defaults file
- tooluniverse.chem_tool.register_tool(tool_type_name=None, config=None)[source][source]¶
Decorator to automatically register tool classes and their configs.
- Usage:
@register_tool(‘CustomToolName’, config={…}) class MyTool:
pass
- class tooluniverse.chem_tool.Indigo[source][source]¶
Bases:
object
- deserialize(arr: bytes) IndigoObject [source][source]¶
- Creates molecule or reaction object from binary serialized CMF
format
- Parameters:
arr (bytes) – array of bytes
- Returns:
molecule or reaction object
- Return type:
IndigoObject
- unserialize(arr: bytes) IndigoObject [source][source]¶
- [DEPRECATED] Creates molecule or reaction object from binary
serialized CMF format
- Parameters:
arr (list) – array of bytes
- Returns:
molecule or reaction object
- Return type:
IndigoObject
- convertToArray(iterable)[source][source]¶
Converts iterable object to array
- Parameters:
iterable (IndigoObject) – iterable object
- Raises:
IndigoException – if object is not iterable
- Returns:
array of objects
- Return type:
IndigoObject
- versionInfo()[source][source]¶
Returns Indigo version info
- Returns:
version info string
- Return type:
- countReferences()[source][source]¶
Returns the number of objects in pool
- Returns:
number of objects
- Return type:
- writeFile(filename)[source][source]¶
Creates file writer object
- Parameters:
filename (str) – full path to the file
- Returns:
file writer object
- Return type:
IndigoObject
- writeBuffer()[source][source]¶
Creates buffer to write an object
- Returns:
buffer object
- Return type:
IndigoObject
- createMolecule()[source][source]¶
Creates molecule object
- Returns:
molecule object
- Return type:
IndigoObject
- createQueryMolecule()[source][source]¶
Creates query molecule object
- Returns:
query molecule
- Return type:
IndigoObject
- loadMolecule(string)[source][source]¶
Loads molecule from string. Format is automatically recognized.
- Parameters:
string (str) – molecule format
- Returns:
molecule object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadMoleculeFromFile(filename)[source][source]¶
Loads molecule from file. Automatically detects input format.
- Parameters:
filename (str) – full path to a file
- Returns:
loaded molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadMoleculeFromBuffer(data)[source][source]¶
Loads molecule from buffer. Automatically detects input format.
- Parameters:
data (bytes) – input byte array
- Returns:
loaded molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
Examples
with open (..), 'rb') as f: m = indigo.loadMoleculeFromBuffer(f.read())
- loadMoleculeWithLib(string, library)[source][source]¶
Loads molecule from string. Format is automatically recognized.
- Parameters:
string (str) – molecule format
library (IndigoObject) – monomer library object
- Returns:
molecule object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadMoleculeWithLibFromFile(filename, library)[source][source]¶
Loads molecule from file. Automatically detects input format.
- Parameters:
filename (str) – full path to a file
library (IndigoObject) – monomer library object
- Returns:
loaded molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadMoleculeWithLibFromBuffer(data, library)[source][source]¶
Loads molecule from buffer. Automatically detects input format.
- Parameters:
data (bytes) – input byte array
library (IndigoObject) – monomer library object
- Returns:
loaded molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
Examples
with open (..), 'rb') as f: m = indigo.loadMoleculeFromBuffer(f.read())
- loadQueryMolecule(string)[source][source]¶
- Loads query molecule from string. Format will be automatically
recognized.
- Parameters:
string (str) – molecule format
- Returns:
query molecule object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadQueryMoleculeFromFile(filename)[source][source]¶
Loads query molecule from file. Automatically detects input format.
- Parameters:
filename (str) – full path to a file
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadQueryMoleculeWithLib(string, library)[source][source]¶
- Loads query molecule from string. Format will be automatically
recognized.
- Parameters:
string (str) – molecule format
library (IndigoObject) – monomer library object
- Returns:
query molecule object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadQueryMoleculeWithLibFromFile(filename, library)[source][source]¶
Loads query molecule from file. Automatically detects input format.
- Parameters:
filename (str) – full path to a file
library (IndigoObject) – monomer library object
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadSmarts(string)[source][source]¶
Loads query molecule from string in SMARTS format
- Parameters:
string (str) – smarts string
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadSmartsFromFile(filename)[source][source]¶
Loads query molecule from file in SMARTS format
- Parameters:
filename (str) – full path to the file with smarts strings
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadMonomerLibrary(string)[source][source]¶
Loads monomer library from ket string
- Parameters:
string (str) – ket
- Returns:
loaded monomer library
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadMonomerLibraryFromFile(filename)[source][source]¶
Loads monomer library from from file in ket format
- Parameters:
string (str) – full path to the file with ket
- Returns:
loaded monomer library
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadKetDocument(string)[source][source]¶
Loads ket document from ket string
- Parameters:
string (str) – ket
- Returns:
loaded ket document
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadKetDocumentFromFile(filename)[source][source]¶
Loads ket document from from file in ket format
- Parameters:
string (str) – full path to the file with ket
- Returns:
loaded ket document
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadSequence(string, seq_type, library)[source][source]¶
Loads molecule from DNA/RNA/PEPTIDE sequence string
- loadSequenceFromFile(filename, seq_type, library)[source][source]¶
Loads query molecule from file in sequence format
- Parameters:
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadFasta(string, seq_type, library)[source][source]¶
Loads molecule from DNA/RNA/PEPTIDE sequence string
- loadFastaFromFile(filename, seq_type, library)[source][source]¶
Loads query molecule from file in sequence format
- Parameters:
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadIdt(string, library)[source][source]¶
Loads molecule from IDT string
- Parameters:
string (str) – sequence string
library (IndigoObject) – monomer library object
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadIdtFromFile(filename, library)[source][source]¶
Loads query molecule from file in IDT sequence format
- Parameters:
filename (str) – full path to the file with sequence string
library (IndigoObject) – monomer library object
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadHelm(string, library)[source][source]¶
Loads molecule from HELM string
- Parameters:
string (str) – sequence string
library (IndigoObject) – monomer library object
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadHelmFromFile(filename, library)[source][source]¶
Loads query molecule from file in HELM sequence format
- Parameters:
filename (str) – full path to the file with sequence string
library (IndigoObject) – monomer library object
- Returns:
loaded query molecular structure
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadReaction(string)[source][source]¶
Loads reaction from string. Format will be automatically recognized.
- Parameters:
string (str) – reaction format
- Returns:
reaction object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadReactionFromFile(filename)[source][source]¶
Loads reaction from file
- Parameters:
filename (str) – full path to a file
- Returns:
loaded reaction
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadQueryReaction(string)[source][source]¶
- Loads query reaction from string. Format will be automatically
recognized.
- Parameters:
string (str) – reaction format
- Returns:
query reaction object
- Return type:
IndigoObject
- loadQueryReactionFromFile(filename)[source][source]¶
Loads query reaction from file. Automatically detects input format.
- Parameters:
filename (str) – full path to a file
- Returns:
loaded query reaction object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadReactionWithLib(string, library)[source][source]¶
Loads reaction from string. Format will be automatically recognized.
- Parameters:
string (str) – reaction format
library (IndigoObject) – monomer library object
- Returns:
reaction object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadReactionFromFileWithLib(filename, library)[source][source]¶
Loads reaction from file
- Parameters:
filename (str) – full path to a file
library (IndigoObject) – monomer library object
- Returns:
loaded reaction
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadQueryReactionWithLib(string, library)[source][source]¶
- Loads query reaction from string. Format will be automatically
recognized.
- Parameters:
string (str) – reaction format
library (IndigoObject) – monomer library object
- Returns:
query reaction object
- Return type:
IndigoObject
- loadQueryReactionFromFileWithLib(filename, library)[source][source]¶
Loads query reaction from file. Automatically detects input format.
- Parameters:
filename (str) – full path to a file
library (IndigoObject) – monomer library object
- Returns:
loaded query reaction object
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadReactionSmarts(string)[source][source]¶
Loads query reaction from string in SMARTS format
- Parameters:
string (str) – smarts string
- Returns:
loaded query reaction
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadReactionSmartsFromFile(filename)[source][source]¶
Loads query reaction from file in SMARTS format
- Parameters:
filename (str) – full path to the file with smarts strings
- Returns:
loaded query reaction
- Return type:
IndigoObject
- Raises:
IndigoException – Exception if structure format is incorrect
- loadStructureFromBuffer(structure_data, parameter=None)[source][source]¶
Loads structure object from buffer
- loadFingerprintFromBuffer(buffer)[source][source]¶
Creates a fingerprint from the supplied binary data
- Parameters:
buffer (list) – array of bytes
- Returns:
fingerprint object
- Return type:
IndigoObject
- loadFingerprintFromDescriptors(descriptors, size, density)[source][source]¶
Packs a list of molecule descriptors into a fingerprint object
- createReaction()[source][source]¶
Creates reaction object
- Returns:
reaction object
- Return type:
IndigoObject
- createQueryReaction()[source][source]¶
Creates query reaction object
- Returns:
query reaction object
- Return type:
IndigoObject
- exactMatch(item1, item2, flags='')[source][source]¶
Creates match object for the given structures
- Parameters:
item1 (IndigoObject) – first target structure (molecule or reaction)
item2 (IndigoObject) – second target structure (molecule or reaction)
flags (str) – exact match options. Optional, defaults to “”.
- Returns:
match object
- Return type:
IndigoObject
- clearTautomerRules()[source][source]¶
Clears all tautomer rules
- Returns:
1 if there are no errors
- Return type:
- commonBits(fingerprint1, fingerprint2)[source][source]¶
Returns the number of common 1 bits for the given fingerprints
- Parameters:
fingerprint1 (IndigoObject) – first fingerprint object
fingerprint2 (IndigoObject) – second fingerprint object
- Returns:
number of common bits
- Return type:
- similarity(item1, item2, metrics='')[source][source]¶
Returns the similarity measure between two structures. Accepts two molecules, two reactions, or two fingerprints.
- Parameters:
item1 (IndigoObject) – molecule, reaction or fingerprint object
item2 (IndigoObject) – molecule, reaction or fingerprint object
metrics (str) – “tanimoto”, “tversky”, “tversky <alpha> <beta>”, “euclid-sub” or “normalized-edit”. Optional, defaults to “tanimoto”.
- Returns:
[description]
- Return type:
- iterateSDFile(filename)[source][source]¶
Returns iterator for SDF files
- Parameters:
filename (str) – full file path
- Returns:
SD iterator object
- Return type:
IndigoObject
- iterateRDFile(filename)[source][source]¶
Returns iterator for RDF files
- Parameters:
filename (str) – full file path
- Returns:
RD iterator object
- Return type:
IndigoObject
- iterateSmilesFile(filename)[source][source]¶
Returns iterator for smiles files
- Parameters:
filename (str) – full file path
- Returns:
smiles iterator object
- Return type:
IndigoObject
- iterateCMLFile(filename)[source][source]¶
Returns iterator for CML files
- Parameters:
filename (str) – full file path
- Returns:
CML iterator object
- Return type:
IndigoObject
- iterateCDXFile(filename)[source][source]¶
Returns iterator for CDX files
- Parameters:
filename (str) – full file path
- Returns:
CDX iterator object
- Return type:
IndigoObject
- createSaver(obj, format_)[source][source]¶
Creates saver object
- Parameters:
obj (IndigoObject) – output object
format (str) – format settings
- Returns:
saver object
- Return type:
IndigoObject
- substructureMatcher(target, mode='')[source][source]¶
Creates substructure matcher
- Parameters:
target (IndigoObject) – target molecule or reaction
mode (str) – substructure mode. Optional, defaults to “”.
- Returns:
substructure matcher
- Return type:
IndigoObject
- extractCommonScaffold(structures, options='')[source][source]¶
Extracts common scaffold for the given structures
- Parameters:
structures (IndigoObject) – array object of molecule structures
options (str) – extraction options. Optional, defaults to “”.
- Returns:
scaffold object
- Return type:
IndigoObject
- decomposeMolecules(scaffold, structures)[source][source]¶
Creates deconvolution object for the given structures
- Parameters:
scaffold (IndigoObject) – query molecule object
structures (IndigoObject) – array of molecule structures
- Returns:
deconvolution object
- Return type:
IndigoObject
- rgroupComposition(molecule, options='')[source][source]¶
Creates composition iterator
- Parameters:
molecule (IndigoObject) – target molecule object
options (str) – rgroup composition options. Optional, defaults to “”.
- Returns:
composition iterator
- Return type:
IndigoObject
- getFragmentedMolecule(elem, options='')[source][source]¶
Returns fragmented molecule for the given composition element
- Parameters:
elem (IndigoObject) – composition element object
options (str) – Fragmentation options. Optional, defaults to “”.
- Returns:
fragmented structure object
- Return type:
IndigoObject
- createDecomposer(scaffold)[source][source]¶
Creates deconvolution object for the given scaffold
- Parameters:
scaffold (IndigoObject) – scaffold molecular structure
- Returns:
deconvolution object
- Return type:
IndigoObject
- reactionProductEnumerate(replaced_action, monomers)[source][source]¶
Creates reaction product enumeration iterator
- Parameters:
replaced_action (IndigoObject) – query reaction for the enumeration
monomers (IndigoObject) – array of objects to enumerate
- Returns:
result products iterator
- Return type:
IndigoObject
- transform(reaction, monomers)[source][source]¶
Transforms the given monomers by reaction
- Parameters:
reaction (IndigoObject) – query reaction
monomers (IndigoObject) – array of objects to transform
- Returns:
mapping object
- Return type:
IndigoObject
- loadBuffer(buf)[source][source]¶
Creates scanner object from buffer
- Parameters:
buf (list) – array of bytes
- Returns:
scanner object
- Return type:
IndigoObject
- loadString(string)[source][source]¶
Creates scanner object from string
- Parameters:
string (str) – string with information
- Returns:
scanner object
- Return type:
IndigoObject
- iterateSDF(reader)[source][source]¶
Creates SDF iterator from scanner object
- Parameters:
reader (IndigoObject) – scanner object
- Returns:
SD iterator object
- Return type:
IndigoObject
- iterateSmiles(reader)[source][source]¶
Creates smiles iterator from scanner object
- Parameters:
reader (IndigoObject) – scanner object
- Returns:
smiles iterator object
- Return type:
IndigoObject
- iterateCML(reader)[source][source]¶
Creates CML iterator from scanner object
- Parameters:
reader (IndigoObject) – scanner object
- Returns:
CML iterator object
- Return type:
IndigoObject
- iterateCDX(reader)[source][source]¶
Creates CDX iterator from scanner object
- Parameters:
reader (IndigoObject) – scanner object
- Returns:
CDX iterator object
- Return type:
IndigoObject
- iterateRDF(reader)[source][source]¶
Creates RDF iterator from scanner object
- Parameters:
reader (IndigoObject) – scanner object
- Returns:
RD iterator object
- Return type:
IndigoObject
- iterateTautomers(molecule, params)[source][source]¶
Iterates tautomers for the given molecule
- Parameters:
molecule (IndigoObject) – molecule to find tautomers from
params (str) – tau iteration parameters. “INCHI” or “RSMARTS”. Defaults to “RSMARTS”
- Returns:
molecule iterator object
- Return type:
IndigoObject
- nameToStructure(name, params=None)[source][source]¶
Converts a chemical name into a corresponding structure
- transformHELMtoSCSR(item)[source][source]¶
Transforms HELM to SCSR object
- Parameters:
item (IndigoObject) – object with HELM information
- Returns:
molecule with SCSR object
- Return type:
IndigoObject
- class tooluniverse.chem_tool.ChEMBLTool(tool_config, base_url='https://www.ebi.ac.uk/chembl/api/data')[source][source]¶
Bases:
BaseTool
Tool to search for molecules similar to a given compound name or SMILES using the ChEMBL Web Services API.
- run(arguments)[source][source]¶
Execute the tool.
The default BaseTool implementation accepts an optional arguments mapping to align with most concrete tool implementations which expect a dictionary of inputs.
- get_chembl_id_by_name(compound_name)[source][source]¶
Search ChEMBL for a compound by name and return the ChEMBL ID of the first match.