Appendix#
Which descriptors are computed#
The Chemistry Development Kit provides a large list of
molecular descriptors, as API for the
org.openscience.cdk.qsar.descriptors.molecular
package reports.
We can get the list of such descriptors by running the following code:
import re
from urllib.request import urlopen
API_URL = 'https://cdk.github.io/cdk/2.9/docs/api/org/openscience/cdk/qsar/descriptors/molecular/package-summary.html'
PATTERN = re.compile(r'href="(\w[^"]+).html" title="class in')
cdk_descriptors = set()
with urlopen(API_URL) as inf:
for line in inf.read().decode('utf-8').splitlines():
if m := PATTERN.search(line):
cdk_descriptors.add(m.group(1))
cdk_descriptors
{'ALOGPDescriptor',
'APolDescriptor',
'AcidicGroupCountDescriptor',
'AminoAcidCountDescriptor',
'AromaticAtomsCountDescriptor',
'AromaticBondsCountDescriptor',
'AtomCountDescriptor',
'AutocorrelationDescriptorCharge',
'AutocorrelationDescriptorMass',
'AutocorrelationDescriptorPolarizability',
'BCUTDescriptor',
'BPolDescriptor',
'BasicGroupCountDescriptor',
'BondCountDescriptor',
'CPSADescriptor',
'CarbonTypesDescriptor',
'ChiChainDescriptor',
'ChiClusterDescriptor',
'ChiPathClusterDescriptor',
'ChiPathDescriptor',
'EccentricConnectivityIndexDescriptor',
'FMFDescriptor',
'FractionalCSP3Descriptor',
'FractionalPSADescriptor',
'FragmentComplexityDescriptor',
'GravitationalIndexDescriptor',
'HBondAcceptorCountDescriptor',
'HBondDonorCountDescriptor',
'HybridizationRatioDescriptor',
'IPMolecularLearningDescriptor',
'JPlogPDescriptor',
'KappaShapeIndicesDescriptor',
'KierHallSmartsDescriptor',
'LargestChainDescriptor',
'LargestPiSystemDescriptor',
'LengthOverBreadthDescriptor',
'LongestAliphaticChainDescriptor',
'MDEDescriptor',
'MannholdLogPDescriptor',
'MomentOfInertiaDescriptor',
'PetitjeanNumberDescriptor',
'PetitjeanShapeIndexDescriptor',
'RotatableBondsCountDescriptor',
'RuleOfFiveDescriptor',
'SmallRingDescriptor',
'SpiroAtomCountDescriptor',
'TPSADescriptor',
'VABCDescriptor',
'VAdjMaDescriptor',
'WHIMDescriptor',
'WeightDescriptor',
'WeightedPathDescriptor',
'WienerNumbersDescriptor',
'XLogPDescriptor',
'ZagrebIndexDescriptor'}
However, not all of these descriptors are computed by the jp²rt
package, more
precisely, the set of not computed descriptors can be obtained by the
difference:
from jp2rt import descriptors
jp2rt_descriptors = set(descriptors())
not_computed = cdk_descriptors - jp2rt_descriptors
not_computed
{'CPSADescriptor',
'GravitationalIndexDescriptor',
'IPMolecularLearningDescriptor',
'LengthOverBreadthDescriptor',
'LongestAliphaticChainDescriptor',
'MomentOfInertiaDescriptor',
'VABCDescriptor',
'WHIMDescriptor'}
The reason why such descriptors are not computed is that their computation
returns just NaN values or raise exceptions, as one can easily check with the
compute_single_descriptor()
function.
import numpy as np
from jp2rt import compute_single_descriptor
smiles = 'O=C(O)C(N)CC1=CC=C(O)C=C1'
for descriptor in not_computed:
print(descriptor, all(np.isnan(f) for f in compute_single_descriptor(descriptor, smiles)))
CPSADescriptor True
GravitationalIndexDescriptor True
LongestAliphaticChainDescriptor True
VABCDescriptor True
WHIMDescriptor True
IPMolecularLearningDescriptor True
MomentOfInertiaDescriptor True
LengthOverBreadthDescriptor True
Apr 04, 2024 10:49:09 AM it.unimi.di.jp2rt.WrappedMolecularDescriptor calculate
WARNING: Ignoring exception during clone/calculate/getValue of LongestAliphaticChainDescriptor, descriptors replaced with 1 NaN
How this documentation is produced#
This documentation is generated using Jupiter Book,
the source of the documentation is available in the jp²rt
repository, in the
docs
directory.
Every code sample (both in Python and shell) is executed during the build of the documentation, so all the output present in the documentation is up-to-date and corresponds exactly to the output produced by the current version of the package.
If you want to run the code of this documentation besides the jp²rt
package
(with plot
dependencies included) you need to install Jupiter
Book. Otherwise you can download a precompiled copy
of the documentation from the
Releases page of the jp²rt
repository.
The following table reports the computation time of the various code samples for every section of this documentation.
Document |
Modified |
Method |
Run Time (s) |
Status |
---|---|---|---|---|
2024-04-04 10:48 |
cache |
155.98 |
✅ |
|
2024-04-04 10:48 |
cache |
35.01 |
✅ |
|
2024-04-04 10:48 |
cache |
0.94 |
✅ |
|
2024-04-04 10:49 |
cache |
7.8 |
✅ |
|
2024-04-04 10:49 |
cache |
2.33 |
✅ |
|
2024-04-04 10:49 |
cache |
3.32 |
✅ |
|
2024-04-04 10:49 |
cache |
8.85 |
✅ |
Changelog#
You can find the
CHANGELOG in the
jp²rt
repository.