The process of discovering a new drug candidate, passing it through clinical trials and onto the market is extremely hard, time-consuming, and expensive. Less than one out of every 10,000 drug candidates becomes an approved marketed drug. Only three out of every 20 approved drugs bring in enough revenue to cover developmental costs. Moreover, it takes approximately 10-15 years and the average cost of $1-3 billion to develop each new drug. The development of computer algorithms can help in this process, for example, by suggesting novel molecules with optimal property profiles. This process is called de novo molecular design. The goal of de novo methods is to create novel molecules with desired properties. It typically comprises from three tasks: 1) molecule generation; 2) scoring, and 3) optimization [Schneider and Fechner, 2005]. Each of these steps could be performed sequentially or together by either human expert or machine.
Machine learning systems are radically transforming the practice of chemical and molecular sciences [Butler et al., 2018]. Drug discovery is well positioned to be the next frontier for a potential breakthrough. Not surprisingly, recent advances in machine learning methods have also facilitated the automated generation of new molecules with the desired properties. Recently we have seen a huge rise due to deep neural networks, which are now well-developed and optimized for continuous signals with naturally defined neighborhoods for the simplest elements. For example, pixels are grouped in local neighborhoods in images with convolutional neural networks, and words are grouped in natural language with recurrent neural networks. However, graphs are a more complex structure with non-uniform neighborhoods, for which networks have been introduced not so long ago.
While there exist several representations of molecules (SMILES, fingerprints, 3D atom configuration), graphs are the most natural one, with direct mapping of atoms into nodes, and bonds into edges. A molecular graph is undirected but has several node types (carbon, oxygen, nitrogen, etc.), as well as edge types (single, double, triple, and aromatic bond). Other representations suffer from being complicated and lacking a clear notion of distance. For example, SMILES (natural language representation of a molecule in the form of a string), adds a layer of complexity with its grammatical rules, but more importantly, a pair of molecules which share a common scaffold (core), can be encoded by very different SMILES strings representations. We introduce a molecular graph recurrent generative model, showing that incremental molecular graph construction seamlessly incorporates the proposed valency-based rejection sampling procedure that yields 100% valid molecules during inference, while also getting signal from invalid intermediate molecules through our structural penalty.
After training our model in an unsupervised manner to match the distribution of large training datasets, optimizing properties is particularly interesting for application. We show the capability of optimizing the generated molecules to a specific property range through reinforcement learning, where the reward is constructed based on the output of a critic.
We summarize our main contributions as the following:
• molecular graph recurrent model, MolecularRNN, for direct generation of realistic molecular graph structures that shows high validity/uniqueness/novelty
• valency-based rejection sampling method during inference that produces 100% valid molecules, and the structural penalty during training for atoms violating valency constraints
• target property optimization with reinforcement learning for improving drug-likeness, lipophilicity, and melting temperature
• an unprecedented large-scale experimental analysis and application through amalgamation of these techniques.
Various approaches to computational de-novo molecule generation have been proposed. The fundamental differences in these approaches lie in types of molecules representation. The most well-studied way to represent a chemical molecule is a simplified molecular-input line-entry system (SMILES) string [Weininger, 1988]. SMILES string consists of symbols corresponding to nodes of the molecular graph in their depth-first order, unambiguously describing the composition and structure of the chemical molecule. Approaches for generating molecules as SMILES strings are using a recurrent neural network to learn a language model of SMILES [Olivecrona et al., 2017, Gómez-Bombarelli et al., 2018, Popova et al., 2018]. Probably the biggest limitation of these methods is imperfect validity (i.e. some of the generated samples are not chemically valid molecules) due to a challenge of learning complex grammatical rules. Another limitation is that SMILES-based approaches cannot be naturally extended to scaffold optimization when a generation process starts from a given core of the molecule and the task is to find a molecule with better properties and pattern of substituents while maintaining the same molecular core.
Another way to represent a chemical molecule is through its molecular graph. Graph-based approaches typically do not suffer from the problem of invalidity of generated molecules. It is also possible to enforce physical constraints on the valency, i.e., how many neighbors each atom can have depending on the atom type. Moreover, these models are more interpretable and more intuitive to chemists. Various algorithms for generating molecular graphs have been developed [Jin et al., 2018, Li et al., 2018a]. Jin et al. [2018] proposed a junction tree variational autoencoder. This model first generates a junction tree where every node corresponds to a structural fragment rather than a single atom. Then, the junction tree is converted into a valid molecule with a sampling procedure. This approach produces valid molecules by design; however, there is ambiguity in the process of converting a generated junction tree into a molecule due to sampling. While this is not a problem for the unconstrained generative process, it may cause difficulties with property optimization because molecules with the same junction tree may have a drastic difference in property value. Jin et al. [2018] argue that it is beneficial to generate a graph from fragments, however, atom-by-atom models have already proven as a strong baseline [You et al., 2018a, Liu et al., 2018, Li et al., 2018b]. In [Li et al., 2018a] the process of graph generation is sequential. Nodes are generated one at a time and then connected to the existing partial graph. With a sequential process, the same graph can be generated with multiple sequences of steps due to the node order permutation. This work does not address the problem of node order permutation. Another limitation of this work is the constraints on the graph size. Only molecular graphs with at most 20 heavy atoms were considered which is not enough for any practical purpose.
In [You et al., 2018a], the procedure of molecular graph generation is presented as a Markov Decision Process. The model uses graph convolutional network (GCN) model for goal-directed graph generation with reinforcement learning and adversarial training. This work similar to [Jin et al., 2018] only reports top 3 molecules, while top 3 may not represent the model performance as well as the distribution of a property obtained from a large number of generated samples. Recently, GraphRNN model [You et al., 2018b] was proposed for the generation of undirected graphs. We extend this model to include node and edge types predictions.
Previous works have explored a variety of methods to optimize properties of interest for molecules: fine-tuning [Olivecrona et al., 2017], transfer learning [Segler et al., 2017], reinforcement learning [Popova et al., 2018] and adversarial training [Kadurin et al., 2017]. Often physicochemical properties, such as the octanol-water partition coefficient (logP) and molecular weight (MW), melting temperature () are used as a convenient proxy for drug-likeness and chances of a particular molecule for the successful drug candidate.
The core of our approach is a MolecularRNN model, which extends GraphRNN [You et al., 2018b] model for generating graphs with node and edge types. Section 3.1 gives background on GraphRNN, and the extension is described in Section 3.2. We introduce a method of valency-based rejection sampling in Section 3.3 that yields 100% validity in inference mode. We show a distribution shift towards desired properties values with reinforcement learning in Section 3.4. Finally in Section 3.5 we introduce our structural penalty that provides a signal from the invalid samples during training.
3.1 Background: GraphRNN model
GraphRNN [You et al., 2018b] was introduced for generation of undirected graphs G = (V, E) with a set of n nodes and a set of undirected edges
between those nodes. Under some node ordering
this graph is represented with its adjacency matrix
with
. The model generates graphs as sequences of adjacency vectors
from node
to previous nodes under
. Thus,
, and likelihood
can be modelled sequentially, being decomposed as
with the special end of sequence token (EOS) as an extra node n + 1.
State-transition function carries the information from step to step i, generating a node, and output function predicts the parameters for sampling current adjacency vector
of edges. According to GraphRNN, we consider recurrent neural networks for both state-transition function (NodeRNN) and output function (EdgeRNN). Thus, NodeRNN unrolls across nodes, updating its hidden state, while EdgeRNN unrolls across edges from i to previous nodes, creating parameters
with the use of a small MLP head with sigmoid activation, which models
as a dependent Bernoulli sequence:
One of the key insights of the method is to re-order the nodes with breadth-first search (BFS), starting from , which gradually reduces the space complexity for graph representations. Moreover, BFS order also reduces the number of edge predictions that have to be made, limiting the size of
dimensions, which appears to be a small number in practical tasks. Thus, for our modified MolecularRNN (Section 3.2) we empirically establish M = 12.
3.2 MolecularRNN
In order to represent a molecule with a graph, atoms are mapped to nodes, while bonds are mapped to edges. Now, adjacency vector entries represent categorical bond types , cor- responding to no, single, double, and triple bonds (molecules are modeled in kekulized form as defined in RDKit [Landrum et al., 2006]). Similarly, categorical type
(oxygen, nitrogen, chlorine, etc.) is assigned to each node. Notice that here a node always has a valid atom class. That is, there is no "terminal node" class, as terminal node notion is already incorporated into
. Specifically, when a node is generated that has no edges to any of the previous nodes, such a node is terminal. Atom class prediction is ignored for this node in our setting.
Likelihood in Equation 1 is rewritten accordingly for MolecularRNN:
with for the terminal node n + 1.
In our model, once the sub-graph on the first nodes under permutation
is completed, NodeRNN can momentarily decide on the atom type of the following node i. Thus, the process represents a dependent multivariate distribution. Accounting for the sub-graph, as well as the i-th atom type, the model switches to EdgeRNN that links the newly generated node to the set
. That step is in turn modeled with a dependent multivariate distribution, as EdgeRNN is unrolled across nodes that precede i. Overall MolecularRNN structure is shown in Figure 1. The model uses embeddings for categorical inputs, and a two-layer MLP with softmax output activation is added on top of hidden states
for categorical prediction, so Equation 2 is modified:
In our BFS ordering the first node is always a Carbon atom, since every organic molecule contains at least one such atom.
3.3 Valency-based rejection sampling
As we have seen, MolecularRNN samples edge types on each sub-step from a multinomial distribution with parameters coming out of softmax predictions. Even when the model is trained well for producing valid molecules, the softmax layer prediction will always have nonzero values, so if sampling is arbitrarily long, any graph can be sampled from the support space. However, real molecules have valency constraints. That is, per-atom valency has to be respected to satisfy chemical constraints. Consequently, in each step, we can ensure that the current sum of all bonds does not exceed the allowed valency. When generating an edge corresponding to a bond of order k between i and j we check the rejection sampling constraint for both atoms:
Figure 1: MolecularRNN model. The model consists of NodeRNN that unrolls across atoms, predicting the type of the next atom in the molecular graph, and EdgeRNN that for every atom is initialized with NodeRNN hidden state, and unrolls across preceding atoms to predict bond types.
For the final molecule, atoms that have not filled up their valencies are complemented with Hydrogens. Notice that valency can be directly enforced only for graphs, unlike SMILES representation, where intermediate sub-strings are not chemically meaningful.
3.4 Property optimization
While generating realistic molecules is an appealing goal, our ultimate aim is to shift the distribution of the generated samples for some desired property. To optimize the chosen property, we use policy gradient algorithm. In this formulation, MolecularRNN acts as a policy network which outputs probability of the next action given the current state. The set of actions is defined as the set of atom labels times the set of combinations of possible generated atom connection to the existing graph. The set of states is defined as all possible sub-graphs of graphs with up to a fixed number of N nodes. Consistently with the BFS ordering in MolecularRNN, initial state is a graph of a single carbon atom. The set of final states is defined as the set of all graphs that correspond to a valid molecule with up to N heavy atoms. The reward
for a final state
(without loss of generality
is used even if n < N in the generated graph) is calculated with a critic. We distributed the final reward to all intermediate steps, with the discounting factor, which proves to show more stable convergence in our experiments. Thus, intermediate rewards
are obtained by discounting the final reward with a fixed factor
The transition probabilities are the elements of the product in Equation 3. Given those, we can write down the loss function for the policy gradient optimization algorithm by Williams [1987], which is designed to maximize the expected reward:
3.5 Structural penalty
Valency-based rejection sampling can be used in inference, as was already described. However, the invalid intermediate structures that are obtained during training can provide a useful signal to the model. For example, a molecule can be almost realistic except for few invalid bonds. We introduce an additional structure penalty for the atoms that disrespect valencies. Thus, instead of providing a penalty for the whole molecule, we target specific atoms, which results in the modification of parameters that respect valency constraints.
To validate the quality of our results and compare those to the state-of-the-art methods we use validity, uniqueness, novelty, internal diversity, synthetic accessibility score (SA score) [Ertl and Schuffen- hauer, 2009] and drug-likeness score (QED) [Bickerton et al., 2012]. Validity is the percentage of chemically valid molecules. Uniqueness is the percentage of unique molecules in the generated pool. Notice that uniqueness is highly dependent on the pool size, and may significantly drop for a large generated library. In our experiments, we report uniqueness in up to a million of generated samples. Internal diversity, as proposed in MOSES benchmark [Polykovskiy et al., 2018], is a quantitative metric of the richness of the generated library and is calculated as the average pairwise distance between all pairs of molecules in the library. SA score is an estimation of how hard is to synthesize a given molecule, which also reflects its structural complexity. Molecules with a higher score will be more complex and harder to synthesize. However, molecules with very low score might be not complex enough to have the desired property. The values of interest for this metric are in the range between 2 and 4. Finally, QED is a measure of drug-likeness in the range from 0 to 1.
4.1 Unsupervised likelihood training
We first pretrain MolecularRNN on a large unlabeled dataset of molecules to teach the model to generate diverse realistic samples. Three training datasets are used: ChEMBL [Gaulton et al., 2011], random 250k molecules from ZINC [Irwin and Shoichet, 2005] and MOSES [Polykovskiy et al., 2018]. These three datasets have different statistics. The statistics are shown in Table 1. ChEMBL dataset contains around 1.5 million of real bioactive molecules (every molecule has at least one experimental bioactivity measurement) and is the most diverse out of all three datasets that we considered. ZINC 250k random dataset contains 250 thousand molecules randomly selected from a database of commercially available compounds [Irwin and Shoichet, 2005]. MOSES dataset contains almost 2 million molecules that were selected from the ZINC database based on several filters to only include molecules with drug-like properties.
Table 1: Statistics for training datasets.
We considered 9 most common elements (C, N, O, F, P, S, Cl, Br, I) and 3 bond types (single, double and triple). The number of atoms in the molecule is restricted to be from 10 to 50, which is chosen based on ChEMBL dataset, where 96% of molecules lie in this range. EdgeRNN is unrolled (as discussed in Section 3.1) for M = 12 steps for each atom. The following architectural parameters are used in all our settings: node embedding of size 128, edge embedding of size 16, NodeRNN with 4 GRU layers of hidden size 256 each, 2 layer NodeMLP with 128 hidden size and ReLU nonlinearity after the first layer, and EdgeRNN with 4 GRU layers of hidden size 128 each. During the unsupervised phase, models are trained with Adam optimizer for 250 epochs on 4 GPUs with a per-GPU batch size of 512. The starting learning rate is 0.001 with a multiplicative drop of 0.999 every k iterations, and k is chosen based on the dataset so that the learning rate drops to end of the training. MolecularRNN trained with the likelihood maximization on the training datasets achieves validity rate of 65% without valency-based rejection sampling. We further used structural penalty described in section 3.5 to shift the model towards generating molecules that respect valency constraints. To that end, every atom that violates its valency constraints is assigned a penalty of
, and then the model is optimized with the policy gradient method. After training with structural penalty, our model achieved valid rate of 90% without valency-based rejection sampling. Enabling valency-based rejection sampling results in 100% valid rate for all models.
Table 2: Statistics for 1 million molecules generated by 3 models pretrained on 3 training datasets
Table 2 summarizes the results of unsupervised likelihood training of MolecularRNN on the three datasets. Statistics are calculated on 1 million generated graphs, which is a much larger scale than previously reported. For comparison, Jin et al. [2018] sample 5 thousand graphs, and Li et al. [2018a] evaluate a 100 thousand set. In all cases, the model produces novel diverse realistic molecules.
We also compare our model with GCPN [You et al., 2018a] and JT-VAE [Jin et al., 2018] in Table 3 on 30K molecules generated from each method. MolecularRNN produces comparable results to the baselines in terms of validity, uniqueness, and novelty. GCPN tends to generate overly complex, hard to synthesize molecules (high SA score). Samples from our model are more realistic, and also have higher internal diversity than the ones from JT-VAE.
Table 3: Comparison of MolecularRNN, GCPN [You et al., 2018a] and JT-VAE [Jin et al., 2018]. Models are trained on ZINC 250k dataset. Statistics are calculated for 30000 generated molecules.
4.2 Property optimization with reinforcement learning
We performed experiments on the properties optimization of generated molecules starting with our strong pretrained model with the policy gradient algorithm (section 3.4). We choose maximization of penalized logP as defined in [Jin et al., 2018] and QED [Bickerton et al., 2012] starting from MolecularRNN that is likelihood-pretrained on ZINC 250k dataset. We also performed an additional experiment with maximization of melting temperature. Such an analysis has never been reported in graph-based generative models before. This is an appealing exercise because it requires training an additional model for melting temperature prediction, while logP and QED can be computed directly from the molecular graph structure. This experiment mimics realistic drug discovery scenario, where toxicity or bioactivity is optimized. It paves the way for further research in this important direction.
Penalized logP and QED maximization. As in [You et al., 2018a, Jin et al., 2018], we independently maximize two properties – penalized logP and QED score. MolecularRNN is tuned for 300 iterations with a generated batch size of 512 and Adam optimizer with a constant learning rate of . The objective function in Equation 6 maximizes the following rewards:
We use discount factor . The best 3 molecules after optimization for both properties are shown in Table 4, and demonstrates the distribution shift. In this experiment, our model outperforms all baselines in both tasks. The top 3 molecules are shown in Figure 2. Samples with high logP values are very realistic, as the model learned to grow a chain of aromatic rings that would very strongly bind to a lipid membrane (high lipophilicity). This is an indicator that the model learned some underlying physics about relationship between molecular structure and properties.
Table 4: Comparison of the top 3 scores for penalized logP and QED.
Figure 2: Top 3 molecules for MolecularRNN optimized with policy gradient
Figure 3: Distribution of maximized QED for MolecularRNN and GCPN.
We took a step further and not only looked at molecules with top 3 scores but also considered the full distribution of the maximized QED for libraries generated with our MolecularRNN and GCPN You et al. [2018a] as the best baseline. We argue that reporting only the top 3 scores is not the most informative benchmarking metric, since top 3 may not reflect the real performance of the model. Instead, we encourage reporting the statistics of the optimized distribution. Figure 3 shows that MolecularRNN shifts the distribution father to the maximum values of QED compared to GCPN.
Melting temperature maximization. We train a graph convolution regression model introduced in [Kipf and Welling, 2016] for predicting the melting point of a molecule. Training and test datasets were 37940 and 9458 objects correspondingly; with ranging from
to
. The model has 4 layers with hidden sizes of 128. We use Adam optimizer, starting with a learning rate of 0.001 and exponential decay with
after every epoch. The model is trained with a batch size of 32 for 30 epochs. The model converges to RMS error of
, that is comparable to the state-of-the-art for the same dataset [Tetko et al., 2014]. This model is then used to assign a reward function
, where
is the normalized predicted melting temperature for a molecule.
For this experiment, we used model pretrained on ChEMBL dataset and optimized it with the same settings as in the previous experiments – 300 iterations with a batch size of 512 and Adam optimizer with a constant learning rate of . Figure 4a shows the relative distribution shift of predicted property for the molecules sampled from the pretrained model and for the molecules sampled from the optimized model. Example of generated molecules with predicted values of
are shown in Figure 4b. Interestingly, in this experiment, MolecularRNN rediscovered two known chemical phenomena. First, fusing multiple aromatic rings significantly increases the
. Second, the
and heterocyclic nitrogens make molecules more polar. This usually enhances dipole-dipole interactions and subsequently increase
Figure 4: Melting temperature maximization
We proposed MolecularRNN, the model for generating realistic molecular graphs. MolecularRNN learns diverse distributions through unsupervised pretraining, generating 100% valid molecules in inference, while still receiving negative feedback from invalid ones during training. Combined with policy gradient optimization, MolecularRNN solves the problem of generating molecules with desired properties. Optimized MolecularRNN outperforms other state-of-the-art methods on the benchmark tasks. Furthermore, we use the predictive model as a critic to optimize melting temperature, a property that cannot be calculated from a molecular graph. Further studies address problems of multi-objective property optimization and completion of a molecular graph from a given scaffold.
O.I. acknowledges support from DOD-ONR (N00014-16-1-2311), National Science Foundation (NSF CHE-1802789), and Eshelman Institute for Innovation (EII) awards. M.P. acknowledges The Molecular Sciences Software Institute (MolSSI) Software Fellowship and NVIDIA Graduate Fellowship. We gratefully acknowledge the support and hardware donation from NVIDIA Corporation.
Gisbert Schneider and Uli Fechner. Computer-based de novo design of drug-like molecules. Nature Reviews Drug Discovery, 4(8):649, 2005.
Keith T Butler, Daniel W Davies, Hugh Cartwright, Olexandr Isayev, and Aron Walsh. Machine learning for molecular and materials science. Nature, 559(7715):547, 2018.
David Weininger. Smiles, a chemical language and information system. 1. introduction to methodol- ogy and encoding rules. Journal of chemical information and computer sciences, 28(1):31–36, 1988.
Marcus Olivecrona, Thomas Blaschke, Ola Engkvist, and Hongming Chen. Molecular de-novo design through deep reinforcement learning. Journal of cheminformatics, 9(1):48, 2017.
Rafael Gómez-Bombarelli, Jennifer N. Wei, David Duvenaud, José Miguel Hernández-Lobato, Ben- jamín Sánchez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, Ryan P. Adams, and Alán Aspuru-Guzik. Automatic chemical design using a data-driven continuous representation of molecules. ACS Central Science, 4(2):268–276, 2018. doi: 10.1021/acscentsci.7b00572. URL https://doi.org/10.1021/acscentsci.7b00572.
Mariya Popova, Olexandr Isayev, and Alexander Tropsha. Deep reinforcement learning for de novo drug design. Science advances, 4(7):eaap7885, 2018.
Wengong Jin, Regina Barzilay, and Tommi Jaakkola. Junction tree variational autoencoder for molecular graph generation. arXiv preprint arXiv:1802.04364, 2018.
Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, and Peter Battaglia. Learning deep generative models of graphs. arXiv preprint arXiv:1803.03324, 2018a.
Jiaxuan You, Bowen Liu, Zhitao Ying, Vijay Pande, and Jure Leskovec. Graph convolutional policy network for goal-directed molecular graph generation. In Advances in Neural Information Processing Systems, pages 6410–6421, 2018a.
Qi Liu, Miltiadis Allamanis, Marc Brockschmidt, and Alexander Gaunt. Constrained graph variational autoencoders for molecule design. In Advances in Neural Information Processing Systems, pages 7795–7804, 2018.
Yibo Li, Liangren Zhang, and Zhenming Liu. Multi-objective de novo drug design with conditional graph generative model. Journal of cheminformatics, 10(1):33, 2018b.
Jiaxuan You, Rex Ying, Xiang Ren, William L Hamilton, and Jure Leskovec. Graphrnn: Generating realistic graphs with deep auto-regressive models. arXiv preprint arXiv:1802.08773, 2018b.
Marwin HS Segler, Thierry Kogej, Christian Tyrchan, and Mark P Waller. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS central science, 4(1): 120–131, 2017.
Artur Kadurin, Sergey Nikolenko, Kuzma Khrabrov, Alex Aliper, and Alex Zhavoronkov. drugan: an advanced generative adversarial autoencoder model for de novo generation of new molecules with desired molecular properties in silico. Molecular pharmaceutics, 14(9):3098–3104, 2017.
Greg Landrum et al. Rdkit: Open-source cheminformatics, 2006.
R. Williams. A class of gradient-estimation algorithms for reinforcement learning in neural networks. In International Conference on Neural Networks, 1987.
Peter Ertl and Ansgar Schuffenhauer. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. Journal of cheminformatics, 1(1):8, 2009.
G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs. Nature chemistry, 4(2):90, 2012.
Daniil Polykovskiy, Alexander Zhebrak, Benjamin Sanchez-Lengeling, Sergey Golovanov, Oktai Tatanov, Stanislav Belyaev, Rauf Kurbanov, Aleksey Artamonov, Vladimir Aladinskiy, Mark Veselov, et al. Molecular sets (moses): A benchmarking platform for molecular generation models. arXiv preprint arXiv:1811.12823, 2018.
A. Gaulton, L.J. Bellis, A.P. Bento, J. Chambers, M. Davies, A. Hersey, Y. Light, S. McGlinchey, D. Michalovich, B. Al-Lazikani, and J.P. Overington. Chembl: a large-scale bioactivity database for drug discovery. Nucleic acids research, 40(D1):D1100–D1107, 2011.
John J Irwin and Brian K Shoichet. Zinc - a free database of commercially available compounds for virtual screening. Journal of chemical information and modeling, 45(1):177–182, 2005.
Gabriel Lima Guimaraes, Benjamin Sanchez-Lengeling, Carlos Outeiral, Pedro Luis Cunha Farias, and Alán Aspuru-Guzik. Objective-reinforced generative adversarial networks (organ) for sequence generation models. arXiv preprint arXiv:1705.10843, 2017.
Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
Y. Tetko, I.V. ans Sushko, S. Novotarskyi, L. Patiny, I. Kondratov, A.E. Petrenko, L. Charochkina, and A.M. Asiri. How accurately can we predict the melting points of drug-like compounds? Journal of chemical information and modeling, 54(12):D1100–D1107, 2014.