Oct 182016
Abstract Image

We investigated how many cases of the same chemical sold as different products (at possibly different prices) occurred in a prototypical large aggregated database and simultaneously tested the tautomerism definitions in the chemoinformatics toolkit CACTVS. We applied the standard CACTVS tautomeric transforms plus a set of recently developed ring–chain transforms to the Aldrich Market Select (AMS) database of 6 million screening samples and building blocks. In 30 000 cases, two or more AMS products were found to be just different tautomeric forms of the same compound. We purchased and analyzed 166 such tautomer pairs and triplets by 1H and 13C NMR to determine whether the CACTVS transforms accurately predicted what is the same “stuff in the bottle”. Essentially all prototropic transforms with examples in the AMS were confirmed. Some of the ring–chain transforms were found to be too “aggressive”, i.e. to equate structures with one another that were different compounds.

Experimental and Chemoinformatics Study of Tautomerism in a Database of Commercially Available Screening Samples

Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland 21702, United States
§ Basic Science Program, Chemical Biology Laboratory, Leidos Biomedical Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
J. Chem. Inf. Model., Article ASAP
Publication Date (Web): September 26, 2016
Copyright © 2016 American Chemical Society
Laura Guasch

Laura Guasch

Chemoinformatics Data Scientist

National Institutes of Health
Bethesda, MD, United States

a highly motivated computational chemist and cheminformatician that employ, analyze, and develop computer-based methods to aid in the drug discovery. I enjoy working with experimentalists from the fields of biology, pharmacy, medicine and chemistry on answering relevant questions in drug discovery.

Specialties: CADD; Virtual screening; Pharmacophores; Docking; Homology modeling; Quantum chemistry; SAR/QSAR; ADME/Tox modeling; Drug metabolism; Chemoinformatics; Data mining; Molecular informatics.

Research experience

  • Feb 2012–present PostDoc Position
    National Institutes of Health · National Cancer Institute (NCI): National Institute of Health · Chemical Biology Laboratory
    United States · Frederick
    Development of novel approaches for tautomerism analysis. Structure-based and ligand-based identification and design of anti-cancer and anti-viral agents.
  • Jan 2009–Jun 2009 Research Intern
    University of Innsbruck · Institute of General, Inorganic and Theoretical Chemistry · Theoretical Chemistry
    Austria · Innsbruck
    Discovery of Natural Product PPAR-gamma Partial Agonists by a Pharmacophore-Based Virtual Screening Workflow.
  • Sep 2007–Dec 2011 PhD Student
    Universitat Rovira i Virgili · Department of Biochemistry and Biotechnology · Nutrigenomics Research Group
    Spain · Tarragona
    Identification of natural products as antidiabetic agents using computer-aided drug design methods.
  • Jan 2007–Jun 2007 Research Fellow
    Universitat Rovira i Virgili · Department of Physical and Inorganic Chemistry · Quantum Chemistry Group
    Spain · Tarragona
    Prediction of enantiomeric excesses in asymmetric catalysis using a new QSSR approach based on three-dimensional DFT molecular descriptors
  • Jul 2006–Aug 2006 Summer Intership
    Barcelona Science Park · Quantum Simulation of Biological Processes
    Spain · Barcelona
    Structure and electronic configuration of compound I intermediates Penicillium vitale catalases using techniques of molecular dynamics simulation


  • Sep 2007–Jun 2008 Universitat Rovira i Virgili
    Nutrition and Metabolism · Master of Science
    Spain · Tarragona
  • Sep 2004–Jun 2007 Universitat Rovira i Virgili
    Biochemistry · BSc
    Spain · Tarragona
  • Sep 2002–Jun 2007 Universitat Rovira i Virgili
    Chemistry · Bachelor of Science
    Spain · Tarragona

Awards & achievements

  • Dec 2011 Award: European Doctorate Mention
  • Dec 2011 Award: PhD Extraordinary Award

Marc C. Nicklaus, Ph.D.

Marc C. Nicklaus, Ph.D.
Senior Scientist
Head, Computer-Aided Drug Design (CADD) Group
Dr. Nicklaus received his Ph.D. in applied physics from the Eberhards-Karls-Universitat, Tubingen, Germany, and then served as a postdoctoral fellow in the Molecular Modeling Section of the then called Laboratory of Medicinal Chemistry, NCI. He became a staff fellow in 1998, and a Senior Scientist in 2002. In 2000, he founded, and has been heading since then, the Computer-Aided Drug Design (CADD) Group.

Dr. Nicklaus pioneered work on making large small-molecule databases and related chemoinformatics tools available to the scientific public on the CADD Group’s web server. He also pioneered the analysis of conformational energies of small molecule ligands bound to proteins. As Head of the CADD Group, he oversees the group’s research program in chemoinformatics, fundamentals of protein-ligand interactions, and in silico screening for targets of high interest to NCI. He makes the latter resources available in collaborative projects to improve NCI’s efforts in hit identification and drug design.

Link to additional information about Dr. Nicklaus’ research.

Areas of Expertise

1) chemoinformatics, 2) small-molecule databases, 3) protein-ligand interactions, 4) (quantitative) structure-activity relationships, 5) computer-aided drug design, 6) computational chemistry


Marc C. Nicklaus, Ph.D.
Center for Cancer Research
National Cancer Institute
Building 376, Room 207
Frederick, MD 21702-1201
Ph: 301-846-5903 sends e-mail)

Computer-Aided Drug Design. The Computer-Aided Drug Design (CADD) Group is a research unit within the Chemical Biology Laboratory (CBL) that employs, analyzes, and develops computer-based methods to aid in the drug discovery, design, and development projects of the CBL and other researchers at the NIH. We split our efforts about evenly between support-type projects and research projects initiated and conducted by CADD staff members. We are implementing many projects, and making available resources developed by the CADD Group, in a Web-based manner. This offers three advantages: (1) it frees all users, including the group members themselves, from platform restraints and the concomitant expenses for specific software/hardware, (2) it makes resources and results immediately available for sharing among all collaborators regardless of their location, and (3) helps, without additional effort, further the mission of the NCI as a publicly funded institution by providing data and services directly to the (scientific) public.

Chemical Identifier Resolver (CIR). CIR works as a resolver for many different chemical structure identifiers (e.g. chemical names, InChI, SMILES etc.) and allows one to convert the given structure identifier into a full structure representation or another structure identifier including references to particular databases in which the corresponding structure or structure identifier occurs. CIR offers a simple to use, programmatic application programming interface (API) based on URLs requested by HTTP. This allows easy linking of CIR and its content to other scientific web services and program packages. CIR currently provides access to 120 million structure records.

Enhanced NCI Database Browser. The Enhanced NCI Database Browser can be used to search the 250,000-compound Open NCI Database. This dataset is the publicly available part of the half-million structure collection assembled by the NCI’s Developmental Therapeutics Program during the program’s 50+ years of screening compounds against cancer and, more recently, AIDS. Visit the CADD Group’s home page or the Enhanced NCI Database Browser service for more information.

Fundamentals of Protein-Ligand Interactions. The non-covalent binding of a drug to the binding site of an enzyme (or other biomacromolecule) is the fundamental process of most drug actions. In spite of a vast body of experimental data available on protein-ligand complexes, mostly obtained by X-ray crystallography, there are still open questions of how this binding process occurs at the atomic and quantitative energetic level. One of the issues is the range of conformational energies one can expect to find for the small-molecule ligand bound to proteins, which we found to be higher than generally assumed. This has led us to broader questions regarding x-ray crystallographic methodologies, such as whether quantum-mechanical refinement (or re-refinement) of protein ligand structures may improve structural quality in various ways.

HIV Integrase. A long-standing interest of our group has been HIV integrase (IN) as a drug development target. This enzyme catalyzes the integration of the viral DNA into the human DNA, which is an essential step in the viral replication cycle. Only a handful of approved drugs so far are based on IN inhibition. We have been utilizing all available experimental results, be they structural, mechanistic, or biochemical, to model and better understand inhibition of IN by small molecules. A recent expansion of these efforts is our work aimed at developing HIV microbicides for the prevention of infection with HIV by topical application such as vaginal gels.

Among our main collaborators are Stephen Hughes and Yves Pommier, NCI; Wolf-Dietrich Ihlenfeldt, Xemistry, Germany; Vladimir Poroikov, Russian Academy of Medical Sciences, Moscow; and Raul Cachau, Leidos, FNLCR.

Scientific Focus Areas:

Biomedical Engineering and Biophysics, Chemical Biology, Computational Biology, Structural Biology
/////////////Experimental, Chemoinformatics, Tautomerism,  Database,  Commercially Available,  Screening Samples

Sorry, the comment form is closed at this time.


Get every new post on this blog delivered to your Inbox.

Join other followers: