We investigated how many cases of the same chemical sold as different products (at possibly different prices) occurred in a prototypical large aggregated database and simultaneously tested the tautomerism definitions in the chemoinformatics toolkit CACTVS. We applied the standard CACTVS tautomeric transforms plus a set of recently developed ring–chain transforms to the Aldrich Market Select (AMS) database of 6 million screening samples and building blocks. In 30 000 cases, two or more AMS products were found to be just different tautomeric forms of the same compound. We purchased and analyzed 166 such tautomer pairs and triplets by 1H and 13C NMR to determine whether the CACTVS transforms accurately predicted what is the same “stuff in the bottle”. Essentially all prototropic transforms with examples in the AMS were confirmed. Some of the ring–chain transforms were found to be too “aggressive”, i.e. to equate structures with one another that were different compounds.
Experimental and Chemoinformatics Study of Tautomerism in a Database of Commercially Available Screening Samples
Chemoinformatics Data Scientist
a highly motivated computational chemist and cheminformatician that employ, analyze, and develop computer-based methods to aid in the drug discovery. I enjoy working with experimentalists from the fields of biology, pharmacy, medicine and chemistry on answering relevant questions in drug discovery.
Specialties: CADD; Virtual screening; Pharmacophores; Docking; Homology modeling; Quantum chemistry; SAR/QSAR; ADME/Tox modeling; Drug metabolism; Chemoinformatics; Data mining; Molecular informatics.
Feb 2012–present PostDoc PositionNational Institutes of Health · National Cancer Institute (NCI): National Institute of Health · Chemical Biology LaboratoryUnited States · FrederickDevelopment of novel approaches for tautomerism analysis. Structure-based and ligand-based identification and design of anti-cancer and anti-viral agents.
Jan 2009–Jun 2009 Research InternUniversity of Innsbruck · Institute of General, Inorganic and Theoretical Chemistry · Theoretical ChemistryAustria · InnsbruckDiscovery of Natural Product PPAR-gamma Partial Agonists by a Pharmacophore-Based Virtual Screening Workflow.
Sep 2007–Dec 2011 PhD StudentUniversitat Rovira i Virgili · Department of Biochemistry and Biotechnology · Nutrigenomics Research GroupSpain · TarragonaIdentification of natural products as antidiabetic agents using computer-aided drug design methods.
Jan 2007–Jun 2007 Research FellowUniversitat Rovira i Virgili · Department of Physical and Inorganic Chemistry · Quantum Chemistry GroupSpain · TarragonaPrediction of enantiomeric excesses in asymmetric catalysis using a new QSSR approach based on three-dimensional DFT molecular descriptors
Jul 2006–Aug 2006 Summer IntershipBarcelona Science Park · Quantum Simulation of Biological ProcessesSpain · BarcelonaStructure and electronic configuration of compound I intermediates Penicillium vitale catalases using techniques of molecular dynamics simulation
Sep 2007–Jun 2008 Universitat Rovira i VirgiliNutrition and Metabolism · Master of ScienceSpain · Tarragona
Sep 2004–Jun 2007 Universitat Rovira i VirgiliBiochemistry · BScSpain · Tarragona
Sep 2002–Jun 2007 Universitat Rovira i VirgiliChemistry · Bachelor of ScienceSpain · Tarragona
Awards & achievements
Dec 2011 Award: European Doctorate Mention
Dec 2011 Award: PhD Extraordinary Award
Marc C. Nicklaus, Ph.D.
Dr. Nicklaus pioneered work on making large small-molecule databases and related chemoinformatics tools available to the scientific public on the CADD Group’s web server. He also pioneered the analysis of conformational energies of small molecule ligands bound to proteins. As Head of the CADD Group, he oversees the group’s research program in chemoinformatics, fundamentals of protein-ligand interactions, and in silico screening for targets of high interest to NCI. He makes the latter resources available in collaborative projects to improve NCI’s efforts in hit identification and drug design.
Areas of Expertise
Computer-Aided Drug Design. The Computer-Aided Drug Design (CADD) Group is a research unit within the Chemical Biology Laboratory (CBL) that employs, analyzes, and develops computer-based methods to aid in the drug discovery, design, and development projects of the CBL and other researchers at the NIH. We split our efforts about evenly between support-type projects and research projects initiated and conducted by CADD staff members. We are implementing many projects, and making available resources developed by the CADD Group, in a Web-based manner. This offers three advantages: (1) it frees all users, including the group members themselves, from platform restraints and the concomitant expenses for specific software/hardware, (2) it makes resources and results immediately available for sharing among all collaborators regardless of their location, and (3) helps, without additional effort, further the mission of the NCI as a publicly funded institution by providing data and services directly to the (scientific) public.
Chemical Identifier Resolver (CIR). CIR works as a resolver for many different chemical structure identifiers (e.g. chemical names, InChI, SMILES etc.) and allows one to convert the given structure identifier into a full structure representation or another structure identifier including references to particular databases in which the corresponding structure or structure identifier occurs. CIR offers a simple to use, programmatic application programming interface (API) based on URLs requested by HTTP. This allows easy linking of CIR and its content to other scientific web services and program packages. CIR currently provides access to 120 million structure records.
Enhanced NCI Database Browser. The Enhanced NCI Database Browser can be used to search the 250,000-compound Open NCI Database. This dataset is the publicly available part of the half-million structure collection assembled by the NCI’s Developmental Therapeutics Program during the program’s 50+ years of screening compounds against cancer and, more recently, AIDS. Visit the CADD Group’s home page or the Enhanced NCI Database Browser service for more information.
Fundamentals of Protein-Ligand Interactions. The non-covalent binding of a drug to the binding site of an enzyme (or other biomacromolecule) is the fundamental process of most drug actions. In spite of a vast body of experimental data available on protein-ligand complexes, mostly obtained by X-ray crystallography, there are still open questions of how this binding process occurs at the atomic and quantitative energetic level. One of the issues is the range of conformational energies one can expect to find for the small-molecule ligand bound to proteins, which we found to be higher than generally assumed. This has led us to broader questions regarding x-ray crystallographic methodologies, such as whether quantum-mechanical refinement (or re-refinement) of protein ligand structures may improve structural quality in various ways.
HIV Integrase. A long-standing interest of our group has been HIV integrase (IN) as a drug development target. This enzyme catalyzes the integration of the viral DNA into the human DNA, which is an essential step in the viral replication cycle. Only a handful of approved drugs so far are based on IN inhibition. We have been utilizing all available experimental results, be they structural, mechanistic, or biochemical, to model and better understand inhibition of IN by small molecules. A recent expansion of these efforts is our work aimed at developing HIV microbicides for the prevention of infection with HIV by topical application such as vaginal gels.
Among our main collaborators are Stephen Hughes and Yves Pommier, NCI; Wolf-Dietrich Ihlenfeldt, Xemistry, Germany; Vladimir Poroikov, Russian Academy of Medical Sciences, Moscow; and Raul Cachau, Leidos, FNLCR.
Scientific Focus Areas: