|
|
|
|
|
|
|
|
|
|
| Kim Henrick, European Bioinformatics Institute |
|
Kim Henrick joined the European Bioinformatics Institute in 1996 and became the Macromolecular Structure Database (MSD) group leader in 2001. He received his PhD in inorganic chemistry in 1974 from the University of Western Australia. After 10 years in organometallic chemistry at the Polytechnic of North London he moved into protein crystallography at Imperial College followed by a period as secretary to CCP4 at Daresbury Laboratory and post-doctoral positions at the Laboratory of Molecular Biology in Cambridge (UK) and The National Institute of Medical Research in London.
The MSD group is part of the wwPDB, processes 3D structures for the Protein Data Bank (PDB) and works on applying relational database technologies creating databases derived from the PDB used to ensure data uniformity across the whole archive together with working towards the integration of various bioinformatics data resources. Search systems are being developed in conjunction with new visualization tools that can present both structure and sequence data in a unified interface.
|
|
Assembly, Distribution and Retrieval of Biological Information in Drug Discovery
Kim Henrick, European Bioinformatics Institute
The MSD is a comprehensive relational database of 3D structures of biological molecules. Its goal is to assemble and distribute this information. The MSD database stresses data ontologies and data standards, data acquisition, data retrieval, and the integration of molecular biology data from different sources. Our users access this data to answer questions relative to the structural basis of biomolecular processes, evolution, disease and drug design. The database serves a worldwide community of scientists through a web site built on data warehouse technologies and through improved software and hardware that facilitates integration and access.
The MSD consists of several relational databases that are currently freely accessible by WWW based tools: · The archive database - used for data conformity, and all original data are loaded into this database using a complex and MSD/Oracle specific extensive set of procedures. All subsequent databases are transforms of this database. · The residue reference database - contains the complete chemical descriptions of all chemical species contained in PDB entries · The reference ligand environment database · The reference superposition database - used for grouping and 3D structure matching · The transformed de-normalised staging database - used to derive additional information and merge in reference, site data and grouping data, and creating inter-data base links. · The search database - a transform of the staging database
The search and retrieval systems provide a range of sophisticated methods to enable users to carry out complex searches of the data. Our systems are based on current industry-strength relational database technology and provide the ability to perform fast and sophisticated searches of structural data in ways that previously would have required the user to write dedicated computer programs. A generic search interface is available, coupled with a fast secondary structure domain search tool. We have also recognised the importance of speed of access, and maintain that the majority of our search services should respond within 3 seconds to provide a truly interactive service. There are eight different search engines, each system offers a different view of the data and the types of data handled are:
PQS : covers the biologically active molecule MSDpro : a generic interface for the general biologist MSDlite : the text base general search interface MSDchem : the search interface for the ligands contained in the PDB MSDsite : the search interface to the active site database MSDtarget : the search interface to the target database MSDfold : Secondary structure alignment search MSDmine : an expert's system to expose the full potential of the MSD to the experienced scientist
NEW - released in August 2005: MSDmotif: search interface for 3D motifs and any chi/phi/psi combination MSDpisa : search interface for Protein interfaces surfaces and assemblies MSDtemplate: search interface for data mined significant 3D arrangements of residues found in the PDB
Acknowledgements MSD gratefully acknowledge the support from the Wellcome Trust, the EU, CCP4, the BBSRC, the MRC and EMBL.
|
|
|
|
|
|
|
|
|