Welcome !

Wed, Aug 10, 2011 23:09 Posted by Stanley Hsu

This site contains materials of our paper:

Quantitatively Integrating Molecular Structure and Bioactivity Profile Evidence into Drug-target Relationship Analysis

Manuscripts, data, supplemental materials and the scripts to perform similarity fusion are collected and provided with a link to download. If you have any question please feel free to contact the author.

Paper Summary



Public resources of chemical compound features are in a rapid growth both in quantity and the types of data-representation. To comprehensively understand the relationship between these intrinsic features of chemical compounds and the compound interaction with protein targets is an essential task to evaluate potential protein-binding function for virtual drug screening. In previous studies, correlations were proposed between bioactivity profiles and target networks, especially when chemical structures were related. With the lack of effective quantitative methods to uncover such correlation, it is demanding and necessary for us to integrate the information from multiple data sources to produce an integrated assessment of the similarity between small molecules, as well as quantitatively uncover the relationship between compounds and their targets by such integrated information.


a multi-view based clustering algorithm was introduced to quantitatively integrate compound similarity from both bioactivity profiles and structural fingerprints. Hierarchy clustering was performed with the fused similarity. Compared to clustering in a single view, the overall common target number within fused classes has been improved by using the integrated similarity, which indicated that the present multi-view based clustering is more efficient by successfully identifying clusters with its members sharing more number of common targets. Further analysis in certain classes reveals that mutual complement of the two views for compound description helps to discover missing similar compound when only single view was applied. Thus by combining features from different data representations; an improved assessment of target-specific compound similarity can be achieved. Our study presents an efficient, extendable and quantitative computational model for integration of different compound representations, and expected to provide new clues to improve the virtual drug screening from various pharmacological properties.


Scripts, supplementary materials and data used in this study are publicly available at http://lifecenter.sgst.cn/fusion/.