Crowd Computing for Cheminformatics (2C4C)

What is it all about ?
2C4C is a large-scale collaborative programme involving community participation towards creating one of the most comprehensive cheminformatics resources for understanding biological properties of chemical molecules.

How do we do it ?
We use extensive manual curation of datasets available from PubChem and other public resources for Bioassay datasets and use Machine Learning algorithms to create predictive cheminformatics models for the biological activity of chemical molecules.

Why do we need models for biological assays
Prediction of the biological properties of select molecules are the key step in drug discovery process. Though our initiative we hope to build an open access compendium of cheminformatics models for biological assays. This would help prioritise moelcules for biological assays and further development.

Integrating Cheminformatics Models for Drug Discovery
The models generated as part of this project has been extensively utilized to prioritize molecules for drug discovery. Read more on how we are using cheminformatics data-sets to prioritize molecules for Tuberculosis Drug Discovery.
Who can Participate ?
2C4C is open for participation from individuals who are willing to learn cheminformatics and willing to spend some time, thought and compute resources towards creating predictive models for biological activities of chemical molecules. The programme would be best suited for students who would be interested in taking up the project as part of their internship during their courses. The programme is completely online and students who meet the requisite criteria could register online for participation.

Important Dates

The registrations to this programme are closed.
Feel free to post and discuss anything relevant to the filed on the online forum.

Start of Programme Jan 01, 2013
The first assignment and manuals are posted online 

Collaborating on this programme
We are open to academic collaboration from individuals, institutes or organisations on this programme. We are also looking forward for publication partners who would be interested in publishing the research outcome of this initiative as a special issue. Please contact Dr. Vinod Scaria for more details.


Jamal S, OSDD Consortium, Scaria V
Data-mining of potential antitubercular activities from molecular ingredients of Traditional Chinese Medicines 
PeerJ (2014) Accepted

Jamal S, OSDD Consortium, Scaria V*
Cheminformatic models based on machine learning for pyruvate kinase inhibitors of Leishmania mexicana
BMC Bioinformatics (2013) Accepted.

Jamal S, Periwal V, OSDD Consortium and Scaria V*
Predictive cheminformatics analysis of anti-malarial molecules inhibiting apicoplast formation
BMC Bioinformatics 2013, 14:55

Jamal S, Periwal V, OSDD Consortium, Scaria V*
Computational analysis and predictive modeling of small molecule modulators of microRNA
Journal of Cheminformatics 2012, 4:16

Periwal V, Kishtapuram S, OSDD Consortium and Scaria V*
Computational models for in-vitro anti-tubercular activity of molecules based on high-throughput chemical biology screening datasets
BMC Pharmacology 2012, 12:1

Periwal V, Jinuraj KR, OSDD Consortium, Jaleel UC* and Scaria V*
Predictive models for anti-tubercular molecules using machine learning on high-throughput biological screening datasets
BMC Research Notes 2011, 4:504

Funding and Resources
We acknowledge the support from the Open Source Drug Discovery Initiative/ CSIR for the programme. We also acknowledge the availability of communication and compute resources from NKN and CDAC-Garuda for the project.

We are Social