PRESS RELEASE | Bangalore, INDIA August 2019
To promote data sharing in research, an increasing number of publishers, funders, and institutions have adopted data sharing policies that either recommend or mandate that data associated with a research article be shared in a public repository.
Even when data sharing is mandatory, compliance with these policies is patchy. Authors are often unsure which datasets they should share and which repository is most appropriate, and stakeholders cannot easily assess whether authors’ sharing efforts meet their policy requirements.
DataSeer is being developed to address these challenges. DataSeer is an open source web-based service that uses Natural Language Processing (NLP) to identify datasets associated with a particular article. Authors are shown best sharing practice for their type of data, and stakeholders are sent a report detailing the completeness of the authors’ sharing efforts.
The project is led by Dr Tim Vines, PhD, a peer review workflow expert who conceived of DataSeer while trying to enforce the data sharing policy at the journal Molecular Ecology. For training the DataSeer NLP algorithm to find dataset mentions in research articles, Dr Vines engaged Molecular Connections. “Molecular Connections are working with us to generate a database of sentences in research articles that describe data collection. We are currently focused on the Methods sections of 2000 published research articles from a wide range of research fields. In each of those Molecular Connections has identified the main data collection sentence, the type of data being collected, and noted any specialist equipment the researchers used.” commented Dr. Vines.
He continued, “Our aim is to develop this database into a high-quality open resource for anyone interested in how researchers describe their data collection efforts. Going forward, we will be working with Molecular Connections to expand the database to new fields and incorporate other components of the research cycle, such as data analysis.”
Krishna K., Director Sales & Marketing, Molecular Connections commented, “content curation is our core business, and we started Molecular Connections way back in 2000 as a “text mining” curation company. Today, we are the world leaders in this space & work with publishers – both primary and secondary, societies and pharmaceutical companies, hand-holding all their content and technology needs. Over the last 2 decades, our 2000+ SMEs have dealt with a myriad of content types & have a 360 degree view of the challenges faced. We are extremely delighted to work with Dr. Tim Vines, sharing our experience, perspective, best practices learnt from our association with the industry, translating that into building a best-in-class workflow solution for DataSeer.”