American Institute of Physics(AIP), a leading publisher of research journals, magazines and conference proceedings in the area of physics. AIP partnered with Molecular Connections to supply and maintain its market-leading concept extraction and content classification system.
AIP had a large corpus of around 2 million documents which included a wide range of document types including research articles, magazines, news articles, letters, blogs as part of its publishing platform (Scitation) and intends to improve the overall discoverability of content on its publishing platform and create topic based collections.
Molecular Connections applies an ensemble approach with a comprehensive ontology and machine learning based tools to identify topics from a corpus of text. These topics form the critical component in various data-led solutions in publishing workflows including topic based subject collections, improving content discovery, finding relevant / related journals & articles, many such workflows. Using Molecular Connections AI enabled workflow for topic extraction, AIP has developed topic based collections of journal articles for each of the journals in its Scitation publication platform.
For this project, Molecular Connections adopted an Ontology based approach for topic identification, developed a comprehensive ontology in physics using its proprietary accelerated ontology development approach (MCLEXICONTM) and enriched the concepts in the ontology with synonyms, acronyms. This approach provided an ontology of more than 35k concepts in less than 6 months. MCMINERTM (Molecular Connections proprietary text mining engine) was deployed for Named Entity Extraction and recognition tasks, and statistical methodologies were used to identify relevant topics. The system also provides a feedback loop for continuous improvement based on researcher input.
Furthermore, the thesaurus manager has also been integrated within the AIP Publishing Editorial Management System. Thesaurus based browse and navigate module in Scitation (AIP Publishing’s content hosting platform) is intuitive, resulting in accurate retrieval of content of interest. Linked data content store developed by Molecular Connections has been very useful in enabling analytics, content recommendation engine and also addresses future business development needs. APIs for real time Indexing shall be deployed as the backbone of referee finder and contextual advertising within Scitation platform by Molecular Connections.