Posted on Nov 23, 2017 | Rating

ReaderBench - Semantic Models and Topic Mining

Extracts the keywords of a text together with their relevance scores and semantic links between them.

Short non-technical description:

Extracts keywords and topics of a text, together with the corresponding relevance scores and semantic links between them.

This component represents a core constituent within all ReaderBench modules in terms of discourse analysis and text mining.

Given an input text, this component returns the list of concepts, their relevance and the links between them.

The component is available in the following languages: English and French. Dutch and Romanian languages will be available soon.

Technical description:

ReaderBench introduced a generalized model for assessment based on the cohesion graph, applicable to both plain essay- or story-like texts and CSCL conversations, in particular chats, forum discussion threads or blog communities.

Text cohesion, viewed as lexical, grammatical and semantic relationships that link together textual units, is defined within our implemented model in terms of semantic similarity measured through semantic distances in: lexicalized ontologies (e.g. WordNet), Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA).

Additionally, specific natural language processing techniques are applied to reduce noise and to improve the system’s accuracy: tokenizing, splitting, part of speech tagging, parsing stop words elimination, dictionary-only words selection, stemming, lemmatizing, named entity recognition and co-reference resolution.

Moreover, we have developed a topic mining module that integrates the previously defined semantic models (available for English, French and Italian).

Support levels: The component is available "as is" without warranties or conditions of any kind. Reported bugs will be fixed. Continued support for new versions of the OS and game engines. New features will be added according to the developer's roadmap. New features can be added upon request (requires a service contract).

Detailed description:

The ReaderBench framework can be either cloned from our GitLab Repository or simply used as deployment library.

The Repository contains three projects:

  1. The ReaderBench Core
  2. The ReaderBench Desktop Client
  3. The ReaderBench API

The ReaderBench Core can be accessed to explore the Natural Language Processing functionalities and operations performed by ReaderBench. You may either clone this project and explore its contents, or you can simply use it as a Maven dependency by cloning it from our Artifactory server.

The ReaderBench Desktop Client can be used to test ReaderBench functionalities with the help of a Java Swing interface. This project uses the ReaderBench Core, so you may use it as a guide into integrating ReaderBench in your projects.

The ReaderBench API can be used to explore how the ReaderBench Application Programming Interface works. Similar to the ReaderBench Desktop Client, you may discover how to integrate the ReaderBench Core into a project.

Language: English, French

Access URL:

keywords extraction

topic mining

semantic models


Document management and text processing Document analysis
Component Language Processing

Related Articles

ReaderBench - Automated Essay Grading
UPB, Rage project, Dascalu Mihai

ReaderBench - Sentiment Analysis on Texts
UPB, Rage project, Dascalu Mihai

ReaderBench - Automated Identification of Reading Strategies
UPB, Rage project, Dascalu Mihai

ReaderBench Multilingual Natural Language Processing Framework
UPB, Rage project, Dascalu Mihai

ReaderBench: Automated evaluation of collaboration based on cohesion and dialogism
Mihai Dascalu, Stefan Trausan-Matu, Danielle McNamara, Philippe Dessus

ReaderBench: An Integrated Cohesion-Centered Framework
Mihai Dascalu, Larise Stavrache, Philippe Dessus, Stefan Trausan-Matu, Danielle McNamara, Maryse Bianco

Visualization of polyphonic voices inter-animation in CSCL chats
Mihai Dascalu, Stefan Trausan-Matu

Seeker: A Serious Game for Improving Cognitive Abilities
Irina Toma, Mihai Dascalu, Stefan Trausan-Matu

Adaptation and Assessment (TwoA) component
Enkhbold Nyamsuren, Rage project, Enkhbold Nyamsuren