Posted on Nov 23, 2017 | Rating

ReaderBench Multilingual Natural Language Processing Framework

ReaderBench encompasses advanced Natural Language Processing techniques for a vast collection of language functionalities. It support English, French, Spanish, Dutch, Romanian and Italian languages.

Short non-technical description:

ReaderBench uses advanced Natural Language Processing techniques to provide a vast collection of language functionalities.

ReaderBench is centred on cohesion and dialogism and is comprised of multiple services which address various comprehension assessment and prediction scenarios. Examples of such services include the ability for tutors to execute the assessment of the learning materials and to estimate the students' essays or self-explanations.

You may simply download and run the deployment libraries on your local machine to explore some of the main ReaderBench functionalities. The following libraries are available:

  • Global ReaderBench Desktop Client
  • English ReaderBench Desktop Client
  • French ReaderBench Desktop Client
  • Spanish ReaderBench Desktop Client

The Global version comprises of English, French and Spanish. The other libraries might be used if you intend to use just one of the languages. Some other languages will be published as deployment version in the future, too.

Technical description:

The ReaderBench framework uses the Standard Core NLP library to implement Natural Language Processing services.

A pre-processing pipeline performs the cleaning of the input texts to achieve better results. It comprises of the following steps: tokenisation, splitting, part of speech tagging, lemmatisation, named entity recognition, dependency parsing, and co-reference resolution.

ReaderBench uses semantic models such as Latent Semantic Analysis, Latent Dirichlet Allocation and Word2Vec, and ontologies like WordNet to calculate cohesion scores and similarities between units of texts. The cohesion scores allows the determination of different relations established between documents, paragraphs, sentences and words.

Support levels: The component is available "as is" without warranties or conditions of any kind. Reported bugs will be fixed. Continued support for new versions of the OS and game engines. New features will be added according to the developer's roadmap. New features can be added upon request (requires a service contract)

Detailed description:

The ReaderBench framework can be either cloned from our GitLab Repository or simply used as deployment library.

The Repository contains three projects:

  1. The ReaderBench Core
  2. The ReaderBench Desktop Client
  3. The ReaderBench API

The ReaderBench Core can be accessed to explore the Natural Language Processing functionalities and operations performed by ReaderBench. You may either clone this project and explore its contents, or you can simply use it as a Maven dependency by cloning it from our Artifactory server.

The ReaderBench Desktop Client can be used to test ReaderBench functionalities with the help of a Java Swing interface. This project uses the ReaderBench Core, so you may use it as a guide into integrating ReaderBench in your projects.

The ReaderBench API can be used to explore how the ReaderBench Application Programming Interface works. Similar to the ReaderBench Desktop Client, you may discover how to integrate the ReaderBench Core into a project.

Language: English, French, Spanish, Dutch, Romanian, Italian

Access URL:


natural language processing

semantic similarity

semantic cohesion

semantic models


latent semantic analysis

latent dirichlet allocation



Personal skills Provide explanations Information exploring and managing Reviewing and evaluating
Document management and text processing Annotation Document analysis
Education Collaborative learning
Component Language Processing

Related Articles

ReaderBench - Sentiment Analysis on Texts
UPB, Rage project, Dascalu Mihai

ReaderBench - Automated Essay Grading
UPB, Rage project, Dascalu Mihai

ReaderBench - Automated Identification of Reading Strategies
UPB, Rage project, Dascalu Mihai

ReaderBench - Semantic Models and Topic Mining
UPB, Rage project, Dascalu Mihai

L2F/INESC-ID, Rage project, L2F/INESC-ID

ReaderBench: An Integrated Cohesion-Centered Framework
Mihai Dascalu, Larise Stavrache, Philippe Dessus, Stefan Trausan-Matu, Danielle McNamara, Maryse Bianco

ReaderBench: Automated evaluation of collaboration based on cohesion and dialogism
Mihai Dascalu, Stefan Trausan-Matu, Danielle McNamara, Philippe Dessus

Visualization of polyphonic voices inter-animation in CSCL chats
Mihai Dascalu, Stefan Trausan-Matu

Predicting Newcomer Integration in Online Knowledge Communities by Automated Dialog Analysis
Nicolae Nistor, Mihai Dascalu, Lucia Larise Stavarache, Christian Tarnai, Stefan Trausan-Matu