Sport news classification

Sport news classification

Automatic classification of sport news into different categories using state of the art Natural Language Processing (NLP). The project is building upon an existing dataset of Norwegian soccer news.

Natural language processing recently got a great performance boost by the so called BERT model. BERT (Bidirectional Encoder Representations from Transformers) is the current state of the art in NLP and performs very well in almost all tasks within the research community. The Norwegian language has been not very well studied in terms of NLP yet and research in this direction is needed.

Goal

The goal of this master project is to built up on an existing soccer news dataset and the BERT model to generate a model than can process Norweigan news. The output of the model should be flexible ranging from multi-class classification to regression. A classification model could be used to classify new into different categories, whereas a regression model could, for example, determine the importance of and article.

Learning outcome

  • Describe what the master student will learn.
  • Work with complex deep neural network models and state of the art NLP

Qualifications

  • Python programming
  • Knowledge about machine learning is an advantage

Supervisors

  • Pål Halvorsen
  • Michael Riegler
  • Steven Hicks

Associated contacts

Pål Halvorsen

Pål Halvorsen

Chief Research Scientist/Research ProfessorHead of DepartmentProfessor

Steven Hicks

Steven Hicks

Research Scientist

Michael Riegler

Michael Riegler

Head of AI StrategyProfessor