Natural Language Processing

Cross-model Alignment ●●○

How similarly is linguistic information represented across models?

  • Data: Any linguistically-motived dataset, ideally multilingual (e.g., multilingual syntactic parsing Universal Dependencies by Nivre et al., 2020) + a diverse selection of pre-trained language models.
  • Method: Encode the same raw data + probe for linguistically-motivated tasks. Compare embeddings and probes for the same data across models.
  • Evaluation: Quantitative metric used by the relevant dataset(s) / Qualitative analysis.
Syntactic Annotation of Lyrics ●○○

How are syntactic characteristics reflected in lyrics from different types of music?

  • Data: Various options available (e.g., DALI by Meseguer-Brocal et al., 2020).
  • Method: Joint annotation to create a novel dataset + extensive linguistic analysis.
  • Evaluation: Inter-annotator agreement / LAS / qualitative analysis.
Asking the Right Questions ●●○

Can a model ask questions about a piece of text which are interesting to humans?

  • Data: A Question Answering dataset such as SQUAD (Rajpurkar et al., 2016).
  • Method: Identify suitable models for representing article/paragraph-level information and use a generative, ranking, or classification model to identify/generate the questions which were asked by humans.
  • Addition: Use a model trained on QA to generate questions which are easier/harder to answer.
  • Evaluation: F1-Score / BLEU (for generation).
Bootstrapping Dependency Relations ●○○

Are dependency relations of the same type represented similarly by masked language models?

  • Data: Universal Dependencies (Nivre et al., 2020).
  • Method: Use contextualized embeddings to extract partial parses from unlabelled data based on a set of seed examples.
  • Evaluation: F1-Score.
Prompting for Dependencies ●○○

Can targeted prompting of pre-trained language models elicit graphical structures?

  • Data: Universal Dependencies (Nivre et al., 2020)
  • Method: Identify a scheme which allows prompting for dependency graphs from pre-trained masked language models
  • Evaluation: UAS/LAS
Graphical Probing ●●○

What kinds of graphical relations can be extracted from contextualized embeddings using linear probes?

  • Data: Graphical tasks such as Relation Extraction (ACE).
  • Method: Linear probe optimized to map embedding distances to graph distances (Hewitt and Manning, 2019).
  • Evaluation: F1-Score.
Storyline Coherence ●●○

Are long-sequence transformers able to accurately estimate the timeline coherence of a story?

  • Data: Crawl fanfiction archive including chapter and paragraph metadata.
  • Method: Ranking, regression or classification using a pre-trained long-sequence transformer model.
  • Alternative: Investigate more recent ∞-former architecture (Martins et al., 2022).
  • Evaluation: Rank correlation.
Song Title Generation ●○○

Can we generate an appropriate title given a song’s lyrics?

  • Data: Million Song Dataset (Bertin-Mahieux et al., 2011) + musiXmatch lyrics / LyricsLab data.
  • Method: Generative models (e.g. n-gram language models, Recurrent Neural Networks, Transformers).
  • Evaluation: BLEU, word overlap / Qualitative evaluation.

Multimodality

Synesthetic Relation Matching ●●●

Which relationships between entities in one modality best map to those in another?

  • Data: Multimodal datasets such as COCO or ImageNet.
  • Method: Train self-supervised representations separately for each modality. Extract relation projections within each modality, and attempt to match them to find corresponding entity pairs.
  • Evaluation: Top-k matching accuracy / Qualitative analysis.
Matching Songs and Artworks ●●○

Are matches in textual content or sentiment representative of audio-visual correspondence?

Cross-modal Onomatopoeia ●●○

How much cross-modal information can be inferred from textual onomatopoeia?

  • Data: Japanese onomatopeia dictionaries (e.g., learning resources) + Annotated image data (e.g., COCO), speech data (e.g., Common Voice), or additional textual data (e.g., Multilingual Amazon Reviews).
  • Method: Extract pairs of relevant onomatopoeia and images/speech/text (e.g., 「凸凹」↔ unpaved road, 「ヒラリ」↔ feather); investigate whether relations between onomatopeia in one latent space hold in the other.
  • Evaluation: Predictive accuracy / Top-k retrieval accuracy.
Music Genre Classification ●○○

Which features or combinations thereof best predict musical genre?

  • Data: Million Song Dataset (Bertin-Mahieux et al., 2011) + tagtraum (Schreiber, 2015) or similar.
  • Method: Favourite supervised learning algorithm.
  • Evaluation: F1-Score.
International Space Station Sightings ●○○

How accurately can a model fit the ISS' orbital path based on visibility data?

  • Data: Sighting data of the International Space Station for 6.8k locations across 3 years from the Flyover service.
  • Method: Favourite supervised time-series prediction algorithm.
  • Addition: Incorporate cross-modal information sources to improve accuracy.
  • Evaluation: Absolute difference evaluated for unseen locations.
Public Transport Prediction ●●○

What influences public transportation punctuality in Denmark?

  • Data: Crawl realtime public transport from Rejseplanen API.
  • Method: Favourite supervised regression algorithm.
  • Addition: Incorporate cross-modal information sources to improve accuracy.
  • Evaluation: Absolute difference evaluated for unseen data.

Artsy

Spectral Analysis of Music Embeddings ●●●

Do neuron activation frequencies in generative models for music correspond to long/short-term structure?

  • Data: Pre-trained Music Transformer model (Huang et al., 2018).
  • Method: Apply the discrete cosine transform to the model's hidden representations (Tamkin et al., 2020).
  • Evaluation: Qualitative evaluation / Perplexity over time given hidden representations filtered at different frequencies.
Pokémon Generation ●●●

Can a generative model with limited target modality data benefit from weakly aligned data in another modality?

  • Data: Pokemon Images Dataset + Pokemon Stats Dataset.
  • Method: (Sin-)GAN (Shaham et al., 2019) / VAE with latent space alignment (Theodoridis et al., 2020).
  • Evaluation: Qualitative evaluation / Output classification with regards to desired statistics.
Departure Melody Generation ●●●

Are railway station properties sufficient to conditionally generate appropriate departure melodies?

  • Data: Custom MIDI dataset + crawling publically available sources.
  • Method: Sequence generation model conditioned on, e.g., station names and locations from crawled data.
  • Evaluation: Qualitative evaluation / Output classification with regards to desired statistics.

A few ideas that could be suitable as supervised projects. The ●○○ indicate the estimated difficulty, effort and uncertainty. If there are projects which fit a similar profile, please feel free to reach out as well.