Natural Language Processing
- Cross-model Alignment ●●○
-
How similarly is linguistic information represented across models?
- Data: Any linguistically-motived dataset, ideally multilingual (e.g., multilingual syntactic parsing Universal Dependencies by Nivre et al., 2020) + a diverse selection of pre-trained language models.
- Method: Encode the same raw data + probe for linguistically-motivated tasks. Compare embeddings and probes for the same data across models.
- Evaluation: Quantitative metric used by the relevant dataset(s) / Qualitative analysis.
- Syntactic Annotation of Lyrics ●○○
-
How are syntactic characteristics reflected in lyrics from different types of music?
- Data: Various options available (e.g., DALI by Meseguer-Brocal et al., 2020).
- Method: Joint annotation to create a novel dataset + extensive linguistic analysis.
- Evaluation: Inter-annotator agreement / LAS / qualitative analysis.
- Asking the Right Questions ●●○
-
Can a model ask questions about a piece of text which are interesting to humans?
- Data: A Question Answering dataset such as SQUAD (Rajpurkar et al., 2016).
- Method: Identify suitable models for representing article/paragraph-level information and use a generative, ranking, or classification model to identify/generate the questions which were asked by humans.
- Addition: Use a model trained on QA to generate questions which are easier/harder to answer.
- Evaluation: F1-Score / BLEU (for generation).
- Bootstrapping Dependency Relations ●○○
-
Are dependency relations of the same type represented similarly by masked language models?
- Data: Universal Dependencies (Nivre et al., 2020).
- Method: Use contextualized embeddings to extract partial parses from unlabelled data based on a set of seed examples.
- Evaluation: F1-Score.
- Prompting for Dependencies ●○○
-
Can targeted prompting of pre-trained language models elicit graphical structures?
- Data: Universal Dependencies (Nivre et al., 2020)
- Method: Identify a scheme which allows prompting for dependency graphs from pre-trained masked language models
- Evaluation: UAS/LAS
- Graphical Probing ●●○
-
What kinds of graphical relations can be extracted from contextualized embeddings using linear probes?
- Data: Graphical tasks such as Relation Extraction (ACE).
- Method: Linear probe optimized to map embedding distances to graph distances (Hewitt and Manning, 2019).
- Evaluation: F1-Score.
- Storyline Coherence ●●○
-
Are long-sequence transformers able to accurately estimate the timeline coherence of a story?
- Data: Crawl fanfiction archive including chapter and paragraph metadata.
- Method: Ranking, regression or classification using a pre-trained long-sequence transformer model.
- Alternative: Investigate more recent ∞-former architecture (Martins et al., 2022).
- Evaluation: Rank correlation.
- Song Title Generation ●○○
-
Can we generate an appropriate title given a song’s lyrics?
- Data: Million Song Dataset (Bertin-Mahieux et al., 2011) + musiXmatch lyrics / LyricsLab data.
- Method: Generative models (e.g. n-gram language models, Recurrent Neural Networks, Transformers).
- Evaluation: BLEU, word overlap / Qualitative evaluation.
Multimodality
- Synesthetic Relation Matching ●●●
-
Which relationships between entities in one modality best map to those in another?
- Data: Multimodal datasets such as COCO or ImageNet.
- Method: Train self-supervised representations separately for each modality. Extract relation projections within each modality, and attempt to match them to find corresponding entity pairs.
- Evaluation: Top-k matching accuracy / Qualitative analysis.
- Matching Songs and Artworks ●●○
-
Are matches in textual content or sentiment representative of audio-visual correspondence?
- Data: Behance Artistic Media Dataset (Wilber et al., 2017) + Million Song Dataset (Bertin-Mahieux et al., 2011).
- Method (1): Similarity measured by proxy of BAM captions and musiXmatch lyrics.
- Method (2): Discriminate between artwork and lyrics with matching/differing BAM emotion and MusicMood labels.
- Evaluation: Human preference / retrieval performance on MusicMood dev.
- Cross-modal Onomatopoeia ●●○
-
How much cross-modal information can be inferred from textual onomatopoeia?
- Data: Japanese onomatopeia dictionaries (e.g., learning resources) + Annotated image data (e.g., COCO), speech data (e.g., Common Voice), or additional textual data (e.g., Multilingual Amazon Reviews).
- Method: Extract pairs of relevant onomatopoeia and images/speech/text (e.g., 「凸凹」↔ unpaved road, 「ヒラリ」↔ feather); investigate whether relations between onomatopeia in one latent space hold in the other.
- Evaluation: Predictive accuracy / Top-k retrieval accuracy.
- Music Genre Classification ●○○
-
Which features or combinations thereof best predict musical genre?
- Data: Million Song Dataset (Bertin-Mahieux et al., 2011) + tagtraum (Schreiber, 2015) or similar.
- Method: Favourite supervised learning algorithm.
- Evaluation: F1-Score.
- International Space Station Sightings ●○○
-
How accurately can a model fit the ISS' orbital path based on visibility data?
- Data: Sighting data of the International Space Station for 6.8k locations across 3 years from the Flyover service.
- Method: Favourite supervised time-series prediction algorithm.
- Addition: Incorporate cross-modal information sources to improve accuracy.
- Evaluation: Absolute difference evaluated for unseen locations.
- Public Transport Prediction ●●○
-
What influences public transportation punctuality in Denmark?
- Data: Crawl realtime public transport from Rejseplanen API.
- Method: Favourite supervised regression algorithm.
- Addition: Incorporate cross-modal information sources to improve accuracy.
- Evaluation: Absolute difference evaluated for unseen data.
Artsy
- Spectral Analysis of Music Embeddings ●●●
-
Do neuron activation frequencies in generative models for music correspond to long/short-term structure?
- Data: Pre-trained Music Transformer model (Huang et al., 2018).
- Method: Apply the discrete cosine transform to the model's hidden representations (Tamkin et al., 2020).
- Evaluation: Qualitative evaluation / Perplexity over time given hidden representations filtered at different frequencies.
- Pokémon Generation ●●●
-
Can a generative model with limited target modality data benefit from weakly aligned data in another modality?
- Data: Pokemon Images Dataset + Pokemon Stats Dataset.
- Method: (Sin-)GAN (Shaham et al., 2019) / VAE with latent space alignment (Theodoridis et al., 2020).
- Evaluation: Qualitative evaluation / Output classification with regards to desired statistics.
- Departure Melody Generation ●●●
-
Are railway station properties sufficient to conditionally generate appropriate departure melodies?
- Data: Custom MIDI dataset + crawling publically available sources.
- Method: Sequence generation model conditioned on, e.g., station names and locations from crawled data.
- Evaluation: Qualitative evaluation / Output classification with regards to desired statistics.
A few ideas that could be suitable as supervised projects. The ●○○ indicate the estimated difficulty, effort and uncertainty. If there are projects which fit a similar profile, please feel free to reach out as well.