Scaling laws for generative mixed-modal language models
Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond
Aya 23: Open weight releases to further multilingual progress
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale
wav2vec 2.0: A framework for self-supervised learning of speech representations
mslam: Massively multilingual joint pre-training for speech and text
Improving image generation with better captions
Generating long sequences with sparse transformers
Scaling instruction-finetuned language models
W2v-bert: Combining contrastive learning and masked language modeling for self-supervised speech pre-training
Unsupervised cross-lingual representation learning at scale
LCFO: Long context and long form output dataset and benchmarking
BERT: Pre-training of deep bidirectional transformers for language understanding
Text and Context: Explorations in the Semantics and Pragmatics of Discourse
Sonar expressive: Zero-shot expressive speech-to-speech translation
Modular speech-to-text translation for zero-shot cross-modal transfer
Language-agnostic BERT sentence embedding
Scene-based text-to-image generation with human priors
Using BERT encoding and sentence-level language model for sentence ordering
Accurate, large minibatch sg d: training imagenet in 1 hour
Effective parallel corpus mining using bilingual sentence embeddings
Efficient diffusion training via min-snr weighting strategy
XL-sum: Large-scale multilingual abstractive summarization for 44 languages
Problem-solving recognition in scientific text
Teaching machines to read and comprehend
Classifier-free diffusion guidance
spaCy: Industrial-strength Natural Language Processing in Python
Elucidating the design space of diffusion-based generative models
Samu-xlsr: Semantically-aligned multimodal utterance-level cross-lingual speech representation
Understanding diffusion objectives as the elbo with simple data augmentation
Reformer: The efficient transformer
Reformulating unsupervised style transfer as paraphrase generation
Autoregressive image generation using residual quantization
Rouge: A package for automatic evaluation of summaries
Common diffusion noise schedules and sample steps are flawed
Improved residual vector quantization for high-dimensional approximate nearest neighbor search
Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps
A corpus and cloze evaluation for deeper understanding of commonsense stories
Don’t give me the details, just the summary! topic-aware convolutional neural networks for extreme summarization
Improved denoising diffusion probabilistic models
Elucidating the exposure bias in diffusion models
Bleu: a method for automatic evaluation of machine translation
Speech disturbances in schizophrenia: Assessing cross-linguistic generalizability of nlp automated measures of coherence
Scalable diffusion models with transformers
Film: Visual reasoning with a general conditioning layer
Language models are unsupervised multitask learners
Exploring the limits of transfer learning with a unified text-to-text transformer
Sentence-bert: Sentence embeddings using siamese Bert-networks
Progressive distillation for fast sampling of diffusion models
Glu variants improve transformer
Lola–an open-source massively multilingual large language model
One embedder, any task: Instruction-finetuned text embeddings
Roformer: Enhanced transformer with rotary position embedding
Llama 2: Open foundation and fine-tuned chat models
Aya model: An instruction finetuned open-access multilingual language model
Neural network acceptability judgments
Neural text generation with unlikelihood training
Improving multilingual sentence embedding using bi-directional dual encoder with additive margin softmax
Soundstream: An end-to-end neural audio codec
Root mean square layer normalization
Planner: Generating diversified paragraph via latent language diffusion model