Yo! You must login or signup first!

Attention_is_all_you_need

Submission   5,838

About

"Attention is all you need" is the title of a 2017 machine learning paper, that is sometimes jokingly referred to in other contexts as a catchphrase "X is all you need".

Origin

The landmark paper was written by Google Brain researcher Ashish Vaswani and his co-authors and marks one of the most important steps forwards in recent Deep Learning history. It introduces the Transformer architecture that is the basis for various seminal technological advances in language processing, image classification [https://paperswithcode.com/paper/an-image-is-worth-16×16-words-transformers-1] and generative models [https://paperswithcode.com/paper/transgan-two-transformers-can-make-one-strong]. OpenAI's notorious GPT-3, for example, is a Transformer-based model architecture ("Generative Pre-trained Transformer 3").

Notable Examples

carlos dg • 2 months ago Yannic Kilcher is all you need REPLY 8 replies ^ 352 Guy G 6 2 months ago lol REPLY 3 omer sahban 2 months ago The unreasonable efficiency of Yannic Kilcher REPLY 38 John Gilbert 6: 2 months ago J Learning to summarize from Yannic Kilcher REPLY 18 John Gilbert 6: 2 months ago J Self-training with Noisy Yannic Kilcher REPLY 17 Pedro Abreu • 1 month ago Meta-Yannic Kilcher REPLY Mattias W. • 1 month ago (edited) Hold on to your papers fellow scholars. What a time to be sentient. REPLY 10 anotherplatypus 6: 1 month ago Machine Learning Research Paper Summarization Models are Yannic Kilchers! REPLY 2 Seong • 1 month ago Yannicsformer REPLY 2 ... ...

Money Is All You Need Nick Debu Tokyo Institute of Bamboo Steamer Abstract Transformer-based models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long se- quences. We introduce one technique to improve the performance of Transformers. We replace NVIDIA PI00s by TPUS, changing its memory from hoge GB to piyo GB. The resulting model performs on par with Transformer-based models while being much more ""TSUYO TSUYO"".



Share Pin

Recent Images 2 total


Recent Videos 0 total

There are no recent videos.





Attention is all you need

Attention is all you need

PROTIP: Press 'i' to view the image gallery, 'v' to view the video gallery, or 'r' to view a random entry.

This submission is currently being researched & evaluated!

You can help confirm this entry by contributing facts, media, and other evidence of notability and mutation.

About

"Attention is all you need" is the title of a 2017 machine learning paper, that is sometimes jokingly referred to in other contexts as a catchphrase "X is all you need".

Origin

The landmark paper was written by Google Brain researcher Ashish Vaswani and his co-authors and marks one of the most important steps forwards in recent Deep Learning history. It introduces the Transformer architecture that is the basis for various seminal technological advances in language processing, image classification [https://paperswithcode.com/paper/an-image-is-worth-16×16-words-transformers-1] and generative models [https://paperswithcode.com/paper/transgan-two-transformers-can-make-one-strong]. OpenAI's notorious GPT-3, for example, is a Transformer-based model architecture ("Generative Pre-trained Transformer 3").

Notable Examples


carlos dg • 2 months ago Yannic Kilcher is all you need REPLY 8 replies ^ 352 Guy G 6 2 months ago lol REPLY 3 omer sahban 2 months ago The unreasonable efficiency of Yannic Kilcher REPLY 38 John Gilbert 6: 2 months ago J Learning to summarize from Yannic Kilcher REPLY 18 John Gilbert 6: 2 months ago J Self-training with Noisy Yannic Kilcher REPLY 17 Pedro Abreu • 1 month ago Meta-Yannic Kilcher REPLY Mattias W. • 1 month ago (edited) Hold on to your papers fellow scholars. What a time to be sentient. REPLY 10 anotherplatypus 6: 1 month ago Machine Learning Research Paper Summarization Models are Yannic Kilchers! REPLY 2 Seong • 1 month ago Yannicsformer REPLY 2 ... ...

Money Is All You Need Nick Debu Tokyo Institute of Bamboo Steamer Abstract Transformer-based models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long se- quences. We introduce one technique to improve the performance of Transformers. We replace NVIDIA PI00s by TPUS, changing its memory from hoge GB to piyo GB. The resulting model performs on par with Transformer-based models while being much more ""TSUYO TSUYO"".


Recent Videos

There are no videos currently available.

Recent Images 2 total



+ Add a Comment

Comments (0)

There are no comments currently available.

Display Comments

Add a Comment