728x90
반응형
FLAN-T5
https://huggingface.co/docs/transformers/model_doc/flan-t5
논문: https://arxiv.org/pdf/2210.11416.pdf
An encoder-decoder model based on the T5
Scaling Instruction – Fine-tuned Language Models
여러 타스크를 통해 fine-tuned된 T5의 향상된 버전
Flan
프롬프팅을 기반으로 하는 사전 교육 방법
FLAN-T5-XL
https://huggingface.co/google/flan-t5-x
60 Languages including Korean
taskmaster2, djaym7/wiki_dialog, deepmind/code_contests, lambada, gsm8k, aqua_rat, esnli, quasc 및 qed를 포함하는 데이터 세트의 Flan 컬렉션에서 훈련된 T5 모델
FLAN-T5-XXL
https://huggingface.co/google/flan-t5-xxl
an 11 billion parameter model based on the Flan-T5 family
Language(s) (NLP): English, German, French
728x90
반응형
'Generative AI > Language Model' 카테고리의 다른 글
[Large Language Model] BLOOMZ & mT0 (0) | 2023.07.11 |
---|---|
[Large Language Model] BLOOM (0) | 2023.07.11 |
[Suvey Paper] Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond (0) | 2023.06.19 |
[Foundation Model] GPT-4 / GPT-3 (0) | 2023.04.12 |
[NLP] Language Model이란 (0) | 2023.03.12 |