Generative AI/Language Model

[Large Language Model] BLOOMZ & mT0

데이터 세상 2023. 7. 11. 13:46

BLOOMZ & mT0

https://huggingface.co/bigscience/mt0-xxl

 

bigscience/mt0-xxl · Hugging Face

Accuracy on Winogrande XL (xl) validation set self-reported 63.380 Accuracy on XWinograd (en) test set self-reported 81.290 Accuracy on XWinograd (fr) test set self-reported 78.310 Accuracy on XWinograd (jp) test set self-reported 78.620 Accuracy on XWinog

huggingface.co

 

논문: https://arxiv.org/abs/2211.01786

 

Crosslingual Generalization through Multitask Finetuning

Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused on English data and models. We apply MTF to the pretrained multilingual BLOOM and m

arxiv.org

 

a family of models capable of following human instructions in dozens of languages zero-shot

crosslingual task mixture (xP3)에서 BLOOM mT5 사전 훈련된 다국어 언어 모델을 fine-tuning하고 보이지 않는 작업 및 언어에 대한 교차 언어 일반화가 가능한 결과 모델

 

Datasets

pretraining: mc4

https://huggingface.co/datasets/mc4

108 languages including Korean

 

finetuning: xP3

https://huggingface.co/datasets/bigscience/xP3

Crosslingual Public Pool of Prompts

46개 언어 및 16NLP 작업에 대한 프롬프트 및 데이터 세트 모음

수십 가지 언어로 zero-shot 사람의 지시를 따를 수 있는 다국어 언어 모델인 BLOOMZ mT0의 교육에 사용

 

Name Explanation Example models
xP3 Mixture of 13 training tasks in 46 languages with English prompts

without Korean

Korea
 - language code: ko, country code: kr

programming_language:
  - C
  - C++
  - C#
  - Go
  - Java
  - JavaScript
  - Lua
  - PHP
  - Python
  - Ruby
  - Rust
  - Scala
  - TypeScript

bloomz & mt0-xxl
xP3x Mixture of 17 tasks in 277 languages(including Korean) with English prompts

Korean
  - Code: kor_Hang
  - Kilobytes: 4,642,468
  - %: 0.68
  - Samples: 3,415,920
  - %: 0.64
WIP - Join us at Project Aya @C4AI to help!
xP3mt Mixture of 13 training tasks in 46 languages with English prompts bloomz-mt & mt0-xxl-mt
xP3all Mixture of 13 training tasks in 46 languages with prompts in 20 languages (machine-translated from English)  
xP3megds xP3 + evaluation datasets adding an additional 3 tasks for a total of 16 tasks in 46 languages with English prompts bloomz
P3 Repreprocessed version of the English-only P3 with 8 training tasks bloomz-p3&mt0-xxl-p3

 

Architecture

Same as mt5-xxl (mT5-xxl)


mT5 (Multilingual T5)

https://github.com/google-research/multilingual-t5

Language (101 languages)

Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu

 

 


BLOOMZ & mT0 Model Family

Multitask finetuned on xP3. Recommended for prompting in English.
Parameters 300M 580M 1.2B 3.7B 13B 560M 1.1B 1.7B 3B 7.1B 176B
Finetuned Model mt0-small mt0-base mt0-large mt0-xl mt0-xxl bloomz-560m bloomz-1b1 bloomz-1b7 bloomz-3b bloomz-7b1 bloomz
Multitask finetuned on xP3mt. Recommended for prompting in non-English.
Finetuned Model         mt0-xxl-mt         bloomz-7b1-mt bloomz-mt
Multitask finetuned on P3. Released for research purposes only. Strictly inferior to above models
Finetuned Model         mt0-xxl-p3         bloomz-7b1-p3 bloomz-p3
Original pretrained checkpoints. Not recommended.
Pretrained Model mt5-small mt5-base mt5-large mt5-xl mt5-xxl bloom-560m bloom-1b1 bloom-1b7 bloom-3b bloom-7b1 bloom

 


Limitations

프롬프트 엔지니어링:

성능은 프롬프트에 따라 다를 수 있습니다.

 

BLOOMZ 모델의 경우 모델이 입력을 계속하려고 하지 않도록 입력이 중지되는 시점을 매우 명확하게 표시하는 것이 좋습니다.

예를 들어 끝에 마침표(.)가 없는 'Translate to English: Je t'aime' 프롬프트는 모델이 프랑스어 문장을 계속하려고 시도하는 결과를 초래할 수 있습니다.

더 나은 프롬프트는 예입니다.

"Translate to English: Je t'aime.", "Translate to English: Je t'aime. Translation:" "What is "Je t'aime." in English?"

 

또한 가능한 한 많은 컨텍스트를 모델에 제공하는 것이 좋습니다.

예를 들어, Telugu로 대답하게 하려면 모델에게 다음과 같이 말하십시오.

"Explain in a sentence in Telugu what is backpropagation in neural networks.".

 

반응형