diff options
author | Vasilev Ruslan <53991623+artnitolog@users.noreply.github.com> | 2022-06-27 11:13:01 +0300 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-06-27 11:13:01 +0300 |
commit | c91b7d7fe8dbf39c9e307d6d324446d0df136a23 (patch) | |
tree | a7b47a2ced9385f75dd34fccc08fe943b4512f7d | |
parent | db9ecf1745616569338a20dca77d313710827a61 (diff) |
Weights on HF git-lfs
-rw-r--r-- | README.md | 5 |
1 files changed, 3 insertions, 2 deletions
@@ -15,6 +15,7 @@ Make sure to have 200GB of free disk space before downloading weights. The model * Run `bash download/download.sh` to download model weights and vocabulary. * By default, weights will be downloaded to `./yalm100b_checkpoint/weights/`, and vocabulary will be downloaded to `./yalm100b_checkpoint/vocab/`. +* As another option, you can [clone our HF repo](https://huggingface.co/yandex/yalm-100b/tree/main) and [pull the checkpoint](https://huggingface.co/yandex/yalm-100b/tree/main/yalm100b_checkpoint). ### Docker @@ -38,7 +39,7 @@ The model is published under the Apache 2.0 license that permits both research a ### Dataset composition -Dataset used for the training of YaLM-100B is comprised of the following parts (rough percentages are measured in tokens seen by the model) +Dataset used for the training of YaLM-100B is comprised of the following parts (rough percentages are measured in tokens seen by the model): * **25%** [The Pile](https://pile.eleuther.ai/) — open English dataset by Eleuther AI team @@ -66,4 +67,4 @@ Some subsets were traversed up to 3 times during the training. ### Training process -Model was trained on a cluster of 800 A100 for ~65 days. In that time it consumed 300B tokens. You can see TensorBoard with LR and ramp up schedule, training metrics and our "thermometers" on the [HF page](https://huggingface.co/yandex/yalm-100b). +Model was trained on a cluster of 800 A100 for ~65 days. In that time it consumed 300B tokens. You can see [TensorBoard](https://huggingface.co/yandex/yalm-100b/tensorboard) with LR and ramp up schedule, training metrics and our "thermometers" on the [HF page](https://huggingface.co/yandex/yalm-100b). |