Update README.md

author: Vasilev Ruslan <53991623+artnitolog@users.noreply.github.com> 2022-06-23 10:35:06 +0300
committer: GitHub <noreply@github.com> 2022-06-23 10:35:06 +0300
commit: 06e5164d5de93e97d01912e9a388b849ac7b64c6 (patch)
tree: 9748c4185a1f42eeefd29e110d5867650ea7898c
parent: 09187f20241f195fe4089c3f887b81d8c8038dc9 (diff)
1 files changed, 5 insertions, 5 deletions
diff --git a/README.md b/README.md
index dfe6263..4a19ed0 100644
--- a/README.md
+++ b/README.md
@@ -5,22 +5,22 @@ The model leverages 100 billion parameters. It took 65 days to train the model o
 
 Training details and best practices on acceleration and stabilizations can be found on **[Medium](https://medium.com/p/d1df53d0e9a6)** (English) and **[Habr](https://habr.com/ru/company/yandex/blog/672396/)** (Russian) articles.
 
-# Setup
+## Setup
 
 Make sure to have 200GB of free disk space before downloading weights. The model *(code is based on [microsoft/DeepSpeedExamples/Megatron-LM-v1.1.5-ZeRO3](https://github.com/microsoft/DeepSpeedExamples/tree/068e6561188e9192104e014f70fbe25224b5eb62/Megatron-LM-v1.1.5-ZeRO3))* is supposed to run on multiple GPUs with tensor parallelism. It was tested on 4 (A100 80g) and 8 (V100 32g) GPUs, but is able to work with different configurations with ≈200GB of GPU memory in total which divide weight dimensions correctly (e.g. 16, 64, 128).
 
-## Downloading checkpoint
+### Downloading checkpoint
 
 * Run `bash download/download.sh` to download model weights and vocabulary.
 * By default, weights will be downloaded to `./yalm100b_checkpoint/weights/`, and vocabulary will be downloaded to `./yalm100b_checkpoint/vocab/`.
 
-## Docker
+### Docker
 
 * We [published](https://hub.docker.com/r/yandex/yalm-cuda11-ds) image on Docker Hub, it can be pulled with `docker/pull.sh`. It is compatible with A100 and V100.
 * Alternatively, you can build docker image from source using `docker/build.sh` (which will just build docker image from `docker/Dockerfile`).
 * To run container, use `docker/run.sh` *(volumes, name and other parameters can be changed)*.
 
-# Usage
+## Usage
 
 You can start with the following scripts:
 * `examples/generate_interactive.sh`: interactive generation from command line, the simplest way to try the model.
@@ -28,6 +28,6 @@ You can start with the following scripts:
 * `examples/generate_conditional_greedy.sh`: same as previous, but generation is greedy. Suitable for solving problems with few-shot.
 * `examples/generate_unconditional.sh`: unconditional generation. No input is used, output will be jsonlines.
 
-# License
+## License
 
 The model is published under the Apache 2.0 license that permits both research and commercial use, Megatron-LM is licensed under the [Megatron-LM license](megatron_lm/LICENSE).
author	Vasilev Ruslan <53991623+artnitolog@users.noreply.github.com>	2022-06-23 10:35:06 +0300
committer	GitHub <noreply@github.com>	2022-06-23 10:35:06 +0300
commit	06e5164d5de93e97d01912e9a388b849ac7b64c6 (patch)
tree	9748c4185a1f42eeefd29e110d5867650ea7898c
parent	09187f20241f195fe4089c3f887b81d8c8038dc9 (diff)