docker等基础工具使用
普通docker启动常用指令:
sudo docker run --name XXX --shm-size="200g" --privileged=true --device /dev/biren -it -d -v /xx:/xx --net=host xxximage /bin/bash
nv 环境docker启动常用指令
docker run -it --gpu=all --ipc=host --network=host --name xxx -v /xx:/xx xximage /bin/bash
slurm 启动
ENROOT_ALLOW_SUPERUSER=yes \
ENROOT_ROOTFS_WRITABLE=yes \
PYTORCH_VERSION=1 \
MELLANOX_VISIBLE_DEVICES=all \
MELLANOX_MOUNT_DRIVER=1 \
SRUN_CHROOT=/PATH \
srun -w server$1\
--container-env=ENROOT_ALLOW_SUPERUSER, ENROOT_ROOTFS_WRITABLE, PYTORCH_VERSION,MELLANOX_VISIBLE_DEVICES,MELLANOX_MOUNT_DRIVER \
--container-image=xxx \
--container-mounts=/xxx:/xxx \
--container-name='xxx' \
--pty bash
GIT 合并一个MR所有commit至一个
GITHUB: git fetch origin pull/188/head -> git merge FETCH_HEAD
GITLAB: git fetch origin refs/merge-requests/22/head:pr22 -> git merge pr22
git 消除文件权限带来的改动显示: git config core.filemode false
DEBUG
import pdb; pdb.set_trace()
from remote_pdb import set_trace; set_trace()
gdb attach (os.getpid())
def info_log:
if torch.distribued.get_rank() == 0:
print(f"{} -- {} -- {}")