Aucune description

hutli c873976649 using blas.meta.available to check host platform		il y a 1 an
.devops	c873976649 using blas.meta.available to check host platform	il y a 1 an
.github	a016026a3a server: continuous performance monitoring and PR comment (#6283)	il y a 1 an
ci	280345968d cuda : rename build flag to LLAMA_CUDA (#6299)	il y a 1 an
cmake	c41ea36eaa cmake : MSVC instruction detection (fixed up #809) (#3923)	il y a 2 ans
common	e562b9714b common : change --no-penalize-nl to --penalize-nl (#6334)	il y a 1 an
docs	280345968d cuda : rename build flag to LLAMA_CUDA (#6299)	il y a 1 an
examples	d0e2f6416b doc: fix typo in MobileVLM-README.md (#6181)	il y a 1 an
ggml-cuda	55c1b2a3bb IQ1_M: 1.75 bpw quantization (#6302)	il y a 1 an
gguf-py	55c1b2a3bb IQ1_M: 1.75 bpw quantization (#6302)	il y a 1 an
grammars	3de31677d3 grammars : blacklists character control set (#5888)	il y a 1 an
kompute @ 4565194ed7	fbf1ddec69 Nomic Vulkan backend (#4456)	il y a 1 an
kompute-shaders	fbf1ddec69 Nomic Vulkan backend (#4456)	il y a 1 an
media	62b3e81aae media : add logos and banners	il y a 2 ans
models	ea5497df5d gpt2 : Add gpt2 architecture integration (#4555)	il y a 2 ans
pocs	a07d0fee1f ggml : add mmla kernels for quantized GEMM (#4966)	il y a 1 an
prompts	37c746d687 llama : add Qwen support (#4281)	il y a 2 ans
requirements	da3b9ba2b7 convert-hf-to-gguf : require einops for InternLM2ForCausalLM (#5792)	il y a 1 an
scripts	280345968d cuda : rename build flag to LLAMA_CUDA (#6299)	il y a 1 an
spm-headers	df334a1125 swift : package no longer use ggml dependency (#5465)	il y a 1 an
tests	55c1b2a3bb IQ1_M: 1.75 bpw quantization (#6302)	il y a 1 an
.clang-tidy	ae1f211ce2 cuda : refactor into multiple files (#6269)	il y a 1 an
.dockerignore	ea55295a74 docker : ignore Git files (#3314)	il y a 2 ans
.ecrc	fbf1ddec69 Nomic Vulkan backend (#4456)	il y a 1 an
.editorconfig	800a489e4a llama.swiftui : add bench functionality (#4483)	il y a 2 ans
.flake8	2891c8aa9a Add support for BERT embedding models (#5423)	il y a 1 an
.gitignore	64e7b47c69 examples : add "retrieval" (#6193)	il y a 1 an
.gitmodules	fbf1ddec69 Nomic Vulkan backend (#4456)	il y a 1 an
.pre-commit-config.yaml	5ddf7ea1fb hooks : setting up flake8 and pre-commit hooks (#1681)	il y a 2 ans
CMakeLists.txt	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an
LICENSE	6a9a67f0be Add LICENSE (#21)	il y a 2 ans
Makefile	3a0345970e make : whitespace	il y a 1 an
Package.swift	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an
README-sycl.md	59c17f02de add blog link (#6222)	il y a 1 an
README.md	1740d6dd4e readme : add php api bindings (#6326)	il y a 1 an
build.zig	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an
codecov.yml	73a12a6344 cov : disable comment in PRs (#2989)	il y a 2 ans
convert-hf-to-gguf.py	e097633f63 convert-hf : fix exception in sentencepiece with added tokens (#6320)	il y a 1 an
convert-llama-ggml-to-gguf.py	4d4d2366fc convert : automatically fall back to HfVocab if tokenizer.model doesn't exist (#5821)	il y a 1 an
convert-lora-to-ggml.py	05490fad7f add safetensors support to convert-lora-to-ggml.py (#5062)	il y a 2 ans
convert-persimmon-to-gguf.py	dbd8828eb0 py : fix persimmon `n_rot` conversion (#5460)	il y a 1 an
convert.py	3a6efdd03c convert : use f32 outtype for bf16 tensors (#6106)	il y a 1 an
flake.lock	43139cc528 flake.lock: Update (#6266)	il y a 1 an
flake.nix	e9f17dc3bf nix: .#windows: proper cross-compilation set-up	il y a 1 an
ggml-alloc.c	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	il y a 1 an
ggml-alloc.h	f30ea47a87 llama : add pipeline parallelism support (#6017)	il y a 1 an
ggml-backend-impl.h	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	il y a 1 an
ggml-backend.c	280345968d cuda : rename build flag to LLAMA_CUDA (#6299)	il y a 1 an
ggml-backend.h	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	il y a 1 an
ggml-common.h	cbc8343619 Make IQ1_M work for QK_K = 64 (#6327)	il y a 1 an
ggml-cuda.cu	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	il y a 1 an
ggml-cuda.h	2bf8d0f7c4 backend : offload large batches to GPU (#6083)	il y a 1 an
ggml-impl.h	3202361c5b ggml, ci : Windows ARM runner and build fixes (#5979)	il y a 1 an
ggml-kompute.cpp	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	il y a 1 an
ggml-kompute.h	fbf1ddec69 Nomic Vulkan backend (#4456)	il y a 1 an
ggml-metal.h	5f14ee0b0c metal : add debug capture backend function (ggml/694)	il y a 1 an
ggml-metal.m	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	il y a 1 an
ggml-metal.metal	cbc8343619 Make IQ1_M work for QK_K = 64 (#6327)	il y a 1 an
ggml-mpi.c	5bf2a27718 ggml : remove src0 and src1 from ggml_tensor and rename opt to src (#2178)	il y a 2 ans
ggml-mpi.h	5656d10599 mpi : add support for distributed inference via MPI (#2099)	il y a 2 ans
ggml-opencl.cpp	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	il y a 1 an
ggml-opencl.h	a1d6df129b Add OpenCL add kernel (#5151)	il y a 2 ans
ggml-quants.c	cbc8343619 Make IQ1_M work for QK_K = 64 (#6327)	il y a 1 an
ggml-quants.h	55c1b2a3bb IQ1_M: 1.75 bpw quantization (#6302)	il y a 1 an
ggml-sycl.cpp	25f4a613c4 [SYCL] fix set main gpu crash (#6339)	il y a 1 an
ggml-sycl.h	ddf6568510 [SYCL] offload op (#6217)	il y a 1 an
ggml-vulkan-shaders.hpp	61d1c88e15 Vulkan Improvements (#5835)	il y a 1 an
ggml-vulkan.cpp	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	il y a 1 an
ggml-vulkan.h	61d1c88e15 Vulkan Improvements (#5835)	il y a 1 an
ggml.c	e5b89a441a ggml : fix bounds checking of zero size views (#6347)	il y a 1 an
ggml.h	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	il y a 1 an
ggml_vk_generate_shaders.py	61d1c88e15 Vulkan Improvements (#5835)	il y a 1 an
llama.cpp	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an
llama.h	557410b8f0 llama : greatly reduce output buffer memory usage (#6122)	il y a 1 an
mypy.ini	b43ebde3b0 convert : partially revert PR #4818 (#5041)	il y a 2 ans
requirements.txt	04ac0607e9 python : add check-requirements.sh and GitHub workflow (#4585)	il y a 2 ans
unicode-data.cpp	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an
unicode-data.h	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an
unicode.cpp	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an
unicode.h	32c8486e1f wpm : portable unicode tolower (#6305)	il y a 1 an

llama.cpp for SYCL

Background
News
OS
Intel GPU
Docker
Linux
Windows
Environment Variable
Known Issue
Q&A
Todo

Background

SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators—such as CPUs, GPUs, and FPGAs. It is a single-source embedded domain-specific language based on pure C++17.

oneAPI is a specification that is open and standards-based, supporting multiple architecture types including but not limited to GPU, CPU, and FPGA. The spec has both direct programming and API-based programming paradigms.

Intel uses the SYCL as direct programming language to support CPU, GPUs and FPGAs.

To avoid to re-invent the wheel, this code refer other code paths in llama.cpp (like OpenBLAS, cuBLAS, CLBlast). We use a open-source tool SYCLomatic (Commercial release Intel® DPC++ Compatibility Tool) migrate to SYCL.

The llama.cpp for SYCL is used to support Intel GPUs.

For Intel CPU, recommend to use llama.cpp for X86 (Intel MKL building).

News

2024.3
- A blog is published: Run LLM on all Intel GPUs Using llama.cpp: intel.com or medium.com.
- New base line is ready: tag b2437.
- Support multiple cards: --split-mode: [none|layer]; not support [row], it's on developing.
- Support to assign main GPU by --main-gpu, replace $GGML_SYCL_DEVICE.
- Support detecting all GPUs with level-zero and same top Max compute units.
- Support OPs
- hardsigmoid
- hardswish
- pool2d
2024.1
- Create SYCL backend for Intel GPU.
- Support Windows build

OS

Intel GPU

Verified

Note: If the EUs (Execution Unit) in iGPU is less than 80, the inference speed will be too slow to use.

Memory

The memory is a limitation to run LLM on GPUs.

When run llama.cpp, there is print log to show the applied memory on GPU. You could know how much memory to be used in your case. Like llm_load_tensors: buffer size = 3577.56 MiB.

For iGPU, please make sure the shared memory from host memory is enough. For llama-2-7b.Q4_0, recommend the host memory is 8GB+.

For dGPU, please make sure the device memory is enough. For llama-2-7b.Q4_0, recommend the device memory is 4GB+.

Nvidia GPU

Verified

|Intel GPU| Status | Verified Model| |-|-|-| |Ampere Series| Support| A100|

oneMKL for CUDA

The current oneMKL release does not contain the oneMKL cuBlas backend. As a result for Nvidia GPU's oneMKL must be built from source.

git clone https://github.com/oneapi-src/oneMKL
cd oneMKL
mkdir build
cd build
cmake -G Ninja .. -DCMAKE_CXX_COMPILER=icpx -DCMAKE_C_COMPILER=icx -DENABLE_MKLGPU_BACKEND=OFF -DENABLE_MKLCPU_BACKEND=OFF -DENABLE_CUBLAS_BACKEND=ON
ninja
// Add paths as necessary

Docker

Note:

Only docker on Linux is tested. Docker on WSL may not work.
You may need to install Intel GPU driver on the host machine (See the Linux section to know how to do that)

Build the image

You can choose between F16 and F32 build. F16 is faster for long-prompt inference.

# For F16:
#docker build -t llama-cpp-sycl --build-arg="LLAMA_SYCL_F16=ON" -f .devops/main-intel.Dockerfile .

# Or, for F32:
docker build -t llama-cpp-sycl -f .devops/main-intel.Dockerfile .

# Note: you can also use the ".devops/server-intel.Dockerfile", which compiles the "server" example

Run

# Firstly, find all the DRI cards:
ls -la /dev/dri
# Then, pick the card that you want to use.

# For example with "/dev/dri/card1"
docker run -it --rm -v "$(pwd):/app:Z" --device /dev/dri/renderD128:/dev/dri/renderD128 --device /dev/dri/card1:/dev/dri/card1 llama-cpp-sycl -m "/app/models/YOUR_MODEL_FILE" -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33

Linux

Setup Environment

Install Intel GPU driver.

a. Please install Intel GPU driver by official guide: Install GPU Drivers.

Note: for iGPU, please install the client GPU driver.

b. Add user to group: video, render.

sudo usermod -aG render username
sudo usermod -aG video username

Note: re-login to enable it.

c. Check

sudo apt install clinfo
sudo clinfo -l

Output (example):

Platform #0: Intel(R) OpenCL Graphics
 `-- Device #0: Intel(R) Arc(TM) A770 Graphics


Platform #0: Intel(R) OpenCL HD Graphics
 `-- Device #0: Intel(R) Iris(R) Xe Graphics [0x9a49]

Install Intel® oneAPI Base toolkit.

a. Please follow the procedure in Get the Intel® oneAPI Base Toolkit .

Recommend to install to default folder: /opt/intel/oneapi.

Following guide use the default folder as example. If you use other folder, please modify the following guide info with your folder.

b. Check

source /opt/intel/oneapi/setvars.sh

sycl-ls

There should be one or more level-zero devices. Please confirm that at least one GPU is present, like [ext_oneapi_level_zero:gpu:0].

Output (example):

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.10.0.17_160000]
[opencl:cpu:1] Intel(R) OpenCL, 13th Gen Intel(R) Core(TM) i7-13700K OpenCL 3.0 (Build 0) [2023.16.10.0.17_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [23.30.26918.50]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.26918]

Build locally:

Note:

You can choose between F16 and F32 build. F16 is faster for long-prompt inference.

By default, it will build for all binary files. It will take more time. To reduce the time, we recommend to build for example/main only.

mkdir -p build
cd build
source /opt/intel/oneapi/setvars.sh

# For FP16:
#cmake .. -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_SYCL_F16=ON

# Or, for FP32:
cmake .. -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx

# For Nvidia GPUs
cmake .. -DLLAMA_SYCL=ON -DLLAMA_SYCL_TARGET=NVIDIA -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx

# Build example/main only
#cmake --build . --config Release --target main

# Or, build all binary
cmake --build . --config Release -v

cd ..

./examples/sycl/build.sh

Run

Put model file to folder models

You could download llama-2-7b.Q4_0.gguf as example.

Enable oneAPI running environment
```
source /opt/intel/oneapi/setvars.sh
```
List device ID

Run without parameter:

./build/bin/ls-sycl-device

# or running the "main" executable and look at the output log:

./build/bin/main

Check the ID in startup log, like:

found 6 SYCL devices:
|  |                  |                                             |Compute   |Max compute|Max work|Max sub|               |
|ID|       Device Type|                                         Name|capability|units      |group   |group  |Global mem size|
|--|------------------|---------------------------------------------|----------|-----------|--------|-------|---------------|
| 0|[level_zero:gpu:0]|               Intel(R) Arc(TM) A770 Graphics|       1.3|        512|    1024|     32|    16225243136|
| 1|[level_zero:gpu:1]|                    Intel(R) UHD Graphics 770|       1.3|         32|     512|     32|    53651849216|
| 2|    [opencl:gpu:0]|               Intel(R) Arc(TM) A770 Graphics|       3.0|        512|    1024|     32|    16225243136|
| 3|    [opencl:gpu:1]|                    Intel(R) UHD Graphics 770|       3.0|         32|     512|     32|    53651849216|
| 4|    [opencl:cpu:0]|         13th Gen Intel(R) Core(TM) i7-13700K|       3.0|         24|    8192|     64|    67064815616|
| 5|    [opencl:acc:0]|               Intel(R) FPGA Emulation Device|       1.2|         24|67108864|     64|    67064815616|

Device selection and execution of llama.cpp

There are two device selection modes:

Single device: Use one device assigned by user.
Multiple devices: Automatically choose the devices with the same biggest Max compute units.

Examples:

Use device 0:

ZES_ENABLE_SYSMAN=1 ./build/bin/main -m models/llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 -sm none -mg 0

or run by script:

./examples/sycl/run_llama2.sh 0

Use multiple devices:

ZES_ENABLE_SYSMAN=1 ./build/bin/main -m models/llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 -sm layer

or run by script:

./examples/sycl/run_llama2.sh

Note:

By default, mmap is used to read model file. In some cases, it leads to the hang issue. Recommend to use parameter --no-mmap to disable mmap() to skip this issue.

Verify the device ID in output

Verify to see if the selected GPU is shown in the output, like:

detect 1 SYCL GPUs: [0] with top Max compute units:512

use 1 SYCL GPUs: [0] with Max compute units:512

Windows

Setup Environment

Install Intel GPU driver.

Please install Intel GPU driver by official guide: Install GPU Drivers.

Note: The driver is mandatory for compute function.

Install Visual Studio.

Please install Visual Studio which impact oneAPI environment enabling in Windows.

Install Intel® oneAPI Base toolkit.

a. Please follow the procedure in Get the Intel® oneAPI Base Toolkit .

Recommend to install to default folder: C:\Program Files (x86)\Intel\oneAPI.

Following guide uses the default folder as example. If you use other folder, please modify the following guide info with your folder.

b. Enable oneAPI running environment:

In Search, input 'oneAPI'.

Search & open "Intel oneAPI command prompt for Intel 64 for Visual Studio 2022"

In Run:

In CMD:

"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64

c. Check GPU

In oneAPI command line:

sycl-ls

There should be one or more level-zero devices. Please confirm that at least one GPU is present, like [ext_oneapi_level_zero:gpu:0].

Output (example):

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.10.0.17_160000]
[opencl:cpu:1] Intel(R) OpenCL, 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz OpenCL 3.0 (Build 0) [2023.16.10.0.17_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Iris(R) Xe Graphics OpenCL 3.0 NEO  [31.0.101.5186]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Iris(R) Xe Graphics 1.3 [1.3.28044]

Install cmake & make

a. Download & install cmake for Windows: https://cmake.org/download/

b. Download & install mingw-w64 make for Windows provided by w64devkit

Download the 1.19.0 version of w64devkit.
Extract w64devkit on your pc.
Add the bin folder path in the Windows system PATH environment, like C:\xxx\w64devkit\bin\.

Build locally:

In oneAPI command line window:

mkdir -p build
cd build
@call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64 --force

::  for FP16
::  faster for long-prompt inference
::  cmake -G "MinGW Makefiles" ..  -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icx  -DCMAKE_BUILD_TYPE=Release -DLLAMA_SYCL_F16=ON

::  for FP32
cmake -G "MinGW Makefiles" ..  -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icx  -DCMAKE_BUILD_TYPE=Release


::  build example/main only
::  make main

::  build all binary
make -j
cd ..

.\examples\sycl\win-build-sycl.bat

Note:

By default, it will build for all binary files. It will take more time. To reduce the time, we recommend to build for example/main only.

Run

Put model file to folder models

You could download llama-2-7b.Q4_0.gguf as example.

Enable oneAPI running environment

In Search, input 'oneAPI'.

Search & open "Intel oneAPI command prompt for Intel 64 for Visual Studio 2022"

In Run:

In CMD:

"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64

List device ID

Run without parameter:

build\bin\ls-sycl-device.exe

or

build\bin\main.exe

Check the ID in startup log, like:

found 6 SYCL devices:
|  |                  |                                             |Compute   |Max compute|Max work|Max sub|               |
|ID|       Device Type|                                         Name|capability|units      |group   |group  |Global mem size|
|--|------------------|---------------------------------------------|----------|-----------|--------|-------|---------------|
| 0|[level_zero:gpu:0]|               Intel(R) Arc(TM) A770 Graphics|       1.3|        512|    1024|     32|    16225243136|
| 1|[level_zero:gpu:1]|                    Intel(R) UHD Graphics 770|       1.3|         32|     512|     32|    53651849216|
| 2|    [opencl:gpu:0]|               Intel(R) Arc(TM) A770 Graphics|       3.0|        512|    1024|     32|    16225243136|
| 3|    [opencl:gpu:1]|                    Intel(R) UHD Graphics 770|       3.0|         32|     512|     32|    53651849216|
| 4|    [opencl:cpu:0]|         13th Gen Intel(R) Core(TM) i7-13700K|       3.0|         24|    8192|     64|    67064815616|
| 5|    [opencl:acc:0]|               Intel(R) FPGA Emulation Device|       1.2|         24|67108864|     64|    67064815616|

Device selection and execution of llama.cpp

There are two device selection modes:

Single device: Use one device assigned by user.
Multiple devices: Automatically choose the devices with the same biggest Max compute units.

Examples:

Use device 0:

build\bin\main.exe -m models\llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e -ngl 33 -s 0 -sm none -mg 0

Use multiple devices:

build\bin\main.exe -m models\llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e -ngl 33 -s 0 -sm layer

or run by script:

.\examples\sycl\win-run-llama2.bat

Note:

By default, mmap is used to read model file. In some cases, it leads to the hang issue. Recommend to use parameter --no-mmap to disable mmap() to skip this issue.

Verify the device ID in output

Verify to see if the selected GPU is shown in the output, like:

detect 1 SYCL GPUs: [0] with top Max compute units:512

use 1 SYCL GPUs: [0] with Max compute units:512

Environment Variable

Build

Running

Known Issue

Hang during startup

llama.cpp use mmap as default way to read model file and copy to GPU. In some system, memcpy will be abnormal and block.

Solution: add --no-mmap or --mmap 0.

Split-mode: [row] is not supported

It's on developing.

Q&A

Note: please add prefix [SYCL] in issue title, so that we will check it as soon as possible.

Error: error while loading shared libraries: libsycl.so.7: cannot open shared object file: No such file or directory.

Miss to enable oneAPI running environment.

Install oneAPI base toolkit and enable it by: source /opt/intel/oneapi/setvars.sh.

In Windows, no result, not error.

Miss to enable oneAPI running environment.

Meet compile error.

Remove folder build and try again.

I can not see [ext_oneapi_level_zero:gpu:0] afer install GPU driver in Linux.

Please run sudo sycl-ls.

If you see it in result, please add video/render group to your ID:

  sudo usermod -aG render username
  sudo usermod -aG video username

Then relogin.

If you do not see it, please check the installation GPU steps again.

Todo

Support row layer split for multiple card runs.

README-sycl.md

llama.cpp for SYCL

Background

News

OS

Intel GPU

Verified

Memory

Nvidia GPU

Verified

oneMKL for CUDA

Docker

Build the image

Run

Linux

Setup Environment

Run

Windows

Setup Environment

Build locally:

Run

Environment Variable

Build

Running

Known Issue

Q&A

Todo