Model Configuration YAML

Full example of a model configuration YAML file

The yaml example below exposes every supported configuration option for creating a model card.

workspace: "./src"
entry_point: "main_entry.py"
bootstrap: |
  echo "Bootstrap start..."
  sh ./config/bootstrap.sh
  echo "Bootstrap finished"
inference_image_name: "fedml/fedml-default-inference-backend"
enable_custom_image: false
image_pull_policy: "IfNotPresent"
docker_registry_user_name: fedml
docker_registry_user_password: passwd
docker_registry: fedml-official
entry_cmd: tritonserver --model-repository=/model
port_inside_docker: 8000
server_external_port: 2204
server_internal_port: 2203
use_gpu: true
use_triton: true
request_input_example: '{"text": "Hello"}'
authentication_token: "myPrivateToken"
data_cache_dir: "~/data_cache"
environment_variables:
  TOP_K: "5"
  PROMPT_STYLE: "llama_orca"
deploy_timeout: 600
auto_detect_public_ip: true
computing:
  minimum_num_gpus: 1           # minimum # of GPUs to provision
  maximum_cost_per_hour: $3000   # max cost per hour for your job per gpu card
  resource_type: A100-80G       # e.g., A100-80G,
  #allow_cross_cloud_resources: true # true, false
  #device_type: CPU              # options: GPU, CPU, hybrid

Detailed specification

Name	Default	Description
`workspace`		Directory where your source code directory is located. [required]
`entry_point`		Entry point file name. [required]
`bootstrap`	`""`	Shell commands to install the dependency during setup stage.
`inference_image_name`	`"fedml/fedml-default-inference-backend"`	The base image for inference container.
`enable_custom_image`	false	If you used image other than fedml official image, which is listed in Advanced Features, you need to set it to true.
`image_pull_policy`	`"IfNotPresent"`	When start to deploy / update a endpoint, indicate whether to pull the image (name:tag) again. Could be either "IfNotPresent" or "Always".
`docker_registry_user_name / password`	None	Username password for your docker registry
`entry_cmd`	None	If you used your own image, here you can indicate the entry cmd(s) for that container.
`port_inside_docker`	2345	Inside a container, we default mount 2345 to the host machine port. But if you want to use another port inside container, please indicate here.
`worker_port`	random	In the host machine, we default randomly open a port and mount to a port inside docker. This might conflict to your firewall policy. So here you can indicate a accessible one.
`use_gpu`	true	Enable GPUs for inference. Only works for local, on-premise mode, for GPU cloud mode, please specify in `computing`
`use_triton`	false	Set to true if your image is a purely provide the triton server service.
`request_input_example`	`""`	The input example of the inference endpoint. Will be shown on the UI for reference.
`authentication_token`	randomly generated by mlops backend	The authentication_token as a parameter in the inference curl command.
`data_cache_dir`	`""`	For on-premise mode, you can indicate a folder that will not be packaged into the model cards. Instead, the worker will read from the host machine.
`environment_variables`	None	Environment variable that can be read in entry_point file.
`deploy_timeout`	900	Maximum waiting time for endpoint to be established.
`auto_detect_public_ip`	false	For on-premise mode, auto detect the ip of the master and workers public ip.
`computing`	None	For gpu cloud mode, indicate the resource you need for inference. You can visiting URL and check: https://TensorOpera.ai/compute/

Model Configuration YAML

Full example of a model configuration YAML file​

Detailed specification​

Full example of a model configuration YAML file

Detailed specification