21 MLHub Configuration File

A MLHub yaml configuration file is all that is required to turn a git based repository into quickly accessible and ready to run, explore, rebuild, and even deploy, pre-built machine learning models. Git repositories, like github, gitlab, and bitbucket, are where many researchers today publish their algorithms.

By installing direct from git repositories, a package developer can simply add a MLHUB.yaml file and the appropriate Python or R scripts (typcially within a mlhub folder) to their git repository. They now have an MLHub package. The scripts are generally short wrappers for the package functionality, supporting command line arguments.

A single git repository may contain multiple MLHub packages, simply through multiple YAML files which can have any name:

$ ml install gjwgit/recommenders:mlhub/sar.yaml
$ ml readme sar
$ ...
$ ml install gjwgit/recommenders:mlhub/rbm.yaml
$ ml readme rbm
$ ...

Each configuration files identifies the other files required specifically for the package.

When MLHub installs a package it downloads the git repository as a zip file.

Your donation will support ongoing development and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984.
Copyright © 1995-2021 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0.