tooluniverse.embedding_sync module

Embedding Sync Tool for ToolUniverse

Synchronize embedding databases with HuggingFace Hub for sharing and collaboration. Supports uploading local databases to HuggingFace and downloading databases from HuggingFace.

class tooluniverse.embedding_sync.Path(*args, **kwargs)[source][source]

Bases: PurePath

PurePath subclass that can make system calls.

Path represents a filesystem path but unlike PurePath, also offers methods to do system calls on path objects. Depending on your system, instantiating a Path will return either a PosixPath or a WindowsPath object. You can also instantiate a PosixPath or WindowsPath directly, but cannot instantiate a WindowsPath on a POSIX system or vice versa.

classmethod cwd()[source][source]

Return a new path pointing to the current working directory (as returned by os.getcwd()).

classmethod home()[source][source]

Return a new path pointing to the user’s home directory (as returned by os.path.expanduser(‘~’)).

samefile(other_path)[source][source]

Return whether other_path is the same or not as this file (as returned by os.path.samefile()).

iterdir()[source][source]

Iterate over the files in this directory. Does not yield any result for the special paths ‘.’ and ‘..’.

glob(pattern)[source][source]

Iterate over this subtree and yield all existing files (of any kind, including directories) matching the given relative pattern.

rglob(pattern)[source][source]

Recursively yield all existing files (of any kind, including directories) matching the given relative pattern, anywhere in this subtree.

absolute()[source][source]

Return an absolute version of this path. This function works even if the path doesn’t point to anything.

No normalization is done, i.e. all ‘.’ and ‘..’ will be kept along. Use resolve() to get the canonical path to a file.

resolve(strict=False)[source][source]

Make the path absolute, resolving all symlinks on the way and also normalizing it (for example turning slashes into backslashes under Windows).

stat(*, follow_symlinks=True)[source][source]

Return the result of the stat() system call on this path, like os.stat() does.

owner()[source][source]

Return the login name of the file owner.

group()[source][source]

Return the group name of the file gid.

open(mode='r', buffering=-1, encoding=None, errors=None, newline=None)[source][source]

Open the file pointed by this path and return a file object, as the built-in open() function does.

read_bytes()[source][source]

Open the file in bytes mode, read it, and close the file.

read_text(encoding=None, errors=None)[source][source]

Open the file in text mode, read it, and close the file.

write_bytes(data)[source][source]

Open the file in bytes mode, write to it, and close the file.

write_text(data, encoding=None, errors=None, newline=None)[source][source]

Open the file in text mode, write to it, and close the file.

Return the path to which the symbolic link points.

touch(mode=438, exist_ok=True)[source][source]

Create this file with the given access mode, if it doesn’t exist.

mkdir(mode=511, parents=False, exist_ok=False)[source][source]

Create a new directory at this given path.

chmod(mode, *, follow_symlinks=True)[source][source]

Change the permissions of the path, like os.chmod().

lchmod(mode)[source][source]

Like chmod(), except if the path points to a symlink, the symlink’s permissions are changed, rather than its target’s.

Remove this file or link. If the path is a directory, use rmdir() instead.

rmdir()[source][source]

Remove this directory. The directory must be empty.

lstat()[source][source]

Like stat(), except if the path points to a symlink, the symlink’s status information is returned, rather than its target’s.

rename(target)[source][source]

Rename this path to the target path.

The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.

Returns the new Path instance pointing to the target path.

replace(target)[source][source]

Rename this path to the target path, overwriting if that path exists.

The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.

Returns the new Path instance pointing to the target path.

Make this path a symlink pointing to the target path. Note the order of arguments (link, target) is the reverse of os.symlink.

Make this path a hard link pointing to the same file as target.

Note the order of arguments (self, target) is the reverse of os.link’s.

Make the target path a hard link pointing to this path.

Note this function does not make this path a hard link to target, despite the implication of the function and argument names. The order of arguments (target, link) is the reverse of Path.symlink_to, but matches that of os.link.

Deprecated since Python 3.10 and scheduled for removal in Python 3.12. Use hardlink_to() instead.

exists()[source][source]

Whether this path exists.

is_dir()[source][source]

Whether this path is a directory.

is_file()[source][source]

Whether this path is a regular file (also True for symlinks pointing to regular files).

is_mount()[source][source]

Check if this path is a POSIX mount point

Whether this path is a symbolic link.

is_block_device()[source][source]

Whether this path is a block device.

is_char_device()[source][source]

Whether this path is a character device.

is_fifo()[source][source]

Whether this path is a FIFO.

is_socket()[source][source]

Whether this path is a socket.

expanduser()[source][source]

Return a new path with expanded ~ and ~user constructs (as returned by os.path.expanduser)

class tooluniverse.embedding_sync.datetime(year, month, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]])[source][source]

Bases: date

The year, month and day arguments are required. tzinfo may be None, or an instance of a tzinfo subclass. The remaining arguments may be ints.

hour[source]
minute[source]
second[source]
microsecond[source]
tzinfo[source]
fold[source]
fromtimestamp()[source]

timestamp[, tz] -> tz’s local time from POSIX timestamp.

utcfromtimestamp()[source]

Construct a naive UTC datetime from a POSIX timestamp.

now()[source]

Returns new datetime object representing current time local to tz.

tz

Timezone object.

If no tz is specified, uses local timezone.

utcnow()[source]

Return a new datetime representing UTC day and time.

combine()[source]

date, time -> datetime with same date and time fields

fromisoformat()[source]

string -> datetime from datetime.isoformat() output

timetuple()[source]

Return time tuple, compatible with time.localtime().

timestamp()[source]

Return POSIX timestamp as float.

utctimetuple()[source]

Return UTC time tuple, compatible with time.localtime().

date()[source]

Return date object with same year, month and day.

time()[source]

Return time object with same time but with tzinfo=None.

timetz()[source]

Return time object with same time and tzinfo.

replace()[source]

Return datetime with new specified fields.

astimezone()[source]

tz -> convert to local time in new timezone tz

ctime()[source]

Return ctime() style string.

isoformat()[source]

[sep] -> string in ISO 8601 format, YYYY-MM-DDT[HH[:MM[:SS[.mmm[uuu]]]]][+HH:MM]. sep is used to separate the year from the time, and defaults to ‘T’. The optional argument timespec specifies the number of additional terms of the time to include. Valid options are ‘auto’, ‘hours’, ‘minutes’, ‘seconds’, ‘milliseconds’ and ‘microseconds’.

__repr__()[source]

Return repr(self).

__str__()[source]

Return str(self).

strptime()[source]

string, format -> new datetime parsed from a string (like time.strptime()).

utcoffset()[source]

Return self.tzinfo.utcoffset(self).

tzname()[source]

Return self.tzinfo.tzname(self).

dst()[source]

Return self.tzinfo.dst(self).

max = datetime.datetime(9999, 12, 31, 23, 59, 59, 999999)[source]
min = datetime.datetime(1, 1, 1, 0, 0)[source]
resolution = datetime.timedelta(microseconds=1)[source]
class tooluniverse.embedding_sync.HfApi(endpoint: str | None = None, token: str | bool | None = None, library_name: str | None = None, library_version: str | None = None, user_agent: Dict | str | None = None, headers: Dict[str, str] | None = None)[source][source]

Bases: object

Client to interact with the Hugging Face Hub via HTTP.

The client is initialized with some high-level settings used in all requests made to the Hub (HF endpoint, authentication, user agents…). Using the HfApi client is preferred but not mandatory as all of its public methods are exposed directly at the root of huggingface_hub.

Parameters:
  • endpoint (str, optional) – Endpoint of the Hub. Defaults to <https://huggingface.co>.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • library_name (str, optional) – The name of the library that is making the HTTP request. Will be added to the user-agent header. Example: "transformers".

  • library_version (str, optional) – The version of the library that is making the HTTP request. Will be added to the user-agent header. Example: "4.24.0".

  • user_agent (str, dict, optional) – The user agent info in the form of a dictionary or a single string. It will be completed with information about the installed packages.

  • headers (dict, optional) – Additional headers to be sent with each request. Example: {"X-My-Header": "value"}. Headers passed here are taking precedence over the default headers.

__init__(endpoint: str | None = None, token: str | bool | None = None, library_name: str | None = None, library_version: str | None = None, user_agent: Dict | str | None = None, headers: Dict[str, str] | None = None) None[source][source]
run_as_future(fn: Callable[[...], R], *args, **kwargs) Future[R][source][source]

Run a method in the background and return a Future instance.

The main goal is to run methods without blocking the main thread (e.g. to push data during a training). Background jobs are queued to preserve order but are not ran in parallel. If you need to speed-up your scripts by parallelizing lots of call to the API, you must setup and use your own [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor).

Note: Most-used methods like [upload_file], [upload_folder] and [create_commit] have a run_as_future: bool argument to directly call them in the background. This is equivalent to calling api.run_as_future(...) on them but less verbose.

Parameters:
  • fn (Callable) – The method to run in the background.

  • *args – Arguments with which the method will be called.

  • **kwargs – Arguments with which the method will be called.

Returns:

a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) instance to get the result of the task.

Return type:

Future

Example

>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> future = api.run_as_future(api.whoami) # instant
>>> future.done()
False
>>> future.result() # wait until complete and return result
(...)
>>> future.done()
True
whoami(token: bool | str | None = None) Dict[source][source]

Call HF API to know “whoami”.

Parameters:

token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

get_token_permission(token: bool | str | None = None) Literal['read', 'write', 'fineGrained', None][source][source]

Check if a given token is valid and return its permissions.

<Tip warning={true}>

This method is deprecated and will be removed in version 1.0. Permissions are more complex than when get_token_permission was first introduced. OAuth and fine-grain tokens allows for more detailed permissions. If you need to know the permissions associated with a token, please use whoami and check the 'auth' key.

</Tip>

For more details about tokens, please refer to https://huggingface.co/docs/hub/security-tokens#what-are-user-access-tokens.

Parameters:

token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

Permission granted by the token (“read” or “write”). Returns None if no token passed, if token is invalid or if role is not returned by the server. This typically happens when the token is an OAuth token.

Return type:

Literal["read", "write", "fineGrained", None]

get_model_tags() Dict[source][source]

List all valid model tags as a nested namespace object

get_dataset_tags() Dict[source][source]

List all valid dataset tags as a nested namespace object.

list_models(*, filter: str | Iterable[str] | None = None, author: str | None = None, apps: str | List[str] | None = None, gated: bool | None = None, inference: Literal['warm'] | None = None, inference_provider: Literal['all'] | 'PROVIDER_T' | List['PROVIDER_T'] | None = None, model_name: str | None = None, trained_dataset: str | List[str] | None = None, search: str | None = None, pipeline_tag: str | None = None, emissions_thresholds: Tuple[float, float] | None = None, sort: Literal['last_modified'] | str | None = None, direction: Literal[-1] | None = None, limit: int | None = None, expand: List[ExpandModelProperty_T] | None = None, full: bool | None = None, cardData: bool = False, fetch_config: bool = False, token: bool | str | None = None, language: str | List[str] | None = None, library: str | List[str] | None = None, tags: str | List[str] | None = None, task: str | List[str] | None = None) Iterable[ModelInfo][source][source]

List models hosted on the Huggingface Hub, given some filters.

Parameters:
  • filter (str or Iterable[str], optional) – A string or list of string to filter models on the Hub. Models can be filtered by library, language, task, tags, and more.

  • author (str, optional) – A string which identify the author (user or organization) of the returned models.

  • apps (str or List, optional) – A string or list of strings to filter models on the Hub that support the specified apps. Example values include "ollama" or ["ollama", "vllm"].

  • gated (bool, optional) – A boolean to filter models on the Hub that are gated or not. By default, all models are returned. If gated=True is passed, only gated models are returned. If gated=False is passed, only non-gated models are returned.

  • inference (Literal["warm"], optional) – If “warm”, filter models on the Hub currently served by at least one provider.

  • inference_provider (Literal["all"] or str, optional) – A string to filter models on the Hub that are served by a specific provider. Pass "all" to get all models served by at least one provider.

  • library (str or List, optional) – Deprecated. Pass a library name in filter to filter models by library.

  • language (str or List, optional) – Deprecated. Pass a language in filter to filter models by language.

  • model_name (str, optional) – A string that contain complete or partial names for models on the Hub, such as “bert” or “bert-base-cased”

  • task (str or List, optional) – Deprecated. Pass a task in filter to filter models by task.

  • trained_dataset (str or List, optional) – A string tag or a list of string tags of the trained dataset for a model on the Hub.

  • tags (str or List, optional) – Deprecated. Pass tags in filter to filter models by tags.

  • search (str, optional) – A string that will be contained in the returned model ids.

  • pipeline_tag (str, optional) – A string pipeline tag to filter models on the Hub by, such as summarization.

  • emissions_thresholds (Tuple, optional) – A tuple of two ints or floats representing a minimum and maximum carbon footprint to filter the resulting models with in grams.

  • sort (Literal["last_modified"] or str, optional) – The key with which to sort the resulting models. Possible values are “last_modified”, “trending_score”, “created_at”, “downloads” and “likes”.

  • direction (Literal[-1] or int, optional) – Direction in which to sort. The value -1 sorts by descending order while all other values sort by ascending order.

  • limit (int, optional) – The limit on the number of models fetched. Leaving this option to None fetches all models.

  • expand (List[ExpandModelProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full, cardData or fetch_config are passed. Possible values are "author", "cardData", "config", "createdAt", "disabled", "downloads", "downloadsAllTime", "gated", "gguf", "inference", "inferenceProviderMapping", "lastModified", "library_name", "likes", "mask_token", "model-index", "pipeline_tag", "private", "safetensors", "sha", "siblings", "spaces", "tags", "transformersInfo", "trendingScore", "widgetData", "resourceGroup" and "xetEnabled".

  • full (bool, optional) – Whether to fetch all model data, including the last_modified, the sha, the files and the tags. This is set to True by default when using a filter.

  • cardData (bool, optional) – Whether to grab the metadata for the model as well. Can contain useful information such as carbon emissions, metrics, and datasets trained on.

  • fetch_config (bool, optional) – Whether to fetch the model configs as well. This is not included in full due to its size.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

an iterable of [huggingface_hub.hf_api.ModelInfo] objects.

Return type:

Iterable[ModelInfo]

Example:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

# List all models
>>> api.list_models()

# List text classification models
>>> api.list_models(filter="text-classification")

# List models from the KerasHub library
>>> api.list_models(filter="keras-hub")

# List models served by Cohere
>>> api.list_models(inference_provider="cohere")

# List models with "bert" in their name
>>> api.list_models(search="bert")

# List models with "bert" in their name and pushed by google
>>> api.list_models(search="bert", author="google")
list_datasets(*, filter: str | Iterable[str] | None = None, author: str | None = None, benchmark: str | List[str] | None = None, dataset_name: str | None = None, gated: bool | None = None, language_creators: str | List[str] | None = None, language: str | List[str] | None = None, multilinguality: str | List[str] | None = None, size_categories: str | List[str] | None = None, task_categories: str | List[str] | None = None, task_ids: str | List[str] | None = None, search: str | None = None, sort: Literal['last_modified'] | str | None = None, direction: Literal[-1] | None = None, limit: int | None = None, expand: List[Literal['author', 'cardData', 'citation', 'createdAt', 'description', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'lastModified', 'likes', 'paperswithcode_id', 'private', 'resourceGroup', 'sha', 'siblings', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, full: bool | None = None, token: bool | str | None = None, tags: str | List[str] | None = None) Iterable[DatasetInfo][source][source]

List datasets hosted on the Huggingface Hub, given some filters.

Parameters:
  • filter (str or Iterable[str], optional) – A string or list of string to filter datasets on the hub.

  • author (str, optional) – A string which identify the author of the returned datasets.

  • benchmark (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by their official benchmark.

  • dataset_name (str, optional) – A string or list of strings that can be used to identify datasets on the Hub by its name, such as SQAC or wikineural

  • gated (bool, optional) – A boolean to filter datasets on the Hub that are gated or not. By default, all datasets are returned. If gated=True is passed, only gated datasets are returned. If gated=False is passed, only non-gated datasets are returned.

  • language_creators (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub with how the data was curated, such as crowdsourced or machine_generated.

  • language (str or List, optional) – A string or list of strings representing a two-character language to filter datasets by on the Hub.

  • multilinguality (str or List, optional) – A string or list of strings representing a filter for datasets that contain multiple languages.

  • size_categories (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by the size of the dataset such as 100K<n<1M or 1M<n<10M.

  • tags (str or List, optional) – Deprecated. Pass tags in filter to filter datasets by tags.

  • task_categories (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by the designed task, such as audio_classification or named_entity_recognition.

  • task_ids (str or List, optional) – A string or list of strings that can be used to identify datasets on the Hub by the specific task such as speech_emotion_recognition or paraphrase.

  • search (str, optional) – A string that will be contained in the returned datasets.

  • sort (Literal["last_modified"] or str, optional) – The key with which to sort the resulting models. Possible values are “last_modified”, “trending_score”, “created_at”, “downloads” and “likes”.

  • direction (Literal[-1] or int, optional) – Direction in which to sort. The value -1 sorts by descending order while all other values sort by ascending order.

  • limit (int, optional) – The limit on the number of datasets fetched. Leaving this option to None fetches all datasets.

  • expand (List[ExpandDatasetProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full is passed. Possible values are "author", "cardData", "citation", "createdAt", "disabled", "description", "downloads", "downloadsAllTime", "gated", "lastModified", "likes", "paperswithcode_id", "private", "siblings", "sha", "tags", "trendingScore", "usedStorage", "resourceGroup" and "xetEnabled".

  • full (bool, optional) – Whether to fetch all dataset data, including the last_modified, the card_data and the files. Can contain useful information such as the PapersWithCode ID.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

an iterable of [huggingface_hub.hf_api.DatasetInfo] objects.

Return type:

Iterable[DatasetInfo]

Example usage with the filter argument:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

# List all datasets
>>> api.list_datasets()


# List only the text classification datasets
>>> api.list_datasets(filter="task_categories:text-classification")


# List only the datasets in russian for language modeling
>>> api.list_datasets(
...     filter=("language:ru", "task_ids:language-modeling")
... )

# List FiftyOne datasets (identified by the tag "fiftyone" in dataset card)
>>> api.list_datasets(tags="fiftyone")

Example usage with the search argument:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

# List all datasets with "text" in their name
>>> api.list_datasets(search="text")

# List all datasets with "text" in their name made by google
>>> api.list_datasets(search="text", author="google")
list_spaces(*, filter: str | Iterable[str] | None = None, author: str | None = None, search: str | None = None, datasets: str | Iterable[str] | None = None, models: str | Iterable[str] | None = None, linked: bool = False, sort: Literal['last_modified'] | str | None = None, direction: Literal[-1] | None = None, limit: int | None = None, expand: List[Literal['author', 'cardData', 'createdAt', 'datasets', 'disabled', 'lastModified', 'likes', 'models', 'private', 'resourceGroup', 'runtime', 'sdk', 'sha', 'siblings', 'subdomain', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, full: bool | None = None, token: bool | str | None = None) Iterable[SpaceInfo][source][source]

List spaces hosted on the Huggingface Hub, given some filters.

Parameters:
  • filter (str or Iterable, optional) – A string tag or list of tags that can be used to identify Spaces on the Hub.

  • author (str, optional) – A string which identify the author of the returned Spaces.

  • search (str, optional) – A string that will be contained in the returned Spaces.

  • datasets (str or Iterable, optional) – Whether to return Spaces that make use of a dataset. The name of a specific dataset can be passed as a string.

  • models (str or Iterable, optional) – Whether to return Spaces that make use of a model. The name of a specific model can be passed as a string.

  • linked (bool, optional) – Whether to return Spaces that make use of either a model or a dataset.

  • sort (Literal["last_modified"] or str, optional) – The key with which to sort the resulting models. Possible values are “last_modified”, “trending_score”, “created_at” and “likes”.

  • direction (Literal[-1] or int, optional) – Direction in which to sort. The value -1 sorts by descending order while all other values sort by ascending order.

  • limit (int, optional) – The limit on the number of Spaces fetched. Leaving this option to None fetches all Spaces.

  • expand (List[ExpandSpaceProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full is passed. Possible values are "author", "cardData", "datasets", "disabled", "lastModified", "createdAt", "likes", "models", "private", "runtime", "sdk", "siblings", "sha", "subdomain", "tags", "trendingScore", "usedStorage", "resourceGroup" and "xetEnabled".

  • full (bool, optional) – Whether to fetch all Spaces data, including the last_modified, siblings and card_data fields.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

an iterable of [huggingface_hub.hf_api.SpaceInfo] objects.

Return type:

Iterable[SpaceInfo]

unlike(repo_id: str, *, token: bool | str | None = None, repo_type: str | None = None) None[source][source]

Unlike a given repo on the Hub (e.g. remove from favorite list).

To prevent spam usage, it is not possible to like a repository from a script.

See also [list_liked_repos].

Parameters:
  • repo_id (str) – The repository to unlike. Example: "user/my-cool-model".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if unliking a dataset or space, None or "model" if unliking a model. Default is None.

Raises:

[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

Example: .. code-block:: python

>>> from huggingface_hub import list_liked_repos, unlike
>>> "gpt2" in list_liked_repos().models # we assume you have already liked gpt2
True
>>> unlike("gpt2")
>>> "gpt2" in list_liked_repos().models
False
list_liked_repos(user: str | None = None, *, token: bool | str | None = None) UserLikes[source][source]

List all public repos liked by a user on huggingface.co.

This list is public so token is optional. If user is not passed, it defaults to the logged in user.

See also [unlike].

Parameters:
Returns:

object containing the user name and 3 lists of repo ids (1 for models, 1 for datasets and 1 for Spaces).

Return type:

[UserLikes]

Raises:

[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If user is not passed and no token found (either from argument or from machine).

Example: .. code-block:: python

>>> from huggingface_hub import list_liked_repos
>>> likes = list_liked_repos("julien-c")
>>> likes.user
"julien-c"
>>> likes.models
["osanseviero/streamlit_1.15", "Xhaheen/ChatGPT_HF", ...]
list_repo_likers(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) Iterable[User][source][source]

List all users who liked a given repo on the hugging Face Hub.

See also [list_liked_repos].

Parameters:
  • repo_id (str) – The repository to retrieve . Example: "user/my-cool-model".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

Returns:

an iterable of [huggingface_hub.hf_api.User] objects.

Return type:

Iterable[User]

model_info(repo_id: str, *, revision: str | None = None, timeout: float | None = None, securityStatus: bool | None = None, files_metadata: bool = False, expand: List[Literal['author', 'baseModels', 'cardData', 'childrenModelCount', 'config', 'createdAt', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'gguf', 'inference', 'inferenceProviderMapping', 'lastModified', 'library_name', 'likes', 'mask_token', 'model-index', 'pipeline_tag', 'private', 'resourceGroup', 'safetensors', 'sha', 'siblings', 'spaces', 'tags', 'transformersInfo', 'trendingScore', 'usedStorage', 'widgetData', 'xetEnabled']] | None = None, token: bool | str | None = None) ModelInfo[source][source]

Get info on one specific model on huggingface.co

Model can be private if you pass an acceptable token or are logged in.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • revision (str, optional) – The revision of the model repository from which to get the information.

  • timeout (float, optional) – Whether to set a timeout for the request to the Hub.

  • securityStatus (bool, optional) – Whether to retrieve the security status from the model repository as well. The security status will be returned in the security_repo_status field.

  • files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.

  • expand (List[ExpandModelProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if securityStatus or files_metadata are passed. Possible values are "author", "baseModels", "cardData", "childrenModelCount", "config", "createdAt", "disabled", "downloads", "downloadsAllTime", "gated", "gguf", "inference", "inferenceProviderMapping", "lastModified", "library_name", "likes", "mask_token", "model-index", "pipeline_tag", "private", "safetensors", "sha", "siblings", "spaces", "tags", "transformersInfo", "trendingScore", "widgetData", "usedStorage", "resourceGroup" and "xetEnabled".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

The model repository information.

Return type:

[huggingface_hub.hf_api.ModelInfo]

<Tip>

Raises the following errors:

  • [~utils.RepositoryNotFoundError] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [~utils.RevisionNotFoundError] If the revision to download from cannot be found.

</Tip>

dataset_info(repo_id: str, *, revision: str | None = None, timeout: float | None = None, files_metadata: bool = False, expand: List[Literal['author', 'cardData', 'citation', 'createdAt', 'description', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'lastModified', 'likes', 'paperswithcode_id', 'private', 'resourceGroup', 'sha', 'siblings', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, token: bool | str | None = None) DatasetInfo[source][source]

Get info on one specific dataset on huggingface.co.

Dataset can be private if you pass an acceptable token.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • revision (str, optional) – The revision of the dataset repository from which to get the information.

  • timeout (float, optional) – Whether to set a timeout for the request to the Hub.

  • files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.

  • expand (List[ExpandDatasetProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if files_metadata is passed. Possible values are "author", "cardData", "citation", "createdAt", "disabled", "description", "downloads", "downloadsAllTime", "gated", "lastModified", "likes", "paperswithcode_id", "private", "siblings", "sha", "tags", "trendingScore",``”usedStorage”, ``"resourceGroup" and "xetEnabled".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

The dataset repository information.

Return type:

[hf_api.DatasetInfo]

<Tip>

Raises the following errors:

  • [~utils.RepositoryNotFoundError] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [~utils.RevisionNotFoundError] If the revision to download from cannot be found.

</Tip>

space_info(repo_id: str, *, revision: str | None = None, timeout: float | None = None, files_metadata: bool = False, expand: List[Literal['author', 'cardData', 'createdAt', 'datasets', 'disabled', 'lastModified', 'likes', 'models', 'private', 'resourceGroup', 'runtime', 'sdk', 'sha', 'siblings', 'subdomain', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, token: bool | str | None = None) SpaceInfo[source][source]

Get info on one specific Space on huggingface.co.

Space can be private if you pass an acceptable token.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • revision (str, optional) – The revision of the space repository from which to get the information.

  • timeout (float, optional) – Whether to set a timeout for the request to the Hub.

  • files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.

  • expand (List[ExpandSpaceProperty_T], optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if full is passed. Possible values are "author", "cardData", "createdAt", "datasets", "disabled", "lastModified", "likes", "models", "private", "runtime", "sdk", "siblings", "sha", "subdomain", "tags", "trendingScore", "usedStorage", "resourceGroup" and "xetEnabled".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

The space repository information.

Return type:

[~hf_api.SpaceInfo]

<Tip>

Raises the following errors:

  • [~utils.RepositoryNotFoundError] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [~utils.RevisionNotFoundError] If the revision to download from cannot be found.

</Tip>

repo_info(repo_id: str, *, revision: str | None = None, repo_type: str | None = None, timeout: float | None = None, files_metadata: bool = False, expand: Literal['author', 'baseModels', 'cardData', 'childrenModelCount', 'config', 'createdAt', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'gguf', 'inference', 'inferenceProviderMapping', 'lastModified', 'library_name', 'likes', 'mask_token', 'model-index', 'pipeline_tag', 'private', 'resourceGroup', 'safetensors', 'sha', 'siblings', 'spaces', 'tags', 'transformersInfo', 'trendingScore', 'usedStorage', 'widgetData', 'xetEnabled'] | Literal['author', 'cardData', 'citation', 'createdAt', 'description', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'lastModified', 'likes', 'paperswithcode_id', 'private', 'resourceGroup', 'sha', 'siblings', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled'] | Literal['author', 'cardData', 'createdAt', 'datasets', 'disabled', 'lastModified', 'likes', 'models', 'private', 'resourceGroup', 'runtime', 'sdk', 'sha', 'siblings', 'subdomain', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled'] | None = None, token: bool | str | None = None) ModelInfo | DatasetInfo | SpaceInfo[source][source]

Get the info object for a given repo of a given type.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • revision (str, optional) – The revision of the repository from which to get the information.

  • repo_type (str, optional) – Set to "dataset" or "space" if getting repository info from a dataset or a space, None or "model" if getting repository info from a model. Default is None.

  • timeout (float, optional) – Whether to set a timeout for the request to the Hub.

  • expand (ExpandModelProperty_T or ExpandDatasetProperty_T or ExpandSpaceProperty_T, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used if files_metadata is passed. For an exhaustive list of available properties, check out [model_info], [dataset_info] or [space_info].

  • files_metadata (bool, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults to False.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

The repository information, as a [huggingface_hub.hf_api.DatasetInfo], [huggingface_hub.hf_api.ModelInfo] or [huggingface_hub.hf_api.SpaceInfo] object.

Return type:

Union[SpaceInfo, DatasetInfo, ModelInfo]

<Tip>

Raises the following errors:

  • [~utils.RepositoryNotFoundError] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [~utils.RevisionNotFoundError] If the revision to download from cannot be found.

</Tip>

repo_exists(repo_id: str, *, repo_type: str | None = None, token: str | bool | None = None) bool[source][source]

Checks if a repository exists on the Hugging Face Hub.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • repo_type (str, optional) – Set to "dataset" or "space" if getting repository info from a dataset or a space, None or "model" if getting repository info from a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

True if the repository exists, False otherwise.

Examples

>>> from huggingface_hub import repo_exists
>>> repo_exists("google/gemma-7b")
True
>>> repo_exists("google/not-a-repo")
False
revision_exists(repo_id: str, revision: str, *, repo_type: str | None = None, token: str | bool | None = None) bool[source][source]

Checks if a specific revision exists on a repo on the Hugging Face Hub.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • revision (str) – The revision of the repository to check.

  • repo_type (str, optional) – Set to "dataset" or "space" if getting repository info from a dataset or a space, None or "model" if getting repository info from a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

True if the repository and the revision exists, False otherwise.

Examples

>>> from huggingface_hub import revision_exists
>>> revision_exists("google/gemma-7b", "float16")
True
>>> revision_exists("google/gemma-7b", "not-a-revision")
False
file_exists(repo_id: str, filename: str, *, repo_type: str | None = None, revision: str | None = None, token: str | bool | None = None) bool[source][source]

Checks if a file exists in a repository on the Hugging Face Hub.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • filename (str) – The name of the file to check, for example: "config.json"

  • repo_type (str, optional) – Set to "dataset" or "space" if getting repository info from a dataset or a space, None or "model" if getting repository info from a model. Default is None.

  • revision (str, optional) – The revision of the repository from which to get the information. Defaults to "main" branch.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

True if the file exists, False otherwise.

Examples

>>> from huggingface_hub import file_exists
>>> file_exists("bigcode/starcoder", "config.json")
True
>>> file_exists("bigcode/starcoder", "not-a-file")
False
>>> file_exists("bigcode/not-a-repo", "config.json")
False
list_repo_files(repo_id: str, *, revision: str | None = None, repo_type: str | None = None, token: str | bool | None = None) List[str][source][source]

Get the list of files in a given repo.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • revision (str, optional) – The revision of the repository from which to get the information.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

the list of files in a given repository.

Return type:

List[str]

list_repo_tree(repo_id: str, path_in_repo: str | None = None, *, recursive: bool = False, expand: bool = False, revision: str | None = None, repo_type: str | None = None, token: str | bool | None = None) Iterable[RepoFile | RepoFolder][source][source]

List a repo tree’s files and folders and get information about them.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • path_in_repo (str, optional) – Relative path of the tree (folder) in the repo, for example: "checkpoints/1fec34a/results". Will default to the root tree (folder) of the repository.

  • recursive (bool, optional, defaults to False) – Whether to list tree’s files and folders recursively.

  • expand (bool, optional, defaults to False) – Whether to fetch more information about the tree’s files and folders (e.g. last commit and files’ security scan results). This operation is more expensive for the server so only 50 results are returned per page (instead of 1000). As pagination is implemented in huggingface_hub, this is transparent for you except for the time it takes to get the results.

  • revision (str, optional) – The revision of the repository from which to get the tree. Defaults to "main" branch.

  • repo_type (str, optional) – The type of the repository from which to get the tree ("model", "dataset" or "space". Defaults to "model".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

The information about the tree’s files and folders, as an iterable of [RepoFile] and [RepoFolder] objects. The order of the files and folders is not guaranteed.

Return type:

Iterable[Union[RepoFile, RepoFolder]]

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [RevisionNotFoundError] – If revision is not found (error 404) on the repo.

  • [EntryNotFoundError] – If the tree (folder) does not exist (error 404) on the repo.

Examples

Get information about a repo’s tree. .. code-block:: py

>>> from huggingface_hub import list_repo_tree
>>> repo_tree = list_repo_tree("lysandre/arxiv-nlp")
>>> repo_tree
<generator object HfApi.list_repo_tree at 0x7fa4088e1ac0>
>>> list(repo_tree)
[
    RepoFile(path='.gitattributes', size=391, blob_id='ae8c63daedbd4206d7d40126955d4e6ab1c80f8f', lfs=None, last_commit=None, security=None),
    RepoFile(path='README.md', size=391, blob_id='43bd404b159de6fba7c2f4d3264347668d43af25', lfs=None, last_commit=None, security=None),
    RepoFile(path='config.json', size=554, blob_id='2f9618c3a19b9a61add74f70bfb121335aeef666', lfs=None, last_commit=None, security=None),
    RepoFile(
        path='flax_model.msgpack', size=497764107, blob_id='8095a62ccb4d806da7666fcda07467e2d150218e',
        lfs={'size': 497764107, 'sha256': 'd88b0d6a6ff9c3f8151f9d3228f57092aaea997f09af009eefd7373a77b5abb9', 'pointer_size': 134}, last_commit=None, security=None
    ),
    RepoFile(path='merges.txt', size=456318, blob_id='226b0752cac7789c48f0cb3ec53eda48b7be36cc', lfs=None, last_commit=None, security=None),
    RepoFile(
        path='pytorch_model.bin', size=548123560, blob_id='64eaa9c526867e404b68f2c5d66fd78e27026523',
        lfs={'size': 548123560, 'sha256': '9be78edb5b928eba33aa88f431551348f7466ba9f5ef3daf1d552398722a5436', 'pointer_size': 134}, last_commit=None, security=None
    ),
    RepoFile(path='vocab.json', size=898669, blob_id='b00361fece0387ca34b4b8b8539ed830d644dbeb', lfs=None, last_commit=None, security=None)]
]

Get even more information about a repo’s tree (last commit and files’ security scan results) .. code-block:: py

>>> from huggingface_hub import list_repo_tree
>>> repo_tree = list_repo_tree("prompthero/openjourney-v4", expand=True)
>>> list(repo_tree)
[
    RepoFolder(
        path='feature_extractor',
        tree_id='aa536c4ea18073388b5b0bc791057a7296a00398',
        last_commit={
            'oid': '47b62b20b20e06b9de610e840282b7e6c3d51190',
            'title': 'Upload diffusers weights (#48)',
            'date': datetime.datetime(2023, 3, 21, 9, 5, 27, tzinfo=datetime.timezone.utc)
        }
    ),
    RepoFolder(
        path='safety_checker',
        tree_id='65aef9d787e5557373fdf714d6c34d4fcdd70440',
        last_commit={
            'oid': '47b62b20b20e06b9de610e840282b7e6c3d51190',
            'title': 'Upload diffusers weights (#48)',
            'date': datetime.datetime(2023, 3, 21, 9, 5, 27, tzinfo=datetime.timezone.utc)
        }
    ),
    RepoFile(
        path='model_index.json',
        size=582,
        blob_id='d3d7c1e8c3e78eeb1640b8e2041ee256e24c9ee1',
        lfs=None,
        last_commit={
            'oid': 'b195ed2d503f3eb29637050a886d77bd81d35f0e',
            'title': 'Fix deprecation warning by changing ``CLIPFeatureExtractor`` to ``CLIPImageProcessor``. (#54)',
            'date': datetime.datetime(2023, 5, 15, 21, 41, 59, tzinfo=datetime.timezone.utc)
        },
        security={
            'safe': True,
            'av_scan': {'virusFound': False, 'virusNames': None},
            'pickle_import_scan': None
        }
    )
    ...
]
list_repo_refs(repo_id: str, *, repo_type: str | None = None, include_pull_requests: bool = False, token: str | bool | None = None) GitRefs[source][source]

Get the list of refs of a given repo (both tags and branches).

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • repo_type (str, optional) – Set to "dataset" or "space" if listing refs from a dataset or a Space, None or "model" if listing from a model. Default is None.

  • include_pull_requests (bool, optional) – Whether to include refs from pull requests in the list. Defaults to False.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Example: .. code-block:: py

>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> api.list_repo_refs("gpt2")
GitRefs(branches=[GitRefInfo(name='main', ref='refs/heads/main', target_commit='e7da7f221d5bf496a48136c0cd264e630fe9fcc8')], converts=[], tags=[])
>>> api.list_repo_refs("bigcode/the-stack", repo_type='dataset')
GitRefs(
    branches=[
        GitRefInfo(name='main', ref='refs/heads/main', target_commit='18edc1591d9ce72aa82f56c4431b3c969b210ae3'),
        GitRefInfo(name='v1.1.a1', ref='refs/heads/v1.1.a1', target_commit='f9826b862d1567f3822d3d25649b0d6d22ace714')
    ],
    converts=[],
    tags=[
        GitRefInfo(name='v1.0', ref='refs/tags/v1.0', target_commit='c37a8cd1e382064d8aced5e05543c5f7753834da')
    ]
)
Returns:

object containing all information about branches and tags for a repo on the Hub.

Return type:

[GitRefs]

list_repo_commits(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None, revision: str | None = None, formatted: bool = False) List[GitCommitInfo][source][source]

Get the list of commits of a given revision for a repo on the Hub.

Commits are sorted by date (last commit first).

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • repo_type (str, optional) – Set to "dataset" or "space" if listing commits from a dataset or a Space, None or "model" if listing from a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • formatted (bool) – Whether to return the HTML-formatted title and description of the commits. Defaults to False.

Example: .. code-block:: py

>>> from huggingface_hub import HfApi
>>> api = HfApi()

# Commits are sorted by date (last commit first) >>> initial_commit = api.list_repo_commits(“gpt2”)[-1]

# Initial commit is always a system commit containing the .gitattributes file. >>> initial_commit GitCommitInfo(

commit_id=’9b865efde13a30c13e0a33e536cf3e4a5a9d71d8’, authors=[‘system’], created_at=datetime.datetime(2019, 2, 18, 10, 36, 15, tzinfo=datetime.timezone.utc), title=’initial commit’, message=’’, formatted_title=None, formatted_message=None

)

# Create an empty branch by deriving from initial commit >>> api.create_branch(“gpt2”, “new_empty_branch”, revision=initial_commit.commit_id)

Returns:

list of objects containing information about the commits for a repo on the Hub.

Return type:

List[[GitCommitInfo]]

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [RevisionNotFoundError] – If revision is not found (error 404) on the repo.

get_paths_info(repo_id: str, paths: List[str] | str, *, expand: bool = False, revision: str | None = None, repo_type: str | None = None, token: str | bool | None = None) List[RepoFile | RepoFolder][source][source]

Get information about a repo’s paths.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • paths (Union[List[str], str], optional) – The paths to get information about. If a path do not exist, it is ignored without raising an exception.

  • expand (bool, optional, defaults to False) – Whether to fetch more information about the paths (e.g. last commit and files’ security scan results). This operation is more expensive for the server so only 50 results are returned per page (instead of 1000). As pagination is implemented in huggingface_hub, this is transparent for you except for the time it takes to get the results.

  • revision (str, optional) – The revision of the repository from which to get the information. Defaults to "main" branch.

  • repo_type (str, optional) – The type of the repository from which to get the information ("model", "dataset" or "space". Defaults to "model".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

The information about the paths, as a list of [RepoFile] and [RepoFolder] objects.

Return type:

List[Union[RepoFile, RepoFolder]]

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [RevisionNotFoundError] – If revision is not found (error 404) on the repo.

Example: .. code-block:: py

>>> from huggingface_hub import get_paths_info
>>> paths_info = get_paths_info("allenai/c4", ["README.md", "en"], repo_type="dataset")
>>> paths_info
[
    RepoFile(path='README.md', size=2379, blob_id='f84cb4c97182890fc1dbdeaf1a6a468fd27b4fff', lfs=None, last_commit=None, security=None),
    RepoFolder(path='en', tree_id='dc943c4c40f53d02b31ced1defa7e5f438d5862e', last_commit=None)
]
super_squash_history(repo_id: str, *, branch: str | None = None, commit_message: str | None = None, repo_type: str | None = None, token: str | bool | None = None) None[source][source]

Squash commit history on a branch for a repo on the Hub.

Squashing the repo history is useful when you know you’ll make hundreds of commits and you don’t want to clutter the history. Squashing commits can only be performed from the head of a branch.

<Tip warning={true}>

Once squashed, the commit history cannot be retrieved. This is a non-revertible operation.

</Tip>

<Tip warning={true}>

Once the history of a branch has been squashed, it is not possible to merge it back into another branch since their history will have diverged.

</Tip>

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • branch (str, optional) – The branch to squash. Defaults to the head of the "main" branch.

  • commit_message (str, optional) – The commit message to use for the squashed commit.

  • repo_type (str, optional) – Set to "dataset" or "space" if listing commits from a dataset or a Space, None or "model" if listing from a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [RevisionNotFoundError] – If the branch to squash cannot be found.

  • [BadRequestError] – If invalid reference for a branch. You cannot squash history on tags.

Example: .. code-block:: py

>>> from huggingface_hub import HfApi
>>> api = HfApi()

# Create repo >>> repo_id = api.create_repo(“test-squash”).repo_id

# Make a lot of commits. >>> api.upload_file(repo_id=repo_id, path_in_repo=”file.txt”, path_or_fileobj=b”content”) >>> api.upload_file(repo_id=repo_id, path_in_repo=”lfs.bin”, path_or_fileobj=b”content”) >>> api.upload_file(repo_id=repo_id, path_in_repo=”file.txt”, path_or_fileobj=b”another_content”)

# Squash history >>> api.super_squash_history(repo_id=repo_id)

list_lfs_files(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) Iterable[LFSFileInfo][source][source]

List all LFS files in a repo on the Hub.

This is primarily useful to count how much storage a repo is using and to eventually clean up large files with [permanently_delete_lfs_files]. Note that this would be a permanent action that will affect all commits referencing this deleted files and that cannot be undone.

Parameters:
  • repo_id (str) – The repository for which you are listing LFS files.

  • repo_type (str, optional) – Type of repository. Set to "dataset" or "space" if listing from a dataset or space, None or "model" if listing from a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

An iterator of [LFSFileInfo] objects.

Return type:

Iterable[LFSFileInfo]

Example

>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> lfs_files = api.list_lfs_files("username/my-cool-repo")

# Filter files files to delete based on a combination of ``filename``, ``pushed_at``, ``ref`` or ``size``.
# e.g. select only LFS files in the "checkpoints" folder
>>> lfs_files_to_delete = (lfs_file for lfs_file in lfs_files if lfs_file.filename.startswith("checkpoints/"))

# Permanently delete LFS files
>>> api.permanently_delete_lfs_files("username/my-cool-repo", lfs_files_to_delete)
permanently_delete_lfs_files(repo_id: str, lfs_files: Iterable[LFSFileInfo], *, rewrite_history: bool = True, repo_type: str | None = None, token: bool | str | None = None) None[source][source]

Permanently delete LFS files from a repo on the Hub.

<Tip warning={true}>

This is a permanent action that will affect all commits referencing the deleted files and might corrupt your repository. This is a non-revertible operation. Use it only if you know what you are doing.

</Tip>

Parameters:
  • repo_id (str) – The repository for which you are listing LFS files.

  • lfs_files (Iterable[LFSFileInfo]) – An iterable of [LFSFileInfo] items to permanently delete from the repo. Use [list_lfs_files] to list all LFS files from a repo.

  • rewrite_history (bool, optional, default to True) – Whether to rewrite repository history to remove file pointers referencing the deleted LFS files (recommended).

  • repo_type (str, optional) – Type of repository. Set to "dataset" or "space" if listing from a dataset or space, None or "model" if listing from a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Example

>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> lfs_files = api.list_lfs_files("username/my-cool-repo")

# Filter files files to delete based on a combination of ``filename``, ``pushed_at``, ``ref`` or ``size``.
# e.g. select only LFS files in the "checkpoints" folder
>>> lfs_files_to_delete = (lfs_file for lfs_file in lfs_files if lfs_file.filename.startswith("checkpoints/"))

# Permanently delete LFS files
>>> api.permanently_delete_lfs_files("username/my-cool-repo", lfs_files_to_delete)
create_repo(repo_id: str, *, token: str | bool | None = None, private: bool | None = None, repo_type: str | None = None, exist_ok: bool = False, resource_group_id: str | None = None, space_sdk: str | None = None, space_hardware: SpaceHardware | None = None, space_storage: SpaceStorage | None = None, space_sleep_time: int | None = None, space_secrets: List[Dict[str, str]] | None = None, space_variables: List[Dict[str, str]] | None = None) RepoUrl[source][source]

Create an empty repo on the HuggingFace Hub.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • private (bool, optional) – Whether to make the repo private. If None (default), the repo will be public unless the organization’s default is private. This value is ignored if the repo already exists.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • exist_ok (bool, optional, defaults to False) – If True, do not raise an error if repo already exists.

  • resource_group_id (str, optional) – Resource group in which to create the repo. Resource groups is only available for Enterprise Hub organizations and allow to define which members of the organization can access the resource. The ID of a resource group can be found in the URL of the resource’s page on the Hub (e.g. "66670e5163145ca562cb1988"). To learn more about resource groups, see https://huggingface.co/docs/hub/en/security-resource-groups.

  • space_sdk (str, optional) – Choice of SDK to use if repo_type is “space”. Can be “streamlit”, “gradio”, “docker”, or “static”.

  • space_hardware (SpaceHardware or str, optional) – Choice of Hardware if repo_type is “space”. See [SpaceHardware] for a complete list.

  • space_storage (SpaceStorage or str, optional) – Choice of persistent storage tier. Example: "small". See [SpaceStorage] for a complete list.

  • space_sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.

  • space_secrets (List[Dict[str, str]], optional) – A list of secret keys to set in your Space. Each item is in the form {"key": ..., "value": ..., "description": ...} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.

  • space_variables (List[Dict[str, str]], optional) – A list of public environment variables to set in your Space. Each item is in the form {"key": ..., "value": ..., "description": ...} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables.

Returns:

URL to the newly created repo. Value is a subclass of str containing attributes like endpoint, repo_type and repo_id.

Return type:

[RepoUrl]

delete_repo(repo_id: str, *, token: str | bool | None = None, repo_type: str | None = None, missing_ok: bool = False) None[source][source]

Delete a repo from the HuggingFace Hub. CAUTION: this is irreversible.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model.

  • missing_ok (bool, optional, defaults to False) – If True, do not raise an error if repo does not exist.

Raises:

[RepositoryNotFoundError] – If the repository to delete from cannot be found and missing_ok is set to False (default).

update_repo_visibility(repo_id: str, private: bool = False, *, token: str | bool | None = None, repo_type: str | None = None) Dict[str, bool][source][source]

Update the visibility setting of a repository.

Deprecated. Use update_repo_settings instead.

Parameters:
  • repo_id (str, optional) – A namespace (user or an organization) and a repo name separated by a /.

  • private (bool, optional, defaults to False) – Whether the repository should be private.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

Returns:

The HTTP response in json.

<Tip>

Raises the following errors:

  • [~utils.RepositoryNotFoundError] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

</Tip>

update_repo_settings(repo_id: str, *, gated: Literal['auto', 'manual', False] | None = None, private: bool | None = None, token: str | bool | None = None, repo_type: str | None = None, xet_enabled: bool | None = None) None[source][source]

Update the settings of a repository, including gated access and visibility.

To give more control over how repos are used, the Hub allows repo authors to enable access requests for their repos, and also to set the visibility of the repo to private.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • gated (Literal["auto", "manual", False], optional) – The gated status for the repository. If set to None (default), the gated setting of the repository won’t be updated. * “auto”: The repository is gated, and access requests are automatically approved or denied based on predefined criteria. * “manual”: The repository is gated, and access requests require manual approval. * False : The repository is not gated, and anyone can access it.

  • private (bool, optional) – Whether the repository should be private.

  • token (Union[str, bool, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – The type of the repository to update settings from ("model", "dataset" or "space"). Defaults to "model".

  • xet_enabled (bool, optional) – Whether the repository should be enabled for Xet Storage.

Raises:
  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If gated is not one of “auto”, “manual”, or False.

  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If repo_type is not one of the values in constants.REPO_TYPES.

  • [HfHubHTTPError] – If the request to the Hugging Face Hub API fails.

  • [RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

move_repo(from_id: str, to_id: str, *, repo_type: str | None = None, token: str | bool | None = None)[source][source]

Moving a repository from namespace1/repo_name1 to namespace2/repo_name2

Note there are certain limitations. For more information about moving repositories, please see https://hf.co/docs/hub/repositories-settings#renaming-or-transferring-a-repo.

Parameters:
  • from_id (str) – A namespace (user or an organization) and a repo name separated by a /. Original repository identifier.

  • to_id (str) – A namespace (user or an organization) and a repo name separated by a /. Final repository identifier.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

<Tip>

Raises the following errors:

  • [~utils.RepositoryNotFoundError] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

</Tip>

create_commit(repo_id: str, operations: Iterable[CommitOperation], *, commit_message: str, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, num_threads: int = 5, parent_commit: str | None = None, run_as_future: Literal[False] = False) CommitInfo[source][source]
create_commit(repo_id: str, operations: Iterable[CommitOperation], *, commit_message: str, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, num_threads: int = 5, parent_commit: str | None = None, run_as_future: Literal[True] = False) Future[CommitInfo]

Creates a commit in the given repo, deleting & uploading files as needed.

<Tip warning={true}>

The input list of CommitOperation will be mutated during the commit process. Do not reuse the same objects for multiple commits.

</Tip>

<Tip warning={true}>

create_commit assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and that repo_id and repo_type are set correctly. If repo does not exist, create it first using [~hf_api.create_repo].

</Tip>

<Tip warning={true}>

create_commit is limited to 25k LFS files and a 1GB payload for regular files.

</Tip>

Parameters:
  • repo_id (str) – The repository in which the commit will be created, for example: "username/custom_transformers"

  • operations (Iterable of [~hf_api.CommitOperation]) –

    An iterable of operations to include in the commit, either:

    • [~hf_api.CommitOperationAdd] to upload a file

    • [~hf_api.CommitOperationDelete] to delete a file

    • [~hf_api.CommitOperationCopy] to copy a file

    Operation objects will be mutated to include information relative to the upload. Do not reuse the same objects for multiple commits.

  • commit_message (str) – The summary (first line) of the commit that will be created.

  • commit_description (str, optional) – The description of the commit that will be created

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the "main" branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.

  • num_threads (int, optional) – Number of concurrent threads for uploading files. Defaults to 5. Setting it to 2 means at most 2 files will be uploaded concurrently.

  • parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.

  • run_as_future (bool, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passing run_as_future=True will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults to False.

Returns:

Instance of [CommitInfo] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). If run_as_future=True is passed, returns a Future object which will contain the result when executed.

Return type:

[CommitInfo] or Future

Raises:
  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If commit message is empty.

  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If parent commit is not a valid commit OID.

  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If a README.md file with an invalid metadata section is committed. In this case, the commit will fail early, before trying to upload any file.

  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If create_pr is True and revision is neither None nor "main".

  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

preupload_lfs_files(repo_id: str, additions: Iterable[CommitOperationAdd], *, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, num_threads: int = 5, free_memory: bool = True, gitignore_content: str | None = None)[source][source]

Pre-upload LFS files to S3 in preparation on a future commit.

This method is useful if you are generating the files to upload on-the-fly and you don’t want to store them in memory before uploading them all at once.

<Tip warning={true}>

This is a power-user method. You shouldn’t need to call it directly to make a normal commit. Use [create_commit] directly instead.

</Tip>

<Tip warning={true}>

Commit operations will be mutated during the process. In particular, the attached path_or_fileobj will be removed after the upload to save memory (and replaced by an empty bytes object). Do not reuse the same objects except to pass them to [create_commit]. If you don’t want to remove the attached content from the commit operation object, pass free_memory=False.

</Tip>

Parameters:
  • repo_id (str) – The repository in which you will commit the files, for example: "username/custom_transformers".

  • operations (Iterable of [CommitOperationAdd]) – The list of files to upload. Warning: the objects in this list will be mutated to include information relative to the upload. Do not reuse the same objects for multiple commits.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – The type of repository to upload to (e.g. "model" -default-, "dataset" or "space").

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • create_pr (boolean, optional) – Whether or not you plan to create a Pull Request with that commit. Defaults to False.

  • num_threads (int, optional) – Number of concurrent threads for uploading files. Defaults to 5. Setting it to 2 means at most 2 files will be uploaded concurrently.

  • gitignore_content (str, optional) – The content of the .gitignore file to know which files should be ignored. The order of priority is to first check if gitignore_content is passed, then check if the .gitignore file is present in the list of files to commit and finally default to the .gitignore file already hosted on the Hub (if any).

Example: .. code-block:: py

>>> from huggingface_hub import CommitOperationAdd, preupload_lfs_files, create_commit, create_repo
>>> repo_id = create_repo("test_preupload").repo_id

# Generate and preupload LFS files one by one >>> operations = [] # List of all CommitOperationAdd objects that will be generated >>> for i in range(5): … content = … # generate binary content … addition = CommitOperationAdd(path_in_repo=f”shard_{i}_of_5.bin”, path_or_fileobj=content) … preupload_lfs_files(repo_id, additions=[addition]) # upload + free memory … operations.append(addition)

# Create commit >>> create_commit(repo_id, operations=operations, commit_message=”Commit all shards”)

upload_file(*, path_or_fileobj: str | Path | bytes | BinaryIO, path_in_repo: str, repo_id: str, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, run_as_future: Literal[False] = False) CommitInfo[source][source]
upload_file(*, path_or_fileobj: str | Path | bytes | BinaryIO, path_in_repo: str, repo_id: str, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, run_as_future: Literal[True] = False) Future[CommitInfo]

Upload a local file (up to 50 GB) to the given repo. The upload is done through a HTTP post request, and doesn’t require git or git-lfs to be installed.

Parameters:
  • path_or_fileobj (str, Path, bytes, or IO) – Path to a file on the local machine or binary data stream / fileobj / buffer.

  • path_in_repo (str) – Relative filepath in the repo, for example: "checkpoints/1fec34a/weights.bin"

  • repo_id (str) – The repository to which the file will be uploaded, for example: "username/custom_transformers"

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • commit_message (str, optional) – The summary / title / first line of the generated commit

  • commit_description (str optional) – The description of the generated commit

  • create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the "main" branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.

  • parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.

  • run_as_future (bool, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passing run_as_future=True will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults to False.

Returns:

Instance of [CommitInfo] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). If run_as_future=True is passed, returns a Future object which will contain the result when executed.

Return type:

[CommitInfo] or Future

<Tip>

Raises the following errors:

</Tip>

<Tip warning={true}>

upload_file assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and that repo_id and repo_type are set correctly. If repo does not exist, create it first using [~hf_api.create_repo].

</Tip>

Example:

>>> from huggingface_hub import upload_file

>>> with open("./local/filepath", "rb") as fobj:
...     upload_file(
...         path_or_fileobj=fileobj,
...         path_in_repo="remote/file/path.h5",
...         repo_id="username/my-dataset",
...         repo_type="dataset",
...         token="my_token",
...     )
"https://huggingface.co/datasets/username/my-dataset/blob/main/remote/file/path.h5"

>>> upload_file(
...     path_or_fileobj=".\\local\\file\\path",
...     path_in_repo="remote/file/path.h5",
...     repo_id="username/my-model",
...     token="my_token",
... )
"https://huggingface.co/username/my-model/blob/main/remote/file/path.h5"

>>> upload_file(
...     path_or_fileobj=".\\local\\file\\path",
...     path_in_repo="remote/file/path.h5",
...     repo_id="username/my-model",
...     token="my_token",
...     create_pr=True,
... )
"https://huggingface.co/username/my-model/blob/refs%2Fpr%2F1/remote/file/path.h5"
upload_folder(*, repo_id: str, folder_path: str | Path, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, allow_patterns: List[str] | str | None = None, ignore_patterns: List[str] | str | None = None, delete_patterns: List[str] | str | None = None, run_as_future: Literal[False] = False) CommitInfo[source][source]
upload_folder(*, repo_id: str, folder_path: str | Path, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, allow_patterns: List[str] | str | None = None, ignore_patterns: List[str] | str | None = None, delete_patterns: List[str] | str | None = None, run_as_future: Literal[True] = False) Future[CommitInfo]

Upload a local folder to the given repo. The upload is done through a HTTP requests, and doesn’t require git or git-lfs to be installed.

The structure of the folder will be preserved. Files with the same name already present in the repository will be overwritten. Others will be left untouched.

Use the allow_patterns and ignore_patterns arguments to specify which files to upload. These parameters accept either a single pattern or a list of patterns. Patterns are Standard Wildcards (globbing patterns) as documented [here](https://tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm). If both allow_patterns and ignore_patterns are provided, both constraints apply. By default, all files from the folder are uploaded.

Use the delete_patterns argument to specify remote files you want to delete. Input type is the same as for allow_patterns (see above). If path_in_repo is also provided, the patterns are matched against paths relative to this folder. For example, upload_folder(..., path_in_repo="experiment", delete_patterns="logs/*") will delete any remote file under ./experiment/logs/. Note that the .gitattributes file will not be deleted even if it matches the patterns.

Any .git/ folder present in any subdirectory will be ignored. However, please be aware that the .gitignore file is not taken into account.

Uses HfApi.create_commit under the hood.

Parameters:
  • repo_id (str) – The repository to which the file will be uploaded, for example: "username/custom_transformers"

  • folder_path (str or Path) – Path to the folder to upload on the local file system

  • path_in_repo (str, optional) – Relative path of the directory in the repo, for example: "checkpoints/1fec34a/results". Will default to the root folder of the repository.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • commit_message (str, optional) – The summary / title / first line of the generated commit. Defaults to: f"Upload {path_in_repo} with huggingface_hub"

  • commit_description (str optional) – The description of the generated commit

  • create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the "main" branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.

  • parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.

  • allow_patterns (List[str] or str, optional) – If provided, only files matching at least one pattern are uploaded.

  • ignore_patterns (List[str] or str, optional) – If provided, files matching any of the patterns are not uploaded.

  • delete_patterns (List[str] or str, optional) – If provided, remote files matching any of the patterns will be deleted from the repo while committing new files. This is useful if you don’t know which files have already been uploaded. Note: to avoid discrepancies the .gitattributes file is not deleted even if it matches the pattern.

  • run_as_future (bool, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passing run_as_future=True will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults to False.

Returns:

Instance of [CommitInfo] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). If run_as_future=True is passed, returns a Future object which will contain the result when executed.

Return type:

[CommitInfo] or Future

<Tip>

Raises the following errors:

if the HuggingFace API returned an error - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid

</Tip>

<Tip warning={true}>

upload_folder assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and that repo_id and repo_type are set correctly. If repo does not exist, create it first using [~hf_api.create_repo].

</Tip>

<Tip>

When dealing with a large folder (thousands of files or hundreds of GB), we recommend using [~hf_api.upload_large_folder] instead.

</Tip>

Example:

# Upload checkpoints folder except the log files
>>> upload_folder(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
...     ignore_patterns="\**/logs/*.txt",
... )
# "https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints"

# Upload checkpoints folder including logs while deleting existing logs from the repo
# Useful if you don't know exactly which log files have already being pushed
>>> upload_folder(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
...     delete_patterns="\**/logs/*.txt",
... )
"https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints"

# Upload checkpoints folder while creating a PR
>>> upload_folder(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
...     create_pr=True,
... )
"https://huggingface.co/datasets/username/my-dataset/tree/refs%2Fpr%2F1/remote/experiment/checkpoints"
delete_file(path_in_repo: str, repo_id: str, *, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None) CommitInfo[source][source]

Deletes a file in the given repo.

Parameters:
  • path_in_repo (str) – Relative filepath in the repo, for example: "checkpoints/1fec34a/weights.bin"

  • repo_id (str) – The repository from which the file will be deleted, for example: "username/custom_transformers"

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if the file is in a dataset or space, None or "model" if in a model. Default is None.

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • commit_message (str, optional) – The summary / title / first line of the generated commit. Defaults to f"Delete {path_in_repo} with huggingface_hub".

  • commit_description (str optional) – The description of the generated commit

  • create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the "main" branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.

  • parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.

<Tip>

Raises the following errors:

</Tip>

delete_files(repo_id: str, delete_patterns: List[str], *, token: bool | str | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None) CommitInfo[source][source]

Delete files from a repository on the Hub.

If a folder path is provided, the entire folder is deleted as well as all files it contained.

Parameters:
  • repo_id (str) – The repository from which the folder will be deleted, for example: "username/custom_transformers"

  • delete_patterns (List[str]) – List of files or folders to delete. Each string can either be a file path, a folder path or a Unix shell-style wildcard. E.g. ["file.txt", "folder/", "data/*.parquet"]

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False. to the stored token.

  • repo_type (str, optional) – Type of the repo to delete files from. Can be "model", "dataset" or "space". Defaults to "model".

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • commit_message (str, optional) – The summary (first line) of the generated commit. Defaults to f"Delete files using huggingface_hub".

  • commit_description (str optional) – The description of the generated commit.

  • create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the "main" branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.

  • parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.

delete_folder(path_in_repo: str, repo_id: str, *, token: bool | str | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None) CommitInfo[source][source]

Deletes a folder in the given repo.

Simple wrapper around [create_commit] method.

Parameters:
  • path_in_repo (str) – Relative folder path in the repo, for example: "checkpoints/1fec34a".

  • repo_id (str) – The repository from which the folder will be deleted, for example: "username/custom_transformers"

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False. to the stored token.

  • repo_type (str, optional) – Set to "dataset" or "space" if the folder is in a dataset or space, None or "model" if in a model. Default is None.

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • commit_message (str, optional) – The summary / title / first line of the generated commit. Defaults to f"Delete folder {path_in_repo} with huggingface_hub".

  • commit_description (str optional) – The description of the generated commit.

  • create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the "main" branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.

  • parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.

upload_large_folder(repo_id: str, folder_path: str | Path, *, repo_type: str, revision: str | None = None, private: bool | None = None, allow_patterns: str | List[str] | None = None, ignore_patterns: str | List[str] | None = None, num_workers: int | None = None, print_report: bool = True, print_report_every: int = 60) None[source][source]

Upload a large folder to the Hub in the most resilient way possible.

Several workers are started to upload files in an optimized way. Before being committed to a repo, files must be hashed and be pre-uploaded if they are LFS files. Workers will perform these tasks for each file in the folder. At each step, some metadata information about the upload process is saved in the folder under .cache/.huggingface/ to be able to resume the process if interrupted. The whole process might result in several commits.

Parameters:
  • repo_id (str) – The repository to which the file will be uploaded. E.g. "HuggingFaceTB/smollm-corpus".

  • folder_path (str or Path) – Path to the folder to upload on the local file system.

  • repo_type (str) – Type of the repository. Must be one of "model", "dataset" or "space". Unlike in all other HfApi methods, repo_type is explicitly required here. This is to avoid any mistake when uploading a large folder to the Hub, and therefore prevent from having to re-upload everything.

  • revision (str, optional) – The branch to commit to. If not provided, the main branch will be used.

  • private (bool, optional) – Whether the repository should be private. If None (default), the repo will be public unless the organization’s default is private.

  • allow_patterns (List[str] or str, optional) – If provided, only files matching at least one pattern are uploaded.

  • ignore_patterns (List[str] or str, optional) – If provided, files matching any of the patterns are not uploaded.

  • num_workers (int, optional) – Number of workers to start. Defaults to os.cpu_count() - 2 (minimum 2). A higher number of workers may speed up the process if your machine allows it. However, on machines with a slower connection, it is recommended to keep the number of workers low to ensure better resumability. Indeed, partially uploaded files will have to be completely re-uploaded if the process is interrupted.

  • print_report (bool, optional) – Whether to print a report of the upload progress. Defaults to True. Report is printed to sys.stdout every X seconds (60 by defaults) and overwrites the previous report.

  • print_report_every (int, optional) – Frequency at which the report is printed. Defaults to 60 seconds.

<Tip>

A few things to keep in mind:
  • Repository limits still apply: https://huggingface.co/docs/hub/repositories-recommendations

  • Do not start several processes in parallel.

  • You can interrupt and resume the process at any time.

  • Do not upload the same folder to several repositories. If you need to do so, you must delete the local .cache/.huggingface/ folder first.

</Tip>

<Tip warning={true}>

While being much more robust to upload large folders, upload_large_folder is more limited than [upload_folder] feature-wise. In practice:
  • you cannot set a custom path_in_repo. If you want to upload to a subfolder, you need to set the proper structure locally.

  • you cannot set a custom commit_message and commit_description since multiple commits are created.

  • you cannot delete from the repo while uploading. Please make a separate commit first.

  • you cannot create a PR directly. Please create a PR first (from the UI or using [create_pull_request]) and then commit to it by passing revision.

</Tip>

**Technical details:**

upload_large_folder process is as follow:
  1. (Check parameters and setup.)

  2. Create repo if missing.

  3. List local files to upload.

  4. Run validation checks and display warnings if repository limits might be exceeded:
    • Warns if the total number of files exceeds 100k (recommended limit).

    • Warns if any folder contains more than 10k files (recommended limit).

    • Warns about files larger than 20GB (recommended) or 50GB (hard limit).

  5. Start workers. Workers can perform the following tasks:
    • Hash a file.

    • Get upload mode (regular or LFS) for a list of files.

    • Pre-upload an LFS file.

    • Commit a bunch of files.

Once a worker finishes a task, it will move on to the next task based on the priority list (see below) until all files are uploaded and committed. 6. While workers are up, regularly print a report to sys.stdout.

Order of priority:
  1. Commit if more than 5 minutes since last commit attempt (and at least 1 file).

  2. Commit if at least 150 files are ready to commit.

  3. Get upload mode if at least 10 files have been hashed.

  4. Pre-upload LFS file if at least 1 file and no worker is pre-uploading.

  5. Hash file if at least 1 file and no worker is hashing.

  6. Get upload mode if at least 1 file and no worker is getting upload mode.

  7. Pre-upload LFS file if at least 1 file (exception: if hf_transfer is enabled, only 1 worker can preupload LFS at a time).

  8. Hash file if at least 1 file to hash.

  9. Get upload mode if at least 1 file to get upload mode.

  10. Commit if at least 1 file to commit and at least 1 min since last commit attempt.

  11. Commit if at least 1 file to commit and all other queues are empty.

Special rules:
  • If hf_transfer is enabled, only 1 LFS uploader at a time. Otherwise the CPU would be bloated by hf_transfer.

  • Only one worker can commit at a time.

  • If no tasks are available, the worker waits for 10 seconds before checking again.

get_hf_file_metadata(*, url: str, token: bool | str | None = None, proxies: Dict | None = None, timeout: float | None = 10) HfFileMetadata[source][source]

Fetch metadata of a file versioned on the Hub for a given url.

Parameters:
  • url (str) – File url, for example returned by [hf_hub_url].

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • proxies (dict, optional) – Dictionary mapping protocol to the URL of the proxy passed to requests.request.

  • timeout (float, optional, defaults to 10) – How many seconds to wait for the server to send metadata before giving up.

Returns:

A [HfFileMetadata] object containing metadata such as location, etag, size and commit_hash.

hf_hub_download(repo_id: str, filename: str, *, subfolder: str | None = None, repo_type: str | None = None, revision: str | None = None, cache_dir: str | Path | None = None, local_dir: str | Path | None = None, force_download: bool = False, proxies: Dict | None = None, etag_timeout: float = 10, token: bool | str | None = None, local_files_only: bool = False, resume_download: bool | None = None, force_filename: str | None = None, local_dir_use_symlinks: bool | Literal['auto'] = 'auto') str[source][source]

Download a given file if it’s not already present in the local cache.

The new cache file layout looks like this: - The cache directory contains one subfolder per repo_id (namespaced by repo type) - inside each repo folder:

  • refs is a list of the latest known revision => commit_hash pairs

  • blobs contains the actual file blobs (identified by their git-sha or sha256, depending on

whether they’re LFS files or not) - snapshots contains one subfolder per commit, each “commit” contains the subset of the files that have been resolved at that particular commit. Each filename is a symlink to the blob at that particular commit.

[  96]  .
└── [ 160]  models--julien-c--EsperBERTo-small
    ├── [ 160]  blobs
    │   ├── [321M]  403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd
    │   ├── [ 398]  7cb18dc9bafbfcf74629a4b760af1b160957a83e
    │   └── [1.4K]  d7edf6bd2a681fb0175f7735299831ee1b22b812
    ├── [  96]  refs
    │   └── [  40]  main
    └── [ 128]  snapshots
        ├── [ 128]  2439f60ef33a0d46d85da5001d52aeda5b00ce9f
        │   ├── [  52]  README.md -> ../../blobs/d7edf6bd2a681fb0175f7735299831ee1b22b812
        │   └── [  76]  pytorch_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd
        └── [ 128]  bbc77c8132af1cc5cf678da3f1ddf2de43606d48
            ├── [  52]  README.md -> ../../blobs/7cb18dc9bafbfcf74629a4b760af1b160957a83e
            └── [  76]  pytorch_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd

If local_dir is provided, the file structure from the repo will be replicated in this location. When using this option, the cache_dir will not be used and a .cache/huggingface/ folder will be created at the root of local_dir to store some metadata related to the downloaded files. While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.

Parameters:
  • repo_id (str) – A user or an organization name and a repo name separated by a /.

  • filename (str) – The name of the file in the repo.

  • subfolder (str, optional) – An optional value corresponding to a folder inside the repository.

  • repo_type (str, optional) – Set to "dataset" or "space" if downloading from a dataset or space, None or "model" if downloading from a model. Default is None.

  • revision (str, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.

  • cache_dir (str, Path, optional) – Path to the folder where cached files are stored.

  • local_dir (str or Path, optional) – If provided, the downloaded file will be placed under this directory.

  • force_download (bool, optional, defaults to False) – Whether the file should be downloaded even if it already exists in the local cache.

  • proxies (dict, optional) – Dictionary mapping protocol to the URL of the proxy passed to requests.request.

  • etag_timeout (float, optional, defaults to 10) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed to requests.request.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • local_files_only (bool, optional, defaults to False) – If True, avoid downloading the file and return the path to the local cached file if it exists.

Returns:

Local path of file or if networking is off, last version of file cached on disk.

Return type:

str

Raises:
  • [RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [RevisionNotFoundError] – If the revision to download from cannot be found.

  • [EntryNotFoundError] – If the file to download cannot be found.

  • [LocalEntryNotFoundError] – If network is disabled or unavailable and file is not found in cache.

  • [EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If token=True but the token cannot be found.

  • [OSError](https – //docs.python.org/3/library/exceptions.html#OSError) If ETag cannot be determined.

  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If some parameter value is invalid.

snapshot_download(repo_id: str, *, repo_type: str | None = None, revision: str | None = None, cache_dir: str | Path | None = None, local_dir: str | Path | None = None, proxies: Dict | None = None, etag_timeout: float = 10, force_download: bool = False, token: bool | str | None = None, local_files_only: bool = False, allow_patterns: str | List[str] | None = None, ignore_patterns: str | List[str] | None = None, max_workers: int = 8, tqdm_class: Type[tqdm_asyncio] | None = None, local_dir_use_symlinks: bool | Literal['auto'] = 'auto', resume_download: bool | None = None) str[source][source]

Download repo files.

Download a whole snapshot of a repo’s files at the specified revision. This is useful when you want all files from a repo, because you don’t know which ones you will need a priori. All files are nested inside a folder in order to keep their actual filename relative to that folder. You can also filter which files to download using allow_patterns and ignore_patterns.

If local_dir is provided, the file structure from the repo will be replicated in this location. When using this option, the cache_dir will not be used and a .cache/huggingface/ folder will be created at the root of local_dir to store some metadata related to the downloaded files.While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.

An alternative would be to clone the repo but this requires git and git-lfs to be installed and properly configured. It is also not possible to filter which files to download when cloning a repository using git.

Parameters:
  • repo_id (str) – A user or an organization name and a repo name separated by a /.

  • repo_type (str, optional) – Set to "dataset" or "space" if downloading from a dataset or space, None or "model" if downloading from a model. Default is None.

  • revision (str, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.

  • cache_dir (str, Path, optional) – Path to the folder where cached files are stored.

  • local_dir (str or Path, optional) – If provided, the downloaded files will be placed under this directory.

  • proxies (dict, optional) – Dictionary mapping protocol to the URL of the proxy passed to requests.request.

  • etag_timeout (float, optional, defaults to 10) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed to requests.request.

  • force_download (bool, optional, defaults to False) – Whether the file should be downloaded even if it already exists in the local cache.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • local_files_only (bool, optional, defaults to False) – If True, avoid downloading the file and return the path to the local cached file if it exists.

  • allow_patterns (List[str] or str, optional) – If provided, only files matching at least one pattern are downloaded.

  • ignore_patterns (List[str] or str, optional) – If provided, files matching any of the patterns are not downloaded.

  • max_workers (int, optional) – Number of concurrent threads to download files (1 thread = 1 file download). Defaults to 8.

  • tqdm_class (tqdm, optional) – If provided, overwrites the default behavior for the progress bar. Passed argument must inherit from tqdm.auto.tqdm or at least mimic its behavior. Note that the tqdm_class is not passed to each individual download. Defaults to the custom HF progress bar that can be disabled by setting HF_HUB_DISABLE_PROGRESS_BARS environment variable.

Returns:

folder path of the repo snapshot.

Return type:

str

Raises:
  • [RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [RevisionNotFoundError] – If the revision to download from cannot be found.

  • [EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If token=True and the token cannot be found.

  • [OSError](https – //docs.python.org/3/library/exceptions.html#OSError) if ETag cannot be determined.

  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid.

get_safetensors_metadata(repo_id: str, *, repo_type: str | None = None, revision: str | None = None, token: str | bool | None = None) SafetensorsRepoMetadata[source][source]

Parse metadata for a safetensors repo on the Hub.

We first check if the repo has a single safetensors file or a sharded safetensors repo. If it’s a single safetensors file, we parse the metadata from this file. If it’s a sharded safetensors repo, we parse the metadata from the index file and then parse the metadata from each shard.

To parse metadata from a single safetensors file, use [parse_safetensors_file_metadata].

For more details regarding the safetensors format, check out https://huggingface.co/docs/safetensors/index#format.

Parameters:
  • repo_id (str) – A user or an organization name and a repo name separated by a /.

  • repo_type (str, optional) – Set to "dataset" or "space" if the file is in a dataset or space, None or "model" if in a model. Default is None.

  • revision (str, optional) – The git revision to fetch the file from. Can be a branch name, a tag, or a commit hash. Defaults to the head of the "main" branch.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

information related to safetensors repo.

Return type:

[SafetensorsRepoMetadata]

Raises:
  • [NotASafetensorsRepoError] – If the repo is not a safetensors repo i.e. doesn’t have either a model.safetensors or a model.safetensors.index.json file.

  • [SafetensorsParsingError] – If a safetensors file header couldn’t be parsed correctly.

Example

# Parse repo with single weights file
>>> metadata = get_safetensors_metadata("bigscience/bloomz-560m")
>>> metadata
SafetensorsRepoMetadata(
    metadata=None,
    sharded=False,
    weight_map={'h.0.input_layernorm.bias': 'model.safetensors', ...},
    files_metadata={'model.safetensors': SafetensorsFileMetadata(...)}
)
>>> metadata.files_metadata["model.safetensors"].metadata
{'format': 'pt'}

# Parse repo with sharded model
>>> metadata = get_safetensors_metadata("bigscience/bloom")
Parse safetensors files: 100%|██████████████████████████████████████████| 72/72 [00:12<00:00,  5.78it/s]
>>> metadata
SafetensorsRepoMetadata(metadata={'total_size': 352494542848}, sharded=True, weight_map={...}, files_metadata={...})
>>> len(metadata.files_metadata)
72  # All safetensors files have been fetched

# Parse repo with sharded model
>>> get_safetensors_metadata("runwayml/stable-diffusion-v1-5")
NotASafetensorsRepoError: 'runwayml/stable-diffusion-v1-5' is not a safetensors repo. Couldn't find 'model.safetensors.index.json' or 'model.safetensors' files.
parse_safetensors_file_metadata(repo_id: str, filename: str, *, repo_type: str | None = None, revision: str | None = None, token: str | bool | None = None) SafetensorsFileMetadata[source][source]

Parse metadata from a safetensors file on the Hub.

To parse metadata from all safetensors files in a repo at once, use [get_safetensors_metadata].

For more details regarding the safetensors format, check out https://huggingface.co/docs/safetensors/index#format.

Parameters:
  • repo_id (str) – A user or an organization name and a repo name separated by a /.

  • filename (str) – The name of the file in the repo.

  • repo_type (str, optional) – Set to "dataset" or "space" if the file is in a dataset or space, None or "model" if in a model. Default is None.

  • revision (str, optional) – The git revision to fetch the file from. Can be a branch name, a tag, or a commit hash. Defaults to the head of the "main" branch.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

information related to a safetensors file.

Return type:

[SafetensorsFileMetadata]

Raises:
  • [NotASafetensorsRepoError] – If the repo is not a safetensors repo i.e. doesn’t have either a model.safetensors or a model.safetensors.index.json file.

  • [SafetensorsParsingError] – If a safetensors file header couldn’t be parsed correctly.

create_branch(repo_id: str, *, branch: str, revision: str | None = None, token: bool | str | None = None, repo_type: str | None = None, exist_ok: bool = False) None[source][source]

Create a new branch for a repo on the Hub, starting from the specified revision (defaults to main). To find a revision suiting your needs, you can use [list_repo_refs] or [list_repo_commits].

Parameters:
  • repo_id (str) – The repository in which the branch will be created. Example: "user/my-cool-model".

  • branch (str) – The name of the branch to create.

  • revision (str, optional) – The git revision to create the branch from. It can be a branch name or the OID/SHA of a commit, as a hexadecimal string. Defaults to the head of the "main" branch.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if creating a branch on a dataset or space, None or "model" if tagging a model. Default is None.

  • exist_ok (bool, optional, defaults to False) – If True, do not raise an error if branch already exists.

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [BadRequestError] – If invalid reference for a branch. Ex: refs/pr/5 or ‘refs/foo/bar’.

  • [HfHubHTTPError] – If the branch already exists on the repo (error 409) and exist_ok is set to False.

delete_branch(repo_id: str, *, branch: str, token: bool | str | None = None, repo_type: str | None = None) None[source][source]

Delete a branch from a repo on the Hub.

Parameters:
  • repo_id (str) – The repository in which a branch will be deleted. Example: "user/my-cool-model".

  • branch (str) – The name of the branch to delete.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if creating a branch on a dataset or space, None or "model" if tagging a model. Default is None.

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [HfHubHTTPError] – If trying to delete a protected branch. Ex: main cannot be deleted.

  • [HfHubHTTPError] – If trying to delete a branch that does not exist.

create_tag(repo_id: str, *, tag: str, tag_message: str | None = None, revision: str | None = None, token: bool | str | None = None, repo_type: str | None = None, exist_ok: bool = False) None[source][source]

Tag a given commit of a repo on the Hub.

Parameters:
  • repo_id (str) – The repository in which a commit will be tagged. Example: "user/my-cool-model".

  • tag (str) – The name of the tag to create.

  • tag_message (str, optional) – The description of the tag to create.

  • revision (str, optional) – The git revision to tag. It can be a branch name or the OID/SHA of a commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. Defaults to the head of the "main" branch.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if tagging a dataset or space, None or "model" if tagging a model. Default is None.

  • exist_ok (bool, optional, defaults to False) – If True, do not raise an error if tag already exists.

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [RevisionNotFoundError] – If revision is not found (error 404) on the repo.

  • [HfHubHTTPError] – If the branch already exists on the repo (error 409) and exist_ok is set to False.

delete_tag(repo_id: str, *, tag: str, token: bool | str | None = None, repo_type: str | None = None) None[source][source]

Delete a tag from a repo on the Hub.

Parameters:
  • repo_id (str) – The repository in which a tag will be deleted. Example: "user/my-cool-model".

  • tag (str) – The name of the tag to delete.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if tagging a dataset or space, None or "model" if tagging a model. Default is None.

Raises:
  • [RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.

  • [RevisionNotFoundError] – If tag is not found.

get_full_repo_name(model_id: str, *, organization: str | None = None, token: bool | str | None = None)[source][source]

Returns the repository name for a given model ID and optional organization.

Parameters:
  • model_id (str) – The name of the model.

  • organization (str, optional) – If passed, the repository name will be in the organization namespace instead of the user namespace.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

The repository name in the user’s namespace ({username}/{model_id}) if no organization is passed, and under the organization namespace ({organization}/{model_id}) otherwise.

Return type:

str

get_repo_discussions(repo_id: str, *, author: str | None = None, discussion_type: Literal['all', 'discussion', 'pull_request'] | None = None, discussion_status: Literal['all', 'open', 'closed'] | None = None, repo_type: str | None = None, token: bool | str | None = None) Iterator[Discussion][source][source]

Fetches Discussions and Pull Requests for the given repo.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • author (str, optional) – Pass a value to filter by discussion author. None means no filter. Default is None.

  • discussion_type (str, optional) – Set to "pull_request" to fetch only pull requests, "discussion" to fetch only discussions. Set to "all" or None to fetch both. Default is None.

  • discussion_status (str, optional) – Set to "open" (respectively "closed") to fetch only open (respectively closed) discussions. Set to "all" or None to fetch both. Default is None.

  • repo_type (str, optional) – Set to "dataset" or "space" if fetching from a dataset or space, None or "model" if fetching from a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

An iterator of [Discussion] objects.

Return type:

Iterator[Discussion]

Example

Collecting all discussions of a repo in a list:

>>> from huggingface_hub import get_repo_discussions
>>> discussions_list = list(get_repo_discussions(repo_id="bert-base-uncased"))

Iterating over discussions of a repo:

>>> from huggingface_hub import get_repo_discussions
>>> for discussion in get_repo_discussions(repo_id="bert-base-uncased"):
...     print(discussion.num, discussion.title)
get_discussion_details(repo_id: str, discussion_num: int, *, repo_type: str | None = None, token: bool | str | None = None) DiscussionWithDetails[source][source]

Fetches a Discussion’s / Pull Request ‘s details from the Hub.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns: [DiscussionWithDetails]

<Tip>

Raises the following errors:

</Tip>

create_discussion(repo_id: str, title: str, *, token: bool | str | None = None, description: str | None = None, repo_type: str | None = None, pull_request: bool = False) DiscussionWithDetails[source][source]

Creates a Discussion or Pull Request.

Pull Requests created programmatically will be in "draft" status.

Creating a Pull Request with changes can also be done at once with [HfApi.create_commit].

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • title (str) – The title of the discussion. It can be up to 200 characters long, and must be at least 3 characters long. Leading and trailing whitespaces will be stripped.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • description (str, optional) – An optional description for the Pull Request. Defaults to "Discussion opened with the huggingface_hub Python library"

  • pull_request (bool, optional) – Whether to create a Pull Request or discussion. If True, creates a Pull Request. If False, creates a discussion. Defaults to False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

Returns: [DiscussionWithDetails]

<Tip>

Raises the following errors:

</Tip>

create_pull_request(repo_id: str, title: str, *, token: bool | str | None = None, description: str | None = None, repo_type: str | None = None) DiscussionWithDetails[source][source]

Creates a Pull Request . Pull Requests created programmatically will be in "draft" status.

Creating a Pull Request with changes can also be done at once with [HfApi.create_commit];

This is a wrapper around [HfApi.create_discussion].

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • title (str) – The title of the discussion. It can be up to 200 characters long, and must be at least 3 characters long. Leading and trailing whitespaces will be stripped.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • description (str, optional) – An optional description for the Pull Request. Defaults to "Discussion opened with the huggingface_hub Python library"

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

Returns: [DiscussionWithDetails]

<Tip>

Raises the following errors:

</Tip>

comment_discussion(repo_id: str, discussion_num: int, comment: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionComment[source][source]

Creates a new comment on the given Discussion.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.

  • comment (str) – The content of the comment to create. Comments support markdown formatting.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

the newly created comment

Return type:

[DiscussionComment]

Examples

>>> comment = """
... Hello @otheruser!
...
... # This is a title
...
... **This is bold\**, *this is italic* and ~this is strikethrough~
... And [this](http://url) is a link
... """

>>> HfApi().comment_discussion(
...     repo_id="username/repo_name",
...     discussion_num=34
...     comment=comment
... )
# DiscussionComment(id='deadbeef0000000', type='comment', ...)

<Tip>

Raises the following errors:

</Tip>

rename_discussion(repo_id: str, discussion_num: int, new_title: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionTitleChange[source][source]

Renames a Discussion.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.

  • new_title (str) – The new title for the discussion

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

the title change event

Return type:

[DiscussionTitleChange]

Examples

>>> new_title = "New title, fixing a typo"
>>> HfApi().rename_discussion(
...     repo_id="username/repo_name",
...     discussion_num=34
...     new_title=new_title
... )
# DiscussionTitleChange(id='deadbeef0000000', type='title-change', ...)

<Tip>

Raises the following errors:

</Tip>

change_discussion_status(repo_id: str, discussion_num: int, new_status: Literal['open', 'closed'], *, token: bool | str | None = None, comment: str | None = None, repo_type: str | None = None) DiscussionStatusChange[source][source]

Closes or re-opens a Discussion or Pull Request.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.

  • new_status (str) – The new status for the discussion, either "open" or "closed".

  • comment (str, optional) – An optional comment to post with the status change.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

the status change event

Return type:

[DiscussionStatusChange]

Examples

>>> new_title = "New title, fixing a typo"
>>> HfApi().rename_discussion(
...     repo_id="username/repo_name",
...     discussion_num=34
...     new_title=new_title
... )
# DiscussionStatusChange(id='deadbeef0000000', type='status-change', ...)

<Tip>

Raises the following errors:

</Tip>

merge_pull_request(repo_id: str, discussion_num: int, *, token: bool | str | None = None, comment: str | None = None, repo_type: str | None = None)[source][source]

Merges a Pull Request.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.

  • comment (str, optional) – An optional comment to post with the status change.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

the status change event

Return type:

[DiscussionStatusChange]

<Tip>

Raises the following errors:

</Tip>

edit_discussion_comment(repo_id: str, discussion_num: int, comment_id: str, new_content: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionComment[source][source]

Edits a comment on a Discussion / Pull Request.

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.

  • comment_id (str) – The ID of the comment to edit.

  • new_content (str) – The new content of the comment. Comments support markdown formatting.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

the edited comment

Return type:

[DiscussionComment]

<Tip>

Raises the following errors:

</Tip>

hide_discussion_comment(repo_id: str, discussion_num: int, comment_id: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionComment[source][source]

Hides a comment on a Discussion / Pull Request.

<Tip warning={true}> Hidden comments’ content cannot be retrieved anymore. Hiding a comment is irreversible. </Tip>

Parameters:
  • repo_id (str) – A namespace (user or an organization) and a repo name separated by a /.

  • discussion_num (int) – The number of the Discussion or Pull Request . Must be a strictly positive integer.

  • comment_id (str) – The ID of the comment to edit.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

the hidden comment

Return type:

[DiscussionComment]

<Tip>

Raises the following errors:

</Tip>

add_space_secret(repo_id: str, key: str, value: str, *, description: str | None = None, token: bool | str | None = None) None[source][source]

Adds or updates a secret in a Space.

Secrets allow to set secret keys or tokens to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.

Parameters:
  • repo_id (str) – ID of the repo to update. Example: "bigcode/in-the-stack".

  • key (str) – Secret key. Example: "GITHUB_API_KEY"

  • value (str) – Secret value. Example: "your_github_api_key".

  • description (str, optional) – Secret description. Example: "Github API key to access the Github API".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

delete_space_secret(repo_id: str, key: str, *, token: bool | str | None = None) None[source][source]

Deletes a secret from a Space.

Secrets allow to set secret keys or tokens to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.

Parameters:
  • repo_id (str) – ID of the repo to update. Example: "bigcode/in-the-stack".

  • key (str) – Secret key. Example: "GITHUB_API_KEY".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

get_space_variables(repo_id: str, *, token: bool | str | None = None) Dict[str, SpaceVariable][source][source]

Gets all variables from a Space.

Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables

Parameters:
add_space_variable(repo_id: str, key: str, value: str, *, description: str | None = None, token: bool | str | None = None) Dict[str, SpaceVariable][source][source]

Adds or updates a variable in a Space.

Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables

Parameters:
  • repo_id (str) – ID of the repo to update. Example: "bigcode/in-the-stack".

  • key (str) – Variable key. Example: "MODEL_REPO_ID"

  • value (str) – Variable value. Example: "the_model_repo_id".

  • description (str) – Description of the variable. Example: "Model Repo ID of the implemented model".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

delete_space_variable(repo_id: str, key: str, *, token: bool | str | None = None) Dict[str, SpaceVariable][source][source]

Deletes a variable from a Space.

Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables

Parameters:
  • repo_id (str) – ID of the repo to update. Example: "bigcode/in-the-stack".

  • key (str) – Variable key. Example: "MODEL_REPO_ID"

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

get_space_runtime(repo_id: str, *, token: bool | str | None = None) SpaceRuntime[source][source]

Gets runtime information about a Space.

Parameters:
Returns:

Runtime information about a Space including Space stage and hardware.

Return type:

[SpaceRuntime]

request_space_hardware(repo_id: str, hardware: SpaceHardware, *, token: bool | str | None = None, sleep_time: int | None = None) SpaceRuntime[source][source]

Request new hardware for a Space.

Parameters:
  • repo_id (str) – ID of the repo to update. Example: "bigcode/in-the-stack".

  • hardware (str or [SpaceHardware]) – Hardware on which to run the Space. Example: "t4-medium".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.

Returns:

Runtime information about a Space including Space stage and hardware.

Return type:

[SpaceRuntime]

<Tip>

It is also possible to request hardware directly when creating the Space repo! See [create_repo] for details.

</Tip>

set_space_sleep_time(repo_id: str, sleep_time: int, *, token: bool | str | None = None) SpaceRuntime[source][source]

Set a custom sleep time for a Space running on upgraded hardware..

Your Space will go to sleep after X seconds of inactivity. You are not billed when your Space is in “sleep” mode. If a new visitor lands on your Space, it will “wake it up”. Only upgraded hardware can have a configurable sleep time. To know more about the sleep stage, please refer to https://huggingface.co/docs/hub/spaces-gpus#sleep-time.

Parameters:
  • repo_id (str) – ID of the repo to update. Example: "bigcode/in-the-stack".

  • sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to pause (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

Runtime information about a Space including Space stage and hardware.

Return type:

[SpaceRuntime]

<Tip>

It is also possible to set a custom sleep time when requesting hardware with [request_space_hardware].

</Tip>

pause_space(repo_id: str, *, token: bool | str | None = None) SpaceRuntime[source][source]

Pause your Space.

A paused Space stops executing until manually restarted by its owner. This is different from the sleeping state in which free Spaces go after 48h of inactivity. Paused time is not billed to your account, no matter the hardware you’ve selected. To restart your Space, use [restart_space] and go to your Space settings page.

For more details, please visit [the docs](https://huggingface.co/docs/hub/spaces-gpus#pause).

Parameters:
Returns:

Runtime information about your Space including stage=PAUSED and requested hardware.

Return type:

[SpaceRuntime]

Raises:
  • [RepositoryNotFoundError] – If your Space is not found (error 404). Most probably wrong repo_id or your space is private but you are not authenticated.

  • [HfHubHTTPError] – 403 Forbidden: only the owner of a Space can pause it. If you want to manage a Space that you don’t own, either ask the owner by opening a Discussion or duplicate the Space.

  • [BadRequestError] – If your Space is a static Space. Static Spaces are always running and never billed. If you want to hide a static Space, you can set it to private.

restart_space(repo_id: str, *, token: bool | str | None = None, factory_reboot: bool = False) SpaceRuntime[source][source]

Restart your Space.

This is the only way to programmatically restart a Space if you’ve put it on Pause (see [pause_space]). You must be the owner of the Space to restart it. If you are using an upgraded hardware, your account will be billed as soon as the Space is restarted. You can trigger a restart no matter the current state of a Space.

For more details, please visit [the docs](https://huggingface.co/docs/hub/spaces-gpus#pause).

Parameters:
  • repo_id (str) – ID of the Space to restart. Example: "Salesforce/BLIP2".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • factory_reboot (bool, optional) – If True, the Space will be rebuilt from scratch without caching any requirements.

Returns:

Runtime information about your Space.

Return type:

[SpaceRuntime]

Raises:
  • [RepositoryNotFoundError] – If your Space is not found (error 404). Most probably wrong repo_id or your space is private but you are not authenticated.

  • [HfHubHTTPError] – 403 Forbidden: only the owner of a Space can restart it. If you want to restart a Space that you don’t own, either ask the owner by opening a Discussion or duplicate the Space.

  • [BadRequestError] – If your Space is a static Space. Static Spaces are always running and never billed. If you want to hide a static Space, you can set it to private.

duplicate_space(from_id: str, to_id: str | None = None, *, private: bool | None = None, token: bool | str | None = None, exist_ok: bool = False, hardware: SpaceHardware | None = None, storage: SpaceStorage | None = None, sleep_time: int | None = None, secrets: List[Dict[str, str]] | None = None, variables: List[Dict[str, str]] | None = None) RepoUrl[source][source]

Duplicate a Space.

Programmatically duplicate a Space. The new Space will be created in your account and will be in the same state as the original Space (running or paused). You can duplicate a Space no matter the current state of a Space.

Parameters:
  • from_id (str) – ID of the Space to duplicate. Example: "pharma/CLIP-Interrogator".

  • to_id (str, optional) – ID of the new Space. Example: "dog/CLIP-Interrogator". If not provided, the new Space will have the same name as the original Space, but in your account.

  • private (bool, optional) – Whether the new Space should be private or not. Defaults to the same privacy as the original Space.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • exist_ok (bool, optional, defaults to False) – If True, do not raise an error if repo already exists.

  • hardware (SpaceHardware or str, optional) – Choice of Hardware. Example: "t4-medium". See [SpaceHardware] for a complete list.

  • storage (SpaceStorage or str, optional) – Choice of persistent storage tier. Example: "small". See [SpaceStorage] for a complete list.

  • sleep_time (int, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to -1 if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.

  • secrets (List[Dict[str, str]], optional) – A list of secret keys to set in your Space. Each item is in the form {"key": ..., "value": ..., "description": ...} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.

  • variables (List[Dict[str, str]], optional) – A list of public environment variables to set in your Space. Each item is in the form {"key": ..., "value": ..., "description": ...} where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables.

Returns:

URL to the newly created repo. Value is a subclass of str containing attributes like endpoint, repo_type and repo_id.

Return type:

[RepoUrl]

Raises:
  • [RepositoryNotFoundError] – If one of from_id or to_id cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): If the HuggingFace API returned an error

Example: .. code-block:: python

>>> from huggingface_hub import duplicate_space

# Duplicate a Space to your account >>> duplicate_space(“multimodalart/dreambooth-training”) RepoUrl(’https://huggingface.co/spaces/nateraw/dreambooth-training’,…)

# Can set custom destination id and visibility flag. >>> duplicate_space(“multimodalart/dreambooth-training”, to_id=”my-dreambooth”, private=True) RepoUrl(’https://huggingface.co/spaces/nateraw/my-dreambooth’,…)

request_space_storage(repo_id: str, storage: SpaceStorage, *, token: bool | str | None = None) SpaceRuntime[source][source]

Request persistent storage for a Space.

Parameters:
  • repo_id (str) – ID of the Space to update. Example: "open-llm-leaderboard/open_llm_leaderboard".

  • storage (str or [SpaceStorage]) – Storage tier. Either ‘small’, ‘medium’, or ‘large’.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

Runtime information about a Space including Space stage and hardware.

Return type:

[SpaceRuntime]

<Tip>

It is not possible to decrease persistent storage after its granted. To do so, you must delete it via [delete_space_storage].

</Tip>

delete_space_storage(repo_id: str, *, token: bool | str | None = None) SpaceRuntime[source][source]

Delete persistent storage for a Space.

Parameters:
  • repo_id (str) – ID of the Space to update. Example: "open-llm-leaderboard/open_llm_leaderboard".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

Runtime information about a Space including Space stage and hardware.

Return type:

[SpaceRuntime]

Raises:

[BadRequestError] – If space has no persistent storage.

list_inference_endpoints(namespace: str | None = None, *, token: str | bool | None = None) List[InferenceEndpoint][source][source]

Lists all inference endpoints for the given namespace.

Parameters:
  • namespace (str, optional) – The namespace to list endpoints for. Defaults to the current user. Set to "*" to list all endpoints from all namespaces (i.e. personal namespace and all orgs the user belongs to).

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

A list of all inference endpoints for the given namespace.

Return type:

List[InferenceEndpoint]

Example: .. code-block:: python

>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> api.list_inference_endpoints()
[InferenceEndpoint(name='my-endpoint', ...), ...]
create_inference_endpoint(name: str, *, repository: str, framework: str, accelerator: str, instance_size: str, instance_type: str, region: str, vendor: str, account_id: str | None = None, min_replica: int = 1, max_replica: int = 1, scale_to_zero_timeout: int | None = None, revision: str | None = None, task: str | None = None, custom_image: Dict | None = None, env: Dict[str, str] | None = None, secrets: Dict[str, str] | None = None, type: InferenceEndpointType = InferenceEndpointType.PROTECTED, domain: str | None = None, path: str | None = None, cache_http_responses: bool | None = None, tags: List[str] | None = None, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint[source][source]

Create a new Inference Endpoint.

Parameters:
  • name (str) – The unique name for the new Inference Endpoint.

  • repository (str) – The name of the model repository associated with the Inference Endpoint (e.g. "gpt2").

  • framework (str) – The machine learning framework used for the model (e.g. "custom").

  • accelerator (str) – The hardware accelerator to be used for inference (e.g. "cpu").

  • instance_size (str) – The size or type of the instance to be used for hosting the model (e.g. "x4").

  • instance_type (str) – The cloud instance type where the Inference Endpoint will be deployed (e.g. "intel-icl").

  • region (str) – The cloud region in which the Inference Endpoint will be created (e.g. "us-east-1").

  • vendor (str) – The cloud provider or vendor where the Inference Endpoint will be hosted (e.g. "aws").

  • account_id (str, optional) – The account ID used to link a VPC to a private Inference Endpoint (if applicable).

  • min_replica (int, optional) – The minimum number of replicas (instances) to keep running for the Inference Endpoint. To enable scaling to zero, set this value to 0 and adjust scale_to_zero_timeout accordingly. Defaults to 1.

  • max_replica (int, optional) – The maximum number of replicas (instances) to scale to for the Inference Endpoint. Defaults to 1.

  • scale_to_zero_timeout (int, optional) – The duration in minutes before an inactive endpoint is scaled to zero, or no scaling to zero if set to None and min_replica is not 0. Defaults to None.

  • revision (str, optional) – The specific model revision to deploy on the Inference Endpoint (e.g. "6c0e6080953db56375760c0471a8c5f2929baf11").

  • task (str, optional) – The task on which to deploy the model (e.g. "text-classification").

  • custom_image (Dict, optional) – A custom Docker image to use for the Inference Endpoint. This is useful if you want to deploy an Inference Endpoint running on the text-generation-inference (TGI) framework (see examples).

  • env (Dict[str, str], optional) – Non-secret environment variables to inject in the container environment.

  • secrets (Dict[str, str], optional) – Secret values to inject in the container environment.

  • type ([InferenceEndpointType], optional) – The type of the Inference Endpoint, which can be "protected" (default), "public" or "private".

  • domain (str, optional) – The custom domain for the Inference Endpoint deployment, if setup the inference endpoint will be available at this domain (e.g. "my-new-domain.cool-website.woof").

  • path (str, optional) – The custom path to the deployed model, should start with a / (e.g. "/models/google-bert/bert-base-uncased").

  • cache_http_responses (bool, optional) – Whether to cache HTTP responses from the Inference Endpoint. Defaults to False.

  • tags (List[str], optional) – A list of tags to associate with the Inference Endpoint.

  • namespace (str, optional) – The namespace where the Inference Endpoint will be created. Defaults to the current user’s namespace.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • Returns – [InferenceEndpoint]: information about the updated Inference Endpoint.

  • Example

  • ```python

  • HfApi (>>> from huggingface_hub import)

  • HfApi() (>>> api =)

  • api.create_inference_endpoint( (>>> endpoint =)

  • "my-endpoint-name" (...)

:param : :param … repository=”gpt2”: :param : :param … framework=”pytorch”: :param : :param … task=”text-generation”: :param : :param … accelerator=”cpu”: :param : :param … vendor=”aws”: :param : :param … region=”us-east-1”: :param : :param … type=”protected”: :param : :param … instance_size=”x2”: :param : :param … instance_type=”intel-icl”: :param : :param … ): :param >>> endpoint: :param InferenceEndpoint: :type InferenceEndpoint: name=’my-endpoint-name’, status=”pending”,… :param # Run inference on the endpoint: :param >>> endpoint.client.text_generation: :type >>> endpoint.client.text_generation: … :param “…”: :param `: :param ```python: :param # Start an Inference Endpoint running Zephyr-7b-beta on TGI: :param >>> from huggingface_hub import HfApi: :param >>> api = HfApi(): :param >>> endpoint = api.create_inference_endpoint(: :param ...     "aws-zephyr-7b-beta-0486": :param : :param ...     repository="HuggingFaceH4/zephyr-7b-beta": :param : :param ...     framework="pytorch": :param : :param ...     task="text-generation": :param : :param ...     accelerator="gpu": :param : :param ...     vendor="aws": :param : :param ...     region="us-east-1": :param : :param ...     type="protected": :param : :param ...     instance_size="x1": :param : :param ...     instance_type="nvidia-a10g": :param : :param ...     env={: :param ...           "MAX_BATCH_PREFILL_TOKENS": "2048", :param ...           "MAX_INPUT_LENGTH": "1024", :param ...           "MAX_TOTAL_TOKENS": "1512", :param ...           "MODEL_ID": "/repository" :param ...         }: :param : :param ...     custom_image={: :param ...         "health_route": "/health", :param ...         "url": "ghcr.io/huggingface/text-generation-inference:1.1.0", :param ...     }: :param : :param ...    secrets={"MY_SECRET_KEY": "secret_value"}, :param ...    tags=["dev": :param "text-generation"]: :param : :param ... ): :param `: :param `python: :param # Start an Inference Endpoint running ProsusAI/finbert while scaling to zero in 15 minutes: :param >>> from huggingface_hub import HfApi: :param >>> api = HfApi(): :param >>> endpoint = api.create_inference_endpoint(: :param ...     "finbert-classifier": :param : :param ...     repository="ProsusAI/finbert": :param : :param ...     framework="pytorch": :param : :param ...     task="text-classification": :param : :param ...     min_replica=0: :param : :param ...     scale_to_zero_timeout=15: :param : :param ...     accelerator="cpu": :param : :param ...     vendor="aws": :param : :param ...     region="us-east-1": :param : :param ...     type="protected": :param : :param ...     instance_size="x2": :param : :param ...     instance_type="intel-icl": :param : :param ... ): :param >>> endpoint.wait: :type >>> endpoint.wait: timeout=300 :param # Run inference on the endpoint: :param >>> endpoint.client.text_generation: :type >>> endpoint.client.text_generation: ... :param TextClassificationOutputElement: :type TextClassificationOutputElement: label='positive', score=0.8983615040779114 :param `:

create_inference_endpoint_from_catalog(repo_id: str, *, name: str | None = None, token: bool | str | None = None, namespace: str | None = None) InferenceEndpoint[source][source]

Create a new Inference Endpoint from a model in the Hugging Face Inference Catalog.

The goal of the Inference Catalog is to provide a curated list of models that are optimized for inference and for which default configurations have been tested. See https://endpoints.huggingface.co/catalog for a list of available models in the catalog.

Parameters:
  • repo_id (str) – The ID of the model in the catalog to deploy as an Inference Endpoint.

  • name (str, optional) – The unique name for the new Inference Endpoint. If not provided, a random name will be generated.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication).

  • namespace (str, optional) – The namespace where the Inference Endpoint will be created. Defaults to the current user’s namespace.

Returns:

information about the new Inference Endpoint.

Return type:

[InferenceEndpoint]

<Tip warning={true}>

create_inference_endpoint_from_catalog is experimental. Its API is subject to change in the future. Please provide feedback if you have any suggestions or requests.

</Tip>

list_inference_catalog(*, token: bool | str | None = None) List[str][source][source]

List models available in the Hugging Face Inference Catalog.

The goal of the Inference Catalog is to provide a curated list of models that are optimized for inference and for which default configurations have been tested. See https://endpoints.huggingface.co/catalog for a list of available models in the catalog.

Use [create_inference_endpoint_from_catalog] to deploy a model from the catalog.

Parameters:

token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication).

Returns:

A list of model IDs available in the catalog.

Return type:

List[str]

<Tip warning={true}>

list_inference_catalog is experimental. Its API is subject to change in the future. Please provide feedback if you have any suggestions or requests.

</Tip>

get_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint[source][source]

Get information about an Inference Endpoint.

Parameters:
  • name (str) – The name of the Inference Endpoint to retrieve information about.

  • namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

information about the requested Inference Endpoint.

Return type:

[InferenceEndpoint]

Example: .. code-block:: python

>>> from huggingface_hub import HfApi
>>> api = HfApi()
>>> endpoint = api.get_inference_endpoint("my-text-to-image")
>>> endpoint
InferenceEndpoint(name='my-text-to-image', ...)

# Get status >>> endpoint.status ‘running’ >>> endpoint.url ‘https://my-text-to-image.region.vendor.endpoints.huggingface.cloud

# Run inference >>> endpoint.client.text_to_image(…)

update_inference_endpoint(name: str, *, accelerator: str | None = None, instance_size: str | None = None, instance_type: str | None = None, min_replica: int | None = None, max_replica: int | None = None, scale_to_zero_timeout: int | None = None, repository: str | None = None, framework: str | None = None, revision: str | None = None, task: str | None = None, custom_image: Dict | None = None, env: Dict[str, str] | None = None, secrets: Dict[str, str] | None = None, domain: str | None = None, path: str | None = None, cache_http_responses: bool | None = None, tags: List[str] | None = None, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint[source][source]

Update an Inference Endpoint.

This method allows the update of either the compute configuration, the deployed model, the route, or any combination. All arguments are optional but at least one must be provided.

For convenience, you can also update an Inference Endpoint using [InferenceEndpoint.update].

Parameters:
  • name (str) – The name of the Inference Endpoint to update.

  • accelerator (str, optional) – The hardware accelerator to be used for inference (e.g. "cpu").

  • instance_size (str, optional) – The size or type of the instance to be used for hosting the model (e.g. "x4").

  • instance_type (str, optional) – The cloud instance type where the Inference Endpoint will be deployed (e.g. "intel-icl").

  • min_replica (int, optional) – The minimum number of replicas (instances) to keep running for the Inference Endpoint.

  • max_replica (int, optional) – The maximum number of replicas (instances) to scale to for the Inference Endpoint.

  • scale_to_zero_timeout (int, optional) – The duration in minutes before an inactive endpoint is scaled to zero.

  • repository (str, optional) – The name of the model repository associated with the Inference Endpoint (e.g. "gpt2").

  • framework (str, optional) – The machine learning framework used for the model (e.g. "custom").

  • revision (str, optional) – The specific model revision to deploy on the Inference Endpoint (e.g. "6c0e6080953db56375760c0471a8c5f2929baf11").

  • task (str, optional) – The task on which to deploy the model (e.g. "text-classification").

  • custom_image (Dict, optional) – A custom Docker image to use for the Inference Endpoint. This is useful if you want to deploy an Inference Endpoint running on the text-generation-inference (TGI) framework (see examples).

  • env (Dict[str, str], optional) – Non-secret environment variables to inject in the container environment

  • secrets (Dict[str, str], optional) – Secret values to inject in the container environment.

  • domain (str, optional) – The custom domain for the Inference Endpoint deployment, if setup the inference endpoint will be available at this domain (e.g. "my-new-domain.cool-website.woof").

  • path (str, optional) – The custom path to the deployed model, should start with a / (e.g. "/models/google-bert/bert-base-uncased").

  • cache_http_responses (bool, optional) – Whether to cache HTTP responses from the Inference Endpoint.

  • tags (List[str], optional) – A list of tags to associate with the Inference Endpoint.

  • namespace (str, optional) – The namespace where the Inference Endpoint will be updated. Defaults to the current user’s namespace.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

information about the updated Inference Endpoint.

Return type:

[InferenceEndpoint]

delete_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) None[source][source]

Delete an Inference Endpoint.

This operation is not reversible. If you don’t want to be charged for an Inference Endpoint, it is preferable to pause it with [pause_inference_endpoint] or scale it to zero with [scale_to_zero_inference_endpoint].

For convenience, you can also delete an Inference Endpoint using [InferenceEndpoint.delete].

Parameters:
  • name (str) – The name of the Inference Endpoint to delete.

  • namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

pause_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint[source][source]

Pause an Inference Endpoint.

A paused Inference Endpoint will not be charged. It can be resumed at any time using [resume_inference_endpoint]. This is different than scaling the Inference Endpoint to zero with [scale_to_zero_inference_endpoint], which would be automatically restarted when a request is made to it.

For convenience, you can also pause an Inference Endpoint using [pause_inference_endpoint].

Parameters:
  • name (str) – The name of the Inference Endpoint to pause.

  • namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

information about the paused Inference Endpoint.

Return type:

[InferenceEndpoint]

resume_inference_endpoint(name: str, *, namespace: str | None = None, running_ok: bool = True, token: str | bool | None = None) InferenceEndpoint[source][source]

Resume an Inference Endpoint.

For convenience, you can also resume an Inference Endpoint using [InferenceEndpoint.resume].

Parameters:
  • name (str) – The name of the Inference Endpoint to resume.

  • namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.

  • running_ok (bool, optional) – If True, the method will not raise an error if the Inference Endpoint is already running. Defaults to True.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

information about the resumed Inference Endpoint.

Return type:

[InferenceEndpoint]

scale_to_zero_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint[source][source]

Scale Inference Endpoint to zero.

An Inference Endpoint scaled to zero will not be charged. It will be resume on the next request to it, with a cold start delay. This is different than pausing the Inference Endpoint with [pause_inference_endpoint], which would require a manual resume with [resume_inference_endpoint].

For convenience, you can also scale an Inference Endpoint to zero using [InferenceEndpoint.scale_to_zero].

Parameters:
  • name (str) – The name of the Inference Endpoint to scale to zero.

  • namespace (str, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

information about the scaled-to-zero Inference Endpoint.

Return type:

[InferenceEndpoint]

list_collections(*, owner: List[str] | str | None = None, item: List[str] | str | None = None, sort: Literal['lastModified', 'trending', 'upvotes'] | None = None, limit: int | None = None, token: bool | str | None = None) Iterable[Collection][source][source]

List collections on the Huggingface Hub, given some filters.

<Tip warning={true}>

When listing collections, the item list per collection is truncated to 4 items maximum. To retrieve all items from a collection, you must use [get_collection].

</Tip>

Parameters:
  • owner (List[str] or str, optional) – Filter by owner’s username.

  • item (List[str] or str, optional) – Filter collections containing a particular items. Example: "models/teknium/OpenHermes-2.5-Mistral-7B", "datasets/squad" or "papers/2311.12983".

  • sort (Literal["lastModified", "trending", "upvotes"], optional) – Sort collections by last modified, trending or upvotes.

  • limit (int, optional) – Maximum number of collections to be returned.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

an iterable of [Collection] objects.

Return type:

Iterable[Collection]

get_collection(collection_slug: str, *, token: str | bool | None = None) Collection[source][source]

Gets information about a Collection on the Hub.

Parameters:
  • collection_slug (str) – Slug of the collection of the Hub. Example: "TheBloke/recent-models-64f9a55bb3115b4f513ec026".

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns: [Collection]

Example:

>>> from huggingface_hub import get_collection
>>> collection = get_collection("TheBloke/recent-models-64f9a55bb3115b4f513ec026")
>>> collection.title
'Recent models'
>>> len(collection.items)
37
>>> collection.items[0]
CollectionItem(
    item_object_id='651446103cd773a050bf64c2',
    item_id='TheBloke/U-Amethyst-20B-AWQ',
    item_type='model',
    position=88,
    note=None
)
create_collection(title: str, *, namespace: str | None = None, description: str | None = None, private: bool = False, exists_ok: bool = False, token: str | bool | None = None) Collection[source][source]

Create a new Collection on the Hub.

Parameters:
  • title (str) – Title of the collection to create. Example: "Recent models".

  • namespace (str, optional) – Namespace of the collection to create (username or org). Will default to the owner name.

  • description (str, optional) – Description of the collection to create.

  • private (bool, optional) – Whether the collection should be private or not. Defaults to False (i.e. public collection).

  • exists_ok (bool, optional) – If True, do not raise an error if collection already exists.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns: [Collection]

Example:

>>> from huggingface_hub import create_collection
>>> collection = create_collection(
...     title="ICCV 2023",
...     description="Portfolio of models, papers and demos I presented at ICCV 2023",
... )
>>> collection.slug
"username/iccv-2023-64f9a55bb3115b4f513ec026"
update_collection_metadata(collection_slug: str, *, title: str | None = None, description: str | None = None, position: int | None = None, private: bool | None = None, theme: str | None = None, token: str | bool | None = None) Collection[source][source]

Update metadata of a collection on the Hub.

All arguments are optional. Only provided metadata will be updated.

Parameters:
  • collection_slug (str) – Slug of the collection to update. Example: "TheBloke/recent-models-64f9a55bb3115b4f513ec026".

  • title (str) – Title of the collection to update.

  • description (str, optional) – Description of the collection to update.

  • position (int, optional) – New position of the collection in the list of collections of the user.

  • private (bool, optional) – Whether the collection should be private or not.

  • theme (str, optional) – Theme of the collection on the Hub.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns: [Collection]

Example:

>>> from huggingface_hub import update_collection_metadata
>>> collection = update_collection_metadata(
...     collection_slug="username/iccv-2023-64f9a55bb3115b4f513ec026",
...     title="ICCV Oct. 2023"
...     description="Portfolio of models, datasets, papers and demos I presented at ICCV Oct. 2023",
...     private=False,
...     theme="pink",
... )
>>> collection.slug
"username/iccv-oct-2023-64f9a55bb3115b4f513ec026"
# ^collection slug got updated but not the trailing ID
delete_collection(collection_slug: str, *, missing_ok: bool = False, token: str | bool | None = None) None[source][source]

Delete a collection on the Hub.

Parameters:
  • collection_slug (str) – Slug of the collection to delete. Example: "TheBloke/recent-models-64f9a55bb3115b4f513ec026".

  • missing_ok (bool, optional) – If True, do not raise an error if collection doesn’t exists.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Example:

>>> from huggingface_hub import delete_collection
>>> collection = delete_collection("username/useless-collection-64f9a55bb3115b4f513ec026", missing_ok=True)

<Tip warning={true}>

This is a non-revertible action. A deleted collection cannot be restored.

</Tip>

add_collection_item(collection_slug: str, item_id: str, item_type: Literal['model', 'dataset', 'space', 'paper', 'collection'], *, note: str | None = None, exists_ok: bool = False, token: str | bool | None = None) Collection[source][source]

Add an item to a collection on the Hub.

Parameters:
  • collection_slug (str) – Slug of the collection to update. Example: "TheBloke/recent-models-64f9a55bb3115b4f513ec026".

  • item_id (str) – ID of the item to add to the collection. It can be the ID of a repo on the Hub (e.g. "facebook/bart-large-mnli") or a paper id (e.g. "2307.09288").

  • item_type (str) – Type of the item to add. Can be one of "model", "dataset", "space" or "paper".

  • note (str, optional) – A note to attach to the item in the collection. The maximum size for a note is 500 characters.

  • exists_ok (bool, optional) – If True, do not raise an error if item already exists.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns: [Collection]

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the item you try to add to the collection does not exist on the Hub.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 409 if the item you try to add to the collection is already in the collection (and exists_ok=False)

Example:

>>> from huggingface_hub import add_collection_item
>>> collection = add_collection_item(
...     collection_slug="davanstrien/climate-64f99dc2a5067f6b65531bab",
...     item_id="pierre-loic/climate-news-articles",
...     item_type="dataset"
... )
>>> collection.items[-1].item_id
"pierre-loic/climate-news-articles"
# ^item got added to the collection on last position

# Add item with a note
>>> add_collection_item(
...     collection_slug="davanstrien/climate-64f99dc2a5067f6b65531bab",
...     item_id="datasets/climate_fever",
...     item_type="dataset"
...     note="This dataset adopts the FEVER methodology that consists of 1,535 real-world claims regarding climate-change collected on the internet."
... )
(...)
update_collection_item(collection_slug: str, item_object_id: str, *, note: str | None = None, position: int | None = None, token: str | bool | None = None) None[source][source]

Update an item in a collection.

Parameters:
  • collection_slug (str) – Slug of the collection to update. Example: "TheBloke/recent-models-64f9a55bb3115b4f513ec026".

  • item_object_id (str) – ID of the item in the collection. This is not the id of the item on the Hub (repo_id or paper id). It must be retrieved from a [CollectionItem] object. Example: collection.items[0].item_object_id.

  • note (str, optional) – A note to attach to the item in the collection. The maximum size for a note is 500 characters.

  • position (int, optional) – New position of the item in the collection.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Example:

>>> from huggingface_hub import get_collection, update_collection_item

# Get collection first
>>> collection = get_collection("TheBloke/recent-models-64f9a55bb3115b4f513ec026")

# Update item based on its ID (add note + update position)
>>> update_collection_item(
...     collection_slug="TheBloke/recent-models-64f9a55bb3115b4f513ec026",
...     item_object_id=collection.items[-1].item_object_id,
...     note="Newly updated model!"
...     position=0,
... )
delete_collection_item(collection_slug: str, item_object_id: str, *, missing_ok: bool = False, token: str | bool | None = None) None[source][source]

Delete an item from a collection.

Parameters:
  • collection_slug (str) – Slug of the collection to update. Example: "TheBloke/recent-models-64f9a55bb3115b4f513ec026".

  • item_object_id (str) – ID of the item in the collection. This is not the id of the item on the Hub (repo_id or paper id). It must be retrieved from a [CollectionItem] object. Example: collection.items[0].item_object_id.

  • missing_ok (bool, optional) – If True, do not raise an error if item doesn’t exists.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Example:

>>> from huggingface_hub import get_collection, delete_collection_item

# Get collection first
>>> collection = get_collection("TheBloke/recent-models-64f9a55bb3115b4f513ec026")

# Delete item based on its ID
>>> delete_collection_item(
...     collection_slug="TheBloke/recent-models-64f9a55bb3115b4f513ec026",
...     item_object_id=collection.items[-1].item_object_id,
... )
list_pending_access_requests(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) List[AccessRequest][source][source]

Get pending access requests for a given gated repo.

A pending request means the user has requested access to the repo but the request has not been processed yet. If the approval mode is automatic, this list should be empty. Pending requests can be accepted or rejected using [accept_access_request] and [reject_access_request].

For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.

Parameters:
  • repo_id (str) – The id of the repo to get access requests for.

  • repo_type (str, optional) – The type of the repo to get access requests for. Must be one of model, dataset or space. Defaults to model.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

A list of [AccessRequest] objects. Each time contains a username, email, status and timestamp attribute. If the gated repo has a custom form, the fields attribute will be populated with user’s answers.

Return type:

List[AccessRequest]

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

Example: .. code-block:: py

>>> from huggingface_hub import list_pending_access_requests, accept_access_request

# List pending requests >>> requests = list_pending_access_requests(“meta-llama/Llama-2-7b”) >>> len(requests) 411 >>> requests[0] [

AccessRequest(

username=’clem’, fullname=’Clem 🤗’, email=’***’, timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc), status=’pending’, fields=None,

]

# Accept Clem’s request >>> accept_access_request(“meta-llama/Llama-2-7b”, “clem”)

list_accepted_access_requests(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) List[AccessRequest][source][source]

Get accepted access requests for a given gated repo.

An accepted request means the user has requested access to the repo and the request has been accepted. The user can download any file of the repo. If the approval mode is automatic, this list should contains by default all requests. Accepted requests can be cancelled or rejected at any time using [cancel_access_request] and [reject_access_request]. A cancelled request will go back to the pending list while a rejected request will go to the rejected list. In both cases, the user will lose access to the repo.

For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.

Parameters:
  • repo_id (str) – The id of the repo to get access requests for.

  • repo_type (str, optional) – The type of the repo to get access requests for. Must be one of model, dataset or space. Defaults to model.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

A list of [AccessRequest] objects. Each time contains a username, email, status and timestamp attribute. If the gated repo has a custom form, the fields attribute will be populated with user’s answers.

Return type:

List[AccessRequest]

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

Example: .. code-block:: py

>>> from huggingface_hub import list_accepted_access_requests
>>> requests = list_accepted_access_requests("meta-llama/Llama-2-7b")
>>> len(requests)
411
>>> requests[0]
[
    AccessRequest(
        username='clem',
        fullname='Clem 🤗',
        email='\***',
        timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc),
        status='accepted',
        fields=None,
    ),
    ...
]
list_rejected_access_requests(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) List[AccessRequest][source][source]

Get rejected access requests for a given gated repo.

A rejected request means the user has requested access to the repo and the request has been explicitly rejected by a repo owner (either you or another user from your organization). The user cannot download any file of the repo. Rejected requests can be accepted or cancelled at any time using [accept_access_request] and [cancel_access_request]. A cancelled request will go back to the pending list while an accepted request will go to the accepted list.

For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.

Parameters:
  • repo_id (str) – The id of the repo to get access requests for.

  • repo_type (str, optional) – The type of the repo to get access requests for. Must be one of model, dataset or space. Defaults to model.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

A list of [AccessRequest] objects. Each time contains a username, email, status and timestamp attribute. If the gated repo has a custom form, the fields attribute will be populated with user’s answers.

Return type:

List[AccessRequest]

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

Example: .. code-block:: py

>>> from huggingface_hub import list_rejected_access_requests
>>> requests = list_rejected_access_requests("meta-llama/Llama-2-7b")
>>> len(requests)
411
>>> requests[0]
[
    AccessRequest(
        username='clem',
        fullname='Clem 🤗',
        email='\***',
        timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc),
        status='rejected',
        fields=None,
    ),
    ...
]
cancel_access_request(repo_id: str, user: str, *, repo_type: str | None = None, token: bool | str | None = None) None[source][source]

Cancel an access request from a user for a given gated repo.

A cancelled request will go back to the pending list and the user will lose access to the repo.

For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.

Parameters:
  • repo_id (str) – The id of the repo to cancel access request for.

  • user (str) – The username of the user which access request should be cancelled.

  • repo_type (str, optional) – The type of the repo to cancel access request for. Must be one of model, dataset or space. Defaults to model.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request cannot be found.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request is already in the pending list.

accept_access_request(repo_id: str, user: str, *, repo_type: str | None = None, token: bool | str | None = None) None[source][source]

Accept an access request from a user for a given gated repo.

Once the request is accepted, the user will be able to download any file of the repo and access the community tab. If the approval mode is automatic, you don’t have to accept requests manually. An accepted request can be cancelled or rejected at any time using [cancel_access_request] and [reject_access_request].

For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.

Parameters:
  • repo_id (str) – The id of the repo to accept access request for.

  • user (str) – The username of the user which access request should be accepted.

  • repo_type (str, optional) – The type of the repo to accept access request for. Must be one of model, dataset or space. Defaults to model.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request cannot be found.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request is already in the accepted list.

reject_access_request(repo_id: str, user: str, *, repo_type: str | None = None, rejection_reason: str | None, token: bool | str | None = None) None[source][source]

Reject an access request from a user for a given gated repo.

A rejected request will go to the rejected list. The user cannot download any file of the repo. Rejected requests can be accepted or cancelled at any time using [accept_access_request] and [cancel_access_request]. A cancelled request will go back to the pending list while an accepted request will go to the accepted list.

For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.

Parameters:
  • repo_id (str) – The id of the repo to reject access request for.

  • user (str) – The username of the user which access request should be rejected.

  • repo_type (str, optional) – The type of the repo to reject access request for. Must be one of model, dataset or space. Defaults to model.

  • rejection_reason (str, optional) – Optional rejection reason that will be visible to the user (max 200 characters).

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request cannot be found.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request is already in the rejected list.

grant_access(repo_id: str, user: str, *, repo_type: str | None = None, token: bool | str | None = None) None[source][source]

Grant access to a user for a given gated repo.

Granting access don’t require for the user to send an access request by themselves. The user is automatically added to the accepted list meaning they can download the files You can revoke the granted access at any time using [cancel_access_request] or [reject_access_request].

For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.

Parameters:
  • repo_id (str) – The id of the repo to grant access to.

  • user (str) – The username of the user to grant access.

  • repo_type (str, optional) – The type of the repo to grant access to. Must be one of model, dataset or space. Defaults to model.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Raises:
  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the user already has access to the repo.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have write or admin role in the organization the repo belongs to or if you passed a read token.

  • [HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.

get_webhook(webhook_id: str, *, token: bool | str | None = None) WebhookInfo[source][source]

Get a webhook by its id.

Parameters:
Returns:

Info about the webhook.

Return type:

[WebhookInfo]

Example

>>> from huggingface_hub import get_webhook
>>> webhook = get_webhook("654bbbc16f2ec14d77f109cc")
>>> print(webhook)
WebhookInfo(
    id="654bbbc16f2ec14d77f109cc",
    watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")],
    url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
    secret="my-secret",
    domains=["repo", "discussion"],
    disabled=False,
)
list_webhooks(*, token: bool | str | None = None) List[WebhookInfo][source][source]

List all configured webhooks.

Parameters:

token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

List of webhook info objects.

Return type:

List[WebhookInfo]

Example

>>> from huggingface_hub import list_webhooks
>>> webhooks = list_webhooks()
>>> len(webhooks)
2
>>> webhooks[0]
WebhookInfo(
    id="654bbbc16f2ec14d77f109cc",
    watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")],
    url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
    secret="my-secret",
    domains=["repo", "discussion"],
    disabled=False,
)
create_webhook(*, url: str, watched: List[Dict | WebhookWatchedItem], domains: List[Literal['repo', 'discussions']] | None = None, secret: str | None = None, token: bool | str | None = None) WebhookInfo[source][source]

Create a new webhook.

Parameters:
  • url (str) – URL to send the payload to.

  • watched (List[WebhookWatchedItem]) – List of [WebhookWatchedItem] to be watched by the webhook. It can be users, orgs, models, datasets or spaces. Watched items can also be provided as plain dictionaries.

  • domains (List[Literal["repo", "discussion"]], optional) – List of domains to watch. It can be “repo”, “discussion” or both.

  • secret (str, optional) – A secret to sign the payload with.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

Info about the newly created webhook.

Return type:

[WebhookInfo]

Example

>>> from huggingface_hub import create_webhook
>>> payload = create_webhook(
...     watched=[{"type": "user", "name": "julien-c"}, {"type": "org", "name": "HuggingFaceH4"}],
...     url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
...     domains=["repo", "discussion"],
...     secret="my-secret",
... )
>>> print(payload)
WebhookInfo(
    id="654bbbc16f2ec14d77f109cc",
    url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
    watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")],
    domains=["repo", "discussion"],
    secret="my-secret",
    disabled=False,
)
update_webhook(webhook_id: str, *, url: str | None = None, watched: List[Dict | WebhookWatchedItem] | None = None, domains: List[Literal['repo', 'discussions']] | None = None, secret: str | None = None, token: bool | str | None = None) WebhookInfo[source][source]

Update an existing webhook.

Parameters:
  • webhook_id (str) – The unique identifier of the webhook to be updated.

  • url (str, optional) – The URL to which the payload will be sent.

  • watched (List[WebhookWatchedItem], optional) – List of items to watch. It can be users, orgs, models, datasets, or spaces. Refer to [WebhookWatchedItem] for more details. Watched items can also be provided as plain dictionaries.

  • domains (List[Literal["repo", "discussion"]], optional) – The domains to watch. This can include “repo”, “discussion”, or both.

  • secret (str, optional) – A secret to sign the payload with, providing an additional layer of security.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

Info about the updated webhook.

Return type:

[WebhookInfo]

Example

>>> from huggingface_hub import update_webhook
>>> updated_payload = update_webhook(
...     webhook_id="654bbbc16f2ec14d77f109cc",
...     url="https://new.webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
...     watched=[{"type": "user", "name": "julien-c"}, {"type": "org", "name": "HuggingFaceH4"}],
...     domains=["repo"],
...     secret="my-secret",
... )
>>> print(updated_payload)
WebhookInfo(
    id="654bbbc16f2ec14d77f109cc",
    url="https://new.webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
    watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")],
    domains=["repo"],
    secret="my-secret",
    disabled=False,
enable_webhook(webhook_id: str, *, token: bool | str | None = None) WebhookInfo[source][source]

Enable a webhook (makes it “active”).

Parameters:
Returns:

Info about the enabled webhook.

Return type:

[WebhookInfo]

Example

>>> from huggingface_hub import enable_webhook
>>> enabled_webhook = enable_webhook("654bbbc16f2ec14d77f109cc")
>>> enabled_webhook
WebhookInfo(
    id="654bbbc16f2ec14d77f109cc",
    url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
    watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")],
    domains=["repo", "discussion"],
    secret="my-secret",
    disabled=False,
)
disable_webhook(webhook_id: str, *, token: bool | str | None = None) WebhookInfo[source][source]

Disable a webhook (makes it “disabled”).

Parameters:
Returns:

Info about the disabled webhook.

Return type:

[WebhookInfo]

Example

>>> from huggingface_hub import disable_webhook
>>> disabled_webhook = disable_webhook("654bbbc16f2ec14d77f109cc")
>>> disabled_webhook
WebhookInfo(
    id="654bbbc16f2ec14d77f109cc",
    url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548",
    watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")],
    domains=["repo", "discussion"],
    secret="my-secret",
    disabled=True,
)
delete_webhook(webhook_id: str, *, token: bool | str | None = None) None[source][source]

Delete a webhook.

Parameters:
Returns:

None

Example

>>> from huggingface_hub import delete_webhook
>>> delete_webhook("654bbbc16f2ec14d77f109cc")
get_user_overview(username: str, token: str | bool | None = None) User[source][source]

Get an overview of a user on the Hub.

Parameters:
Returns:

A [User] object with the user’s overview.

Return type:

User

Raises:

[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the user does not exist on the Hub.

list_organization_members(organization: str, token: str | bool | None = None) Iterable[User][source][source]

List of members of an organization on the Hub.

Parameters:
Returns:

A list of [User] objects with the members of the organization.

Return type:

Iterable[User]

Raises:

[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the organization does not exist on the Hub.

list_user_followers(username: str, token: str | bool | None = None) Iterable[User][source][source]

Get the list of followers of a user on the Hub.

Parameters:
Returns:

A list of [User] objects with the followers of the user.

Return type:

Iterable[User]

Raises:

[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the user does not exist on the Hub.

list_user_following(username: str, token: str | bool | None = None) Iterable[User][source][source]

Get the list of users followed by a user on the Hub.

Parameters:
Returns:

A list of [User] objects with the users followed by the user.

Return type:

Iterable[User]

Raises:

[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the user does not exist on the Hub.

list_papers(*, query: str | None = None, token: str | bool | None = None) Iterable[PaperInfo][source][source]

List daily papers on the Hugging Face Hub given a search query.

Parameters:
  • query (str, optional) – A search query string to find papers. If provided, returns papers that match the query.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

Returns:

an iterable of [huggingface_hub.hf_api.PaperInfo] objects.

Return type:

Iterable[PaperInfo]

Example:

>>> from huggingface_hub import HfApi

>>> api = HfApi()

# List all papers with "attention" in their title
>>> api.list_papers(query="attention")
paper_info(id: str) PaperInfo[source][source]

Get information for a paper on the Hub.

Parameters:

id (str, **optional**) – ArXiv id of the paper.

Returns:

A PaperInfo object.

Return type:

PaperInfo

Raises:

[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the paper does not exist on the Hub.

auth_check(repo_id: str, *, repo_type: str | None = None, token: str | bool | None = None) None[source][source]

Check if the provided user token has access to a specific repository on the Hugging Face Hub.

This method verifies whether the user, authenticated via the provided token, has access to the specified repository. If the repository is not found or if the user lacks the required permissions to access it, the method raises an appropriate exception.

Parameters:
  • repo_id (str) – The repository to check for access. Format should be "user/repo_name". Example: "user/my-cool-model".

  • repo_type (str, optional) – The type of the repository. Should be one of "model", "dataset", or "space". If not specified, the default is "model".

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Raises:
  • [RepositoryNotFoundError] – Raised if the repository does not exist, is private, or the user does not have access. This can occur if the repo_id or repo_type is incorrect or if the repository is private but the user is not authenticated.

  • [GatedRepoError] – Raised if the repository exists but is gated and the user is not authorized to access it.

Example

Check if the user has access to a repository:

>>> from huggingface_hub import auth_check
>>> from huggingface_hub.utils import GatedRepoError, RepositoryNotFoundError

try:
    auth_check("user/my-cool-model")
except GatedRepoError:
    # Handle gated repository error
    print("You do not have permission to access this gated repository.")
except RepositoryNotFoundError:
    # Handle repository not found error
    print("The repository was not found or you do not have access.")

In this example: - If the user has access, the method completes successfully. - If the repository is gated or does not exist, appropriate exceptions are raised, allowing the user to handle them accordingly.

run_job(*, image: str, command: List[str], env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: str | bool | None = None) JobInfo[source][source]

Run compute Jobs on Hugging Face infrastructure.

Parameters:
  • image (str) – The Docker image to use. Examples: "ubuntu", "python:3.12", "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel". Example with an image from a Space: "hf.co/spaces/lhoestq/duckdb".

  • command (List[str]) – The command to run. Example: ["echo", "hello"].

  • env (Dict[str, Any], optional) – Defines the environment variables for the Job.

  • secrets (Dict[str, Any], optional) – Defines the secret environment variables for the Job.

  • flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to "cpu-basic".

  • timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or "5m" for 5 minutes.

  • namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Example

Run your first Job:

>>> from huggingface_hub import run_job
>>> run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"])

Run a GPU Job:

>>> from huggingface_hub import run_job
>>> image = "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel"
>>> command = ["python", "-c", "import torch; print(f"This code ran with the following GPU: {torch.cuda.get_device_name()}")"]
>>> run_job(image=image, command=command, flavor="a10g-small")
fetch_job_logs(*, job_id: str, namespace: str | None = None, token: str | bool | None = None) Iterable[str][source][source]

Fetch all the logs from a compute Job on Hugging Face infrastructure.

Parameters:
  • job_id (str) – ID of the Job.

  • namespace (str, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Example

>>> from huggingface_hub import fetch_job_logs, run_job
>>> job = run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"])
>>> for log in fetch_job_logs(job.id):
...     print(log)
Hello from HF compute!
list_jobs(*, timeout: int | None = None, namespace: str | None = None, token: str | bool | None = None) List[JobInfo][source][source]

List compute Jobs on Hugging Face infrastructure.

Parameters:
  • timeout (float, optional) – Whether to set a timeout for the request to the Hub.

  • namespace (str, optional) – The namespace from where it lists the jobs. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

inspect_job(*, job_id: str, namespace: str | None = None, token: str | bool | None = None) JobInfo[source][source]

Inspect a compute Job on Hugging Face infrastructure.

Parameters:
  • job_id (str) – ID of the Job.

  • namespace (str, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Example

>>> from huggingface_hub import inspect_job, run_job
>>> job = run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"])
>>> inspect_job(job.id)
JobInfo(
    id='68780d00bbe36d38803f645f',
    created_at=datetime.datetime(2025, 7, 16, 20, 35, 12, 808000, tzinfo=datetime.timezone.utc),
    docker_image='python:3.12',
    space_id=None,
    command=['python', '-c', "print('Hello from HF compute!')"],
    arguments=[],
    environment={},
    secrets={},
    flavor='cpu-basic',
    status=JobStatus(stage='RUNNING', message=None)
)
cancel_job(*, job_id: str, namespace: str | None = None, token: str | bool | None = None) None[source][source]

Cancel a compute Job on Hugging Face infrastructure.

Parameters:
  • job_id (str) – ID of the Job.

  • namespace (str, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

run_uv_job(script: str, *, script_args: List[str] | None = None, dependencies: List[str] | None = None, python: str | None = None, image: str | None = None, env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: bool | str | None = None, _repo: str | None = None) JobInfo[source][source]

Run a UV script Job on Hugging Face infrastructure.

Parameters:
  • script (str) – Path or URL of the UV script, or a command.

  • script_args (List[str], optional) – Arguments to pass to the script or command.

  • dependencies (List[str], optional) – Dependencies to use to run the UV script.

  • python (str, optional) – Use a specific Python version. Default is 3.12.

  • (str (image) – python3.12-bookworm”): Use a custom Docker image with uv installed.

  • optional – python3.12-bookworm”): Use a custom Docker image with uv installed.

  • "ghcr.io/astral-sh/uv (defaults to) – python3.12-bookworm”): Use a custom Docker image with uv installed.

  • env (Dict[str, Any], optional) – Defines the environment variables for the Job.

  • secrets (Dict[str, Any], optional) – Defines the secret environment variables for the Job.

  • flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to "cpu-basic".

  • timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or "5m" for 5 minutes.

  • namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Example

Run a script from a URL:

>>> from huggingface_hub import run_uv_job
>>> script = "https://raw.githubusercontent.com/huggingface/trl/refs/heads/main/trl/scripts/sft.py"
>>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"]
>>> run_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small")

Run a local script:

>>> from huggingface_hub import run_uv_job
>>> script = "my_sft.py"
>>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"]
>>> run_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small")

Run a command:

>>> from huggingface_hub import run_uv_job
>>> script = "lighteval"
>>> script_args= ["endpoint", "inference-providers", "model_name=openai/gpt-oss-20b,provider=auto", "lighteval|gsm8k|0|0"]
>>> run_uv_job(script, script_args=script_args, dependencies=["lighteval"], flavor="a10g-small")
create_scheduled_job(*, image: str, command: List[str], schedule: str, suspend: bool | None = None, concurrency: bool | None = None, env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: str | bool | None = None) ScheduledJobInfo[source][source]

Create scheduled compute Jobs on Hugging Face infrastructure.

Parameters:
  • image (str) – The Docker image to use. Examples: "ubuntu", "python:3.12", "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel". Example with an image from a Space: "hf.co/spaces/lhoestq/duckdb".

  • command (List[str]) – The command to run. Example: ["echo", "hello"].

  • schedule (str) – One of “@annually”, “@yearly”, “@monthly”, “@weekly”, “@daily”, “@hourly”, or a CRON schedule expression (e.g., ‘0 9 * * 1’ for 9 AM every Monday).

  • suspend (bool, optional) – If True, the scheduled Job is suspended (paused). Defaults to False.

  • concurrency (bool, optional) – If True, multiple instances of this Job can run concurrently. Defaults to False.

  • env (Dict[str, Any], optional) – Defines the environment variables for the Job.

  • secrets (Dict[str, Any], optional) – Defines the secret environment variables for the Job.

  • flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to "cpu-basic".

  • timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or "5m" for 5 minutes.

  • namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Example

Create your first scheduled Job:

>>> from huggingface_hub import create_scheduled_job
>>> create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"], schedule="@hourly")

Use a CRON schedule expression:

>>> from huggingface_hub import create_scheduled_job
>>> create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('this runs every 5min')"], schedule="*/5 * * * *")

Create a scheduled GPU Job:

>>> from huggingface_hub import create_scheduled_job
>>> image = "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel"
>>> command = ["python", "-c", "import torch; print(f"This code ran with the following GPU: {torch.cuda.get_device_name()}")"]
>>> create_scheduled_job(image, command, flavor="a10g-small", schedule="@hourly")
list_scheduled_jobs(*, timeout: int | None = None, namespace: str | None = None, token: str | bool | None = None) List[ScheduledJobInfo][source][source]

List scheduled compute Jobs on Hugging Face infrastructure.

Parameters:
  • timeout (float, optional) – Whether to set a timeout for the request to the Hub.

  • namespace (str, optional) – The namespace from where it lists the jobs. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

inspect_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) ScheduledJobInfo[source][source]

Inspect a scheduled compute Job on Hugging Face infrastructure.

Parameters:
  • scheduled_job_id (str) – ID of the scheduled Job.

  • namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Example

>>> from huggingface_hub import inspect_job, create_scheduled_job
>>> scheduled_job = create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"], schedule="@hourly")
>>> inspect_scheduled_job(scheduled_job.id)
delete_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) None[source][source]

Delete a scheduled compute Job on Hugging Face infrastructure.

Parameters:
  • scheduled_job_id (str) – ID of the scheduled Job.

  • namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

suspend_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) None[source][source]

Suspend (pause) a scheduled compute Job on Hugging Face infrastructure.

Parameters:
  • scheduled_job_id (str) – ID of the scheduled Job.

  • namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

resume_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) None[source][source]

Resume (unpause) a scheduled compute Job on Hugging Face infrastructure.

Parameters:
  • scheduled_job_id (str) – ID of the scheduled Job.

  • namespace (str, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

create_scheduled_uv_job(script: str, *, script_args: List[str] | None = None, schedule: str, suspend: bool | None = None, concurrency: bool | None = None, dependencies: List[str] | None = None, python: str | None = None, image: str | None = None, env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: bool | str | None = None, _repo: str | None = None) ScheduledJobInfo[source][source]

Run a UV script Job on Hugging Face infrastructure.

Parameters:
  • script (str) – Path or URL of the UV script, or a command.

  • script_args (List[str], optional) – Arguments to pass to the script, or a command.

  • schedule (str) – One of “@annually”, “@yearly”, “@monthly”, “@weekly”, “@daily”, “@hourly”, or a CRON schedule expression (e.g., ‘0 9 * * 1’ for 9 AM every Monday).

  • suspend (bool, optional) – If True, the scheduled Job is suspended (paused). Defaults to False.

  • concurrency (bool, optional) – If True, multiple instances of this Job can run concurrently. Defaults to False.

  • dependencies (List[str], optional) – Dependencies to use to run the UV script.

  • python (str, optional) – Use a specific Python version. Default is 3.12.

  • (str (image) – python3.12-bookworm”): Use a custom Docker image with uv installed.

  • optional – python3.12-bookworm”): Use a custom Docker image with uv installed.

  • "ghcr.io/astral-sh/uv (defaults to) – python3.12-bookworm”): Use a custom Docker image with uv installed.

  • env (Dict[str, Any], optional) – Defines the environment variables for the Job.

  • secrets (Dict[str, Any], optional) – Defines the secret environment variables for the Job.

  • flavor (str, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware] for possible values. Defaults to "cpu-basic".

  • timeout (Union[int, float, str], optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example: 300 or "5m" for 5 minutes.

  • namespace (str, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.

  • ` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to False to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.

Example

Schedule a script from a URL:

>>> from huggingface_hub import create_scheduled_uv_job
>>> script = "https://raw.githubusercontent.com/huggingface/trl/refs/heads/main/trl/scripts/sft.py"
>>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"]
>>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small", schedule="@weekly")

Schedule a local script:

>>> from huggingface_hub import create_scheduled_uv_job
>>> script = "my_sft.py"
>>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"]
>>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small", schedule="@weekly")

Schedule a command:

>>> from huggingface_hub import create_scheduled_uv_job
>>> script = "lighteval"
>>> script_args= ["endpoint", "inference-providers", "model_name=openai/gpt-oss-20b,provider=auto", "lighteval|gsm8k|0|0"]
>>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["lighteval"], flavor="a10g-small", schedule="@weekly")
tooluniverse.embedding_sync.upload_folder(*, repo_id: str, folder_path: str | Path, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, allow_patterns: str | List[str] | None = None, ignore_patterns: str | List[str] | None = None, delete_patterns: str | List[str] | None = None, run_as_future: bool = False) CommitInfo | Future[CommitInfo][source]

Upload a local folder to the given repo. The upload is done through a HTTP requests, and doesn’t require git or git-lfs to be installed.

The structure of the folder will be preserved. Files with the same name already present in the repository will be overwritten. Others will be left untouched.

Use the allow_patterns and ignore_patterns arguments to specify which files to upload. These parameters accept either a single pattern or a list of patterns. Patterns are Standard Wildcards (globbing patterns) as documented [here](https://tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm). If both allow_patterns and ignore_patterns are provided, both constraints apply. By default, all files from the folder are uploaded.

Use the delete_patterns argument to specify remote files you want to delete. Input type is the same as for allow_patterns (see above). If path_in_repo is also provided, the patterns are matched against paths relative to this folder. For example, upload_folder(..., path_in_repo="experiment", delete_patterns="logs/*") will delete any remote file under ./experiment/logs/. Note that the .gitattributes file will not be deleted even if it matches the patterns.

Any .git/ folder present in any subdirectory will be ignored. However, please be aware that the .gitignore file is not taken into account.

Uses HfApi.create_commit under the hood.

Parameters:
  • repo_id (str) – The repository to which the file will be uploaded, for example: "username/custom_transformers"

  • folder_path (str or Path) – Path to the folder to upload on the local file system

  • path_in_repo (str, optional) – Relative path of the directory in the repo, for example: "checkpoints/1fec34a/results". Will default to the root folder of the repository.

  • token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.

  • repo_type (str, optional) – Set to "dataset" or "space" if uploading to a dataset or space, None or "model" if uploading to a model. Default is None.

  • revision (str, optional) – The git revision to commit from. Defaults to the head of the "main" branch.

  • commit_message (str, optional) – The summary / title / first line of the generated commit. Defaults to: f"Upload {path_in_repo} with huggingface_hub"

  • commit_description (str optional) – The description of the generated commit

  • create_pr (boolean, optional) – Whether or not to create a Pull Request with that commit. Defaults to False. If revision is not set, PR is opened against the "main" branch. If revision is set and is a branch, PR is opened against this branch. If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.

  • parent_commit (str, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified and create_pr is False, the commit will fail if revision does not point to parent_commit. If specified and create_pr is True, the pull request will be created from parent_commit. Specifying parent_commit ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.

  • allow_patterns (List[str] or str, optional) – If provided, only files matching at least one pattern are uploaded.

  • ignore_patterns (List[str] or str, optional) – If provided, files matching any of the patterns are not uploaded.

  • delete_patterns (List[str] or str, optional) – If provided, remote files matching any of the patterns will be deleted from the repo while committing new files. This is useful if you don’t know which files have already been uploaded. Note: to avoid discrepancies the .gitattributes file is not deleted even if it matches the pattern.

  • run_as_future (bool, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passing run_as_future=True will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults to False.

Returns:

Instance of [CommitInfo] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). If run_as_future=True is passed, returns a Future object which will contain the result when executed.

Return type:

[CommitInfo] or Future

<Tip>

Raises the following errors:

if the HuggingFace API returned an error - [ValueError](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid

</Tip>

<Tip warning={true}>

upload_folder assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and that repo_id and repo_type are set correctly. If repo does not exist, create it first using [~hf_api.create_repo].

</Tip>

<Tip>

When dealing with a large folder (thousands of files or hundreds of GB), we recommend using [~hf_api.upload_large_folder] instead.

</Tip>

Example:

# Upload checkpoints folder except the log files
>>> upload_folder(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
...     ignore_patterns="\**/logs/*.txt",
... )
# "https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints"

# Upload checkpoints folder including logs while deleting existing logs from the repo
# Useful if you don't know exactly which log files have already being pushed
>>> upload_folder(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
...     delete_patterns="\**/logs/*.txt",
... )
"https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints"

# Upload checkpoints folder while creating a PR
>>> upload_folder(
...     folder_path="local/checkpoints",
...     path_in_repo="remote/experiment/checkpoints",
...     repo_id="username/my-dataset",
...     repo_type="datasets",
...     token="my_token",
...     create_pr=True,
... )
"https://huggingface.co/datasets/username/my-dataset/tree/refs%2Fpr%2F1/remote/experiment/checkpoints"
tooluniverse.embedding_sync.snapshot_download(repo_id: str, *, repo_type: str | None = None, revision: str | None = None, cache_dir: str | Path | None = None, local_dir: str | Path | None = None, library_name: str | None = None, library_version: str | None = None, user_agent: Dict | str | None = None, proxies: Dict | None = None, etag_timeout: float = 10, force_download: bool = False, token: bool | str | None = None, local_files_only: bool = False, allow_patterns: List[str] | str | None = None, ignore_patterns: List[str] | str | None = None, max_workers: int = 8, tqdm_class: Type[tqdm_asyncio] | None = None, headers: Dict[str, str] | None = None, endpoint: str | None = None, local_dir_use_symlinks: bool | Literal['auto'] = 'auto', resume_download: bool | None = None) str[source][source]

Download repo files.

Download a whole snapshot of a repo’s files at the specified revision. This is useful when you want all files from a repo, because you don’t know which ones you will need a priori. All files are nested inside a folder in order to keep their actual filename relative to that folder. You can also filter which files to download using allow_patterns and ignore_patterns.

If local_dir is provided, the file structure from the repo will be replicated in this location. When using this option, the cache_dir will not be used and a .cache/huggingface/ folder will be created at the root of local_dir to store some metadata related to the downloaded files. While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.

An alternative would be to clone the repo but this requires git and git-lfs to be installed and properly configured. It is also not possible to filter which files to download when cloning a repository using git.

Parameters:
  • repo_id (str) – A user or an organization name and a repo name separated by a /.

  • repo_type (str, optional) – Set to "dataset" or "space" if downloading from a dataset or space, None or "model" if downloading from a model. Default is None.

  • revision (str, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.

  • cache_dir (str, Path, optional) – Path to the folder where cached files are stored.

  • local_dir (str or Path, optional) – If provided, the downloaded files will be placed under this directory.

  • library_name (str, optional) – The name of the library to which the object corresponds.

  • library_version (str, optional) – The version of the library.

  • user_agent (str, dict, optional) – The user-agent info in the form of a dictionary or a string.

  • proxies (dict, optional) – Dictionary mapping protocol to the URL of the proxy passed to requests.request.

  • etag_timeout (float, optional, defaults to 10) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed to requests.request.

  • force_download (bool, optional, defaults to False) – Whether the file should be downloaded even if it already exists in the local cache.

  • token (str, bool, optional) –

    A token to be used for the download.
    • If True, the token is read from the HuggingFace config folder.

    • If a string, it’s used as the authentication token.

  • headers (dict, optional) – Additional headers to include in the request. Those headers take precedence over the others.

  • local_files_only (bool, optional, defaults to False) – If True, avoid downloading the file and return the path to the local cached file if it exists.

  • allow_patterns (List[str] or str, optional) – If provided, only files matching at least one pattern are downloaded.

  • ignore_patterns (List[str] or str, optional) – If provided, files matching any of the patterns are not downloaded.

  • max_workers (int, optional) – Number of concurrent threads to download files (1 thread = 1 file download). Defaults to 8.

  • tqdm_class (tqdm, optional) – If provided, overwrites the default behavior for the progress bar. Passed argument must inherit from tqdm.auto.tqdm or at least mimic its behavior. Note that the tqdm_class is not passed to each individual download. Defaults to the custom HF progress bar that can be disabled by setting HF_HUB_DISABLE_PROGRESS_BARS environment variable.

Returns:

folder path of the repo snapshot.

Return type:

str

Raises:
  • [RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to private and you do not have access.

  • [RevisionNotFoundError] – If the revision to download from cannot be found.

  • [EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If token=True and the token cannot be found.

  • [OSError](https – //docs.python.org/3/library/exceptions.html#OSError) if ETag cannot be determined.

  • [ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid.

exception tooluniverse.embedding_sync.HfHubHTTPError(message: str, response: Response | None = None, *, server_message: str | None = None)[source][source]

Bases: HTTPError

HTTPError to inherit from for any custom HTTP Error raised in HF Hub.

Any HTTPError is converted at least into a HfHubHTTPError. If some information is sent back by the server, it will be added to the error message.

Added details: - Request id from “X-Request-Id” header if exists. If not, fallback to “X-Amzn-Trace-Id” header if exists. - Server error message from the header “X-Error-Message”. - Server error message if we can found one in the response body.

Example: ```py

import requests from huggingface_hub.utils import get_session, hf_raise_for_status, HfHubHTTPError

response = get_session().post(…) try:

hf_raise_for_status(response)

except HfHubHTTPError as e:

print(str(e)) # formatted message e.request_id, e.server_message # details returned by server

# Complete the error message with additional information once it’s raised e.append_to_message(”

create_commit expects the repository to exist.”)

raise

```

__init__(message: str, response: Response | None = None, *, server_message: str | None = None)[source][source]

Initialize RequestException with request and response objects.

append_to_message(additional_message: str) None[source][source]

Append additional information to the HfHubHTTPError initial message.

class tooluniverse.embedding_sync.BaseTool(tool_config)[source][source]

Bases: object

__init__(tool_config)[source][source]
classmethod get_default_config_file()[source][source]

Get the path to the default configuration file for this tool type.

This method uses a robust path resolution strategy that works across different installation scenarios:

  1. Installed packages: Uses importlib.resources for proper package resource access

  2. Development mode: Falls back to file-based path resolution

  3. Legacy Python: Handles importlib.resources and importlib_resources

Override this method in subclasses to specify a custom defaults file.

Returns:

Path or resource object pointing to the defaults file

classmethod load_defaults_from_file()[source][source]

Load defaults from the configuration file

run(arguments=None)[source][source]

Execute the tool.

The default BaseTool implementation accepts an optional arguments mapping to align with most concrete tool implementations which expect a dictionary of inputs.

check_function_call(function_call_json)[source][source]
get_required_parameters()[source][source]

Retrieve required parameters from the endpoint definition. Returns: list: List of required parameters for the given endpoint.

tooluniverse.embedding_sync.register_tool(tool_type_name=None, config=None)[source][source]

Decorator to automatically register tool classes and their configs.

Usage:

@register_tool(‘CustomToolName’, config={…}) class MyTool:

pass

tooluniverse.embedding_sync.get_logger(name: str | None = None) Logger[source][source]

Get a logger instance

Parameters:

name (str, optional) – Logger name (usually __name__)

Returns:

Logger instance

Return type:

logging.Logger

class tooluniverse.embedding_sync.EmbeddingSync(tool_config)[source][source]

Bases: BaseTool

Sync embedding databases with HuggingFace Hub. Supports uploading local databases and downloading shared databases.

__init__(tool_config)[source][source]
run(arguments)[source][source]

Main entry point for the tool