tooluniverse.embedding_sync module¶
Embedding Sync Tool for ToolUniverse
Synchronize embedding databases with HuggingFace Hub for sharing and collaboration. Supports uploading local databases to HuggingFace and downloading databases from HuggingFace.
- class tooluniverse.embedding_sync.Path(*args, **kwargs)[source][source]¶
Bases:
PurePath
PurePath subclass that can make system calls.
Path represents a filesystem path but unlike PurePath, also offers methods to do system calls on path objects. Depending on your system, instantiating a Path will return either a PosixPath or a WindowsPath object. You can also instantiate a PosixPath or WindowsPath directly, but cannot instantiate a WindowsPath on a POSIX system or vice versa.
- classmethod cwd()[source][source]¶
Return a new path pointing to the current working directory (as returned by os.getcwd()).
- classmethod home()[source][source]¶
Return a new path pointing to the user’s home directory (as returned by os.path.expanduser(‘~’)).
- samefile(other_path)[source][source]¶
Return whether other_path is the same or not as this file (as returned by os.path.samefile()).
- iterdir()[source][source]¶
Iterate over the files in this directory. Does not yield any result for the special paths ‘.’ and ‘..’.
- glob(pattern)[source][source]¶
Iterate over this subtree and yield all existing files (of any kind, including directories) matching the given relative pattern.
- rglob(pattern)[source][source]¶
Recursively yield all existing files (of any kind, including directories) matching the given relative pattern, anywhere in this subtree.
- absolute()[source][source]¶
Return an absolute version of this path. This function works even if the path doesn’t point to anything.
No normalization is done, i.e. all ‘.’ and ‘..’ will be kept along. Use resolve() to get the canonical path to a file.
- resolve(strict=False)[source][source]¶
Make the path absolute, resolving all symlinks on the way and also normalizing it (for example turning slashes into backslashes under Windows).
- stat(*, follow_symlinks=True)[source][source]¶
Return the result of the stat() system call on this path, like os.stat() does.
- open(mode='r', buffering=-1, encoding=None, errors=None, newline=None)[source][source]¶
Open the file pointed by this path and return a file object, as the built-in open() function does.
- read_text(encoding=None, errors=None)[source][source]¶
Open the file in text mode, read it, and close the file.
- write_text(data, encoding=None, errors=None, newline=None)[source][source]¶
Open the file in text mode, write to it, and close the file.
- touch(mode=438, exist_ok=True)[source][source]¶
Create this file with the given access mode, if it doesn’t exist.
- mkdir(mode=511, parents=False, exist_ok=False)[source][source]¶
Create a new directory at this given path.
- chmod(mode, *, follow_symlinks=True)[source][source]¶
Change the permissions of the path, like os.chmod().
- lchmod(mode)[source][source]¶
Like chmod(), except if the path points to a symlink, the symlink’s permissions are changed, rather than its target’s.
- unlink(missing_ok=False)[source][source]¶
Remove this file or link. If the path is a directory, use rmdir() instead.
- lstat()[source][source]¶
Like stat(), except if the path points to a symlink, the symlink’s status information is returned, rather than its target’s.
- rename(target)[source][source]¶
Rename this path to the target path.
The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.
Returns the new Path instance pointing to the target path.
- replace(target)[source][source]¶
Rename this path to the target path, overwriting if that path exists.
The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.
Returns the new Path instance pointing to the target path.
- symlink_to(target, target_is_directory=False)[source][source]¶
Make this path a symlink pointing to the target path. Note the order of arguments (link, target) is the reverse of os.symlink.
- hardlink_to(target)[source][source]¶
Make this path a hard link pointing to the same file as target.
Note the order of arguments (self, target) is the reverse of os.link’s.
- link_to(target)[source][source]¶
Make the target path a hard link pointing to this path.
Note this function does not make this path a hard link to target, despite the implication of the function and argument names. The order of arguments (target, link) is the reverse of Path.symlink_to, but matches that of os.link.
Deprecated since Python 3.10 and scheduled for removal in Python 3.12. Use
hardlink_to()
instead.
- class tooluniverse.embedding_sync.datetime(year, month, day[, hour[, minute[, second[, microsecond[, tzinfo]]]]])[source][source]¶
Bases:
date
The year, month and day arguments are required. tzinfo may be None, or an instance of a tzinfo subclass. The remaining arguments may be ints.
- now()[source]¶
Returns new datetime object representing current time local to tz.
- tz
Timezone object.
If no tz is specified, uses local timezone.
- isoformat()[source]¶
[sep] -> string in ISO 8601 format, YYYY-MM-DDT[HH[:MM[:SS[.mmm[uuu]]]]][+HH:MM]. sep is used to separate the year from the time, and defaults to ‘T’. The optional argument timespec specifies the number of additional terms of the time to include. Valid options are ‘auto’, ‘hours’, ‘minutes’, ‘seconds’, ‘milliseconds’ and ‘microseconds’.
- class tooluniverse.embedding_sync.HfApi(endpoint: str | None = None, token: str | bool | None = None, library_name: str | None = None, library_version: str | None = None, user_agent: Dict | str | None = None, headers: Dict[str, str] | None = None)[source][source]¶
Bases:
object
Client to interact with the Hugging Face Hub via HTTP.
The client is initialized with some high-level settings used in all requests made to the Hub (HF endpoint, authentication, user agents…). Using the
HfApi
client is preferred but not mandatory as all of its public methods are exposed directly at the root ofhuggingface_hub
.- Parameters:
endpoint (
str
, optional) – Endpoint of the Hub. Defaults to <https://huggingface.co>.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.library_name (
str
, optional) – The name of the library that is making the HTTP request. Will be added to the user-agent header. Example:"transformers"
.library_version (
str
, optional) – The version of the library that is making the HTTP request. Will be added to the user-agent header. Example:"4.24.0"
.user_agent (
str
,dict
, optional) – The user agent info in the form of a dictionary or a single string. It will be completed with information about the installed packages.headers (
dict
, optional) – Additional headers to be sent with each request. Example:{"X-My-Header": "value"}
. Headers passed here are taking precedence over the default headers.
- __init__(endpoint: str | None = None, token: str | bool | None = None, library_name: str | None = None, library_version: str | None = None, user_agent: Dict | str | None = None, headers: Dict[str, str] | None = None) None [source][source]¶
- run_as_future(fn: Callable[[...], R], *args, **kwargs) Future[R] [source][source]¶
Run a method in the background and return a Future instance.
The main goal is to run methods without blocking the main thread (e.g. to push data during a training). Background jobs are queued to preserve order but are not ran in parallel. If you need to speed-up your scripts by parallelizing lots of call to the API, you must setup and use your own [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor).
Note: Most-used methods like [
upload_file
], [upload_folder
] and [create_commit
] have arun_as_future: bool
argument to directly call them in the background. This is equivalent to callingapi.run_as_future(...)
on them but less verbose.- Parameters:
fn (
Callable
) – The method to run in the background.*args – Arguments with which the method will be called.
**kwargs – Arguments with which the method will be called.
- Returns:
a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) instance to get the result of the task.
- Return type:
Future
Example
>>> from huggingface_hub import HfApi >>> api = HfApi() >>> future = api.run_as_future(api.whoami) # instant >>> future.done() False >>> future.result() # wait until complete and return result (...) >>> future.done() True
- whoami(token: bool | str | None = None) Dict [source][source]¶
Call HF API to know “whoami”.
- Parameters:
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- get_token_permission(token: bool | str | None = None) Literal['read', 'write', 'fineGrained', None] [source][source]¶
Check if a given
token
is valid and return its permissions.<Tip warning={true}>
This method is deprecated and will be removed in version 1.0. Permissions are more complex than when
get_token_permission
was first introduced. OAuth and fine-grain tokens allows for more detailed permissions. If you need to know the permissions associated with a token, please usewhoami
and check the'auth'
key.</Tip>
For more details about tokens, please refer to https://huggingface.co/docs/hub/security-tokens#what-are-user-access-tokens.
- Parameters:
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.- Returns:
Permission granted by the token (“read” or “write”). Returns
None
if no token passed, if token is invalid or if role is not returned by the server. This typically happens when the token is an OAuth token.- Return type:
Literal["read", "write", "fineGrained", None]
- list_models(*, filter: str | Iterable[str] | None = None, author: str | None = None, apps: str | List[str] | None = None, gated: bool | None = None, inference: Literal['warm'] | None = None, inference_provider: Literal['all'] | 'PROVIDER_T' | List['PROVIDER_T'] | None = None, model_name: str | None = None, trained_dataset: str | List[str] | None = None, search: str | None = None, pipeline_tag: str | None = None, emissions_thresholds: Tuple[float, float] | None = None, sort: Literal['last_modified'] | str | None = None, direction: Literal[-1] | None = None, limit: int | None = None, expand: List[ExpandModelProperty_T] | None = None, full: bool | None = None, cardData: bool = False, fetch_config: bool = False, token: bool | str | None = None, language: str | List[str] | None = None, library: str | List[str] | None = None, tags: str | List[str] | None = None, task: str | List[str] | None = None) Iterable[ModelInfo] [source][source]¶
List models hosted on the Huggingface Hub, given some filters.
- Parameters:
filter (
str
orIterable[str]
, optional) – A string or list of string to filter models on the Hub. Models can be filtered by library, language, task, tags, and more.author (
str
, optional) – A string which identify the author (user or organization) of the returned models.apps (
str
orList
, optional) – A string or list of strings to filter models on the Hub that support the specified apps. Example values include"ollama"
or["ollama", "vllm"]
.gated (
bool
, optional) – A boolean to filter models on the Hub that are gated or not. By default, all models are returned. Ifgated=True
is passed, only gated models are returned. Ifgated=False
is passed, only non-gated models are returned.inference (
Literal["warm"]
, optional) – If “warm”, filter models on the Hub currently served by at least one provider.inference_provider (
Literal["all"]
orstr
, optional) – A string to filter models on the Hub that are served by a specific provider. Pass"all"
to get all models served by at least one provider.library (
str
orList
, optional) – Deprecated. Pass a library name infilter
to filter models by library.language (
str
orList
, optional) – Deprecated. Pass a language infilter
to filter models by language.model_name (
str
, optional) – A string that contain complete or partial names for models on the Hub, such as “bert” or “bert-base-cased”task (
str
orList
, optional) – Deprecated. Pass a task infilter
to filter models by task.trained_dataset (
str
orList
, optional) – A string tag or a list of string tags of the trained dataset for a model on the Hub.tags (
str
orList
, optional) – Deprecated. Pass tags infilter
to filter models by tags.search (
str
, optional) – A string that will be contained in the returned model ids.pipeline_tag (
str
, optional) – A string pipeline tag to filter models on the Hub by, such assummarization
.emissions_thresholds (
Tuple
, optional) – A tuple of two ints or floats representing a minimum and maximum carbon footprint to filter the resulting models with in grams.sort (
Literal["last_modified"]
orstr
, optional) – The key with which to sort the resulting models. Possible values are “last_modified”, “trending_score”, “created_at”, “downloads” and “likes”.direction (
Literal[-1]
orint
, optional) – Direction in which to sort. The value-1
sorts by descending order while all other values sort by ascending order.limit (
int
, optional) – The limit on the number of models fetched. Leaving this option toNone
fetches all models.expand (
List[ExpandModelProperty_T]
, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used iffull
,cardData
orfetch_config
are passed. Possible values are"author"
,"cardData"
,"config"
,"createdAt"
,"disabled"
,"downloads"
,"downloadsAllTime"
,"gated"
,"gguf"
,"inference"
,"inferenceProviderMapping"
,"lastModified"
,"library_name"
,"likes"
,"mask_token"
,"model-index"
,"pipeline_tag"
,"private"
,"safetensors"
,"sha"
,"siblings"
,"spaces"
,"tags"
,"transformersInfo"
,"trendingScore"
,"widgetData"
,"resourceGroup"
and"xetEnabled"
.full (
bool
, optional) – Whether to fetch all model data, including thelast_modified
, thesha
, the files and thetags
. This is set toTrue
by default when using a filter.cardData (
bool
, optional) – Whether to grab the metadata for the model as well. Can contain useful information such as carbon emissions, metrics, and datasets trained on.fetch_config (
bool
, optional) – Whether to fetch the model configs as well. This is not included infull
due to its size.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
an iterable of [
huggingface_hub.hf_api.ModelInfo
] objects.- Return type:
Iterable[ModelInfo]
Example:
>>> from huggingface_hub import HfApi >>> api = HfApi() # List all models >>> api.list_models() # List text classification models >>> api.list_models(filter="text-classification") # List models from the KerasHub library >>> api.list_models(filter="keras-hub") # List models served by Cohere >>> api.list_models(inference_provider="cohere") # List models with "bert" in their name >>> api.list_models(search="bert") # List models with "bert" in their name and pushed by google >>> api.list_models(search="bert", author="google")
- list_datasets(*, filter: str | Iterable[str] | None = None, author: str | None = None, benchmark: str | List[str] | None = None, dataset_name: str | None = None, gated: bool | None = None, language_creators: str | List[str] | None = None, language: str | List[str] | None = None, multilinguality: str | List[str] | None = None, size_categories: str | List[str] | None = None, task_categories: str | List[str] | None = None, task_ids: str | List[str] | None = None, search: str | None = None, sort: Literal['last_modified'] | str | None = None, direction: Literal[-1] | None = None, limit: int | None = None, expand: List[Literal['author', 'cardData', 'citation', 'createdAt', 'description', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'lastModified', 'likes', 'paperswithcode_id', 'private', 'resourceGroup', 'sha', 'siblings', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, full: bool | None = None, token: bool | str | None = None, tags: str | List[str] | None = None) Iterable[DatasetInfo] [source][source]¶
List datasets hosted on the Huggingface Hub, given some filters.
- Parameters:
filter (
str
orIterable[str]
, optional) – A string or list of string to filter datasets on the hub.author (
str
, optional) – A string which identify the author of the returned datasets.benchmark (
str
orList
, optional) – A string or list of strings that can be used to identify datasets on the Hub by their official benchmark.dataset_name (
str
, optional) – A string or list of strings that can be used to identify datasets on the Hub by its name, such asSQAC
orwikineural
gated (
bool
, optional) – A boolean to filter datasets on the Hub that are gated or not. By default, all datasets are returned. Ifgated=True
is passed, only gated datasets are returned. Ifgated=False
is passed, only non-gated datasets are returned.language_creators (
str
orList
, optional) – A string or list of strings that can be used to identify datasets on the Hub with how the data was curated, such ascrowdsourced
ormachine_generated
.language (
str
orList
, optional) – A string or list of strings representing a two-character language to filter datasets by on the Hub.multilinguality (
str
orList
, optional) – A string or list of strings representing a filter for datasets that contain multiple languages.size_categories (
str
orList
, optional) – A string or list of strings that can be used to identify datasets on the Hub by the size of the dataset such as100K<n<1M
or1M<n<10M
.tags (
str
orList
, optional) – Deprecated. Pass tags infilter
to filter datasets by tags.task_categories (
str
orList
, optional) – A string or list of strings that can be used to identify datasets on the Hub by the designed task, such asaudio_classification
ornamed_entity_recognition
.task_ids (
str
orList
, optional) – A string or list of strings that can be used to identify datasets on the Hub by the specific task such asspeech_emotion_recognition
orparaphrase
.search (
str
, optional) – A string that will be contained in the returned datasets.sort (
Literal["last_modified"]
orstr
, optional) – The key with which to sort the resulting models. Possible values are “last_modified”, “trending_score”, “created_at”, “downloads” and “likes”.direction (
Literal[-1]
orint
, optional) – Direction in which to sort. The value-1
sorts by descending order while all other values sort by ascending order.limit (
int
, optional) – The limit on the number of datasets fetched. Leaving this option toNone
fetches all datasets.expand (
List[ExpandDatasetProperty_T]
, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used iffull
is passed. Possible values are"author"
,"cardData"
,"citation"
,"createdAt"
,"disabled"
,"description"
,"downloads"
,"downloadsAllTime"
,"gated"
,"lastModified"
,"likes"
,"paperswithcode_id"
,"private"
,"siblings"
,"sha"
,"tags"
,"trendingScore"
,"usedStorage"
,"resourceGroup"
and"xetEnabled"
.full (
bool
, optional) – Whether to fetch all dataset data, including thelast_modified
, thecard_data
and the files. Can contain useful information such as the PapersWithCode ID.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
an iterable of [
huggingface_hub.hf_api.DatasetInfo
] objects.- Return type:
Iterable[DatasetInfo]
Example usage with the
filter
argument:>>> from huggingface_hub import HfApi >>> api = HfApi() # List all datasets >>> api.list_datasets() # List only the text classification datasets >>> api.list_datasets(filter="task_categories:text-classification") # List only the datasets in russian for language modeling >>> api.list_datasets( ... filter=("language:ru", "task_ids:language-modeling") ... ) # List FiftyOne datasets (identified by the tag "fiftyone" in dataset card) >>> api.list_datasets(tags="fiftyone")
Example usage with the
search
argument:>>> from huggingface_hub import HfApi >>> api = HfApi() # List all datasets with "text" in their name >>> api.list_datasets(search="text") # List all datasets with "text" in their name made by google >>> api.list_datasets(search="text", author="google")
- list_spaces(*, filter: str | Iterable[str] | None = None, author: str | None = None, search: str | None = None, datasets: str | Iterable[str] | None = None, models: str | Iterable[str] | None = None, linked: bool = False, sort: Literal['last_modified'] | str | None = None, direction: Literal[-1] | None = None, limit: int | None = None, expand: List[Literal['author', 'cardData', 'createdAt', 'datasets', 'disabled', 'lastModified', 'likes', 'models', 'private', 'resourceGroup', 'runtime', 'sdk', 'sha', 'siblings', 'subdomain', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, full: bool | None = None, token: bool | str | None = None) Iterable[SpaceInfo] [source][source]¶
List spaces hosted on the Huggingface Hub, given some filters.
- Parameters:
filter (
str
orIterable
, optional) – A string tag or list of tags that can be used to identify Spaces on the Hub.author (
str
, optional) – A string which identify the author of the returned Spaces.search (
str
, optional) – A string that will be contained in the returned Spaces.datasets (
str
orIterable
, optional) – Whether to return Spaces that make use of a dataset. The name of a specific dataset can be passed as a string.models (
str
orIterable
, optional) – Whether to return Spaces that make use of a model. The name of a specific model can be passed as a string.linked (
bool
, optional) – Whether to return Spaces that make use of either a model or a dataset.sort (
Literal["last_modified"]
orstr
, optional) – The key with which to sort the resulting models. Possible values are “last_modified”, “trending_score”, “created_at” and “likes”.direction (
Literal[-1]
orint
, optional) – Direction in which to sort. The value-1
sorts by descending order while all other values sort by ascending order.limit (
int
, optional) – The limit on the number of Spaces fetched. Leaving this option toNone
fetches all Spaces.expand (
List[ExpandSpaceProperty_T]
, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used iffull
is passed. Possible values are"author"
,"cardData"
,"datasets"
,"disabled"
,"lastModified"
,"createdAt"
,"likes"
,"models"
,"private"
,"runtime"
,"sdk"
,"siblings"
,"sha"
,"subdomain"
,"tags"
,"trendingScore"
,"usedStorage"
,"resourceGroup"
and"xetEnabled"
.full (
bool
, optional) – Whether to fetch all Spaces data, including thelast_modified
,siblings
andcard_data
fields.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
an iterable of [
huggingface_hub.hf_api.SpaceInfo
] objects.- Return type:
Iterable[SpaceInfo]
- unlike(repo_id: str, *, token: bool | str | None = None, repo_type: str | None = None) None [source][source]¶
Unlike a given repo on the Hub (e.g. remove from favorite list).
To prevent spam usage, it is not possible to
like
a repository from a script.See also [
list_liked_repos
].- Parameters:
repo_id (
str
) – The repository to unlike. Example:"user/my-cool-model"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if unliking a dataset or space,None
or"model"
if unliking a model. Default isNone
.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
Example: .. code-block:: python
>>> from huggingface_hub import list_liked_repos, unlike >>> "gpt2" in list_liked_repos().models # we assume you have already liked gpt2 True >>> unlike("gpt2") >>> "gpt2" in list_liked_repos().models False
- list_liked_repos(user: str | None = None, *, token: bool | str | None = None) UserLikes [source][source]¶
List all public repos liked by a user on huggingface.co.
This list is public so token is optional. If
user
is not passed, it defaults to the logged in user.See also [
unlike
].- Parameters:
user (
str
, optional) – Name of the user for which you want to fetch the likes.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
object containing the user name and 3 lists of repo ids (1 for models, 1 for datasets and 1 for Spaces).
- Return type:
[
UserLikes
]- Raises:
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If
user
is not passed and no token found (either from argument or from machine).
Example: .. code-block:: python
>>> from huggingface_hub import list_liked_repos
>>> likes = list_liked_repos("julien-c")
>>> likes.user "julien-c"
>>> likes.models ["osanseviero/streamlit_1.15", "Xhaheen/ChatGPT_HF", ...]
- list_repo_likers(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) Iterable[User] [source][source]¶
List all users who liked a given repo on the hugging Face Hub.
See also [
list_liked_repos
].- Parameters:
repo_id (
str
) – The repository to retrieve . Example:"user/my-cool-model"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.
- Returns:
an iterable of [
huggingface_hub.hf_api.User
] objects.- Return type:
Iterable[User]
- model_info(repo_id: str, *, revision: str | None = None, timeout: float | None = None, securityStatus: bool | None = None, files_metadata: bool = False, expand: List[Literal['author', 'baseModels', 'cardData', 'childrenModelCount', 'config', 'createdAt', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'gguf', 'inference', 'inferenceProviderMapping', 'lastModified', 'library_name', 'likes', 'mask_token', 'model-index', 'pipeline_tag', 'private', 'resourceGroup', 'safetensors', 'sha', 'siblings', 'spaces', 'tags', 'transformersInfo', 'trendingScore', 'usedStorage', 'widgetData', 'xetEnabled']] | None = None, token: bool | str | None = None) ModelInfo [source][source]¶
Get info on one specific model on huggingface.co
Model can be private if you pass an acceptable token or are logged in.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.revision (
str
, optional) – The revision of the model repository from which to get the information.timeout (
float
, optional) – Whether to set a timeout for the request to the Hub.securityStatus (
bool
, optional) – Whether to retrieve the security status from the model repository as well. The security status will be returned in thesecurity_repo_status
field.files_metadata (
bool
, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults toFalse
.expand (
List[ExpandModelProperty_T]
, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used ifsecurityStatus
orfiles_metadata
are passed. Possible values are"author"
,"baseModels"
,"cardData"
,"childrenModelCount"
,"config"
,"createdAt"
,"disabled"
,"downloads"
,"downloadsAllTime"
,"gated"
,"gguf"
,"inference"
,"inferenceProviderMapping"
,"lastModified"
,"library_name"
,"likes"
,"mask_token"
,"model-index"
,"pipeline_tag"
,"private"
,"safetensors"
,"sha"
,"siblings"
,"spaces"
,"tags"
,"transformersInfo"
,"trendingScore"
,"widgetData"
,"usedStorage"
,"resourceGroup"
and"xetEnabled"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
The model repository information.
- Return type:
[
huggingface_hub.hf_api.ModelInfo
]
<Tip>
Raises the following errors:
[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.[
~utils.RevisionNotFoundError
] If the revision to download from cannot be found.
</Tip>
- dataset_info(repo_id: str, *, revision: str | None = None, timeout: float | None = None, files_metadata: bool = False, expand: List[Literal['author', 'cardData', 'citation', 'createdAt', 'description', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'lastModified', 'likes', 'paperswithcode_id', 'private', 'resourceGroup', 'sha', 'siblings', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, token: bool | str | None = None) DatasetInfo [source][source]¶
Get info on one specific dataset on huggingface.co.
Dataset can be private if you pass an acceptable token.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.revision (
str
, optional) – The revision of the dataset repository from which to get the information.timeout (
float
, optional) – Whether to set a timeout for the request to the Hub.files_metadata (
bool
, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults toFalse
.expand (
List[ExpandDatasetProperty_T]
, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used iffiles_metadata
is passed. Possible values are"author"
,"cardData"
,"citation"
,"createdAt"
,"disabled"
,"description"
,"downloads"
,"downloadsAllTime"
,"gated"
,"lastModified"
,"likes"
,"paperswithcode_id"
,"private"
,"siblings"
,"sha"
,"tags"
,"trendingScore"
,``”usedStorage”, ``"resourceGroup"
and"xetEnabled"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
The dataset repository information.
- Return type:
[
hf_api.DatasetInfo
]
<Tip>
Raises the following errors:
[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.[
~utils.RevisionNotFoundError
] If the revision to download from cannot be found.
</Tip>
- space_info(repo_id: str, *, revision: str | None = None, timeout: float | None = None, files_metadata: bool = False, expand: List[Literal['author', 'cardData', 'createdAt', 'datasets', 'disabled', 'lastModified', 'likes', 'models', 'private', 'resourceGroup', 'runtime', 'sdk', 'sha', 'siblings', 'subdomain', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled']] | None = None, token: bool | str | None = None) SpaceInfo [source][source]¶
Get info on one specific Space on huggingface.co.
Space can be private if you pass an acceptable token.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.revision (
str
, optional) – The revision of the space repository from which to get the information.timeout (
float
, optional) – Whether to set a timeout for the request to the Hub.files_metadata (
bool
, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults toFalse
.expand (
List[ExpandSpaceProperty_T]
, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used iffull
is passed. Possible values are"author"
,"cardData"
,"createdAt"
,"datasets"
,"disabled"
,"lastModified"
,"likes"
,"models"
,"private"
,"runtime"
,"sdk"
,"siblings"
,"sha"
,"subdomain"
,"tags"
,"trendingScore"
,"usedStorage"
,"resourceGroup"
and"xetEnabled"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
The space repository information.
- Return type:
[
~hf_api.SpaceInfo
]
<Tip>
Raises the following errors:
[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.[
~utils.RevisionNotFoundError
] If the revision to download from cannot be found.
</Tip>
- repo_info(repo_id: str, *, revision: str | None = None, repo_type: str | None = None, timeout: float | None = None, files_metadata: bool = False, expand: Literal['author', 'baseModels', 'cardData', 'childrenModelCount', 'config', 'createdAt', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'gguf', 'inference', 'inferenceProviderMapping', 'lastModified', 'library_name', 'likes', 'mask_token', 'model-index', 'pipeline_tag', 'private', 'resourceGroup', 'safetensors', 'sha', 'siblings', 'spaces', 'tags', 'transformersInfo', 'trendingScore', 'usedStorage', 'widgetData', 'xetEnabled'] | Literal['author', 'cardData', 'citation', 'createdAt', 'description', 'disabled', 'downloads', 'downloadsAllTime', 'gated', 'lastModified', 'likes', 'paperswithcode_id', 'private', 'resourceGroup', 'sha', 'siblings', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled'] | Literal['author', 'cardData', 'createdAt', 'datasets', 'disabled', 'lastModified', 'likes', 'models', 'private', 'resourceGroup', 'runtime', 'sdk', 'sha', 'siblings', 'subdomain', 'tags', 'trendingScore', 'usedStorage', 'xetEnabled'] | None = None, token: bool | str | None = None) ModelInfo | DatasetInfo | SpaceInfo [source][source]¶
Get the info object for a given repo of a given type.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.revision (
str
, optional) – The revision of the repository from which to get the information.repo_type (
str
, optional) – Set to"dataset"
or"space"
if getting repository info from a dataset or a space,None
or"model"
if getting repository info from a model. Default isNone
.timeout (
float
, optional) – Whether to set a timeout for the request to the Hub.expand (
ExpandModelProperty_T
orExpandDatasetProperty_T
orExpandSpaceProperty_T
, optional) – List properties to return in the response. When used, only the properties in the list will be returned. This parameter cannot be used iffiles_metadata
is passed. For an exhaustive list of available properties, check out [model_info
], [dataset_info
] or [space_info
].files_metadata (
bool
, optional) – Whether or not to retrieve metadata for files in the repository (size, LFS metadata, etc). Defaults toFalse
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
The repository information, as a [
huggingface_hub.hf_api.DatasetInfo
], [huggingface_hub.hf_api.ModelInfo
] or [huggingface_hub.hf_api.SpaceInfo
] object.- Return type:
Union[SpaceInfo, DatasetInfo, ModelInfo]
<Tip>
Raises the following errors:
[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.[
~utils.RevisionNotFoundError
] If the revision to download from cannot be found.
</Tip>
- repo_exists(repo_id: str, *, repo_type: str | None = None, token: str | bool | None = None) bool [source][source]¶
Checks if a repository exists on the Hugging Face Hub.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if getting repository info from a dataset or a space,None
or"model"
if getting repository info from a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
True if the repository exists, False otherwise.
Examples
>>> from huggingface_hub import repo_exists >>> repo_exists("google/gemma-7b") True >>> repo_exists("google/not-a-repo") False
- revision_exists(repo_id: str, revision: str, *, repo_type: str | None = None, token: str | bool | None = None) bool [source][source]¶
Checks if a specific revision exists on a repo on the Hugging Face Hub.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.revision (
str
) – The revision of the repository to check.repo_type (
str
, optional) – Set to"dataset"
or"space"
if getting repository info from a dataset or a space,None
or"model"
if getting repository info from a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
True if the repository and the revision exists, False otherwise.
Examples
>>> from huggingface_hub import revision_exists >>> revision_exists("google/gemma-7b", "float16") True >>> revision_exists("google/gemma-7b", "not-a-revision") False
- file_exists(repo_id: str, filename: str, *, repo_type: str | None = None, revision: str | None = None, token: str | bool | None = None) bool [source][source]¶
Checks if a file exists in a repository on the Hugging Face Hub.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.filename (
str
) – The name of the file to check, for example:"config.json"
repo_type (
str
, optional) – Set to"dataset"
or"space"
if getting repository info from a dataset or a space,None
or"model"
if getting repository info from a model. Default isNone
.revision (
str
, optional) – The revision of the repository from which to get the information. Defaults to"main"
branch.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
True if the file exists, False otherwise.
Examples
>>> from huggingface_hub import file_exists >>> file_exists("bigcode/starcoder", "config.json") True >>> file_exists("bigcode/starcoder", "not-a-file") False >>> file_exists("bigcode/not-a-repo", "config.json") False
- list_repo_files(repo_id: str, *, revision: str | None = None, repo_type: str | None = None, token: str | bool | None = None) List[str] [source][source]¶
Get the list of files in a given repo.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.revision (
str
, optional) – The revision of the repository from which to get the information.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
the list of files in a given repository.
- Return type:
List[str]
- list_repo_tree(repo_id: str, path_in_repo: str | None = None, *, recursive: bool = False, expand: bool = False, revision: str | None = None, repo_type: str | None = None, token: str | bool | None = None) Iterable[RepoFile | RepoFolder] [source][source]¶
List a repo tree’s files and folders and get information about them.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.path_in_repo (
str
, optional) – Relative path of the tree (folder) in the repo, for example:"checkpoints/1fec34a/results"
. Will default to the root tree (folder) of the repository.recursive (
bool
, optional, defaults toFalse
) – Whether to list tree’s files and folders recursively.expand (
bool
, optional, defaults toFalse
) – Whether to fetch more information about the tree’s files and folders (e.g. last commit and files’ security scan results). This operation is more expensive for the server so only 50 results are returned per page (instead of 1000). As pagination is implemented inhuggingface_hub
, this is transparent for you except for the time it takes to get the results.revision (
str
, optional) – The revision of the repository from which to get the tree. Defaults to"main"
branch.repo_type (
str
, optional) – The type of the repository from which to get the tree ("model"
,"dataset"
or"space"
. Defaults to"model"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
The information about the tree’s files and folders, as an iterable of [
RepoFile
] and [RepoFolder
] objects. The order of the files and folders is not guaranteed.- Return type:
Iterable[Union[RepoFile, RepoFolder]]
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
[EntryNotFoundError] – If the tree (folder) does not exist (error 404) on the repo.
Examples
Get information about a repo’s tree. .. code-block:: py
>>> from huggingface_hub import list_repo_tree >>> repo_tree = list_repo_tree("lysandre/arxiv-nlp") >>> repo_tree <generator object HfApi.list_repo_tree at 0x7fa4088e1ac0> >>> list(repo_tree) [ RepoFile(path='.gitattributes', size=391, blob_id='ae8c63daedbd4206d7d40126955d4e6ab1c80f8f', lfs=None, last_commit=None, security=None), RepoFile(path='README.md', size=391, blob_id='43bd404b159de6fba7c2f4d3264347668d43af25', lfs=None, last_commit=None, security=None), RepoFile(path='config.json', size=554, blob_id='2f9618c3a19b9a61add74f70bfb121335aeef666', lfs=None, last_commit=None, security=None), RepoFile( path='flax_model.msgpack', size=497764107, blob_id='8095a62ccb4d806da7666fcda07467e2d150218e', lfs={'size': 497764107, 'sha256': 'd88b0d6a6ff9c3f8151f9d3228f57092aaea997f09af009eefd7373a77b5abb9', 'pointer_size': 134}, last_commit=None, security=None ), RepoFile(path='merges.txt', size=456318, blob_id='226b0752cac7789c48f0cb3ec53eda48b7be36cc', lfs=None, last_commit=None, security=None), RepoFile( path='pytorch_model.bin', size=548123560, blob_id='64eaa9c526867e404b68f2c5d66fd78e27026523', lfs={'size': 548123560, 'sha256': '9be78edb5b928eba33aa88f431551348f7466ba9f5ef3daf1d552398722a5436', 'pointer_size': 134}, last_commit=None, security=None ), RepoFile(path='vocab.json', size=898669, blob_id='b00361fece0387ca34b4b8b8539ed830d644dbeb', lfs=None, last_commit=None, security=None)] ]
Get even more information about a repo’s tree (last commit and files’ security scan results) .. code-block:: py
>>> from huggingface_hub import list_repo_tree >>> repo_tree = list_repo_tree("prompthero/openjourney-v4", expand=True) >>> list(repo_tree) [ RepoFolder( path='feature_extractor', tree_id='aa536c4ea18073388b5b0bc791057a7296a00398', last_commit={ 'oid': '47b62b20b20e06b9de610e840282b7e6c3d51190', 'title': 'Upload diffusers weights (#48)', 'date': datetime.datetime(2023, 3, 21, 9, 5, 27, tzinfo=datetime.timezone.utc) } ), RepoFolder( path='safety_checker', tree_id='65aef9d787e5557373fdf714d6c34d4fcdd70440', last_commit={ 'oid': '47b62b20b20e06b9de610e840282b7e6c3d51190', 'title': 'Upload diffusers weights (#48)', 'date': datetime.datetime(2023, 3, 21, 9, 5, 27, tzinfo=datetime.timezone.utc) } ), RepoFile( path='model_index.json', size=582, blob_id='d3d7c1e8c3e78eeb1640b8e2041ee256e24c9ee1', lfs=None, last_commit={ 'oid': 'b195ed2d503f3eb29637050a886d77bd81d35f0e', 'title': 'Fix deprecation warning by changing ``CLIPFeatureExtractor`` to ``CLIPImageProcessor``. (#54)', 'date': datetime.datetime(2023, 5, 15, 21, 41, 59, tzinfo=datetime.timezone.utc) }, security={ 'safe': True, 'av_scan': {'virusFound': False, 'virusNames': None}, 'pickle_import_scan': None } ) ... ]
- list_repo_refs(repo_id: str, *, repo_type: str | None = None, include_pull_requests: bool = False, token: str | bool | None = None) GitRefs [source][source]¶
Get the list of refs of a given repo (both tags and branches).
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if listing refs from a dataset or a Space,None
or"model"
if listing from a model. Default isNone
.include_pull_requests (
bool
, optional) – Whether to include refs from pull requests in the list. Defaults toFalse
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Example: .. code-block:: py
>>> from huggingface_hub import HfApi >>> api = HfApi() >>> api.list_repo_refs("gpt2") GitRefs(branches=[GitRefInfo(name='main', ref='refs/heads/main', target_commit='e7da7f221d5bf496a48136c0cd264e630fe9fcc8')], converts=[], tags=[])
>>> api.list_repo_refs("bigcode/the-stack", repo_type='dataset') GitRefs( branches=[ GitRefInfo(name='main', ref='refs/heads/main', target_commit='18edc1591d9ce72aa82f56c4431b3c969b210ae3'), GitRefInfo(name='v1.1.a1', ref='refs/heads/v1.1.a1', target_commit='f9826b862d1567f3822d3d25649b0d6d22ace714') ], converts=[], tags=[ GitRefInfo(name='v1.0', ref='refs/tags/v1.0', target_commit='c37a8cd1e382064d8aced5e05543c5f7753834da') ] )
- Returns:
object containing all information about branches and tags for a repo on the Hub.
- Return type:
[
GitRefs
]
- list_repo_commits(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None, revision: str | None = None, formatted: bool = False) List[GitCommitInfo] [source][source]¶
Get the list of commits of a given revision for a repo on the Hub.
Commits are sorted by date (last commit first).
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if listing commits from a dataset or a Space,None
or"model"
if listing from a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.formatted (
bool
) – Whether to return the HTML-formatted title and description of the commits. Defaults to False.
Example: .. code-block:: py
>>> from huggingface_hub import HfApi >>> api = HfApi()
# Commits are sorted by date (last commit first) >>> initial_commit = api.list_repo_commits(“gpt2”)[-1]
# Initial commit is always a system commit containing the
.gitattributes
file. >>> initial_commit GitCommitInfo(commit_id=’9b865efde13a30c13e0a33e536cf3e4a5a9d71d8’, authors=[‘system’], created_at=datetime.datetime(2019, 2, 18, 10, 36, 15, tzinfo=datetime.timezone.utc), title=’initial commit’, message=’’, formatted_title=None, formatted_message=None
)
# Create an empty branch by deriving from initial commit >>> api.create_branch(“gpt2”, “new_empty_branch”, revision=initial_commit.commit_id)
- Returns:
list of objects containing information about the commits for a repo on the Hub.
- Return type:
List[[
GitCommitInfo
]]- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
- get_paths_info(repo_id: str, paths: List[str] | str, *, expand: bool = False, revision: str | None = None, repo_type: str | None = None, token: str | bool | None = None) List[RepoFile | RepoFolder] [source][source]¶
Get information about a repo’s paths.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.paths (
Union[List[str], str]
, optional) – The paths to get information about. If a path do not exist, it is ignored without raising an exception.expand (
bool
, optional, defaults toFalse
) – Whether to fetch more information about the paths (e.g. last commit and files’ security scan results). This operation is more expensive for the server so only 50 results are returned per page (instead of 1000). As pagination is implemented inhuggingface_hub
, this is transparent for you except for the time it takes to get the results.revision (
str
, optional) – The revision of the repository from which to get the information. Defaults to"main"
branch.repo_type (
str
, optional) – The type of the repository from which to get the information ("model"
,"dataset"
or"space"
. Defaults to"model"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
The information about the paths, as a list of [
RepoFile
] and [RepoFolder
] objects.- Return type:
List[Union[RepoFile, RepoFolder]]
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
Example: .. code-block:: py
>>> from huggingface_hub import get_paths_info >>> paths_info = get_paths_info("allenai/c4", ["README.md", "en"], repo_type="dataset") >>> paths_info [ RepoFile(path='README.md', size=2379, blob_id='f84cb4c97182890fc1dbdeaf1a6a468fd27b4fff', lfs=None, last_commit=None, security=None), RepoFolder(path='en', tree_id='dc943c4c40f53d02b31ced1defa7e5f438d5862e', last_commit=None) ]
- super_squash_history(repo_id: str, *, branch: str | None = None, commit_message: str | None = None, repo_type: str | None = None, token: str | bool | None = None) None [source][source]¶
Squash commit history on a branch for a repo on the Hub.
Squashing the repo history is useful when you know you’ll make hundreds of commits and you don’t want to clutter the history. Squashing commits can only be performed from the head of a branch.
<Tip warning={true}>
Once squashed, the commit history cannot be retrieved. This is a non-revertible operation.
</Tip>
<Tip warning={true}>
Once the history of a branch has been squashed, it is not possible to merge it back into another branch since their history will have diverged.
</Tip>
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.branch (
str
, optional) – The branch to squash. Defaults to the head of the"main"
branch.commit_message (
str
, optional) – The commit message to use for the squashed commit.repo_type (
str
, optional) – Set to"dataset"
or"space"
if listing commits from a dataset or a Space,None
or"model"
if listing from a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If the branch to squash cannot be found.
[BadRequestError] – If invalid reference for a branch. You cannot squash history on tags.
Example: .. code-block:: py
>>> from huggingface_hub import HfApi >>> api = HfApi()
# Create repo >>> repo_id = api.create_repo(“test-squash”).repo_id
# Make a lot of commits. >>> api.upload_file(repo_id=repo_id, path_in_repo=”file.txt”, path_or_fileobj=b”content”) >>> api.upload_file(repo_id=repo_id, path_in_repo=”lfs.bin”, path_or_fileobj=b”content”) >>> api.upload_file(repo_id=repo_id, path_in_repo=”file.txt”, path_or_fileobj=b”another_content”)
# Squash history >>> api.super_squash_history(repo_id=repo_id)
- list_lfs_files(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) Iterable[LFSFileInfo] [source][source]¶
List all LFS files in a repo on the Hub.
This is primarily useful to count how much storage a repo is using and to eventually clean up large files with [
permanently_delete_lfs_files
]. Note that this would be a permanent action that will affect all commits referencing this deleted files and that cannot be undone.- Parameters:
repo_id (
str
) – The repository for which you are listing LFS files.repo_type (
str
, optional) – Type of repository. Set to"dataset"
or"space"
if listing from a dataset or space,None
or"model"
if listing from a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
An iterator of [
LFSFileInfo
] objects.- Return type:
Iterable[LFSFileInfo]
Example
>>> from huggingface_hub import HfApi >>> api = HfApi() >>> lfs_files = api.list_lfs_files("username/my-cool-repo") # Filter files files to delete based on a combination of ``filename``, ``pushed_at``, ``ref`` or ``size``. # e.g. select only LFS files in the "checkpoints" folder >>> lfs_files_to_delete = (lfs_file for lfs_file in lfs_files if lfs_file.filename.startswith("checkpoints/")) # Permanently delete LFS files >>> api.permanently_delete_lfs_files("username/my-cool-repo", lfs_files_to_delete)
- permanently_delete_lfs_files(repo_id: str, lfs_files: Iterable[LFSFileInfo], *, rewrite_history: bool = True, repo_type: str | None = None, token: bool | str | None = None) None [source][source]¶
Permanently delete LFS files from a repo on the Hub.
<Tip warning={true}>
This is a permanent action that will affect all commits referencing the deleted files and might corrupt your repository. This is a non-revertible operation. Use it only if you know what you are doing.
</Tip>
- Parameters:
repo_id (
str
) – The repository for which you are listing LFS files.lfs_files (
Iterable[LFSFileInfo]
) – An iterable of [LFSFileInfo
] items to permanently delete from the repo. Use [list_lfs_files
] to list all LFS files from a repo.rewrite_history (
bool
, optional, default toTrue
) – Whether to rewrite repository history to remove file pointers referencing the deleted LFS files (recommended).repo_type (
str
, optional) – Type of repository. Set to"dataset"
or"space"
if listing from a dataset or space,None
or"model"
if listing from a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Example
>>> from huggingface_hub import HfApi >>> api = HfApi() >>> lfs_files = api.list_lfs_files("username/my-cool-repo") # Filter files files to delete based on a combination of ``filename``, ``pushed_at``, ``ref`` or ``size``. # e.g. select only LFS files in the "checkpoints" folder >>> lfs_files_to_delete = (lfs_file for lfs_file in lfs_files if lfs_file.filename.startswith("checkpoints/")) # Permanently delete LFS files >>> api.permanently_delete_lfs_files("username/my-cool-repo", lfs_files_to_delete)
- create_repo(repo_id: str, *, token: str | bool | None = None, private: bool | None = None, repo_type: str | None = None, exist_ok: bool = False, resource_group_id: str | None = None, space_sdk: str | None = None, space_hardware: SpaceHardware | None = None, space_storage: SpaceStorage | None = None, space_sleep_time: int | None = None, space_secrets: List[Dict[str, str]] | None = None, space_variables: List[Dict[str, str]] | None = None) RepoUrl [source][source]¶
Create an empty repo on the HuggingFace Hub.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.private (
bool
, optional) – Whether to make the repo private. IfNone
(default), the repo will be public unless the organization’s default is private. This value is ignored if the repo already exists.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.exist_ok (
bool
, optional, defaults toFalse
) – IfTrue
, do not raise an error if repo already exists.resource_group_id (
str
, optional) – Resource group in which to create the repo. Resource groups is only available for Enterprise Hub organizations and allow to define which members of the organization can access the resource. The ID of a resource group can be found in the URL of the resource’s page on the Hub (e.g."66670e5163145ca562cb1988"
). To learn more about resource groups, see https://huggingface.co/docs/hub/en/security-resource-groups.space_sdk (
str
, optional) – Choice of SDK to use if repo_type is “space”. Can be “streamlit”, “gradio”, “docker”, or “static”.space_hardware (
SpaceHardware
orstr
, optional) – Choice of Hardware if repo_type is “space”. See [SpaceHardware
] for a complete list.space_storage (
SpaceStorage
orstr
, optional) – Choice of persistent storage tier. Example:"small"
. See [SpaceStorage
] for a complete list.space_sleep_time (
int
, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to-1
if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.space_secrets (
List[Dict[str, str]]
, optional) – A list of secret keys to set in your Space. Each item is in the form{"key": ..., "value": ..., "description": ...}
where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.space_variables (
List[Dict[str, str]]
, optional) – A list of public environment variables to set in your Space. Each item is in the form{"key": ..., "value": ..., "description": ...}
where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables.
- Returns:
URL to the newly created repo. Value is a subclass of
str
containing attributes likeendpoint
,repo_type
andrepo_id
.- Return type:
[
RepoUrl
]
- delete_repo(repo_id: str, *, token: str | bool | None = None, repo_type: str | None = None, missing_ok: bool = False) None [source][source]¶
Delete a repo from the HuggingFace Hub. CAUTION: this is irreversible.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model.missing_ok (
bool
, optional, defaults toFalse
) – IfTrue
, do not raise an error if repo does not exist.
- Raises:
[RepositoryNotFoundError] – If the repository to delete from cannot be found and
missing_ok
is set to False (default).
- update_repo_visibility(repo_id: str, private: bool = False, *, token: str | bool | None = None, repo_type: str | None = None) Dict[str, bool] [source][source]¶
Update the visibility setting of a repository.
Deprecated. Use
update_repo_settings
instead.- Parameters:
repo_id (
str
, optional) – A namespace (user or an organization) and a repo name separated by a/
.private (
bool
, optional, defaults toFalse
) – Whether the repository should be private.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.
- Returns:
The HTTP response in json.
<Tip>
Raises the following errors:
[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- update_repo_settings(repo_id: str, *, gated: Literal['auto', 'manual', False] | None = None, private: bool | None = None, token: str | bool | None = None, repo_type: str | None = None, xet_enabled: bool | None = None) None [source][source]¶
Update the settings of a repository, including gated access and visibility.
To give more control over how repos are used, the Hub allows repo authors to enable access requests for their repos, and also to set the visibility of the repo to private.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a /.gated (
Literal["auto", "manual", False]
, optional) – The gated status for the repository. If set toNone
(default), thegated
setting of the repository won’t be updated. * “auto”: The repository is gated, and access requests are automatically approved or denied based on predefined criteria. * “manual”: The repository is gated, and access requests require manual approval. * False : The repository is not gated, and anyone can access it.private (
bool
, optional) – Whether the repository should be private.token (
Union[str, bool, None]
, optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass False.repo_type (
str
, optional) – The type of the repository to update settings from ("model"
,"dataset"
or"space"
). Defaults to"model"
.xet_enabled (
bool
, optional) – Whether the repository should be enabled for Xet Storage.
- Raises:
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If gated is not one of “auto”, “manual”, or False.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If repo_type is not one of the values in constants.REPO_TYPES.
[HfHubHTTPError] – If the request to the Hugging Face Hub API fails.
[RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to
private
and you do not have access.
- move_repo(from_id: str, to_id: str, *, repo_type: str | None = None, token: str | bool | None = None)[source][source]¶
Moving a repository from namespace1/repo_name1 to namespace2/repo_name2
Note there are certain limitations. For more information about moving repositories, please see https://hf.co/docs/hub/repositories-settings#renaming-or-transferring-a-repo.
- Parameters:
from_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
. Original repository identifier.to_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
. Final repository identifier.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
<Tip>
Raises the following errors:
[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- create_commit(repo_id: str, operations: Iterable[CommitOperation], *, commit_message: str, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, num_threads: int = 5, parent_commit: str | None = None, run_as_future: Literal[False] = False) CommitInfo [source][source]¶
- create_commit(repo_id: str, operations: Iterable[CommitOperation], *, commit_message: str, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, num_threads: int = 5, parent_commit: str | None = None, run_as_future: Literal[True] = False) Future[CommitInfo]
Creates a commit in the given repo, deleting & uploading files as needed.
<Tip warning={true}>
The input list of
CommitOperation
will be mutated during the commit process. Do not reuse the same objects for multiple commits.</Tip>
<Tip warning={true}>
create_commit
assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and thatrepo_id
andrepo_type
are set correctly. If repo does not exist, create it first using [~hf_api.create_repo
].</Tip>
<Tip warning={true}>
create_commit
is limited to 25k LFS files and a 1GB payload for regular files.</Tip>
- Parameters:
repo_id (
str
) – The repository in which the commit will be created, for example:"username/custom_transformers"
operations (
Iterable
of [~hf_api.CommitOperation
]) –An iterable of operations to include in the commit, either:
[
~hf_api.CommitOperationAdd
] to upload a file[
~hf_api.CommitOperationDelete
] to delete a file[
~hf_api.CommitOperationCopy
] to copy a file
Operation objects will be mutated to include information relative to the upload. Do not reuse the same objects for multiple commits.
commit_message (
str
) – The summary (first line) of the commit that will be created.commit_description (
str
, optional) – The description of the commit that will be createdtoken (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.create_pr (
boolean
, optional) – Whether or not to create a Pull Request with that commit. Defaults toFalse
. Ifrevision
is not set, PR is opened against the"main"
branch. Ifrevision
is set and is a branch, PR is opened against this branch. Ifrevision
is set and is not a branch name (example: a commit oid), anRevisionNotFoundError
is returned by the server.num_threads (
int
, optional) – Number of concurrent threads for uploading files. Defaults to 5. Setting it to 2 means at most 2 files will be uploaded concurrently.parent_commit (
str
, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified andcreate_pr
isFalse
, the commit will fail ifrevision
does not point toparent_commit
. If specified andcreate_pr
isTrue
, the pull request will be created fromparent_commit
. Specifyingparent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.run_as_future (
bool
, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passingrun_as_future=True
will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults toFalse
.
- Returns:
Instance of [
CommitInfo
] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). Ifrun_as_future=True
is passed, returns a Future object which will contain the result when executed.- Return type:
[
CommitInfo
] orFuture
- Raises:
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If commit message is empty.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If parent commit is not a valid commit OID.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If a README.md file with an invalid metadata section is committed. In this case, the commit will fail early, before trying to upload any file.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If
create_pr
isTrue
and revision is neitherNone
nor"main"
.[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
- preupload_lfs_files(repo_id: str, additions: Iterable[CommitOperationAdd], *, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, num_threads: int = 5, free_memory: bool = True, gitignore_content: str | None = None)[source][source]¶
Pre-upload LFS files to S3 in preparation on a future commit.
This method is useful if you are generating the files to upload on-the-fly and you don’t want to store them in memory before uploading them all at once.
<Tip warning={true}>
This is a power-user method. You shouldn’t need to call it directly to make a normal commit. Use [
create_commit
] directly instead.</Tip>
<Tip warning={true}>
Commit operations will be mutated during the process. In particular, the attached
path_or_fileobj
will be removed after the upload to save memory (and replaced by an emptybytes
object). Do not reuse the same objects except to pass them to [create_commit
]. If you don’t want to remove the attached content from the commit operation object, passfree_memory=False
.</Tip>
- Parameters:
repo_id (
str
) – The repository in which you will commit the files, for example:"username/custom_transformers"
.operations (
Iterable
of [CommitOperationAdd
]) – The list of files to upload. Warning: the objects in this list will be mutated to include information relative to the upload. Do not reuse the same objects for multiple commits.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – The type of repository to upload to (e.g."model"
-default-,"dataset"
or"space"
).revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.create_pr (
boolean
, optional) – Whether or not you plan to create a Pull Request with that commit. Defaults toFalse
.num_threads (
int
, optional) – Number of concurrent threads for uploading files. Defaults to 5. Setting it to 2 means at most 2 files will be uploaded concurrently.gitignore_content (
str
, optional) – The content of the.gitignore
file to know which files should be ignored. The order of priority is to first check ifgitignore_content
is passed, then check if the.gitignore
file is present in the list of files to commit and finally default to the.gitignore
file already hosted on the Hub (if any).
Example: .. code-block:: py
>>> from huggingface_hub import CommitOperationAdd, preupload_lfs_files, create_commit, create_repo
>>> repo_id = create_repo("test_preupload").repo_id
# Generate and preupload LFS files one by one >>> operations = [] # List of all
CommitOperationAdd
objects that will be generated >>> for i in range(5): … content = … # generate binary content … addition = CommitOperationAdd(path_in_repo=f”shard_{i}_of_5.bin”, path_or_fileobj=content) … preupload_lfs_files(repo_id, additions=[addition]) # upload + free memory … operations.append(addition)# Create commit >>> create_commit(repo_id, operations=operations, commit_message=”Commit all shards”)
- upload_file(*, path_or_fileobj: str | Path | bytes | BinaryIO, path_in_repo: str, repo_id: str, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, run_as_future: Literal[False] = False) CommitInfo [source][source]¶
- upload_file(*, path_or_fileobj: str | Path | bytes | BinaryIO, path_in_repo: str, repo_id: str, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, run_as_future: Literal[True] = False) Future[CommitInfo]
Upload a local file (up to 50 GB) to the given repo. The upload is done through a HTTP post request, and doesn’t require git or git-lfs to be installed.
- Parameters:
path_or_fileobj (
str
,Path
,bytes
, orIO
) – Path to a file on the local machine or binary data stream / fileobj / buffer.path_in_repo (
str
) – Relative filepath in the repo, for example:"checkpoints/1fec34a/weights.bin"
repo_id (
str
) – The repository to which the file will be uploaded, for example:"username/custom_transformers"
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.commit_message (
str
, optional) – The summary / title / first line of the generated commitcommit_description (
str
optional) – The description of the generated commitcreate_pr (
boolean
, optional) – Whether or not to create a Pull Request with that commit. Defaults toFalse
. Ifrevision
is not set, PR is opened against the"main"
branch. Ifrevision
is set and is a branch, PR is opened against this branch. Ifrevision
is set and is not a branch name (example: a commit oid), anRevisionNotFoundError
is returned by the server.parent_commit (
str
, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified andcreate_pr
isFalse
, the commit will fail ifrevision
does not point toparent_commit
. If specified andcreate_pr
isTrue
, the pull request will be created fromparent_commit
. Specifyingparent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.run_as_future (
bool
, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passingrun_as_future=True
will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults toFalse
.
- Returns:
Instance of [
CommitInfo
] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). Ifrun_as_future=True
is passed, returns a Future object which will contain the result when executed.- Return type:
[
CommitInfo
] orFuture
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.[
~utils.RevisionNotFoundError
] If the revision to download from cannot be found.
</Tip>
<Tip warning={true}>
upload_file
assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and thatrepo_id
andrepo_type
are set correctly. If repo does not exist, create it first using [~hf_api.create_repo
].</Tip>
Example:
>>> from huggingface_hub import upload_file >>> with open("./local/filepath", "rb") as fobj: ... upload_file( ... path_or_fileobj=fileobj, ... path_in_repo="remote/file/path.h5", ... repo_id="username/my-dataset", ... repo_type="dataset", ... token="my_token", ... ) "https://huggingface.co/datasets/username/my-dataset/blob/main/remote/file/path.h5" >>> upload_file( ... path_or_fileobj=".\\local\\file\\path", ... path_in_repo="remote/file/path.h5", ... repo_id="username/my-model", ... token="my_token", ... ) "https://huggingface.co/username/my-model/blob/main/remote/file/path.h5" >>> upload_file( ... path_or_fileobj=".\\local\\file\\path", ... path_in_repo="remote/file/path.h5", ... repo_id="username/my-model", ... token="my_token", ... create_pr=True, ... ) "https://huggingface.co/username/my-model/blob/refs%2Fpr%2F1/remote/file/path.h5"
- upload_folder(*, repo_id: str, folder_path: str | Path, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, allow_patterns: List[str] | str | None = None, ignore_patterns: List[str] | str | None = None, delete_patterns: List[str] | str | None = None, run_as_future: Literal[False] = False) CommitInfo [source][source]¶
- upload_folder(*, repo_id: str, folder_path: str | Path, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, allow_patterns: List[str] | str | None = None, ignore_patterns: List[str] | str | None = None, delete_patterns: List[str] | str | None = None, run_as_future: Literal[True] = False) Future[CommitInfo]
Upload a local folder to the given repo. The upload is done through a HTTP requests, and doesn’t require git or git-lfs to be installed.
The structure of the folder will be preserved. Files with the same name already present in the repository will be overwritten. Others will be left untouched.
Use the
allow_patterns
andignore_patterns
arguments to specify which files to upload. These parameters accept either a single pattern or a list of patterns. Patterns are Standard Wildcards (globbing patterns) as documented [here](https://tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm). If bothallow_patterns
andignore_patterns
are provided, both constraints apply. By default, all files from the folder are uploaded.Use the
delete_patterns
argument to specify remote files you want to delete. Input type is the same as forallow_patterns
(see above). Ifpath_in_repo
is also provided, the patterns are matched against paths relative to this folder. For example,upload_folder(..., path_in_repo="experiment", delete_patterns="logs/*")
will delete any remote file under./experiment/logs/
. Note that the.gitattributes
file will not be deleted even if it matches the patterns.Any
.git/
folder present in any subdirectory will be ignored. However, please be aware that the.gitignore
file is not taken into account.Uses
HfApi.create_commit
under the hood.- Parameters:
repo_id (
str
) – The repository to which the file will be uploaded, for example:"username/custom_transformers"
folder_path (
str
orPath
) – Path to the folder to upload on the local file systempath_in_repo (
str
, optional) – Relative path of the directory in the repo, for example:"checkpoints/1fec34a/results"
. Will default to the root folder of the repository.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.commit_message (
str
, optional) – The summary / title / first line of the generated commit. Defaults to:f"Upload {path_in_repo} with huggingface_hub"
commit_description (
str
optional) – The description of the generated commitcreate_pr (
boolean
, optional) – Whether or not to create a Pull Request with that commit. Defaults toFalse
. Ifrevision
is not set, PR is opened against the"main"
branch. Ifrevision
is set and is a branch, PR is opened against this branch. Ifrevision
is set and is not a branch name (example: a commit oid), anRevisionNotFoundError
is returned by the server.parent_commit (
str
, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified andcreate_pr
isFalse
, the commit will fail ifrevision
does not point toparent_commit
. If specified andcreate_pr
isTrue
, the pull request will be created fromparent_commit
. Specifyingparent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.allow_patterns (
List[str]
orstr
, optional) – If provided, only files matching at least one pattern are uploaded.ignore_patterns (
List[str]
orstr
, optional) – If provided, files matching any of the patterns are not uploaded.delete_patterns (
List[str]
orstr
, optional) – If provided, remote files matching any of the patterns will be deleted from the repo while committing new files. This is useful if you don’t know which files have already been uploaded. Note: to avoid discrepancies the.gitattributes
file is not deleted even if it matches the pattern.run_as_future (
bool
, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passingrun_as_future=True
will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults toFalse
.
- Returns:
Instance of [
CommitInfo
] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). Ifrun_as_future=True
is passed, returns a Future object which will contain the result when executed.- Return type:
[
CommitInfo
] orFuture
<Tip>
Raises the following errors:
if the HuggingFace API returned an error - [
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid</Tip>
<Tip warning={true}>
upload_folder
assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and thatrepo_id
andrepo_type
are set correctly. If repo does not exist, create it first using [~hf_api.create_repo
].</Tip>
<Tip>
When dealing with a large folder (thousands of files or hundreds of GB), we recommend using [
~hf_api.upload_large_folder
] instead.</Tip>
Example:
# Upload checkpoints folder except the log files >>> upload_folder( ... folder_path="local/checkpoints", ... path_in_repo="remote/experiment/checkpoints", ... repo_id="username/my-dataset", ... repo_type="datasets", ... token="my_token", ... ignore_patterns="\**/logs/*.txt", ... ) # "https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints" # Upload checkpoints folder including logs while deleting existing logs from the repo # Useful if you don't know exactly which log files have already being pushed >>> upload_folder( ... folder_path="local/checkpoints", ... path_in_repo="remote/experiment/checkpoints", ... repo_id="username/my-dataset", ... repo_type="datasets", ... token="my_token", ... delete_patterns="\**/logs/*.txt", ... ) "https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints" # Upload checkpoints folder while creating a PR >>> upload_folder( ... folder_path="local/checkpoints", ... path_in_repo="remote/experiment/checkpoints", ... repo_id="username/my-dataset", ... repo_type="datasets", ... token="my_token", ... create_pr=True, ... ) "https://huggingface.co/datasets/username/my-dataset/tree/refs%2Fpr%2F1/remote/experiment/checkpoints"
- delete_file(path_in_repo: str, repo_id: str, *, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None) CommitInfo [source][source]¶
Deletes a file in the given repo.
- Parameters:
path_in_repo (
str
) – Relative filepath in the repo, for example:"checkpoints/1fec34a/weights.bin"
repo_id (
str
) – The repository from which the file will be deleted, for example:"username/custom_transformers"
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if the file is in a dataset or space,None
or"model"
if in a model. Default isNone
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.commit_message (
str
, optional) – The summary / title / first line of the generated commit. Defaults tof"Delete {path_in_repo} with huggingface_hub"
.commit_description (
str
optional) – The description of the generated commitcreate_pr (
boolean
, optional) – Whether or not to create a Pull Request with that commit. Defaults toFalse
. Ifrevision
is not set, PR is opened against the"main"
branch. Ifrevision
is set and is a branch, PR is opened against this branch. Ifrevision
is set and is not a branch name (example: a commit oid), anRevisionNotFoundError
is returned by the server.parent_commit (
str
, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified andcreate_pr
isFalse
, the commit will fail ifrevision
does not point toparent_commit
. If specified andcreate_pr
isTrue
, the pull request will be created fromparent_commit
. Specifyingparent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.[
~utils.RevisionNotFoundError
] If the revision to download from cannot be found.[
~utils.EntryNotFoundError
] If the file to download cannot be found.
</Tip>
- delete_files(repo_id: str, delete_patterns: List[str], *, token: bool | str | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None) CommitInfo [source][source]¶
Delete files from a repository on the Hub.
If a folder path is provided, the entire folder is deleted as well as all files it contained.
- Parameters:
repo_id (
str
) – The repository from which the folder will be deleted, for example:"username/custom_transformers"
delete_patterns (
List[str]
) – List of files or folders to delete. Each string can either be a file path, a folder path or a Unix shell-style wildcard. E.g.["file.txt", "folder/", "data/*.parquet"]
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
. to the stored token.repo_type (
str
, optional) – Type of the repo to delete files from. Can be"model"
,"dataset"
or"space"
. Defaults to"model"
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.commit_message (
str
, optional) – The summary (first line) of the generated commit. Defaults tof"Delete files using huggingface_hub"
.commit_description (
str
optional) – The description of the generated commit.create_pr (
boolean
, optional) – Whether or not to create a Pull Request with that commit. Defaults toFalse
. Ifrevision
is not set, PR is opened against the"main"
branch. Ifrevision
is set and is a branch, PR is opened against this branch. Ifrevision
is set and is not a branch name (example: a commit oid), anRevisionNotFoundError
is returned by the server.parent_commit (
str
, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified andcreate_pr
isFalse
, the commit will fail ifrevision
does not point toparent_commit
. If specified andcreate_pr
isTrue
, the pull request will be created fromparent_commit
. Specifyingparent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
- delete_folder(path_in_repo: str, repo_id: str, *, token: bool | str | None = None, repo_type: str | None = None, revision: str | None = None, commit_message: str | None = None, commit_description: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None) CommitInfo [source][source]¶
Deletes a folder in the given repo.
Simple wrapper around [
create_commit
] method.- Parameters:
path_in_repo (
str
) – Relative folder path in the repo, for example:"checkpoints/1fec34a"
.repo_id (
str
) – The repository from which the folder will be deleted, for example:"username/custom_transformers"
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
. to the stored token.repo_type (
str
, optional) – Set to"dataset"
or"space"
if the folder is in a dataset or space,None
or"model"
if in a model. Default isNone
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.commit_message (
str
, optional) – The summary / title / first line of the generated commit. Defaults tof"Delete folder {path_in_repo} with huggingface_hub"
.commit_description (
str
optional) – The description of the generated commit.create_pr (
boolean
, optional) – Whether or not to create a Pull Request with that commit. Defaults toFalse
. Ifrevision
is not set, PR is opened against the"main"
branch. Ifrevision
is set and is a branch, PR is opened against this branch. Ifrevision
is set and is not a branch name (example: a commit oid), anRevisionNotFoundError
is returned by the server.parent_commit (
str
, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified andcreate_pr
isFalse
, the commit will fail ifrevision
does not point toparent_commit
. If specified andcreate_pr
isTrue
, the pull request will be created fromparent_commit
. Specifyingparent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.
- upload_large_folder(repo_id: str, folder_path: str | Path, *, repo_type: str, revision: str | None = None, private: bool | None = None, allow_patterns: str | List[str] | None = None, ignore_patterns: str | List[str] | None = None, num_workers: int | None = None, print_report: bool = True, print_report_every: int = 60) None [source][source]¶
Upload a large folder to the Hub in the most resilient way possible.
Several workers are started to upload files in an optimized way. Before being committed to a repo, files must be hashed and be pre-uploaded if they are LFS files. Workers will perform these tasks for each file in the folder. At each step, some metadata information about the upload process is saved in the folder under
.cache/.huggingface/
to be able to resume the process if interrupted. The whole process might result in several commits.- Parameters:
repo_id (
str
) – The repository to which the file will be uploaded. E.g."HuggingFaceTB/smollm-corpus"
.folder_path (
str
orPath
) – Path to the folder to upload on the local file system.repo_type (
str
) – Type of the repository. Must be one of"model"
,"dataset"
or"space"
. Unlike in all otherHfApi
methods,repo_type
is explicitly required here. This is to avoid any mistake when uploading a large folder to the Hub, and therefore prevent from having to re-upload everything.revision (
str
,optional
) – The branch to commit to. If not provided, themain
branch will be used.private (
bool
,optional
) – Whether the repository should be private. IfNone
(default), the repo will be public unless the organization’s default is private.allow_patterns (
List[str]
orstr
, optional) – If provided, only files matching at least one pattern are uploaded.ignore_patterns (
List[str]
orstr
, optional) – If provided, files matching any of the patterns are not uploaded.num_workers (
int
, optional) – Number of workers to start. Defaults toos.cpu_count() - 2
(minimum 2). A higher number of workers may speed up the process if your machine allows it. However, on machines with a slower connection, it is recommended to keep the number of workers low to ensure better resumability. Indeed, partially uploaded files will have to be completely re-uploaded if the process is interrupted.print_report (
bool
, optional) – Whether to print a report of the upload progress. Defaults to True. Report is printed tosys.stdout
every X seconds (60 by defaults) and overwrites the previous report.print_report_every (
int
, optional) – Frequency at which the report is printed. Defaults to 60 seconds.
<Tip>
- A few things to keep in mind:
Repository limits still apply: https://huggingface.co/docs/hub/repositories-recommendations
Do not start several processes in parallel.
You can interrupt and resume the process at any time.
Do not upload the same folder to several repositories. If you need to do so, you must delete the local
.cache/.huggingface/
folder first.
</Tip>
<Tip warning={true}>
- While being much more robust to upload large folders,
upload_large_folder
is more limited than [upload_folder
] feature-wise. In practice: you cannot set a custom
path_in_repo
. If you want to upload to a subfolder, you need to set the proper structure locally.you cannot set a custom
commit_message
andcommit_description
since multiple commits are created.you cannot delete from the repo while uploading. Please make a separate commit first.
you cannot create a PR directly. Please create a PR first (from the UI or using [
create_pull_request
]) and then commit to it by passingrevision
.
</Tip>
**Technical details:**
upload_large_folder
process is as follow:(Check parameters and setup.)
Create repo if missing.
List local files to upload.
- Run validation checks and display warnings if repository limits might be exceeded:
Warns if the total number of files exceeds 100k (recommended limit).
Warns if any folder contains more than 10k files (recommended limit).
Warns about files larger than 20GB (recommended) or 50GB (hard limit).
- Start workers. Workers can perform the following tasks:
Hash a file.
Get upload mode (regular or LFS) for a list of files.
Pre-upload an LFS file.
Commit a bunch of files.
Once a worker finishes a task, it will move on to the next task based on the priority list (see below) until all files are uploaded and committed. 6. While workers are up, regularly print a report to sys.stdout.
- Order of priority:
Commit if more than 5 minutes since last commit attempt (and at least 1 file).
Commit if at least 150 files are ready to commit.
Get upload mode if at least 10 files have been hashed.
Pre-upload LFS file if at least 1 file and no worker is pre-uploading.
Hash file if at least 1 file and no worker is hashing.
Get upload mode if at least 1 file and no worker is getting upload mode.
Pre-upload LFS file if at least 1 file (exception: if hf_transfer is enabled, only 1 worker can preupload LFS at a time).
Hash file if at least 1 file to hash.
Get upload mode if at least 1 file to get upload mode.
Commit if at least 1 file to commit and at least 1 min since last commit attempt.
Commit if at least 1 file to commit and all other queues are empty.
- Special rules:
If
hf_transfer
is enabled, only 1 LFS uploader at a time. Otherwise the CPU would be bloated byhf_transfer
.Only one worker can commit at a time.
If no tasks are available, the worker waits for 10 seconds before checking again.
- get_hf_file_metadata(*, url: str, token: bool | str | None = None, proxies: Dict | None = None, timeout: float | None = 10) HfFileMetadata [source][source]¶
Fetch metadata of a file versioned on the Hub for a given url.
- Parameters:
url (
str
) – File url, for example returned by [hf_hub_url
].token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.proxies (
dict
, optional) – Dictionary mapping protocol to the URL of the proxy passed torequests.request
.timeout (
float
, optional, defaults to 10) – How many seconds to wait for the server to send metadata before giving up.
- Returns:
A [
HfFileMetadata
] object containing metadata such as location, etag, size and commit_hash.
- hf_hub_download(repo_id: str, filename: str, *, subfolder: str | None = None, repo_type: str | None = None, revision: str | None = None, cache_dir: str | Path | None = None, local_dir: str | Path | None = None, force_download: bool = False, proxies: Dict | None = None, etag_timeout: float = 10, token: bool | str | None = None, local_files_only: bool = False, resume_download: bool | None = None, force_filename: str | None = None, local_dir_use_symlinks: bool | Literal['auto'] = 'auto') str [source][source]¶
Download a given file if it’s not already present in the local cache.
The new cache file layout looks like this: - The cache directory contains one subfolder per repo_id (namespaced by repo type) - inside each repo folder:
refs is a list of the latest known revision => commit_hash pairs
blobs contains the actual file blobs (identified by their git-sha or sha256, depending on
whether they’re LFS files or not) - snapshots contains one subfolder per commit, each “commit” contains the subset of the files that have been resolved at that particular commit. Each filename is a symlink to the blob at that particular commit.
[ 96] . └── [ 160] models--julien-c--EsperBERTo-small ├── [ 160] blobs │ ├── [321M] 403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd │ ├── [ 398] 7cb18dc9bafbfcf74629a4b760af1b160957a83e │ └── [1.4K] d7edf6bd2a681fb0175f7735299831ee1b22b812 ├── [ 96] refs │ └── [ 40] main └── [ 128] snapshots ├── [ 128] 2439f60ef33a0d46d85da5001d52aeda5b00ce9f │ ├── [ 52] README.md -> ../../blobs/d7edf6bd2a681fb0175f7735299831ee1b22b812 │ └── [ 76] pytorch_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd └── [ 128] bbc77c8132af1cc5cf678da3f1ddf2de43606d48 ├── [ 52] README.md -> ../../blobs/7cb18dc9bafbfcf74629a4b760af1b160957a83e └── [ 76] pytorch_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd
If
local_dir
is provided, the file structure from the repo will be replicated in this location. When using this option, thecache_dir
will not be used and a.cache/huggingface/
folder will be created at the root oflocal_dir
to store some metadata related to the downloaded files. While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.- Parameters:
repo_id (
str
) – A user or an organization name and a repo name separated by a/
.filename (
str
) – The name of the file in the repo.subfolder (
str
, optional) – An optional value corresponding to a folder inside the repository.repo_type (
str
, optional) – Set to"dataset"
or"space"
if downloading from a dataset or space,None
or"model"
if downloading from a model. Default isNone
.revision (
str
, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.cache_dir (
str
,Path
, optional) – Path to the folder where cached files are stored.local_dir (
str
orPath
, optional) – If provided, the downloaded file will be placed under this directory.force_download (
bool
, optional, defaults toFalse
) – Whether the file should be downloaded even if it already exists in the local cache.proxies (
dict
, optional) – Dictionary mapping protocol to the URL of the proxy passed torequests.request
.etag_timeout (
float
, optional, defaults to10
) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed torequests.request
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.local_files_only (
bool
, optional, defaults toFalse
) – IfTrue
, avoid downloading the file and return the path to the local cached file if it exists.
- Returns:
Local path of file or if networking is off, last version of file cached on disk.
- Return type:
str
- Raises:
[RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to
private
and you do not have access.[RevisionNotFoundError] – If the revision to download from cannot be found.
[EntryNotFoundError] – If the file to download cannot be found.
[LocalEntryNotFoundError] – If network is disabled or unavailable and file is not found in cache.
[EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If
token=True
but the token cannot be found.[OSError](https – //docs.python.org/3/library/exceptions.html#OSError) If ETag cannot be determined.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) If some parameter value is invalid.
- snapshot_download(repo_id: str, *, repo_type: str | None = None, revision: str | None = None, cache_dir: str | Path | None = None, local_dir: str | Path | None = None, proxies: Dict | None = None, etag_timeout: float = 10, force_download: bool = False, token: bool | str | None = None, local_files_only: bool = False, allow_patterns: str | List[str] | None = None, ignore_patterns: str | List[str] | None = None, max_workers: int = 8, tqdm_class: Type[tqdm_asyncio] | None = None, local_dir_use_symlinks: bool | Literal['auto'] = 'auto', resume_download: bool | None = None) str [source][source]¶
Download repo files.
Download a whole snapshot of a repo’s files at the specified revision. This is useful when you want all files from a repo, because you don’t know which ones you will need a priori. All files are nested inside a folder in order to keep their actual filename relative to that folder. You can also filter which files to download using
allow_patterns
andignore_patterns
.If
local_dir
is provided, the file structure from the repo will be replicated in this location. When using this option, thecache_dir
will not be used and a.cache/huggingface/
folder will be created at the root oflocal_dir
to store some metadata related to the downloaded files.While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.An alternative would be to clone the repo but this requires git and git-lfs to be installed and properly configured. It is also not possible to filter which files to download when cloning a repository using git.
- Parameters:
repo_id (
str
) – A user or an organization name and a repo name separated by a/
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if downloading from a dataset or space,None
or"model"
if downloading from a model. Default isNone
.revision (
str
, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.cache_dir (
str
,Path
, optional) – Path to the folder where cached files are stored.local_dir (
str
orPath
, optional) – If provided, the downloaded files will be placed under this directory.proxies (
dict
, optional) – Dictionary mapping protocol to the URL of the proxy passed torequests.request
.etag_timeout (
float
, optional, defaults to10
) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed torequests.request
.force_download (
bool
, optional, defaults toFalse
) – Whether the file should be downloaded even if it already exists in the local cache.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.local_files_only (
bool
, optional, defaults toFalse
) – IfTrue
, avoid downloading the file and return the path to the local cached file if it exists.allow_patterns (
List[str]
orstr
, optional) – If provided, only files matching at least one pattern are downloaded.ignore_patterns (
List[str]
orstr
, optional) – If provided, files matching any of the patterns are not downloaded.max_workers (
int
, optional) – Number of concurrent threads to download files (1 thread = 1 file download). Defaults to 8.tqdm_class (
tqdm
, optional) – If provided, overwrites the default behavior for the progress bar. Passed argument must inherit fromtqdm.auto.tqdm
or at least mimic its behavior. Note that thetqdm_class
is not passed to each individual download. Defaults to the custom HF progress bar that can be disabled by settingHF_HUB_DISABLE_PROGRESS_BARS
environment variable.
- Returns:
folder path of the repo snapshot.
- Return type:
str
- Raises:
[RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to
private
and you do not have access.[RevisionNotFoundError] – If the revision to download from cannot be found.
[EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If
token=True
and the token cannot be found.[OSError](https – //docs.python.org/3/library/exceptions.html#OSError) if ETag cannot be determined.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid.
- get_safetensors_metadata(repo_id: str, *, repo_type: str | None = None, revision: str | None = None, token: str | bool | None = None) SafetensorsRepoMetadata [source][source]¶
Parse metadata for a safetensors repo on the Hub.
We first check if the repo has a single safetensors file or a sharded safetensors repo. If it’s a single safetensors file, we parse the metadata from this file. If it’s a sharded safetensors repo, we parse the metadata from the index file and then parse the metadata from each shard.
To parse metadata from a single safetensors file, use [
parse_safetensors_file_metadata
].For more details regarding the safetensors format, check out https://huggingface.co/docs/safetensors/index#format.
- Parameters:
repo_id (
str
) – A user or an organization name and a repo name separated by a/
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if the file is in a dataset or space,None
or"model"
if in a model. Default isNone
.revision (
str
, optional) – The git revision to fetch the file from. Can be a branch name, a tag, or a commit hash. Defaults to the head of the"main"
branch.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
information related to safetensors repo.
- Return type:
[
SafetensorsRepoMetadata
]- Raises:
[NotASafetensorsRepoError] – If the repo is not a safetensors repo i.e. doesn’t have either a
model.safetensors
or amodel.safetensors.index.json
file.[SafetensorsParsingError] – If a safetensors file header couldn’t be parsed correctly.
Example
# Parse repo with single weights file >>> metadata = get_safetensors_metadata("bigscience/bloomz-560m") >>> metadata SafetensorsRepoMetadata( metadata=None, sharded=False, weight_map={'h.0.input_layernorm.bias': 'model.safetensors', ...}, files_metadata={'model.safetensors': SafetensorsFileMetadata(...)} ) >>> metadata.files_metadata["model.safetensors"].metadata {'format': 'pt'} # Parse repo with sharded model >>> metadata = get_safetensors_metadata("bigscience/bloom") Parse safetensors files: 100%|██████████████████████████████████████████| 72/72 [00:12<00:00, 5.78it/s] >>> metadata SafetensorsRepoMetadata(metadata={'total_size': 352494542848}, sharded=True, weight_map={...}, files_metadata={...}) >>> len(metadata.files_metadata) 72 # All safetensors files have been fetched # Parse repo with sharded model >>> get_safetensors_metadata("runwayml/stable-diffusion-v1-5") NotASafetensorsRepoError: 'runwayml/stable-diffusion-v1-5' is not a safetensors repo. Couldn't find 'model.safetensors.index.json' or 'model.safetensors' files.
- parse_safetensors_file_metadata(repo_id: str, filename: str, *, repo_type: str | None = None, revision: str | None = None, token: str | bool | None = None) SafetensorsFileMetadata [source][source]¶
Parse metadata from a safetensors file on the Hub.
To parse metadata from all safetensors files in a repo at once, use [
get_safetensors_metadata
].For more details regarding the safetensors format, check out https://huggingface.co/docs/safetensors/index#format.
- Parameters:
repo_id (
str
) – A user or an organization name and a repo name separated by a/
.filename (
str
) – The name of the file in the repo.repo_type (
str
, optional) – Set to"dataset"
or"space"
if the file is in a dataset or space,None
or"model"
if in a model. Default isNone
.revision (
str
, optional) – The git revision to fetch the file from. Can be a branch name, a tag, or a commit hash. Defaults to the head of the"main"
branch.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
information related to a safetensors file.
- Return type:
[
SafetensorsFileMetadata
]- Raises:
[NotASafetensorsRepoError] – If the repo is not a safetensors repo i.e. doesn’t have either a
model.safetensors
or amodel.safetensors.index.json
file.[SafetensorsParsingError] – If a safetensors file header couldn’t be parsed correctly.
- create_branch(repo_id: str, *, branch: str, revision: str | None = None, token: bool | str | None = None, repo_type: str | None = None, exist_ok: bool = False) None [source][source]¶
Create a new branch for a repo on the Hub, starting from the specified revision (defaults to
main
). To find a revision suiting your needs, you can use [list_repo_refs
] or [list_repo_commits
].- Parameters:
repo_id (
str
) – The repository in which the branch will be created. Example:"user/my-cool-model"
.branch (
str
) – The name of the branch to create.revision (
str
, optional) – The git revision to create the branch from. It can be a branch name or the OID/SHA of a commit, as a hexadecimal string. Defaults to the head of the"main"
branch.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if creating a branch on a dataset or space,None
or"model"
if tagging a model. Default isNone
.exist_ok (
bool
, optional, defaults toFalse
) – IfTrue
, do not raise an error if branch already exists.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[BadRequestError] – If invalid reference for a branch. Ex:
refs/pr/5
or ‘refs/foo/bar’.[HfHubHTTPError] – If the branch already exists on the repo (error 409) and
exist_ok
is set toFalse
.
- delete_branch(repo_id: str, *, branch: str, token: bool | str | None = None, repo_type: str | None = None) None [source][source]¶
Delete a branch from a repo on the Hub.
- Parameters:
repo_id (
str
) – The repository in which a branch will be deleted. Example:"user/my-cool-model"
.branch (
str
) – The name of the branch to delete.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if creating a branch on a dataset or space,None
or"model"
if tagging a model. Default isNone
.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[HfHubHTTPError] – If trying to delete a protected branch. Ex:
main
cannot be deleted.[HfHubHTTPError] – If trying to delete a branch that does not exist.
- create_tag(repo_id: str, *, tag: str, tag_message: str | None = None, revision: str | None = None, token: bool | str | None = None, repo_type: str | None = None, exist_ok: bool = False) None [source][source]¶
Tag a given commit of a repo on the Hub.
- Parameters:
repo_id (
str
) – The repository in which a commit will be tagged. Example:"user/my-cool-model"
.tag (
str
) – The name of the tag to create.tag_message (
str
, optional) – The description of the tag to create.revision (
str
, optional) – The git revision to tag. It can be a branch name or the OID/SHA of a commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. Defaults to the head of the"main"
branch.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if tagging a dataset or space,None
or"model"
if tagging a model. Default isNone
.exist_ok (
bool
, optional, defaults toFalse
) – IfTrue
, do not raise an error if tag already exists.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If revision is not found (error 404) on the repo.
[HfHubHTTPError] – If the branch already exists on the repo (error 409) and
exist_ok
is set toFalse
.
- delete_tag(repo_id: str, *, tag: str, token: bool | str | None = None, repo_type: str | None = None) None [source][source]¶
Delete a tag from a repo on the Hub.
- Parameters:
repo_id (
str
) – The repository in which a tag will be deleted. Example:"user/my-cool-model"
.tag (
str
) – The name of the tag to delete.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if tagging a dataset or space,None
or"model"
if tagging a model. Default isNone
.
- Raises:
[RepositoryNotFoundError] – If repository is not found (error 404): wrong repo_id/repo_type, private but not authenticated or repo does not exist.
[RevisionNotFoundError] – If tag is not found.
- get_full_repo_name(model_id: str, *, organization: str | None = None, token: bool | str | None = None)[source][source]¶
Returns the repository name for a given model ID and optional organization.
- Parameters:
model_id (
str
) – The name of the model.organization (
str
, optional) – If passed, the repository name will be in the organization namespace instead of the user namespace.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
The repository name in the user’s namespace ({username}/{model_id}) if no organization is passed, and under the organization namespace ({organization}/{model_id}) otherwise.
- Return type:
str
- get_repo_discussions(repo_id: str, *, author: str | None = None, discussion_type: Literal['all', 'discussion', 'pull_request'] | None = None, discussion_status: Literal['all', 'open', 'closed'] | None = None, repo_type: str | None = None, token: bool | str | None = None) Iterator[Discussion] [source][source]¶
Fetches Discussions and Pull Requests for the given repo.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.author (
str
, optional) – Pass a value to filter by discussion author.None
means no filter. Default isNone
.discussion_type (
str
, optional) – Set to"pull_request"
to fetch only pull requests,"discussion"
to fetch only discussions. Set to"all"
orNone
to fetch both. Default isNone
.discussion_status (
str
, optional) – Set to"open"
(respectively"closed"
) to fetch only open (respectively closed) discussions. Set to"all"
orNone
to fetch both. Default isNone
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if fetching from a dataset or space,None
or"model"
if fetching from a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
An iterator of [
Discussion
] objects.- Return type:
Iterator[Discussion]
Example
Collecting all discussions of a repo in a list:
>>> from huggingface_hub import get_repo_discussions >>> discussions_list = list(get_repo_discussions(repo_id="bert-base-uncased"))
Iterating over discussions of a repo:
>>> from huggingface_hub import get_repo_discussions >>> for discussion in get_repo_discussions(repo_id="bert-base-uncased"): ... print(discussion.num, discussion.title)
- get_discussion_details(repo_id: str, discussion_num: int, *, repo_type: str | None = None, token: bool | str | None = None) DiscussionWithDetails [source][source]¶
Fetches a Discussion’s / Pull Request ‘s details from the Hub.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.discussion_num (
int
) – The number of the Discussion or Pull Request . Must be a strictly positive integer.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Returns: [
DiscussionWithDetails
]<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- create_discussion(repo_id: str, title: str, *, token: bool | str | None = None, description: str | None = None, repo_type: str | None = None, pull_request: bool = False) DiscussionWithDetails [source][source]¶
Creates a Discussion or Pull Request.
Pull Requests created programmatically will be in
"draft"
status.Creating a Pull Request with changes can also be done at once with [
HfApi.create_commit
].- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.title (
str
) – The title of the discussion. It can be up to 200 characters long, and must be at least 3 characters long. Leading and trailing whitespaces will be stripped.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.description (
str
, optional) – An optional description for the Pull Request. Defaults to"Discussion opened with the huggingface_hub Python library"
pull_request (
bool
, optional) – Whether to create a Pull Request or discussion. IfTrue
, creates a Pull Request. IfFalse
, creates a discussion. Defaults toFalse
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.
Returns: [
DiscussionWithDetails
]<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- create_pull_request(repo_id: str, title: str, *, token: bool | str | None = None, description: str | None = None, repo_type: str | None = None) DiscussionWithDetails [source][source]¶
Creates a Pull Request . Pull Requests created programmatically will be in
"draft"
status.Creating a Pull Request with changes can also be done at once with [
HfApi.create_commit
];This is a wrapper around [
HfApi.create_discussion
].- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.title (
str
) – The title of the discussion. It can be up to 200 characters long, and must be at least 3 characters long. Leading and trailing whitespaces will be stripped.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.description (
str
, optional) – An optional description for the Pull Request. Defaults to"Discussion opened with the huggingface_hub Python library"
repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.
Returns: [
DiscussionWithDetails
]<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- comment_discussion(repo_id: str, discussion_num: int, comment: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionComment [source][source]¶
Creates a new comment on the given Discussion.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.discussion_num (
int
) – The number of the Discussion or Pull Request . Must be a strictly positive integer.comment (
str
) – The content of the comment to create. Comments support markdown formatting.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
the newly created comment
- Return type:
[
DiscussionComment
]
Examples
>>> comment = """ ... Hello @otheruser! ... ... # This is a title ... ... **This is bold\**, *this is italic* and ~this is strikethrough~ ... And [this](http://url) is a link ... """ >>> HfApi().comment_discussion( ... repo_id="username/repo_name", ... discussion_num=34 ... comment=comment ... ) # DiscussionComment(id='deadbeef0000000', type='comment', ...)
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- rename_discussion(repo_id: str, discussion_num: int, new_title: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionTitleChange [source][source]¶
Renames a Discussion.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.discussion_num (
int
) – The number of the Discussion or Pull Request . Must be a strictly positive integer.new_title (
str
) – The new title for the discussionrepo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
the title change event
- Return type:
[
DiscussionTitleChange
]
Examples
>>> new_title = "New title, fixing a typo" >>> HfApi().rename_discussion( ... repo_id="username/repo_name", ... discussion_num=34 ... new_title=new_title ... ) # DiscussionTitleChange(id='deadbeef0000000', type='title-change', ...)
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- change_discussion_status(repo_id: str, discussion_num: int, new_status: Literal['open', 'closed'], *, token: bool | str | None = None, comment: str | None = None, repo_type: str | None = None) DiscussionStatusChange [source][source]¶
Closes or re-opens a Discussion or Pull Request.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.discussion_num (
int
) – The number of the Discussion or Pull Request . Must be a strictly positive integer.new_status (
str
) – The new status for the discussion, either"open"
or"closed"
.comment (
str
, optional) – An optional comment to post with the status change.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
the status change event
- Return type:
[
DiscussionStatusChange
]
Examples
>>> new_title = "New title, fixing a typo" >>> HfApi().rename_discussion( ... repo_id="username/repo_name", ... discussion_num=34 ... new_title=new_title ... ) # DiscussionStatusChange(id='deadbeef0000000', type='status-change', ...)
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- merge_pull_request(repo_id: str, discussion_num: int, *, token: bool | str | None = None, comment: str | None = None, repo_type: str | None = None)[source][source]¶
Merges a Pull Request.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.discussion_num (
int
) – The number of the Discussion or Pull Request . Must be a strictly positive integer.comment (
str
, optional) – An optional comment to post with the status change.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
the status change event
- Return type:
[
DiscussionStatusChange
]
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- edit_discussion_comment(repo_id: str, discussion_num: int, comment_id: str, new_content: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionComment [source][source]¶
Edits a comment on a Discussion / Pull Request.
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.discussion_num (
int
) – The number of the Discussion or Pull Request . Must be a strictly positive integer.comment_id (
str
) – The ID of the comment to edit.new_content (
str
) – The new content of the comment. Comments support markdown formatting.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
the edited comment
- Return type:
[
DiscussionComment
]
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- hide_discussion_comment(repo_id: str, discussion_num: int, comment_id: str, *, token: bool | str | None = None, repo_type: str | None = None) DiscussionComment [source][source]¶
Hides a comment on a Discussion / Pull Request.
<Tip warning={true}> Hidden comments’ content cannot be retrieved anymore. Hiding a comment is irreversible. </Tip>
- Parameters:
repo_id (
str
) – A namespace (user or an organization) and a repo name separated by a/
.discussion_num (
int
) – The number of the Discussion or Pull Request . Must be a strictly positive integer.comment_id (
str
) – The ID of the comment to edit.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
the hidden comment
- Return type:
[
DiscussionComment
]
<Tip>
Raises the following errors:
[
HTTPError
](https://requests.readthedocs.io/en/latest/api/#requests.HTTPError) if the HuggingFace API returned an error[
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid[
~utils.RepositoryNotFoundError
] If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.
</Tip>
- add_space_secret(repo_id: str, key: str, value: str, *, description: str | None = None, token: bool | str | None = None) None [source][source]¶
Adds or updates a secret in a Space.
Secrets allow to set secret keys or tokens to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.
- Parameters:
repo_id (
str
) – ID of the repo to update. Example:"bigcode/in-the-stack"
.key (
str
) – Secret key. Example:"GITHUB_API_KEY"
value (
str
) – Secret value. Example:"your_github_api_key"
.description (
str
, optional) – Secret description. Example:"Github API key to access the Github API"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- delete_space_secret(repo_id: str, key: str, *, token: bool | str | None = None) None [source][source]¶
Deletes a secret from a Space.
Secrets allow to set secret keys or tokens to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.
- Parameters:
repo_id (
str
) – ID of the repo to update. Example:"bigcode/in-the-stack"
.key (
str
) – Secret key. Example:"GITHUB_API_KEY"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- get_space_variables(repo_id: str, *, token: bool | str | None = None) Dict[str, SpaceVariable] [source][source]¶
Gets all variables from a Space.
Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables
- Parameters:
repo_id (
str
) – ID of the repo to query. Example:"bigcode/in-the-stack"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- add_space_variable(repo_id: str, key: str, value: str, *, description: str | None = None, token: bool | str | None = None) Dict[str, SpaceVariable] [source][source]¶
Adds or updates a variable in a Space.
Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables
- Parameters:
repo_id (
str
) – ID of the repo to update. Example:"bigcode/in-the-stack"
.key (
str
) – Variable key. Example:"MODEL_REPO_ID"
value (
str
) – Variable value. Example:"the_model_repo_id"
.description (
str
) – Description of the variable. Example:"Model Repo ID of the implemented model"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- delete_space_variable(repo_id: str, key: str, *, token: bool | str | None = None) Dict[str, SpaceVariable] [source][source]¶
Deletes a variable from a Space.
Variables allow to set environment variables to a Space without hardcoding them. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables
- Parameters:
repo_id (
str
) – ID of the repo to update. Example:"bigcode/in-the-stack"
.key (
str
) – Variable key. Example:"MODEL_REPO_ID"
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- get_space_runtime(repo_id: str, *, token: bool | str | None = None) SpaceRuntime [source][source]¶
Gets runtime information about a Space.
- Parameters:
repo_id (
str
) – ID of the repo to update. Example:"bigcode/in-the-stack"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[
SpaceRuntime
]
- request_space_hardware(repo_id: str, hardware: SpaceHardware, *, token: bool | str | None = None, sleep_time: int | None = None) SpaceRuntime [source][source]¶
Request new hardware for a Space.
- Parameters:
repo_id (
str
) – ID of the repo to update. Example:"bigcode/in-the-stack"
.hardware (
str
or [SpaceHardware
]) – Hardware on which to run the Space. Example:"t4-medium"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.sleep_time (
int
, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to-1
if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[
SpaceRuntime
]
<Tip>
It is also possible to request hardware directly when creating the Space repo! See [
create_repo
] for details.</Tip>
- set_space_sleep_time(repo_id: str, sleep_time: int, *, token: bool | str | None = None) SpaceRuntime [source][source]¶
Set a custom sleep time for a Space running on upgraded hardware..
Your Space will go to sleep after X seconds of inactivity. You are not billed when your Space is in “sleep” mode. If a new visitor lands on your Space, it will “wake it up”. Only upgraded hardware can have a configurable sleep time. To know more about the sleep stage, please refer to https://huggingface.co/docs/hub/spaces-gpus#sleep-time.
- Parameters:
repo_id (
str
) – ID of the repo to update. Example:"bigcode/in-the-stack"
.sleep_time (
int
, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to-1
if you don’t want your Space to pause (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[
SpaceRuntime
]
<Tip>
It is also possible to set a custom sleep time when requesting hardware with [
request_space_hardware
].</Tip>
- pause_space(repo_id: str, *, token: bool | str | None = None) SpaceRuntime [source][source]¶
Pause your Space.
A paused Space stops executing until manually restarted by its owner. This is different from the sleeping state in which free Spaces go after 48h of inactivity. Paused time is not billed to your account, no matter the hardware you’ve selected. To restart your Space, use [
restart_space
] and go to your Space settings page.For more details, please visit [the docs](https://huggingface.co/docs/hub/spaces-gpus#pause).
- Parameters:
repo_id (
str
) – ID of the Space to pause. Example:"Salesforce/BLIP2"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Runtime information about your Space including
stage=PAUSED
and requested hardware.- Return type:
[
SpaceRuntime
]- Raises:
[RepositoryNotFoundError] – If your Space is not found (error 404). Most probably wrong repo_id or your space is private but you are not authenticated.
[HfHubHTTPError] – 403 Forbidden: only the owner of a Space can pause it. If you want to manage a Space that you don’t own, either ask the owner by opening a Discussion or duplicate the Space.
[BadRequestError] – If your Space is a static Space. Static Spaces are always running and never billed. If you want to hide a static Space, you can set it to private.
- restart_space(repo_id: str, *, token: bool | str | None = None, factory_reboot: bool = False) SpaceRuntime [source][source]¶
Restart your Space.
This is the only way to programmatically restart a Space if you’ve put it on Pause (see [
pause_space
]). You must be the owner of the Space to restart it. If you are using an upgraded hardware, your account will be billed as soon as the Space is restarted. You can trigger a restart no matter the current state of a Space.For more details, please visit [the docs](https://huggingface.co/docs/hub/spaces-gpus#pause).
- Parameters:
repo_id (
str
) – ID of the Space to restart. Example:"Salesforce/BLIP2"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.factory_reboot (
bool
, optional) – IfTrue
, the Space will be rebuilt from scratch without caching any requirements.
- Returns:
Runtime information about your Space.
- Return type:
[
SpaceRuntime
]- Raises:
[RepositoryNotFoundError] – If your Space is not found (error 404). Most probably wrong repo_id or your space is private but you are not authenticated.
[HfHubHTTPError] – 403 Forbidden: only the owner of a Space can restart it. If you want to restart a Space that you don’t own, either ask the owner by opening a Discussion or duplicate the Space.
[BadRequestError] – If your Space is a static Space. Static Spaces are always running and never billed. If you want to hide a static Space, you can set it to private.
- duplicate_space(from_id: str, to_id: str | None = None, *, private: bool | None = None, token: bool | str | None = None, exist_ok: bool = False, hardware: SpaceHardware | None = None, storage: SpaceStorage | None = None, sleep_time: int | None = None, secrets: List[Dict[str, str]] | None = None, variables: List[Dict[str, str]] | None = None) RepoUrl [source][source]¶
Duplicate a Space.
Programmatically duplicate a Space. The new Space will be created in your account and will be in the same state as the original Space (running or paused). You can duplicate a Space no matter the current state of a Space.
- Parameters:
from_id (
str
) – ID of the Space to duplicate. Example:"pharma/CLIP-Interrogator"
.to_id (
str
, optional) – ID of the new Space. Example:"dog/CLIP-Interrogator"
. If not provided, the new Space will have the same name as the original Space, but in your account.private (
bool
, optional) – Whether the new Space should be private or not. Defaults to the same privacy as the original Space.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.exist_ok (
bool
, optional, defaults toFalse
) – IfTrue
, do not raise an error if repo already exists.hardware (
SpaceHardware
orstr
, optional) – Choice of Hardware. Example:"t4-medium"
. See [SpaceHardware
] for a complete list.storage (
SpaceStorage
orstr
, optional) – Choice of persistent storage tier. Example:"small"
. See [SpaceStorage
] for a complete list.sleep_time (
int
, optional) – Number of seconds of inactivity to wait before a Space is put to sleep. Set to-1
if you don’t want your Space to sleep (default behavior for upgraded hardware). For free hardware, you can’t configure the sleep time (value is fixed to 48 hours of inactivity). See https://huggingface.co/docs/hub/spaces-gpus#sleep-time for more details.secrets (
List[Dict[str, str]]
, optional) – A list of secret keys to set in your Space. Each item is in the form{"key": ..., "value": ..., "description": ...}
where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets.variables (
List[Dict[str, str]]
, optional) – A list of public environment variables to set in your Space. Each item is in the form{"key": ..., "value": ..., "description": ...}
where description is optional. For more details, see https://huggingface.co/docs/hub/spaces-overview#managing-secrets-and-environment-variables.
- Returns:
URL to the newly created repo. Value is a subclass of
str
containing attributes likeendpoint
,repo_type
andrepo_id
.- Return type:
[
RepoUrl
]- Raises:
[RepositoryNotFoundError] – If one of
from_id
orto_id
cannot be found. This may be because it doesn’t exist, or because it is set toprivate
and you do not have access.[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): If the HuggingFace API returned an error
Example: .. code-block:: python
>>> from huggingface_hub import duplicate_space
# Duplicate a Space to your account >>> duplicate_space(“multimodalart/dreambooth-training”) RepoUrl(’https://huggingface.co/spaces/nateraw/dreambooth-training’,…)
# Can set custom destination id and visibility flag. >>> duplicate_space(“multimodalart/dreambooth-training”, to_id=”my-dreambooth”, private=True) RepoUrl(’https://huggingface.co/spaces/nateraw/my-dreambooth’,…)
- request_space_storage(repo_id: str, storage: SpaceStorage, *, token: bool | str | None = None) SpaceRuntime [source][source]¶
Request persistent storage for a Space.
- Parameters:
repo_id (
str
) – ID of the Space to update. Example:"open-llm-leaderboard/open_llm_leaderboard"
.storage (
str
or [SpaceStorage
]) – Storage tier. Either ‘small’, ‘medium’, or ‘large’.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[
SpaceRuntime
]
<Tip>
It is not possible to decrease persistent storage after its granted. To do so, you must delete it via [
delete_space_storage
].</Tip>
- delete_space_storage(repo_id: str, *, token: bool | str | None = None) SpaceRuntime [source][source]¶
Delete persistent storage for a Space.
- Parameters:
repo_id (
str
) – ID of the Space to update. Example:"open-llm-leaderboard/open_llm_leaderboard"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Runtime information about a Space including Space stage and hardware.
- Return type:
[
SpaceRuntime
]- Raises:
[BadRequestError] – If space has no persistent storage.
- list_inference_endpoints(namespace: str | None = None, *, token: str | bool | None = None) List[InferenceEndpoint] [source][source]¶
Lists all inference endpoints for the given namespace.
- Parameters:
namespace (
str
, optional) – The namespace to list endpoints for. Defaults to the current user. Set to"*"
to list all endpoints from all namespaces (i.e. personal namespace and all orgs the user belongs to).token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A list of all inference endpoints for the given namespace.
- Return type:
List[
InferenceEndpoint
]
Example: .. code-block:: python
>>> from huggingface_hub import HfApi >>> api = HfApi() >>> api.list_inference_endpoints() [InferenceEndpoint(name='my-endpoint', ...), ...]
- create_inference_endpoint(name: str, *, repository: str, framework: str, accelerator: str, instance_size: str, instance_type: str, region: str, vendor: str, account_id: str | None = None, min_replica: int = 1, max_replica: int = 1, scale_to_zero_timeout: int | None = None, revision: str | None = None, task: str | None = None, custom_image: Dict | None = None, env: Dict[str, str] | None = None, secrets: Dict[str, str] | None = None, type: InferenceEndpointType = InferenceEndpointType.PROTECTED, domain: str | None = None, path: str | None = None, cache_http_responses: bool | None = None, tags: List[str] | None = None, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint [source][source]¶
Create a new Inference Endpoint.
- Parameters:
name (
str
) – The unique name for the new Inference Endpoint.repository (
str
) – The name of the model repository associated with the Inference Endpoint (e.g."gpt2"
).framework (
str
) – The machine learning framework used for the model (e.g."custom"
).accelerator (
str
) – The hardware accelerator to be used for inference (e.g."cpu"
).instance_size (
str
) – The size or type of the instance to be used for hosting the model (e.g."x4"
).instance_type (
str
) – The cloud instance type where the Inference Endpoint will be deployed (e.g."intel-icl"
).region (
str
) – The cloud region in which the Inference Endpoint will be created (e.g."us-east-1"
).vendor (
str
) – The cloud provider or vendor where the Inference Endpoint will be hosted (e.g."aws"
).account_id (
str
, optional) – The account ID used to link a VPC to a private Inference Endpoint (if applicable).min_replica (
int
, optional) – The minimum number of replicas (instances) to keep running for the Inference Endpoint. To enable scaling to zero, set this value to 0 and adjustscale_to_zero_timeout
accordingly. Defaults to 1.max_replica (
int
, optional) – The maximum number of replicas (instances) to scale to for the Inference Endpoint. Defaults to 1.scale_to_zero_timeout (
int
, optional) – The duration in minutes before an inactive endpoint is scaled to zero, or no scaling to zero if set to None andmin_replica
is not 0. Defaults to None.revision (
str
, optional) – The specific model revision to deploy on the Inference Endpoint (e.g."6c0e6080953db56375760c0471a8c5f2929baf11"
).task (
str
, optional) – The task on which to deploy the model (e.g."text-classification"
).custom_image (
Dict
, optional) – A custom Docker image to use for the Inference Endpoint. This is useful if you want to deploy an Inference Endpoint running on thetext-generation-inference
(TGI) framework (see examples).env (
Dict[str, str]
, optional) – Non-secret environment variables to inject in the container environment.secrets (
Dict[str, str]
, optional) – Secret values to inject in the container environment.type ([
InferenceEndpointType]
, optional) – The type of the Inference Endpoint, which can be"protected"
(default),"public"
or"private"
.domain (
str
, optional) – The custom domain for the Inference Endpoint deployment, if setup the inference endpoint will be available at this domain (e.g."my-new-domain.cool-website.woof"
).path (
str
, optional) – The custom path to the deployed model, should start with a/
(e.g."/models/google-bert/bert-base-uncased"
).cache_http_responses (
bool
, optional) – Whether to cache HTTP responses from the Inference Endpoint. Defaults toFalse
.tags (
List[str]
, optional) – A list of tags to associate with the Inference Endpoint.namespace (
str
, optional) – The namespace where the Inference Endpoint will be created. Defaults to the current user’s namespace.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.Returns – [
InferenceEndpoint
]: information about the updated Inference Endpoint.Example
```python –
HfApi (>>> from huggingface_hub import)
HfApi() (>>> api =)
api.create_inference_endpoint( (>>> endpoint =)
"my-endpoint-name" (...)
:param : :param … repository=”gpt2”: :param : :param … framework=”pytorch”: :param : :param … task=”text-generation”: :param : :param … accelerator=”cpu”: :param : :param … vendor=”aws”: :param : :param … region=”us-east-1”: :param : :param … type=”protected”: :param : :param … instance_size=”x2”: :param : :param … instance_type=”intel-icl”: :param : :param … ): :param >>> endpoint: :param InferenceEndpoint: :type InferenceEndpoint: name=’my-endpoint-name’, status=”pending”,… :param # Run inference on the endpoint: :param >>> endpoint.client.text_generation: :type >>> endpoint.client.text_generation: … :param “…”: :param
`: :param ```python: :param # Start an Inference Endpoint running Zephyr-7b-beta on TGI: :param >>> from huggingface_hub import HfApi: :param >>> api = HfApi(): :param >>> endpoint = api.create_inference_endpoint(: :param ... "aws-zephyr-7b-beta-0486": :param : :param ... repository="HuggingFaceH4/zephyr-7b-beta": :param : :param ... framework="pytorch": :param : :param ... task="text-generation": :param : :param ... accelerator="gpu": :param : :param ... vendor="aws": :param : :param ... region="us-east-1": :param : :param ... type="protected": :param : :param ... instance_size="x1": :param : :param ... instance_type="nvidia-a10g": :param : :param ... env={: :param ... "MAX_BATCH_PREFILL_TOKENS": "2048", :param ... "MAX_INPUT_LENGTH": "1024", :param ... "MAX_TOTAL_TOKENS": "1512", :param ... "MODEL_ID": "/repository" :param ... }: :param : :param ... custom_image={: :param ... "health_route": "/health", :param ... "url": "ghcr.io/huggingface/text-generation-inference:1.1.0", :param ... }: :param : :param ... secrets={"MY_SECRET_KEY": "secret_value"}, :param ... tags=["dev": :param "text-generation"]: :param : :param ... ): :param `
: :param`python: :param # Start an Inference Endpoint running ProsusAI/finbert while scaling to zero in 15 minutes: :param >>> from huggingface_hub import HfApi: :param >>> api = HfApi(): :param >>> endpoint = api.create_inference_endpoint(: :param ... "finbert-classifier": :param : :param ... repository="ProsusAI/finbert": :param : :param ... framework="pytorch": :param : :param ... task="text-classification": :param : :param ... min_replica=0: :param : :param ... scale_to_zero_timeout=15: :param : :param ... accelerator="cpu": :param : :param ... vendor="aws": :param : :param ... region="us-east-1": :param : :param ... type="protected": :param : :param ... instance_size="x2": :param : :param ... instance_type="intel-icl": :param : :param ... ): :param >>> endpoint.wait: :type >>> endpoint.wait: timeout=300 :param # Run inference on the endpoint: :param >>> endpoint.client.text_generation: :type >>> endpoint.client.text_generation: ... :param TextClassificationOutputElement: :type TextClassificationOutputElement: label='positive', score=0.8983615040779114 :param `
:
- create_inference_endpoint_from_catalog(repo_id: str, *, name: str | None = None, token: bool | str | None = None, namespace: str | None = None) InferenceEndpoint [source][source]¶
Create a new Inference Endpoint from a model in the Hugging Face Inference Catalog.
The goal of the Inference Catalog is to provide a curated list of models that are optimized for inference and for which default configurations have been tested. See https://endpoints.huggingface.co/catalog for a list of available models in the catalog.
- Parameters:
repo_id (
str
) – The ID of the model in the catalog to deploy as an Inference Endpoint.name (
str
, optional) – The unique name for the new Inference Endpoint. If not provided, a random name will be generated.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
namespace (
str
, optional) – The namespace where the Inference Endpoint will be created. Defaults to the current user’s namespace.
- Returns:
information about the new Inference Endpoint.
- Return type:
[
InferenceEndpoint
]
<Tip warning={true}>
create_inference_endpoint_from_catalog
is experimental. Its API is subject to change in the future. Please provide feedback if you have any suggestions or requests.</Tip>
- list_inference_catalog(*, token: bool | str | None = None) List[str] [source][source]¶
List models available in the Hugging Face Inference Catalog.
The goal of the Inference Catalog is to provide a curated list of models that are optimized for inference and for which default configurations have been tested. See https://endpoints.huggingface.co/catalog for a list of available models in the catalog.
Use [
create_inference_endpoint_from_catalog
] to deploy a model from the catalog.- Parameters:
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
- Returns:
A list of model IDs available in the catalog.
- Return type:
List[
str
]
<Tip warning={true}>
list_inference_catalog
is experimental. Its API is subject to change in the future. Please provide feedback if you have any suggestions or requests.</Tip>
- get_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint [source][source]¶
Get information about an Inference Endpoint.
- Parameters:
name (
str
) – The name of the Inference Endpoint to retrieve information about.namespace (
str
, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
information about the requested Inference Endpoint.
- Return type:
[
InferenceEndpoint
]
Example: .. code-block:: python
>>> from huggingface_hub import HfApi >>> api = HfApi() >>> endpoint = api.get_inference_endpoint("my-text-to-image") >>> endpoint InferenceEndpoint(name='my-text-to-image', ...)
# Get status >>> endpoint.status ‘running’ >>> endpoint.url ‘https://my-text-to-image.region.vendor.endpoints.huggingface.cloud’
# Run inference >>> endpoint.client.text_to_image(…)
- update_inference_endpoint(name: str, *, accelerator: str | None = None, instance_size: str | None = None, instance_type: str | None = None, min_replica: int | None = None, max_replica: int | None = None, scale_to_zero_timeout: int | None = None, repository: str | None = None, framework: str | None = None, revision: str | None = None, task: str | None = None, custom_image: Dict | None = None, env: Dict[str, str] | None = None, secrets: Dict[str, str] | None = None, domain: str | None = None, path: str | None = None, cache_http_responses: bool | None = None, tags: List[str] | None = None, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint [source][source]¶
Update an Inference Endpoint.
This method allows the update of either the compute configuration, the deployed model, the route, or any combination. All arguments are optional but at least one must be provided.
For convenience, you can also update an Inference Endpoint using [
InferenceEndpoint.update
].- Parameters:
name (
str
) – The name of the Inference Endpoint to update.accelerator (
str
, optional) – The hardware accelerator to be used for inference (e.g."cpu"
).instance_size (
str
, optional) – The size or type of the instance to be used for hosting the model (e.g."x4"
).instance_type (
str
, optional) – The cloud instance type where the Inference Endpoint will be deployed (e.g."intel-icl"
).min_replica (
int
, optional) – The minimum number of replicas (instances) to keep running for the Inference Endpoint.max_replica (
int
, optional) – The maximum number of replicas (instances) to scale to for the Inference Endpoint.scale_to_zero_timeout (
int
, optional) – The duration in minutes before an inactive endpoint is scaled to zero.repository (
str
, optional) – The name of the model repository associated with the Inference Endpoint (e.g."gpt2"
).framework (
str
, optional) – The machine learning framework used for the model (e.g."custom"
).revision (
str
, optional) – The specific model revision to deploy on the Inference Endpoint (e.g."6c0e6080953db56375760c0471a8c5f2929baf11"
).task (
str
, optional) – The task on which to deploy the model (e.g."text-classification"
).custom_image (
Dict
, optional) – A custom Docker image to use for the Inference Endpoint. This is useful if you want to deploy an Inference Endpoint running on thetext-generation-inference
(TGI) framework (see examples).env (
Dict[str, str]
, optional) – Non-secret environment variables to inject in the container environmentsecrets (
Dict[str, str]
, optional) – Secret values to inject in the container environment.domain (
str
, optional) – The custom domain for the Inference Endpoint deployment, if setup the inference endpoint will be available at this domain (e.g."my-new-domain.cool-website.woof"
).path (
str
, optional) – The custom path to the deployed model, should start with a/
(e.g."/models/google-bert/bert-base-uncased"
).cache_http_responses (
bool
, optional) – Whether to cache HTTP responses from the Inference Endpoint.tags (
List[str]
, optional) – A list of tags to associate with the Inference Endpoint.namespace (
str
, optional) – The namespace where the Inference Endpoint will be updated. Defaults to the current user’s namespace.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
information about the updated Inference Endpoint.
- Return type:
[
InferenceEndpoint
]
- delete_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) None [source][source]¶
Delete an Inference Endpoint.
This operation is not reversible. If you don’t want to be charged for an Inference Endpoint, it is preferable to pause it with [
pause_inference_endpoint
] or scale it to zero with [scale_to_zero_inference_endpoint
].For convenience, you can also delete an Inference Endpoint using [
InferenceEndpoint.delete
].- Parameters:
name (
str
) – The name of the Inference Endpoint to delete.namespace (
str
, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- pause_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint [source][source]¶
Pause an Inference Endpoint.
A paused Inference Endpoint will not be charged. It can be resumed at any time using [
resume_inference_endpoint
]. This is different than scaling the Inference Endpoint to zero with [scale_to_zero_inference_endpoint
], which would be automatically restarted when a request is made to it.For convenience, you can also pause an Inference Endpoint using [
pause_inference_endpoint
].- Parameters:
name (
str
) – The name of the Inference Endpoint to pause.namespace (
str
, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
information about the paused Inference Endpoint.
- Return type:
[
InferenceEndpoint
]
- resume_inference_endpoint(name: str, *, namespace: str | None = None, running_ok: bool = True, token: str | bool | None = None) InferenceEndpoint [source][source]¶
Resume an Inference Endpoint.
For convenience, you can also resume an Inference Endpoint using [
InferenceEndpoint.resume
].- Parameters:
name (
str
) – The name of the Inference Endpoint to resume.namespace (
str
, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.running_ok (
bool
, optional) – IfTrue
, the method will not raise an error if the Inference Endpoint is already running. Defaults toTrue
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
information about the resumed Inference Endpoint.
- Return type:
[
InferenceEndpoint
]
- scale_to_zero_inference_endpoint(name: str, *, namespace: str | None = None, token: str | bool | None = None) InferenceEndpoint [source][source]¶
Scale Inference Endpoint to zero.
An Inference Endpoint scaled to zero will not be charged. It will be resume on the next request to it, with a cold start delay. This is different than pausing the Inference Endpoint with [
pause_inference_endpoint
], which would require a manual resume with [resume_inference_endpoint
].For convenience, you can also scale an Inference Endpoint to zero using [
InferenceEndpoint.scale_to_zero
].- Parameters:
name (
str
) – The name of the Inference Endpoint to scale to zero.namespace (
str
, optional) – The namespace in which the Inference Endpoint is located. Defaults to the current user.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
information about the scaled-to-zero Inference Endpoint.
- Return type:
[
InferenceEndpoint
]
- list_collections(*, owner: List[str] | str | None = None, item: List[str] | str | None = None, sort: Literal['lastModified', 'trending', 'upvotes'] | None = None, limit: int | None = None, token: bool | str | None = None) Iterable[Collection] [source][source]¶
List collections on the Huggingface Hub, given some filters.
<Tip warning={true}>
When listing collections, the item list per collection is truncated to 4 items maximum. To retrieve all items from a collection, you must use [
get_collection
].</Tip>
- Parameters:
owner (
List[str]
orstr
, optional) – Filter by owner’s username.item (
List[str]
orstr
, optional) – Filter collections containing a particular items. Example:"models/teknium/OpenHermes-2.5-Mistral-7B"
,"datasets/squad"
or"papers/2311.12983"
.sort (
Literal["lastModified", "trending", "upvotes"]
, optional) – Sort collections by last modified, trending or upvotes.limit (
int
, optional) – Maximum number of collections to be returned.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
an iterable of [
Collection
] objects.- Return type:
Iterable[Collection]
- get_collection(collection_slug: str, *, token: str | bool | None = None) Collection [source][source]¶
Gets information about a Collection on the Hub.
- Parameters:
collection_slug (
str
) – Slug of the collection of the Hub. Example:"TheBloke/recent-models-64f9a55bb3115b4f513ec026"
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Returns: [
Collection
]Example:
>>> from huggingface_hub import get_collection >>> collection = get_collection("TheBloke/recent-models-64f9a55bb3115b4f513ec026") >>> collection.title 'Recent models' >>> len(collection.items) 37 >>> collection.items[0] CollectionItem( item_object_id='651446103cd773a050bf64c2', item_id='TheBloke/U-Amethyst-20B-AWQ', item_type='model', position=88, note=None )
- create_collection(title: str, *, namespace: str | None = None, description: str | None = None, private: bool = False, exists_ok: bool = False, token: str | bool | None = None) Collection [source][source]¶
Create a new Collection on the Hub.
- Parameters:
title (
str
) – Title of the collection to create. Example:"Recent models"
.namespace (
str
, optional) – Namespace of the collection to create (username or org). Will default to the owner name.description (
str
, optional) – Description of the collection to create.private (
bool
, optional) – Whether the collection should be private or not. Defaults toFalse
(i.e. public collection).exists_ok (
bool
, optional) – IfTrue
, do not raise an error if collection already exists.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Returns: [
Collection
]Example:
>>> from huggingface_hub import create_collection >>> collection = create_collection( ... title="ICCV 2023", ... description="Portfolio of models, papers and demos I presented at ICCV 2023", ... ) >>> collection.slug "username/iccv-2023-64f9a55bb3115b4f513ec026"
- update_collection_metadata(collection_slug: str, *, title: str | None = None, description: str | None = None, position: int | None = None, private: bool | None = None, theme: str | None = None, token: str | bool | None = None) Collection [source][source]¶
Update metadata of a collection on the Hub.
All arguments are optional. Only provided metadata will be updated.
- Parameters:
collection_slug (
str
) – Slug of the collection to update. Example:"TheBloke/recent-models-64f9a55bb3115b4f513ec026"
.title (
str
) – Title of the collection to update.description (
str
, optional) – Description of the collection to update.position (
int
, optional) – New position of the collection in the list of collections of the user.private (
bool
, optional) – Whether the collection should be private or not.theme (
str
, optional) – Theme of the collection on the Hub.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Returns: [
Collection
]Example:
>>> from huggingface_hub import update_collection_metadata >>> collection = update_collection_metadata( ... collection_slug="username/iccv-2023-64f9a55bb3115b4f513ec026", ... title="ICCV Oct. 2023" ... description="Portfolio of models, datasets, papers and demos I presented at ICCV Oct. 2023", ... private=False, ... theme="pink", ... ) >>> collection.slug "username/iccv-oct-2023-64f9a55bb3115b4f513ec026" # ^collection slug got updated but not the trailing ID
- delete_collection(collection_slug: str, *, missing_ok: bool = False, token: str | bool | None = None) None [source][source]¶
Delete a collection on the Hub.
- Parameters:
collection_slug (
str
) – Slug of the collection to delete. Example:"TheBloke/recent-models-64f9a55bb3115b4f513ec026"
.missing_ok (
bool
, optional) – IfTrue
, do not raise an error if collection doesn’t exists.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Example:
>>> from huggingface_hub import delete_collection >>> collection = delete_collection("username/useless-collection-64f9a55bb3115b4f513ec026", missing_ok=True)
<Tip warning={true}>
This is a non-revertible action. A deleted collection cannot be restored.
</Tip>
- add_collection_item(collection_slug: str, item_id: str, item_type: Literal['model', 'dataset', 'space', 'paper', 'collection'], *, note: str | None = None, exists_ok: bool = False, token: str | bool | None = None) Collection [source][source]¶
Add an item to a collection on the Hub.
- Parameters:
collection_slug (
str
) – Slug of the collection to update. Example:"TheBloke/recent-models-64f9a55bb3115b4f513ec026"
.item_id (
str
) – ID of the item to add to the collection. It can be the ID of a repo on the Hub (e.g."facebook/bart-large-mnli"
) or a paper id (e.g."2307.09288"
).item_type (
str
) – Type of the item to add. Can be one of"model"
,"dataset"
,"space"
or"paper"
.note (
str
, optional) – A note to attach to the item in the collection. The maximum size for a note is 500 characters.exists_ok (
bool
, optional) – IfTrue
, do not raise an error if item already exists.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Returns: [
Collection
]- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the item you try to add to the collection does not exist on the Hub.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 409 if the item you try to add to the collection is already in the collection (and exists_ok=False)
Example:
>>> from huggingface_hub import add_collection_item >>> collection = add_collection_item( ... collection_slug="davanstrien/climate-64f99dc2a5067f6b65531bab", ... item_id="pierre-loic/climate-news-articles", ... item_type="dataset" ... ) >>> collection.items[-1].item_id "pierre-loic/climate-news-articles" # ^item got added to the collection on last position # Add item with a note >>> add_collection_item( ... collection_slug="davanstrien/climate-64f99dc2a5067f6b65531bab", ... item_id="datasets/climate_fever", ... item_type="dataset" ... note="This dataset adopts the FEVER methodology that consists of 1,535 real-world claims regarding climate-change collected on the internet." ... ) (...)
- update_collection_item(collection_slug: str, item_object_id: str, *, note: str | None = None, position: int | None = None, token: str | bool | None = None) None [source][source]¶
Update an item in a collection.
- Parameters:
collection_slug (
str
) – Slug of the collection to update. Example:"TheBloke/recent-models-64f9a55bb3115b4f513ec026"
.item_object_id (
str
) – ID of the item in the collection. This is not the id of the item on the Hub (repo_id or paper id). It must be retrieved from a [CollectionItem
] object. Example:collection.items[0].item_object_id
.note (
str
, optional) – A note to attach to the item in the collection. The maximum size for a note is 500 characters.position (
int
, optional) – New position of the item in the collection.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Example:
>>> from huggingface_hub import get_collection, update_collection_item # Get collection first >>> collection = get_collection("TheBloke/recent-models-64f9a55bb3115b4f513ec026") # Update item based on its ID (add note + update position) >>> update_collection_item( ... collection_slug="TheBloke/recent-models-64f9a55bb3115b4f513ec026", ... item_object_id=collection.items[-1].item_object_id, ... note="Newly updated model!" ... position=0, ... )
- delete_collection_item(collection_slug: str, item_object_id: str, *, missing_ok: bool = False, token: str | bool | None = None) None [source][source]¶
Delete an item from a collection.
- Parameters:
collection_slug (
str
) – Slug of the collection to update. Example:"TheBloke/recent-models-64f9a55bb3115b4f513ec026"
.item_object_id (
str
) – ID of the item in the collection. This is not the id of the item on the Hub (repo_id or paper id). It must be retrieved from a [CollectionItem
] object. Example:collection.items[0].item_object_id
.missing_ok (
bool
, optional) – IfTrue
, do not raise an error if item doesn’t exists.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
Example:
>>> from huggingface_hub import get_collection, delete_collection_item # Get collection first >>> collection = get_collection("TheBloke/recent-models-64f9a55bb3115b4f513ec026") # Delete item based on its ID >>> delete_collection_item( ... collection_slug="TheBloke/recent-models-64f9a55bb3115b4f513ec026", ... item_object_id=collection.items[-1].item_object_id, ... )
- list_pending_access_requests(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) List[AccessRequest] [source][source]¶
Get pending access requests for a given gated repo.
A pending request means the user has requested access to the repo but the request has not been processed yet. If the approval mode is automatic, this list should be empty. Pending requests can be accepted or rejected using [
accept_access_request
] and [reject_access_request
].For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (
str
) – The id of the repo to get access requests for.repo_type (
str
, optional) – The type of the repo to get access requests for. Must be one ofmodel
,dataset
orspace
. Defaults tomodel
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A list of [
AccessRequest
] objects. Each time contains ausername
,email
,status
andtimestamp
attribute. If the gated repo has a custom form, thefields
attribute will be populated with user’s answers.- Return type:
List[AccessRequest]
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.
Example: .. code-block:: py
>>> from huggingface_hub import list_pending_access_requests, accept_access_request
# List pending requests >>> requests = list_pending_access_requests(“meta-llama/Llama-2-7b”) >>> len(requests) 411 >>> requests[0] [
- AccessRequest(
username=’clem’, fullname=’Clem 🤗’, email=’***’, timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc), status=’pending’, fields=None,
]
# Accept Clem’s request >>> accept_access_request(“meta-llama/Llama-2-7b”, “clem”)
- list_accepted_access_requests(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) List[AccessRequest] [source][source]¶
Get accepted access requests for a given gated repo.
An accepted request means the user has requested access to the repo and the request has been accepted. The user can download any file of the repo. If the approval mode is automatic, this list should contains by default all requests. Accepted requests can be cancelled or rejected at any time using [
cancel_access_request
] and [reject_access_request
]. A cancelled request will go back to the pending list while a rejected request will go to the rejected list. In both cases, the user will lose access to the repo.For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (
str
) – The id of the repo to get access requests for.repo_type (
str
, optional) – The type of the repo to get access requests for. Must be one ofmodel
,dataset
orspace
. Defaults tomodel
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A list of [
AccessRequest
] objects. Each time contains ausername
,email
,status
andtimestamp
attribute. If the gated repo has a custom form, thefields
attribute will be populated with user’s answers.- Return type:
List[AccessRequest]
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.
Example: .. code-block:: py
>>> from huggingface_hub import list_accepted_access_requests
>>> requests = list_accepted_access_requests("meta-llama/Llama-2-7b") >>> len(requests) 411 >>> requests[0] [ AccessRequest( username='clem', fullname='Clem 🤗', email='\***', timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc), status='accepted', fields=None, ), ... ]
- list_rejected_access_requests(repo_id: str, *, repo_type: str | None = None, token: bool | str | None = None) List[AccessRequest] [source][source]¶
Get rejected access requests for a given gated repo.
A rejected request means the user has requested access to the repo and the request has been explicitly rejected by a repo owner (either you or another user from your organization). The user cannot download any file of the repo. Rejected requests can be accepted or cancelled at any time using [
accept_access_request
] and [cancel_access_request
]. A cancelled request will go back to the pending list while an accepted request will go to the accepted list.For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (
str
) – The id of the repo to get access requests for.repo_type (
str
, optional) – The type of the repo to get access requests for. Must be one ofmodel
,dataset
orspace
. Defaults tomodel
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A list of [
AccessRequest
] objects. Each time contains ausername
,email
,status
andtimestamp
attribute. If the gated repo has a custom form, thefields
attribute will be populated with user’s answers.- Return type:
List[AccessRequest]
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.
Example: .. code-block:: py
>>> from huggingface_hub import list_rejected_access_requests
>>> requests = list_rejected_access_requests("meta-llama/Llama-2-7b") >>> len(requests) 411 >>> requests[0] [ AccessRequest( username='clem', fullname='Clem 🤗', email='\***', timestamp=datetime.datetime(2023, 11, 23, 18, 4, 53, 828000, tzinfo=datetime.timezone.utc), status='rejected', fields=None, ), ... ]
- cancel_access_request(repo_id: str, user: str, *, repo_type: str | None = None, token: bool | str | None = None) None [source][source]¶
Cancel an access request from a user for a given gated repo.
A cancelled request will go back to the pending list and the user will lose access to the repo.
For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (
str
) – The id of the repo to cancel access request for.user (
str
) – The username of the user which access request should be cancelled.repo_type (
str
, optional) – The type of the repo to cancel access request for. Must be one ofmodel
,dataset
orspace
. Defaults tomodel
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request cannot be found.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request is already in the pending list.
- accept_access_request(repo_id: str, user: str, *, repo_type: str | None = None, token: bool | str | None = None) None [source][source]¶
Accept an access request from a user for a given gated repo.
Once the request is accepted, the user will be able to download any file of the repo and access the community tab. If the approval mode is automatic, you don’t have to accept requests manually. An accepted request can be cancelled or rejected at any time using [
cancel_access_request
] and [reject_access_request
].For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (
str
) – The id of the repo to accept access request for.user (
str
) – The username of the user which access request should be accepted.repo_type (
str
, optional) – The type of the repo to accept access request for. Must be one ofmodel
,dataset
orspace
. Defaults tomodel
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request cannot be found.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request is already in the accepted list.
- reject_access_request(repo_id: str, user: str, *, repo_type: str | None = None, rejection_reason: str | None, token: bool | str | None = None) None [source][source]¶
Reject an access request from a user for a given gated repo.
A rejected request will go to the rejected list. The user cannot download any file of the repo. Rejected requests can be accepted or cancelled at any time using [
accept_access_request
] and [cancel_access_request
]. A cancelled request will go back to the pending list while an accepted request will go to the accepted list.For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (
str
) – The id of the repo to reject access request for.user (
str
) – The username of the user which access request should be rejected.repo_type (
str
, optional) – The type of the repo to reject access request for. Must be one ofmodel
,dataset
orspace
. Defaults tomodel
.rejection_reason (
str
, optional) – Optional rejection reason that will be visible to the user (max 200 characters).token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request cannot be found.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user access request is already in the rejected list.
- grant_access(repo_id: str, user: str, *, repo_type: str | None = None, token: bool | str | None = None) None [source][source]¶
Grant access to a user for a given gated repo.
Granting access don’t require for the user to send an access request by themselves. The user is automatically added to the accepted list meaning they can download the files You can revoke the granted access at any time using [
cancel_access_request
] or [reject_access_request
].For more info about gated repos, see https://huggingface.co/docs/hub/models-gated.
- Parameters:
repo_id (
str
) – The id of the repo to grant access to.user (
str
) – The username of the user to grant access.repo_type (
str
, optional) – The type of the repo to grant access to. Must be one ofmodel
,dataset
orspace
. Defaults tomodel
.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the repo is not gated.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 400 if the user already has access to the repo.
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 403 if you only have read-only access to the repo. This can be the case if you don’t have
write
oradmin
role in the organization the repo belongs to or if you passed aread
token.[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 if the user does not exist on the Hub.
- get_webhook(webhook_id: str, *, token: bool | str | None = None) WebhookInfo [source][source]¶
Get a webhook by its id.
- Parameters:
webhook_id (
str
) – The unique identifier of the webhook to get.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Info about the webhook.
- Return type:
[
WebhookInfo
]
Example
>>> from huggingface_hub import get_webhook >>> webhook = get_webhook("654bbbc16f2ec14d77f109cc") >>> print(webhook) WebhookInfo( id="654bbbc16f2ec14d77f109cc", watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")], url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", secret="my-secret", domains=["repo", "discussion"], disabled=False, )
- list_webhooks(*, token: bool | str | None = None) List[WebhookInfo] [source][source]¶
List all configured webhooks.
- Parameters:
token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.- Returns:
List of webhook info objects.
- Return type:
List[WebhookInfo]
Example
>>> from huggingface_hub import list_webhooks >>> webhooks = list_webhooks() >>> len(webhooks) 2 >>> webhooks[0] WebhookInfo( id="654bbbc16f2ec14d77f109cc", watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")], url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", secret="my-secret", domains=["repo", "discussion"], disabled=False, )
- create_webhook(*, url: str, watched: List[Dict | WebhookWatchedItem], domains: List[Literal['repo', 'discussions']] | None = None, secret: str | None = None, token: bool | str | None = None) WebhookInfo [source][source]¶
Create a new webhook.
- Parameters:
url (
str
) – URL to send the payload to.watched (
List[WebhookWatchedItem]
) – List of [WebhookWatchedItem
] to be watched by the webhook. It can be users, orgs, models, datasets or spaces. Watched items can also be provided as plain dictionaries.domains (
List[Literal["repo", "discussion"]]
, optional) – List of domains to watch. It can be “repo”, “discussion” or both.secret (
str
, optional) – A secret to sign the payload with.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Info about the newly created webhook.
- Return type:
[
WebhookInfo
]
Example
>>> from huggingface_hub import create_webhook >>> payload = create_webhook( ... watched=[{"type": "user", "name": "julien-c"}, {"type": "org", "name": "HuggingFaceH4"}], ... url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", ... domains=["repo", "discussion"], ... secret="my-secret", ... ) >>> print(payload) WebhookInfo( id="654bbbc16f2ec14d77f109cc", url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")], domains=["repo", "discussion"], secret="my-secret", disabled=False, )
- update_webhook(webhook_id: str, *, url: str | None = None, watched: List[Dict | WebhookWatchedItem] | None = None, domains: List[Literal['repo', 'discussions']] | None = None, secret: str | None = None, token: bool | str | None = None) WebhookInfo [source][source]¶
Update an existing webhook.
- Parameters:
webhook_id (
str
) – The unique identifier of the webhook to be updated.url (
str
, optional) – The URL to which the payload will be sent.watched (
List[WebhookWatchedItem]
, optional) – List of items to watch. It can be users, orgs, models, datasets, or spaces. Refer to [WebhookWatchedItem
] for more details. Watched items can also be provided as plain dictionaries.domains (
List[Literal["repo", "discussion"]]
, optional) – The domains to watch. This can include “repo”, “discussion”, or both.secret (
str
, optional) – A secret to sign the payload with, providing an additional layer of security.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Info about the updated webhook.
- Return type:
[
WebhookInfo
]
Example
>>> from huggingface_hub import update_webhook >>> updated_payload = update_webhook( ... webhook_id="654bbbc16f2ec14d77f109cc", ... url="https://new.webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", ... watched=[{"type": "user", "name": "julien-c"}, {"type": "org", "name": "HuggingFaceH4"}], ... domains=["repo"], ... secret="my-secret", ... ) >>> print(updated_payload) WebhookInfo( id="654bbbc16f2ec14d77f109cc", url="https://new.webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")], domains=["repo"], secret="my-secret", disabled=False,
- enable_webhook(webhook_id: str, *, token: bool | str | None = None) WebhookInfo [source][source]¶
Enable a webhook (makes it “active”).
- Parameters:
webhook_id (
str
) – The unique identifier of the webhook to enable.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Info about the enabled webhook.
- Return type:
[
WebhookInfo
]
Example
>>> from huggingface_hub import enable_webhook >>> enabled_webhook = enable_webhook("654bbbc16f2ec14d77f109cc") >>> enabled_webhook WebhookInfo( id="654bbbc16f2ec14d77f109cc", url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")], domains=["repo", "discussion"], secret="my-secret", disabled=False, )
- disable_webhook(webhook_id: str, *, token: bool | str | None = None) WebhookInfo [source][source]¶
Disable a webhook (makes it “disabled”).
- Parameters:
webhook_id (
str
) – The unique identifier of the webhook to disable.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
Info about the disabled webhook.
- Return type:
[
WebhookInfo
]
Example
>>> from huggingface_hub import disable_webhook >>> disabled_webhook = disable_webhook("654bbbc16f2ec14d77f109cc") >>> disabled_webhook WebhookInfo( id="654bbbc16f2ec14d77f109cc", url="https://webhook.site/a2176e82-5720-43ee-9e06-f91cb4c91548", watched=[WebhookWatchedItem(type="user", name="julien-c"), WebhookWatchedItem(type="org", name="HuggingFaceH4")], domains=["repo", "discussion"], secret="my-secret", disabled=True, )
- delete_webhook(webhook_id: str, *, token: bool | str | None = None) None [source][source]¶
Delete a webhook.
- Parameters:
webhook_id (
str
) – The unique identifier of the webhook to delete.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
None
Example
>>> from huggingface_hub import delete_webhook >>> delete_webhook("654bbbc16f2ec14d77f109cc")
- get_user_overview(username: str, token: str | bool | None = None) User [source][source]¶
Get an overview of a user on the Hub.
- Parameters:
username (
str
) – Username of the user to get an overview of.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A [
User
] object with the user’s overview.- Return type:
User
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the user does not exist on the Hub.
- list_organization_members(organization: str, token: str | bool | None = None) Iterable[User] [source][source]¶
List of members of an organization on the Hub.
- Parameters:
organization (
str
) – Name of the organization to get the members of.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A list of [
User
] objects with the members of the organization.- Return type:
Iterable[User]
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the organization does not exist on the Hub.
- list_user_followers(username: str, token: str | bool | None = None) Iterable[User] [source][source]¶
Get the list of followers of a user on the Hub.
- Parameters:
username (
str
) – Username of the user to get the followers of.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A list of [
User
] objects with the followers of the user.- Return type:
Iterable[User]
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the user does not exist on the Hub.
- list_user_following(username: str, token: str | bool | None = None) Iterable[User] [source][source]¶
Get the list of users followed by a user on the Hub.
- Parameters:
username (
str
) – Username of the user to get the users followed by.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
A list of [
User
] objects with the users followed by the user.- Return type:
Iterable[User]
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the user does not exist on the Hub.
- list_papers(*, query: str | None = None, token: str | bool | None = None) Iterable[PaperInfo] [source][source]¶
List daily papers on the Hugging Face Hub given a search query.
- Parameters:
query (
str
, optional) – A search query string to find papers. If provided, returns papers that match the query.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.
- Returns:
an iterable of [
huggingface_hub.hf_api.PaperInfo
] objects.- Return type:
Iterable[PaperInfo]
Example:
>>> from huggingface_hub import HfApi >>> api = HfApi() # List all papers with "attention" in their title >>> api.list_papers(query="attention")
- paper_info(id: str) PaperInfo [source][source]¶
Get information for a paper on the Hub.
- Parameters:
id (
str
, **optional**) – ArXiv id of the paper.- Returns:
A
PaperInfo
object.- Return type:
PaperInfo
- Raises:
[HTTPError](https – //requests.readthedocs.io/en/latest/api/#requests.HTTPError): HTTP 404 If the paper does not exist on the Hub.
- auth_check(repo_id: str, *, repo_type: str | None = None, token: str | bool | None = None) None [source][source]¶
Check if the provided user token has access to a specific repository on the Hugging Face Hub.
This method verifies whether the user, authenticated via the provided token, has access to the specified repository. If the repository is not found or if the user lacks the required permissions to access it, the method raises an appropriate exception.
- Parameters:
repo_id (
str
) – The repository to check for access. Format should be"user/repo_name"
. Example:"user/my-cool-model"
.repo_type (
str
, optional) – The type of the repository. Should be one of"model"
,"dataset"
, or"space"
. If not specified, the default is"model"
.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- Raises:
[RepositoryNotFoundError] – Raised if the repository does not exist, is private, or the user does not have access. This can occur if the
repo_id
orrepo_type
is incorrect or if the repository is private but the user is not authenticated.[GatedRepoError] – Raised if the repository exists but is gated and the user is not authorized to access it.
Example
Check if the user has access to a repository:
>>> from huggingface_hub import auth_check >>> from huggingface_hub.utils import GatedRepoError, RepositoryNotFoundError try: auth_check("user/my-cool-model") except GatedRepoError: # Handle gated repository error print("You do not have permission to access this gated repository.") except RepositoryNotFoundError: # Handle repository not found error print("The repository was not found or you do not have access.")
In this example: - If the user has access, the method completes successfully. - If the repository is gated or does not exist, appropriate exceptions are raised, allowing the user to handle them accordingly.
- run_job(*, image: str, command: List[str], env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: str | bool | None = None) JobInfo [source][source]¶
Run compute Jobs on Hugging Face infrastructure.
- Parameters:
image (
str
) – The Docker image to use. Examples:"ubuntu"
,"python:3.12"
,"pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel"
. Example with an image from a Space:"hf.co/spaces/lhoestq/duckdb"
.command (
List[str]
) – The command to run. Example:["echo", "hello"]
.env (
Dict[str, Any]
, optional) – Defines the environment variables for the Job.secrets (
Dict[str, Any]
, optional) – Defines the secret environment variables for the Job.flavor (
str
, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware
] for possible values. Defaults to"cpu-basic"
.timeout (
Union[int, float, str]
, optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example:300
or"5m"
for 5 minutes.namespace (
str
, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
Example
Run your first Job:
>>> from huggingface_hub import run_job >>> run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"])
Run a GPU Job:
>>> from huggingface_hub import run_job >>> image = "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel" >>> command = ["python", "-c", "import torch; print(f"This code ran with the following GPU: {torch.cuda.get_device_name()}")"] >>> run_job(image=image, command=command, flavor="a10g-small")
- fetch_job_logs(*, job_id: str, namespace: str | None = None, token: str | bool | None = None) Iterable[str] [source][source]¶
Fetch all the logs from a compute Job on Hugging Face infrastructure.
- Parameters:
job_id (
str
) – ID of the Job.namespace (
str
, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
Example
>>> from huggingface_hub import fetch_job_logs, run_job >>> job = run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"]) >>> for log in fetch_job_logs(job.id): ... print(log) Hello from HF compute!
- list_jobs(*, timeout: int | None = None, namespace: str | None = None, token: str | bool | None = None) List[JobInfo] [source][source]¶
List compute Jobs on Hugging Face infrastructure.
- Parameters:
timeout (
float
, optional) – Whether to set a timeout for the request to the Hub.namespace (
str
, optional) – The namespace from where it lists the jobs. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- inspect_job(*, job_id: str, namespace: str | None = None, token: str | bool | None = None) JobInfo [source][source]¶
Inspect a compute Job on Hugging Face infrastructure.
- Parameters:
job_id (
str
) – ID of the Job.namespace (
str
, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
Example
>>> from huggingface_hub import inspect_job, run_job >>> job = run_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"]) >>> inspect_job(job.id) JobInfo( id='68780d00bbe36d38803f645f', created_at=datetime.datetime(2025, 7, 16, 20, 35, 12, 808000, tzinfo=datetime.timezone.utc), docker_image='python:3.12', space_id=None, command=['python', '-c', "print('Hello from HF compute!')"], arguments=[], environment={}, secrets={}, flavor='cpu-basic', status=JobStatus(stage='RUNNING', message=None) )
- cancel_job(*, job_id: str, namespace: str | None = None, token: str | bool | None = None) None [source][source]¶
Cancel a compute Job on Hugging Face infrastructure.
- Parameters:
job_id (
str
) – ID of the Job.namespace (
str
, optional) – The namespace where the Job is running. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- run_uv_job(script: str, *, script_args: List[str] | None = None, dependencies: List[str] | None = None, python: str | None = None, image: str | None = None, env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: bool | str | None = None, _repo: str | None = None) JobInfo [source][source]¶
Run a UV script Job on Hugging Face infrastructure.
- Parameters:
script (
str
) – Path or URL of the UV script, or a command.script_args (
List[str]
, optional) – Arguments to pass to the script or command.dependencies (
List[str]
, optional) – Dependencies to use to run the UV script.python (
str
, optional) – Use a specific Python version. Default is 3.12.(str (image) – python3.12-bookworm”): Use a custom Docker image with
uv
installed.optional – python3.12-bookworm”): Use a custom Docker image with
uv
installed."ghcr.io/astral-sh/uv (defaults to) – python3.12-bookworm”): Use a custom Docker image with
uv
installed.env (
Dict[str, Any]
, optional) – Defines the environment variables for the Job.secrets (
Dict[str, Any]
, optional) – Defines the secret environment variables for the Job.flavor (
str
, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware
] for possible values. Defaults to"cpu-basic"
.timeout (
Union[int, float, str]
, optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example:300
or"5m"
for 5 minutes.namespace (
str
, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
Example
Run a script from a URL:
>>> from huggingface_hub import run_uv_job >>> script = "https://raw.githubusercontent.com/huggingface/trl/refs/heads/main/trl/scripts/sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> run_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small")
Run a local script:
>>> from huggingface_hub import run_uv_job >>> script = "my_sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> run_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small")
Run a command:
>>> from huggingface_hub import run_uv_job >>> script = "lighteval" >>> script_args= ["endpoint", "inference-providers", "model_name=openai/gpt-oss-20b,provider=auto", "lighteval|gsm8k|0|0"] >>> run_uv_job(script, script_args=script_args, dependencies=["lighteval"], flavor="a10g-small")
- create_scheduled_job(*, image: str, command: List[str], schedule: str, suspend: bool | None = None, concurrency: bool | None = None, env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: str | bool | None = None) ScheduledJobInfo [source][source]¶
Create scheduled compute Jobs on Hugging Face infrastructure.
- Parameters:
image (
str
) – The Docker image to use. Examples:"ubuntu"
,"python:3.12"
,"pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel"
. Example with an image from a Space:"hf.co/spaces/lhoestq/duckdb"
.command (
List[str]
) – The command to run. Example:["echo", "hello"]
.schedule (
str
) – One of “@annually”, “@yearly”, “@monthly”, “@weekly”, “@daily”, “@hourly”, or a CRON schedule expression (e.g., ‘0 9 * * 1’ for 9 AM every Monday).suspend (
bool
, optional) – If True, the scheduled Job is suspended (paused). Defaults to False.concurrency (
bool
, optional) – If True, multiple instances of this Job can run concurrently. Defaults to False.env (
Dict[str, Any]
, optional) – Defines the environment variables for the Job.secrets (
Dict[str, Any]
, optional) – Defines the secret environment variables for the Job.flavor (
str
, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware
] for possible values. Defaults to"cpu-basic"
.timeout (
Union[int, float, str]
, optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example:300
or"5m"
for 5 minutes.namespace (
str
, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
Example
Create your first scheduled Job:
>>> from huggingface_hub import create_scheduled_job >>> create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"], schedule="@hourly")
Use a CRON schedule expression:
>>> from huggingface_hub import create_scheduled_job >>> create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('this runs every 5min')"], schedule="*/5 * * * *")
Create a scheduled GPU Job:
>>> from huggingface_hub import create_scheduled_job >>> image = "pytorch/pytorch:2.6.0-cuda12.4-cudnn9-devel" >>> command = ["python", "-c", "import torch; print(f"This code ran with the following GPU: {torch.cuda.get_device_name()}")"] >>> create_scheduled_job(image, command, flavor="a10g-small", schedule="@hourly")
- list_scheduled_jobs(*, timeout: int | None = None, namespace: str | None = None, token: str | bool | None = None) List[ScheduledJobInfo] [source][source]¶
List scheduled compute Jobs on Hugging Face infrastructure.
- Parameters:
timeout (
float
, optional) – Whether to set a timeout for the request to the Hub.namespace (
str
, optional) – The namespace from where it lists the jobs. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- inspect_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) ScheduledJobInfo [source][source]¶
Inspect a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (
str
) – ID of the scheduled Job.namespace (
str
, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
Example
>>> from huggingface_hub import inspect_job, create_scheduled_job >>> scheduled_job = create_scheduled_job(image="python:3.12", command=["python", "-c" ,"print('Hello from HF compute!')"], schedule="@hourly") >>> inspect_scheduled_job(scheduled_job.id)
- delete_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) None [source][source]¶
Delete a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (
str
) – ID of the scheduled Job.namespace (
str
, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- suspend_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) None [source][source]¶
Suspend (pause) a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (
str
) – ID of the scheduled Job.namespace (
str
, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- resume_scheduled_job(*, scheduled_job_id: str, namespace: str | None = None, token: str | bool | None = None) None [source][source]¶
Resume (unpause) a scheduled compute Job on Hugging Face infrastructure.
- Parameters:
scheduled_job_id (
str
) – ID of the scheduled Job.namespace (
str
, optional) – The namespace where the scheduled Job is. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
- create_scheduled_uv_job(script: str, *, script_args: List[str] | None = None, schedule: str, suspend: bool | None = None, concurrency: bool | None = None, dependencies: List[str] | None = None, python: str | None = None, image: str | None = None, env: Dict[str, Any] | None = None, secrets: Dict[str, Any] | None = None, flavor: SpaceHardware | None = None, timeout: int | float | str | None = None, namespace: str | None = None, token: bool | str | None = None, _repo: str | None = None) ScheduledJobInfo [source][source]¶
Run a UV script Job on Hugging Face infrastructure.
- Parameters:
script (
str
) – Path or URL of the UV script, or a command.script_args (
List[str]
, optional) – Arguments to pass to the script, or a command.schedule (
str
) – One of “@annually”, “@yearly”, “@monthly”, “@weekly”, “@daily”, “@hourly”, or a CRON schedule expression (e.g., ‘0 9 * * 1’ for 9 AM every Monday).suspend (
bool
, optional) – If True, the scheduled Job is suspended (paused). Defaults to False.concurrency (
bool
, optional) – If True, multiple instances of this Job can run concurrently. Defaults to False.dependencies (
List[str]
, optional) – Dependencies to use to run the UV script.python (
str
, optional) – Use a specific Python version. Default is 3.12.(str (image) – python3.12-bookworm”): Use a custom Docker image with
uv
installed.optional – python3.12-bookworm”): Use a custom Docker image with
uv
installed."ghcr.io/astral-sh/uv (defaults to) – python3.12-bookworm”): Use a custom Docker image with
uv
installed.env (
Dict[str, Any]
, optional) – Defines the environment variables for the Job.secrets (
Dict[str, Any]
, optional) – Defines the secret environment variables for the Job.flavor (
str
, optional) – Flavor for the hardware, as in Hugging Face Spaces. See [SpaceHardware
] for possible values. Defaults to"cpu-basic"
.timeout (
Union[int, float, str]
, optional) – Max duration for the Job: int/float with s (seconds, default), m (minutes), h (hours) or d (days). Example:300
or"5m"
for 5 minutes.namespace (
str
, optional) – The namespace where the Job will be created. Defaults to the current user’s namespace.` (token) – A valid user access token. If not provided, the locally saved token will be used, which is the recommended authentication method. Set to
False
to disable authentication. Refer to: https://huggingface.co/docs/huggingface_hub/quick-start#authentication.
Example
Schedule a script from a URL:
>>> from huggingface_hub import create_scheduled_uv_job >>> script = "https://raw.githubusercontent.com/huggingface/trl/refs/heads/main/trl/scripts/sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small", schedule="@weekly")
Schedule a local script:
>>> from huggingface_hub import create_scheduled_uv_job >>> script = "my_sft.py" >>> script_args = ["--model_name_or_path", "Qwen/Qwen2-0.5B", "--dataset_name", "trl-lib/Capybara", "--push_to_hub"] >>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["trl"], flavor="a10g-small", schedule="@weekly")
Schedule a command:
>>> from huggingface_hub import create_scheduled_uv_job >>> script = "lighteval" >>> script_args= ["endpoint", "inference-providers", "model_name=openai/gpt-oss-20b,provider=auto", "lighteval|gsm8k|0|0"] >>> create_scheduled_uv_job(script, script_args=script_args, dependencies=["lighteval"], flavor="a10g-small", schedule="@weekly")
- tooluniverse.embedding_sync.upload_folder(*, repo_id: str, folder_path: str | Path, path_in_repo: str | None = None, commit_message: str | None = None, commit_description: str | None = None, token: str | bool | None = None, repo_type: str | None = None, revision: str | None = None, create_pr: bool | None = None, parent_commit: str | None = None, allow_patterns: str | List[str] | None = None, ignore_patterns: str | List[str] | None = None, delete_patterns: str | List[str] | None = None, run_as_future: bool = False) CommitInfo | Future[CommitInfo] [source]¶
Upload a local folder to the given repo. The upload is done through a HTTP requests, and doesn’t require git or git-lfs to be installed.
The structure of the folder will be preserved. Files with the same name already present in the repository will be overwritten. Others will be left untouched.
Use the
allow_patterns
andignore_patterns
arguments to specify which files to upload. These parameters accept either a single pattern or a list of patterns. Patterns are Standard Wildcards (globbing patterns) as documented [here](https://tldp.org/LDP/GNU-Linux-Tools-Summary/html/x11655.htm). If bothallow_patterns
andignore_patterns
are provided, both constraints apply. By default, all files from the folder are uploaded.Use the
delete_patterns
argument to specify remote files you want to delete. Input type is the same as forallow_patterns
(see above). Ifpath_in_repo
is also provided, the patterns are matched against paths relative to this folder. For example,upload_folder(..., path_in_repo="experiment", delete_patterns="logs/*")
will delete any remote file under./experiment/logs/
. Note that the.gitattributes
file will not be deleted even if it matches the patterns.Any
.git/
folder present in any subdirectory will be ignored. However, please be aware that the.gitignore
file is not taken into account.Uses
HfApi.create_commit
under the hood.- Parameters:
repo_id (
str
) – The repository to which the file will be uploaded, for example:"username/custom_transformers"
folder_path (
str
orPath
) – Path to the folder to upload on the local file systempath_in_repo (
str
, optional) – Relative path of the directory in the repo, for example:"checkpoints/1fec34a/results"
. Will default to the root folder of the repository.token (Union[bool, str, None], optional) – A valid user access token (string). Defaults to the locally saved token, which is the recommended method for authentication (see https://huggingface.co/docs/huggingface_hub/quick-start#authentication). To disable authentication, pass
False
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if uploading to a dataset or space,None
or"model"
if uploading to a model. Default isNone
.revision (
str
, optional) – The git revision to commit from. Defaults to the head of the"main"
branch.commit_message (
str
, optional) – The summary / title / first line of the generated commit. Defaults to:f"Upload {path_in_repo} with huggingface_hub"
commit_description (
str
optional) – The description of the generated commitcreate_pr (
boolean
, optional) – Whether or not to create a Pull Request with that commit. Defaults toFalse
. Ifrevision
is not set, PR is opened against the"main"
branch. Ifrevision
is set and is a branch, PR is opened against this branch. Ifrevision
is set and is not a branch name (example: a commit oid), anRevisionNotFoundError
is returned by the server.parent_commit (
str
, optional) – The OID / SHA of the parent commit, as a hexadecimal string. Shorthands (7 first characters) are also supported. If specified andcreate_pr
isFalse
, the commit will fail ifrevision
does not point toparent_commit
. If specified andcreate_pr
isTrue
, the pull request will be created fromparent_commit
. Specifyingparent_commit
ensures the repo has not changed before committing the changes, and can be especially useful if the repo is updated / committed to concurrently.allow_patterns (
List[str]
orstr
, optional) – If provided, only files matching at least one pattern are uploaded.ignore_patterns (
List[str]
orstr
, optional) – If provided, files matching any of the patterns are not uploaded.delete_patterns (
List[str]
orstr
, optional) – If provided, remote files matching any of the patterns will be deleted from the repo while committing new files. This is useful if you don’t know which files have already been uploaded. Note: to avoid discrepancies the.gitattributes
file is not deleted even if it matches the pattern.run_as_future (
bool
, optional) – Whether or not to run this method in the background. Background jobs are run sequentially without blocking the main thread. Passingrun_as_future=True
will return a [Future](https://docs.python.org/3/library/concurrent.futures.html#future-objects) object. Defaults toFalse
.
- Returns:
Instance of [
CommitInfo
] containing information about the newly created commit (commit hash, commit url, pr url, commit message,…). Ifrun_as_future=True
is passed, returns a Future object which will contain the result when executed.- Return type:
[
CommitInfo
] orFuture
<Tip>
Raises the following errors:
if the HuggingFace API returned an error - [
ValueError
](https://docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid</Tip>
<Tip warning={true}>
upload_folder
assumes that the repo already exists on the Hub. If you get a Client error 404, please make sure you are authenticated and thatrepo_id
andrepo_type
are set correctly. If repo does not exist, create it first using [~hf_api.create_repo
].</Tip>
<Tip>
When dealing with a large folder (thousands of files or hundreds of GB), we recommend using [
~hf_api.upload_large_folder
] instead.</Tip>
Example:
# Upload checkpoints folder except the log files >>> upload_folder( ... folder_path="local/checkpoints", ... path_in_repo="remote/experiment/checkpoints", ... repo_id="username/my-dataset", ... repo_type="datasets", ... token="my_token", ... ignore_patterns="\**/logs/*.txt", ... ) # "https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints" # Upload checkpoints folder including logs while deleting existing logs from the repo # Useful if you don't know exactly which log files have already being pushed >>> upload_folder( ... folder_path="local/checkpoints", ... path_in_repo="remote/experiment/checkpoints", ... repo_id="username/my-dataset", ... repo_type="datasets", ... token="my_token", ... delete_patterns="\**/logs/*.txt", ... ) "https://huggingface.co/datasets/username/my-dataset/tree/main/remote/experiment/checkpoints" # Upload checkpoints folder while creating a PR >>> upload_folder( ... folder_path="local/checkpoints", ... path_in_repo="remote/experiment/checkpoints", ... repo_id="username/my-dataset", ... repo_type="datasets", ... token="my_token", ... create_pr=True, ... ) "https://huggingface.co/datasets/username/my-dataset/tree/refs%2Fpr%2F1/remote/experiment/checkpoints"
- tooluniverse.embedding_sync.snapshot_download(repo_id: str, *, repo_type: str | None = None, revision: str | None = None, cache_dir: str | Path | None = None, local_dir: str | Path | None = None, library_name: str | None = None, library_version: str | None = None, user_agent: Dict | str | None = None, proxies: Dict | None = None, etag_timeout: float = 10, force_download: bool = False, token: bool | str | None = None, local_files_only: bool = False, allow_patterns: List[str] | str | None = None, ignore_patterns: List[str] | str | None = None, max_workers: int = 8, tqdm_class: Type[tqdm_asyncio] | None = None, headers: Dict[str, str] | None = None, endpoint: str | None = None, local_dir_use_symlinks: bool | Literal['auto'] = 'auto', resume_download: bool | None = None) str [source][source]¶
Download repo files.
Download a whole snapshot of a repo’s files at the specified revision. This is useful when you want all files from a repo, because you don’t know which ones you will need a priori. All files are nested inside a folder in order to keep their actual filename relative to that folder. You can also filter which files to download using
allow_patterns
andignore_patterns
.If
local_dir
is provided, the file structure from the repo will be replicated in this location. When using this option, thecache_dir
will not be used and a.cache/huggingface/
folder will be created at the root oflocal_dir
to store some metadata related to the downloaded files. While this mechanism is not as robust as the main cache-system, it’s optimized for regularly pulling the latest version of a repository.An alternative would be to clone the repo but this requires git and git-lfs to be installed and properly configured. It is also not possible to filter which files to download when cloning a repository using git.
- Parameters:
repo_id (
str
) – A user or an organization name and a repo name separated by a/
.repo_type (
str
, optional) – Set to"dataset"
or"space"
if downloading from a dataset or space,None
or"model"
if downloading from a model. Default isNone
.revision (
str
, optional) – An optional Git revision id which can be a branch name, a tag, or a commit hash.cache_dir (
str
,Path
, optional) – Path to the folder where cached files are stored.local_dir (
str
orPath
, optional) – If provided, the downloaded files will be placed under this directory.library_name (
str
, optional) – The name of the library to which the object corresponds.library_version (
str
, optional) – The version of the library.user_agent (
str
,dict
, optional) – The user-agent info in the form of a dictionary or a string.proxies (
dict
, optional) – Dictionary mapping protocol to the URL of the proxy passed torequests.request
.etag_timeout (
float
, optional, defaults to10
) – When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed torequests.request
.force_download (
bool
, optional, defaults toFalse
) – Whether the file should be downloaded even if it already exists in the local cache.token (
str
,bool
, optional) –- A token to be used for the download.
If
True
, the token is read from the HuggingFace config folder.If a string, it’s used as the authentication token.
headers (
dict
, optional) – Additional headers to include in the request. Those headers take precedence over the others.local_files_only (
bool
, optional, defaults toFalse
) – IfTrue
, avoid downloading the file and return the path to the local cached file if it exists.allow_patterns (
List[str]
orstr
, optional) – If provided, only files matching at least one pattern are downloaded.ignore_patterns (
List[str]
orstr
, optional) – If provided, files matching any of the patterns are not downloaded.max_workers (
int
, optional) – Number of concurrent threads to download files (1 thread = 1 file download). Defaults to 8.tqdm_class (
tqdm
, optional) – If provided, overwrites the default behavior for the progress bar. Passed argument must inherit fromtqdm.auto.tqdm
or at least mimic its behavior. Note that thetqdm_class
is not passed to each individual download. Defaults to the custom HF progress bar that can be disabled by settingHF_HUB_DISABLE_PROGRESS_BARS
environment variable.
- Returns:
folder path of the repo snapshot.
- Return type:
str
- Raises:
[RepositoryNotFoundError] – If the repository to download from cannot be found. This may be because it doesn’t exist, or because it is set to
private
and you do not have access.[RevisionNotFoundError] – If the revision to download from cannot be found.
[EnvironmentError](https – //docs.python.org/3/library/exceptions.html#EnvironmentError) If
token=True
and the token cannot be found.[OSError](https – //docs.python.org/3/library/exceptions.html#OSError) if ETag cannot be determined.
[ValueError](https – //docs.python.org/3/library/exceptions.html#ValueError) if some parameter value is invalid.
- exception tooluniverse.embedding_sync.HfHubHTTPError(message: str, response: Response | None = None, *, server_message: str | None = None)[source][source]¶
Bases:
HTTPError
HTTPError to inherit from for any custom HTTP Error raised in HF Hub.
Any HTTPError is converted at least into a
HfHubHTTPError
. If some information is sent back by the server, it will be added to the error message.Added details: - Request id from “X-Request-Id” header if exists. If not, fallback to “X-Amzn-Trace-Id” header if exists. - Server error message from the header “X-Error-Message”. - Server error message if we can found one in the response body.
import requests from huggingface_hub.utils import get_session, hf_raise_for_status, HfHubHTTPError
response = get_session().post(…) try:
hf_raise_for_status(response)
- except HfHubHTTPError as e:
print(str(e)) # formatted message e.request_id, e.server_message # details returned by server
# Complete the error message with additional information once it’s raised e.append_to_message(”
- class tooluniverse.embedding_sync.BaseTool(tool_config)[source][source]¶
Bases:
object
- classmethod get_default_config_file()[source][source]¶
Get the path to the default configuration file for this tool type.
This method uses a robust path resolution strategy that works across different installation scenarios:
Installed packages: Uses importlib.resources for proper package resource access
Development mode: Falls back to file-based path resolution
Legacy Python: Handles importlib.resources and importlib_resources
Override this method in subclasses to specify a custom defaults file.
- Returns:
Path or resource object pointing to the defaults file
- tooluniverse.embedding_sync.register_tool(tool_type_name=None, config=None)[source][source]¶
Decorator to automatically register tool classes and their configs.
- Usage:
@register_tool(‘CustomToolName’, config={…}) class MyTool:
pass
- tooluniverse.embedding_sync.get_logger(name: str | None = None) Logger [source][source]¶
Get a logger instance
- Parameters:
name (str, optional) – Logger name (usually __name__)
- Returns:
Logger instance
- Return type: