本文由 AI 分析生成
建立時間: 2026-03-28 來源: https://huggingface.co/docs/huggingface_hub/en/guides/upload
Summary
Official Hugging Face Hub Python library documentation for uploading files to repositories. Covers upload_file(), upload_folder(), upload_large_folder(), CLI (hf upload), non-blocking uploads (run_as_future), scheduled uploads (CommitScheduler), and low-level create_commit() operations.
Hugging Face Hub Python 庫上傳文件到存儲庫的官方文檔。涵蓋 upload_file()、upload_folder()、upload_large_folder()、CLI(hf upload)、非阻塞上傳(run_as_future)、定時上傳(CommitScheduler)和底層 create_commit() 操作。
Key Points
upload_file(): upload a single file; specifypath_or_fileobj,path_in_repo,repo_id,repo_type(model/dataset/space)upload_folder(): upload entire folder; respects.gitignore; supportsallow_patterns,ignore_patterns,delete_patterns; creates a single commitupload_large_folder(): resumable, multi-threaded, resilient upload for large datasets; splits into many small tasks with local caching; multiple commits- CLI:
hf upload [repo_id] [local_path] [path_in_repo];hf upload-large-folderfor large datasets - Non-blocking:
run_as_future=TruereturnsFutureobject; background uploads queued in order CommitScheduler: push data to Hub periodically from local folder; designed for append-only streaming data (training logs, user feedback); usesscheduler.lockfor thread-safetycreate_commit(): low-level API; supportsCommitOperationAdd,CommitOperationDelete,CommitOperationCopyhf_xet: new chunk-based deduplication storage (enabled by default in huggingface_hub ≥ 0.32.0); faster uploads; setHF_XET_HIGH_PERFORMANCE=1for maximum throughput
Insights
CommitScheduler is the pattern to know for ML training pipelines — instead of polluting the git history with thousands of checkpoint commits, schedule periodic batch commits. The preupload_lfs_files() + create_commit() pattern is important for large shard uploads: pre-uploading each shard to S3 before making the final commit avoids OOM issues from loading all shards simultaneously. The hf_xet storage system (Rust-based, chunk-deduplication) is a significant improvement for large model repositories where many versions share most weights.
Connections
Raw Excerpt
Sharing your files and work is an important aspect of the Hub. The huggingface_hub offers several options for uploading your files to the Hub. You can use these functions independently or integrate them into your library.