Huggingface Hub

Learn about using Sentry for Huggingface Hub.

This integration connects Sentry with the Huggingface Hub Python SDK and has been confirmed to work with Huggingface Hub version 0.21.4.

Once you've installed this SDK, you can use Sentry LLM Monitoring, a Sentry dashboard that helps you understand what's going on with your AI pipelines.

Sentry LLM Monitoring will automatically collect information about prompts, tokens, and models from providers like OpenAI. Learn more about it here.

Install sentry-sdk from PyPI with the huggingface_hub extra:

Copied
pip install --upgrade 'sentry-sdk[huggingface_hub]'

If you have the huggingface_hub package in your dependencies, the Huggingface Hub integration will be enabled automatically when you initialize the Sentry SDK.

Configuration should happen as early as possible in your application's lifecycle.

Copied
import sentry_sdk

sentry_sdk.init(
    dsn="https://examplePublicKey@o0.ingest.sentry.io/0",
    # Set traces_sample_rate to 1.0 to capture 100%
    # of transactions for tracing.
    traces_sample_rate=1.0,
    # Set profiles_sample_rate to 1.0 to profile 100%
    # of sampled transactions.
    # We recommend adjusting this value in production.
    profiles_sample_rate=1.0,
)

Verify that the integration works by creating an AI pipeline. The resulting data should show up in your LLM monitoring dashboard.

Copied
import sentry_sdk
from sentry_sdk.ai.monitoring import ai_track
from huggingface_hub import InferenceClient

sentry_sdk.init(...)  # same as above

client = InferenceClient(token="(your Huggingface Hub API token)", model="HuggingFaceH4/zephyr-7b-beta")

@ai_track("My AI pipeline")
def my_pipeline():
    with sentry_sdk.start_transaction(op="ai-inference", name="The result of the AI inference"):
        print(client.text_generation(prompt="say hello", details=True))

After running this script, a pipeline will be created in the LLM Monitoring section of the Sentry dashboard. The pipeline will have an associated Huggingface Hub span for the text_generation operation.

It may take a couple of moments for the data to appear in sentry.io.

  • The Huggingface Hub integration will connect Sentry with all supported Huggingface Hub methods automatically.

  • All exceptions in supported SDK methods are reported to Sentry automatically.

  • Currently, the only supported module is InferenceClient.text_generation.

  • Sentry considers LLM and tokenizer inputs/outputs as PII and doesn't include PII data by default. If you want to include the data, set send_default_pii=True in the sentry_sdk.init() call. To explicitly exclude prompts and outputs despite send_default_pii=True, configure the integration with include_prompts=False as shown in the Options section below.

After adding HuggingfaceHubIntegration to your sentry_sdk.init() call explicitly, you'll be able to set options to change its behavior:

Copied
import sentry_sdk
from sentry_sdk.integrations.huggingface_hub import HuggingfaceHubIntegration

sentry_sdk.init(
    # ...
    send_default_pii=True,
    integrations=[
        HuggingfaceHubIntegration(
            include_prompts=False, # LLM/tokenizer inputs/outputs will be not sent to Sentry, despite send_default_pii=True
        ),
    ],
)

  • huggingface_hub: 0.21.4+
  • Python: 3.9+
Help improve this content
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").