Integration: Hugging Face Transformers
Run Transformers models locally in your Haystack pipelines
Table of Contents
Overview
Transformers is Hugging Face’s library for state-of-the-art machine learning models. With this integration, you can run models from the Hugging Face Hub locally, on your own machine, in your Haystack pipelines.
Haystack supports Hugging Face models in other ways too:
- Sentence Transformers for local embedding and ranking models
- Hugging Face API to call models via Inference Providers, Inference Endpoints, or self-hosted TGI/TEI
- Optimum for high-performance inference with ONNX Runtime
Installation
pip install transformers-haystack
Usage
Components
This integration provides several components that run Transformers models locally:
-
TransformersChatGenerator: chat generation with local LLMs. -
TransformersExtractiveReader: extracts answers from documents using question answering models. -
TransformersTextRouterandTransformersZeroShotTextRouter: route text to different pipeline branches based on classification. -
TransformersZeroShotDocumentClassifier: classifies documents with zero-shot classification models. -
TransformersNamedEntityExtractor: annotates named entities in documents.
Chat Generation
Use
TransformersChatGenerator to run a chat model locally:
from haystack_integrations.components.generators.transformers import TransformersChatGenerator
from haystack.dataclasses import ChatMessage
generator = TransformersChatGenerator(model="Qwen/Qwen3-0.6B")
messages = [ChatMessage.from_user("What's Natural Language Processing? Be brief.")]
print(generator.run(messages))
Extractive Question Answering
Use
TransformersExtractiveReader to extract answers from the relevant context:
from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack_integrations.components.readers.transformers import TransformersExtractiveReader
docs = [Document(content="Paris is the capital of France."),
Document(content="Berlin is the capital of Germany."),
Document(content="Rome is the capital of Italy."),
Document(content="Madrid is the capital of Spain.")]
document_store = InMemoryDocumentStore()
document_store.write_documents(docs)
retriever = InMemoryBM25Retriever(document_store=document_store)
reader = TransformersExtractiveReader(model="deepset/roberta-base-squad2-distilled")
extractive_qa_pipeline = Pipeline()
extractive_qa_pipeline.add_component(instance=retriever, name="retriever")
extractive_qa_pipeline.add_component(instance=reader, name="reader")
extractive_qa_pipeline.connect("retriever.documents", "reader.documents")
query = "What is the capital of France?"
extractive_qa_pipeline.run(data={"retriever": {"query": query, "top_k": 3},
"reader": {"query": query, "top_k": 2}})
Zero-Shot Document Classification
Use
TransformersZeroShotDocumentClassifier to classify documents with labels of your choice, without fine-tuning:
from haystack import Document
from haystack_integrations.components.classifiers.transformers import TransformersZeroShotDocumentClassifier
documents = [Document(content="Today was a nice day!"),
Document(content="Yesterday was a bad day!")]
classifier = TransformersZeroShotDocumentClassifier(
model="cross-encoder/nli-deberta-v3-xsmall",
labels=["positive", "negative"],
)
result = classifier.run(documents=documents)
print([doc.meta["classification"]["label"] for doc in result["documents"]])
# ['positive', 'negative']
Named Entity Recognition
Use
TransformersNamedEntityExtractor to annotate named entities in documents:
from haystack import Document
from haystack_integrations.components.extractors.transformers import TransformersNamedEntityExtractor
documents = [
Document(content="I'm Merlin, the happy pig!"),
Document(content="My name is Clara and I live in Berkeley, California."),
]
extractor = TransformersNamedEntityExtractor(model="dslim/bert-base-NER")
results = extractor.run(documents=documents)["documents"]
annotations = [TransformersNamedEntityExtractor.get_stored_annotations(doc) for doc in results]
print(annotations)
