ValyuContext

Valyu allows AI applications and agents to search the internet and proprietary data sources for relevant LLM ready information.

This notebook goes over how to use Valyu deep search tool in LangChain.

First, get an Valyu API key and add it as an environment variable. Get $10 free credit by signing up here.

Setup

The integration lives in the langchain-valyu package.

%pip install -qU langchain-valyu

In order to use the package, you will also need to set the VALYU_API_KEY environment variable to your Valyu API key.

import os

valyu_api_key = os.environ["VALYU_API_KEY"]

Instantiation

Now we can instantiate our retriever: The ValyuRetriever can be configured with several parameters:

k: int = 10
The number of top results to return for each query.
search_type: str = "all"
The type of search to perform: 'all', 'proprietary', or 'web'. Defaults to 'all'.
relevance_threshold: float = 0.5
The minimum relevance score (between 0 and 1) required for a document to be considered relevant. Defaults to 0.5.
max_price: float = 50.0 The maximum price (in USD) you are willing to spend per query. Defaults to 50.0.
is_tool_call: bool = True Set to True when called by AI agents/tools (optimized for LLM consumption). Defaults to True.
start_date: Optional[str] = None Start date for time filtering in YYYY-MM-DD format (optional).
end_date: Optional[str] = None End date for time filtering in YYYY-MM-DD format (optional).
included_sources: Optional[List[str]] = None List of URLs, domains, or datasets to include in search results (optional).
excluded_sources: Optional[List[str]] = None List of URLs, domains, or datasets to exclude from search results (optional).
response_length: Optional[Union[int, str]] = None Content length per item: int for character count, or 'short' (25k), 'medium' (50k), 'large' (100k), 'max' (full content) (optional).
country_code: Optional[str] = None 2-letter ISO country code (e.g., 'GB', 'US') to bias search results to a specific country (optional).
fast_mode: bool = False Enable fast mode for faster but shorter results. Defaults to False.
client: Optional[Valyu] = None
An optional custom Valyu client instance. If not provided, a new client will be created internally.
valyu_api_key: Optional[str] = None
Your Valyu API key. If not provided, the retriever will look for the VALYU_API_KEY environment variable.

from langchain_valyu import ValyuRetriever

retriever = ValyuRetriever(
    k=5,
    search_type="all",
    relevance_threshold=0.5,
    max_price=30.0,
    start_date="2024-01-01",
    end_date="2024-12-31",
    client=None,
    valyu_api_key=os.environ["VALYU_API_KEY"],
)

Usage

query = "What are the benefits of renewable energy?"
docs = retriever.invoke(query)

for doc in docs:
    print(doc.page_content)
    print(doc.metadata)

Use within a chain

We can easily combine this retriever in to a chain.

Content Extraction Retriever

The package also includes a ValyuContentsRetriever for extracting content from specific URLs:

from langchain_valyu import ValyuContentsRetriever

# Initialize the contents retriever with specific URLs
contents_retriever = ValyuContentsRetriever(
    urls=["https://www.example.com/article1", "https://www.example.com/article2"],
    summary=True,  # Enable content summarization
    extract_effort="normal",  # Extraction effort level
    response_length="medium",  # Content length preference
    valyu_api_key=os.environ["VALYU_API_KEY"],
)

# Extract content from the configured URLs
# Note: The query parameter is not used for pre-configured URLs
docs = contents_retriever.invoke("extract content")

for doc in docs:
    print(doc.page_content)
    print(doc.metadata)

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

prompt = ChatPromptTemplate.from_template(
    """Answer the question based only on the context provided.

Context: {context}

Question: {question}"""
)

llm = ChatOpenAI(model="gpt-4o-mini")


def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

API Reference:StrOutputParser | ChatPromptTemplate | RunnablePassthrough

API reference

For detailed documentation of all Valyu Context API features and configurations head to the API reference: https://docs.valyu.network/overview

Retriever conceptual guide
Retriever how-to guides

Setup​

Instantiation​

Usage​

Use within a chain​

Content Extraction Retriever​

API reference​

Related​