Skip to main content

LlamaIndex-Chapter 2 (QA and Assessment)

Production-level examples

SEC-Insights

QA

User Case:

What

  • Semantic Query (Semantic search/ Top K)
  • sum up

Where

How

The links above all point to the sense: Q&A patterns below

Understanding: Q&A patterns

One of the simplest Q&A

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)

Select different Data Sources (Route Datasource)

link

Compare/Contrast Queries

I don't understand this

Multi Document Queries

Besides the explicit synthesis/routing flows described above, LlamaIndex can support more general multi-document queries as well. It can do this through our SubQuestionQueryEngine class. Given a query, this query engine will generate a "query plan" containing sub-queries against sub-documents before synthesizing the final answer.

This query engine can execute any number of sub-queries against any subset of query engine tools before synthesizing the final answer. This makes it especially well-suited for compare/contrast queries across documents as well as queries pertaining to a specific document.

Multi-Step Queries

LlamaIndex can also support iterative multi-step queries. Given a complex query, break it down into an initial subquestions, and sequentially generate subquestions based on returned answers until the final answer is returned.

For instance, given a question "Who was in the first batch of the accelerator program the author started? ", the module will first decompose the query into a simpler initial question "What was the accelerator program the author started? ", query the index, and then ask followup questions.

Temporal Query

Eval

Introduction to concepts

  • evaluation response
  • evaluation search

Detailed overview and process

  • evaluation response
    • Use GPT-4 to evaluate
    • Dimension of Assessment
      • 生成的答案与参考答案:正确性和语义相似度
      • 生成的答案与retrieved contexts:Faithfulness
      • 生成的答案与Query: Answer Relevancy
      • retrieved contexts和Query:Context Relevancy
    • Generate reference answers
  • Evaluation search (retrieval)
    • How to evaluate: ranking metrics like mean-reciprocal rank (MRR), hit-rate, precision, and more.

Generate dataset

use case

Ensemble with other tools

  • UpTrain: 1.9K: Available for trial, but a book demo is required, and eye observation is not cheap
  • Tonic Validate(Includes Web UI for visualizing results): There is a commercial version, which can be tried, and then US$200/month
  • DeepEval: 1.6K
  • Ragas: 4.4K
    • feels good
    • Llamaindex-->Ragas-->LangSmith and other tools
    • However, unfortunately, the quick start failed to run, promptingModuleNotFoundError: No module named 'ragas.metrics'; 'ragas' is not a package

cost assessment

optimize

Basic optimization

Retrieval