Ragrank API documentation
Ragrank is a user-friendly Python library created to make evaluating Retrieval Augmented Generation (RAG) models easier.
- ragrank.evaluate(dataset: Dataset | DataNode | dict, *, llm: BaseLLM | None = None, metrics: BaseMetric | List[BaseMetric] | None = None) EvalResult
Evaluate the performance of a given dataset using specified metrics.
- Parameters:
dataset (Union[Dataset, DataNode, dict]) – The dataset to be evaluated. It can be provided either as a Dataset object, DataNode object, or a dict representing the dataset.
llm (Optional[BaseLLM]) – The LLM (Language Model) used for evaluation. If None, a default LLM will be used.
metrics (Optional[Union[BaseMetric, List[BaseMetric]]]) – The metric or list of metrics used for evaluation. If None, response relevancy metric will be used.
- Returns:
An object containing the evaluation results.
- Return type:
Examples:
from ragrank import evaluate from ragrank.dataset import from_dict data = from_dict({ "question": "Who is the 46th Prime Minister of US ?", "context": [ "Joseph Robinette Biden is an American politician, " "he is the 46th and current president of the United States.", ], "response": "Joseph Robinette Biden", }) result = evaluate(data) print(result)