ragrank.evaluation.outputs
Contains the ouputs of evaluation
- class ragrank.evaluation.outputs.EvalResult(*, llm: BaseLLM, metrics: List[BaseMetric], dataset: Dataset, scores: List[List[float]], response_time: float)
Represents the result of an evaluation.
- metrics
List of metrics used for evaluation.
- Type:
List[BaseMetric]
- scores
List of scores for each metric.
- Type:
List[List[float]]
- response_time
Response time for the evaluation process.
- Type:
float
- model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ConfigDict = {'arbitrary_types_allowed': True, 'frozen': True}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[dict[str, FieldInfo]] = {'dataset': FieldInfo(annotation=Dataset, required=True, description='The dataset used for evaluation'), 'llm': FieldInfo(annotation=BaseLLM, required=True, description='The language model used for evaluation'), 'metrics': FieldInfo(annotation=List[BaseMetric], required=True, description='List of metrics used for evaluation.'), 'response_time': FieldInfo(annotation=float, required=True, description='Response time for the evaluation process.', metadata=[Gt(gt=0)]), 'scores': FieldInfo(annotation=List[List[float]], required=True, description='List of scores for each metric')}
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].
This replaces Model.__fields__ from Pydantic V1.
- to_dataframe() DataFrame
Convert the evaluation result to a pandas DataFrame.
- Returns:
A DataFrame containing the evaluation results.
- Return type:
DataFrame
- to_dict() Dict[str, List[str] | str] | Dict[str, List[str] | List[List[str]]]
Convert the evaluation result to a dict.
- Returns:
A dict containing the evaluation results.
- Return type:
dict
- validator() EvalResult
Validate the evaluation result after instantiation.
- Raises:
ValueError – If the number of metrics and scores are not equal, or if the number of datapoints and scores are not balanced.