Overview
Evaluating or verifying AI solutions is technically challenging and imposes specific requirements on the application, which can vary from setting internal benchmarks to establishing safety conditions. Evaluation and verification are performed using a broad set of tools ranging from straightforward performance evaluation metrics to formal methods for the verification of safety requirements. Formal methods can also be used to evaluate requirement fulfilment and systematically discover scenarios showing that AI systems need further improvements. Alternatively, customers can receive technical assistance and advice on improving AI systems that do not meet requirements.
More about the service
• The formulation of requirements to be tested is an iterative process back and forth between RISE and the customer.• Requirement testing is mainly conducted by RISE, with feedback from the customer used to refine and reformulate requirements when needed.
• For model improvement, RISE can provide advice on how the AI system can be improved to satisfy the given requirements, followed by another round of requirement testing.
Delivery Period: The service is available throughout the year, ensuring access support when required.
Duration: The service execution can span several weeks, depending on the complexity of the tests.
Location: The service can be executed remotely or on-site, depending on the customer’s preference and the nature of the testing.
Customer Requirements:
• For the provision of the service, access to the model parameters must be granted. Access to only binary or encrypted models will only be sufficient for the performance of simple testing.
• Access to relevant datasets is generally not imposed but may be necessary for evaluating specific requirements.
• A PM detailing test setup considerations must be prepared and approved before starting the tests.
• Any non-disclosure agreements must be in place before the provision of the service.
Deliverables:
• Output: A verification and evaluation report on the results will be provided. Results will also be presented in a final review meeting.