AI model evaluation and verification

The service provides technical support and tools for AI model evaluation and verification in light of the EU AI Act.

Interested in this service? Contact us at agrifoodtef@ri.se

Overview

Evaluating or verifying AI solutions is technically challenging and imposes specific requirements on the application, which can vary from setting internal benchmarks to establishing safety conditions. Evaluation and verification are performed using a broad set of tools ranging from straightforward performance evaluation metrics to formal methods for the verification of safety requirements. Formal methods can also be used to evaluate requirement fulfilment and systematically discover scenarios showing that AI systems need further improvements. Alternatively, customers can receive technical assistance and advice on improving AI systems that do not meet requirements.

More about the service

Discover more about our service, including how it can benefit you, the delivery process, and the options for customisation tailored to your specific needs!

This service is directed towards customers that develop or use AI systems in their business activities and require assistance testing and evaluating such systems. The AI models may be developed by the customer, open-source, or developed by a third party. Service S00331 can also provide technical advice and recommendations to customers towards compliance with EU AI-Act regulations. The service is applicable to all AI systems and tasks relevant to the agricultural sector. Of particular relevance are AI systems for robotic control, drone control, and other control systems where formal methods can be used in conjunction with traditional testing. The requirement formalisation stage is carried out iteratively in close collaboration with the customer. The latter can provide their own set of requirements, with RISE suggesting additional requirements based on the EU AI Act. Requirement testing and verification is carried out by RISE and further refined following customer feedback. If relevant, assistance and advice on how to retrain AI models to align with requirements can also be provided. 

Logistics:
• The formulation of requirements to be tested is an iterative process back and forth between RISE and the customer.• Requirement testing is mainly conducted by RISE, with feedback from the customer used to refine and reformulate requirements when needed.
• For model improvement, RISE can provide advice on how the AI system can be improved to satisfy the given requirements, followed by another round of requirement testing.

Delivery Period: The service is available throughout the year, ensuring access support when required.
Duration: The service execution can span several weeks, depending on the complexity of the tests.
Location: The service can be executed remotely or on-site, depending on the customer’s preference and the nature of the testing.
Customer Requirements:
• For the provision of the service, access to the model parameters must be granted. Access to only binary or encrypted models will only be sufficient for the performance of simple testing. 
• Access to relevant datasets is generally not imposed but may be necessary for evaluating specific requirements.
• A PM detailing test setup considerations must be prepared and approved before starting the tests.
• Any non-disclosure agreements must be in place before the provision of the service.

Deliverables:
• Output: A verification and evaluation report on the results will be provided. Results will also be presented in a final review meeting. 

The service can be adapted to specific customer needs. The assessment journey starts with a joint meeting where the customer discusses different alternatives with a technical team from agrifoodTEF, supplemented with domain experts from RISE and members from the customer support team. A roadmap for the service is established, and the service can commence.
Location
Remote
Sweden
Type of Sector
Arable farming
Food processing
Greenhouse
Horticulture
Livestock farming
Tree Crops
Viticulture
Type of service
AI model training
Collection of test data
Conformity assessment
ELSA assessment
Performance evaluation
Test design
Test execution
Accepted type of products
Data
Design / Documentation
Software or AI model