
Overview
This service concerns training AI models on behalf of the customer for a specific task and optimisation objective, e.g., improving accuracy on crop classification from image data. The target model is the solution provided by the customer that needs to be enhanced with respect to a set of pre-determined features to reach the desired performance level. However, if required, the training can also be applied to additional state-of-the-art models available in the market for benchmarking purposes. If not defined by the customer, some features of the training process can be identified via service S00179 (desk assessment activities for digital systems and/or data): for instance, model features to improve, reference model baselines to include in the performance comparison, as well as benchmark datasets. The data used for training the model can be either provided by the customer or annotated ad hoc as a preparatory activity to model training (via service S00290 - Data Labelling); another possibility is that data are retrieved among reference benchmark datasets that are openly available. We will also agree with customers on the level of hardware acceleration required, based on the considered AI models: e.g., GPU acceleration via connection to a remote server vs. on-device training.
More about the service
This service provides the customer with access to a team of engineers expert in setting up and executing AI model training operations, as well as to a computational infrastructure that can support the training and ensure it gets completed within a short timeframe.
The first phase of the service involves one or more interviews where the customer defines, together with AgrifoodTEF, the goals and tools of the training. If the model to be trained is provided by the customer, it is done during this phase, under NDA if needed. The customer is also asked to provide AgrifoodTEF with the data to be used for training, except when publicly available datasets are chosen; another possibility is to use datasets available to AgrifoodTEF.
The second phase of the service involves the setup and execution of the training, performed by AgrifoodTEF. The training procedure will be monitored by tracking the evaluation metrics that are relevant to the end task (e.g., training loss wrt the optimisation objective, average classification precision and accuracy, ...).The outcome of the service is the trained model, provided in the format preferred by the customer.
The following is an example of a service instance.
Example service: The customer is interested in promptly identifying the emergence of the Peronospora (downy mildew) disease in vineyards. Peronospora symptoms can be detected by inspecting changes on the leaf surface (appearance of small spots, gradual changes in the leaf colour).
The customer has already implemented a computer vision model to classify leaves as healthy or unhealthy. However, the model needs to be re-trained to account for the collection of higher-quality images and annotations of disease symptoms (e.g., via S00113 - Collection of test data during physical testing and via S00290 - Data labelling). Since the solution is expected to work in real-time, we use a TPU-accelerated stick readily available on the market to train the model directly on the device as opposed to training the model offline on a remote server. Given the real-time performance requirements, we opt for an incremental training protocol, where only a few image examples (i.e., shots) are used for each update of the model parameters, so that the customer will be able to modularly update the model in the future, as soon as additional images are acquired.
The performance of the model across training iterations is tracked with respect to the binary cross-entropy loss associated with the healthy and unhealthy classes and to the mean average precision of the detected leaf regions. Ultimately, we deliver the model checkpoints that led to the highest performance to the customer, with a report explaining how the best model was chosen and the parameters used at training time (e.g., learning rate, batch size, momentum, …). The trained model can be provided in a lightweight format that supports on-device learning (e.g., TFLite) but also in more interoperable formats like ONNX to facilitate the conversion across different deep learning frameworks and computing devices.