IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting

1Eindhoven University of Technology, 2ASML Research

Published in: WACV 2024

Presenting IndustReal: a multi-modal dataset for procedure understanding in industrial-like settings!

Abstract

Although action recognition for procedural tasks has received notable attention, it has a fundamental flaw in that no measure of success for actions is provided. This limits the applicability of such systems especially within the industrial domain, since the outcome of procedural actions is often significantly more important than the mere execution. To address this limitation, we define the novel task of procedure step recognition (PSR), focusing on recognizing the correct completion and order of procedural steps. Alongside the new task, we also present the multi-modal IndustReal dataset.

Unlike currently available datasets, IndustReal contains procedural errors (such as omissions) as well as execution errors. A significant part of these errors are exclusively present in the validation and test sets, making IndustReal suitable to evaluate robustness of algorithms to new, unseen mistakes. Additionally, to encourage reproducibility and allow for scalable approaches trained on synthetic data, the 3D models of all parts are publicly available. Annotations and benchmark performance are provided for action recognition and assembly state detection, as well as the new PSR task. IndustReal, along with the code and model weights, is available on this project page.

Video Presentation

Procedure step recognition

By formalizing the task of procedure step recognition (PSR), along with an evaluation scheme, we encourage researchers to develop methods to automatically recognize the completion of steps, rather than the (partial) execution of activities. Additionally, PSR systems should explicitly leverage procedural knowledge and allow a flexible execution order for procedural tasks, when the procedure allows it.


Task definition

The objective of PSR is to extract an estimate of all procedure steps correctly performed by a person up to time t, based on sensory inputs X and a descriptive set of the procedural actions to be performed P. The predicted completed procedure steps at time t are defined such that $$ {\hat{y}_t = \mathcal{F} (X_t, \mathcal{P}). } $$

Crucially, this definition allows for real-time operation, since contrary to existing tasks, PSR does not require a full recording of the procedure as input.


Evaluation metrics

Procedure order similarity (POS). We propose to measure the quality of a predicted sequence order for an entire recording by comparing it with a similarity measure with respect to the ground-truth. This is approached as a string similarity problem since words consist of a sequence of characters, where order and type of character matters. We define POS as $$ {\textrm{POS} = 1 - \textrm{min}(\frac{\textrm{DamLev}(y, \hat{y})}{|y|},\hspace{0.1cm}1) ,} $$ where DamLev() is a weighted DamLev edit distance function.

F1-score.

Average delay. To complement the aforementioned metrics with a temporal component, quantifying the time between the ground-truth completion and corresponding recognition of a step.


IndustReal dataset

Construction-toy car

The IndustReal dataset contains 6 hours of recordings, demonstrating how 27 participants assembly and perform a maintenance task on a construction-toy car. IndustReal uses 3D printed parts and instructions for printing your own model can be found here.

Overview of the parts and components in IndustReal.

Overview of the parts and components in IndustReal.

We are thankful to the STEMFIE project for open-sourcing their 3D printed toy construction sets.


Novel aspects of the dataset

Variety of execution errors. IndustReal features 38 errors, of which 14 are exclusive to the validation and test sets. Whilst some datasets already include procedural errors (e.g. omissions), IndustReal is the first to also include a variety of execution errors (e.g. wrong type of nut used).

Subgoal oriented execution. Other datasets are either “free-style” assemblies or contain a strict, step-by-step execution order. IndustReal combines these execution types with a subgoal-oriented assembly style, where participants are given flexibility to determine the execution order between subgoals. This approach more closely resembles industrial procedures, since it maintains a hierarchy in procedure execution whilst allowing for flexibility where possible. IndustReal contains 48 different execution orders.

Open-source geometries. Scalability is an important factor for many industrial tasks, where simultaneously the technical drawings are often available. Therefore, 3D models for all parts are published, to stimulate use of synthetic data in procedural action understanding, e.g. by sim2real domain adaptation or generalization.

3D printed parts. To ensure reproducibility, future availability of the model, and growth via community effort, all parts are 3D printed and open source.


Annotations

IndustReal is specifically created to stimulate research on the novel procedure step recognition task. However, to broaden the possible uses of the dataset, we provide action recognition and object detection (assembly state detection) labels alongside the PSR annotations:

Sample from a clip in the IndustReal dataset.

Samples from a clip in the IndustReal dataset, demonstrating the modalities and annotations for all three tasks. Gaze is indicated by the cross, detected hand joints by the dots. AR: action recognition, ASD: assembly state detection, PSR: procedure step recognition.


BibTeX

@inproceedings{schoonbeek2024industreal,
  title={IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting},
  author={Schoonbeek, Tim J and Houben, Tim and Onvlee, Hans and van der Sommen, Fons and others},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={4365--4374},
  year={2024}
}