9.00 - 9.30
|
Registration & Coffee
|
9.30 - 9.45
|
Opening (General Chairs & Host Institution Representatives)
|
9.45 - 10.45
|
Keynote (Prof. Dr. Björn Eskofier)
|
10.45 - 11.15
|
Coffee Break
|
11.15 - 12.35
|
Session #1: Ubiquitous computing(Chair: Frédéric Li)
Compressed, Real-Time Voice Activity Detection with Open Source Implementation for Small Devices (Lasse R. Andersen, Lukas J. Jacobsen and David Campos)
This paper proposes a real-time voice activity detection (VAD) system that utilizes a compressed convolutional neural network (CNN) model. On general-purpose computers, the system is capable of accurately classifying the presence of speech in audio with low latency. Whereas, when implemented on small devices, the system is showing higher latency, which is presumably an indication of high-load computations in the preprocessing steps. The results of the eva luation indicate that the proposed VAD system is an improvement over the existent solutions, in terms of reducing the model size and improving the level of accuracy among different evaluation metrics. Furthermore, the proposed VAD system offers an extension of the applicability by training the CNN model on a different and more diverse data set. Moreover, the proposed architecture is capable of being compressed to approximately one-eleventh of the size, facilitating eventual deployment on small devices. In contrast to existent, closed VAD solutions, the entire pipeline of the proposed VAD system is developed in Python and made available as open source, ensuring the verifiability and accessibility of the work. ✎ Paper
Self-supervised representation learning using multimodal Transformer for emotion recognition(Theresa Götz, Pulkit Arora, F. X. Erick, Nina Holzer and Shrutika Sawant)
In this paper, we present a Modality-Agnostic Transformer based Self-Supervised Learning (MATS2L) for emotion recognition using physiological signals. The proposed approach consists of two stages: a) Pretext stage, where the transformer model is pre-trained with unlabeled physiological signal data using masked signal prediction as pre-training task and form contextualized signal representations. b) Downstream stage, where self-supervised learning (SSL) representations extracted from a pre-trained model are utilized for emotion recognition tasks. Modality-agnostic approach allows the transformer model to focus on exploring mutual features among different physiological signals and learning more meaningful embeddings to estimate emotions effectively. We conduct several experiments on a public dataset WESAD and perform comparisons with fully supervised and other competitive SSL approaches. Experimental results showed that the proposed approach is capable of learning meaningful features and superior to other competitive SSL approaches. Moreover, transformer model trained on SSL features outperforms fully supervised transformer model. We also present detailed ablation studies to prove the robustness of our approach. ✎ Paper
WiFi Sensing with Single-Antenna Devices for Ambient Assisted Living(Robert Schumann, Frédéric Li and Marcin Grzegorzek)
The absolute coverage WiFi networks is higher than it has ever been and WiFi sensing offers a device-free contactless alternative to intrusive wearable devices for a variety of applications, in particular for ambient assisted living (AAL). However a majority of sensing systems proposed in literature use multiple antennas for sensing, resulting in high cost that hinders development of such solutions in real life. This work examines existing single-antenna systems as a low-cost solution in an AAL scenario. The capabilities of these systems are reviewed regarding their practical applicability based on testing in AAL environments, leveraging multiple links (i.e. transmitter-receiver pairs) and considering mobile and repositioned receivers. It is found that while the AAL use-cases of respiration monitoring, fall detection and activity recognition are realised by existing systems, further testing in realistic AAL environments with the inclusion of activities of daily living is needed. Additionally the full use of multiple links and consideration of a mobile receiver is still rare, but shows promising improvements. A multi-task system enabling all three applications is discussed using the Model for Ethical Evaluation of Socio-Technical Arrangements (MEESTAR). We suggest an ethically sensitive use, but identify a need for mitigation strategies to address privacy-related concerns of potentially unwilling users. ✎ Paper
TongueMendous: IR-Based Tongue-Gesture Interface with Tiny Machine Learning [Best Paper] (Davy P. Y. Wong and Pai H. Chou)
This paper presents TongueMendous, an non-intrusive, pervasive tongue-gesture recognition interface for the general population and use cases. It uses an infrared sensor to detect tongue gestures when the tongue sticks in different directions. The collected data is recognized by a tiny machine learning (TinyML) model, allowing TongueMendous to classify tongue gestures on a microcontroller. Evaluations on the initial prototype reported a 91.7% cross-validation accuracy and 89.4% leave-one-person-out accuracy. We also conduct a study to explore the user experience and future design space. These results suggest that the proposed system can be accurate and work well across different users. ✎ Paper
|
12:35 - 14:00
|
Lunch Break
|
14:00 - 15:20
|
Session #2: Visual technology-based human activity recognition(Chair: Denys Matthies)
Survey on food intake methods using visual technologies(Sudhir Kumar Dubey, Dimitri Kraft, Nicola Drüeke and Gerald Bieber)
Assessing food intake is important for reasons of well-being, lifestyle, health, appearance, or fun. Particularly in the field of medicine, the intake of appropriate foods and quantities of food is considered elementary and always related to physical activity.
Various food tracking techniques exist, ranging from pen-based, purchase-based, calorie-counting, or camera-based systems. Here, it is important that automated systems can recognize ingredients and estimate quantities. Therefore, there are many camera-based systems, but they differ in terms of accuracy, speed or performance.
This review provides an overview of existing technologies and describes new approaches in the area of volume-sensitive sensing methods using lidar and true-depth technologies. ✎ Paper
Interactive Exercises for Computer-based Work Using a Webcam(Angelina Schmidt, Hassan Shahid, Dimitri Kraft, Gerald Bieber and Michael Fellmann)
Sedentary behavior in office environments has become a widespread concern due to its negative impact on individuals' health and well-being. This study not only addresses this issue by providing details about the musculoskeletal disorders pertinent to the wrist, shoulders, and neck that can develop due to immobility or prolonged sitting in front of a computer workstation, but also promotes the regular incorporation of three specific exercises for office workers. In particular, this study contains a comprehensive literature review covering the trade-offs between wearable devices and computer vision techniques in monitoring and counting the repetitive movements of various physical activities. Moreover, this study utilizes the Mediapipe pose estimation technique to track exercise performance and develops algorithms using a state machine for accurately counting repetitions during active breaks in an office environment. The dataset used to evaluate the methods employed consisted of a total of 36 videos and was gathered by engaging the employees working at Fraunhofer IGD and the University of Rostock. The findings of the research validated that the state machine could count the interventions with a mean accuracy of 92%. This suggests its incorporation in the future on a larger scale by selecting more exercises, a larger dataset, and various environmental settings. ✎ Paper
Understanding the Challenges and Opportunities of Pose-based Anomaly Detection
(Ghazal Alinezhad Noghre, Armin Danesh Pazho, Vinit Katariya and Hamd Tabkhi)
Pose-based anomaly detection is a video-analysis technique for detecting anomalous events or behaviors by examining human pose extracted from the video frames. Human anomaly detection plays a crucial role in various applications, such as smart cities and intelligent surveillance systems, for the safety of public environments. Utilizing pose data alleviates the privacy and ethical issues while reducing computational complexity compared to pixel-based approaches. However, it introduces more challenges, such as noisy skeleton data, losing important pixel information, and not having enriched enough features. These problems are exacerbated by the scarcity of high-quality anomaly detection datasets that are good enough representatives of real-world scenarios. In this work, we analyze and quantify the characteristics of two video anomaly datasets to better understand the difficulties of pose-based anomaly detection. We take a step forward, exploring the discriminating power of pose and trajectory for video anomaly detection and their effectiveness based on context. Consequently, our findings will provide valuable insights into the benefits and limitations of pose-based approaches for anomaly detection. ✎ Paper
Pedestrian Collision Prediction Using a Monocular Camera
(Shiyuan Chen, Xue Qin, Zeyd Boukhers, John See, Wei Sui and Cong Yang)
This paper introduces a simple yet efficient method, PedView, for pedestrian collision warning in Advanced Driver Assistance Systems (ADAS). In contrast to existing approaches that rely on LiDAR and stereo cameras for pedestrian-vehicle distance calculation, our proposed PedView stands out in three key aspects. Firstly, it leverages a forward-looking monocular camera for 3D pedestrian detection, particularly suitable for resource-limited environments like dealer-installed ADAS. Secondly, PedView goes beyond conventional methods, solely utilizing distance and car speed for collision prediction. Instead, it takes an end-to-end approach by incorporating pedestrian location and intent derived from our proposed 3D detector and fuzzy rules. This integration results in a significant improvement in prediction accuracy. Lastly, extensive experiments conducted on two datasets demonstrate the efficiency of PedView, showcasing its superior performance compared to the discrete conditional rules method (DCR) (Precision 0.937 vs. 0.844 and Recall 0.835 vs. 0.746). These results highlight PedView's robustness across various real-world scenarios. ✎ Paper
|
15:20 - 16:20
|
Coffee Break & Poster Madness(Chair: Marco Gabrecht)
Exploring False Statement Detection with Force Plate(Davy P. Y. Wong and Sheng-Chieh Yang)
We present a novel lie detection approach based on a force plate. The COP (center of pressure) trajectory length from the interviewee is extracted to test honesty. The preliminary study shows that the COP trajectory length of honest statements is shorter than dishonest, and the feasibility of using force plate as a polygraph. ✎ Paper
HomeGrid - Experimental Displays for SmartHome Devices and Interfaces
(Moritz Krause and Michael Zöllner)
In our society, displays are becoming increasingly prevalent. While it is nearly inconceivable to imagine a daily life without screens, scientific research indicates that our mental well-being is negatively affected by the increasing use and time spent on screens. The "HomeGrid" project has been developed as a smart home concept, aiming to explore how traditional screens in private environments can be reduced while still effectively conveying information in an intuitive manner. To achieve this, two main approaches were employed. Firstly, experimentation was conducted on the capture and communication of artificial light, and secondly, precise monitoring and visualization of indoor air quality were explored. These factors are fundamental indicators of both our mental and physical well-being, as high air quality, for example, enhances concentration, and light serves as a crucial regulator of our circadian rhythm. ✎ Paper
A Study on Hyperparameters Configurations for an Efficient Human Activity Recognition System
(Paulo J.S. Ferreira, João Mendes-Moreira and João M.P. Cardoso)
Human Activity Recognition (HAR) has been a popular research field due to the widespread of devices with sensors and computational power (e.g., smartphones and smartwatches). Applications for HAR systems have been extensively researched in recent literature, mainly due to the benefits of improving quality of life in areas like health and fitness monitoring. However, since persons have different motion patterns when performing physical activities, a HAR system needs to be adapted to the characteristics of the user in order to maintain or improve accuracy. Mobile devices, such as smartphones, used to implement HAR systems, have limited resources (for example, battery life). They also have difficulty to adapt to the device’s constraints to work efficiently for long periods. In this work, we present a kNN-based HAR system and an extensive study of the influence of hyperparameters (window size, overlap, distance function, and the value of k) and parameters (sampling frequency) on the system accuracy, energy consumption, and response time. We also study how hyperparameter configurations affect the model's performance for the users and the activities. Experimental results show that adjusting the system's behavior to the user, the device, and the target service is possible by adapting the hyperparameters. These results motivate the development of a HAR system capable of automatically adapting the hyperparameters for the user, the device, and the service. ✎ Paper
Challenges in Modelling Cooking Task Execution for User Assistance
(Tomasz Sosnowski, Teodor Stoev, Thomas Kirste and Kristina Yordanova)
Executing a complex physical task according to an instruction or a checklist is typical for various fields, such as aviation or healthcare. It is possible that the person is inexperienced or under stress and therefore unable to promptly consult the instruction text for correct execution of the task. To address this problem different works propose the usage of automated assistive systems that could guide the user through the tasks execution. This is also addressed by the Defense Advanced Research Projects Agency (DARPA) Perceptually-enabled Task Guidance (PTG) program, aiming to provide users with augmented reality goggle capable of tracking the state of the user during task execution and to display hints relevant for the completion of the task.
In this paper we discuss one important part of this project, namely, the ability to track the state of the user and the environment in order to be able to assist them. We discuss our modelling approach for the development of a probabilistic state tracker, which uses sensor observations to identify the current state and actions of the user, as well as the goal they are pursuing. The paper provides useful insights into the modelling of user behaviour, the challenges associated with modelling multi-user behaviour and how to tackle them. ✎ Paper
PneuShoe: A Pneumatic Smart Shoe for Activity Recognition, Terrain Identification, and Weight Estimation
(Marco Gabrecht, Hengyu Wang and Denys J.C. Matthies)
We present a footwear prototype that can detect activities, distinguish terrains, and estimate the user’s weight. The insole features two air chambers with pressure sensors and a 6-DOF IMU. A machine learning model, a decision tree was trained to distinguish
standing, walking, and running. Further, we can discriminate be-
tween differentiate terrains, such as soft sand, asphalt, and grass.
Moreover, we showcase how the air pressure sensors can be utilized
to provide a weight estimation. ✎ Paper
Using Deep Learning to Identify Persons by their Movement on a Sensor Floor
(Felicia Bader, Laura Liebenow and Axel Steinhage)
We present an approach to identify persons based on their movement on a sensor floor. Three types of deep learning neural networks were trained on five subjects' sensor data collected during ordinary working days in a test room. A Transformer network architecture proved to be the most successful, achieving a recognition rate of over 90% in the task of assigning just one minute of movement data to the correct person. Since the sensor floor can be installed invisibly under normal flooring, the findings result in new applications, e.g. for security systems or in the early detection of health problems that are reflected in the gait pattern. ✎ Paper
Assessment of Quality of Gyrocardiograms Based on Features Derived from Symmetric Projection Attractor Reconstruction
(Szymon Sieciński, Muhammad Tausif Irshad, Md Abid Hasan, Ewaryst Tkacz and Marcin Grzegorzek)
Signal quality assessment is essential for biomedical signal processing, analysis, and interpretation. Various methods exist, including averaged numerical values, thresholding, time- or frequency-domain analysis, and nonlinear approaches. This study evaluated the quality of gyrocardiographic signals (GCG) using symmetric projection attractor reconstruction (SPAR) analysis. Two classifiers, random forest and bagged trees, were used to assess the performance of the SPAR-based approach. Eleven features were extracted from the variables v and w, calculated on the basis of the signal delay. These features included minimum and maximum values, mean, standard deviation (SD), median, and Euclidean distance. The results showed that the SPAR-based approach achieved high accuracy, precision, and recall. The random forest classifier achieved 0.729 accuracy, 0.726 precision, and 0.729 recall, while the bagged trees classifier achieved 0.792 accuracy, 0.804 precision, and 0.792 recall. These findings suggest that the SPAR-based approach is a promising method to accurately assess the quality of GCG signals. ✎ Paper
Preliminary studies of measuring skateboarding forces by combining inertial sensors and camera-based pose estimation.
(Michael Zöllner and Moritz Krause)
Understanding acceleration forces and making progress in learning Skateboarding is a process of trial and error. In our paper we are describing our preliminary experiments for describing the complex interactions while pushing for speed in ramps and pump tracks. Therefore, we capture and visualize the body movement, the joint relations from hip to ankle and the resulting forces by joining inertial sensors on the skateboard and camera-based machine learning pose estimation of the athlete. ✎ Paper
The impact of cerebellar transcranial alternating current stimulation (tACS) and simultaneous motor network activation via motor sequence learning (MSL) on movements and muscle strength
(Kathinka von Möller, Rebecca Herzog, Christina Bolte, Alexander Münchau, Tobias Bäumer and Anne Weissbach)
The cerebellum and its connections to the cerebrum can be modulated by noninvasive brain stimulation (NIBS) techniques, especially transcranial alternating current stimulation (tACS). This modulation may affect movements and muscle strength by increasing the cortical excitability. Moreover, the effect depends on the state of the neurons involved. Therefore, it is important to gain further insight into how cerebellar tACS with and without simultaneously activation of the motor network via motor sequence learning (MSL) affects movements and muscle strength. Using inertial measure units (IMU) and electromyography (EMG), we were able to record and evaluate movements of 20 participants regarding this issue. We could show that simultaneous activation of the motor network partly led to longer task durations because it is suspected to interfere with the tACS effect. Concerning the muscle strength, a strength enhancing effect occurred, due to the irritation of the motor system by tACS and particularly the simultaneous MSL. These findings are of importance for future therapeutic approaches using tACS. ✎ Paper
Effects of Time-Series Data Pre-processing on the Transformer-based Classification of Activities from Smart Glasses
(Gabriela Augustinov, Marcin Grzegorzek and Sebastian Fudickar)
Time series classification is gaining significance in pattern recognition as time series data becomes more abundant along with the increasing digitization of daily life and the rise of the Internet of Things (IoT). One of the biggest challenges lies in the ordered nature of time series attributes, making traditional machine learning (ML) algorithms designed for static data unsuitable for processing temporal data. The Transformer architecture was introduced as a novel approach in natural language processing for machine translation tasks, relying solely on attention mechanisms without the need for convolution or recurrence. Since machine translation is similar to time-series data, where order is an important factor, it is also worth considering the Transformer for time-series classification. Pre-processing the data is a crucial step in the ML process and can influence the data and impact the effectiveness of the ML models. In this paper, we aim to address the effects of time-series pre-processing and data representation in combination with the Transformer model for Human Activity Recognition (HAR) using IMU data from smart glasses as input. We analyze the results based on established evaluation metrics such as the F1-score and the area under the curve (AUC). ✎ Paper
|
16:20 - 16:30
|
Bus Pickup to the Boat (Parking Lot Fraunhofer IMTE)
|
17:00 - 18:30
|
Boat Ride(Schiffsanleger Quandt-Linie-Lübeck -> Travemünde)
|
18:30 - 20:30
|
Best Paper Banquette & Dining (Travemünde)
|
20.30 - 22.00
|
Boat Ride (Travemünde -> Schiffsanleger Quandt-Linie-Lübeck)
|
9.45 - 10.45
|
Keynote (Prof. Dr. Niels Henze)
|
10:45 - 11.15
|
Coffee Break
|
11:15 - 12:15
|
Session #3: Wearable human activity recognition(Chair: Arjan Kuijper)
Miss-placement Prediction of Multiple On-body Devices for Human Activity Recognition(Robin Dönnebrink, Fernando Moya Rueda, René Grzeszick and Maximilian Stach)
Nowadays, in industrial applications, automatic human activity recognition (HAR) plays a central role. Especially human-centered activity recognition methods using on-body devices (OBDs) address situations where the identity has to be protected. However,practitioners strongly assume that end-users use OBDs correctly at deployment. In reality, this is hardly the case. Thus, there is a need for a robust HAR system, either at the recording stage or at recognition stage. This contribution addresses a combination of both stages. It proposes a miss-placement recognition of OBDs on the human body when performing an activity. We deploy a limb-oriented temporal convolutional neural network (CNN) to either recognize a miss-placement occurring or the type of miss-placement. Primarily results on a proposed dataset suggest that miss-placement classification is possible, which can be used for end-user feedback during recording or leveraged in data post-processing. ✎ Paper
Exploring the Benefits of Time Series Data Augmentation for Wearable Human Activity Recognition.(Md Abid Hasan, Frédéric Li, Artur Piet, Philip Gouverneur, Muhammad Tausif Irshad and Marcin Grzegorzek)
Wearable Human Activity Recognition (HAR) is an important field of research in smart assistive technologies. Collecting the data needed to train reliable HAR classifiers is complex and expensive. As a way to mitigate data scarcity, time series data augmentation (TSDA) techniques have emerged as a promising approach for generating synthetic HAR data. TSDA is not as trivial as image augmentation and has been relatively less investigated. In this paper, a comparative study of various state-of-the-art TSDA techniques is applied in the context of wearable HAR. More specifically, we investigate the classification of human activities on the OPPORTUNITY dataset using a deep CNN-LSTM architecture trained on raw and synthetic data. Our study highlights the importance of TSDA on performance enhancement for multivariate multi-class datasets. Interestingly very simple time domain-based TSDA techniques notably outperform complex ones based on Generative Adversarial Networks. We provide practical advice on how to apply TSDA for imbalanced datasets in practice for generating the ideal amount of synthetic data to achieve optimal classification accuracy. Our TSDA-based approach outperforms the previous state-of-the-art on the OPPORTUNITY dataset by 4.66% and 1.66% in average and weighted F1-scores, respectively. ✎ Paper
A Real-time Human Pose Estimation Approach for Optimal Sensor Placement in Sensor-based Human Activity Recognition
(Orhan Konak, Alexander Wischmann, Robin van de Water and Bert Arnrich)
Sensor-based Human Activity Recognition facilitates unobtrusive
monitoring of human movements. However, determining the most
effective sensor placement for optimal classification performance
remains challenging. This paper introduces a novel methodology
to resolve this issue, using real-time 2D pose estimations derived
from video recordings of target activities. The derived skeleton
data provides a unique strategy for identifying the optimal sensor
location. We validate our approach through a feasibility study,
applying inertial sensors to monitor 13 different activities across
ten subjects. Our findings indicate that the vision-based method
for sensor placement offers comparable results to the conventional
deep learning approach, demonstrating its efficacy. This research
significantly advances the field of Human Activity Recognition by
providing a lightweight, on-device solution for determining the
optimal sensor placement, thereby enhancing data anonymization
and supporting a multimodal classification approach. ✎ Paper
|
12:15 - 13.15
|
Lunch Break
|
13:15 - 14:15
|
Session #4: Sensor-based healthcare applications(Chair: Sebastian Fudickar)
Classification of freezing of gait using accelerometer data: A systematic performance evaluation approach(Aditi Site, Jari Nurmi and Elena Simona Lohan)
Parkinson’s disease is one of the most common neurodegenerative chronic diseases which can affect the patient’s quality of life by creating several motor and non-motor impairments. The freezing of gait is one such motor impairment which can cause the inability to move forward despite the intention to walk. The identification of
the freezing-of-gait events using sensor technology and machine- learning algorithms can result in an improvement in the quality of life and can decrease the risk of fall in Parkinson’s patients. Our study focuses on a systematic performance evaluation of machine learning algorithms for developing a good fit and generalized model. In this work, we train time-domain and frequency-domain-transform-based features on fully connected artificial and deep neural network algorithm for classifying the events of freezing of gait in patients by using accelerometer data. We evaluate these algorithms for hyperparameters such as batch size, optimizer type,
and window sizes in a step-wise process. We identify an optimal combination of parameters according to the accuracy and model fit optimality metrics, for artificial and deep neural network to classify freezing of gait events in Parkinson’s patients. We were able to achieve classification accuracy of 89%-90% with Adam optimizer, batch sizes of 256 and 8 and epochs of 60 and 40 for ANN and DNN respectively. ✎ Paper
Sensor-Based Detection of Food Hypersensitivity Using Machine Learning(Lennart Jablonski, Torge Jensen, Greta M. Ahlemann, Xinyu Huang, Vivian V. Tetzlaff-Lelleck, Artur Piet, Franziska Schmelter, Valerie S. Dinkler, Christian Sina and Marcin Grzegorzek)
The recognition of physiological reactions with the help of machine learning methods already plays a major role in many research areas, but is still little represented in the field of food hypersensitivity recognition. The present work addresses the question of how food hypersensitivity can be detected by analysing sensor data with explainable machine learning algorithms. In a first step, the Empatica E4 wristband, a wearable device that can be easily integrated into everyday life, collects raw data on various physiological patterns, and algorithms are implemented to extract a variety of features from the raw data. Subsequently, machine learning methods are used to target this classification problem and examine how food hypersensitivity can be detected using these objectively measurable features. In a subject-independent setup, an accuracy of 91% could be achieved, which provides a promising basis for a new non-invasive and objectively measurable method to detect food hypersensitivity. ✎ Paper
Making Noise - Improving Seismocardiography Based Heart Analysis With Denoising Autoencoders
(Jonas Burian, Helmut Tödtmann, Marian Haescher, Mario Aehnelt and Arjan Kuijper)
Seismocardiography is a method commonly used to monitor and prevent cardiovascular diseases. However, noise and artifacts in the signals often interfere with the assessment of cardiac health and the analysis of the signal morphology. Therefore, this work presents a new approach to denoise seismocardiography signals by applying fully convolutional denoising autoencoders. In order to investigate the suitability and robustness of this approach, a series of experiments have been carried out with respect to the optimal configuration for the denoising task and a comparison with wavelet denoising as a traditional approach. Furthermore, the practical applicability of the method is tested with the use case of transforming noisy seismocardiography signals into electrocardiography signals. Our approach using autoencoders outperforms the commonly used wavelet denoising. Additionally, we demonstrate that the widespread usage of Butterworth filters may not only be unnecessary but even detrimental. Finally, the generalizability of the method is verified on unseen data. With those combined improvements in noise reduction, the assessment of cardiac health using seismocardiography in the presence of noise may be facilitated in the future. ✎ Paper
|
14.15 - 14.30
|
Closing
|