Intelligent Cockpit Monitoring System DMS Product Design
While everyone is familiar with DMS and OMS, IMS is a concept that has only become popular in the past two years. IMS, In-cabin monitoring System, is an intelligent visual monitoring system for automotive cockpits. Generally speaking, IMS includes both DMS and OMS, as well as face recognition, gesture recognition, body sign recognition, remote monitoring and so on.
The development of intelligent networked vehicles in recent years, especially the continuous iterative design of intelligent cockpits, the intelligent experience of the cockpit has also been enhanced, and the exploration of a series of intelligent scenarios, such as DMS fusion ADAS and in-cockpit life scenarios, has also contributed to the experience of the cockpit.
For DMS, for public travel safety considerations, both the EU and China have introduced laws and regulations, and China has taken the lead in mandatory requirements for the installation of DMS systems in commercial vehicles such as “two passengers and one hazardous vehicle”, and the requirements for passenger cars are also being developed. At the same time, DMS has become a key element of the Euro NCAP five-star safety rating, and it is a necessary condition. In the past two years, a lot of companies have opened DMS supply services, but few of them have made profits, but DMS has become a standard feature of new models.
The core functions of the DMS are divided into two modules, fatigue monitoring and hazardous distraction monitoring.
Fatigue monitoring: During the driving process, the camera samples the driver’s closed eyes and yawning behavior; DMS combines the driving time, driving speed and other factors to determine whether the driver is fatigued and the fatigue level. According to the fatigue level, the system sends corresponding warnings to the driver, such as sound alarms, voice alarms, seatbelt tightening, instrument alarms and so on.
Fatigue monitoring realization principle:
Face detection: the process is subdivided into face localization, face recognition and face tracking. The role of face localization is to detect faces by identifying facial feature points in an image and marking the location; the role of face recognition is to match the facial data detected in a new image with the stored data; the role of face tracking is to track the faces found in the previous image frames on each frame.
Head features: consists of three pose corners, CNN based head tracking system is designed to take the face region in the image as input, and the detected facial feature points combined with the default head model can be used to get the approximate head pose. By further tracking the detected facial features and finding more features, more data can be obtained to be added to the head model to update the geometric properties of the head. During system operation, this process is looped over and over again to continuously output the current pose of the head in terms of the 3D pose angle.
Eye Detection: The direction of vision is used to determine whether the driver is distracted or not. The approximate direction of vision can be deduced from the previously obtained head posture. In the case that the pupil and cornea can be well recognized, the exact direction of vision can be further calculated based on the Pulchin spots. Then, according to the data of relevant parts arrangement built in the system, the current driver’s observation target can be known. Combined with the current driving behavior and then determine whether the driver is distracted or not.
Blink detection: the position of the eyes and their states are further recognized based on the recognized face and head pose, which is mainly used to perform the calculation of fatigue state and whether the attention is distracted or not. In this case, information such as eye opening is utilized to determine the fatigue state based on PERCLOS. This includes blink information (rate vs. time difference) and eye information (open vs. closed). The eye information is a binary classification problem and requires a smaller neural network; the blink information requires analyzing the past several frames.
Hazardous and Distracting Movements Monitoring: All actions that take your attention away from driving are known as hazardous driving behaviors. For example, looking down for something, looking out the window, answering the phone, smoking, drinking, not wearing a seatbelt, intentionally covering up, and so on. There are two ways to categorize these unusual situations. For example, looking down or looking out of the window are related to the head, if we can get the head posture angle, then the problem can be solved; drinking water, making phone calls and other actions, all belong to the action recognition, and the design of an algorithm for action recognition can also be solved.
According to the angle threshold of the offset, judgment is made; triggering the offset threshold starts the timing, according to the length of time to judge the distraction level, and give the corresponding prompts, such as sound alarm, voice alarm, seatbelt tightening, meter alarm and so on.
Head pose detection: there are two general methods to get the human head pose angle. The first is the landmark mentioned above, the coordinates of the key points on the face are all known, the use of PnP algorithm can be directly fitted to the three-dimensional angle of the head; the second is to use the labeled yaw, pitch, roll three angles of the face picture directly train a small network, the network output layer directly return to the three float volume, simple and brutal, high precision, training data The second is to train a small network using labeled yaw, pitch and roll face images, and the output layer of the network directly returns three float quantities.
Abnormal Action Recognition: recognizing phone call actions, drinking actions, smoking and other actions, if it is purely as action behavior recognition, using LSTM or 3D-Conv algorithm, its computational consumption of resources is very serious. Therefore, there are generally two simpler methods. The first is to consider such action recognition as a target detection problem, for example, for phone call recognition, the cell phone is considered as the target to be detected, and for water drinking detection, the water cup is considered as the target to be detected.
The idea of this method is very simple, but one drawback is that, for example, the DMS camera is mounted on the A-pillar of the car, and the driver makes a phone call with his right hand, which cannot be seen in the imaging screen, so the detection rate is not high in this scenario. Another approach is to consider action recognition as a single-frame image classification task, collect images under multiple action behaviors, manually label supervised signals for training, and use a classifier to determine the final action classification.
DMS System Flow
Cockpit DMS-related human-computer interaction design (IVI)
This part is mainly to solve the problem of how to effectively remind the user, generally through audio, images, text in the instrument, the center control screen, speakers, etc., but also through the seatbelt vibration (somatosensory), smell (olfactory) and the user for information transfer. Different OEMs or cockpit R&D projects will have different focuses, but the main product features are still designed around the above points.