Merge and Plot Billions of Data Points; On-the-Fly, No Pre-processing

By - Drivetech
20.10.23 11:59 AM
DriveTech was recently working on client project which involved Durability testing for a fleet of vehicles. It generated huge amounts of data on a daily basis, and at the end of each day it was expected to have an incremental and a cumulative analysis of the data till date.
The Client was essentially facing the following four challenges:
-  Merging multiple signal files together.
-  Prevent loss of data due to sampling.
-  Prevent data duplication (caused by sampling and re-sampling of the original data).
-  Sharing of files / data analysis with team members.
For this, DriveTech developed a solution that can perform all the above and more.

Our Research:
DriveTech collaborated with the Client’s teams to get a deeper understanding of the issues faced. The Client teams were running a sequence of road tests on one of their proto vehicles and were recording CAN data on the same. We noted that this was leading to generation of a lot of data files from each test. The testing team wanted to analyze all the files as a single data set. They converted the message files to signal files and tried to manually combine multiple files together. However, we discovered that this was leading their system to crash. This meant that they had to analyze each file individually. Further they were sampling the files at a standard sampling interval. We understood that whenever a different sampling interval was needed, new re-sampled files had to be painstakingly created causing duplication of data, increasing storage requirements, as also adding the time required for re-sampling the data.
To quantify this, on average, each 8-hour drive test generated ~ 20 signal files, each of size ~ 150 mb. Analyzing each file will require at least 1 man-hour. This means that analyzing 20 files would take 20 man-hours i.e., 2 to 3 working days; making individual file analysis laborious and impractical.

The Solution:
To cater to the requirements, DriveTech introduced a new feature to its automotive test data analytics platform – StellarAi; called ‘Dynamic Sampling’ (DS). Dynamic Sampling allowed users to visualize a group of files together as a single dataset. StellarAi slices the original consolidated data set to isolate the parameters to be visualized, and the time period of interest. This eliminates the need to handle the entire data set and reduces StellarAi's own computing requirements.
Dynamic Sampling helps StellarAi to intelligently set a fixed sampling interval, optimized based on the available data bandwidth and rendering muscle of the User's workstation. Users can visualize a signal parameter across the entire data set based on the fixed sampling interval. DS also enables Users to drill-down into a particular slice of the graph, as seen here below.
As the User drills down, Dynamic Sampling adjusts the view to display just that selected data window, freeing up computing resources, and effectively crop the data under study, to allow for lower sampling interval within the selected data window.

The result is an enhanced user experience where individuals can continue drilling down into specific events within the data. This means users can even reach a point where they are viewing the actual, non-sampled data. The DS approach also ensures that all users can view comprehensive overviews of massive amounts of data; reliably rendered in a deterministic period of time.
StellarAi equipped with DS, eliminated the need for manual sampling, keeping the original data intact and readily accessible. This also ensured no data duplication, no data loss due to sampling, and the ability to adjust the sampling interval on the fly to meet the User’s analytical needs.
Further, because of StellarAi’s centralized ‘Data Repository’ and its existing collaboration environment, Users were able to share data analysis with their colleagues and effectively collaborate with each other without having to physically share large amounts of data.
Due to the modular architecture of StellarAi, DriveTech was able to quickly roll out the Dynamic Sampling, a new feature that effectively addressed Client’s challenges. It ensured that the Client is able to:

-  Analyze multiple files together.
-  Avoid manually sample / preprocess the data.
-  Eliminate data duplication.
-  Share their work with their colleagues easily.