Researchers at the Agency for Science, Technology and Research (A*STAR) in Singapore have designed a new way to structure data that is robust against cyber-attacks and allows it to be processed in record time.
Cyber-security solutions must track all the data flowing through the networks to protect them from threats. However, it is difficult to design a solution that works fast enough to process all the exponentially growing volumes of information in real time, and to block threats before they can strike. The way that network traffic is tracked has a huge effect on the speed at which it can be analysed and checked for malicious activities.
The team’s work improves on widely-used data structures called ‘hash tables’. A hash table maps values to specific locations, labelled with indices. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
Dr. Vrizlynn Thing from the A*STAR's Institute for Infocomm Research (, who led the study, explained, “The challenges are that millions of values need to be stored, and the values are generated and transmitted extremely quickly.”
Traditional hash tables are becoming inefficient as the Internet grows and data flows get larger. Researchers have developed data structures known as Cuckoo and Peacock, but when they are under attack, these hash tables fill up quickly, eroding performance.
The new data structure developed by Thing and her team is called REX, which stands for Resilient and Efficient data Structure (X for structure). REX exploits certain inherent characteristics of Internet traffic. For example, it takes into account the ‘heavy-tail’ behaviour of data flows. There are a small number of large so-called ‘elephant flows’ which contribute to a much larger percentage of the total volume than the many small ‘mice flows’. Based on this fact, REX employs a hierarchy of sub-tables increasing in size from top to bottom. This structure effectively segregates the different types of flows.
REX also utilises the special processing property of computer RAM (Random Access Memory). Its design features both fast, expensive Static RAM (SRAM[1]) and slower, cheaper Dynamic RAM (DRAM). The faster SRAM is used to process the few large, important flows, allowing fast tracking and frequent updates, while DRAM handles the low priority flows in secondary sub-tables.
In tests using real recorded network traffic, REX was faster and more efficient at analysing data than Cuckoo and Peacock. The researchers plan to further investigate the efficiency and scalability of this new data structure for security analysis in larger scale environments.