DiskBoss Logo
Flexense Data Management Software

File Classification Performance

DiskBoss is optimized for modern multi-core and multi-CPU systems and is capable of classifying files stored on multiple disks, directories or network shares in parallel using all CPUs installed in the computer. DiskBoss provides a number of different performance optimization options allowing one to tune the file classification operations for user-specific hardware and storage configurations.

DiskBoss File Classification Performance Options

In order to customize the file classification performance optimization options, open the file classification operation dialog, press the 'Options' button and select the 'Advanced' tab. The 'File Classification Threads' option controls how many parallel threads are used to classify files. The 'Directories Scanning Threads' option controls how many parallel threads are used to scan input disks, directories and network shares. In the 'Fault-Tolerant' directory scanning mode, DiskBoss uses an individual processing thread for each input disk, directory or network share, but limits the maximum number of parallel scanning threads to the specified value. In the high-performance directory scanning mode, DiskBoss always uses the specified number of parallel directory scanning threads even when processing a single input disk, directory or network share.

NVMe SSD Disk File Classification Performance

For example, when classifying files stored on a high-speed NVMe SSD disk, DiskBoss reaches up to 63,000 files/sec using a single file classification thread. With two parallel file classification threads, the performance scales up to 98,000 files/sec and with four parallel file classification threads, the performance increases up to 123,000 files/sec showing a very good multi-threaded performance scalability. With six processing threads the file classification performance reaches up to 127,000 files/sec and with eight processing threads the performance increases up to 129,000 files/sec, which makes it possible to classify and categorize 10 millions files within two minutes.

When classifying files stored on regular SATA SSD drives, which are significantly slower than NVMe SSD drives, the performance of the file classification process reaches up to 24,000 files/sec using a single file classification thread and scales up to 69,000 files/sec with four parallel file classification threads. With eight parallel file classification threads, the performance reaches up to 98,000 files/sec, which allows to classify and categorize huge numbers of files relatively fast.

SATA SSD Disk File Classification Performance

Classifying files stored on a NAS storage device via a network is more complicated because the user needs to take into account the speed and the latency of the network. If the computer, on which DiskBoss is installed, is connected to the NAS storage device via a high-speed, low-latency network, the performance of the file classification operations may reach up to 12,000 files/sec with one file classification thread, scale up to 49,000 files/sec with four parallel file classification threads and increase up to 82,000 files/sec with eight parallel file classification threads.

NAS Server File Classification Performance

On the other hand, if DiskBoss will need to access network shares via the Internet or via a long-distance, high-latency network, the performance of the file classification operations will be relatively slow. One of the options to increase the performance of the file classification operations in such configurations is to set the 'High-Performance' directory scanning mode and increase the number of parallel directory scanning threads to 16 or even 32 disregarding how many CPUs are actually installed on the computer.

Modern USB flash drives provide plenty of the storage space and are reasonably fast allowing one to store vast amounts of data for backup purposes. Sometimes, it may be required to classify and categorize files on a USB flash drive in order to free the used disk space. When classifying files stored on a USB flash drive, DiskBoss can reach up to 15,000 files/sec with a single file classification thread. With two parallel file classification threads, the performance increases up to 18,000 files/sec, which is good, but more than two file classification threads will slightly degrade the performance of the file classification operations.

USB Flash Drive File Classification Performance

Today, modern IT environments widely deploy virtual servers and/or virtual workstations. Most of the popular virtualization platforms provide a high level of performance, but anyway some performance degradations are inevitable when a file classification operation is executed on a guest virtual machine compared to the same file classification operation executed directly on the host computer.

Virtual Machine File Classification Performance

For example, when a virtual machine with 4 virtual CPUs is stored on an NVMe SSD disk and classifying files stored on a virtual local disk drive, which is physically stored on the same NVMe SSD disk, the performance of the file classification operations reaches up to 25,000 files/sec using a single file classification thread. With two parallel file classification threads, the performance of the file classification operations scales up to 37,000 files/sec and with four parallel file classification threads, the performance of the file classification operations increases up to 48,000 files allowing one to classify and categorize huge numbers of files relatively fast.