DiskBoss is optimized for modern multi-core and multi-CPU systems and is capable of searching duplicate files stored on multiple disks,
directories or network shares in parallel using all CPUs installed in the computer. DiskBoss provides a number of different performance
optimization options allowing one to tune the duplicate files search operations for user-specific hardware and storage configurations.
In order to customize the duplicate files search performance optimization options, open the duplicate files search operation dialog,
press the 'Options' button and select the 'Advanced' tab. The 'Dup Files Search Threads' option controls how many parallel threads
are used to search duplicate files. The 'Directories Scanning Threads' option controls how many parallel threads are used to scan
input disks, directories and network shares. In the 'Fault-Tolerant' directory scanning mode, DiskBoss uses an individual processing
thread for each input disk, directory or network share, but limits the maximum number of parallel scanning threads to the specified
value. In the high-performance directory scanning mode, DiskBoss always uses the specified number of parallel directory scanning
threads even when processing a single input disk, directory or network share.
For example, when searching duplicate files stored on a high-speed NVMe SSD disk, DiskBoss reaches up to 3,000 files/sec using
a single search thread. With two parallel search threads, the performance scales up to 6,000 files/sec and with four parallel
search threads, the performance increases up to 11,000 files/sec showing a very good level of multi-threaded performance scalability.
With six processing threads the duplicate files search performance reaches up to 14,000 files/sec and with eight processing threads
the performance increases up to 16,000 files/sec allowing one to quickly process large numbers of files and identify how many files
are duplicates and how much duplicate disk space these files are using.
When searching duplicate files stored on regular SATA SSD drives, which are significantly slower than NVMe SSD drives, the performance
of the duplicate files search process reaches up to 2,000 files/sec using a single process thread and scales up to 4,400 files/sec with
four parallel duplicate files search threads. With eight parallel threads, the performance reaches up to 5,900 files/sec, which allows
to process large numbers of files relatively fast.
Searching duplicate files stored on a NAS storage device via a network is a more complicated task because the user needs to take into account
the speed and the latency of the network. If the computer, on which DiskBoss is installed, is connected to the NAS storage device via a high-speed,
low-latency network, the performance of the duplicate files search operations may reach up to 200 files/sec with one duplicate files search thread,
scale up to 684 files/sec with four parallel search threads and increase up to 1,057 files/sec with eight parallel duplicate files search threads.
On the other hand, if DiskBoss will need to access network shares via the Internet or via a long-distance, high-latency network, the performance
of the duplicate files search operations will be relatively slow. One of the options to increase the performance of the duplicate files search
operations in such configurations is to set the 'High-Performance' directory scanning mode and increase the number of parallel duplicate files
search threads to 16 or even 32 disregarding how many CPUs are actually installed on the computer.
Searching duplicate files stored in one or more NAS servers may be a very time consuming operation and one of the ways to speed-up the duplicate
files search process is to use a 2.5 Gigabit Ethernet network. With 2.5 Gigabit Ethernet the performance of the DiskBoss duplicate files search
operations continues to scale up to 3,800 Files/Sec with 8 parallel duplicate files search threads, which represents a 69% improvement compared
to the standard Gigabit Ethernet.
Due to a very wide adoption of laptops and NAS servers with built-in WiFi network interfaces, many users may consider searching duplicate files
stored in NAS servers via the wireless network. But, the latency of the wireless network is much higher and therefore it will take much more time
to complete the duplicate files search operation via the wireless network. The question is how much longer the user will need to wait and if it
will save any significant amount of time to search duplicate files via a wired network.
Based on our benchmarks, via a 5 GHz wireless network, DiskBoss reaches up to 54 Files/Sec with a single duplicate files search thread and scales
up to 400 Files/Sec with 8 parallel duplicate files search threads, which is approximately 6 times slower compared to the standard Gigabit Ethernet
and approximately 10 times slower when compared to the 2.5 Gigabit Ethernet. So, if the user needs to search duplicate files in a NAS server
with 100,000 files or more, a low-latency Gigabit Ethernet or 2.5 Gigabit Ethernet is required.
Modern USB flash drives provide plenty of the storage space and are reasonably fast allowing one to store vast amounts of data for backup purposes.
Sometimes, it may be required to search duplicate files on a USB flash drive in order to free the used disk space. When searching duplicate files
stored on a USB flash drive, DiskBoss can reach up to 357 files/sec with a single search thread. With two parallel search threads, the performance
increases up to 644 files/sec, with four parallel threads the performance increases up to 1,008 files/sec and with eight parallel duplicate files
search threads the performance scales up to 1,174 files/sec.
Today, modern IT environments widely deploy virtual servers and/or virtual workstations. Most of the popular virtualization platforms provide a high
level of performance, but depending on the target hardware and software platforms, significant performance degradations are inevitable when a duplicate
files search operation is executed on a guest virtual machine compared to the same duplicate files search operation executed directly on the host computer.
For example, when a virtual machine with 4 virtual CPUs is stored on an SSD disk and searching duplicate files stored on a virtual local disk drive,
which is physically stored on the same SSD disk, the performance of the duplicate files search operations reaches up to 223 files/sec using a single
search thread. With two parallel search threads, the performance of the duplicate files search operations scales up to 443 files/sec and with four
parallel search threads, the performance of the duplicate files search operations increases up to 770 files/sec.