About Multiple Archive Paths

Multiple archive paths feature extends the ability to configure a default archive path for each data store.

This feature enables you to automatically store your archives in different locations based on the age of the archive. It can be useful when splitting the data for a data store across multiple hard drives, primarily for performance and cost optimization. Additionally, it could help optimize disk space constraints.

For instance, in a performance and cost optimization scenario, the most recent data could reside on a fast/expensive NVMe or SSD drive, intermediate data on a slower/cheaper local HDD, and older data on even slower/cheaper long-term storage (for example, NAS or cloud). The requirement for the drive is that it must be mapped to the Historian server node as a drive or folder and accessible by the DataArchiver service. For example, it could be a shared drive such as \\Server\HistorianData mapped to H:\HistorianData. The physical location of the drive does not matter if the DataArchiver service can access it.

Note: There are many sources of drives and various methods to map them. The Historian team has not tested all possibilities and cannot guarantee compatibility with every configuration. However, if the drive can be accessed (with read and write permissions) by the DataArchiver service, it will likely function properly.

About Multiple Archive Path Configuration

Configuration for multiple archive paths can be done on a per-data store basis. Additionally, each data store could have different configurations, so you can decide whether you want to configure multiple archive paths for all the data stores or not. When configuring the archive paths, you can specify a path and an age.

Where,

The age defines how far in the past the archive end time must be before it could be moved.

The path is the directory where the archive will be moved to.

Once the end time of the archive becomes older than the current time minus the age, the archive will be moved. Since moving an archive can be a slow process, ensure that the archive is not modified during the move. Therefore, only archives that have become read-only can be moved. Archives become read-only when their end time is older than the Data is Read Only After (Hours) property. Therefore, be sure to set your time offsets greater than this property. If you set it smaller than this property, then the value of the property is used instead.

The duration of the move depends on the size of the archives and the speed of the source and destination disks. During the move, the source archive is still available for reading. Once the move has completed successfully, the source archive is deleted.

A move is stopped in the following scenarios:

  • If a move failure occurs: The move is stopped, and an appropriate message is logged in the log file. The move retires later.
  • If system shuts down during the move: The move is stopped. Since archives are being moved and deleted during this process, it is important to have proper backups of the system in case errors occur.
Note: When configuring archive paths in a Mirror setup, the configured paths must exist on all mirror nodes.

Example of Multiple Archive Paths

Consider that your system setup includes the following:

  • A fast NVME drive (N:), though not very large.
  • A slow HDD (H:), which is larger compared to the fast NVME drive (N;).
  • A remote NAS drive mapped to (Z:), the largest of the three drives.
    Note: All the three drives use the \Proficy Historian Data\Archives directory.
Your system design requirement is to have your frequently used data on the NVME for the best performance. The NVME drives greatly improve read and write performance than the traditional HDDs. Looking at your data access needs, data storage rate, and the NVME size, you decide to have three months of data there. You do the same for the HDD and decide to have nine months of data there. Data older than that goes to the NAS drive. For a simpler understanding, let us consider that you decide to have daily archives. Finally, you decide that data becomes read-only in a month.

Before you configure multiple archive paths, know the following:

  • The multiple archive paths feature is available for any data store other than the System data store.
  • While moving archives can be helpful, if not managed well, it may strain disk resources and impact system performance. Although it is ensured that system performance is not impacted by moving archives, it is recommended that you move archives in smaller sizes and be mindful that your configurations do not affect system performance.
  • Archive performance, particularly read performance, is crucial to be considered. When data is moved to slower drives, slower read performance is expected, but this is acceptable only for data on those slower paths. Data reads on faster paths should remain unaffected. To manage this, the Historian treats all paths except the default one as "slower" locations. Determining which archives are needed for queries can be challenging until after the query is executed.

    For instance, queries like "Raw By Number" reading backwards could require access to any archive in the past. However, queries with specific time ranges reveal which archives will be necessary. If the system identifies that archives on these "slower" drives are required, the query will be executed at a lower priority using Low Priority Read Threads. The configured number of read threads is divided into three categories: High, Medium, and Low, each with its specific number of threads. Additionally, each category can utilize lower priority threads if necessary.

    For example, a high priority query initially uses a High priority thread but can use a Medium or Low priority thread if all high priority threads are busy. The same flexibility applies to Medium priority queries. It is important to note that thread priority does not affect Windows thread priority; all threads run at the same Windows priority level, so executing queries on Low Priority threads does not affect their performance. Instead, it operates as a thread reservation system. By identifying and assigning "slow" queries to Low priority threads, High and Medium threads remain available for other queries. Generally, most queries are Medium priority unless programmatically changed. You must determine the appropriate number of read threads to configure to ensure adequate resources are provided.

Example Configuration procedure

  1. Configure the default path. This can be done in the same way as you do for data stores. In this example, this is set to N:\Proficy Historian Data\Archives.
  2. Configure the archive locations based on the time.
    Time Offset (Hours) Path
    2232 (3 months) h:\Proficy Historian Data\Archives
    8760 (1 year) Z:\Proficy Historian Data\Archives
Based on the configuration, the DataArchiver will periodically look if any archives need to be moved. Only archives that are closed (that is, are time-based or have a fixed end time), and are read-only will be checked. The DataArchiver will check the archive’s end time against each configured age, from smallest to largest. The first age where the archive end time is older will be used to determine the directory to which the archive must move to.

So, based on this example, the following will happen:

  1. The newly created archive will go into the normal default path, that is, N:\Proficy Historian Data\Archives. That archive will exist and be written to until after a month. At that point, it will become read-only because that is what was configured.
  2. When the archive is three months (2232 hours) old, the DataArchiver will move the archive to H:\Proficy Historian Data\Archive, as the end time is older than the 2232 hours that you configured and less than the 8760 hours. If the move is successful, the original archive is deleted. This process will repeat when the archive is older than 8760 hours.
  3. Then it will be moved to Z:\Proficy Historian Data\Archives. Here it will stay unless the configuration is changed. If, for example, the 8760 was changed to 16520, then an archive that was 10000 hours old would be moved back to H:\Proficy Historian Data\Archives.