

# Data collection
<a name="discovery-tool-data-collection"></a>

## Discovery tool collection schedule
<a name="discovery-tool-scheduling"></a>

After your initial discovery collection, the discovery tool continues to run on a staggered schedule to avoid resource contention:
+ VMware discovery – every hour (at :00 UTC)
+ Hyper-V discovery – every hour (at :20 UTC)

The discovery tool also collects OS metrics through the following independent modules, each with its own staggered schedule:
+ Database discovery – once a day
+ Network metrics – every 15 seconds, might be less frequent for large environments
+ Server performance metrics – every 10 minutes (at :03, :13, :23, :33, :43, :53 UTC)
+ Storage performance metrics – every 10 minutes (at :07, :17, :27, :37, :47, :57 UTC)
+ Server provisioning data – daily (at 00:05 UTC)
+ Storage provisioning data – daily (at 00:35 UTC)
+ Network interfaces – daily (at 01:05 UTC)
+ Running processes – hourly (at :40 UTC)

You can independently start, stop, or trigger each OS metrics module by using **Collect data now**.

To manually run a collection, from the **Actions** menu choose:
+ **Start** – Enables the discovery module.
+ **Stop** – Disables the discovery module.
+ **Collect data now** – Starts discovery immediately. Use this option, for example, after you make a change in your network.

These actions apply per module. You can control OS metrics modules individually.

### OS data collection attempts
<a name="discovery-tool-os-collection-attempts"></a>

When a new server is discovered, the discovery tool attempts each configured credential for each IP address and the hostname. After the discovery tool finds a valid credential, it continues to use that credential unless you add a new credential.

After a collection failure, the discovery tool attempts to collect networking data for a server after 3 minutes, 30 minutes, 2 hours, and then 6 hours. After 4 failed attempts, the discovery tool continues to try all configured credentials once every 6 hours.

## Discovered inventory
<a name="discovery-tool-inventory"></a>

After you configure a discovery source, the **Number of discovered servers** value in the **Discovery tool status** frame begins to increment. The discovery status for the configured source changes to **Enabled** in the **Collection module** frame. The inventory page shows servers from all configured sources: VMware VMs, Hyper-V VMs, and imported servers. Each server shows its source and collection status per module.

Navigate to the **Discovered inventory** page to see the servers that the discovery tool has found. From this page, choose **Download inventory** to download a ZIP file (`discovery_tool_export.zip`) that contains up to 30 days of collected data, including MPA files for all configured sources, performance utilization data, database information, and server-to-server communication information.

You can download the ZIP file while the discovery tool continues to work, and obtain partial results. Upload this file to [Migration assessment ](https://docs.aws.amazon.com/transform/latest/userguide/transform-app-assessments.html)to obtain a business case for migration.

### Export options
<a name="discovery-tool-export-options"></a>

When exporting data, you can customize the export with the following options:

**Date range**

Select a start date and end date to export only data collected within that time period. Both dates are inclusive. The maximum date range is 30 days.

**Note**  
The discovery tool stores up to 30 days of collected data. If you need data spanning more than 30 days, run incremental exports every 30 days to capture all data.

**Module selection**

Choose which data modules to include in the export. You can export all modules or select specific ones:


| Module | Description | 
| --- | --- | 
| VMware data | Virtual machine inventory from vCenter servers | 
| Hyper-V data | Virtual machine inventory from Hyper-V hosts | 
| Network data | Network connections between servers | 
| Database data | SQL Server database inventory | 
| Server inventory | Server hardware and OS information | 
| Server performance metrics | CPU, memory, and network utilization | 
| Server storage performance | Disk IOPS and throughput | 
| Storage config | Disk and volume configuration | 
| Network interfaces | Network adapter details | 
| Process metrics | Running processes | 

If you don't select any modules, the discovery tool exports all available data.

### Data points collected
<a name="discovery-tool-data-points"></a>

The discovery tool gathers comprehensive data across VMware, Hyper-V, OS metrics, database, and network components. The following sections detail the specific data points collected for each component.

#### VMware data collection
<a name="discovery-tool-vmware-data"></a>

This table describes the VMware virtual machine information collected by the discovery tool:


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| vm\_name | String | VM Info | "w2k22-snmpd-v2-en-us-mssql-2022-testcase4-1" | 
| vm\_id | String | VM Info | "vm-30920" | 
| vm\_uuid | String | VM Info | "4201ecf8-cc44-ee7e-01da-34dfb2acf6c0" | 
| powerstate | String | VM Info | "poweredOn" | 
| host | String | VM Info | "esxi-70-node1.testlab.local" | 
| primary\_ip\_address | String | VM Info | "192.168.0.52" | 
| cpus | Integer | VM Info | 2 | 
| memory | Integer | VM Info | 4096 | 
| total\_disk\_capacity\_mib | Integer | VM Info | 32768 | 
| os\_according\_to\_the\_configuration\_file | String | VM Info | "Microsoft Windows Server 2016 or later (64-bit)" | 
| max\_cpu\_usage\_pct\_dec | Float | VM Performance | 79.33 | 
| avg\_cpu\_usage\_pct\_dec | Float | VM Performance | 45.06 | 
| max\_ram\_usage\_pct\_dec | Float | VM Performance | 63.99 | 
| avg\_ram\_utl\_pct\_dec | Float | VM Performance | 29.27 | 

#### Hyper-V data collection
<a name="discovery-tool-hyperv-data"></a>

This table describes the Hyper-V virtual machine information collected by the discovery tool:


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| vm\_name | String | VM Info | "win2022-hyperv-test-01" | 
| vm\_id | String | VM Info | "a1b2c3d4-e5f6-7890-abcd-ef1234567890" | 
| powerstate | String | VM Info | "Running" | 
| cpus | Integer | VM Info | 4 | 
| memory\_mb | Integer | VM Info | 8192 | 
| disk\_paths | String | Disk | "C:\\\\VMs\\\\disk1.vhdx" | 
| disk\_size\_gb | Float | Disk | 127.0 | 
| network\_adapters | String | Network | "00:15:5D:01:02:03" | 
| ip\_addresses | String | Network | "10.0.1.50" | 
| host\_name | String | Host | "hyperv-host-01.example.com" | 
| host\_os\_version | String | Host | "Windows Server 2022 Datacenter" | 
| cluster\_name | String | Host | "FailoverCluster01" | 
| hypervisor | String | VM Info | "Hyper-V" | 

#### Imported server data
<a name="discovery-tool-bare-metal-data"></a>

Imported servers are not auto-discovered. They are imported through a CSV file. The discovery tool does not collect hypervisor-level data for imported servers. Instead, it collects database, network, and OS metrics data by using the OS credentials associated with each server during import.

## Discovery tool's OS-related data
<a name="discovery-tool-os-data"></a>

### OS metrics data collection
<a name="discovery-tool-os-metrics-data"></a>

The discovery tool collects OS-level metrics from servers through SSH (Linux) and WinRM (Windows). Data is collected across six sub-modules and exported into six CSV files.

#### Server inventory (server\_inventory.csv)
<a name="discovery-tool-os-server-inventory"></a>

Combines server provisioning (hardware and OS configuration) with aggregated storage performance. Collected every 24 hours.


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| server\_id | String | Server Info | "vm-web-server-01" | 
| server\_name | String | Server Info | "web-server-01" | 
| resource\_type | String | Server Info | "virtual\_machine" | 
| power\_state | String | Server Info | "Running" | 
| os\_type | String | Server Info | "Linux" | 
| os\_name | String | Server Info | "Amazon Linux" | 
| os\_version | String | Server Info | "2023" | 
| primary\_hostname | String | Server Info | "web-server-01.example.com" | 
| primary\_ip\_address | String | Server Info | "10.0.2.101" | 
| netmask | String | Server Info | "255.255.255.0" | 
| total\_num\_network\_cards | Integer | Server Info | 2 | 
| total\_num\_disks | Integer | Server Info | 1 | 
| cpu\_count | Integer | Server Info | 4 | 
| total\_memory\_gb | Float | Server Info | 15.88 | 
| server\_uuid | String | Server Info | "4201ecf8-cc44-ee7e-01da-34dfb2acf6c0" | 
| smbios\_uuid | String | Server Info | "4201ecf8-cc44-ee7e-01da-34dfb2acf6c0" | 
| cluster\_name | String | Server Info | "production-cluster-01" | 
| hypervisor\_object\_id | String | Server Info | "vm-30920" | 
| hypervisor\_type | String | Server Info | "VMware" | 
| hypervisor\_version | String | Server Info | "8.0.0" | 
| hypervisor\_hostname | String | Server Info | "esxi-node1.example.com" | 
| hypervisor\_host\_id | String | Server Info | "host-1234" | 
| hypervisor\_id | String | Server Info | "4201ecf8-cc44-ee7e-01da-34dfb2acf6c0" | 
| disk\_read\_iops\_avg | Float | Storage Performance | 12.5 | 
| disk\_read\_iops\_peak | Float | Storage Performance | 245.0 | 
| disk\_write\_iops\_avg | Float | Storage Performance | 8.3 | 
| disk\_write\_iops\_peak | Float | Storage Performance | 180.0 | 
| disk\_total\_iops\_avg | Float | Storage Performance | 20.8 | 
| disk\_total\_iops\_peak | Float | Storage Performance | 425.0 | 
| disk\_read\_throughput\_avg\_mbps | Float | Storage Performance | 1.2 | 
| disk\_read\_throughput\_peak\_mbps | Float | Storage Performance | 24.5 | 
| disk\_write\_throughput\_avg\_mbps | Float | Storage Performance | 0.8 | 
| disk\_write\_throughput\_peak\_mbps | Float | Storage Performance | 18.0 | 
| disk\_total\_throughput\_avg\_mbps | Float | Storage Performance | 2.0 | 
| disk\_total\_throughput\_peak\_mbps | Float | Storage Performance | 42.5 | 

#### Server performance metrics (server\_performance\_metrics.csv)
<a name="discovery-tool-os-server-performance"></a>

CPU, memory, and network throughput utilization. Sampled every 10 minutes, aggregated over 30 days.


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| server\_id | String | Server Info | "vm-web-server-01" | 
| data\_source | String | Server Info | "OS" | 
| cpu\_utilization\_avg\_pct | Float | CPU | 45.06 | 
| cpu\_utilization\_peak\_pct | Float | CPU | 79.33 | 
| cpu\_count | Integer | CPU | 4 | 
| memory\_total\_gb | Float | Memory | 15.88 | 
| memory\_utilization\_avg\_pct | Float | Memory | 29.27 | 
| memory\_utilization\_peak\_pct | Float | Memory | 63.99 | 
| network\_in\_avg\_mbps | Float | Network | 0.52 | 
| network\_in\_peak\_mbps | Float | Network | 12.3 | 
| network\_out\_avg\_mbps | Float | Network | 0.31 | 
| network\_out\_peak\_mbps | Float | Network | 8.7 | 
| network\_total\_avg\_mbps | Float | Network | 0.83 | 
| network\_total\_peak\_mbps | Float | Network | 21.0 | 

#### Storage performance (server\_storage\_performance.csv)
<a name="discovery-tool-os-storage-performance"></a>

Per-volume disk I/O and space utilization. Sampled every 10 minutes, aggregated over 30 days.


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| server\_id | String | Server Info | "vm-web-server-01" | 
| data\_source | String | Server Info | "OS" | 
| disk\_volume\_id | String | Volume Info | "/dev/nvme0n1p1" | 
| disk\_mount\_point | String | Volume Info | "/" | 
| file\_system | String | Volume Info | "xfs" | 
| disk\_total\_gb | Float | Disk Space | 30.0 | 
| disk\_used\_gb | Float | Disk Space | 12.5 | 
| disk\_free\_gb | Float | Disk Space | 17.5 | 
| disk\_read\_iops\_avg | Float | Disk I/O | 12.5 | 
| disk\_read\_iops\_peak | Float | Disk I/O | 245.0 | 
| disk\_write\_iops\_avg | Float | Disk I/O | 8.3 | 
| disk\_write\_iops\_peak | Float | Disk I/O | 180.0 | 
| disk\_total\_iops\_avg | Float | Disk I/O | 20.8 | 
| disk\_total\_iops\_peak | Float | Disk I/O | 425.0 | 
| disk\_read\_throughput\_avg\_mbps | Float | Disk Throughput | 1.2 | 
| disk\_read\_throughput\_peak\_mbps | Float | Disk Throughput | 24.5 | 
| disk\_write\_throughput\_avg\_mbps | Float | Disk Throughput | 0.8 | 
| disk\_write\_throughput\_peak\_mbps | Float | Disk Throughput | 18.0 | 
| disk\_total\_throughput\_avg\_mbps | Float | Disk Throughput | 2.0 | 
| disk\_total\_throughput\_peak\_mbps | Float | Disk Throughput | 42.5 | 

#### Storage configuration (storage\_config.csv)
<a name="discovery-tool-os-storage-config"></a>

Physical disk hardware details. Collected every 24 hours.


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| server\_id | String | Server Info | "vm-web-server-01" | 
| disk\_controller\_id | String | Disk Info | "/dev/sda" | 
| vmdk\_vhd\_file\_name | String | Disk Info | "web-server-01.vmdk" | 
| disk\_volume\_type | String | Disk Info | "Virtual" | 
| disk\_provisioned\_gb | Float | Disk Info | 30.0 | 
| disk\_device\_type | String | Disk Info | "SCSI HDD" | 
| disk\_interface\_type | String | Disk Info | "SCSI" | 
| disk\_protocol | String | Disk Info | "LSI Logic SAS" | 

#### Network interfaces (network\_interfaces.csv)
<a name="discovery-tool-os-network-interfaces"></a>

Network adapter configuration. Collected every 24 hours.


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| server\_id | String | Server Info | "vm-web-server-01" | 
| interface\_name | String | Interface Info | "eth0" | 
| interface\_index | Integer | Interface Info | 2 | 
| mac\_address | String | Interface Info | "0A:1B:2C:3D:4E:5F" | 
| adapter\_type | String | Interface Info | "vmxnet3" | 
| virtual\_network\_name | String | Interface Info | "VM Network" | 
| virtual\_network\_id | String | Interface Info | "dvportgroup-1234" | 
| virtual\_switch | String | Interface Info | "vSwitch0" | 
| ipv4\_address | String | IP Config | "10.0.2.101" | 
| ipv4\_subnet\_mask | String | IP Config | "255.255.255.0" | 
| ipv4\_gateway | String | IP Config | "10.0.2.1" | 
| ipv6\_address | String | IP Config | "fe80::a1b:2cff:fe3d:4e5f" | 
| ipv6\_prefix\_length | Integer | IP Config | 64 | 
| ipv6\_gateway | String | IP Config | "fe80::1" | 
| dns\_servers | String | IP Config | "10.0.0.2" | 
| dhcp\_enabled | Boolean | IP Config | false | 
| interface\_status | String | Interface Info | "Up" | 
| vlan\_id | Integer | Interface Info | 100 | 
| is\_primary | Boolean | Interface Info | true | 

#### Running processes (process\_metrics.csv)
<a name="discovery-tool-os-running-processes"></a>

Snapshot of running processes. Collected every hour, deduplicated over 30 days.


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| server\_id | String | Server Info | "vm-web-server-01" | 
| process\_name | String | Process Info | "sshd" | 
| process\_id | Integer | Process Info | 1234 | 
| process\_command\_line | String | Process Info | "/usr/sbin/sshd -D" | 
| process\_user | String | Process Info | "root" | 

### Network collection
<a name="discovery-tool-network-collection"></a>

The Network collection module helps you discover dependencies among servers in your on-premises data center. This network data accelerates your migration planning by providing visibility into how applications communicate across servers.

This module collects network data for servers from all configured sources, including VMware, Hyper-V, and imported servers. It uses WinRM to collect data from Windows servers and uses SSH, SNMPv2, and SNMPv3 to collect data from Linux servers.

#### Network data collection
<a name="discovery-tool-network-data"></a>

The Network collection module captures TCP IPv4 connections in ESTABLISHED or TIME\_WAIT state between servers in your discovered inventory. A connection appears in the output only when both the source and target IP addresses belong to servers that the discovery tool has discovered or that you have imported. Connections to or from IP addresses outside your inventory — such as external services, cloud endpoints, or servers not yet added to the discovery tool — are not included.

This design focuses the network data on server-to-server dependencies within your environment, which is the information needed for application dependency mapping and migration wave planning.

These data points are collected for each connection:
+ Source IP, port, process ID, and process name
+ Target IP, port, process ID, and process name
+ State (ESTABLISHED and TIME\_WAIT)
+ Transport protocol (TCP)
+ IP version (IPv4)
+ Count (number of times this unique connection was observed)

**Tip**  
To maximize the completeness of your network dependency map, configure all discovery sources (VMware, Hyper-V, and server CSV import) and add OS credentials before reviewing network data. The more servers in your inventory, the more connections the network module can capture.

#### Private address network collection
<a name="discovery-tool-private-address-collection"></a>

By default, the Network collection module only captures connections where both endpoints are servers in your discovered inventory. You can enable private address collection to also capture connections to and from RFC 1918 private IP addresses (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16) that are not in your inventory.

**To start private address collection**

1. On the **Collector configuration** page, locate the **Collection modules** section.

1. Find **Network connections discovery** under **Application discovery**.

1. Open the **Actions** dropdown.

1. Choose **Start private address collection**.

1. If the module status was **Enabled**, you can see it change to **Enabled · Private Address on**.

To stop private address collection, open the **Actions** dropdown and choose **Stop private address collection**. Previously collected private address data is retained even after stopping.

Private address connections appear only in the full CSV export (inside the ZIP file), not in the MPA CSV files. If an IP address belongs to a server already in your inventory, it is always identified by its discovered server ID regardless of this setting.

This setting persists across restarts. You can start or stop private address collection at any time. Previously collected private address data is exported regardless of the current setting.

### Database collection
<a name="discovery-tool-database-collection"></a>

The Database collection module gathers database (SQL Server) information from Windows servers across all configured sources, including VMware, Hyper-V, and imported servers. The module uses the WinRM protocol to remotely connect to each Windows server and run PowerShell queries to get information about all installed SQL Server services (components) on the server by using WMI namespaces, registry, and file properties.

A SQL Server component is a specific service or feature instance installed as part of a SQL Server deployment on a Windows server. The discovery tool collects Database Engine, Analysis Services, Reporting Services, and Integration Services.

#### Database data collection
<a name="discovery-tool-database-data"></a>

The Database collection module gathers SQL Server component information. This table describes key database data points collected:


| Name | Type | Category | Sample Value | 
| --- | --- | --- | --- | 
| Engine Type | String | Component | sql\_server | 
| Is Engine Component | Boolean | Component | Y | 
| Status | String | Service | Running, Stopped, StartPending | 
| Version | String | Service | 2015.131.5026.0 | 
| Edition | String | Service | Developer Edition (64-bit) | 
| SQL Service Name | String | Service | MsDtsServer130, Mssql | 
| SQL Service Type | String | Service | SQL Server service, Integration Services service | 
| Instance Name | String | Instance | MSSQLSERVER | 
| Display Name | String | Service | SQL Server (MSSQLSERVER2017) | 
| Start Mode | String | Service | Automatic, Manual, Disabled | 
| Service Account Name | String | Service | NT Service/MsDtsServer130 | 
| Is Clustered | Boolean | Configuration | N | 

**Note**  
Full format includes all service types. MPA format includes only database engine components. Not all fields are available depending on the SQL service type and configuration.