Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Using Aggregate to perform summary calculations on selected fields

Focus mode
Using Aggregate to perform summary calculations on selected fields - AWS Glue
To use the Aggregate transform
  1. Add the Aggregate node to the job diagram.

  2. On the Node properties tab, choose fields to group together by selecting the drop-down field (optional). You can select more than one field at a time or search for a field name by typing in the search bar.

    When fields are selected, the name and datatype are shown. To remove a field, choose 'X' on the field.

    The screenshot shows the Transform tab for the Aggregate node.
  3. Choose Aggregate another column. It is required to select at least one field.

    The screenshot shows the fields when choosing Aggregate another column.
  4. Choose a field in the Field to aggregate drop-down.

  5. Choose the aggregation function to apply to the chosen field:

    • avg - calculates the average

    • countDistinct - calculates the number of unique non-null values

    • count - calculates the number of non-null values

    • first - returns the first value that satisfies the 'group by' criteria

    • last - returns the last value that satisfies the 'group by' criteria

    • kurtosis - calculates the the sharpness of the peak of a frequency-distribution curve

    • max - returns the highest value that satisfies the 'group by' criteria

    • min - returns the lowest value that satisfies the 'group by' criteria

    • skewness - measure of the asymmetry of the probability distribution of a normal distribution

    • stddev_pop - calculates the population standard deviation and returns the square root of the population variance

    • sum - the sum of all values in the group

    • sumDistinct - the sum of distinct values in the group

    • var_samp - the sample variance of the group (ignores nulls)

    • var_pop - the population variance of the group (ignores nulls)

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.