# Memory Group By

## ![](/files/0ZoNuhDNAofymKffVdHE) Memory Group By

### Description <a href="#description" id="description"></a>

The Memory Group By transform builds aggregates in a group by fashion.

This transform processes all rows within memory and therefore does not require a sorted input. However, it **does** require all data to fit into memory.

When the number of rows is too large to fit into memory, use a combination of [Sort Rows](/data-shaper-1.21/knowing-the-data-shaper-designer/pipelines/transforms/sort.md) and [Group By](/data-shaper-1.21/knowing-the-data-shaper-designer/pipelines/transforms/groupby.md) transforms.

| Hop Engine | <sup>✓</sup> |
| ---------- | ------------ |
| Spark      | <sup>✓</sup> |
| Flink      | <sup>✓</sup> |
| Dataflow   | <sup>✓</sup> |

### Options

| Option                            | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Transform name                    | Name of the transform. This name has to be unique in a single pipeline,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Always give back a result row     | <p>If you enable this option, the Group By transform will always give back a result row, even if there is no input row.</p><p>This can be useful if you want to count the number of rows. Without this option you would never get a count of zero (0).</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| The fields that make up the group | Specify the fields over which you want to group. Click Get Fields to add all fields from the input stream(s).                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| Aggregates                        | <p>Specify the fields that must be aggregated, the method and the name of the resulting new field. Click Get lookup fields to add all fields from the input stream(s). Here are the available aggregation methods:</p><p>- Sum - Average (Mean) - Median - Percentile - Minimum - Maximum - Number of values (N) - Concatenate strings separated by , (comma) - First non-null value - Last non-null value - First value (including null) - Last value (including null) - Standard deviation - Concatenate strings separated by \<Value>: specify the separator in the Value column (This supports hexadecimals) - Number of distinct values - Number of rows (without field argument) - Concatenate distinct values separated by \<Value>: specify the separator in the Value column (This supports hexadecimals)</p> |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.primeur.com/data-shaper-1.21/knowing-the-data-shaper-designer/pipelines/transforms/memgroupby.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
