- tez.am.resource.memory.mb should be a multiple of yarn.scheduler.maximum-allocation-mb but less than yarn.scheduler.maximum-allocation-mb Application Master Java Heap sizes (tez.am.launch.cmd-opts) should be by default 80% of tez.am.resource.memory.mb
- Set hive.tez.container.size to be the same as or a small multiple (1 or 2 times that) of YARN container size yarn.scheduler.minimum-allocation-mb but NEVER more than yarn.scheduler.maximum-allocation-mb, to have headroom for multiple containers to be spun up
- Set Container Reuse to True: (Default is true) hive.prewarm.enabled
- Prewarm Containers when HiveSever2 Starts hive.prewarm.enabled and hive.prewarm.numcontainers (> 1)
- Container Java Heap sizes (hive.tez.java.ops).By default should be 80% of the container sizes, hive.tez.container.size.
- Set tez.runtime.io.sort.mb is the memory when the output needs to be sorted
- Set tez.runtime.unordered.output.buffer.size-mb is the memory when the output does not need to be sorted
- Perform map join as much as possible. hive.auto.convert.join.noconditionaltask.size is a very important parameter to size memory to perform Map Joins. You want to perform Map joins as much as possible.
- The following parameters control the number of mappers for splittable formats with Tez: