TEZ Memory Tuning Checklist

TEZ Application Manager

  1. tez.am.resource.memory.mb  should be a multiple of yarn.scheduler.maximum-allocation-mb but less than yarn.scheduler.maximum-allocation-mb             Application Master Java Heap sizes (tez.am.launch.cmd-opts) should be by default 80% of  tez.am.resource.memory.mb

 TEZ Container

  1. Set hive.tez.container.size to be the same as or a small multiple (1 or 2 times that) of YARN container size yarn.scheduler.minimum-allocation-mb but NEVER more than yarn.scheduler.maximum-allocation-mb,  to have headroom for multiple containers to be spun up
  2. Set Container Reuse to True:  (Default is true) hive.prewarm.enabled
  3. Prewarm Containers when HiveSever2 Starts hive.prewarm.enabled and hive.prewarm.numcontainers (> 1)
  4. Container Java Heap sizes (hive.tez.java.ops).By default should be 80% of the container sizes, hive.tez.container.size.
  5. Set tez.runtime.io.sort.mb is the memory when the output needs to be sorted
  6. Set tez.runtime.unordered.output.buffer.size-mb is the memory when the output does not need to be sorted
  7. Perform map join as much as possible. hive.auto.convert.join.noconditionaltask.size is a very important parameter to size memory to perform Map Joins. You want to perform Map joins as much as possible.
  8. The following parameters control the number of mappers for splittable formats with Tez:
    set tez.grouping.min-size
    set  tez.grouping.max-size

Reference : https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html