José Valerio Oracle Technology

RAC Interconnect Performance

05.19.2010 · Posted in Internals, Performance, RAC

The contention in blocks can be measured by using the block transfer statistic. The first step is to determine block transfer time examining  the gv$sysstat view:

  • global cache cr block receive time
  • global cache cr blocks received.

The time is determined by calculating the ratio of global cache cr block receive time to global cache cr blocks received. The values for these statistics are taken from the gv$sysstat view,  see below:

from SQL*Plus

column "AVG RECEIVE TIME (ms)" format 9999999.9
col inst_id for 9999
prompt GCS CR BLOCKS
select b1.inst_id, b2.value "RECEIVED",
b1.value "RECEIVE TIME",
((b1.value / b2.value) * 10) "AVG RECEIVE TIME (ms)"
from gv$sysstat b1, gv$sysstat b2
where b1.name = 'global cache cr block receive time' and
b2.name = 'global cache cr blocks received' and b1.inst_id = b2.inst_id 

INST_ID   RECEIVED RECEIVE TIME AVG RECEIVE TIME (ms)
------- ---------- ------------ ---------------------
 1       2792         3285                  15.8
 2       3761         7481                  20.9

In the example above, the nodes in the cluster shows excessive transfer times, the cluster interconnects should be checked using system level commands to verify that they are functioning correctly. In the above SELECT result, instance two exhibits an average receive time is higher than instance one, in my experience if these values are over 5 ms your are in troubles. A high transfer time means a big impact in the performance.

In most known configurations to date, the bandwidth of 1 GbE should be sufficient. The actual utilization depends on the size of the cluster nodes in terms of CPU power, the number of nodes accessing the same data, the size of the working set for an application. Most applications have good cache locality, and there are no increasing interconnect requirements when scaling the application out by adding cluster nodes and distributing the work over more instances or adding additional load. For small working sets which could fit into a small percentage of the available global buffer cache, the interconnect traffic may increase when the set remains constant.
The actual utilization is difficult to predict but in most cases is no reason for concern when it comes to providing adequate bandwidth. Typical utilizations for OLTP are usually much lower than the total available network capacity of 1 GbE. Have in mind that 10000-12000 8K blocks per second can saturate 1 x Gb Ethernet ( 75-80% of theoretical bandwidth )

Block Latency

The avg latency reported in Statspack or AWR is based on roundtrip, end-to-end latencies measured at the buffer cache layer. The individual latencies can have a “narrower” or a “wider” dispersion. Larger variation and a tendency for higher delays can be caused by the load on a node, particularly affecting the block server processes, or interconnect bandwidth saturation. The lower bound for a block transfer of 4KB is about 300 microseconds. What one sees in Statspack and AWR reports are usually averages which are based on a particular distribution. Normally, latencies for the 90% percentile are in the < 500 microsecond bucket, but this can be significantly impacted by the system load.

The block access latency is defined as rountrip time, typical measuring:

  • ~300 microseconds is lowest measured with UDP over Gigabit Ethernet and 2K blocks
  • ~ 120 microseconds is lowest measured with RDS over Infiniband and 2K blocks
Note: The source of this post is based on public Oracle information
      and personal research. 
  • Share/Bookmark

Comments are closed

Copy Protected by Chetan's WP-CopyProtect.