## Path-Based Statistical Timing Analysis Considering Interand Intra-Die Correlations

Aseem Agarwal, David Blaauw, \*Vladimir Zolotov, \*Savithri Sundareswaran, \*Min Zhao, \*Kaushik Gala, \*Rajendran Panda

> University of Michigan, Ann Arbor, MI \*Motorola, Inc., Austin, TX

#### Abstract

Process variation has become a significant concern for static timing analysis. In this paper, we present a new method for path-based statistical timing analysis. We first propose a method for modeling inter- and intra-die device length variations. Based on this model, we then present an efficient method for computing the total path delay probability distribution using a combination of device length enumeration for inter-die variation and an analytical approach for intradie variation. We also propose a simple and effective model of spatial correlation of intra-die device length variation. The analysis is then extended to include spatial correlation. We test the proposed methods on paths from an industrial high-performance microprocessor and present comparisons with traditional path analysis which does not distinguish between inter- and intra-die variations. The characteristics of the device length distributions were obtained from measured data of 8 test chips with a total of 17688 device length measurements. Spatial correlation data was also obtained from these measurements. We demonstrate the accuracy of the proposed approach by comparing our results with Monte-Carlo simulation.

#### **Categories and Subject Descriptors**

B.8.2 [Performance and Reliability]: Performance analysis

#### **General Terms**

Algorithms, performance, reliability

## **1** Introduction

Static timing analysis has become the primary method for performance verification of high performance designs. Static timing analysis has the advantage that it does not require input vectors and has a run time that is linear with the size of the circuit. A number of methods have been proposed to increase the accuracy of static timing analysis through improved delay models and analysis techniques. In recent technologies, the variability of circuit delay due to process variations has become a significant concern. As process geometries continue to shrink, the ability to control critical device parameters is becoming increasingly difficult, and significant variations in devicelength, doping concentrations, and oxide thicknesses have resulted.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. *TAU 02*, December 2 3, 2002, Monterey, California, USA. Copyright 2002 ACM 1-58113-526-2/02/0012&\$5.00.

These process variations pose a significant problem for timing yield prediction and require that static timing analysis models the circuit delay not as a deterministic value, but as a random variable.

Process variations can be classified as systematic or random where systematic variation are deterministic in nature and are caused by the structure of a particular gate and its topological environment.

For instance, wire thicknesses will polish differently during CPM depending on the density of the surrounding routing. Also, poly gate width has a deterministic dependence on the spacing of neighboring poly lines due to limitations of the lithography and the application of OPC methods. Random variations are unpredictable in nature and include random variations in the device length, discreet doping fluctuations, and oxide thickness variations. Analysis of the impact of deterministic variations on circuit delay is relatively straightforward, given accurate models of their dependence on physical topologies and the needed layout information at the time of analysis. Methods have been proposed to include deterministic device length variations [1] and interconnect variations [2] in the analysis of circuit performance. However, often the necessary models and layout information for incorporating deterministic variations in delay computation are not available and hence, deterministic variations are treated as random variations.

Process variations can be further classified as inter-die variation and intra-die variations. Intra-die variations are variations in device features that are present within a single chip, meaning that a device feature varies between different locations on the same die. Often, intra-chip variations exhibit spatial correlations, where devices that are close to each other have a higher probability of being alike than devices that are placed far apart. Intra-die variation also exhibit structural correlations, meaning that devices that are structurally similar have an increased likelihood of having similar device features, for instance, devices oriented in the same direction tend to be more alike. Inter-chip variation are variations that occur from one die to the next, meaning that the same device on a chip has different features among different die of one wafer, from wafer to wafer, and from wafer lot to wafer lot. With increased process scaling, intrachip variations are becoming a more dominant portion of the overall variability of device features, meaning that devices on the same die can no longer be treated as identical copies of the same device.

In this paper, we are concerned with the impact of random interand intra-die variations on circuit performance. Traditionally, these process variations have been modeled using case analysis, where a set of worst-case and best-case device features are constructed based on the 3-sigma points of their distributions. Deterministic timing analysis is then performed for each case of device features. A significant draw back of case based timing analysis is that inter- and intradie variations cannot be distinguished since each device has identical (best-case or worst-case) features during the analysis. In practice, device features vary among the devices on a chip and the likelihood that all devices have a worst-case feature is extremely small. Case analysis is therefore pessimistic since on an actual die, devices with worse delay are compensated for by other devices on the same die that have better delay. The impact of intra-die variations on path delay will vary from path to path, due to differing number of gates in a path and their spatial locations. Case based timing analysis may therefore identify incorrect critical paths, thereby resulting in incorrect circuit optimization. With continued process scaling, intra-die variations are becoming a dominant portion of the overall process variation and traditional timing analysis approach will, therefore, become too restrictive for aggressive circuit design.

With increasing awareness of process variation, a number of techniques have been developed which model random delay variations and perform statistical timing analysis. These can be classified into *full-chip* analysis and *path-based* analysis approaches. Full-chip analysis models the delay of a circuit as a random variable and endeavors to compute its probability distribution [3-7]. This task is complicated by the reconvergence between circuit paths, giving rise to correlations of path delays. Since the underlying problem has an exponential complexity, the proposed methods are heuristic in nature and have a very high worst-case computational complexity. Also, they are based on very simple delay models, where the dependence of gate delay due to slope variation at the input of the gate and load variation at the output of the gate is not modeled. From both a run time and accuracy perspective, full chip statistical timing analysis is therefore not yet practical for industrial designs.

In a path based approach, deterministic timing analysis is first performed and the top n critical paths are enumerated, where n is a sufficiently large number to ensure that all paths that have a significant probability of being critical on a manufactured die are included. For instance, if the delay variability is expected to be 10% of nominal, all paths that have a deterministic delay within 10% of the worst-case circuit delay must be included. The delay of each path is then statistically analyzed resulting in the probability distribution of each path delay. The 3-sigma delay (or any other desired confidence point) is then computed for each path and is compared against the required circuit performance. This approach avoids the issue of path reconvergence thereby simplifying the problem and allowing for the use of more accurate models. Path-based statistical timing analysis provides statistical information on a path-by-path basis. It accounts for intra-die process variations and hence eliminates the pessimism in deterministic timing analysis based on case files. It also provides a more accurate measure of which paths are critical under process variability, allowing more correct optimization of the circuit.

In [8], a path based statistical timing analysis approach was proposed. However, this approach does not include the load dependence of the gate delay due to variability of fanout gates and does not address spatial correlations of intra-die variability. In this paper, we therefore propose a new path-based approach to statistical timing analysis. We accurately model variations of gate delay due to variations of the input slope and output loads resulting from variations of fanin and fanout stages in the path. We propose a model where inter- and intra-die variations are modeled as two separate components and propose efficiently methods to compute path delay variability due to either source and as well as their combine their effect. We also propose a new model for intra-die correlations that models the impact of spatial separation of gates in a circuit path. We demonstrate how the proposed analysis can be extended to efficiently include this spatial correlation model.

The proposed model and analysis method was applied to device length variations in this paper, although extensions to other device parameters is straightforward. To obtain intra-die device length variations and their spatial correlation, we examined an extensive set of device length measurements from an industrial 0.18um process. To compute the intra-die path delay component of process variability, we first compute the sensitivity of gate delay, output slope, and input load with respect to the input slope, output load and device length. Using these sensitivities we then express the path delay variation as an analytical expression of the device length variation, allowing for very efficient analysis of intra-die variability, including an accurate model for spatial correlation. Since the inter-die component of path delay variability is dependent on a single random variable, we can compute it efficiently though enumeration of its probability distribution. We then compute the joint path delay distribution through convolution of inter- and intra-die delay distribution components to obtain the distribution of the total delay variability.

The proposed model assumptions are validated through monte carlo simulation and show that the proposed approach yields very accurate results. The most computational intensive part of the analysis is the initial computation of sensitivities. Since these sensitivities are precomputed once and do not need to be re-computed during the analysis of individual paths, the proposed approach is very efficient. We present results on critical paths from an industrial high performance microprocessor and show that the proposed statistical analysis can significantly improve the accuracy of performance analysis. Furthermore, we demonstrate the importance of including spatial correlation information in the analysis, showing that ignoring such correlations may result in an under estimation of the computed variability.

The remainder of this paper is organized as follows. Section 2 discusses the delay model assumptions and properties. Section 3 presents the proposed approach for computing the path delay distribution under inter- and intra-die device length variability. Section 4 presents our model and analysis method for spatial correlation of intra-die variations. Section 5 contains experimental results and in Section 6 we draw our conclusions.

## 2 Statistical Timing Analysis Model

We first consider process variation due to inter- and intra-die variation, while ignoring spatial correlations. Extensions of the model to include spatial correlation are presented in Section 4. We propose the following model, where the device length  $L_{total,i}$  of device *i* is the algebraic sum of a inter-die device length  $L_{inter}$  and intra-die device length variation,  $\Delta L_{intra,i}$ :

$$L_{total,i} = L_{inter} + \Delta L_{intra,i}, \qquad (EQ 1)$$

where  $L_{inter}$  and  $\Delta L_{intra,i}$  are random variables with normal distributions. All devices on a die share one variable  $L_{inter}$  for the inter-die component of their total device length, which represents the mean of the gate of a particular die. For the intra-die component of device length, each device has an separate independent random variable  $\Delta L_{intra,i}$ , where all random variables  $\Delta L_{intra,i}$  have identical probability distributions. Both the total variation  $L_{total}$  and the inter-die variation  $L_{inter}$  have a mean which is equal to the nominal value of the device length. The intra-die variations  $\Delta L_{intra,i}$  have a mean of zero. We assume that all three random variables  $L_{total}$ ,  $L_{inter}$ , and  $\Delta L_{intra}$  have a normal distribution, which is a common assumption since device length is a physical quantity. It is important to notice, however, that the gate delays do not have normal distributions since the delay of a gate is a non-linear function of the device length. In this paper, we compute the two components  $L_{inter}$  and  $\Delta L_{intra}$  as follows. The total device length variation  $L_{total}$  is typically well characterized during process development and the mean and sigma of  $L_{total}$  is available from the spice parameter file. The statistical parameters of  $L_{inter}$  and  $\Delta L_{intra}$  are typically not directly measured during process development. Therefore, we analyzed device length measurements from test die on 8 manufactured wafers. Each test die consisted of 378 test structures covering 63 different test sites with 6 different structures per test site for a total of 17688 device length measurements. We computed the intra die standard deviation for each type of structure on each die and set the standard deviation of  $\Delta L_{intra}$  equal their average. Since  $\Delta L_{intra}$  represents a device length deviation from the chip mean,  $\Delta L_{intra}$  has a mean of zero. Given the distributions of  $L_{total}$  and  $\Delta L_{intra}$ , the standard deviation of the interdie variation is computed from the following equation:

$$\sigma^{2}Ltotal = \sigma^{2}Linter + \sigma^{2}Lintra \qquad (EQ 2)$$

#### 3 Inter- and Intra-die Analysis Method

We have modeled the total device length as the sum of two independent random variables. Our objective is to obtain the distribution of the path delay  $D_p$  resulting from the variation of the total device length of the individual gates in the path,

$$D_p = \sum_{i} D_i (L_{inter} + \Delta L_{intra,i})$$
(EQ 3)

where  $D_i$  is the delay of gate *i* as a function of its device length and the sum is taken over all gates of a path. The path delay  $D_p$  is a random variable. However, computing its distribution is difficult since  $D_i$  is a non-linear function that cannot be accurately expressed in closed form. One method for computing the distribution of  $D_p$  is through Monte-Carlo simulation. However, since each iteration of Monte Carlo involves spice simulation of the entire circuit path, this approach will have unacceptable run time. We therefore make the following simplifying assumption:

 $D_i(L_{inter} + \Delta L_{intra,i}) = D_i(L_{inter}) + \Delta D_i(\Delta L_{intra,i}),$  (EQ 4) where  $\Delta D_i(L_{intra,i})$  is the change of gate delay due to a small change in device length. In other words, the gate delay of the sum of interand intra-die device lengths is approximated by the sum of the delay of the inter- and intra-die variations. Note that  $\Delta D_i$  is assumed to be independent of  $L_{inter}$  which is an approximation that is valid if  $\Delta L_{in$  $tra,i}$  is small compared to  $L_{inter}$ . The assumption of EQ4 allows us to compute  $D_i(L_{inter})$  and  $\Delta D_i(\Delta L_{intra,i})$  independently and then combine them to obtain the total path delay distribution  $D_p$ , as follows:

$$D_p = \sum_{i} D_i(L_{inter}) + \sum_{i} \Delta D_i(\Delta L_{intra,i}), \qquad (EQ 5)$$

We discuss the computation of the two components  $D_{p,inter} = \sum_{i} D_i(L_{inter})$  and  $D_{p,intra} = \sum_{i} \Delta D_i(\Delta L_{intra,i})$  in the following two Sections.

### 3.1 Inter-die variability analysis

To compute the delay due to inter-die variation we need to compute  $D_{p,inter} = \sum_{i} D_i(L_{inter})$ , as function of the inter-die device length. Since all gate delays  $D_i(L_{inter})$  in  $D_{p,inter}$  share a single random variable, it can be efficiently computed through enumeration of the distribution of the  $L_{inter}$ . We enumerate different possibilities from the worst case to the best case process corners, and compute the path delay  $D_{p,inter}$  for each case. The distribution of  $D_{p,inter}$  is then computed by considering the probability of the selected device length from  $L_{inter}$  and its resulting path delay for each enumeration. In our experiments, discretization of  $L_{inter}$  into 20 device lengths was sufficient to obtain a high level of accuracy. This requires simulating each path 20 times, which is a relatively low cost for computing  $D_{p,inter}$ 

#### 3.2 Intra-die variability analysis

The path delay variation due to intra-die device length variation  $D_{p,intra} = \sum_{i} \Delta D_i (\Delta L_{intra,i})$  is a function of multiple independent random variables. Therefore, the number of simulations required for computing  $D_{p,intra}$  through enumeration is  $m^n$ , where *m* is the number discretizations of  $\Delta L_{intra,i}$  and *n* is the number of gates in the path. Even for paths consisting of a few gates, this approach is therefore computationally infeasible. We therefore make the second simplifying assumption, namely that  $\Delta D_i (\Delta L_{intra,i})$  can be approximated linearly as follows:

$$\Delta D_i(\Delta L_{intra,i}) = \frac{\partial D_i}{\partial L_{intra,i}} \times \Delta L_{intra,i}, \qquad (EQ 6)$$

for small values of  $L_{intra,i}$ , where the sensitivity of the delay with respect to device length  $\frac{\partial D_i}{\partial L_{intra,i}}$  is computed at the nominal device length. The simplification of EQ6 allows us to compute the change of path delay  $D_{p,intra}$  due to intra-die device length variation analytically and efficiently using precomputed delay sensitivities. When computing  $D_{p,intra}$  the dependence of the delay of gate *i* on gate input load of its fanout gate *i*+1 must be considered, which is a function of the device length  $\Delta L_{intra,i+1}$ . Similarly, the delay of gate *i* is dependent on its input slope, which is a function of all device lengths  $\Delta L_{intra,j}$ , where gate j < i precedes gate *i* in the path. We therefore extend the linear assumption of EQ6 to the change of a gate delay and output slope due to input slope and output load and formulate the computation of  $D_{p,intra}$  as follows.

The change in path delay  $D_{p,intra}$  is the sum of the individual gate delay changes  $\Delta D_i$ , where each of the gate delay changes and their corresponding output slope changes are a function of the change in output slope of the *preceding* gate ( $\Delta S_{i-I}$ ), the change in input load of the *succeeding* gate ( $\Delta Cl_{i+I}$ ), and the intra-die device length:

$$\Delta D_i = f(\Delta Cl_{i+1}, \Delta S_{i-1}, \Delta L_{intra,i}) \tag{EQ 7}$$

$$S_{i} = f(\Delta Cl_{i+1}, \Delta S_{i-1}, \Delta L_{intra,i})$$
(EQ 8)

The change in delay, slope and input capacitance of a single gate is approximated as a sum of products of the sensitivities and the change in the parameter values:

٨

$$\Delta D_{i} = \frac{\partial D_{i}}{\partial S_{i-1}} \times \Delta S_{i-1} + \frac{\partial D_{i}}{\partial L_{i}} \times \Delta L_{intra, i} + \frac{\partial D_{i}}{\partial Cl_{i+1}} \times \Delta Cl_{i+1}$$
(EQ 9)

$$\Delta S_{i} = \frac{\partial S_{i}}{\partial S_{i-1}} \times \Delta S_{i-1} + \frac{\partial S_{i}}{\partial L_{i}} \times \Delta L_{intra, i} + \frac{\partial S_{i}}{\partial Cl_{i+1}} \times \Delta Cl_{i+1}$$
(EQ 10)

$$\Delta Cl_i = \frac{\partial Cl_i}{\partial L_i} \times \Delta L_{intra, i}$$
(EQ 11)

The seven basic sensitivities of delay and slope with respect to input slope, output load and device length and the sensitivity of gate input load with respect to device length are precomputed for each gate over a range of output load and input slope conditions. In this paper, we computed the sensitivities numerically, although methods for directly computing these sensitivities during circuit simulation are also possible. These basic sensitivities are then stored in tables and are then accessed during the computation of  $D_{p,intra}$  for a particular path using linear interpolation of the stored values in the table.

We then substitute EQ11 in EQ10 and EQ10 in EQ9 to obtain an expression of  $\Delta D_i$  as a function of basic sensitivities and intra-die device length variations. Note that  $\Delta D_i$  is a function of all intra-die device lengths *j*, where  $j \le i + 1$ , due to the recursive dependence of  $\Delta S_i$  on  $\Delta S_{i-1}$ . The change in delay of gate *i* therefore depends on the intra-die device length of the gate itself, the succeeding gate and all preceding gates and is expressed as a linear function of these intra-die device lengths. The delay change coefficients of this function are efficiently computed for all gates in the path using a single traversal of the path using the basic seven sensitivities. We then collect all coefficients of gate delays with respect to each intra-die device length and express the total change in path delay  $D_{p,intra}$  as follows:

$$D_{p,intra} = \sum_{i} (K_i \times \Delta L_i), \qquad (EQ \ 12)$$

where  $K_i$  is the coefficient of total path delay change due to intra-die device length  $\Delta L_i$  at gate *i*. Given the mean  $\mu_i$  and the standard deviation  $\sigma_i$  for intra-die device length  $\Delta L_i$  with normal distribution and the coefficients  $K_i$ , we can compute mean and standard deviation of the probability distribution for  $D_{p,intra}$  directly using the following standard equations:

$$\mu_{D_{p,intra}} = \sum_{i} K_i \times \mu_i \tag{EQ 13}$$

$$\sigma_{D_{p,intra}}^{2} = \sum_{i} (K_{i}^{2} \times \sigma_{i}^{2})$$
 (EQ 14)

Given precharacterized sensitivities, the final computation of the distribution of  $D_{p,intra}$  is performed very efficiently and requires only a single traversal of the path. To validate the accuracy of the proposed approach, we compare the distribution of  $D_{p,intra}$  computed through the proposed analytical approach with that obtained through Monte Carlo simulation in Section 5.

# **3.3** Combined analysis and comparison to traditional approach

After computing the two components of path delay variation,  $D_{p,inter}(L_{inter})$  and  $D_{p,intra}(\Delta L_{intra,i})$  (EQ4), we compute distribution of the total path delay  $D_p$ . Since  $L_{inter}$  and  $\Delta L_{intra,i}$  are independent random variables, this involves the convolution of the two distributions. However, since  $D_{p,inter}$  is not normal, the convolution can not be preformed analytically and must be performed numerically. This is performed by discretizing the two distributions and then taking their convolution numerically. The total path delay distribution is again validated using Monte Carlo simulation in Section 5.

We also compute the path delay distribution when we treat the total variation as inter-die variation and the intra-die variation as zero,  $\sigma_{L_{inter}} = \sigma_{L_{total}}$   $\sigma_{\Delta L_{intra}} = 0$ . We again use enumer-

ation of the distribution of  $L_{inter}$  to obtain the path delay distribution. We refer to this delay distribution as the *traditional* delay distribution, since traditionally all variations are treated as inter-die variations and computed using case analysis. We compare the delay distribution obtained with the proposed approach to the traditional delay distribution in Section 5.

### 4 Model and Analysis of Spatial Correlations

We propose a new model for spatial correlation of intra-die device length variation. We first divide the area of the die into regions using a multi-level quad-tree partitioning, as shown in Figure 1. For each



Figure 1. Spatial correlations

level *l*, the die area is partitioned into  $2^{l}$ -by- $2^{l}$  squares, where the first or top level 0 has a single region for the entire die and the last or bottom level *k* has  $4^{k}$  regions. We then associate an independent random variable  $\Delta L_{l,r}$  with each region (l, r) to represent a component of the total intra-die device length variation. The variation of a gate *i* is then composed of a sum of intra-die device length components  $\Delta L_{l,r}$ , where level *l* ranges from 0 to *k* and the region *r* at a particular level is the region that intersects with the position of gate *i* on the die. For the gate in region 2,1 in Figure 1, the components of intradie device length variation would be  $\Delta L_{0,1}$ ,  $\Delta L_{1,1}$  and  $\Delta L_{2,1}$ . The intra-die device length components are defined such that the sum of all random variables  $\Delta L_{l,r}$  associated with a gate is equal to  $\Delta L_{in$  $tra,i}$ :

$$\Delta L_{intra,i} = \sum_{0 < l < k, \text{ r intersects i}} \Delta L_{l,r}$$
(EQ 15)

Gates that lie within close proximity of each other will have many common intra-die device length components resulting in a strong intra-die length correlation. Gates that lie far apart on a die share few common components and therefore have weaker correlation. Figure 1 shows an example of a die with 3 levels of partitioning resulting in 16 region at the bottom level. Since the number of regions at the bottom level grows as  $4^k$  it is possible to obtain a fine partitioning of the die with only a moderate number of levels. Note also that length  $\Delta L_{0,1}$  associated with the region of at the top level of the hierarchy is equivalent to the inter-die device length  $L_{inter}$  since it is shared by all gates on the die.

We can control how quickly the spatial correlation diminishes as the separation between two gates increases by correctly allocating the total intra-die device length variation among the different levels. If the total intra-die variance is largely allocated to the bottom levels, and the regions at top levels have only a small variance, there is less sharing of device length variation between gates that are far apart and the spatial correlation will diminish quickly. The results will vield results that are close to uncorrelated intra-die analysis. On the other hand, if the total intra-die variance is predominantly allocated to the regions at the top levels of the hierarchy, then even gates that are widely spaced apart will still have significant correlation. This will yield results that are close to the traditional approach where all gates are perfectly correlated and the intra-die device length variation is zero. The proposed model is therefore flexible and can be easily fit to measured device length data. Also, it is straightforward to extend the model to include topological and structural correlations, such as gate orientation.

We illustrate the spatial correlation model for the three gates shown in Figure 1 in regions (2,1), (2,4) and (2,15). The intra-die device length variation of these gates is the sum of device length variation components associated with regions that the gate lies in leading to the following equations:

$$\Delta L_{intra, 1} = \Delta L_{2, 1} + \Delta L_{1, 1} + \Delta L_{0, 1}$$
(EQ 16)

$$\Delta L_{intra\ 2} = \Delta L_{2\ 4} + \Delta L_{1\ 1} + \Delta L_{0\ 1} \tag{EQ 17}$$

$$\Delta L_{intra, 3} = \Delta L_{2, 15} + \Delta L_{1, 4} + \Delta L_{0, 1}$$
 (EQ 18)

We can observe from the intra-die device length equations that gates 1 and 2 are strongly correlated, as they share the common variables  $\Delta L_{I,I}$  and  $\Delta L_{0,I}$ . On the other hand, gates 1 and 3 are more weakly correlated as they share only the common variable  $\Delta L_{0,I}$ . The change in delay due to intra-die device length variation for these gates can be expressed as the product their intra-die device length components with their respective coefficients of the total path delay change. Using equation EQ12, we get the following equations:

$$\Delta D_1 = K_1 (\Delta L_{2,1} + \Delta L_{1,1} + \Delta L_{0,1})$$
 (EQ 19)

$$\Delta D_2 = K_2(\Delta L_{2,4} + \Delta L_{1,1} + \Delta L_{0,1})$$
 (EQ 20)

$$\Delta D_3 = K_3 (\Delta L_{2,15} + \Delta L_{1,4} + \Delta L_{0,1})$$
 (EQ 21)

Summing up the  $\Delta D_{i^s}$  in EQ19 through EQ21, we get the change in the path delay  $D_{p,intra}$  due to spatially correlated intra-die device length variation as follows:

$$D_{p,intra} = K_1(\Delta L_{2,1}) + K_2(\Delta L_{2,4}) + K_3(\Delta L_{2,15}) + (K_1 + K_2)\Delta L_{1,1} + K_3(\Delta L_{1,4}) + (K_1 + K_2 + K_3)\Delta L_{0,1}$$
(EQ 22)

We then compute the path delay distribution in the same way as the intra-die variability analysis using equations EQ13 and EQ14.

#### **5** Experimental Results

We apply our approach to critical paths extracted from an industrial, high performance design. The Spice simulations were performed using a process with 0.18micron nominal device length. The standard deviation used for intra-die variability was based on measurements from a test chip and was 4.41% of the nominal device length. The total variability had a standard deviation of 6.6% of nominal. The standard deviation of inter-die device length was computed using EQ2 and was 4.907%. Normal distributions were used for all variations.

The proposed Inter- and Intra-die analysis methods were implemented as well as the traditional approach. Also, Intra-die analysis with spatial correlations was implemented using a 6 level hierarchy. The variance of the intra-die variability components at each level were obtained from test chip measurements.

In Figure 2, we show a plot of the path delay probability density



Figure 2. Comparison of probability density function for traditional approach and proposed approach

function of path p2 for both the traditional approach and our proposed method considering intra- and inter-die device length variations. The means of both these distributions are aligned at 2493.1 ps. The distribution obtained by our approach is more narrow than the traditional approach, indicating less variability and a smaller standard deviation. The 3-sigma delay point with our approach is also smaller than that obtained with the traditional approach, which means that the path delay distribution is less pessimistic with our approach.

Figure 3 shows the same comparison, but instead of a probability



Figure 3. Cumulative distribution function for traditional and proposed approaches

density function, we have plotted the cumulative distribution functions (cdf) of both the approaches. A cdf at any time point, shows the probability of an event occurring at or before that time point. The figure shows a significant difference between the approaches at the 99% point.

In Table 1, we show the path characteristics such as the number of gates, the mean delay of the path, the standard deviation and 3-sigma points of the path delay distribution using our approach and the traditional approach. The percentage reduction in the standard deviation and 3-sigma delay points obtained with our approach are shown in Table 1. The variability using the proposed approach is reduced by 27.2% on average, compared to the traditional analysis. The percentage reduction in the 3-sigma delay points is 4.46% on average.

In Figure 4, we show the comparison between the results obtained



Figure 4. Comparison of Monte Carlo simulation and analytical approach for intra-die delay variability

using our proposed analytical approach and Monte Carlo simulation for intra-die delay variability analysis of path p2. The plot shows a close match between the analytical approach and the Monte Carlo simulation. In Figure 5 we compare the total path delay probability



Figure 5. Comparison of Monte Carlo simulation and analytical approach for total delay variability

distribution for the two approaches for path p2. The mean and sigma of the distribution using Monte-Carlo simulation were 2487ps and 107ps which is matched closely by the mean and sigma obtained using our analytical approach, which were 2493ps and 112ps.

In Table 2, we show the results of the intra-die variability analysis using spatial correlations. The uncorrelated standard deviation val-**Table 1. Results of proposed approach and traditional approach** 

|                |                    |                       |                         | ••              |      |                      |                 |      |
|----------------|--------------------|-----------------------|-------------------------|-----------------|------|----------------------|-----------------|------|
| critical paths | No.<br>of<br>gates | mean<br>delay<br>(ps) | standard deviation (ps) |                 |      | 3sigma delay (ps)    |                 |      |
|                |                    |                       | traditional approach    | Our<br>approach | %red | traditional approach | Our<br>approach | %red |
| p1             | 14                 | 2188.3                | 139                     | 103             | 26%  | 2605.7               | 2498.0          | 4.1% |
| p2             | 12                 | 2493.1                | 152                     | 112             | 26%  | 2950.5               | 2830.1          | 4.1% |
| p3             | 25                 | 4449.3                | 276                     | 199             | 28%  | 5276.7               | 5046.0          | 4.4% |

203

199

191

172

28%

28%

28%

2.7%

4785.8

5004.3

4721.9

4606.8

4546.3

4774.5

4494.6

4412.3

5.0%

4 6%

4.8%

4.2%

3935.6

4177 1

3922.0

3895.9

283

276

266

237

p4

p5

р6 р7 32

23

43

20

 Table 2. Path delay distribution with different spatial correlations

| Critical<br>paths | sigma for<br>D <sub>p,intra</sub> (ps)<br>(uncorrelated) | sigma for<br>$D_{p,intra}$ (ps)<br>(correlated) | %increase<br>in sigma | 3 sigma-pt with<br>our approach<br>(correlated) (ps) | Correlated<br>% red |
|-------------------|----------------------------------------------------------|-------------------------------------------------|-----------------------|------------------------------------------------------|---------------------|
| p1                | 43.5                                                     | 68.5                                            | 57.4%                 | 2519.7                                               | 3.3%                |
| p2                | 45.1                                                     | 73.1                                            | 62.1%                 | 2853.0                                               | 3.3%                |
| p3                | 60.0                                                     | 114.6                                           | 91.0%                 | 5068.8                                               | 3.9%                |
| p4                | 56.8                                                     | 109.6                                           | 93.0%                 | 4568.8                                               | 4.5%                |
| p5                | 60.9                                                     | 115.8                                           | 90.0%                 | 4797.2                                               | 4.1%                |
| p6                | 48.4                                                     | 89.8                                            | 85.5%                 | 4515.1                                               | 4.4%                |
| p7                | 55.4                                                     | 103.1                                           | 86.1%                 | 4437.1                                               | 3.7%                |

ues are for the intra-die path delay distribution without any spatial correlation. We then show the sigma values for the intra-die path delay with correlations, calculated using our model for spatial correlation. The variability is increased on average by 80.7% when spatial correlation is considered, compared to uncorrelated analysis. We then show the 3-sigma delay values for the total path delay distribution, and report the percentage reduction with spatially correlated analysis over the traditional analysis which was 3.88% on average.

#### 6 Conclusions

In conclusion, we have presented a new method for computing the delay distribution of critical paths that considers inter- and intra-die variations. We propose a model for inter- and intra-die device length variation and show how the delay distribution can be efficiently computed using delay sensitivities. We also propose a new model for spatial correlations that can accurately capture the effect of intra-die spatial correlations. The methods were tested on paths from a high performance microprocessor. Monte Carlo simulation was used to demonstrate the high accuracy of the proposed approach.

#### Acknowledgements

This research was supported by SRC contract 2001-HJ-959 and NSF grant CCR-0205227.

#### References

- M. Orshansky, L. Milor, P. Chen, K. Keutzer, C. Hu, "Impact of systematic spatial intra-chip gate length variability on performance of high-speed digital circuits", ICCAD 2000, pp. 62 -67.
- [2] V.Mehrotra, S.L.Sam, D.Boning, A.Chandrakasan, R.Vallishayee, S.Nassif "A methodology for modelling the effects of systematic within-die interconnect and device variation on circuit performance." DAC 2000.
- [3] S.Devadas; H.F.Jyu; K.Keutzer; S.Malik "Statistical timing analysis of combinational circuits", ICCD 1992 pp. 38 -43
- [4] R.B. Brawhear, N. Menezes, C. Oh, L. Pillage, R. Mercer, "Predicting circuit performance using circuit-level statistical timing analysis" European Design and Test Conference, 1994.
- [5] M. Berkelaar, "Statistical Delay Calculation, a Linear Time Method," Proceedings of TAU 97, Austin, TX, December 1997
- [6] J.J Liou, K.T. Cheng, S. Kundu, A. Krstic, "Fast Statistical Timing Analysis By Probabilistic Event Propagation", DAC 2001
- [7] M. Orshanshy, K. Keutzer, "A general probabilistic framework for worst-case timing analysis", Proc. DAC 2002.
- [8] A. Gattiker, S.Nassif, R.Dinakar, C.Long "Timing Yield Estimation from Static Timing Analysis", Proc. ISQED 2001