Instituto Tecnológico y de Estudios Superiores de Occidente

Reconocimiento de validez oficial de estudios de nivel superior según acuerdo secretarial 15018, publicado en el Diario Oficial de la Federación del 29 de noviembre de 1976.

## Departamento de Electrónica, Sistemas e Informática Maestría en Diseño Electrónico



# ANALYSIS AND DESIGN OF POWER DELIVERY NETWORKS EXPLOITING SIMULATION TOOLS AND NUMERICAL OPTIMIZATION TECHNIQUES

TESIS que para obtener el GRADO de MAESTRO EN DISEÑO ELECTRÓNICO

Presenta: BENJAMIN MERCADO CASILLAS

Director: DR. JOSÉ ERNESTO RAYAS SÁNCHEZ

Tlaquepaque, Jalisco. Febrero de 2021.

### MAESTRO EN INGENIERIA (2020) Maestría en Diseño Electrónico

| TÍTULO:            | Analysis and Design of Power Delivery Networks Exploiting     |  |  |  |
|--------------------|---------------------------------------------------------------|--|--|--|
|                    | Simulation Tools and Numerical Optimization Techniques        |  |  |  |
|                    |                                                               |  |  |  |
| AUTOR:             | Benjamin Mercado Casillas                                     |  |  |  |
|                    | Ingeniero en Comunicaciones y Electrónica (Universidad de     |  |  |  |
|                    | Guadalajara, México)                                          |  |  |  |
|                    |                                                               |  |  |  |
| DIRECTOR DE TESIS: | José Ernesto Rayas Sánchez                                    |  |  |  |
|                    | Departamento de Electrónica, Sistemas e Informática, ITESO    |  |  |  |
|                    | Ingeniero en Electrónica (ITESO, México)                      |  |  |  |
|                    | Maestro en Sistemas Electrónicos (ITESM Campus Monterrey,     |  |  |  |
|                    | México)                                                       |  |  |  |
|                    | Doctor en Ingeniería Eléctrica (Universidad McMaster, Canadá) |  |  |  |
|                    | Senior, IEEE                                                  |  |  |  |
|                    |                                                               |  |  |  |

NÚMERO DE PÁGINAS: ix, 112

### Dedication

I dedicate this thesis to my wife Ana Lilia, my daughter Fernanda and my son Pedro Yazael; they kept me motivated during this master's degree program journey, and always encouraged me to overcome all obstacles

To my mother, Mercedes, and my father, Pedro, for the great values they gave me which have helped me to drive my life path with perseverance and honesty

To my brothers and sisters for being my role models

To God that always gives me the willpower to persevere and lead me to take the right decisions through my life.

## **Summary**

A higher performance of computing systems is being demanded year after year, driving the digital industry to fiercely compete for offering the fastest computer system at the lowest cost. In addition, as computing system performance is growing, power delivery networks (PDN) and power integrity (PI) designs are getting increasingly more relevance due to the faster speeds and more parallelism required to obtain the required performance growth. The largest data throughput at the lowest power consumption is a common goal for most of the commercial computing systems. As a consequence of this performance growth and power delivery tradeoffs, the complexity involved in analyzing and designing PDN in digital systems is being increased. This complexity drives longer design cycle times when using traditional design tools. For this reason, the need of using more efficient design methods is getting more relevance in order to keep designing and launching products in a faster manner to the market. This trend pushes PDN designers to look for methodologies to simplify analysis and reduce design cycle times. The main objective for this Master's thesis is to propose alternative methods by exploiting reliable simulation approaches and efficient numerical optimization techniques to analyze and design PDN to ensure power integrity. This thesis explores the use of circuital models and electromagnetic (EM) field solvers in combination with numerical optimization methods, including parameter extraction (PE) formulations. It also establishes a sound basis for using space mapping (SM) methodologies in future developments, in a way that we exploit the advantages of the most accurate and powerful models, such as 3D full-wave EM simulators, but conserving the simplicity and low computational resourcing of the analytical, circuital, and empirical models.

## Contents

| Su           | mma                                    | ry                                                                                                                                                                                                                                                                                                                                                        | V                                              |  |
|--------------|----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------|--|
| Contents vii |                                        |                                                                                                                                                                                                                                                                                                                                                           |                                                |  |
| Int          | rodu                                   | ction                                                                                                                                                                                                                                                                                                                                                     | 1                                              |  |
| 1.<br>Cir    | An ]<br>cuit                           | Introduction to Power Delivery Networks for IC Packages and Printe<br>Boards                                                                                                                                                                                                                                                                              | ed<br>3                                        |  |
|              | <ol> <li>1.1.</li> <li>1.2.</li> </ol> | POWER DELIVERY MAIN PROBLEMS         1.1.1       DC Drop         1.1.2       Power Loss         1.1.3       Transient Voltage Droop         1.1.4       Ground Bounce         1.1.5       Electromagnetic Interference (EMI)         MAIN CAUSES OF PDN PROBLEMS         1.2.1       Resistance                                                           | 4<br>5<br>6<br>6<br>7<br>7                     |  |
|              | 1.3.<br>1.4.                           | <ul> <li>1.2.2 Inductance</li> <li>1.2.3 Parallel Resonance</li> <li>METHODOLOGIES TO ANALYZE PDN</li> <li>1.3.1 Models Based on Analytical Expressions</li> <li>1.3.2 Circuit Simulator Based Modeling</li> <li>1.3.3 Method Based on Electromagnetic Field Simulator</li> <li>1.3.4 Lumped Power Delivery Network Model</li> <li>CONCLUSIONS</li> </ul> | 8<br>9<br>13<br>.13<br>.19<br>.21<br>.24<br>24 |  |
| 2.<br>Mo     | Met<br>odels                           | hodology Based on Electromagnetic Field Solvers and Circuital<br>for PDN Analysis                                                                                                                                                                                                                                                                         | 27                                             |  |
|              | <ul><li>2.1.</li><li>2.2.</li></ul>    | POWER DISTRIBUTION NETWORK ANALYSIS USING FIELD SOLVERS AND CIRCUITAL<br>MODELS                                                                                                                                                                                                                                                                           | 27<br>.28<br>.30<br>.31                        |  |
|              | 2.3.                                   | <ul> <li>2.2.1 Field Solver Simulation and Data Generation</li></ul>                                                                                                                                                                                                                                                                                      | .35<br>.40<br>.43<br>44                        |  |
| 3.<br>Mc     | Accu                                   | urate and Computationally Efficient Power Delivery Network Lump<br>Obtained from Parameter Extraction                                                                                                                                                                                                                                                     | ed<br>47                                       |  |
| TATC         | 3.1.                                   | EVALUATING THE INFLUENCE OF PDN ON SIGNAL INTEGRITY                                                                                                                                                                                                                                                                                                       | <b>4</b> 9                                     |  |

|    | 3.2.                                                        | DISTRIBUTED PDN MODEL DESCRIPTION                                               |         |  |  |
|----|-------------------------------------------------------------|---------------------------------------------------------------------------------|---------|--|--|
|    | 3.3.                                                        | LUMPED PDN MODEL DEFINITION                                                     |         |  |  |
|    |                                                             | 3.3.1 Lumped Model Topology                                                     |         |  |  |
|    |                                                             | 3.3.2 Lumped Elements Classification                                            |         |  |  |
|    | 3.4.                                                        | DESCRIPTION OF THE PARAMETER EXTRACTION BY OPTIMIZATION                         | 60      |  |  |
|    |                                                             | 3.4.1 Optimization Variables, Pre-Assigned Parameters, and Response of Interest |         |  |  |
|    |                                                             | 3.4.2 Objective Function                                                        |         |  |  |
|    |                                                             | 3.4.3 Optimization Problem Formulation                                          | 61      |  |  |
|    |                                                             | 3.4.4 Optimization Method                                                       | 61      |  |  |
|    |                                                             | 3.4.5 Seed Values                                                               | 62      |  |  |
|    |                                                             | 3.4.6 Optimization Results                                                      |         |  |  |
|    | 3.5.                                                        | OPTIMIZED PDN APPLIED TO DDR SIGNAL INTEGRITY ANALYSIS                          | 65      |  |  |
|    |                                                             | 3.5.1 SI Analysis Assumptions                                                   |         |  |  |
|    |                                                             | 3.5.2 SI Tests Description                                                      |         |  |  |
|    |                                                             | 3.5.3 SI Analysis Results                                                       |         |  |  |
|    | 3.6.                                                        | Conclusions                                                                     | 69      |  |  |
| 4. | Accurate Simulation of Package Substrate Air Core Inductors |                                                                                 |         |  |  |
|    | 4 1                                                         |                                                                                 | 70      |  |  |
|    | 4.1.                                                        | DESIGNING PACKAGE SUBSTRATE AIR CORE INDUCTORS (ACI)                            |         |  |  |
|    |                                                             | 4.1.1 ACI Design Process                                                        |         |  |  |
|    | 4.0                                                         | 4.1.2 ACI Design Optimization Process                                           | 12      |  |  |
|    | 4.2.                                                        | PHYSICAL STRUCTURE AND ELECTRICAL CHARACTERISTICS OF AN ACL                     |         |  |  |
|    |                                                             | 4.2.1 Input Design Parameters of Physical Characteristics of the ACI.           |         |  |  |
|    | 1.2                                                         | 4.2.2 Output Responses of Electrical Characteristics of the ACI                 |         |  |  |
|    | 4.3.                                                        | ACI FINE MODEL DESCRIPTION AND SIMULATION RESULTS                               |         |  |  |
|    |                                                             | 4.3.1 3D Solver Parameters for Simulating the Fine Model                        |         |  |  |
|    |                                                             | 4.3.2 Fine Model Simulation Results                                             |         |  |  |
|    |                                                             | 4.3.3 Fine Model Simulation Resources                                           |         |  |  |
|    |                                                             | 4.3.4 Conclusions                                                               |         |  |  |
| 5. | Dev                                                         | eloping a Coarse Model of a Package Substrate Air Core Inducto                  | or for  |  |  |
| Sp | ace N                                                       | Mapping Applications                                                            | 83      |  |  |
|    | 51                                                          | COADSE MODEL DESCRIPTION                                                        | 81      |  |  |
|    | J.1.                                                        | 5.1.1 Coarse Model 1: 2D Zero Order with Coarse Initial Mash (Using DSL 2D)     |         |  |  |
|    |                                                             | 5.1.1 Coarse Model 2: 3D Zero Order with Perfect Conductor (Using PSI-5D)       |         |  |  |
|    |                                                             | 5.1.2 Coarse Model 2: 3D Zero Order Light Model (Using PSL 3D)                  |         |  |  |
|    |                                                             | 5.1.5 Coarse Model 4: 3D Zero Order with Zero Metal Thickness Threshold (Using  | DCI 2D) |  |  |
|    |                                                             | 5.1.4 Coarse Model 4. 5D Zero Order with Zero Metal Threshold (Osing            | ·····   |  |  |
|    |                                                             | 5.1.5 Coarse Model 5: 2.5D Model (Using PowerSI)                                | 91      |  |  |
|    | 5.2.                                                        | COARSE MODEL DECISION CRITERIA                                                  |         |  |  |
|    |                                                             | 5.2.1 Coarse Model Accuracy                                                     |         |  |  |
|    |                                                             | 5.2.2 Simulation Time and Computing Resources                                   | 96      |  |  |
|    |                                                             | 5.2.3 Nature of Design Variables                                                | 96      |  |  |
|    |                                                             | 5.2.4 Geometric Characteristics Parametrization                                 | 97      |  |  |
|    | 5.3.                                                        | CONCLUSIONS                                                                     | 97      |  |  |
| Ge | nera                                                        | l Conclusions                                                                   | 99      |  |  |

| Appendices   |                                                       |  |  |
|--------------|-------------------------------------------------------|--|--|
| А.<br>В.     | LIST OF INTERNAL RESEARCH REPORTS<br>PUBLISHED PAPERS |  |  |
| Bibliography |                                                       |  |  |
| Index1       |                                                       |  |  |

## Introduction

This Master's thesis elaborates on alternative methods to analyze and design power distribution networks (PDN) to ensure power integrity by incorporating reliable and accurate simulation tools combined with efficient numerical optimization techniques. In particular, this thesis explores the use of circuital lumped and distributed models, as well as electromagnetic (EM) field solvers, in combination with numerical optimization methods that make use of parameter extraction (PE) formulations. The proposed Master's thesis also establishes a solid basis for applying advanced space mapping (SM) optimization methodologies in future developments, aiming at exploiting the most accurate and powerful PDN models, such as those based on 3D full-wave EM simulators, but conserving the simplicity and low computational cost of circuital representations. The thesis is organized as follows.

A power distribution network (PDN) overview is presented in Chapter 1 as a background on this subject. The main power delivery problems and their respective root causes are described in general terms. This chapter also mentions some typical methodologies used in industry to design and analyze PDNs.

Chapter 2 presents a well-established methodology based on electromagnetic field solvers and equivalent circuits for modeling and analyzing power distribution networks. In this chapter, an illustrative case study is also analyzed in order to exemplify the techniques typically used in this methodology.

In Chapter 3, a low computational cost optimization method based on a parameter extraction (PE) technique is proposed to develop an accurate PDN lumped model. Once this model is available, it is used in the simulation process during the signal integrity (SI) and power integrity (PI) design cycle, making the whole design process more efficient. As a consequence, a reduction of the design process cycle time is achieved and with a much lower computational cost.

A common industrial strategy to obtain efficient power consumption and meet the performance targets consists of placing high-frequency voltage regulators (VR) close to the silicon devices and inserting VR output air core inductors (ACI) at the substrate packages of the

microprocessors and chipsets. Chapter 4 describes the relevance of the substrate package air core inductors during the process of delivering power to the silicon die. It also explains the ACI physical structure and its simulation using a detailed physical model, or fine model, to obtain highly accurate electrical responses, which are needed for designing an efficient high frequency VR.

As an effort to speed up analysis and design time, as well as to reduce complexity, PDN designers are looking for methodologies to simplify these tasks. Many of these methodologies, such as space mapping, include the usage of surrogate or coarse models, replacing the fine or detailed models which typically take a long simulation time and a large computational cost. Surrogate models facilitate the iterative process of searching for an optimal solution within the design parameters space with a low computational cost. Suitable coarse models are also a key element during the space mapping optimization process. Chapter 5 presents a computationally efficient coarse model for a substrate package air core inductor (ACI) used in a PDN. This coarse model can be used in a space mapping formulation in conjunction with the original fine model, in a future development.

The author thanks Dr. José E. Rayas-Sánchez, director of the Research Group on Computer-Aided Engineering of Circuits and Systems (CAECAS) at ITESO, for his valuable guidance during the whole process of this thesis. Author thanks Intel Corporation, Santa Clara, CA, for making PowerSI<sup>TM</sup>, HSPICE<sup>®</sup>, and Matlab<sup>®</sup> licenses available for running all evaluations in this work. Author thanks Mauro Lai and Charles Fulcher, from Intel Corporation, Wisconsin and Oregon, respectively, for providing SI simulation data including the PDN impacts. He also thanks Ivan Cinco-Galicia and Felipe Leal-Romo, from Intel Corporation, Guadalajara: Ivan helped to extract the initial values of the PDN lumped elements presented in Chapter 3 and Felipe provided guidance to the author during the development of this thesis. Author thanks Joel Auernheimer, from Intel Corporation, Arizona, for guiding during the process of defining the problem statement during the PI-SI co-simulation discussed in Chapter 3. Finally, author thanks Dat Le, from Intel Corporation, Oregon, for providing DDR3 modeling assumptions.

## 1. An Introduction to Power Delivery Networks for IC Packages and Printed Circuit Boards

The number of transistors in a microprocessor die has been growing exponentially for decades. This tendency has been possible due to the fact that transistor technology has been continuously improving, and transistors have been getting smaller sizes and faster speeds. At the same time, the levels of current consumption have been continuously increased, making very important to have low-impedance path from the power supply to the die. If this path is not correctly designed, it can result in excessive noise that might impact the performance of the components. As described in [Bogatin-10] this path is called Power Distribution Network (PDN) and it starts from the power supply or voltage regulator module (VRM) and it ends at the circuits in the silicon die or chip. In other words, the PDN consists of all system elements such as the VRM, supply output inductors, the bulk and decoupling capacitors, the copper traces in the board, the plated through-holes (PTHs), the socket connectors of package substrates, the packages planes and vias, the controlled collapse chip connection (C4) bumps connecting to the die, the metal layers inside the chip, the metal-insulator-metal (MIM) capacitors or extrinsic capacitors, the die capacitance (Cdie) or intrinsic capacitors and the interconnections in the chip that help to supply the necessary power to a target load, which is basically the silicon component such as a microprocessor, microcontroller, FPGA, CPLD, or in general a chipset, such that it can performs its function flawlessly. Fig. 1-1, extracted from [Molex-04], provides an illustration of a power delivery network. This definition indicates that the PDN can be considered as an ecosystem where the silicon is settled down and its main purposes are: to supply a sufficient and clean power to the target load; to minimize ground bounce; and to minimize electromagnetic interference (EMI) problems.

The voltage provided by the PDN to the silicon device has to be within a specified range. This voltage range allows the transistors in the silicon device to work properly. As long as the circuitry in the integrated circuit (IC) device starts switching, there is a current drawn in the PDN, creating a current transient which interacts with the implicit impedance of the network and generates a voltage fluctuation or transient noise in the power rails. This noise has to be within a



Fig. 1-1 Main components in a power distribution network (PDN). Figure taken from [Molex-04].

valid range, otherwise it can affect the digital or analog signals.

The ground bounce is related to the noise level of the reference path of the signals or the return path of many power rails, which in most cases is a power plane in the PDN. When many signals are switching and these signals are using the same reference plane, then many of the return currents overlap. This phenomenon is also known as a simultaneous switching noise (SSN).

The power planes in a board are the largest conductive paths. They transport a considerable amount of current, usually with high frequency noise components, and hence these planes are more likely to radiate electromagnetic interference to the rest of the components in the board.

### **1.1.** Power Delivery Main Problems

As mentioned before, one of the main goals of a PDN is to provide a very stable and clean voltage to the load at any demanding current. This kind of voltage is only provided by an ideal voltage source connected directly to the load. Unfortunately, this kind of sources do not exist, instead there is a large resistive and inductive path from the source to the load. The good news is that by using an appropriate PDN this voltage can be provided under some specified tolerance band which does not affect to the correct functionality of the silicon circuitry or to the environment. This voltage tolerance has to be maintained from DC up to the frequency

bandwidth of the switching current, which is typically up to 1 GHz. There are some drawbacks that make difficult for the PDN to meet those tolerances. In the following paragraphs some of these drawbacks are described.

#### 1.1.1 DC Drop

If the silicon die demanded a constant DC current from the PDN, then a DC drop would be produced across the power planes all the way from the load (silicon die) to the voltage regulator (VR). This drop is sometimes known as IR drop since it is caused by ohmic losses, and it is produced by the inherent plane or interconnects resistivity. During a PDN design, DC drop must be taken into account for several reasons, for instance to create the mechanism of compensation such that the DC voltage at the pad of the die is set closer to the nominal required voltage. The compensation mechanism uses a feed-back sense point. This analysis is used to allocate the VR sense point at a convenient place in order to have a good balance, this VR sense is used to monitor the level of voltage at certain locations and it provides feed-back to the VR for the proper regulation. Additionally, these DC studies are helpful to identify possible bottle necks of current, allowing the designer to reinforce specific areas such as PTHs, vias, and planes. For more information consult [Bogatin-10], [Novak-07].

#### 1.1.2 Power Loss

When a current passes through out a resistive path such as power planes, vias, connectors, etc., then the DC voltage decreases along that path, producing a power loss which is energy transformed into heat. This power loss is energy that did not get to the load or silicon circuitry, and therefore the VR needs to compensate and provides additional power to satisfy the load requirements. There are limits for the temperature of the PDN. If we exceed these limits, the PDN operation begins to deteriorate, creating problems and failures to the die. Moreover, if this power loss is not taken into account, then VRM will get to its power limits and may be overheated or simply it will consume too much power. Other aspect to be considered is the heat in a printed circuit board (PCB) or package board, which needs to be properly dissipated, making

important to locate where the power is being dissipated, such that a thermal analysis can be done and a proper thermal solution can be implemented.

#### 1.1.3 Transient Voltage Droop

As the activity of the die changes, then the die current fluctuates. This change is intensified when the die switches from a low activity state to a high activity state. This is also known as the dI/dt event, which represents the instantaneous current change driven by the die activity. Now, because of this transient current, not only the resistive impedance is acting but the complex capacitive and inductive components are influencing the total impedance. When the activity state changes, then the die tries to draw current from the source, in this case the VRM, but due to the small transition time the impedance from the die pads to the VRM increases, limiting the pass of the current to the die. Subsequently, the power seen at the die pads will not be enough because the VRM cannot supply the required current. As a result, the voltages on the die pads would also fluctuate as a function of the time during this transitional state. This voltage fluctuation is known as the droop, and depending on the period of time in which this happens is the order of the droop. A solution for this problem is to use multi-stage decoupling with different types of capacitors located at specific areas along the system, such that when the high frequency current step is drawn by the die, the first tier of capacitors, as Land Side Capacitors (LSC) located underneath of the die package, start providing power to the die. The resulting voltage droop at this moment is known as the first droop. For middle frequency components of the original step of current, a second tier of capacitors, located a little bit further from the die pads, as the Die Side Capacitors (DSC), provide the power to the die when the first tier of capacitor ran out of energy. The die pad voltage at this time is known as the second droop. And finally, for low frequency components of the original step of current, the bulk and decoupling capacitors on the mother board start providing power to the die, the die pad voltage at this moment is known as the third droop. For more details see [Bogatin-10], [Novak-07].

#### 1.1.4 Ground Bounce

The ground bounce effect is also known as simultaneous switching noise (SSN), and it is observed when the return current is constricted and the return currents from different signals overlap. This SSN is also generated when many output buffers are switching at the same time. This ground bounce will cause significant voltage drop in the power rails [Bogatin-10], [Novak-07].

#### **1.1.5** Electromagnetic Interference (EMI)

Since power nets are always the larger conductors in a board and carry high levels of current with frequency noise implicit, then power planes can radiate electromagnetic emission and hence the noise of one power rail can couple to other plane, signal lines or into the environment, generally causing failures.

If all of these aspects are not considered in the PDN design, then there will be excessive noise on the power pads of the ICs affecting the timing, logic and functionality. Also, when there is an excessive noise in the pads, then the devices attached to the IC may be driven into the non-linear region or even cause a breakdown in low voltage devices [Bogatin-10], [Novak-07].

### **1.2.** Main Causes of PDN Problems

Even though PDNs are essentially planes that connect the source of power to the target load, resembling an ideal short circuited link, there are some physical parasitics that make this connection not as clean as one may think.

#### 1.2.1 Resistance

The first effect is the resistivity of the planes, the most common conductor used for the construction of the planes is copper, which is not an ideal conductor but it has an implicit resistivity. This parasitic produce a resistance in the planes which depends on the length l, the area of the cross section A, and the resistivity  $\rho$  of the conductor. The relationship between these

variables in a rectangular conductor shape is described by

$$R = \frac{\rho l}{A} \tag{1-1}$$

where *A* is the area of the conductor cross section. This area is the product between width *w* and thickness *t* of the plane. This resistance is also known as  $R_{dc}$  and will cause a voltage drop or IR drop that will depend on the current passing through the conductor. This IR drop can be easily determined when the load is sinking a DC current, but also this drop is present across all die switching frequencies. In (1-1) it is assumed that the electric current is homogeneously distributed across the conductor area, which is a valid condition only at low frequencies. Fig. 1-2 shows a rectangular conductor shape and the resistance parameters.

The temperature is another variable that can affect the resistivity or conductance of the material. The more heat the more resistance in the plane.

Skin depth is an additional factor that influences the plane's resistance. When a DC current is flowing through the power plane the entire cross-section area is used. However, when an AC current is flowing through the plane, then it uses the outer perimeter of the conductor. This effect is called skin depth and the normalized resistance under high-frequency conditions ( $f \ge 500 \text{ MHz}$ ) for a rectangular plane is described by

$$\frac{R}{R_{\rm dc}} \cong \frac{t}{2\delta} \,, \tag{1-2}$$

$$\delta = \frac{1}{\sqrt{\pi f \mu \sigma}} \tag{1-3}$$

where  $\delta$  is the skin depth, *f* is the frequency,  $\mu$  is the permeability and  $\sigma$  is the conductivity. This phenomenon matters at high frequency because the current tends to go at the structure surface even if the plane is thick.

#### 1.2.2 Inductance

The inductance is another parasitic in the power planes and it plays a significant role in the PDNs performance. When the load is switching it starts demanding an alternating current which is propagated to the planes. If the inductance in the plane is too high, then the electrical

path from the source to the load will exhibit high impedance due to the alternating current. Therefore, most of the energy of the power supply will no longer feed the load. As a consequence, the load will experience a voltage droop.

The inductance is a proportionality constant that describes the sensibility of the voltage generated with respect to a changing current. Inductance can also be defined as the ratio of the voltage to the variation of current flowing through an inductive element. In other words, inductance is a mechanism to measure the opposition to the current change. Therefore, the more inductance the more sensibility to the current change and hence more voltage drop. This means that with a DC current, where there is no current change, then the inductive element acts as a short circuit. However, for higher frequency current, the inductor element shows higher impedance.

#### **1.2.3** Parallel Resonance

At frequencies when inductance starts having predominant effect in the PDN, then capacitors start playing an important role by lowering the impedance. Moreover, the capacitors will act as local batteries that provide the necessary energy to the load components. These capacitors have to be placed at specific locations across the board and need to be large enough such that an adequate amount of charge can be provided.

We know that the impedance of an ideal capacitor decays as the frequency increases. If an ideal capacitor were available, it would be placed close enough to the die pads. Hence, the energy of the capacitor would be transferred to the die at all frequencies and power delivery issues would be significantly decreased. However, capacitors have two series parasitics which



Fig. 1-2 Parameters to define the resistance of a rectangular conductor plane.



Fig. 1-3 Impedance profile of a real capacitor represented by a RLC model. Figure taken from [Bogatin-10].

are the equivalent series inductance (ESL) and the equivalent series resistance (ESR). In addition, the more capacitance the more inductive components the capacitor would have. A real capacitor can be approximated by using a RLC circuit model. Therefore, a real capacitor starts acting as an ideal capacitor but at some frequencies it reaches the lowest impedance and then it acts as an inductor, increasing the impedance as the frequency grows. Fig. 1-3 shows the impedance profile of a real capacitor.

The frequency at which the capacitor presents the lowest impedance is at the selfresonant frequency and is given by

$$f_{\rm SRF} = \frac{1}{2\pi} \frac{1}{\sqrt{LC}} \tag{1-4}$$

where  $f_{SRF}$  is the self-resonant frequency in MHz, *L* is the equivalent series inductance in nH and *C* is the capacitance in nF. The capacitor's ESL is related to the design of the complete path of the power and return currents from pads of the die, or wherever the next tier of capacitors is, to the capacitor element. For instance, the ESL associated with the capacitor and its path to the load can be: the loop inductance of the surface traces, the loop inductance of the vias crossing the planes, the spreading inductance from the capacitor vias to the vias of the BGA, the loop inductance from the cavity under the package to the leads or solder balls of the package, etc.

As mentioned before, selecting the right number and right values of the capacitor, impedance can be maintained below the target value. This can be achieved by placing various capacitors in parallel, sometimes those capacitors are identical to achieve an specific capacitance level at an specific location across the board, and other times capacitors with different values are hooked in order to reduce the inductance at different frequency ranges. When those capacitors are of the same value, then the SRF stays at the same frequency but the overall impedance is lower. However, when placing capacitors of different values of capacitance, there will be two SRF dips for each capacitor and in between both SRFs a new peak impedance would be generated, as shown in Fig. 4. This high impedance peak is called parallel resonant peak which occurs at the parallel resonant frequency PRF. This frequency is dependent of the ESL of the larger capacitor and the C of the smaller capacitor. An approximation can be calculated using

$$f_{\rm PRF} \approx \frac{1}{2\pi} \frac{1}{\sqrt{C_2 ESL_1}} \tag{1-5}$$

where  $f_{PRF}$  is parallel resonance frequency in MHz,  $C_2$  is the capacitance of the smaller capacitor in nF and  $ESL_1$  is the equivalent series inductance in nH of the larger capacitor. This approximation is valid only when the SRF of both capacitors are far apart each other. In Fig. 1-4



Fig. 1-4 Impedance profile of two RLC circuits in parallel with same R and L but different C values. Figure taken from [Bogatin-10].



Fig. 1-5 Partitioning impedance profile of a complete PDN divided by the frequency range they influence. Figure taken from [Bogatin-10].

the parallel combination of two different capacitors is shown. When two capacitors with different SRFs are added in parallel, they create a parallel resonant peak impedance between their self-resonant dips. This impedance peak can be high enough to make the PDN fails to the specifications and hence special care should be taken during the PDN design.

As suggested by the previous drawbacks of the PDN design, the main goal is to maintain low impedance from the die pads to the voltage regulator at all switching frequencies of the die. A very useful representation of a PDN can be obtained by partitioning the interconnections in the frequency domain into different regions as suggested by Fig. 1-5. The plot is related to the PDN impedance that the die sees across the frequency spectrum from 100 Hz to 10 GHz. The first impedance region is dominated by the voltage regulator and the  $R_{dc}$  of the network. From DC (0 Hz) to 10 KHz the VR inductor starts presenting high impedance. After this, the bulk capacitor is affecting the impedance making it lower, this range is from 10 KHz to 100 KHz. As the frequency increases the lead inductance of bulk and the spread inductance of the board make the impedance to start increasing one more time. The range from 100 KHz to 100 MHz is dominated by the board and the inductance of the package. After this range, the package capacitors start providing low impedance. Beyond the 100 MHz the inductance of the path, from the package capacitors to the die, presents high impedance. At this point the on-die capacitance plays a

significant role in the network since the loop inductance associated with it is low and it offers the lowest impedance at the highest frequency.

### 1.3. Methodologies to Analyze PDN

Since most of the pre-silicon power delivery design decisions are made based on modeling data, it is important to have models that accurately describe the real behavior. When we do not have available the actual physical component, we use models which are simplified replicas of the real device under a specific range of input parameters. The accuracy of the model depends on the complexity of the inherent physics and input parameters interaction within a desired range. Therefore, depending on the model accuracy needed we can select from a wide variety of equivalent circuit modeling tools available. We can choose from an ideal electric lumped circuit model to a full-wave 2D or 3D distributed model with thousands of interconnecting nodes.

Nowadays there are many techniques for modeling PDNs that basically can be classified into three groups: methods based on analytical expressions, methods based on circuit simulators, and methods based on electromagnetic field simulators. The techniques that use analytical expressions help only for limited practical problems or when a feasibility assessment is required for starting a design. Circuit simulator techniques translate physical parameters such as conductors, dielectrics, interconnections, etc., into equivalent simple circuit elements such as resistors, inductors, capacitors, conductance and their relationships, either lumped or distributed, then simulating all these elements we can obtain a frequency profile or a transient response and determine the performance of the whole PDN. Field solvers are able to describe the electromagnetic effect of the structures and hence they include the coupling effects into the models. Methods that are based on field solvers can be more accurate than analytic models or circuit simulator-based methods. On the other hand, using electromagnetic field simulators can slow down the PDN analysis due to the high computing resources and typically long time needed for running these tools.

#### 1.3.1 Models Based on Analytical Expressions



Fig. 1-6 Parallel plane structure. Box-shape structure extracted to be analyzed using analytical expressions though the usage of spreadsheet calculations.

Analytical expressions can be evaluated using tools such as spread sheet calculators, numerical analysis software or simply hand calculations.

Spreadsheet calculations are well suited for frequency-domain analysis or basic DC drop analysis for simple plane shapes such as parallel rectangular power/ground planes including a few bulks and decoupling capacitors. An example of the structures that can be analyzed using a spread sheet is presented in Fig. 1-6. In this figure a small box-shape portion of the plane pair is extracted and analyzed using a double infinite summation of modal harmonics to obtain the impedance for the parallel plate structure [Novak-07]. The known variables are: the length and



Fig. 1-7 Self-Impedance of a parallel pair of planes using spreadsheet that evaluated an analytical expression for the plane impedance (lossless). Right chart is an enlarged view of the left chart. Figure taken from [Novak-07].

width of the parallel planes, the thickness of the copper plane, the relative permittivity and loss tangent of the dielectric and the plane separation. An example of an impedance profile of a parallel plate structure is shown in Fig. 1-7, where a good correlation with measurement data was obtained. Using this AC profile, the parallel and series resonance up to 10 GHz can be identified. As suggested by Fig. 1-6 this data, corresponding to a squared parallel plane, can be used to integrate a larger model including bulks and decoupling capacitors.

Analytical expressions can also be evaluated using tools such as Matlab<sup>1</sup>. The ability of computing iterations makes this tool well fitted for the method of double infinite summation of modal harmonics. This method evaluates a double nested series summation applying loops and an additional loop to evaluate a frequency sweep. Another analytical method that can be handled using Matlab is the Cavity Model to Lossy Power-Return Planes [Xu-03]. In this method the dielectric losses and conductive losses are included for the plane pair impedance calculation. This model is accurate when the skin depth of the conductor is much thinner than the plane thickness.

Another modeling method that uses analytical expression is the Path-Based Equivalent Circuit (PBEC) [Kim-04]. This equivalent circuit is a power-bus model including decoupling capacitors, it provides a simple and accurate lumped equivalent circuit model for a PDN. This model estimates the accurate frequency behavior of a PCB. It considers the interference of the current's paths between the decoupling capacitors, while the conventional lumped models assume that all decoupling capacitors are connected in parallel, independently from each other. It also models the equivalent electrical parameters of the board to consider the board parasitics precisely, while the conventional lumped models employ only the inter-plane capacitance of the power-ground planes. An example of a PBEC model is presented in Fig. 1-8 where the relationship between the PBEC and the physical board geometry is described. The PBEC can show the direct relationship between physical parameters and the impedance characteristic curves. Moreover, PBEC is fast and by using simple hand calculations it can be solved. In [Kim-03] a method based on PDN synthesis with path-based equivalent circuit (PBEC) model is reviewed, where a methodology that integrates on-chip and off-chip power bus is proposed. This allows the designers to accurately include the resonant peaks. One of those resonant peaks occurs

<sup>&</sup>lt;sup>1</sup> MATLAB, ver. 7.0.1 (R14), The MathWorks Inc., Natick, MA, 2004.



Fig. 1-8 Relationship between the geometry of the physical board and the PBEC element model.

due to a parallel resonant circuit composed of on-decoupling capacitors and the off-chip power bus circuit components. As described previously, this parallel peak is one of the critical factors caused I/O switching noises in the high-speed digital system. The first step, in this methodology, is to determine the target impedance. After that, by using genetic algorithms and direct search optimization, the decoupling capacitors are located in the most convenient places. Finally, the design parameters of the PDN are determined, such as the total on-chip decoupling capacitors, effective inductance of package power bus, locations of off-decoupling capacitors, and type of the capacitors. This method is very useful at early design steps.

When a more complex PDN is being analyzed such as a package or board with numerous vias, decoupling capacitors, irregular geometries and multi-plane layers, then, the multi-input and multi-output transmission matrix method is proposed in [Kim-01]. This method can be analyzed in frequency and time domain for mixed signal applications. It is 7 to 13 times faster and saves memory requirements in comparison with SPICE based methods. In this method, power/ground planes are divided into unit cells with lumped elements for each cell, which contains an equivalent R, L, C and G components, and then a cell matrix is built. Using a multiinput and multi-output transfer function, the matrix for the entire PDN can be calculated as the product of the individual square matrices formed by 2N-port networks having N inputs and N outputs ports. Once every parameter in the individual cell is computed, then the transmission matrix method can be used for any plane geometry. This method uses the  $\pi$  model for each unit cell in order to calculate the impedances without decoupling capacitors or with decoupling capacitors between 2 or more ports for irregular geometries. In Fig. 1-9 the modeling method is shown. The circuit parameters in the cell are calculated from the quasi-static models. Using the physical parameters of the cell such as lateral dimension, separation of plane pairs and metal thickness, in addition to some material parameters such as dielectric constant, loss tangent and



Fig. 1-9 Multi-input and multi-output transmission matrix: a) Complete plane structure and its corresponding cell partitioning, b) unit cell and its equivalent  $\pi$  model. Figure taken from [Kim-01].

metal conductivity, then the circuit parameter per cell can be calculated. To obtain good accuracy, the unit cell dimension must be 10 times smaller than the wavelength of the highest frequency of interest. Basically, the transmission matrix method consists of dividing the rectangular plane into  $N \ge M$  unit cells and extracting the transmission matrix of a 2N-port network which corresponds to a column subsection of the total matrix cells. This subsection is represented by a dashed line rectangle in Fig. 1-9. Next, the transmission matrix for the overall PDN is computed by cascading and multiplying the individual matrices, where the voltage and current equations are derived. Moreover, using the transmission matrix of the network, the impedance of the network can be calculated obtaining the self and trans-impedance between ports.

An alternative method for PDN analysis in time-domain using analytical models is the parallel-distributed circuit simulation algorithm based on the Latency Insertion Method (LIM) [Watanabe-06]. In this method, the PDN of a PCB and package are modeled as two-dimensional power and ground planes discretized into unit cells, where each cell is formed by an RLCG equivalent circuit. The parameters of each element are derived from the dimensions and medium coefficients. Latency Insertion Method (LIM) is an algorithm for time-domain simulation, and it is a derivative method in a class of algorithms such as the finite-difference time-domain (FDTD) method, where the node voltage vector and the branch-current vector are computed alternately. In LIM the whole circuit to be analyzed is divided into several subcircuits, and each subcircuit is simulated by each PE (Processing Element). LIM requires that each branch has an inductor and each node has a grounded capacitor. Then, for the branch which contains a resistor, inductor and a voltage source, the Kirchhoff voltage law is applied and obtained the voltage equation. Next for the nodes that have a parallel combination of capacitor, conductance and a current source to the ground, the Kirchhoff current law is applied, and the node voltage is updated. The timedomain simulation is performed by alternating updates of branch-currents and node-voltages. In addition, actual PDNs have some frequency dependent properties such as the skin effect and dielectric losses. Then a transmission line model is used instead of the simple RLCG model described before. Usually, transmission line computation takes long CPU time and large memory capacity. Therefore, these frequency-dependent parameters are approximated by using the firstorder Debye rational function [Watanabe-06] where the distributed series impedance and shunt



Fig. 1-10 Unit cell model containing frequency dependent parameters calculated using the firstorder Debye model. Drawing taken from [Watanabe-06].

admittance are approximated. From this function a unit cell model, which is frequency dependent, can be derived as represented in Fig. 1-10. By using distributed series impedance formulation, the number of poles is obtained. The number of RL parallel networks is equal to the number of poles obtained previously. And the number of GC series networks is determined from the number of poles given by the series shunt admittance formulation. The unknown parameters of Debye functions (*Ri*, *Li*, *Gi* and *Ci* for  $i = 1 \dots N$ ) are determined using optimization methods, where the series impedance and shunt admittance formulations are fit to sampled data calculated using the distributed series impedance and shunt admittance of the transmission line formulation.

#### 1.3.2 Circuit Simulator Based Modeling

By using a circuit simulator we can perform nonlinear dc, nonlinear transient and linear AC analysis starting from an electric circuit that includes resistors, capacitors, inductors, mutual inductors, independent voltage sources, independent current sources, dependent current and voltage sources, lossless and lossy transmission lines, semiconductors, etc.

Typical circuit simulators are based on SPICE (Simulation Program with Integrated Circuit Emphasis). Linear frequency domain analysis (.AC) is a very useful simulation technique to analyze PDNs. Using this command, SPICE simulator first calculate the DC operating point at all nodes in the circuit which means that SPICE will calculate all DC voltages and currents.

Next, SPICE linearize the whole circuit around the DC operating point. Finally, a modified nodal analysis (MNA) calculates the results over a frequency range using the MNA matrix with the real and imaginary parts of the impedance.

To use circuit simulators, we have first to construct a PDN model. This network can be a group of small pieces called grid cells, consisting of discrete RLCG elements or transmission line elements, these networks are called Grid-based models. Fig. 1-11 shows an example of grid-based transmission line and discrete RLCG PDN models. Voids, odd-shapes planes, split planes, etc., must be considered when modeling a real board. Therefore, adaptive grid size can be used for taking into account all these details in the model. If a transmission line grid is selected, then the number of unit cells is constructed such that each transmission line segment in the model is a small fraction of the wavelength of the highest frequency component. For more details consult section 4.3 of [Novak-07]. For the calculation of each parameter the transmission line formulation is used. The characteristic impedance can be calculated using lossless or lossy characteristics. Also, propagation delay expressions are used. The plane capacitance *C* per cell area can be approximated using the quasi-static formulation, and then the propagation delay, then the characteristic impedance,  $Z_0$ , and the inductance, L, are obtained by using the transmission line equations,

$$t_{\rm p} = \sqrt{LC}, \ L = \frac{t_{\rm pd}^2}{L}, \ Z_{\rm p} = \sqrt{\frac{L}{L}}$$
 (1-6), (1-7)

$$t_{\rm pd} = \sqrt{LC} , \ L = \frac{1}{C} , \ Z_0 = \sqrt{\frac{1}{C}}$$
(1-7),  
(1-8)

where L and C are the total inductance and capacitance of each cell area represented by the transmission line.

To add the conductive losses, each plane resistance is calculated separately. The total resistance per cell, R, can be calculated by adding the DC resistance,  $R_{dc}$ , and the skin-effect resistance,  $R_{skin}$ ,

$$R = R_{\rm dc} + R_{\rm skin} \tag{1-9}$$

The conductance G, which is the element in parallel, is the sum of the dc conductance,  $G_{dc}$ , and the conductance due to dielectric loss,  $G_{diel}$ . As an assumption the  $G_{diel}$  has linear frequency dependence,



Fig. 1-11 Grid-based PDN model: a) Grid construction, b) discrete RLGC circuit element model, c) transmission line model. Figure taken from [Novak-07].

$$G = G_{\rm dc} + G_{\rm diel} \tag{1-10}$$

Another important aspect of simulation is the input current source applied, which also establishes the electric current drawn by the die at different operation processes. In simulation, these stimuli are represented by current sources. The location of those current sources and the location of the probe point play a significant role in the final impedance profile. For instance, at low frequencies, the simulated impedance is very similar all across the board, but for impedance at or after the first series resonance frequency, it can be very different at various board locations. Therefore, the location of stimulus sources and probe points are very important during the analysis of PDN. For this matter, a correct granularity is required at specific probe location or current source location.

#### **1.3.3** Method Based on Electromagnetic Field Simulator

The third group of modeling and analysis methods for PDNs consists of using field solvers. Employing these tools allow us to capture electromagnetic effects in the structures. By using those effects then the losses, parasitics, coupling effects, fields, current distributions, self and mutual parameters, etc., can be described.

We can describe field solvers by three different aspects: the type of structure geometry to be analyzed, the numerical method applied and the solution domain.

There are three geometrical classes of field solvers:

- a) 2D cross-section or transverse field distributions
- b) Field solvers that mesh the surface of planar metals
- c) Field solvers that meshes a 3D volume.

For the first class, the solver works for any geometrical shape in 2 dimensions and it assumes a uniform and constant cross-section for the longitudinal direction. This method makes to this kind of tool to be relatively quick in comparison with the other two geometrical classes. A commercial 2D field solver is Ansoft Maxwell  $2D^2$ . For the second class, the field solver meshes the metallic surfaces in both longitudinal and transversal directions, and they can consider vertical connections such as vias. Sometimes these tools are also known as 2.5D field solvers. Commercial tool that uses 2.5D are Sonnet EM<sup>3</sup>, Agilent Momentum<sup>4</sup>, and Ansoft SIwave<sup>5</sup>. The third geometrical class of field solvers is the one that meshes the geometry in 3 dimensions. By using these tools more complex problems can be solved and with more accuracy due to the ability of volume discretization of via transitions, multilayer discontinuities, structure coupling, capacitor mounting, connector coupling, etc. This class of tools usually spends more time solving the problem, but the advantage is that they can be used for a wider range of problems. Some commercial tools that use 3D modeling are HFSS<sup>6</sup> and CST<sup>7</sup>.

The other aspect of electromagnetic field solvers is the numerical method applied, which is how the corresponding Maxwell's equations are solved. Most of the solvers work by subdividing the geometry into cells by using a very small portion of the wavelength. Hence, the

<sup>&</sup>lt;sup>2</sup> Maxwell<sup>®</sup>, ver. 12, Ansys Inc., Canonsburg, PA, 2008.

<sup>&</sup>lt;sup>3</sup> Sonnet<sup>®</sup> Suites<sup>TM</sup> ver. 12, Sonnet Software Inc., North Syracuse, NY, 2010.

<sup>&</sup>lt;sup>4</sup> Agilent Momentum 3D<sup>TM</sup>, Agilent Inc., Santa Clara CA, 2010.

 <sup>&</sup>lt;sup>5</sup> SIwave<sup>TM</sup>, ver. 4. Ansys Inc., Canonsburg, PA. 2009.
 <sup>6</sup> HFSS<sup>TM</sup>, ver. 12.1. Ansys Inc., Canonsburg, PA. 2010.

<sup>&</sup>lt;sup>7</sup> CST Studio Suite<sup>TM</sup>, ver. 2010, CST Computer Simulation Technology AG., Framingham, MA, 2010.

tools acquire each cell field and the interaction with the neighboring cells. At the end, a summation of all responses is calculated. For frequency domain simulations basically three different methods are used: finite element method (FEM), the method of moments (MOM) and finite integration technique (FIT). For time domain simulations the most common methods are finite difference time domain (FDTD) and transmission line matrix (TLM). 2.5D solvers typically use the MOM, and 2D methods use MOM and FEM, meanwhile 3D solvers typically use FEM, FIT or FDTD. A description of advantages and disadvantages for different field solvers are documented in [Swanson-03].

The third aspect to be considered when using electromagnetic solvers is the solution domain needed. There are frequency-domain solvers in which the independent variable is the frequency and the output responses are in the form of scattering, impedance, or admittance parameters. These frequency-domain solvers discretize the structure geometry in cells, and then calculate parameters that build a matrix. The solution is obtained by inverting the matrix. When the transient response is needed, then time-domain field solvers are used, where the geometry of the structure is discretized and then applied an impulse to obtain the transient response. If a frequency response is needed from a time domain response, then a fast Fourier transform FFT is calculated.

Electromagnetic simulators are also classified as quasi-static or full-wave solvers. Basically, when the solution is required in a narrow frequency spectrum, then a relatively simple formulation can be used. In practical terms when the length of the device is much smaller than the wavelength,  $l \ll \lambda$ , then the problem conditions becomes static or quasi-static. As an example of a commercial quasi-static solver is Ansoft Q3D Extractor<sup>8</sup>. When the above condition is not met, then the usage of a formulation that solves Maxwell equations, without the quasi-static assumption, is needed. This formulation is known as full-wave because it considers the 3 possible components for the electric and magnetic fields at each frequency point or at each instant of time. Full-wave solvers are well fit when the problem involves field coupling and electromagnetic radiation, as well as non-transverse electromagnetic propagation.

As a conclusion about electromagnetic solvers, we can say that a 3D full-wave FEM solver could be a very good tool with accurate results for PDNs modeling and analysis but only

<sup>&</sup>lt;sup>8</sup> Q3D Extractor<sup>®</sup>, ver. 8.1. Ansys Inc., Canonsburg, PA. 2009.

when the structure is relatively small. If a large PCB is analyzed, then a 2.5D MOM should be used because it requires less computing resources and the solution can be extracted faster. Therefore, the best tool to be used depends on the problem to be analyzed and the required accuracy needed.

#### 1.3.4 Lumped Power Delivery Network Model

Lumped PDN method uses a discrete RLCG circuit element model consisting of a few resistors, inductors and capacitors, where each element represents a concentrated parameter of a specific piece of the power delivery structure. For instance, a lumped model that represents the path from the bulk capacitor location to the decoupling capacitors location in a mother board could be represented by a resistor in series with an inductor. These two elements represent the concentration of the resistance and inductance parameters. If lumped elements are used, then the cell size should be selected according to size of the reference element selected. For instance, the cell size could be the pin pitch for a package. The goal is to identify the main blocks of impedance discontinuities and replace all the parasitics in each block by circuit elements. By connecting all those blocks we get the complete PDN circuit model. Using this method, we can have an idea about the performance of the PDN in the very early design steps. Circuit in Fig. 1-5 shows an example of a lumped circuit that represents a PDN. As an example, in [Ren-04] the lumped model is used.

#### 1.4. Conclusions

It was confirmed in this chapter that PDN is a critical design factor nowadays because of the exponential increment of the transistor density in the silicon devices, higher current consumption and faster speeds. The main problems involved during the PDN design are: DC drop, power loss, transient voltage droops, ground bounce, EMI, and their corresponding impact on the PDN performance. It was also observed that the main contributor to the detrimental PDN performance is the inductance and resistance, but it can be compensated by the addition of capacitors in some strategic locations across the PDN. Modeling and analysis techniques were

reviewed and the importance of this modeling process during the design was mentioned. If we have the modeling and analysis methodologies well calibrated, the designer can take decisions wisely with a good cost-performance trade-off in an opportune time framework. Based on this, new techniques for modeling and analysis of PDN are explored and proposed in following chapters.
# 2. Methodology Based on Electromagnetic Field Solvers and Circuital Models for PDN Analysis

The principles and overview of a well-established methodology that is based on electromagnetic field solvers and circuital models for analyzing of power distribution networks (PDNs) is described in this chapter. Since this technique makes use of EM simulators and equivalent circuits, it is regarded as a hybrid strategy. This methodology includes the solution for different power delivery effects, such as voltage fluctuations in power and ground systems, interaction between signal and power delivery system, multi-driven simultaneous switching outputs (SSO), electromagnetic coupling between vias, resonances in PCBs and packages, effects of decoupling caps, radiation from edges of boards and many miscellaneous effects. This methodology is well fitted for system level power delivery analysis, which can be an entire PCB, an IC package or the combination of both. With this kind of analysis, power integrity guidelines can be developed before a system is implemented in a layout. Additionally, the performance of a post-layout design can be verified and improved without a physical prototype.

In this chapter, an illustrative case study is presented, where a simple pair of parallel planes with a couple of vias represents the PDN. Basically, this example illustrates the main modeling and simulation techniques in this methodology. Additionally, a data comparison is presented between some referenced measurements and the data obtained using these electromagnetic and circuital simulation techniques.

# 2.1. Power Distribution Network Analysis Using Field Solvers and Circuital Models

A typical PDN analysis methodology is basically composed by three main processes: power plane structure modeling, SPICE model extraction, and transient/frequency simulation process. Modeling the power plane structure consists of extracting all the electrical information in a form of network parameters from a physical structure, such as package or mother board power planes. The network parameters can be scattering (S), impedance (Z) or admittance (Y)

numbers. Sometimes the network parameter information is enough at this stage for the designer to make important decisions. On the other hand, when more specific data needs to be obtained such as a transient response, then this electrical information is used to construct a circuit model in SPICE format to make detailed circuit simulations. The extracted circuit model can be simulated and DC, transient or frequency domain data can be obtained. Fig. 2-1 shows a typical flow diagram of a PDN analysis using field solvers.

### 2.1.1 EM-based Data Generation Process

Nowadays many accurate and reliable field solvers are available in the market for modeling power planes structures. As reviewed in [Mercado-Casillas-10] some of these tools are: HFSS<sup>9</sup>, Sonnet, CST, PowerSI, Q3D, etc. In this section, it is described a method for



Fig. 2-1 Flow diagram of a power distribution network analysis using field solvers to construct an equivalent circuit model for SPICE simulation.

<sup>&</sup>lt;sup>9</sup> HFSS<sup>TM</sup>, ver. 12.1. Ansys Inc., Canonsburg, PA. 2010.

modeling PDN structures using the commercial solver PowerSI<sup>10</sup> from Sigrity<sup>11</sup>.

PowerSI provides full-wave simulation results suited for high-speed power delivery analysis. The method is applied for modeling packages and printed circuit boards. This solver describes the electromagnetic field phenomena through simulations in the frequency domain. The results are displayed in network parameters format such as scattering (S), impedance (Z) and admittance (Y) matrices for an *N*-port network. The extracted matrix information is generated into the Touchtone format and can be used for subsequent analysis of larger scale systems.

This solver uses full-wave dynamic electromagnetic analysis that takes into account interactions in physical structures. PowerSI uses a hybrid solver strategy, which means that the solver automatically selects the appropriate solution method based on the complexity of the problem. The tool makes a decomposition of the 3D structure into several components and each component is simulated with specialized fast algorithms. For instance, the tool applies full-wave effects for sections of the structure which are comparable to a wavelength, it also adds the transmission line effects for sections which exhibit such behavior or it uses a lumped circuit for parasitics that do not require a full-wave description or a transmission line model.

The solvers used by this tool are, an EM solver, a circuit solver and a transmission line solver. The three solvers are restricted to the frequency domain. The EM field solver takes into account coupling between vias, reflection from edges, resonances, power and ground voltage fluctuations, and metal/dielectric losses. These effects are described dealing with Maxwell equations. A circuit solver is linked directly with the field solver. The circuit solver takes circuit data files in SPICE format. It evaluates linear and non-linear circuit components, including those of IBIS models and HSPICE transistor models. For this circuit solver the tool deals with the MNA (Modified Nodal Analysis) simulation method using a SPICE engine. The transmission line solver includes the skin-effect loss, dielectric loss, frequency dependent dielectrics, and coupling between lines. These 3 solvers run simultaneously in order to find the solution considering the interaction between plane fields' propagation and circuit switching. Fig. 2-2 shows the process flow to perform a PDN model extraction using PowerSI from Sigrity. For more details review PowerSI

<sup>&</sup>lt;sup>10</sup> PowerSI<sup>™</sup>, ver. 9.0.2.06151, Sigrity Inc., Campbell, CA, 2008.

<sup>&</sup>lt;sup>11</sup> Sigrity Incorporated. May 04, 2010, <u>http://www.sigrity.com</u>.

user's guide<sup>12</sup>.

### 2.1.2 SPICE Model Extraction Process

Transient analysis is required at some point during the PDN design. After using the frequency domain network parameters that resulted from the structure modeling, then a macro model is created. This macro model consists of a simplified circuit netlist that can be simulated and hence transient data can be obtained. In this section the Broadband SPICE<sup>13</sup> (BBS) from Sigrity is described.

Broadband SPICE can generate circuit models based on active or passive elements that are compatible with commercial SPICE engines. This tool converts network parameters to SPICE equivalent circuits by using curve fitting. This equivalent circuit is valid up to a user specified maximum frequency for a passive *N*-Port S, Z, or Y parameters in Touchtone format. The network parameters can be obtained from measurements or from electromagnetic field solvers. The Touchstone data format file is an ASCII text file where data appears line-by-line. The contents of the file can be categorized into three types: comments, specification lines, and data lines. The synthesized SPICE circuits can then be used for DC, AC and transient analysis using HSPICE<sup>14</sup>, PSPICE or any other SPICE-compatible circuit simulator.

During the translation of network parameters to circuit elements, the order and corresponding topology of these circuits are determined automatically according to the complexity of the network parameter curves. The tool is provided with built-in passivity enforcement algorithms when creating the SPICE circuit models. When a circuit is passive then the circuit is guaranteed to be stable. In addition, when a circuit is non-passive, it is often the cause of computational divergence in transient simulations or nonphysical oscillations.

<sup>&</sup>lt;sup>12</sup> PowerSI<sup>TM</sup> User's Guide, ver. 9.0, Sigrity Inc., Campbell, CA, 2008.

<sup>&</sup>lt;sup>13</sup> Broadband SPICE<sup>™</sup>, ver. 9.0.2.06111, Sigrity Inc., Campbell, CA, 2008.

<sup>&</sup>lt;sup>14</sup> HSPICE<sup>®</sup>, ver. B-2008.09-SP1, Synopsys Inc., Mountain View, CA, 2008.



Fig. 2-2 Flow diagram of a power plane structure modeling methodology using PowerSI from Sigrity.

Broadband SPICE has two formats for output netlist files for the equivalent circuit model, one is the HSPICE-compatible and the other is the generic SPICE-compatible format. When using the HSPICE-compatible format, the tool uses the Laplace rational function which is a recursive convolution to handle the frequency response data. Laplace rational function is an efficient simulation scheme and hence can be fast and accurate. The generic SPICE-compatible format is generated by the tool using a direct convolution scheme to handle the frequency response data. Fig. 2-3 explains the process flow for a circuit model extraction using Broadband SPICE from Sigrity.

### 2.1.3 Transient/Frequency Simulation Process

Once we have the circuit model represented by an *N*-port black box, then we have to construct the main deck file. The main deck is a SPICE format file where all the models are integrated into the PDN such that it can be simulated for further analysis. Two different main



Fig. 2-3 SPICE model extraction using Broadband Spice from Sigrity.

deck files are needed for each type of analysis of the PDN, one for transient analysis and the other for AC analysis. In this main file all elements in the PDN are integrated such as the mother board, voltage regulator module (VRM), capacitors, sockets, connectors, packages, die circuits, stimulus circuits representing the silicon die, etc. In some cases, another *N*-port black box representing other ingredients of the PDN such the package of a microprocessor or chipset is integrated. Fig. 2-4 shows a representation of a SPICE deck for transient/DC/AC simulation using the circuit model (black box) resulted from the SPICE model extraction.

In a transient simulation deck the stimulus of the network is basically a time waveform that represents the silicon die loading condition. These loading conditions can be generated using current sources representing the demanded current of the die. In some cases, a step current is used to represents the transition from different silicon states, i.e.: system standby state to full working state or power virus. In this case the transient analysis will give the voltage droops



Fig. 2-4 SPICE deck for transient/DC/AC simulation using the circuit model (black box) resulted from the SPICE model extraction.

generated on the PDN due to the current transition (di/dt) applied at the silicon nodes. This analysis helps to the PDN designer to make sure all voltages at the silicon nodes (C4 bumps) meet the specifications. Fig. 2-5 shows s typical transient deck simulation output.

For the AC simulation deck, there are two options. One option is using the .lin command



Fig. 2-5 Output plot of a transient deck simulation: voltage droops through the PDN.

in HSPICE where the main deck is handled as an *N*-port network and Z parameters are extracted. The other option is using a 1 A ac current source hooked up at the die connection of the main deck. Using a frequency sweep for a 1 A source applied to the network, we can measure the voltage at the current source terminal and determine the impedance at that point. Fig. 2-6 shows the impedance profile of the PDN resulted from the simulation of a frequency deck.

# 2.2. Illustrative Case Study for a PDN Analysis Using Field Solver and Circuital Models

In this section an illustrative case study is presented in order to exemplify the methodology reviewed in the previous section and to provide the reference for testing the concept of the PDN analysis method to be proposed. In addition, accuracy is verified by comparing results against measurement data documented in a reference book [Novak-07]. For



Fig. 2-6 Output plot of an AC deck simulation: impedance profile of a PDN.

this example, we are going to extract the self-impedance of a bare rectangular set of parallel plates through a frequency-domain analysis using the methodology based on Sigrity field solver. This same set of parallel plates was analyzed in [Novak-07], where the power shapes were analyzed using a model based on analytical expressions using a spreadsheet and then compared to lab measurements. The results obtained in the reference are compared to those obtained using the field solver method explained in the previous section. The analysis will be limited to the frequency domain, such that a simple AC main deck will be built.

PowerSI tool will be used for extracting the parametric behavior of the parallel plates including the electromagnetic field's effects. For the SPICE macro model generation, the Broadband SPICE tool will be utilized and HSPICE tool will be used for the AC simulation.

#### 2.2.1 Field Solver Simulation and Data Generation

For this analysis purpose the PDN is comprised by a simple set of two parallel planes and two vias with the following characteristics: conductor material is copper whose conductivity  $\sigma$  = 5.8x10<sup>7</sup> S/m, plane width is x = 5.21 cm (2.05 inch), plane length is y = 2.86 cm (1.13 inch), copper plane thickness is t = 29.21 µm (1.15 mils), dielectric thickness is h = 63.12 µm (2.49 mils), dielectric constant or relative permittivity is  $\varepsilon_{\rm f} = 3.9$  and loss tangent tan  $\delta = 0.021$ . The



Fig. 2-7 Simple PDN consisting of two parallel planes with a pair of vias, to illustrate the electromagnetic field solver methodology.

structure is shown in Fig. 2-7.

The vias are comprised by two cylinders with a diameter of 0.4 mm, each via contains a pad of 0.5mm of diameter and antipad of 0.75mm of diameter. A shape of the vias cross-section and the pad-stack are presented in Fig. 2-9. In this figure the square shape indicates the top surface of the via and the cross shape indicates the bottom surface of the via. In addition, power positive rail is in red (left) and ground power rail in green (right).

The self-impedance of the pair of planes is measured at the port location, which is fixed between the two vias at the level of the top layer or top metal plane (see Fig. 2-9a).

As mentioned before, this test structure will be analyzed using the PowerSI tool which is an electromagnetic field solver. As described in Fig. 2-2, the first step is to translate the PCB post layout data base into the .spd format in order to enable PowerSI to handle the structure. For this example, the structure is simple and can be generated using the PowerSI editor. However, if a structure is already in a post layout format such as BRD, MCM, DSN, NDD, ASC, etc., then, the corresponding file translator tool (brd2spd, dsn2spd, ndd2spd, pad2spd, etc.) should be used. For instance, if the structure was implemented using Allegro PCB Editor<sup>15</sup> from Cadence, the output file format will be .brd and the Sigrity Brd2Spd Translator<sup>16</sup> is used.

As a second step, the planes and vias are edited using PowerSI. The plane shapes and the stack-up are shown in Fig. 2-8. The vias were placed at the center of the x direction of the plane and distributed along the y axis within a separation of 1 mm from the y axis center to each via, as shown in Fig. 2-9.

After this editing process, the port assignation is performed. In this test board only one port is defined because we want to extract the self-impedance of the structure at the center of the planes. As mentioned before, the port was designated between the two vias pads on the top layer.

Next, the sweep frequency range is defined. For this analysis, the starting frequency is 100 MHz and ending frequency is 10 GHz. And finally, the simulation is launched.

The resulting data is placed in a Sigrity's format file (.bnp), but it can also be saved in Touchtone format or in comma separated value (.csv) format. In this analysis both .csv and .bnp files formats are used, .csv is used to export data to Matlab and .bnp is used to generate the

<sup>&</sup>lt;sup>15</sup> Allegro® PCB Editor, ver. 16.2, Cadence Design Systems Inc., San Jose, CA 95134, 2008.

<sup>&</sup>lt;sup>16</sup> Sigrity – BRD2SPD Translator®, ver. 8, Sigrity Inc., Campbell, CA, 2008.



Fig. 2-8 Pair of planes implemented in PowerSI: a) First layer (Vcc plane), b) stack up.

macro model using Broadband SPICE. The .csv file contains the frequency, the real part and the imaginary part of the impedance. The Touchtone file can be generated such that it contains rectangular format data (IR; imaginary, real), decibels format data (DB; amplitude in dB, angle in degree) or polar format data (MA; amplitude and degrees). The resulting data contains the *S*, *Z* and *Y* parameters.



a)

| Via Editing                                         | × |  |
|-----------------------------------------------------|---|--|
| Plane02                                             |   |  |
| Medium02                                            |   |  |
| Plane01                                             |   |  |
| Add Delete Property                                 |   |  |
| Hints<br>Click Add for new via or click OK to evit! |   |  |
|                                                     |   |  |
| Info<br>The pad stack of this via is:               |   |  |
| padstack_1                                          |   |  |
| OK Cancel                                           |   |  |
| b)                                                  |   |  |

| PadStack Library : Padstack                    | _1            |                     |          |        |              |            |           | х  |
|------------------------------------------------|---------------|---------------------|----------|--------|--------------|------------|-----------|----|
|                                                |               |                     |          |        |              |            | Unit : mm |    |
| PadStacks                                      | Xsection View | Layer               | PadType  | Shape  | Width        | Height     | OffSetX   | Of |
| ~DefaultPadStack                               |               | D DefaultLibLayer   | Pad      |        |              |            |           |    |
| Padstack_1                                     |               |                     | Anti     |        |              |            |           |    |
|                                                |               |                     | Thermal  |        |              |            |           |    |
|                                                |               | Plane02             | Pad      | Circle | 0.5          | 0.5        |           |    |
|                                                |               |                     | Anti     | Circle | 0.75         | 0.75       |           |    |
|                                                |               |                     | Thermal  |        |              |            |           |    |
|                                                |               | Plane01             | Pad      | Circle | 0.5          | 0.5        |           |    |
|                                                |               |                     | Anti     | Circle | 0.75         | 0.75       |           |    |
|                                                |               |                     | Thermal  |        |              |            |           |    |
|                                                |               |                     |          |        |              |            |           |    |
|                                                |               |                     |          |        |              |            |           |    |
| <                                              |               | <                   |          | 111    |              |            |           | >  |
| Current default pad stack:<br>~DefaultPadStack |               | Outer radius : 0.2  |          |        |              |            |           |    |
|                                                |               | Plating thickness:  |          | 🗹 So   | lid Via      |            |           |    |
| Set As Default                                 |               | Conductivity: 5.800 | 000e+007 | 🗹 Us   | e default co | nductivity |           |    |
| Add Delete                                     |               | Select material:    |          | -      |              |            | Cance     | 2  |
|                                                |               | c)                  |          |        |              |            |           |    |

Fig. 2-9 Implementing vias in PowerSI: a) vias location, b) via cross-section, c) via pad-stack.

The  $Z_{11}$  parameter from this analysis represents the self-impedance of the parallel plate structure. The data output of this analysis is compared to the data presented in [Novak-07], where self-impedance was obtained through an analytical expression method and lab measurements. This data comparison is shown in Fig. 2-10. A strong correlation with the measurement data is observed. With this comparison the high accuracy of the modeling method is confirmed. However, the correlation, in terms of amplitude, with the data obtained using the analytical expression method employed in [Novak-07] is less accurate since the expressions used in that analytical method for the plane impedance do not include losses (in addition to other simplifications), hence sharper resonances are observed. Obtaining the impedance profile of Fig. 2-10a using PowerSI takes about 3.8 seconds using a computer with a 2 Duo CPU @ 2.4 GHz processor and 3 GB of RAM.



Fig. 2-10 Self impedance of the parallel planes: a) Z<sub>11</sub> using EM based simulation on PowerSI,
b) Z<sub>11</sub> using spreadsheet that evaluates an analytical expression for the plane impedance (lossless) and using lab measurements. Right charts are an enlarged view of the left charts. Bottom graphics were taken from [Novak-07].

#### 2.2.2 Translation of Frequency-Domain Data

As reviewed in previous section, frequency response of the PDN can be obtained by directly extracting S, Z or Y parameters using PowerSI. However, when the PDN requires timedomain analysis, then the parametric data is translated into a macro model using Broadband SPICE throughout curve fitting methods. In this illustrative example only the self-impedance is being evaluated, hence Z parameter data extraction using PowerSI would be sufficient. Nevertheless, as we want to evaluate the validity of the method and exemplify it, the macro-model process will be evaluated. At the end, we will evaluate the macro-model by constructing an AC SPICE deck and compare it against the PowerSI output data which has already been extracted.

As a starting point, the output file .bnp of the parallel planes structure that was generated by PowerSI is entered to the Broadband SPICE. In this program we can visualize the frequency domain curves of the amplitude and phase of S, Z or Y parameters per port as shown in Fig. 2-11.

The extraction mode is selected from the options menu. There are two choices, passivity mode and precision mode, and in this case, we enabled the passivity mode. Passive means that the element only consumes energy. In contrast, the active circuits generate or provide energy to the rest of the elements in the circuit. Circuit theory states that the interconnections of passive circuits are guaranteed to be stable. Stable but non-passive equivalent circuits are often the cause of computational divergence in transient simulations. In precision mode, Broadband SPICE extracts a highly accurate equivalent circuit model, but does not enforce the passivity of the extracted model. When using passivity mode, the tool uses algorithms that enforces the SPICE circuit macro-model to be completely passive.

The precision mode extracts circuit models that might lead to a non-convergence issue or to erroneous oscillations at a later simulation stage, while the passivity mode extracts models which avoid convergence issues. Fig. 2-12 illustrates the consequent result of a transient simulation of SPICE deck by using passive mode and precision mode.





Fig. 2-11 BBS windows: a) parallel plane  $Z_{11}$  parameter (amplitude and phase), b) input S-parameter and macro model options.



Fig. 2-12 Illustration of the consequent result of a transient simulation of SPICE deck by using precision mode and passive mode.

Broadband SPICE can generate two types of equivalent circuits, one is the HSPICEcompatible and the other is the general SPICE-compatible. For the parallel planes example the equivalent circuit selected was HSPICE-compatible. With this, the generated equivalent circuit uses some specific statements supported by HSPICE that contains smaller quantity of circuit nodes and components. Hence, the circuit simulation runs faster in comparison to the general SPICE-compatible option.

Next, the upper frequency limit is set to 10 GHz, which is the maximum frequency to be analyzed. Then, the "highlight errors greater than" option is enabled. This error limit is the average magnitude difference between the original S parameters and the S parameters of the extracted equivalent circuit response. For this analysis, the error limit was set to 0.02 and the actual error was 0.00022.

Another important aspect for the accuracy of the equivalent circuit is the S-parameter value at DC. If this value is not provided, the software automatically extrapolates the DC values for each entry in the parameter matrix. This was the case in the parallel planes example. The parallel planes structure was modeled from a starting frequency of 100 MHz and the actual extrapolated DC value was 0.999158 because the parallel plate at DC is an open circuit, then the



Fig. 2-13 Comparison between the S-parameters resulted from the EM simulation of PowerSI and the S-parameter obtained from the macro-model using Broadband SPICE.

ideal reflection coefficient would be 1 ( $S_{11} = 1$ ). This DC value can be edited manually.

Fig. 2-13 shows a comparison between the S parameters resulted from the EM simulation of PowerSI and the S parameter obtained from the macro-model using Broadband SPICE. In this comparison it was observed that  $S_{11}$  from BBS has a sufficient correlation with the  $S_{11}$  calculated using PowerSI across all frequencies. However, it is important to mention that at lower frequencies BBS presented the majority of the differences.

#### 2.2.3 Frequency-Domain Analysis Using SPICE Circuit Model

Up to this point we have the SPICE circuit model generated. Now, the circuit model can be used for transient or frequency analysis. Frequency domain analysis will be performed in this example in order to review the methodology. For this purpose, an AC simulation deck will be constructed in order to extract the self-impedance of the parallel plate structure. This deck consists of adding an AC current source of 1 A of amplitude to the 52.1x28.6 parallel plate



Fig. 2-14 Diagram representing the AC SPICE deck for the frequency domain analysis of the parallel\_plane\_52p1x28p6 structure.

circuit model through the port 1. Fig. 2-14 represents the connection of these two elements. By injecting a 1A AC current we can measure the impedance indirectly by measuring the voltage at the port 1 terminals. This AC main deck was simulated; the results are shown in Fig. 2-15 where a very good match is observed between the PowerSI response (EM simulations) and the BBS model (macro-model). Nevertheless, a small error is observed at lower frequencies (100 MHz) and parallel resonance peaks. This error is about 0.017 Ohms @ 100 MHz and it represents the 0.9% of the total impedance at that frequency. In practice, it has been observed that the error produced by the EM solver at very low frequencies (around 10Hz) is very significant with respect to measurements. Since the equivalent circuit model follows the EM responses, this model also yields low accuracy in very low frequency range.

### 2.3. Conclusions

In this chapter it was reviewed and analyzed an industrial methodology that is based on an electromagnetic field solver and equivalent circuits for modeling and analysis of Power Distribution Networks (PDNs). The three main processes were studied. These processes are EMbased data generation process, SPICE model extraction process and transient/frequency simulation process. After this, an illustrative example using the EM field solver techniques was reviewed. In this example, the EM field solver results were compared with measurement data and strong correlation was observed in the frequency domain. It was also noticed that extracted SPICE model has a sufficient correlation with the original structure response. However, there are



Fig. 2-15 Comparison between the PowerSI  $Z_{11}$  parameter and that one of the SPICE circuit model generated by BBS.

some almost negligible errors at lower frequencies. It was reviewed that instability in transient simulation can affect the results if the model does not satisfy passivity conditions. Some tool options were reviewed to guarantee the passivity of the circuit model.

# 3. Accurate and Computationally Efficient Power Delivery Network Lumped Models Obtained from Parameter Extraction

During the design process of a computing system, there are key aspects that need to be accounted for coming up with a functional and attractive product to the market. Those aspects can be measured and quantified in order to demonstrate the system capabilities and advantages over the competitive products. One of these aspects are the performance metrics of the system, which essentially focus on the ability of the system to produce large amount of transactions in a time unit. Performance parameters are very important in a computer system because they will define the selling price of the product and then affect the manufacturer revenue.

There are several performance metrics well defined in the computer industry. For example, in the central processing unit (CPU) context, some of the most popular performance metrics are: the per core performance (PCP), millions of instructions per second (MIPS), millions of operations per second (MOPS), floating point operations (FLOPS), and cycles per instructions (CPI). In the memory context, a typical performance metrics is the mega transfers per second (MT/s). In the input/output (I/O) interfaces context, a typical performance metrics is giga bit per second (Gb/s) speed. In the graphics context, tera-operations per second (TOPS) is a frequently performance metric. All of these are just examples of distinct metrics that industry and users are interested when they decide to produce or buy a product. At the same time, all of these performance metrics also depend on other factors. One of these factors is the noise associated with the digital signal, which will produce a time deviation and an amplitude deviation from an ideal digital waveform. The amplitude deviation is defined as the amplitude noise and the time deviation is defined as the time jitter. This noise can limit the frequency of the signal if it exceeds certain amplitude or timing thresholds for detecting a logic 0 or a logic 1 during the acquisition of a digital signal. A broader description about amplitude noise and timing jitter is in [Li-07]. Amplitude noise and timing jitter can be severely impacted if the voltage supply of the transceiver is not clean enough. Supply voltage fluctuations can also affect the speed paths on CPU, which is basically the digital signal propagation through millions of flipflops inside of a core. Therefore, a robust voltage supply and power delivery network (PDN) are

critical and should be part of the system design process, as described in [Mercado-Casillas-10].

There are several works that include optimization techniques into the design process of a PDN. Authors in [Chen-07] describe an optimization application for finding the right decoupling capacitors (caps) into a package, by using simulated annealing algorithm to minimize the total cost of decoupling capacitors using some noise driven constraints. Similarly, in [Pan-13] authors use a linear programming (ILP) optimization algorithm for finding the right location of decoupling caps but now considering also the proper placement and patterns of power and ground vias. Another optimization approach based on analytical methods using a spreadsheet is described in [Intel Altera-05], in which some guidance is obtained in order to improve the efficiency of the PDN by wisely selecting the proper decoupling capacitance. All of the optimization techniques mentioned above are intended to improve the decoupling caps in a PDN, however, there are other optimization techniques in which more design variables are included aside of the decoupling caps. In [Xu-14], the main design characteristics of a DC-DC converter and the different level of decoupling caps are treated as design variables. The corresponding objective function or output responses that are monitored during the optimization flow include voltage supply noises, leakage savings, and area overhead. There are other power delivery areas in which optimization techniques are applied. For instance, in [Jung-10] optimization is used to minimize the total energy dissipation in a system that uses a power management technique called dynamic voltage scaling (DVS). Authors applied in [Jung-10] a mathematical programming model to solve the energy optimization problem finding the right set of power states in functional blocks; such average energy dissipation is minimized while all quality constrains are met.

All of the works cited above are intended to improve and optimize some specific components of a PDN. In contrast, in this chapter, optimization is used to reduce the complexity of the PDN models while conserving the accuracy of the more complex and detailed model, such that the signal integrity (SI) and PDN co-design process becomes more efficient. It is used a low-cost optimization method based on a parameter extraction (PE) technique. As a result, a reduction of the design process cycle time is achieved and with low computational cost. The PDN quality of a dual data rate (DDR) memory sub-system is characterized using the proposed PE methodology, in order to evaluate how it can affect the SI and then determine if the PDN design is adequate. Since this SI design verification process implies multiple PDN evaluations, it

is imperative to simplify the models. In this part of the design process is where the PE optimization methodology plays a critical role by converting a distributed PDN model into a lumped PDN model while keeping a similar electrical response. Therefore, the SI evaluation time will be reduced, conserving a sufficient accuracy to take design decisions.

This PDN to SI impact verification process can be computationally expensive and sometimes prohibited if the designer decides to evaluate SI using large and complex PDN models, such as distributed PDN models, which can take from seconds to minutes or even hours during each iteration. In contrast, the lumped PDN model evaluation can take few mili-seconds.

Ultimately, the aim of this PE optimization method is to make the whole design process more efficient, and consequently, to shorten the design process cycle time and the computational cost. Relying on these techniques, the designer can meet the system performance targets using accurate models within a short time, becoming more competitive. This chapter corresponds to an extended version of the work in [Mercado-Casillas-19b].

# 3.1. Evaluating the Influence of PDN on Signal Integrity

In this section, a typical industrial process to evaluate the impact of PDN on signal integrity is described, with emphasis on DDR memory. In a DDR memory sub-system, the main power supply is known as VDD. VDD is the primary voltage supply that goes to the memory device power pins and memory controller (MC) pins. A section of the VDD feeds the digital blocks of memory devices and MC. Another section of VDD, known as VDDQ, feeds the output stage of input/output buffers as well as phase locked loops (PLL) and strobes.

The goal of this analysis of sizing the effects of the PDN into the SI is to include and assess the performance impact of the noise introduced by the VDD power delivery rail to the signal integrity at dynamic random access memory (DRAM) level. This PDN is incorporated to a DDR3 signal integrity (SI) simulation deck in which the PDN is analyzed multiple times. For this reason, the PDN simulation time must be fast enough. This SI simulation deck is normally used in a units per million (UPM) analysis [Zhang-15], which is obtained through a statistical analysis and it represents the number of units out of a million parts that will exceed the bit error rate (BER) targets.

As mentioned before, the scope of the present report is to describe an optimization method based on PE techniques to reduce the design time of the SI-PI networks by testing an optimized lumped PDN model over a SI simulation deck. The total design time benefit in a UPM analysis is inferred based on the SI-PI simulation time reduction.

The first step is to extract the distributed PDN model. As described in [Mercado-Casillas-10], a PDN is composed of all interconnects from the voltage regulator (VR) to the pads on the chips and the metallization on the die that locally distribute power and return current. Interconnects and metallization are extracted by using electromagnetic field solvers to obtain the distributed PDN model, which is composed by thousands of interconnecting nodes in a black box known as a macro-model. After building the distributed PDN model, then a lumped model topology using inductors, resistors, and capacitors is suggested. The interconnection of these lumped elements describes the main physical paths between the VR and the different silicon die components. Once the lumped topology is completed, then a PE optimization algorithm is applied in order to find the lumped element parameter values that makes the lumped PDN model response as close as possible to the distributed PDN model response. Once the lumped parameter values are extracted, the lumped circuit is attached to the signal integrity simulation deck in order to size the effects on the signal quality by evaluating the voltage or timing margins. This SI-PI simulation deck can be tested several times for collecting statistical data to obtain the UPM number, however, this step is not covered in the present thesis. If voltage or timing margin targets are exceeded due to the noise injected through the PDN, then the PDN needs to be redesigned to reduce the noise. The solution space could consist on increasing the decoupling caps in certain areas to compensate specific frequency ranges of the noise, increase copper weight in the motherboard stack up, change VR compensation components for increasing the bandwidth response, etc.

The overall process flow described above is shown in Fig. 3-1, where the lumped model shown is a simplification of the actual lumped circuit used during the simulation. In Table I, II, III and IV all correspondences are described. In summary, Fig. 3-1 shows a pictographic description of the process of assessing the impact of the PDN into the SI performance using PE optimization method for accelerating the design process.



Fig. 3-1 Overall process flow of assessing the impact of the PDN into the SI using PE optimization method.

| Lumped element name | Simplified lumped group   | Vector element        |
|---------------------|---------------------------|-----------------------|
| $R_{ m VR}$         | $R_{ m MB}$               | $p_1$                 |
| $L_{ m VR}$         | $L_{ m MB}$               | $p_2$                 |
| $L_{ m brd-bulk}$   | $L_{ m MB}$               | <i>p</i> <sub>3</sub> |
| $L_{ m bulk}$       | $L_{ m MB}$               | $p_4$                 |
| $R_{ m bulk}$       | $R_{ m MB}$               | $p_5$                 |
| $C_{ m bulk}$       | $C_{ m bulk}$             | $p_6$                 |
| $L_{ m brd-MC}$     | $L_{ m MB}$               | $p_7$                 |
| $L_{ m brd-D0}$     | $L_{ m MB}$               | $p_8$                 |
| $L_{ m dcapD0}$     | $L_{ m MB}$               | $p_9$                 |
| $R_{ m cbrdD0}$     | $oldsymbol{R}_{	ext{MB}}$ | $p_{10}$              |
| $C_{ m cbrdD0}$     | $C_{ m decoupling}$       | $p_{11}$              |
| $L_{ m cbrdD00}$    | $L_{ m MB}$               | $p_{12}$              |
| $R_{ m cbrdD00}$    | $oldsymbol{R}_{	ext{MB}}$ | $p_{13}$              |
| $C_{ m cbrdD00}$    | $C_{ m decoupling}$       | $p_{14}$              |
| $R_{ m brd-D1}$     | $oldsymbol{R}_{	ext{MB}}$ | $p_{15}$              |
| $L_{ m brd-D1}$     | $L_{ m MB}$               | $p_{16}$              |
| $L_{ m cbrdD1}$     | $L_{ m MB}$               | $p_{17}$              |
| $R_{ m cbrdD1}$     | $oldsymbol{R}_{	ext{MB}}$ | $p_{18}$              |
| $C_{ m cbrdD1}$     | Board Section             | $p_{19}$              |

# TABLE I PRE-ASSIGNED PARAMETERS OF THE MOTHER BOARD LUMPED MODEL

TABLE II OPTIMIZATION VARIABLES OF THE MOTHER BOARD LUMPED MODEL

| Lumped element name | Simplified lumped group   | Vector element        |
|---------------------|---------------------------|-----------------------|
| $R_{ m brd-bulk}$   | $\pmb{R}_{	ext{MB}}$      | $x_1$                 |
| $R_{ m brd-MC}$     | $oldsymbol{R}_{	ext{MB}}$ | $x_2$                 |
| $R_{ m brd-brd-D0}$ | $oldsymbol{R}_{	ext{MB}}$ | <i>x</i> <sub>3</sub> |

| Lumped element       | Simplified lumped group Vector elem |          |
|----------------------|-------------------------------------|----------|
| R <sub>conn</sub>    | $R_{ m DIMM}$                       | $p_{20}$ |
| $L_{ m conn}$        | $L_{ m DIMM}$                       | $p_{21}$ |
| $L_{c1}$             | $L_{ m DIMM}$                       | $p_{22}$ |
| $R_{c1}$             | $R_{ m DIMM}$                       | $p_{23}$ |
| $C_1$                | $oldsymbol{C}_{	ext{decouping}}$    | $p_{24}$ |
| $L_{c2}$             | $L_{ m DIMM}$                       | $p_{25}$ |
| $R_{c2}$             | $R_{ m DIMM}$                       | $p_{26}$ |
| $C_2$                | $oldsymbol{C}_{	ext{decouping}}$    | $p_{27}$ |
| $R_{ m vdd\_r0}$     | $R_{ m DIMM}$                       | $p_{28}$ |
| $L_{ m vdd\_r0}$     | $L_{ m DIMM}$                       | $p_{29}$ |
| $R_{ m vdd\_dcap0}$  | <b>DRAM</b> DIE                     | $p_{30}$ |
| $C_{ m vdd\_dcap0}$  | $DRAM_{DIE}$                        | $p_{31}$ |
| $R_{ m vddq\_r0}$    | $R_{ m DIMM}$                       | $p_{32}$ |
| $L_{ m vddq\_r0}$    | $L_{ m DIMM}$                       | $p_{33}$ |
| $R_{ m vddq\_dcap0}$ | <b>DRAM</b> DIE                     | $p_{34}$ |
| $C_{ m vddq\_dcap0}$ | <b>DRAM</b> DIE                     | $p_{35}$ |
| $R_{ m vdd\_r1}$     | $R_{\rm DIMM}$                      | $p_{36}$ |
| $L_{ m vdd\_r1}$     | $L_{ m DIMM}$                       | $p_{37}$ |
| $R_{ m vdd\_dcap1}$  | <b>DRAM</b> DIE                     | $p_{38}$ |
| $C_{ m vdd\_dcap1}$  | <b>DRAM</b> DIE                     | $p_{39}$ |
| $R_{ m vddq\_r1}$    | $R_{\text{DIMM}}$                   | $p_{40}$ |
| $L_{ m vddq\_r1}$    | $L_{ m DIMM}$                       | $p_{41}$ |
| $R_{ m vddq\_dcap1}$ | <b>DRAM</b> DIE                     | $p_{42}$ |
| $C_{ m vddq\_dcap1}$ | $DRAM_{DIE}$                        | $p_{43}$ |

# TABLE III PRE-ASSIGNED PARAMETERS OF THE DIMM LUMPED MODEL

# TABLE IV OPTIMIZATION VARIABLES OF THE DIMM LUMPED MODEL

| Lumped element       | Simplified lumped group       | Vector element |
|----------------------|-------------------------------|----------------|
| $R_{\rm dimm-c1}$    | $R_{ m DIMM}$                 | <i>X</i> 4     |
| $L_{dimm-c1}$        | $L_{ m DIMM}$                 | <i>x</i> 5     |
| $R_{ m dimm-c2}$     | $R_{ m DIMM}$                 | $x_6$          |
| $L_{ m dimm-c2}$     | $L_{ m DIMM}$                 | <i>X</i> 7     |
| $L_{ m dimm-bump}$   | $L_{ m DIMM}$                 | <i>X</i> 8     |
| $R_{ m dram\_r0}$    | $\boldsymbol{R}_{	ext{DIMM}}$ | <i>X</i> 9     |
| L <sub>dram_r0</sub> | $L_{ m DIMM}$                 | $x_{10}$       |
| $R_{ m dram_r1}$     | $\boldsymbol{R}_{	ext{DIMM}}$ | <i>X</i> 9     |
| Ldram_r1             | $L_{ m DIMM}$                 | <i>X</i> 10    |

# 3.2. Distributed PDN Model Description

DDR3 is the third generation of the DDR memory interface technology. In a DDR3 subsystem the PDN consists of two power supplies: VDD with a nominal voltage of 1.5 V and VTT with a nominal voltage of VDD/2. As mentioned in previous section, VDD provides energy to the MC in the CPU and the DRAM memory devices in all dual in line modules (DIMMs). VDD supply is normally implemented through a switching VR. On the other hand, the VTT power supply is used to supply the proper voltage to the termination resistors on the command, address, and control signals on the DIMM and it is a sink/source supply. VTT can be a low drop out (LDO) active device or a switching VR. The focus on this work is concentrated on the VDD power supply and not on the VTT.

The VDD power supply, or VDD PDN in this work, is composed by a switching VR, VR phase inductors, a group of large bulk capacitors close to the VR, various groups of ceramic caps placed closed to the DIMM memory modules, a group of decoupling caps close to the MC, a motherboard with different stacking copper layers which interconnects the different elements in the sub-system, socket connectors, DIMM connectors, DIMM boards, ceramic caps inside of the DIMM boards, MC or CPU package substrate with copper connections from pins to bumps, bumps to connect to silicon active devices, active devices in silicon, and metallization inside of the silicon for both ends DRAM and MC. Each ingredient of the VDD PDN is modeled by extracting an electrical representation of its behavior. For the sake of maintaining accuracy, most of the electrical behavior extractions require electromagnetic full-wave 3D field solvers which produce a set of scattering parameters (S-parameters). The commercial solver PowerSI from Cadence<sup>17</sup> is used in this application. Once all components are modeled, they are interconnected and become the distributed PDN model. Fig. 3-2 shows the block diagram representing the different ingredients in the distributed PDN model of the VDD power rail.

<sup>&</sup>lt;sup>17</sup> Cadence Design Systems Inc., San Jose, CA 95134, 2008.



Fig. 3-2 Block diagram of the distributed PDN of VDD power rail in a DDR3 memory subsystem.

One of the most complex ingredients to model is the mother board (MB). It is composed by plated through-holes (PTHs), multiple layers of metallic power planes with different shapes and dimensions, flame retardant (FR4) dielectric material between layers, CPU or MC pads and soldered DIMM connectors. There are some other components inside the motherboard, such as decoupling caps close to the DIMM pin field, decoupling caps close to the MC and bulk caps close to the mother board. A pre-layout of the mother board and its main components are showed in Fig. 3-3. Once the layout structure is completed, then it is modeled through S-parameters using PowerSI<sup>TM</sup> 3D modeler. A similar procedure is applied to each of the blocks in Fig. 3-2.

The other main component in the memory sub-system are the DIMMs. In this application it is used dual rank (DR) memory modules with DRAM devices of 4 bits (4x) of data length with a capacity of 4 Gb in a planar package per DRAM module running at 1600 MT's. These DIMMs belongs to a DDR3 standard; they are called as raw card E in JEDEC<sup>18</sup>. Similarly to the mother board, the PDN in DIMM modules are composed by PTHs, multiple copper layers of ground and VDD planes, decoupling capacitors responding at different frequencies, etc. In DIMM modules the power and ground external connection are given by the golden fingers, which are inserted in

<sup>&</sup>lt;sup>18</sup> Joint Electron Device Engineering Council, Arlington, Virginia, U.S.



Fig. 3-3 Pre-layout of the motherboard and its main components.

the board DIMM sockets. Fig. 3-4 shows the raw card E DIMM module and its main components. DIMM distributed model is extracted using PowerSI<sup>TM</sup>.

To give a better idea of the level of complexity of the resultant distributed PDN model, for this particular application it consists of two main macro models or black boxes. Both macro models are created from Touchtone S-parameter matrix by a tool that performs a curve fitting process. One of these macro models represents the mother board and the other represents the DIMM module. In this distributed PDN model the DIMM model is instantiated twice. Mother board macro model has 35 ports for connecting DIMM, VR, and capacitors. DIMM module macro model has 38 ports for connecting DRAM devices and mother board. In total, distributed PDN model has 2,380 nodes and 98,319 elements in which there are: 1,618 resistors, 159 capacitors, 594 inductors, 95,368 voltage controlled current sources, 3 voltage controlled voltage sources, 111 current controlled current sources, 393 voltage sources and 73 current sources.





Total elapsed time for a typical transient simulation is about 3,066 seconds (approx. 51 minutes) and total memory used during simulation is 127.9 MB.

# 3.3. Lumped PDN Model Definition

# 3.3.1 Lumped Model Topology

The proposed lumped model is divided in three independent sections: board section, DIMM section, and MC section. The topology of the lumped model sections is selected simply by following the connectivity path between main PDN components. In this work, PE optimization is applied only to board and DIMM sections for extracting the lumped parameter values. MC lumped elements are extracted using an empirical methodology and it is not covered in this thesis.

The first step is to define the circuit nodes. Fig. 3-5 shows the physical layout of the motherboard and the corresponding lumped model topology representing the mother board. In the mother board section model, node 1 represents the connection between the VR output inductor and the board resistance and inductor parasitics to the bulk capacitors, node 2 is the connection between bulk caps and the main parasitics to the MC pins, node 3 is the connection between MC pins and parasitics to the DIMM0 connector, node 4 represents the connection between DIMM0 connector, DIMM0 board capacitors and parasitics to the DIMM1 connector.



Fig. 3-5 Top figure shows the physical layout of the motherboard and bottom figure shows the corresponding lumped model topology that represents the mother board.

Finally, node 5 is the connection between DIMM1 connector and DIMM1 board capacitors. In this application, DIMM2 and DIMM3 are not used.

The lumped model of the DIMM section is shown in Fig. 3-6. Node 1 represents the connection between MB and the DIMM connector, node 2 is the connection between DIMM



Fig. 3-6 Lower left pictures show the physical layout of the DIMM card and top left and right figure shows the corresponding lumped model topology that represents the DIMM card and DRAM package.

connector and the parasitics to the first tier of low frequency (LF) capacitors, node 3 is the connection between the LF caps and the parasitics to the mid frequency (MF) caps, node 4 is connecting MF caps and the parasitics to the DRAM package bumps, node 5 is connecting the DRAM package bumps to the parasitics of the DRAM package to the upper die and lower die in a dual die package configuration (DDP). Node 6 and 9 represents the bifurcation between the VDD and VDDQ sections of the DRAM package. Node 7 and 10 represent the connection to the high frequency (HF) caps of the VDD die. Nodes 8 and 11 are the connections to the HF caps of the VDDQ die.

#### **3.3.2** Lumped Elements Classification

Some lumped elements are fixed because their values are already known, some of them

are obtained from datasheets, and some others are extracted from standards. In our optimization formulation, all of these fixed elements are treated as pre-assigned parameters and remain constant across different designs and during PE optimization. In this application, some preassigned parameters are common for both the lumped and the distributed models, such as DIMM connector resistance/inductance, capacitance and ESL/ESR of discrete capacitors, VR inductor, etc. Pre-assigned parameters are represented by vector  $p \in \Re^m$ .

We have another group of lumped elements whose parameter values are unknown. We need to find their values to approximate as close as possible the response of the distributed PDN model. In our PE optimization formulation we called these elements as design variables or optimization variables and are contained in vector  $x \in \Re^n$ . Table I and Table II show all mother board lumped elements and its corresponding parameter type. Table III and Table IV show all DIMM lumped elements and their corresponding parameter type.

# 3.4. Description of the Parameter Extraction by Optimization

# 3.4.1 Optimization Variables, Pre-Assigned Parameters, and Response of Interest

As mentioned before, all optimization variables are in  $x \in \Re^n$  and all pre-assigned parameters are in  $p \in \Re^m$ . The optimization variables for the mother board are shown in Table II and those for the DIMM are in Table IV, including 12 lumped elements in total. From those 12 lumped elements, only 10 distinct values are used in x since  $R_{dram_r0} = R_{dram_r1} = x_9$  and  $L_{dram_r0} = L_{dram_r1} = x_{10}$ , because Rank0 and Rank1 have exactly the same copper distribution in the DIMM, then the corresponding lumped parasitic should be the same. This reduces the optimization complexity because the dimensionality is reduced from 12 to 10 (n = 10). The pre-assigned parameters for the mother board are in Table I and those for the DIMM are in Table III, including 43 pre-assigned parameters (m = 43). The response of interest is denoted as R(x, p) and corresponds to the frequency response of the PDN.

### 3.4.2 Objective Function

The objective function we want to minimize is represented by three scalar multidimensional error functions:  $e_1(x, p)$ ,  $e_2(x)$ ,  $e_3(x)$ . It is defined as

$$u(\mathbf{x}) = \max\{e_1(\mathbf{x}), e_2(\mathbf{x}), e_3(\mathbf{x})\}$$
(3-1)

where

$$e_1(x) = \|\mathbf{R}(x, \mathbf{p}) - \mathbf{R}^t\|_2^2$$
(3-2)

$$e_2(\mathbf{x}) = \max(\mathbf{x} - \mathbf{x}_{\max}) \tag{3-3}$$

$$e_3(\mathbf{x}) = \max(\mathbf{x}_{\min} - \mathbf{x}) \tag{3-4}$$

The objective function value in this PE problem is given by the maximum value among the three error function values. Error function  $e_1$  represents the absolute difference between the AC responses of the PDN lumped model response  $\mathbf{R}(\mathbf{x}, \mathbf{p})$  and the target response  $\mathbf{R}^t$  (ac response of the distributed model). Error functions  $e_2$  and  $e_3$  play the role of penalty functions; they are used to constrain the optimization variables in a feasible region defined by upper and lower bounds  $\mathbf{x}_{max}$  and  $\mathbf{x}_{min}$ .

### 3.4.3 Optimization Problem Formulation

The aim of parameter extraction in this work consists of finding within a feasible region the optimal x that minimizes the error between the VDD PDN lumped model response R(x, p) in AC and the target response  $R^t$  given by the VDD PDN distributed model. The optimization problem for finding  $x^*$  is

$$\boldsymbol{x}^* = \arg\min_{\boldsymbol{x}} \boldsymbol{u}(\boldsymbol{x}) \tag{3-5}$$

PE problem definition in (3-2) uses the general *p*-th norm formulation. It is selected the least squares  $l_2$  norm (p = 2). The objective function u(x) defined in (3-1) is minimized in (3-5) and as a consequence the error between the lumped and the distributed model responses defined by (3-2) is minimized. In addition to the  $l_2$  formulation in (3-2), the objective function is complemented with box constraints defined by (3-3) and (3-4) as penalty functions.

#### 3.4.4 Optimization Method

This parameter extraction formulation is solved using a classical optimization method

called Nelder-Mead, from the group of direct search methods available in Matlab<sup>19</sup>. This method does not require gradients and it is relatively rapid to converge. A more detailed description of this method is in [Gilli-11].

#### 3.4.5 Seed Values

The seed values or starting point values of the optimization variables are selected by estimating the resistance and inductance using the S-parameter matrix from the distributed parameter model. Scattering parameters in Touchtone format are translated into Z-parameter matrix using PowerSI<sup>TM</sup>, then Z-parameter matrix can be reduced by short circuiting any port. In this way the impedance path between ports is plotted in real and imaginary components. Using that information, we can get the approximate resistance by using the real component and the inductance using the reactive section of the imaginary component. Initial values of design variables are  $\mathbf{x}^{(0)} = [1.79 \text{m}\Omega \ 2.92 \text{m}\Omega \ 3.75 \text{m}\Omega \ 0.6 \text{m}\Omega \ 64 \text{pH} \ 2.56 \text{m}\Omega \ 0.1 \text{pH} \ 20 \text{pH} \ 1\text{m}\Omega \ 5\text{pH}]^{T}$ .

To improve the performance of the optimization algorithm, design variables are scaled to reduce the variability in range of values. The scaling process consists of multiplying each design variable element by a scaling factor, as follows:  $x_1(1\times10^3)$ ,  $x_2(1\times10^3)$ ,  $x_3(1\times10^3)$ ,  $x_4(1\times10^3)$ ,  $x_5(1\times10^{12})$ ,  $x_6(1\times10^3)$ ,  $x_7(1\times10^{12})$ ,  $x_8(1\times10^{12})$ ,  $x_9(1\times10^3)$ ,  $x_{10}(1\times10^{12})$ .

#### 3.4.6 Optimization Results

The system response using seed values  $\mathbf{R}(\mathbf{x}^{(0)}, \mathbf{p})$  is evaluated and compared with target response  $\mathbf{R}^{t}$ . As a result, it is found that typical methodology using Touchtone for extracting lumped values is not accurate for predicting the PDN behavior. Fig. 3-7 shows the impedance response of the PDN for the lumped model using seed values and the distributed model or target response. The impedance error at very low frequencies (below 1 KHz) is about 54%, the impedance error at the first resonance frequency is about 7%, the impedance error at second resonance frequency is about 48% with 10 MHz shift, and the error at the third resonance

<sup>&</sup>lt;sup>19</sup> MATLAB, Version 9.1.0, the MathWorks, Inc., 3 Apple Hill Drive, Natick MA 01760-2098, 2016.


Fig. 3-7 Lumped model response with seed optimization parameters  $R(x^{(0)}, p)$  vs. target response  $R^{t}$  from the distributed model.

frequency is about 38% with 4 MHz shift.

Comparing the system response after optimization  $R(x^*, p)$  and the target  $R^t$ , both responses are now much better matched than before optimization, as show in Fig. 3-8. In this comparison, the impedance error at very low frequencies (below 1 kHz) is negligible, the impedance error at the first resonance frequency is just 1%, the impedance error at second

TABLE V IMPEDANCE MAGNITUDES AND RESONANT FREQUENCIES OF MEMORY PDN BEFORE AND AFTER OPTIMIZATION

| Modeling type                                                        | <1 kHz        | First resonance |               | Second resonance |                     | Third resonance |                     |
|----------------------------------------------------------------------|---------------|-----------------|---------------|------------------|---------------------|-----------------|---------------------|
|                                                                      | Z(f)          | Z(f)            | Frequency     | Z(f)             | Frequency<br>in MHz | Z(f)            | Frequency<br>in MHz |
| Distributed model $R^{t}$                                            | 9.5           | 23.5            | 11.7          | 22.7             | 40.7                | 11 1            | 89.1                |
| Response at starting point $R(x^{(0)}, p)$<br>(error vs target in %) | 14.6<br>(54%) | 25.1<br>(7%)    | 11.2<br>(-4%) | 33.6<br>(48%)    | 31.6<br>(-22%)      | 6.9<br>(38%)    | 85.1<br>(-4%)       |
| Response optimized<br>$R(x^*, p)$<br>(error vs target in %)          | 9.5<br>(0%)   | 23.8<br>(1%)    | 11.5<br>(-2%) | 23.9<br>(5%)     | 40.7<br>(0%)        | 7.5<br>(-32%)   | 87.1<br>(-2%)       |



Fig. 3-8 Optimized lumped model response  $\mathbf{R}(\mathbf{x}^*, \mathbf{p})$  vs. target response  $\mathbf{R}^t$  from the distributed model.

resonance frequency is about 5% with no frequency shift, and the error at the third resonance frequency is about 32% with only 2 MHz shift. Table V shows the summary of the differences between the three responses. The error in the impedance magnitude for the third resonance frequency is still out of the 5%. This frequency range is governed by the  $R_{vdd_r0/1}$  and  $L_{vdd_r0/1}$  but those elements are not included in the optimization loop, as mentioned in Section 3.3; this is the reason why that resonance peak does not improve largely.

The PE optimization process takes 172 iterations and 286 function evaluations using Nelder-Mead simplex direct search algorithm. The optimization exit condition reach the tolerance of design parameters variation criteria set at  $1 \times 10^{-3}$  with a final objective function value of 0.15092, meaning that the algorithm likely converged to a local minimum. Since the output impedance tolerance is met then it is decided to accept this optimal solution. Optimized design variables obtained are  $\mathbf{x}^* = [1.3 \text{m}\Omega \ 2\text{m}\Omega \ 1\text{m}\Omega \ 0.013675\text{m}\Omega \ 0.00033503\text{pH} 2.4893\text{m}\Omega \ 0.065312\text{pH} \ 8.8692\text{pH} \ 0.7246\text{m}\Omega \ 4.4651\text{pH}]^{\text{T}}.$ 

Regarding the simulation comparisons in computational cost, it is found also a great

difference if we compare distributed and lumped model. Both circuit models are simulated in transient regimen using HSPICE<sup>20</sup> simulation engine. Significant computational cost reduction is observed using the lumped model, as expected. Lumped model is using only 1.2% of the total memory of the CPU time used by the distributed model and 0.2% of the CPU time used by the distributed model and 0.2% of the CPU time used by the distributed model and 0.2% of the CPU time used by the distributed model. Table VI shows this comparison in computational cost between lumped and distributed models.

# 3.5. Optimized PDN Applied to DDR Signal Integrity Analysis

As a test vehicle to assess the benefits of the proposed PE optimization methodology to develop efficient and accurate PDN lumped models, the optimized lumped model is now applied into a SI-PI co-design of a DDR sub-memory system. In this section, we review the summary of the results of the main signal integrity (SI) characteristics by incorporating the PDN optimized lumped model into the SI simulation deck.

### 3.5.1 SI Analysis Assumptions

As mentioned before, the memory configuration to be evaluated consists of populating one memory channel with two DIMMs per channel (2 DPC), the DIMMs type are dual rank (2R) running at 1600 MT's. The PDN stimuli and the IO bit pattern assumes a 1 to 0 interleaving

TABLE VI COMPARISON OF THE TRANSIENT SIMULATION COMPUTATIONAL COST BETWEEN LUMPED AND DISTRIBUTED PDN MODELS

| Modeling type          | Total memory used<br>(kBytes) | Total CPU time<br>(Seconds) | Total Elapsed time<br>(Seconds) |
|------------------------|-------------------------------|-----------------------------|---------------------------------|
| Distributed model      | 127,880                       | 3,064.45                    | 3,066.96                        |
| Lumped seed            | 1,565                         | 7.60                        | 10.27                           |
| Lumped/Distributed (%) | 1.2%                          | 0.2%                        | 0.3%                            |

<sup>&</sup>lt;sup>20</sup> HSPICE®, ver. B-2008.09-SP1, Synopsys Inc., Mountain View, CA, 2008.

pattern (10101010 ...).

### 3.5.2 SI Tests Description

During the SI test, several simulation cases are evaluated. Test 1 consists of running the simulation with a single DQ lane toggling between logic 1 and 0 with PDN noise stimuli. Test 2 is about driving a 64 train of bits in 10 DQ lanes with 101010... bit pattern transitions with PDN noise stimuli. Test 3 is the control experiment without PDN noise in order to get a baseline of the SI characteristics and then size the PDN noise stimuli impact. Test 3 consists on driving 64 bits in 10 DQ lanes with 101010... bit pattern transitions but without PDN noise stimuli.

#### 3.5.3 SI Analysis Results

SI simulation time for single run (64 bit transmission through 10 DQ lanes) without PDN takes around 30-40 minutes and once the optimized lumped PDN is incorporated the simulation time for a single run takes 11-12 hours. SI simulation time, using distributed PDN, is not tested due to the much longer simulation time implied (distributed model simulation time is approximately 400 times larger than lumped model simulation time). Simulation time reduction is the main benefit of using the lumped model instead of using distributed PDN model.

First of all, Test 1 shows how the PDN noise in the main VDDQ rail is coupled into the RX signal. The VDDQ power rail presents a voltage swing of ~191 mV peak to peak which is coupled into the RX signal in ~127 mV peak to peak. Fig. 3-9 shows the Test 1 VDDQ noise and how it is coupled into the signal at receiver. This additional PDN noise coupled into the RX signal degrades the eye height (EH) margin.

In Test 2, the signal amplitude of the 64 bit train of pulses is modulated by a low frequency noise signal coming from the VDDQ rail; this low frequency noise is the resulting voltage variation due to the PDN noise. This low frequency noise modulation on the RX and TX signals is shown in Fig. 3-10. Test 3 is the baseline and shows the TX and RX voltage waveforms of the DQ signal without the effect of the PDN noise. Using the resulting signal of Test 3 as signal baseline waveform and the resulting signal of Test 2 as the signal waveform with

PDN included, then it is concluded that PDN cause a 90 mV degradation in the high-level input voltage ( $V_{\rm IH}$ ) at the receiver side RX. Fig. 3-11 shows the DQ signal waveforms comparison with and without PDN noise included. This signal degradation due to PDN can severely affect the eye height (EH) margins in an UPM analysis.

As an outcome, it is found that the PDN introduced ~127 mV to the propagated DQ signal which impacted the  $V_{IH}$  in 90 mV at the RX side. As a consequence, this noise in the signal severely reduced the EH margins in an UPM analysis which will significantly degrade the SI performance of the DDR channel.

In this test vehicle it is observed that the voltage noise in the DQ signal is asymmetrical,  $V_{\rm IH}$  noise level is larger than the low-level input voltage ( $V_{\rm IL}$ ) noise (see Fig. 3-9). The reason of this asymmetry is because the lumped and the distributed PDN model are single sided. This means that all the ground ( $V_{\rm ss}$ ) parasitics are assigned to the power side (VDD) and consequently  $V_{\rm IH}$  presented most of the total PDN noise and  $V_{\rm IL}$  only a very small portion of it. This asymmetry could change the EH shape and the margins in an UPM analysis could be different. A possible solution to this issue would be to use a dual sided model, in which the actual parasitics in the ground and power rails are distributed accordingly.



Fig. 3-9 Test 1 PDN VDDQ noise and how it is coupled into the signal at receiver RX side in the high-level input voltage ( $V_{\text{IH}}$ ).



Fig. 3-10 Test 2 driver (TX) and receiver (RX) signals including PDN noise. 64 bits in 10 DQ lines with 101010 pattern.



Fig. 3-11 Test 2 and Test 3 transmitter (TX) and receiver (RX) signals with and without PDN noise. 64 bits in 10 DQ lines with 101010 pattern.

# 3.6. Conclusions

A new power delivery design methodology was described in this chapter for simplifying a PDN model using a PE technique based on optimization methods. This methodology demonstrates that it can significantly increase the design efficiency by reducing the simulation time of the PDN by approximately 99.8% of the regular time using a standard industry methodology based on 3D full-wave distributed models. It was also demonstrated that the optimized PDN model kept a sufficient correlation with respect to the distributed models. However, it was detected that at the highest resonance frequency (87 MHz) the model error is around 32% and this is because the lumped elements that are affecting that frequency are not added in the optimization process. It is expected that by adding those elements to the optimization loop the error will be reduced.

The proposed PE methodology was tested into a SI-PI co-design of a DDR sub-memory system. In this testing deck it was found that PDN noise impacted the signal integrity by injecting excessive noise into the RX and TX signals. This design problem was detected by running the SI-PI simulation deck based on the optimized lumped PDN model in three iterations, each iteration took between 11-12 hours. In contrast, the SI-PI simulation deck based on a distributed PDN model resulted prohibitive due to its excessively high computationally cost.

# 4. Accurate Simulation of Package Substrate Air Core Inductors

The current trend of increasing computing systems speed and performance are coming together with an expensive and larger power consumption. This speed performance demand is being addressed by either increasing the central processing unit (CPU) speed, increasing the quantity of CPU cores working in parallel, or both. In any case, the total power demand has to pay the bill. This increment in power is directly translated into larger electrical currents that go from the main power supply to the silicon chips. As a consequence, the mechanisms to efficiently deliver power to the silicon die becomes more challenging.

One strategy for improving power efficiency consists of installing high-frequency power supplies as close as possible to the circuits at die level, establishing power control for individual CPU cores in order to avoid unnecessary power losses, as describe in [Burton-14] and [Mathuna-12]. For this purpose, package substrates are being used to implement physical structures that serve as the output inductor of switching power supplies installed close to or in the silicon die. The electrical characteristics of these package substrate inductors are critical to achieve the highest efficiency of the voltage regulator (VR), as well as to keep output voltage ripple and voltage noise within specifications.

Given the physical space constraints in the package substrate due to the high cost of real state and the trending miniaturization of the silicon process technology, the design of these package substrate inductors is a challenge. As a consequence, during the design process of these package inductors, it is required to obtain suitable electrical and mechanical characteristics for the given physical constraints.

This chapter describes an accurate simulation process based on a high-fidelity or fine model, of a package substrate inductor using 3D full-wave electromagnetic (EM) simulation tools. This accurate simulation process can be used during the design of an ACI in order to satisfy the high-frequency VR requirements. Simulation time is another main factor that should be considered during the simulation process, and this is the main drawback of this accurate methodology using a fine model. Simulation time becomes an important practical criterion to

select the most appropriate inductor model. In any case, the resultant electrical responses from these highly accurate but computationally expensive fine model simulations will be used as a reference in Chapter 5, where less accurate but faster models will be developed.

# 4.1. Designing Package Substrate Air Core Inductors (ACI)

The technology progress of fitting high-power transistors in small silicon cases allows high-efficiency DC-DC buck converters that work with switching frequencies in the order of multi-MHz. At the same time, this increment in frequency reduces the inductance requirement of the output inductor of the switching voltage regulators (VR), as pointed in [Mathuna-10]. The requirements of this type of VRs, such as voltage ripple, transient response, current handling capability, and reduction of inductance, allows the implementation of substrate air core inductors (ACI) at low cost, without significant impact on the real-state [Lambert-14]. Package air core inductors must be carefully designed since they impact on the main VR metrics.

### 4.1.1 ACI Design Process

The ACI design process consists of finding the right inductor to meet output ripple and transient response specifications, maximize VR efficiency, and meet thermal and reliability limits [Bharath-16]. This design process typically involves to find out the minimum inductance for meeting the output ripple target, the maximum inductance to meet transient response requirements, minimum inductance that meets input to output noise rejection targets, define phase quantity to support maximum load requirements, define minimum number of solenoid turns that can produce the target inductance within the available real-state of the package, quantify the amount of dummy metals required that yield into a manufacturable ACI, etc.

### 4.1.2 ACI Design Optimization Process

Once defining all the relevant inductor target characteristics for a given case, such as those enlisted in previous subsection, then the first ACI is implemented. At that starting point,

the ACI can be optimized using empirical or numerical optimization processes to obtain the target inductance and maximize the efficiency. This optimization process can be time consuming and need large computational resources because it typically requires multiple simulations of the ACI fine model.

# 4.2. Physical Structure and Electrical Characteristics of an ACI

Air core inductors are embedded passive elements built in the stack up of an organic package through routing copper layers, vias, and FR4 dielectrics. There are two major ACI topologies: snake inductors (vertical solenoids) and racetrack inductors (horizontal solenoids). Snake inductors are vertical loops implemented through the package core to obtain the volume for the inductor. Racetrack inductors are horizontal loops implemented through the planes in the bottom layers of the package. Both ACI topologies have some advantages and disadvantages over each other.

Snake inductor advantages over racetrack inductors include: larger VR efficiencies due to higher quality factors, higher inductance per area, higher input noise rejection, free up more space in bottom package layers for signal and package capacitor routing, and better power/ground planes.

Racetrack inductor advantages over snake inductors include: thinner core layers and shorter vertical connectivity to bottom package capacitors, as well as cheaper substrate package solution.

The typical area of an ACI depends on the silicon lithography technology used in the die. Smaller silicon technology requires a smaller ACI area. For Intel desktop technology, a typical ACI dimension is 1.4 mm x 1.2 mm [Lambert-16].

#### 4.2.1 Input Design Parameters or Physical Characteristics of the ACI

We select a snake ACI type for describing the main input design characteristics. The snake ACI type vertical loops are implemented mainly by PTH vias crossing the core of the package substrate in order to obtain the inductor void.

The selected snake ACI is shown in Fig. 4-1. It consists of 6 inductor coils or solenoids connected in parallel. Each solenoid has only one turn. The input of each solenoid is connected to the power transistors of a multi-phase buck converter. In this case, each solenoid connects to a phase of a 6-phase buck converter. The output of each solenoid is connected to a common node called as cold bar which distributes the power to the load. Another important characteristic of this ACI is the coupling effect between solenoids, which improves the efficiency of the buck converter. Coupling is achieved by placing solenoids such that the current flows on them in opposite directions, as shown in Fig. 4-1b.

Some of the physical characteristics of the ACI are parametrized into two groups: first-order design parameters and second-order design parameters. First-order design parameters are those physical characteristics that can be modified without any impact on other structures under some limits. The parameters in this category are depicted in Fig. 4-2 and they include: top coil plane width ( $w_1$ ), bottom coil plane width ( $w_2$ ), coil PTH diameter ( $d_1$ ), copper weight per layer ( $t_1 - t_8$ ), and dielectric thickness between layers ( $h_1 - h_7$ ). First-order design parameters are stored in vector  $\mathbf{x} = [w_1 \ w_2 \ d_1 \ t_1 \ t_2 \ t_3 \ t_4 \ t_5 \ t_6 \ t_7 \ t_8 \ h_1 \ h_2 \ h_3 \ h_4 \ h_5 \ h_6 \ h_7$ ]<sup>T</sup>, whose values are listed in Table VII.



Fig. 4-1 a) Package substrate snake type coupled ACI with 6 inductor coils or solenoids connected in parallel. b) Small section of the ACI showing the coupling effect by placing solenoids such the current flow in each coil is opposite each other.



Fig. 4-2 First order design parameters: top and bottom coil plane width  $(w_1, w_2)$ , coil PTH diameter  $(d_1)$ , copper and dielectric thickness per layer  $(t_1 - t_8, h_1 - h_7)$ .

| Name                  | Value (mm) | Vector element         |
|-----------------------|------------|------------------------|
| <i>W</i> 1            | 0.406      | $x_1$                  |
| <i>W</i> 2            | 0.351      | $x_2$                  |
| $d_1$                 | 0.15       | <i>X</i> 3             |
| $t_1$                 | 0.015      | $\chi_4$               |
| $t_2$                 | 0.035      | <i>X</i> 5             |
| <i>t</i> <sub>3</sub> | 0.035      | $\chi_6$               |
| $t_4$                 | 0.015      | <i>x</i> <sub>7</sub>  |
| $t_5$                 | 0.015      | $x_8$                  |
| $t_6$                 | 0.015      | <i>X</i> 9             |
| <i>t</i> 7            | 0.015      | $x_{10}$               |
| $t_8$                 | 0.015      | $x_{11}$               |
| $h_1$                 | 0.03       | <i>X</i> 12            |
| $h_2$                 | 0.74       | <i>x</i> <sub>13</sub> |
| $h_3$                 | 0.03       | $x_{14}$               |
| $h_4$                 | 0.025      | <i>x</i> 15            |
| $h_5$                 | 0.025      | <i>X</i> 16            |
| $h_6$                 | 0.025      | <i>X</i> 17            |
| $h_7$                 | 0.025      | $x_{18}$               |

TABLE VII FIRST-ORDER DESIGN PARAMETERS OF THE ACI

Second-order parameters are those physical characteristics that can affect other structures. The parameters in this category are shown in Fig. 4-3 and they include: top coil plane length (*l*1), bottom coil plane length (*l*<sub>2</sub>), uvia position offset ( $v_1 - v_{12}$ ), PTH position offset ( $p_1 - p_{12}$ ), ground PTH proximity (*G*), dummy metals placement offset (*D*), VccIN PTH proximity (*I*), coils alignment (*A*), coils proximity ( $P_1 - P_5$ ), and the total dielectric volume inside of each coil (*V*). All second-order parameters, excepting *V*, are stored in vector  $\mathbf{y} = [l_1 \ l_2 \ v_1 \ v_2 \ v_3 \ v_4 \ v_5 \ v_6 \ v_7 \ v_8 \ v_9 \ v_{10} \ v_{11} \ v_{12} \ p_1 \ p_2 \ p_3 \ p_4 \ p_5 \ p_6 \ p_7 \ p_8 \ p_9 \ p_{10} \ p_{11} \ p_{12} \ G \ D \ I \ A \ P_1 \ P_2 \ P_3 \ P_4 \ P_5]^T$ , whose values are listed in Table VIII.

The total dielectric volume inside of each coil (V), also known as coil volume, is dependent on both the first-order and second-order design parameters. It is seen from Fig. 4-2 and Fig. 4-3 that V is a function of the top and bottom coil plane widths, dielectric thickness between layers, top and bottom coil plane lengths, uvia positions, and PTH positions.



Fig. 4-3 Second-order design parameters: ground PTH proximity (*G*), dummy metals placement (*D*), VccIN PTH proximity (*I*), coils alignment (*A*), coils proximity (*P*), top and bottom coil plane length  $(l_1, l_2)$ , uvia position  $(v_1 - v_{12})$ , PTH position  $(p_1 - p_{12})$ , and dielectric volume (*V*).

| Name            | Value $(x, y)$ (mm) | Vector element | Name     | Value $(x, y)$ (mm) | Vector element |
|-----------------|---------------------|----------------|----------|---------------------|----------------|
| $l_1$           | (1.730, N/A)        | <i>y</i> 1     | $p_5$    | (0, 0)              | <i>y</i> 19    |
| $l_2$           | (1.730, N/A)        | <i>Y</i> 2     | $p_6$    | (0, 0)              | <i>Y</i> 20    |
| $v_1$           | (0, 0)              | У3             | $p_7$    | (0, 0)              | <i>Y</i> 21    |
| <i>V</i> 2      | (0, 0)              | <i>y</i> 4     | $p_8$    | (0, 0)              | <i>Y</i> 22    |
| <i>V</i> 3      | (0, 0)              | <i>y</i> 5     | $p_9$    | (0, 0)              | <i>Y</i> 23    |
| $\mathcal{V}4$  | (0, 0)              | <i>Y</i> 6     | $p_{10}$ | (0, 0)              | <i>Y</i> 24    |
| <i>V</i> 5      | (0, 0)              | <i>У</i> 7     | $p_{11}$ | (0, 0)              | <i>Y</i> 25    |
| $v_6$           | (0, 0)              | <i>y</i> 8     | $p_{12}$ | (0, 0)              | <i>Y</i> 26    |
| $\mathcal{V}_7$ | (0, 0)              | <i>y</i> 9     | G        | (0.4725, 0.2425)    | <i>Y</i> 27    |
| $\mathcal{V}_8$ | (0, 0)              | <i>y</i> 10    | D        | (0, 0)              | <i>Y</i> 28    |
| V9              | (0, 0)              | <i>y</i> 11    | Ι        | (0.432, -0.164)     | <i>Y</i> 29    |
| $v_{10}$        | (0, 0)              | <i>y</i> 12    | A        | (0.2205, N/A)       | <i>y</i> 30    |
| $v_{11}$        | (0, 0)              | <i>y</i> 13    | $P_1$    | (N/A, 0.3)          | <i>y</i> 31    |
| <i>V</i> 12     | (0, 0)              | <i>Y</i> 14    | $P_2$    | (N/A, 0.536)        | <i>y</i> 32    |
| $p_1$           | (0, 0)              | <i>y</i> 15    | $P_3$    | (N/A, 0.3)          | <i>y</i> 33    |
| $p_2$           | (0, 0)              | <i>Y</i> 16    | $P_4$    | (N/A, 0.536)        | <i>Y</i> 34    |
| $p_3$           | (0, 0)              | <i>Y</i> 17    | $P_5$    | (N/A, 0.3)          | <i>Y</i> 35    |
| $p_4$           | (0, 0)              | <i>y</i> 18    |          |                     |                |

TABLE VIIISECOND-ORDER DESIGN PARAMETERS OF THE ACI

### 4.2.2 Output Responses or Electrical Characteristics of the ACI

As mentioned before, ACI design process consists of finding the right inductor: minimum inductance for meeting output ripple targets, maximum inductance to meet transient requirements, maximum resistance to meet reliability specifications while losing the least power ( $P_{LOSS}$ ).

On die VR power loss, PLOSS, can be estimated using [Lambert-16]

$$P_{\rm LOSS} = I_{\rm out}^2 \left( R_{\rm DS,on} + R_{\rm dc} \right) + I_{\rm RMS,ac}^2 \left( R_{\rm DS,on} + R_{\rm ac} \right) + P_{\rm sw}$$
(4-1)

where three terms are considered: DC losses or conduction losses, AC losses, and switching losses  $P_{SW}$ . In particular,  $R_{dc}$  is the DC resistance of the inductor,  $R_{ac}$  is the AC resistance of the inductor at the switching frequency,  $R_{DS,on}$  represents the drain to source resistance of the power transistor when it is in the conduction region,  $I_{out}$  is the steady state or DC current and  $I_{RMS,ac}$  represents the RMS current through the inductor excluding  $I_{out}$ .  $I_{RMS,ac}$  is

dependent of the ripple current of the ACI. In general, the best inductor minimizes  $R_{dc}$  and  $R_{ac}$  in order to reduce the power loss; these two electrical characteristics are part of the main output responses of the ACI.

Another important electrical characteristic of an ACI is the actual inductance. Inductance is classified in self-inductance ( $L_{self}$ ) and mutual-inductance ( $L_{mutual}$ ) or coupled inductance. Both are computed at the VR switching frequency. Self-inductance is produced by each coil independently, without the interaction of the neighbor coils. Self-inductance should not be too large that transient noise violates the noise specification, nor too small to violate ripple noise requirements of the buck converter. Coupled inductance is obtained when two separate windings are interleaving each other, and it can be negative or positive. Negative coupling is preferable such it provides a reduction in the AC losses. Negative coupling is obtained when the two windings are oriented opposite each other, as observed in Fig. 4-1b. Coupling is sized through the coupling factor (K), which is the ratio of mutual to self-inductance

$$K = \frac{L_{mutual}}{L_{self}} \tag{4-2}$$

One important factor for the AC losses in (4-1) is that the AC current in the inductor  $(I_{\text{RMS},ac})$  is inversely proportional to the quality factor (Q), which is used as a metric of the inductor AC performance and is defined as

$$Q = \frac{2\pi f L}{R_{ac}} \tag{4-3}$$

where *L* is the equivalent self-inductance of the ACI at the switching frequency of the VR and f is the switching frequency of the VR.

In summary, the most relevant output responses of the ACI are  $\mathbf{R} = [R_{dc} \ R_{ac} \ L_{self} \ L_{mutual} K \ Q]^{T}$  and can be used for defining the design specifications for a numerical optimization (including an space mapping formulation).

# 4.3. ACI Fine Model Description and Simulation Results

As mentioned before, a fine model is an accurate computational representation of the physical structure of the ACI and it is obtained through high accuracy simulations. In our case, it resulted from solving the 3D structure by applying the finite element method (FEM) using

Cadence<sup>®21</sup> Sigrity<sup>TM</sup> PowerSI<sup>®22</sup> tool in 3DFEM full-wave extraction mode [Cadence-19] (PSI-3D).

An accurate ACI model is critical during the design phase of an ACI because it reduces the fabrication cost by allowing the designer to produce the target product in a single fabrication iteration without spending in many prototypes. An accurate ACI model also shortens the development time and warranty the correct functionality of the device by allowing the designer to run different simulations to validate the design performance.

### 4.3.1 3D Solver Parameters for Simulating the Fine Model

Since the skin depth at VR switching frequency ( $f_{SW} \sim 150$  MHz) is very close to the copper thickness, this could introduce some errors when calculating the losses during the simulation. Hence, in PSI-3D the parameter "metal type" uses the option Metal\_Inside in order to accurate capture losses in the solenoid turn for solving the fields inside the copper.

The substrate ACI fine model in this work uses the first order elements option in PSI-3D for improving accuracy. In the first order element the electric field (E-field) is represented with high order polynomials inside and along the edges of the elements, while in the zero order element option the E-field is constant along the edges and linear inside the elements [Cadence-18]. Furthermore, first order element introduces a reasonable mesh size for accurately solving the fields inside the metals.

Our fine model also includes conductive objects close to the ACI (~1 mm) in order to calculate losses due to induction. For this reason, the fine model includes surrounding metals to the ACI structure, such as dummy metals and other power rails as VR input (VccIN) plating through hole (PTH), as showed in Fig. 4-3.

### 4.3.2 Fine Model Simulation Results

Results are obtained in Matlab from reading a Touchstone file from PSI-3D based on the technique used in [Leal-Romo-17a]. The resultant inductance matrix for the ACI fine model is in

<sup>&</sup>lt;sup>21</sup> Cadence Design Systems Inc., San Jose, CA 95134.

<sup>&</sup>lt;sup>22</sup> PowerSI<sup>®</sup>, ver. 2018, Cadence<sup>®</sup> Sigrity<sup>TM</sup>.

Table IX (at  $f_{sw} = 158$  MHz of the VR). Diagonal terms show the self-inductance for each of the 6 coils while coupling or mutual-inductance between all coils are in the off-diagonal terms. The average self-inductance  $L_{self}$  is

$$L_{\text{self}} = \frac{\sum_{j=1}^{n} (L_{jj})}{n} \quad \text{with } n = 6 \tag{4-4}$$

yielding  $L_{self} = 1.039$  nH. The average of the maximum mutual-inductance  $L_{mutual}$  is

$$L_{\text{mutual}} = \frac{\sum_{i=1}^{m} [\min_{j=1}^{n} (L_{ij})]}{m} \quad \text{with } n = m = 6 \text{ and } i \neq j$$
(4-5)

yielding  $L_{\text{mutual}} = -0.290 \text{ nH}.$ 

Table X contains the resultant AC resistance matrix for the ACI fine model at  $f_{sw}$ . In this matrix the AC resistance ( $R_{ac}$ ) or input resistance for each coil is shown in the diagonal. Data in off-diagonal is not captured since it not required for our purposes. The average AC resistance  $R_{ac}$  is

$$R_{\rm ac} = \frac{\sum_{j=1}^{n} R_{jj}}{n} \qquad (4-6)$$

yielding  $R_{\rm ac} = 75.88 \text{ m}\Omega$ .

In this thesis,  $R_{dc}$  is not directly simulated to avoid an additional simulation domain, instead, a low frequency AC resistance ( $R_{LF}$ ) is used, as shown in Table XI. Applying

(4-6) to the matrix in Table XI we obtain  $R_{dc} \approx R_{LF} = 13.76 \text{ m}\Omega$ .

Applying

(4-2) we obtain K = -0.279, and using

(4-3) then Q = 13.65. A summary of all the output responses from this fine model

TABLE IXFINE MODEL SIX-PHASE ACI INDUCTANCE MATRIX AT 158 MHZ

| Inductance matrix elements at $f_{sw}$ (nH) |        |        |        |        |        |  |  |  |
|---------------------------------------------|--------|--------|--------|--------|--------|--|--|--|
| 1.023                                       | 0.088  | 0.016  | -0.023 | -0.163 | -0.281 |  |  |  |
| 0.088                                       | 1.029  | 0.104  | -0.177 | -0.279 | -0.039 |  |  |  |
| 0.016                                       | 0.104  | 1.081  | -0.311 | -0.046 | -0.008 |  |  |  |
| -0.023                                      | -0.177 | -0.311 | 1.049  | 0.096  | 0.014  |  |  |  |
| -0.163                                      | -0.279 | -0.046 | 0.096  | 1.022  | 0.092  |  |  |  |
| -0.281                                      | -0.039 | -0.008 | 0.014  | 0.092  | 1.033  |  |  |  |

simulation is presented in Table XII. In addition, the frequency response of the self-inductance of coil 1 is represented in Fig. 4-4, showing an inductance value of  $L_{11} = 1.023$  nH at the desired frequency  $f_{sw} = 158$  MHz. As mentioned above, the average self-inductance for all 6 coils of the VR is  $L_{self} = 1.039$  nH at  $f_{sw} = 158$  MHz.

### 4.3.3 Fine Model Simulation Resources

The total elapsed time for the fine model simulation process of this structure is 5 hours 47 minutes 32 seconds, and peak memory usage is 103.980 GB. This simulation time was obtained using a server computer system with a total of 1 TB of installed RAM memory, 4 Intel<sup>™</sup> Xeon<sup>®</sup>

TABLE X FINE MODEL SIX-PHASE ACI RESISTANCE MATRIX AT 158 MHZ

| Resistance matrix elements at $f_{sw}$ (m $\Omega$ ) |        |        |        |        |        |  |  |  |
|------------------------------------------------------|--------|--------|--------|--------|--------|--|--|--|
| 75.012                                               |        |        |        |        |        |  |  |  |
|                                                      | 74.796 |        |        |        |        |  |  |  |
| •                                                    | •      | 81.941 | •      | •      |        |  |  |  |
|                                                      |        | •      | 75.268 |        |        |  |  |  |
|                                                      |        |        |        | 73.006 |        |  |  |  |
| •                                                    | •      | •      | •      | •      | 75.307 |  |  |  |

TABLE XI FINE MODEL SIX-PHASE ACI RESISTANCE MATRIX AT 1 KHZ

| Resistance matrix elements at LF (m $\Omega$ ) |        |        |        |        |        |  |  |
|------------------------------------------------|--------|--------|--------|--------|--------|--|--|
| 13.487                                         |        |        |        |        |        |  |  |
|                                                | 13.493 |        |        |        |        |  |  |
|                                                |        | 14.643 |        |        |        |  |  |
|                                                |        |        | 13.518 |        |        |  |  |
|                                                |        |        |        | 13.420 |        |  |  |
|                                                |        |        |        |        | 14.010 |  |  |

 TABLE XII

 OUTPUT PARAMETER VALUES FROM THE SIMULATION OF THE FINE MODEL

| Modeling type                | $R_{\mathrm{LF}}(\mathrm{m}\Omega)$ | $R_{\mathrm{ac}}\left(\mathrm{m}\Omega\right)$ | $L_{\text{self}}(nH)$ | L <sub>mutual</sub> (nH) | K      | Q     |
|------------------------------|-------------------------------------|------------------------------------------------|-----------------------|--------------------------|--------|-------|
| Fine model<br>(using PSI-3D) | 13.76                               | 75.88                                          | 1.039                 | -0.290                   | -0.279 | 13.65 |



Fig. 4-4 Self-inductance of coil 1,  $L_{11}$ , in ACI fine model using PSI-3D across multiple frequencies. It is seen that  $L_{11} = 1.023$  nH at  $f_{sw} = 158$  MHz.

processors with a total of 32 cores with a maximum frequency of 2.6 GHz.

It is seen that the iterative process for optimizing the ACI by using only the fine model is prohibitive due to the large computing resources and time implied. From here, it is very relevant to find a suitable coarse model for a future SM-based optimization.

### 4.3.4 Conclusions

Substrate package ACIs play a key role for the efficient delivery of power to microprocessors or chipsets. An ACI acts as an output inductor of a high frequency DC-DC buck converter that can provides a clean voltage and large amount of current to the load. The electrical characteristics of an ACI are critical to achieve the VR requirements, such as voltage ripple, transient response, current handling capability, and power efficiency. We can validate if all these requirements are met by running highly accurate simulations without the need of fabricating the physical inductor. In this chapter, we described the ACI physical structure and its simulation using a highly accurate model implemented in Cadence<sup>®23</sup> Sigrity<sup>TM</sup> PowerSI<sup>®24</sup> (PSI-3D).

<sup>&</sup>lt;sup>23</sup> Cadence Design Systems Inc., San Jose, CA 95134.

<sup>&</sup>lt;sup>24</sup> PowerSI<sup>®</sup>, ver. 2018, Cadence<sup>®</sup> Sigrity<sup>™</sup>.

# 5. Developing a Coarse Model of a Package Substrate Air Core Inductor for Space Mapping Applications

The scaling of performance in computing systems is coming together with an increment on power consumption. For this reason, new mechanisms to efficiently deliver power to the microprocessors and chipsets are evolving [Burton-14], including placing the power supply closer to the load and using embedded passive elements, such as substrate package air core inductors (ACI). As explained in [Mercado-Casillas-19a], current package substrate inductor design methodology includes multiple simulation iterations of a high-fidelity or fine model of the substrate inductor using 3D full-wave electromagnetic (EM) simulation tools, making the design process too long and computationally expensive.

There are several techniques to reduce the computational cost and accelerate an EMbased design optimization process. Among the most powerful such techniques is the space mapping (SM) optimization methodology [Bandler-04], [Koziel-08] and [Rayas-Sánchez-16]. This technique requires the usage of a surrogate model, also known as coarse model, which has two main properties: it should represent the general behavior of the original fine model and it should be computational cheap. These properties impact on: a) the quality of the final design; b) the efficiency of the SM method to find the right design parameters; c) the computational cost involved; and d) the time to find the optimal solution. Therefore, coarse model selection should be one of the main focusing areas when using space mapping optimization methodology.

In this chapter, it is developed a coarse model suited for a substrate package inductor used in a power delivery network (PDN) with enough correlation with respect to the original 3D fullwave EM (PSI-3D) fine model and with a low computational cost. This coarse model will be used in a future work based on a space mapping formulation. The goal of this future work is to produce a methodology that will reduce design time of a substrate package air core inductor with similar accuracy level than the one using purely fine models in commercial 3D full-wave EM simulation tools.

# 5.1. Coarse Model Description

A typical design process of an ACI is described in [Mercado-Casillas-19a]. In this process, an accurate simulation method based on a high-fidelity or fine model of an ACI is typically used. This ACI design process requires multiple iterations before the ACI meets all the requirements, and each iteration using the ACI's fine model requires long simulation time and large computing resources. Therefore, designing only with fine or high-fidelity ACI models is not a good option. For this reason, the usage of coarse ACI models in space mapping (SM) approach is more convenient during the design process.

A coarse model is a simplification of a high-fidelity model with lower computational cost. Coarse model represents the behavior of the fine model and it is one of the main ingredients in a space mapping optimization process because it affects the efficiency of this method to find the right design, as described in [Koziel-07].

As mentioned before, coarse model has two main properties. First, it should represent the behavior of the original fine model with a good level of accuracy. In this application the model's output responses of interest include the equivalent inductances ( $L_{self}$  and  $L_{mutual}$ ), resistances ( $R_{ac}$  and  $R_{LF}$ ), coupling factor (K), and quality factor (Q); they are represented in the vector  $\mathbf{R} = [R_{dc} R_{ac} L_{self} L_{mutual} K Q]^{T}$ .  $\mathbf{R}_{f}$  represents the output responses from the fine model and  $\mathbf{R}_{c}$  from the coarse model. The fine model input design parameters are represented by  $\mathbf{x}_{f}$  and those for the coarse model are represented by  $\mathbf{x}_{c}$ . A more detailed description of the input design parameters and the output responses are described in [Mercado-Casillas-19a]. Second, the coarse model should be computational cheap by consuming lower computational system resources within a shorter simulation time with respect to its corresponding fine model. Simulation time of a coarse model should be at least one order of magnitude faster than the original fine model. In this report, several options for representing a coarse ACI model are explored in order to select the best candidate. Then, the best coarse model will be used during a space mapping optimization process in a future work.

All coarse models are simulated using the same computer system for a fair comparison. The characteristics of the computer system used during the coarse model simulations has a total of 768 GB of installed DDR3 RAM memory running at 1,333 MHz and 2 Intel<sup>TM</sup> Xeon®

processors with a total of 24 cores with a maximum frequency of 2.89 GHz.

# 5.1.1 Coarse Model 1: 3D Zero Order with Coarse Initial Mesh (Using PSI-3D)

Coarse model 1 consists of reducing the complexity of the original ACI fine model using the same solver (PSI-3D). This is done by simplifying the ACI structure and by relaxing some of the simulation parameters.

The ACI structure is simplified by removing some copper shapes: dummy metals and structures from adjacent ACIs are removed, 2 power rails elements are eliminated, keeping only the input voltage rail (VccIN), actual ACI elements, and ground planes (Vss). All of these simplifications allow the whole structure area to be reduced by 40%.

Simulation is also simplified by relaxing some solver options: parameter *Basis Function Order* is set to zero order element; this means that the electric field is constant along the edges of the elements and linear inside the elements. Parameter *Target Delta S* in the Adaptive Solution menu is changed from 0.01 to 0.05, this drives adaptive meshing iterations to continue until Sparameters difference is less than 0.05, producing a faster solution with reasonable accuracy. Parameter *Number of Points for Via, Wire, Ball, Bump* in Geometry Options menu is changed from 6 to 4; this parameter indicates the accuracy level of modeling cylinders, this value will be used to split the cylinder objects, a larger number produces more accurate models. PSI-3D user manual [Cadence-18] indicates that 4 is accurate enough for most applications and 8 is recommended for applications that require accurate modeling of cylinders, such as a via array or, in our case, a fine model ACI. Additionally, *Meshing Algorithm* in Geometry Options menu is changed from default mesh to coarser initial mesh. This meshing algorithm is used for generating 3D tetrahedral elements. Coarser initial mesh option reduces the total mesh size, and consequently, reduces the simulation time.

The output responses of the 3D zero order model with this coarser initial mesh are calculated using matrices in Tables XIII-XV, yielding  $L_{self} = 0.964$  nH,  $L_{mutual} = -0.286$  nH, K = -0.296,  $R_{LF} = 11.43$  m $\Omega$ ,  $R_{ac} = 31.83$  m $\Omega$  and Q = 30.21.

Simulating this ACI coarse model takes 18 minutes 30 seconds and a 4.472 GB peak of RAM memory usage.

The modeling process of this coarse model is of the same nature of the fine model. Consequently, both models have very similar responses, and this characteristic could help to reduce the complexity of a space mapping optimization.

# TABLE XIII COARSE MODEL 1: 3D ZERO ORDER WITH COARSER INITIAL MESH SIX-PHASE ACI INDUCTANCE MATRIX AT 158 MHZ

|        | Inductance matrix elements at $f_{sw}$ (nH) |        |        |        |        |  |  |  |
|--------|---------------------------------------------|--------|--------|--------|--------|--|--|--|
| 0.934  | 0.087                                       | 0.016  | -0.023 | -0.160 | -0.264 |  |  |  |
| 0.087  | 0.957                                       | 0.107  | -0.179 | -0.275 | -0.038 |  |  |  |
| 0.016  | 0.107                                       | 1.040  | -0.319 | -0.047 | -0.007 |  |  |  |
| -0.023 | -0.179                                      | -0.319 | 0.978  | 0.096  | 0.014  |  |  |  |
| -0.160 | -0.275                                      | -0.047 | 0.096  | 0.946  | 0.087  |  |  |  |
| -0.264 | -0.038                                      | -0.007 | 0.014  | 0.087  | 0.932  |  |  |  |

### TABLE XIV

# COARSE MODEL 1: 3D ZERO ORDER WITH COARSER INITIAL MESH SIX-PHASE ACI RESISTANCE MATRIX AT 158 MHZ

| Resistance matrix elements at $f_{sw}$ (m $\Omega$ ) |        |        |        |        |        |  |  |  |
|------------------------------------------------------|--------|--------|--------|--------|--------|--|--|--|
| 31.899                                               |        |        |        |        |        |  |  |  |
|                                                      | 32.090 |        |        |        |        |  |  |  |
|                                                      | •      | 35.372 | •      | •      |        |  |  |  |
| •                                                    | •      | •      | 30.186 | •      | •      |  |  |  |
|                                                      |        |        |        | 31.183 |        |  |  |  |
|                                                      |        |        |        |        | 30.253 |  |  |  |

### TABLE XV

# COARSE MODEL 1: 3D ZERO ORDER WITH COARSER INITIAL MESH SIX-PHASE ACI RESISTANCE MATRIX AT 1 KHZ

| Resistance matrix elements at LF (m $\Omega$ ) |        |        |        |        |        |  |  |  |
|------------------------------------------------|--------|--------|--------|--------|--------|--|--|--|
| 11.190                                         |        |        |        |        |        |  |  |  |
|                                                | 11.181 |        |        | •      |        |  |  |  |
|                                                |        | 12.075 |        | •      |        |  |  |  |
|                                                | •      | •      | 11.199 | •      | •      |  |  |  |
| •                                              | •      | •      | •      | 11.220 | •      |  |  |  |
| •                                              | •      |        |        | •      | 11.741 |  |  |  |

# 5.1.2 Coarse Model 2: 3D Zero Order with Perfect Conductor (Using PSI-3D)

In PSI-3D the simulation complexity can be further reduced by selecting a different *Metal Type* parameter in the Solver Options menu. The most memory and time efficient metal type is *Metal\_PEC* which models metals as perfect conductors. The solution using this option neglects metal losses, which are required for low and high frequency resistance  $R_{LF}$  and  $R_{ac}$  calculation. In our ACI application we need low frequency resistance in order to calculate conduction losses and high frequency resistance to calculate quality factor and efficiency at switching frequency ( $f_{sw}$ ). Therefore, for our application *Metal\_PEC* is not a good option for creating the coarse model. However, *Metal\_PEC* option is very efficient if the only interest is the inductive effect, such as self-inductance ( $L_{self}$ ) and mutual-inductance ( $L_{mutual}$ ).

In this coarse model all simulation engine simplifications enlisted in Subsection 5.1.1 are included. Additionally, parameter *Outer Box Boundary Conditions* uses PEC option, which means that outer box surfaces are treated as perfect conductors.

The output responses of the 3D zero order model with perfect conductor are calculated using matrix in Table XVI, yielding  $L_{self} = 0.850$  nH,  $L_{mutual} = -0.278$  nH, K = -0.327.

Simulating this coarse model takes 4 min. 54 sec. and a 1.823 GB RAM memory peak.

| Inductance matrix elements at $f_{sw}$ (nH) |        |        |        |        |        |  |  |  |
|---------------------------------------------|--------|--------|--------|--------|--------|--|--|--|
| 0.839                                       | -0.023 | 0.015  | -0.023 | -0.164 | -0.264 |  |  |  |
| -0.023                                      | 0.857  | 0.1    | -0.17  | -0.268 | -0.038 |  |  |  |
| 0.015                                       | 0.1    | 0.9    | -0.303 | -0.045 | -0.007 |  |  |  |
| -0.023                                      | -0.17  | -0.303 | 0.866  | 0.093  | 0.014  |  |  |  |
| -0.164                                      | -0.268 | -0.045 | 0.093  | 0.837  | 0.088  |  |  |  |
| -0.264                                      | -0.038 | -0.007 | 0.014  | 0.088  | 0.804  |  |  |  |

TABLE XVI COARSE MODEL 2: 3D ZERO ORDER WITH PERFECT CONDUCTOR TWO-PHASE ACI INDUCTANCE MATRIX AT 158 MHZ

### 5.1.3 Coarse Model 3: 3D Zero Order Light Model (Using PSI-3D)

The original ACI target structure includes 6 very similar solenoids, in which each pair contains most of the coupling effects that the complete ACI structure exhibits. Since these solenoids are very similar, then the self-inductance ( $L_{self}$ ) and resistance ( $R_{LF}$  and  $R_{ac}$ ) can be extracted only with one coil.  $L_{self}$  for different solenoids are shown in the diagonal of matrix in Table XIII, and similarly,  $R_{LF}$  and  $R_{ac}$  are shown in Tables XIV and XV, respectively. Moreover, the coupling factor is mainly driven by the interaction of each pair of solenoids, as observed in off-diagonal elements in Table XIII (ports 1 and 6, 2 and 5, 3 and 4). Therefore, a good representation of the main output responses of the ACI is obtained by modeling only one solenoid pair, which is exploited in this coarse model 3.

In addition to reducing the structure size, some additional simulation parameters in the modeler engine are relaxed. Parameter *Polygon Simplification Threshold* is changed from a very granular resolution (0.004098 mm) to a grosser resolution (0.116 mm). This parameter is used to smooth the metal conductors which clean up bad geometries and improve numerical stability. Guidance from Cadence<sup>®25</sup> is to use 1/3 of the trace width. In this application it was used the ACI plane width as a reference (0.3441 mm) for calculating the threshold. This produces simplified polygons and consequently easier shapes to be computed by the solver.

The Auto\_Fitting option in the Metal Type parameter is selected for a better system memory usage and simulation efficiency. It uses Metal\_Inside in the low-frequency range and Metal\_Skin\_Impedance in the high-frequency range. Metal\_Inside option captures low-frequency losses very accurately but this option is the least memory efficient. On the other hand, Metal\_Skin\_Impedance cannot capture resistance losses at low frequencies, but it is very accurate capturing high-frequency resistive losses with better memory and timing efficiency.

For visual comparison, the coils in ACI fine model structure are shown in Fig. 5-1a while the ACI coarse model 3 with 3D Zero Order Light Model option is in Fig. 5-1b.

<sup>&</sup>lt;sup>25</sup> Cadence Design Systems Inc., San Jose, CA 95134.



Fig. 5-1 ACI models: a) fine model with 6 inductors; b) coarse model 3 using 3D Zero Order Light Model with only 2 inductors.

In coarse model 3 most of the simulation engine simplifications enlisted in Subsection 5.1.2 are included. There is only one exception: parameter *Target Delta S* in the Adaptive Solution menu is set to 0.5 instead of 0.05, which reduces even more the adaptive meshing iterations. This change reduced the simulation time by 20% while impacted the output response accuracy in 8%.

The output responses of the 3D Zero Order Light Model option are calculated using matrix in Tables XVII-XIX, yielding  $L_{self} = 0.918$  nH,  $L_{mutual} = -0.274$  nH, K = -0.298,  $R_{LF} = 9.10 \text{ m}\Omega$ ,  $R_{ac} = 51.52 \text{ m}\Omega$  and Q = 17.75.

Simulating this coarse model takes 2 min. 32 sec. and a 967 MB RAM memory peak.

# TABLE XVII COARSE MODEL 3: 3D ZERO ORDER LIGHT MODEL TWO-PHASE ACI INDUCTANCE MATRIX AT 158 MHZ

| Inductance matrix elements at $f_{sw}$ (nH) |        |  |
|---------------------------------------------|--------|--|
| 0.937                                       | -0.274 |  |
| -0.274                                      | 0.9    |  |

### TABLE XVIII COARSE MODEL 3: 3D ZERO ORDER LIGHT MODEL TWO-PHASE ACI RESISTANCE MATRIX AT 158 MHZ

| Resistance matrix elements at $f_{sw}$ (m $\Omega$ ) |       |  |  |  |
|------------------------------------------------------|-------|--|--|--|
| 52.77 .                                              |       |  |  |  |
| <u> </u>                                             | 50.27 |  |  |  |

# TABLE XIX COARSE MODEL 3: 3D ZERO ORDER LIGHT MODEL TWO-PHASE ACI RESISTANCE MATRIX AT 1 KHZ

| Resistance matrix elements at LF (m $\Omega$ ) |       |  |  |  |
|------------------------------------------------|-------|--|--|--|
| 9.434 .                                        |       |  |  |  |
| •                                              | 8.772 |  |  |  |

# 5.1.4 Coarse Model 4: 3D Zero Order with Zero Metal Thickness Threshold (Using PSI-3D)

In Solver Options menu the parameter *Zero Metal Thickness Threshold* can be changed from 0 to 0.1 mm, reducing the model size significantly. This option allows the simulation engine to model as zero thickness sheet any metal layer whose thickness is smaller than this threshold. Coarse model 4 makes use of this option.

If the metal layer thickness is not going to be a variable in the input design parameters, then coarse model 4 can be used to accelerate the simulation process. However, in our application we want to optimize the metal layer thickness to control  $R_{ac}$ , then it may not be a good option. For instance, the metal layer thickness in our ACI is 15 µm which is smaller than the 0.1 mm given to the *Zero Metal Thickness Threshold* parameter, then all planes are modeled as zero thickness sheet. Therefore,  $R_{ac}$  would be a function only of the vertical vias and not of the horizontal metal planes.

In this coarse model option, all the simulation engine simplifications enlisted in Subsection 5.1.3 are included. The output responses of the 3D Zero Order Light Model with Zero Metal Thickness Threshold option are obtained from Tables XX-XXII, yielding  $L_{self} = 0.89$  nH,  $L_{mutual} = -0.256$  nH, K = -0.287,  $R_{LF} = 9.91$  m $\Omega$ ,  $R_{ac} = 36.12$  m $\Omega$  and Q = 24.53.

### TABLE XX COARSE MODEL 4: 3D ZERO ORDER WITH ZERO METAL THICKNESS THRESHOLD TWO-PHASE ACI INDUCTANCE MATRIX AT 158 MHZ

| Inductance matrix e | elements at $f_{sw}$ (nH) |
|---------------------|---------------------------|
| 0.925               | -0.256                    |
| -0.256              | 0.855                     |

TABLE XXI COARSE MODEL 4: 3D ZERO ORDER WITH ZERO METAL THICKNESS THRESHOLD TWO-PHASE ACI RESISTANCE MATRIX AT 158 MHZ

Resistance matrix elements at  $f_{sw}$  (m $\Omega$ )36.09...36.148

TABLE XXII COARSE MODEL 4: 3D ZERO ORDER WITH ZERO METAL THICKNESS THRESHOLD TWO-PHASE ACI RESISTANCE MATRIX AT 1 KHZ

| Resistance matrix elements at LF (m $\Omega$ ) |       |  |  |
|------------------------------------------------|-------|--|--|
| 10.168 .                                       |       |  |  |
| ·                                              | 9.657 |  |  |

Simulating this coarse model takes 1 min. 19 sec. and a 428 MB RAM memory peak.

This coarse model 4 resulted in the most efficient scheme using PSI-3D. However, it is recommended only in cases when the metal layer thickness or copper weight is fixed and not part of the input design parameters of the ACI.

#### 5.1.5 Coarse Model 5: 2.5D Model (Using PowerSI)

PowerSI can be used in parameter extraction mode to translate the electromagnetic (EM) field behavior of the simulated structure into an equivalent circuit represented by S, Z, or Y parameters. Extraction mode has several options, including 3D-EM Full-Wave Extraction Mode

and simply Extraction Mode. In this work, we call 3D-EM Full-Wave Extraction Mode as PSI-3D and Extraction Mode as PowerSI [Cadence-15]. Extraction mode has a hybrid simulation engine that uses either EM field, transmission line and circuital solver. The solver used by the tool depends on the geometry of the structure section to be analyzed and the simulation frequencies. EM full-wave solver is used when sections of the PDN structure are comparable to the smallest simulated wavelength, transmission line solver is used for the PDN sections that exhibit such behavior, and lumped circuit solver is used for sections that are much smaller than the smallest simulated wavelength. The main difference against PSI-3D is that PowerSI solves vias or vertical structures independent from planes or horizontal structures. For this reason, PowerSI used in Extraction Mode produces models in 2.5D. On the other hand, PSI-3D generates a 3D mesh that solves all vias, planes, and their EM interaction all together. For this reason, a PDN model in PSI-3D is more accurate and significantly more computationally expensive than PowerSI. Normally, PSI-3D is used for accurate analysis of complex but relatively small 3D structures, while PowerSI is used for larger PDN structures. In this case, PowerSI is used as an option for extracting the coarse model of the ACI, reducing considerably the simulation time with an acceptable correlation with respect to the fine model, which uses PSI-3D.

The output responses of the 2.5D Model with simplified grid option are obtained from Tables XXIII-XXV, yielding  $L_{self} = 1.084$  nH,  $L_{mutual} = -0.260$  nH, K = -0.240,  $R_{LF} = 19.64$  m $\Omega$ ,  $R_{ac} = 83.6$  m $\Omega$  and Q = 12.92.

| Inductance matrix elements at $f_{sw}$ (nH) |        |        |        |        |        |
|---------------------------------------------|--------|--------|--------|--------|--------|
| 1.071                                       | 0.082  | 0.014  | -0.021 | -0.145 | -0.24  |
| 0.082                                       | 1.081  | 0.094  | -0.162 | -0.256 | -0.034 |
| 0.014                                       | 0.094  | 1.116  | -0.285 | -0.041 | -0.006 |
| -0.021                                      | -0.162 | -0.285 | 1.109  | 0.089  | 0.012  |
| -0.145                                      | -0.256 | -0.041 | 0.089  | 1.078  | 0.079  |
| -0.24                                       | -0.034 | -0.006 | 0.012  | 0.079  | 1.052  |

TABLE XXIII COARSE MODEL 5: 2.5D MODEL SIX-PHASE ACI INDUCTANCE MATRIX AT 158 MHZ

# TABLE XXIV COARSE MODEL 5: 2.5D MODEL SIX-PHASE ACI RESISTANCE MATRIX AT 158 MHZ

|       | Resistance matrix elements at $f_{sw}$ (m $\Omega$ ) |        |        |        |        |  |
|-------|------------------------------------------------------|--------|--------|--------|--------|--|
| 83.31 |                                                      |        |        |        |        |  |
|       | 82.297                                               |        |        |        |        |  |
|       | •                                                    | 88.529 | •      |        | •      |  |
|       | •                                                    | •      | 83.388 | •      | •      |  |
|       |                                                      |        |        | 81.342 |        |  |
|       |                                                      |        | •      | •      | 82.654 |  |

TABLE XXV COARSE MODEL 5: 2.5D MODEL SIX-PHASE ACI RESISTANCE MATRIX AT 1 KHZ

| Resistance matrix elements at LF (m $\Omega$ ) |        |        |        |        |        |
|------------------------------------------------|--------|--------|--------|--------|--------|
| 19.414                                         |        |        |        |        |        |
|                                                | 19.284 | •      | •      | •      | •      |
| •                                              |        | 21.014 |        | •      |        |
| •                                              |        |        | 19.305 |        |        |
| •                                              |        |        |        | 19.158 |        |
| •                                              | •      | •      | •      | •      | 19.684 |

Simulating this coarse model only takes 20.8 sec. and a 651 MB RAM memory peak.

PowerSI is a good tool candidate for generating an ACI coarse model due to the short simulation times. This allows a quick coarse model iteration during the SM optimization flow. However, the output responses of this coarse model show lower correlation with those of the fine model at low frequencies.

# 5.2. Coarse Model Decision Criteria

As mentioned before, the coarse model plays an important role on the success of SM performance. It influences the quality of the design and the computational complexity during the optimization process, as described in [Koziel-07]. Therefore, the selection process of the coarse model needs to be consciously performed. For this reason, the next selection criteria are applied

on the different coarse model alternatives described in Section 5.1.

#### 5.2.1 Coarse Model Accuracy

The ACI coarse model should be a good representation of the ACI fine model. This feature improves the quality of the final design as well as convergence rate of the SM optimization algorithm. This in turn reduces the computational complexity by minimizing the number of fine model evaluations.

Accuracy of each ACI coarse model is measured by comparing their responses ( $\mathbf{R}_c$ ) versus the ACI fine model responses ( $\mathbf{R}_f$ ) at a reference design across different frequencies, as shown in Fig. 5-2 and Fig. 5-3.

It is seen from Fig. 5-2a that resistance Res(f) of coarse model 5 presents the best accuracy for f > 20 MHz, while coarse model 1 is the most accurate for f < 20 MHz. Coarse



Fig. 5-2 Comparing fine and coarse model responses: a) resistance ( $R_{LF}$  at low frequencies and  $R_{ac}$  at high frequencies); b) self-inductance  $L_{self}$ .



Fig. 5-3 Comparing fine and coarse model responses: a) quality factor Q; b) mutual inductance  $L_{\text{mutual}}$ .

model 3 is also good enough since it is balancing the accuracy level across all frequencies. Res(f) accuracy is important at low and high frequency points since optimizing *Res* is required at 1 KHz =  $R_{LF}$  and at 158 MHz =  $R_{ac}$ .

Regarding the self-inductance  $L_{self}(f)$  (see Fig. 5-2b), coarse model 1 presents the best accuracy for f < 3 MHz and coarse model 5 for f > 3 MHz. However, coarse model 5 shows the worst deviation for low frequencies, which might be irrelevant since optimizing  $L_{self}$  is only required at  $f_{sw} = 158$  MHz.

Regarding the quality factor Q(f), the best accuracy at  $f_{sw}$  is achieved by coarse model 5, as shown in Fig. 5-3b. Coarse models 1 and 4 present the best accuracy for f < 30 MHz.

Regarding the mutual inductance,  $L_{mutual}(f)$ , all coarse models present good accuracy at  $f_{sw}$ , as shown in Fig. 5-3b, with a slightly better behavior in coarse model 1 across all frequencies. Similarly to  $L_{self}(f)$ , the optimization process in  $L_{mutual}(f)$  is focused more at  $f_{sw}$ , consequently, any of the evaluated coarse models can be a good representation of the fine model for that response.

In summary, the most suited coarse model for the intended optimization in terms of accuracy is coarse model 5. This model obtains the best accuracy in 3 out of the 6 responses of interest ( $R_{ac}$ ,  $L_{self}$ , and Q) and presents an acceptable deviation in the rest ( $R_{LF}$ ,  $L_{mutual}$  and K). A summary of the main responses of interest from the fine model and the different coarse models is presented in Table XXVI.

#### **5.2.2** Simulation Time and Computing Resources

The coarse model should be computationally cheap. This minimizes the computational resources used by the SM optimization process, particularly during the coarse model direct optimization and the parameter extraction iterations.

Simulation time of a coarse model should be at least one order of magnitude faster than the original fine model (10% of the original time). However, in this ACI optimization problem, the fine model takes around 5 hours and even 10% might be too long. Therefore, our target is to obtain a coarse model that takes less than 1% of the fine model simulation time. The simulation resources of all coarse models described in Section 5.1 are reported in Table XXVII. The best timing is given by coarse model 5, whose simulation takes only 0.1% of the total elapsed time of fine model.

### 5.2.3 Nature of Design Variables

Coarse and fine models with same nature design parameters  $x_c$  and  $x_f$  facilitate the SM

TABLE XXVI MAIN RESPONSES OF INTEREST FROM THE SIMULATION OF THE FINE MODEL AND THE DIFFERENT COARSE MODELS

| Modeling type  | $R_{\rm LF}({ m m}\Omega)$ | $R_{\rm ac}({ m m}\Omega)$ | $L_{\text{self}}(nH)$ | L <sub>mutual</sub> (nH) | K      | Q     |
|----------------|----------------------------|----------------------------|-----------------------|--------------------------|--------|-------|
| Fine model     | 13.76                      | 75.88                      | 1.039                 | -0.290                   | -0.279 | 13.65 |
| Coarse model 1 | 11.43                      | 31.83                      | 0.964                 | -0.286                   | -0.296 | 30.21 |
| Coarse model 2 | N/A                        | N/A                        | 0.850                 | -0.278                   | -0.327 | N/A   |
| Coarse model 3 | 9.10                       | 51.52                      | 0.918                 | -0.274                   | -0.298 | 17.75 |
| Coarse model 4 | 9.91                       | 36.12                      | 0.89                  | -0.256                   | -0.287 | 24.53 |
| Coarse model 5 | 19.64                      | 83.60                      | 1.084                 | -0.260                   | -0.240 | 12.92 |

| Modeling type  | Time (seconds) | Time vs fine<br>model | Memory (GB) | Memory vs fine<br>Model |
|----------------|----------------|-----------------------|-------------|-------------------------|
| Fine model     | 20852          | 100%                  | 103.980     | 100%                    |
| Coarse model 1 | 1110           | 5.3%                  | 4.472       | 4.3%                    |
| Coarse model 2 | 294            | 1.4%                  | 1.823       | 1.8%                    |
| Coarse model 3 | 152            | 0.7%                  | 0.967       | 0.9%                    |
| Coarse model 4 | 79             | 0.4%                  | 0.428       | 0.4%                    |
| Coarse model 5 | 21             | 0.1%                  | 0.651       | 0.6%                    |

TABLE XXVII SIMULATION RESOURCES COMPARISON: FINE MODEL VS. COARSE MODELS

algorithm since the Broyden matrix B can initialized by the identity matrix. In contrast, if  $x_c$  and  $x_f$  are from different nature, then B should be initialized by evaluating the Jacobian, which can consume large computational resources [Rayas-Sánchez-16]. All the ACI coarse models described in Section 5.1 meet this criterion, since  $x_c$  and  $x_f$  both represent the same geometric physical parameters of the substrate package ACI.

### 5.2.4 Geometric Characteristics Parametrization

During the parameter extraction of SM optimization, the coarse model design parameters in  $x_c$  are varying constantly. For this reason, another important criterion for the coarse model selection is the ability to parametrize the geometric physical characteristics to automate the change of these parameters. In our case the design parameters in all coarse models described in Section 5.1 can be parametrized and automated using the PowerSI and Matlab®<sup>26</sup> driver, as in [Leal-Romo-17b].

# 5.3. Conclusions

This chapter presented a set of package substrate ACI coarse models with reasonable correlation with respect to an original 3D full-wave EM ACI fine model, and much lower

<sup>&</sup>lt;sup>26</sup> MATLAB, Version 9.1.0, the MathWorks, Inc., 3 Apple Hill Drive, Natick MA 01760-2098, 2016.

computational cost. The best coarse model found will be used in a future work during a SM optimization process. The goal of this study on alternative coarse models aims at improving efficiency and accuracy during the SM optimization. The selection process consists on evaluating multiple coarse model options across different criteria, including accuracy level, simulation time, computing resources, and parametrization capability.
# **General Conclusions**

This thesis provided the basics on power delivery concepts and demonstrated the importance of power delivery networks (PDN) in the design of digital systems. Some typical analysis and design methodologies used in industry were also described. Additionally, some numerical optimization techniques were proposed to increase the efficiency during the analysis and design of PDNs.

Chapter 1 presented an overview of the main power delivery network concepts. It described the main problems during the design of a PDN such as DC drop, power loss, transient voltage droops, ground bounce, EMI, etc. It also showed some typical modeling methodologies to design and analyze this important ingredient of the digital system design. It was learned that if we have the modeling and analysis methodologies well calibrated, the designer can take decisions wisely with a good cost-performance trade-off in an opportune time framework.

Chapter 2 described a well-established industry methodology for analyzing PDNs based on electromagnetic field solvers. It explained the three basics steps in this process: EM simulation, SPICE model generation, and transient/frequency simulation. This process was illustrated through a specific example in which some of the most challenges were enlisted.

In Chapter 3, a typical challenge between signal integrity and power delivery was presented. It was shown that if we include power delivery effects to the signal integrity analysis, complexity increases, resulting in longer design cycle times. In that same chapter, a low cost optimization method based on a parameter extraction (PE) technique was proposed to develop an accurate and efficient PDN model. This efficient PDN model was used during the SI analysis, reducing design cycle times considerably. The work in that chapter also provided the main inputs for elaborating and presenting a paper in the International Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS 2019), celebrated in Montreal, Canada. The material in that chapter also provided some inputs to a paper presented in IEEE Latin American Symposium on Circuits and Systems (LASCAS 2020), celebrated in San Jose, Costa Rica.

The last two chapters focused on air core inductors (ACI) design and analysis. Starting from the concept and benefits of ACI, those chapters showed a typical analysis and design

methodology and finally they presented a coarse ACI model concept that represents the foundations of an alternative methodology which consisted of using surrogate models to simplify and reduce the cycle time of the analysis and design process. Chapter 4 described the importance of the ACIs for providing power to the circuits on die in order to increase the power delivery efficiency. It also explained the ACI physical structure and its simulation process using a detailed or fine model for obtaining highly accurate electrical responses. Chapter 4 also described the ACI physical structure and its simulation using a highly accurate model implemented in Cadence<sup>®</sup> Sigrity<sup>TM</sup> PowerSI<sup>®</sup> (PSI-3D). Chapter 5 proposed a procedure for selecting a good and computationally efficient ACI coarse model candidate that could be used during a space mapping optimization process. The selection process of the coarse model consisted on evaluating multiple coarse model options across different criteria, including accuracy level, simulation time, computing resources, and parametrization capability.

The coarse model developed in Chapter 5 and the fine model obtained in Chapter 4 in conjunction with the PE algorithm developed in Chapter 3 can be used as the backbone of a future work of space mapping optimization process in a future research.

# Appendices

#### A. LIST OF INTERNAL RESEARCH REPORTS

- [1] B. Mercado-Casillas and J. E. Rayas-Sánchez, "An introduction to power delivery networks for IC packages and printed circuit boards," Internal Report *CAECAS-10-07-R*, ITESO, Tlaquepaque, Mexico, Jul. 2010.
- [2] B. Mercado-Casillas and J. E. Rayas-Sánchez, "Methodology based on electromagnetic field solvers and circuital models for PDN analysis," Internal Report *CAECAS-10-09-R*, ITESO, Tlaquepaque, Mexico, Sep. 2010.
- [3] B. Mercado-Casillas and J. E. Rayas-Sánchez, "Accurate and computationally efficient power delivery network lumped models obtained from parameter extraction," Internal Report *CAECAS-19-03-R*, ITESO, Tlaquepaque, Mexico, Apr. 2019.
- [4] B. Mercado-Casillas and J. E. Rayas-Sánchez, "Accurate simulation of package substrate air core inductors," Internal Report *CAECAS-19-16-R*, ITESO, Tlaquepaque, Mexico, Dec. 2019.
- [5] B. Mercado-Casillas and J. E. Rayas-Sánchez, "Developing a coarse model of a package substrate air core inductor for space mapping applications," Internal Report *CAECAS-20-02-R*, ITESO, Tlaquepaque, Mexico, May. 2020.

## **B. PUBLISHED PAPERS**

- [1] B. Mercado-Casillas and J. E. Rayas-Sánchez, "Towards signal-power integrity analysis by efficient power delivery network lumped models obtained from parameter extraction," in *Int. Conf. Electrical Performance of Electronic Packaging and Systems (EPEPS 2019)*, Montreal, Canada, Oct. 2019, pp. 1-3. (ISSN: 2165-4107; eISSN: 2165-4115; ISBN: 978-1-7281-4586-0; e-ISBN: 978-1-7281-4585-3; DOI: 10.1109/EPEPS47316.2019.193214).
- [2] J. E. Rayas-Sánchez, F. E. Rangel-Patiño, B. Mercado-Casillas, F. Leal-Romo, and J. L. Chávez-Hurtado, "Machine learning techniques and space mapping approaches to enhance signal and power integrity in high-speed links and power delivery networks," in *IEEE Latin American Symp. Circuits and Systems Dig. (LASCAS 2020)*, San Jose, Costa Rica, Feb. 2020, pp. 1-4. (ISSN: 2330-9954; eISSN: 2473-4667; ISBN: 978-1-7281-3428-4; e-ISBN: 978-1-7281-3427-7; DOI: 10.1109/LASCAS45839.2020.9068994).

# Bibliography

| [Bandler-04]      | J. W. Bandler, Q. Cheng, S. A. Dakroury, A. S. Mohamed, M. H. Bakr, K. Madsen, and J. Søndergaard, "Space mapping: the state of the art," <i>IEEE Trans. Microwave Theory Tech.</i> , vol. 52, no. 1, pp. 337-361, Jan. 2004.                                                    |
|-------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [Bharath-16]      | K. Bharath, S. Venkatraman, "Power delivery design and analysis of 14nm multicore server CPUs with integrated voltage regulators," in <i>IEEE 66th Electronic Components and Technology Conf.</i> , Las Vegas, NV, June 2016, pp. 368-373.                                       |
| [Bogatin-10]      | E. Bogatin, Signal and Power Integrity Simplified. Boston, MA: Prentice Hall, 2010.                                                                                                                                                                                              |
| [Burton-14]       | A. E. Burton et al., "FIVR – fully integrated voltage regulators on 4 <sup>th</sup> generation Intel® Core <sup>TM</sup> SoCs," in <i>IEEE Applied Power Electronics Conference and Exposition - APEC 2014</i> , Fort Worth, TX, Mar. 2014, pp. 432-439.                         |
| [Cadence-15]      | Cadence Design Systems, Inc. (2015). <i>Cadence Sigrity PowerSI</i> [Online]. Available: <u>https://www.cadence.com/content/dam/cadence-www/global/en_US/documents/tools/pcb-design-analysis/sigrity-powersi-ds.pdf</u>                                                          |
| [Cadence-18]      | PowerSI 3D-EM User Manual, Cadence Design Systems, Inc., San Jose, CA 95134, USA, 2018.                                                                                                                                                                                          |
| [Cadence-19]      | Cadence Design Systems, Inc. (2019). <i>Sigrity PowerSI 3D EM Extraction Option</i> [Online].<br>Available:<br><u>https://www.cadence.com/content/dam/cadence-www/global/en_US/documents/tools/ic-package-design-analysis/sigrity-powersi-3d-em-extraction-ds.pdf</u>            |
| [Chen-07]         | J. Chen and L. He, "Efficient in-package decoupling capacitor optimization for I/O power integrity," <i>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems</i> , vol. 26, no. 4, Apr. 2007, pp. 734-738.                                              |
| [Gilli-11]        | M. Gilli, D. Maringer, and E. Schumann, <i>Numerical Methods and Optimization in Finance</i> . San Diego, CA: Academic Press, 2011.                                                                                                                                              |
| [Intel Altera-05] | Intel Altera. (2005). Using the Altera PDN Tool to Optimize your Power Delivery Network<br>Design (rev Jul. 2015) [Online]. Available:<br>https://www.intel.com/content/www/us/en/programmable/documentation/jba143404024986<br>5.html                                           |
| [Jung-10]         | H. Jung and M. Pedram, "Optimizing the power delivery network in dynamically voltage scaled systems with uncertain power mode transition times," in <i>2010 Design, Automation &amp; Test in Europe Conference &amp; Exhibition</i> , Dresden, Germany, March 2010, pp. 351-356. |
| [Kim-01]          | Y-J. Kim, M. Swaminathan and Y. Sub, "Modeling of power distribution networks for mixed signal applications," in <i>IEEE Int. Symp. Electromagnetic Compatibility (EMC 2001)</i> , Montreal, Quebec, 2001, pp. 1117-1122.                                                        |

| [Kim-03]               | Y-J. Kim, J-H. Kang, K. Woo-Park, J-K. Wee and K-H. Hong, "Synthesis method for design of power distribution network in high-speed digital systems", in <i>IEEE Symp. Electrical Performance of Electronic Packaging</i> , Princeton, NJ, Oct. 2003, pp. 133-136.                                            |
|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [Kim-04]               | Y-J. Kim, H-S. Yoon, S. Lee, G. Moon, J. Kim and J-K. Wee, "An efficient path-based equivalent circuit model for design, synthesis, and optimization of power distribution networks in multilayer printed circuit boards", <i>IEEE Trans.</i> on <i>Advanced Packaging</i> , vol. 27, pp. 97-106, Feb. 2004. |
| [Koziel-07]            | S. Koziel and J.W. Bandler, "Coarse models for efficient space mapping optimisation of microwave structures," <i>IET Microwaves, Antennas &amp; Propagation</i> , vol. 4, no. 4, Apr. 2007, pp. 453-465.                                                                                                     |
| [Koziel-08]            | S. Koziel, Q. S. Cheng, and J. W. Bandler, "Space mapping," <i>IEEE Microwave Magazine</i> , vol. 9, no. 6, pp. 105-122, Dec. 2008.                                                                                                                                                                          |
| [Lambert-14]           | W. J. Lambert, M. J. Hill, K. Radhakrishnan, L. Wojewoda, and A. E. Augustine, "Package embedded inductors for integrated voltage regulators," in <i>IEEE 64th Electronic Components and Technology Conf. (ECTC)</i> , Orlando, FL, May. 2014, pp. 528-534.                                                  |
| [Lambert-16]           | W. J. Lambert, M. J. Hill, K. Radhakrishnan, L. Wojewoda, and A. E. Augustine, "Package inductors for Intel fully integrated voltage regulators," <i>IEEE Transactions on Components, Packaging and Manufacturing Technology</i> , vol. 6, no. 1, pp. 3-11, Jan. 2016.                                       |
| [Leal-Romo-17a]        | F. J. Leal-Romo, M. Cabrera-Gómez, D. M. García-Mora, and J. E. Rayas-Sánchez,<br>"Design optimization of a 3D spiral inductor using space mapping," Internal Report<br><i>PhDEngScITESO-17-06-R (CAECAS-17-05-R)</i> , ITESO, Tlaquepaque, Mexico, May 2017.                                                |
| [Leal-Romo-17b]        | F. Leal-Romo, M. Cabrera-Gómez, J. E. Rayas-Sánchez, and D. M. García-Mora, "Design optimization of a planar spiral inductor using space mapping," in <i>Int. Conf. Electrical Performance of Electronic Packaging and Systems (EPEPS 2017)</i> , San Jose, CA, Oct. 2017, pp. 1-3.                          |
| [Li-07]                | M. P. Li, Jitter, Noise, and Signal Integrity at High-Speed. Boston, MA: Prentice Hall, 2007.                                                                                                                                                                                                                |
| [Mathuna-10]           | S. C. O. Mathuna, T. O'Donnell, N. Wang, and K. Rinne, "Magnetics on silicon: an enabling technology for power supply on chip," <i>IEEE Trans. on Power Electronics</i> , vol. 20, no. 4, pp. 585-592, Apr. 2010.                                                                                            |
| [Mathuna-12]           | S. C. O. Mathuna, N. Wang, S. Kulkarni, and S. Roy, "Review of integrated magnetics for power supply on chip (PwrSoC)," <i>IEEE Trans. on Power Electronics</i> , vol. 27, no. 11, pp. 4799-4816, Nov. 2012.                                                                                                 |
| [Mercado-Casillas-10]  | B. Mercado-Casillas, J. E. Rayas-Sánchez, "An introduction to power delivery networks for IC packages and printed circuit boards," Internal Report CAECAS-10-07-R, ITESO, Tlaquepaque, Mexico, Jul. 2010.                                                                                                    |
| [Mercado-Casillas-19a] | B. Mercado-Casillas and J. E. Rayas-Sánchez, "Accurate simulation of package substrate air core inductors," Internal Report <i>CAECAS-19-16-R</i> , ITESO, Tlaquepaque, Mexico, Dec. 2019.                                                                                                                   |

[Mercado-Casillas-19b] B. Mercado-Casillas and J. E. Rayas-Sánchez, "Towards signal-power integrity analysis by efficient power delivery network lumped models obtained from parameter extraction," in Int. Conf. Electrical Performance of Electronic Packaging and Systems (EPEPS 2019), Montreal, Canada, Oct. 2019, pp. 1-3. [Molex-04] Molex, "Z-Axis power delivery (ZAPD)," in F-Molex Power Symp., vol. 3, Sep. 2004. [Novak-07] I. Novak and J. R. Miller, Frequency-Domain Characterization of Power Distribution Networks. Norwood, MA: Artech House, 2007. [Pan-13] S. Pan and B. Achkir, "Optimization of power delivery network design for multiple supply voltages," in 2013 IEEE International Symposium on Electromagnetic Compatibility, Denver, CO, Nov. 2013, pp. 333-337. [Rayas-Sánchez-16] J. E. Rayas-Sánchez, "Power in simplicity with ASM: tracing the aggressive space mapping algorithm over two decades of development and engineering applications," IEEE Microwave Magazine, vol. 17, no. 4, pp. 64-76, Apr. 2016. [Ren-04] Y. Ren, K. Yao, M. Xu, F. C. Lee. "Analysis of the power delivery path from the 12V VR to the microprocessor," IEEE Trans. Power Electronics, vol. 19, pp. 1507-1514, Nov. 2004. [Swanson-03] D. G. Swanson and W. J. R. Hoefer, Microwave Circuit Modeling Using Electromagnetic Field Simulation, Norwood, MA: Artech House, 2003. [Watanabe-06] T. Watanabe, Y. Tanji, H. Kubota and H. Asai, "Parallel-distributed time-domain circuit simulation of power distribution networks with frequency-dependent parameters," in Asia and South Pacific Conf. Design Automation, Yokohama, Japan, Jan. 2006, pp. 6. [Xu-03] M. Xu, H. Wang and T. Hubing, "Application of the cavity model to lossy power-return plane structures in printed circuit boards," IEEE Trans. on Advanced Packaging, vol. 23, pp. 73-80, Feb. 2003. [Xu-14] T. Xu, Circuit and System Level Design Optimization for Power Delivery and Management, Ph.D. thesis, Dep. Graduate and professional studies, Texas A&M University, Texas, 2014. [Zhang-10] H. Zhang, S. Krooswyk, and J. Ou, High Speed Digital Design: Design of High Speed Interconnects and Signaling. Waltham, MA: Morgan Kaufmann, 2015.

# Index

#### 2

2.5D, 22, 23, 24, 91, 92

### 3

3D, v, 13, 22, 23, 29, 54, 55, 69, 71, 78, 79, 82, 83, 85, 87, 88, 89, 90, 91, 97, 105, 106

#### Α

AC losses, 77, 78 ACI, 1, 2, 71, 72, 73, 74, 77, 78, 79, 80, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 96, 97 admittance, 19, 23, 27, 29 air core inductors, 1, 72, 83, 103, 106 annealing algorithm, 48

#### B

BER, 49 BGA, 10 Broadband, 30, 31, 32, 35, 37, 40, 42, 43 Broyden, 97 buck, 72, 74, 78, 82 bulk, 3, 6, 12, 24, 54, 55, 57 bumps, 33, 54, 59

#### С

C4, 3, 33 Cadence, 36, 54, 79, 82, 85, 88, 92, 100, 105

#### Ch

chipsets, 2, 82, 83

#### С

circuit solver, 29, 92 coarse model, 2, 82, 83, 84, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 100, 103 coarse models, 2, 84, 95, 96, 97 coil, 74, 75, 76, 78, 80, 81, 82, 88 conductivity, 8, 18, 35 copper, 7, 15, 35, 50, 54, 55, 60, 73, 74, 75, 79, 85, 91 core, 1, 2, 47, 72, 73, 83, 99, 103, 106 coupled inductance, 78 CPU, 18, 39, 47, 54, 55, 65, 71

#### D

DC losses, 77 DC-DC, 48, 72, 82 DDP, 59 DDR3, 2, 49, 54, 55, 84 decoupling capacitors, 3, 6, 14, 15, 17, 24, 48, 55 design parameters, 2, 16, 64, 74, 75, 76, 83, 84, 90, 91, 96, 97 design variables, 48, 60, 62, 64 dI/dt. 6 die, 3, 5, 6, 8, 9, 10, 12, 21, 32, 34, 50, 59, 71, 73,77 dielectric, 15, 17, 18, 20, 29, 35, 55, 74, 75, 76 DIMM, 54, 55, 56, 57, 58, 59, 60 distributed model, 13, 56, 61, 62, 63, 64, 65, 66 DQ, 66, 67, 68 DRAM, 49, 54, 55, 56, 59 droop, 6, 9 dual rank, 55, 65 dummy metals, 72, 76, 79, 85 dynamic voltage scaling, 48

#### E

E-field, 79 EH, 66, 67 electromagnetic, 1, 3, 4, 7, 13, 22, 23, 27, 29, 30, 35, 36, 44, 50, 54, 71, 83, 91, 99, 103 electromagnetic field solvers, 1, 22, 27, 30, 50, 103 electromagnetic interference, 3, 4 EMI, 3, 7, 24 equivalent series resistance, 10 error function, 61 Error functions, 61 ESL, 10, 11, 60 ESR, 10, 60 eye height, 66, 67

#### $\mathbf{F}$

FEM, 23, 78 field solver, 22, 29, 35, 36, 44 field solvers, 13, 22, 23, 28, 54 fine model, 2, 71, 73, 78, 79, 80, 81, 82, 83, 84, 85, 86, 88, 89, 92, 93, 94, 95, 96, 97 fine models, 2, 83, 96 first droop, 6 flip-flops, 47 FR4, 55, 73  $f_{sw}$ , 80, 81, 82, 87, 95 full-wave, v, 23, 29, 54, 69, 71, 79, 83, 92, 97

# G

ground, 3, 4, 7, 14, 15, 17, 18, 24, 27, 29, 36, 48, 55, 67, 73, 76, 85, 99 ground bounce, 3, 4, 7, 24

#### Η

HF, 59 HSPICE, 2, 29, 30, 31, 34, 35, 42, 65 hybrid solver, 29

#### Ι

IBIS, 29 IC, 3, 7, 27, 103, 106 impedance, 3, 6, 9, 10, 11, 12, 14, 15, 18, 20, 21, 23, 24, 27, 29, 34, 35, 36, 37, 39, 40, 43, 62, 63, 64 impedance profile, 10, 12, 15, 21, 34, 39 inductance, 8, 9, 10, 11, 12, 16, 20, 24, 60, 62, 72, 73, 77, 78, 79, 80, 81, 82, 87, 88, 94, 95 interconnects, 5, 12, 50, 54 IR drop, 5, 8

#### J

JEDEC, 55

#### L

Land Side Capacitors, 6 Laplace rational function, 31 layers, 17, 54, 55, 73, 74, 76 LDO, 54 LF, 59 linear programming, 48  $L_{\text{mutual}}$ , 78, 80, 84, 85, 87, 89, 90, 92, 95, 96 local minimum, 64 loss tangent, 15, 17, 35 LSC, 6  $L_{\text{self}}$ , 78, 80, 81, 84, 85, 87, 88, 89, 90, 92, 94, 95, 96 lumped, 2, 13, 15, 17, 24, 29, 49, 50, 57, 58, 59, 60, 61, 62, 64, 65, 66, 67, 69, 92, 103, 104 lumped circuit, 13, 24, 29, 50, 92 Lumped model, 63, 65

#### Μ

Matlab, 2, 15, 36, 62, 79, 97 Maxwell, 22, 23, 29 memory controller, 49 mesh, 22, 79, 85, 92 MF, 59 microprocessor, 3, 32, 107 microprocessors, 2, 82, 83 MNA, 20, 29 model response, 50, 61, 63, 64 mother board, 6, 24, 27, 32, 55, 56, 57, 58, 60 mutual-inductance, 78, 80, 87

#### Ν

Negative coupling, 78 Nelder-Mead, 62, 64 network parameters, 27, 29, 30

#### 0

objective function, 48, 61, 64 on-chip decoupling capacitors, 16 optimization problem, 48, 61, 96 optimization variables, 60, 61, 62 output inductor, 57, 71, 72, 82 output ripple, 72, 77

# P

package air core inductor, 2, 83
parallel resonant frequency, 11
parameter extraction, 1, 48, 61, 91, 96, 97, 99, 103, 104
Parameter Extraction, v, 47, 60
parasitics, 7, 9, 15, 22, 24, 29, 57, 59, 67
passivity, 30, 40, 45
PCB, 5, 15, 18, 24, 27, 36

PDN, v, 1, 2, 3, 4, 5, 7, 9, 12, 13, 15, 17, 18, 20, 21, 24, 27, 29, 30, 31, 33, 34, 35, 40, 47, 48, 49, 50, 51, 54, 55, 56, 57, 60, 61, 62, 65, 66, 67, 68, 69, 83, 92, 103, 105 PE, v, 1, 18, 48, 49, 50, 51, 57, 60, 61, 64, 65, 69, 99, 100 penalty functions, 61 performance, v, 1, 3, 8, 13, 24, 27, 47, 49, 50, 62, 67, 71, 78, 79, 83, 93 permeability, 8 PLL. 49 power delivery, v, 1, 3, 9, 13, 24, 27, 29, 47, 48, 49, 69, 83, 103, 104, 105, 106, 107 power delivery networks, v, 103, 104, 106 power integrity, v, 27, 104, 105 power loss, 5, 24, 77, 78 power nets, 7 power planes, 4, 5, 7, 8, 27, 28, 55 power rail, 7, 36, 54, 55, 66 power rails, 3, 7, 67, 79, 85 PowerSI, 28, 29, 30, 31, 35, 36, 37, 38, 39, 40, 43, 44, 45, 54, 79, 82, 91, 93, 97, 100, 105 **PRF**, 11 PSI-3D, 79, 83, 85, 87, 88, 90, 92, 100 PTH, 73, 74, 75, 76, 79

# Q

quality factor, 78, 84, 87, 95 quasi-static, 17, 20, 23

### R

 $R_{ac}$ , 77, 78, 80, 84, 85, 87, 88, 89, 90, 92, 94, 95, 96 racetrack inductors, 73 relative permittivity, 15, 35 reliability, 72, 77 resistance, 7, 8, 9, 20, 24, 57, 60, 62, 77, 80, 87, 88, 94 resistivity, 5, 7, 8 resonant peaks, 15 return current, 3, 7, 50 return currents, 4, 7, 10  $R_{LF}$ , 80, 84, 85, 87, 88, 89, 90, 92, 94, 95, 96 RMS, 77 RX, 66, 67, 68, 69

## S

scattering, 23, 27, 29, 54 second droop, 6 seed values, 62 self resonant, 10, 12 self-inductance, 78, 80, 81 sense point, 5 SI, 2, 48, 49, 50, 51, 65, 66, 67, 69, 99 signal integrity, 48, 49, 50, 65, 69 silicon die, 2, 5, 32, 50, 71 Simultaneous Switching Noise, 4 simultaneous switching outputs, 27 skin depth, 8, 15, 79 Skin depth, 8 skin effect. 18 skin-effect, 20, 29 SM, 82, 83, 84, 93, 94, 96, 97, 98, 100 snake inductors, 73 socket, 3, 54 solenoid, 72, 74, 79, 88 space mapping, v, 2, 78, 83, 84, 86, 103, 104, 106.107 S-parameters, 43, 54, 55, 85 speed paths, 47 SPICE, 17, 19, 27, 28, 29, 30, 31, 32, 33, 35, 37, 40, 42, 43, 44, 45, 99 SRF, 11 SSN, 4, 7 substrate package air core inductors, 2 substrate packages, 1 surrogate model, 83 Surrogate models, 2 switching frequencies, 8, 12, 72 switching frequency, 77, 78, 79, 87 switching losses, 77 system response, 62, 63

# Т

target impedance, 16 target inductance, 72 target response, 61, 62, 63, 64 Telegrapher's equations, 29 third droop, 6 Touchtone, 29, 30, 36, 56, 62 transient, 3, 6, 13, 19, 23, 24, 27, 30, 32, 33, 40, 42, 43, 44, 57, 65, 72, 77, 78, 82, 99 transistors, 3, 72, 74 transmission line, 18, 20, 21, 23, 29, 92 TX, 66, 68, 69, 105

# U

UPM, 49, 50, 67

# V

VDD, 49, 54, 55, 59, 61, 67 V<sub>IH</sub>, 67 V<sub>IL</sub>, 67 voltage drop, 7, 8, 9 voltage regulator, 3, 5, 12, 32, 50, 71 voltage ripple, 71, 72, 82 VR, 1, 5, 12, 50, 54, 56, 57, 60, 71, 72, 73, 77, 78, 79, 80, 81, 82, 107 VRM, 5, 6, 32 VTT, 54

# W

wavelength, 18, 20, 22, 23, 29, 92

# Ζ

Z-parameter, 62