Abstract: This paper extends the methods developed by Hamilton (1989) and Chib (1996) to identified multiple-equation models. It details how to obtain Bayesian estimation and inference for a class of models with different degrees of time variation and discusses both analytical and computational difficulties. JEL classification: C3 Key words: simultaneity, identification, time variation, volatility, Bayesian method I. INTRODUCTION We consider nonlinear stochastic dynamic simultaneous equations of the structural form: [y'.sub.t][A.sub.0]([s.sub.t]) = [x'.sub.t] + ([S.sub.t] + [[epsilon'].sub.t], t = 1,...., T, (1) Pr([S.sub.t] = i|[S.sub.t] - 1 = k) = [P.sub.ik], i,k = 1,...,h, (2) where s is an unobserved state, y is an nx1 vector of endogenous variables, x is an m x 1 vector of exogenous and lagged endogenous variables, A0 is an n x n matrix of parameters, [A.sub.+] is an m x n matrix of parameters, T is a sample size, and h is the total number of states. Denote the longest lag length in the system of equations (1) by v. The vector of right-hand variables, [x.sub.t] , is ordered from the n endogenous variables for the first lag down to the n variables for the last ([v.sup.th]) lag with the last element of [x.sub.t] being the constant term. For t = 1,..., T, denote [Y.sub.t] = {[y.sub.1],..., [y.sub.t]}. We treat as given the initial lagged values of endogenous variables Y0 = {[y.sub.1], ..., [y.sub.0]}. Structural disturbances are assumed to have the distribution: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] where N(a,b) refers to the normal pdf with mean a and covariance matrix b and [I.sub.n] is an nxn identity matrix. Following Hamilton 1989 and Chib 1996, we impose no restrictions on the transition matrix P = [[p.sub.ik]]. The reduced-form system of equations implied by (1) is: [y'.sub.t] = [x'.sub.t] B([s.sub.t])+ [u'.sub.t]([s.sub.t]), t = 1,..., T; (3) where B([S.sub.t]) = [A.sub.+]([s.sub.t] )[A.sup.-1.sub.0]([s.sub.t]), (4) [u.sub.t]([s.sub.t]) = [A'.sup.-1.sub.0]([s.sub.t])[[epsilon].sub.t], (5) E([u.sub.t]([s.sub.t])[u.sub.t]([s.sub.t])') = ([A.sub.0]([s.sub.t])[A'.sub.0][([s.sub.t]).sup.-1]. (6) In the reduced form (4)-(6), B([s.sub.t]) and [u.sub.t]([s.sub.t]) involve the structural parameters and shocks across equations, making it impossible to distinguish regime shifts from one structural equation to another. In contrast, the structural form (1) allows one to identify each structural equation, such as the policy rule, for regime switches. II. PRIOR RESTRICTIONS II.1. Restrictions on time variation. If we let all parameters vary across states, it is relatively straightforward to apply the existing methods of Chib 1996 and Sims and Zha 1998 to the model estimation because [A.sub.0]([s.sub.t]) and [A.sub.+]([s.sub.t]) in each given state can be estimated independent of the parameters in other states. But with such an unrestricted form for the time variation, if the system of equations is large or the lag length is long, the number of free parameters in the model becomes impractically large. For a typical monthly model with 13 lags and 6 endogenous variables, for example, the number of parameters in [A.sub.+]([s.sub.t]) is of order 468 for each state. Given the post-war macroeconomic data, however, it is not uncommon to have some states lasting for only a few years and thus the number of associated observations is far less than 468. It is therefore essential to simplify the model by restricting the degree of time variation in the model's parameters. Such a restriction entails complexity and difficulties that have not been dealt with in the simultaneous-equation literature. To begin with, we rewrite A+ as [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (7) where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] If we place a prior distribution on D([s. …