For anyone wander on this site, I have moved to a new domain and website
The new site is built using eleventy and hosted on Netlify. It is more like a personal website instead of just a blog.
This article shows an algorithm to solve Fredholm integral equation of the first kind.
A Fredholm integral equation of the first kind is written as,
The problem is to find , given that and are known. This equation occurs quite often in many different areas.
Here we describe a discretizationbased method to solve the Fredholm integral equation. The integral equation is approximately replaced by a Riemann summation over grids,
where is the grid size along the dimension and , are the grid points with and indicating their indices. When grid size , the summation converges to the true integral. It is more convenient to write it in the matrix form,
where
and with being the identity matrix of dimension .
Now solving the Fredholm integral equation is equivalent to solving a system of linear equations. The standard approach ordinary least squares linear regression, which is to find the vector minimizing the norm . In principle, the Fredholm integral equation may have nonunique solutions, thus the corresponding linear equations are also illposed. The most commonly used method for illposed problem is Tikhonov regularization which is to minimize
Note that this is actually a subset of Tikhonov regularization (also called Ridge regularization) with being a constant.
In many cases, both and are probability density function (PDF), and is a conditional PDF, equivalent to . Thus, there are two constraints on the solution , that is and . These two constraints translate to for any and . Hence, we need to solve the Tikhonov regularization problem subject to these two constraints.
In the following, I will show how to solve the Tikhonov regularization problem with both equality and inequality constraints. First, I will show that the Tikhonov regularization problem with a nonnegative constraint can be easily translated to a regular nonnegative least square problem (NNLS) which can be solved using the active set algorithm.
Let us construct the matrix,
and the vector,
where is the identity matrix and is the zero vector of size . It is easy to show that the Tikhonov regularization problem subject to is equivalent to the regular NNLS problem,
Now we add the equality constraint, or written in matrix form. My implementation of solving such problem follows the algorithm described in Haskell and Hanson ^{1}. According to their method, the problem becomes another NNLS problem,
The solution to the above equation converges to the true solution when . Now I have described the algorithm to solve the Fredholm equation of the first kind when is a probability density function. I call the algorithm described above as nonnegative Tikhonov regularization with equality constraint (NNETR).
Here I show the core code of the algorithm described above.
# core algorithm of nonnegative equality Tikhonov regularization (NNETR) def NNETR(K, f, Delta, epsilon, alpha): # the first step A_nn = np.vstack((K, alpha * np.identity(K.shape[1]))) b_nn = np.hstack((f, np.zeros(K.shape[1]))) # the second step A_nne = np.vstack((epsilon * A_nn, np.full(A_nn.shape[1], 1.0))) b_nne = np.hstack((epsilon * b_nn, 1.0)) # Use NNLS solver provided by scipy sol, residue = scipy.optimize.nnls(A_nne, b_nne) # solution should be divided by Delta (grid size) sol = sol/Delta return sol, residue
[1]: Haskell, Karen H., and Richard J. Hanson. “An algorithm for linear least squares problems with equality and nonnegativity constraints.” Mathematical Programming 21.1 (1981): 98118.
This note is about the effectivenss of using multidimensional LSTM network to learn matrix operations, such as linear mapping as well as nonlinear mapping. Recently I am trying to solve a research problem related to mapping between two matrices. And came up the idea of applying neural network to the problem. The Recurrent Neural Network (RNN) came to my sight not long after I started searching since it seems be able to capture the spatiotemporal dependence and correlation between different elements wheares the convolutional neural network is not able to (I am probably wrong, this is just my very simple understanding and I am not a expert on neural network). Perhaps the most famous article about RNN online is this blog where Andrej Karpathy demonstrated the effectiveness of using RNN to generate meaningful text content. For whoever interested in traditional chinese culture, here is a github repo on using RNN to generate 古诗 (Classical Chinese poetry).
However the above examples only focus taking the sequential input data and output sequential prediction. My problem is learning mapping between two matrices which is multidimensional. After researching a little bit, I found Multidimensional Recurrent Neural Network can be used here. If you google “Multidimensional Recurrent Neural Network”, the first entry would be this paper by Alex Graves, et al. However I want to point out that almost exact same idea is long proposed back in 2003 in the context of protein contact map prediction in this paper.
I have never had any experience using neural network before. Instead of learning from stratch, I decided that it is probably more efficient to just find a github repo available and study the code from there. Fortunately I did find a very good exemplary code here.
The question is that can MDLSTM learn the mapping between two matrices? From basic linear algrebra, we know there are two types of mapping: linear map and nonlinear map. So it is natural to study the problem in two cases. Any linear mapping can be represtened by a matrix. For simplicity, I use a random matrix to represent the linear mapping we want to learn, . And apply it to a gaussian field matrix to produce a new transformed matrix , i.e. . We feed and into our MDLSTM network as our inputs and targets. Since our goal is to predict given the input where values of elements in are continous rather than categorical. So we use linear activation function and mean square error as our loss function.
def fft_ind_gen(n): a = list(range(0, int(n / 2 + 1))) b = list(range(1, int(n / 2))) b.reverse() b = [i for i in b] return a + b def gaussian_random_field(pk=lambda k: k ** 3.0, size1=100, size2=100, anisotropy=True): def pk2(kx_, ky_): if kx_ == 0 and ky_ == 0: return 0.0 if anisotropy: if kx_ != 0 and ky_ != 0: return 0.0 return np.sqrt(pk(np.sqrt(kx_ ** 2 + ky_ ** 2))) noise = np.fft.fft2(np.random.normal(size=(size1, size2))) amplitude = np.zeros((size1, size2)) for i, kx in enumerate(fft_ind_gen(size1)): for j, ky in enumerate(fft_ind_gen(size2)): amplitude[i, j] = pk2(kx, ky) return np.fft.ifft2(noise * amplitude) def next_batch_linear_map(bs, h, w, mapping, anisotropy=True): x = [] for i in range(bs): o = gaussian_random_field(pk=lambda k: k ** 4.0, size1=h, size2=w, anisotropy=anisotropy).real x.append(o) x = np.array(x) y = [] for idx, item in enumerate(x): y.append(np.dot(mapping, item)) y = np.array(y) # data normalization for idx, item in enumerate(x): x[idx] = (item  item.mean())/item.std() for idx, item in enumerate(y): y[idx] = (item  item.mean())/item.std() return x, y
Note that we normalize the matrix elements by making their mean equals zero and variance equal 1. We can visualize the mapping by plotting the matrix
h, w = 10, 10 batch_size = 10 linear_map = np.random.rand(h, w) batch_x, batch_y = next_batch(batch_size, h, w, linear_map) fig, ax = plt.subplots(1,3) ax[0].imshow(batch_x[0], cmap='jet', interpolation='none') ax[1].imshow(my_multiply, cmap='jet', interpolation='none') ax[2].imshow(batch_y[0], cmap='jet', interpolation='none') ax[0].set_title(r'$\mathrm{Input\ Matrix\ }I$') ax[1].set_title(r'$\mathrm{Linear\ Mapping\ Matrix\ }M$') ax[2].set_title(r'$\mathrm{Output\ Matrix\ }O$') ax[0].axis('off') ax[1].axis('off') ax[2].axis('off') plt.tight_layout() plt.show()
As shown, the matrix maps to . Such transformation is called linear mapping. I will show that MDLSTM can indeed learn this mapping up to reasonable accuracy. I use the codes here. The following code is the traning part
anisotropy = False learning_rate = 0.005 batch_size = 200 h = 10 w = 10 channels = 1 x = tf.placeholder(tf.float32, [batch_size, h, w, channels]) y = tf.placeholder(tf.float32, [batch_size, h, w, channels]) linear_map = np.random.rand(h,w) hidden_size = 100 rnn_out, _ = multi_dimensional_rnn_while_loop(rnn_size=hidden_size, input_data=x, sh=[1, 1]) # use linear activation function model_out = slim.fully_connected(inputs=rnn_out, num_outputs=1, activation_fn=None) # use a little different loss function from the original code loss = tf.sqrt(tf.reduce_sum(tf.square(tf.subtract(y, model_out)))) grad_update = tf.train.AdamOptimizer(learning_rate).minimize(loss) sess = tf.Session(config=tf.ConfigProto(log_device_placement=False)) sess.run(tf.global_variables_initializer()) # Add tensorboard (Really usefull) train_writer = tf.summary.FileWriter('Tensorboard_out' + '/MDLSTM',sess.graph) steps = 1000 mypredict_result = [] loss_series = [] for i in range(steps): batch = next_batch_linear_map(batch_size, h, w, linear_map, anisotropy) st = time() batch_x = np.expand_dims(batch[0], axis=3) batch_y = np.expand_dims(batch[1], axis=3) mypredict, loss_val, _ = sess.run([model_out, loss, grad_update], feed_dict={x: batch_x, y: batch_y}) mypredict_result.append([batch_x, batch_y, mypredict]) print('steps = {0}  loss = {1:.3f}  time {2:.3f}'.format(str(i).zfill(3), loss_val, time()  st)) loss_series.append([i+1, loss_val])
The loss as a function of steps is shown in the figure below. It seems the loss saturate around 7075. Now let’s see how well our neural network learns? The following figures show five predictions on newly randomly generated input matrix. The results are pretty good for the purpose of illustration. I am sure there must be some room for improvements.
I choose the square of the matrix as the test for nonlinear mapping, .
def next_batch_nonlinear_map(bs, h, w, anisotropy=True): x = [] for i in range(bs): o = gaussian_random_field(pk=lambda k: k ** 4.0, size1=h, size2=w, anisotropy=anisotropy).real x.append(o) x = np.array(x) y = [] for idx, item in enumerate(x): y.append(np.dot(item, item)) # only changes here y = np.array(y) # data normalization for idx, item in enumerate(x): x[idx] = (item  item.mean())/item.std() for idx, item in enumerate(y): y[idx] = (item  item.mean())/item.std() return x, y
The following image are the loss function and results.
As you can see, the results are not great but very promising. Any suggestions to improve the results would be welcome.
Long time ago, I wrote about how to use Pivot algorithm to generate equilibrium conformations of a random walk, either selfavoiding or not. The volume exclusion of a selfavoiding chain make it nontrivial to generate conformations. Gaussian chain, on the other hand, is very easy and trivial to generate. In addition to the efficient pivot algorithm, in this article, I will show another interesting but nonstraightforward method to generate gaussian chain conformations.
To illustrate this method which is used to generate static conformations of a gaussian chain, we need first consider the dynamics of a such system. It is well known the dynamics of a gaussian/ideal chain can be modeled by the Brownian motion of beads connected along a chain, which is ensured to give correct equilibrium ensemble. The model is called “Rouse model”, and is very well studied. I strongly suggest the book The Theory of Polymer Dynamics
by Doi and Edwards to understand the method used here. I also found a useful material here. I will not go through the details of derivation of solution of Rouse model. To make it short, the motion of a gaussian chain is just linear combinations of a series of inifinite number of independent normal modes. Mathematically, that is,
where is the position of bead and are the normal modes. is the solution of langevin equation . This is a special case of OrsteinUhlenbeck process and the equilibrium solution of this equation is just a normal distribution with mean and variance .
where , is the number of beads or number of steps. is the kuhn length.
Since the normal modes are independent with each other and they are just gaussian random number. It is very easy and straightforward to generate normal modes. And then we can just transform them to the actual position of beads using the first equation and we get the position of each beads, giving us the conformations. The idea is that simple! This may seems untrivial at first glance but should give us the correct result. To test this, let’s implement the algorithm in python.
def generate_gaussian_chain(N, b, pmax): # N = number of beads # b = kuhn length # pmax = maximum p modes used in the summation # compute normal modes xpx, xpy and xpz xpx = np.asarray(map(lambda p: np.random.normal(scale = np.sqrt(N * b**2.0/(6 * np.pi**2.0 * p**2.0))), xrange(1, pmax+1))) xpy = np.asarray(map(lambda p: np.random.normal(scale = np.sqrt(N * b**2.0/(6 * np.pi**2.0 * p**2.0))), xrange(1, pmax+1))) xpz = np.asarray(map(lambda p: np.random.normal(scale = np.sqrt(N * b**2.0/(6 * np.pi**2.0 * p**2.0))), xrange(1, pmax+1))) # compute cosin terms cosarray = np.asarray(map(lambda p: np.cos(p * np.pi * np.arange(1, N+1)/N), xrange(1, pmax+1))) # transform normal modes to actual position of beads x = 2.0 * np.sum(np.resize(xpx, (len(xpx),1)) * cosarray, axis=0) y = 2.0 * np.sum(np.resize(xpy, (len(xpy),1)) * cosarray, axis=0) z = 2.0 * np.sum(np.resize(xpz, (len(xpz),1)) * cosarray, axis=0) return np.dstack((x,y,z))[0]
Note that there is a parameter called pmax
. Although actual position is the linear combination of inifinite number of normal modes, numerically we must truncate this summation. pmax
set the number of normal modes computed. Also in the above code, we use numpy broadcasting to make the code very consie and efficient. Let’s use this code to generate three conformations with different values of pmax
and plot them
# N = 300 # b = 1.0 conf1 = generate_gaussian_chain(300, 1.0, 10) # pmax = 10 conf2 = generate_gaussian_chain(300, 1.0, 100) # pmax = 100 conf3 = generate_gaussian_chain(300, 1.0, 1000) # pmax = 1000 fig = plt.figure(figsize=(15,5)) # matplotlib codes here plt.show()
The three plots show the conformations with , and . and . As clearly shown here, larger number of modes gives more correct result. The normal modes of small p
corresponds the low frequency motion of the chain, thus with small pmax
, we are only considering the low frequency modes. The conformation generated can be considered as some what coarsegrained representation of a gaussian chain. Larger the pmax
is, more normal modes of higher frequency are included, leading to more detailed structure. The coarsing process can be vividly observed in the above figure from right to left (large pmax
to small pmax
). To test our conformations indeed are gaussian chain, we compute the mean endtoend distance to test whether we get correct Flory scaling ().
As shown in the above plot, we indeed get the correct scaling result, . When using this method, care should be taken setting the parameter pmax
, which is the number of normal modes computed. This number should be large enough to ensure the correct result. Longer the chain is, the larger pmax
should be set.
I have migrated to my new domain and website here ⇒ https://www.guangshi.io/
LAMMPS is a very powerful Molecular Dynamics simulation software I use in my daily research. In our research group, we mainly run Langevin Dynamics (LD) or Brownian Dynamics (BD) simualtion. However, for some reason, LAMMPS doesn’t provide a way to do Brownian Dynamics (BD) simulation. Both the LD and BD can be used to sample correct canonical ensemble, which sometimes also be called NVT ensemble.
The BD is the large friction limit of LD, where the inertia is neglected. Thus BD is also called overdamped Langevin Dynamics. It is very important to know the difference between LD and BD since these two terms seems be used indifferently by many people which is simply not correct.
The equation of motion of LD is,
where is the mass of the particle, is its position and is the damping constant. is random force. The random force is subjected to fluctuationdissipation theorem. . where is the drag coefficient. is nowhere differentiable, its integral is called Wiener process. Denote the wiener process as . It has the property , is the Gaussian random variable of zero mean, variance of .
The fix fix langevin
provided in LAMMPS
is the numerical simulation of the above equation. LAMMPS
uses a very simple integration scheme. It is the VelocityVerlet algorithm where the force on a particle includes the friction drag term and the noise term. Since it is just a first order algorithm in terms of the random noise, it can not be used for large friction case. Thus the langevin fix
in LAMMPS
is mainly just used as a way to conserve the temperature (thermostat) in the simulation to sample the conformation space. However in many cases, we want to study the dynamics of our interested system realistically where friction is much larger than the inertia. We need to do BD simulation.
For a overdamped system, is very large, let’s take the limit , the bath becomes infinitely dissipative (overdamped). Then we can neglect the left side of the equation of LD. Thus for BD, the equation of motion becomes
The first order integration scheme of the above equation is called EulerMaruyama algorithm, given as
where is the normal random variable with zero mean and unit variance. Since for BD, the velocities are not well defined anymore, only the positions are updated. The implementation of this scheme in LAMMPS
is straightforward. Based on source codes fix_langevin.cpp
and fix_langevin.h
in the LAMMPS
, I wrote a custom fix
of BD myself. The core part of the code is the following. The whole code is here.
... void FixBD::initial_integrate(int vflag) { double dtfm; double randf; // update x of atoms in group double **x = atom>x; double **f = atom>f; double *rmass = atom>rmass; double *mass = atom>mass; int *type = atom>type; int *mask = atom>mask; int nlocal = atom>nlocal; if (igroup == atom>firstgroup) nlocal = atom>nfirst; if (rmass) { for (int i = 0; i < nlocal; i++) if (mask[i] & groupbit) { dtfm = dtf / rmass[i]; randf = sqrt(rmass[i]) * gfactor; x[i][0] += dtv * dtfm * (f[i][0]+randf*random>gaussian()); x[i][1] += dtv * dtfm * (f[i][1]+randf*random>gaussian()); x[i][2] += dtv * dtfm * (f[i][2]+randf*random>gaussian()); } } else { for (int i = 0; i < nlocal; i++) if (mask[i] & groupbit) { dtfm = dtf / mass[type[i]]; randf = sqrt(mass[type[i]]) * gfactor; x[i][0] += dtv * dtfm * (f[i][0]+randf*random>gaussian()); x[i][1] += dtv * dtfm * (f[i][1]+randf*random>gaussian()); x[i][2] += dtv * dtfm * (f[i][2]+randf*random>gaussian()); } } } ...
As one can see, the implementation of the integration scheme is easy, shown above. dtv
is the time step used. dtfm
is and randf
is .
The EulerMaruyama scheme is a simple first order algorithm. Many studies has been done on higher order integration scheme allowing large time step being used. I also implemented a method shown in this paper. The integration scheme called BAOAB
is very simple, given as
The source code of this method can be downloaded here. In addition, feel free to fork my Github repository for fix bd
and fix bd/baoab
. I have done some tests and have been using this code in my research for a while and haven’t found problems. But please test the code yourself if you intend to use it and welcome any feedback if you find any problems.
5月不是最适合去Big Bend露营的季节, 沙漠在每天最热的时候温度会高达40度, 再加上缺水, 所以很少人会选择夏季来Big Bend. 来Austin之后, 偶然看到一张South rim落日后的照片, 紫色的天空, 脚下是一眼望不到头的Chiwawa沙漠, 看到这张照片后就下定决心一定要去一趟. 无奈一直没有合适的时间, 再加上去Big Bend最适宜的季节 (冬季) 很难订到campsite, 一年多了也没有去成. 5月份学期结束, 我们决定干脆说走就走, 现在来看我俩很庆幸有了这次稍显准备仓促的旅行, 要不然真不知道什么时候能去成了.
big bend NP位于德州与墨西哥交界处, 由一条叫Rio Grande的河分界, 整个公园属于Chiwawa沙漠的一部分. 由于位置偏远, (我们从austin开过去要8个小时, 圣安东尼奥算是离公园最近的大城市也得6个小时), 所以是美国到访人数最少的几个国家公园之一 (全美一共59个国家公园big bend到访人数排名42). 这里最受欢迎的活动是徒步和露营, 整个公园的trail一共长180英里(290公里), 由于游客稀少, 所以在这里的露营和徒步会给你非常难忘的体验. Big Bend另外一个出名的是这里的夜空, 这里有美国最黑的夜晚 (因为附近实在太荒了, 几乎没有任何光污染), 遗憾的是我们没能好好欣赏这里的银河. 推荐如果有机会一定要熬到后半夜看看银河. 这里是公园官方关于观星的页面
从五月到八月沙漠地区白天会有极端干燥高温天气, 对于没有什么野外生存经验的人, 由于暴晒和缺水, 夏天在Big Bend的沙漠平原两天以上的徒步和露营是存在生命危险的. 所以我们决定只在High Chisos活动. High chisos是位于公园中央地区的一片拔地而起的山脉, 最高点比沙漠平原处提升了大约3000 feet, 即使在夏天最热的时候, 这里的气温也是可以承受的. 所以大多数夏天来Big Bend徒步露营的人都只会在High Chisos活动.
Big bend公园内的住宿分为两种: camping和lodging.
顾名思义, lodging就是住在公园内的建好的屋子里,有空调,可以洗澡.位置在Chisos Basin,网上的资料基本上都说这个需要提前半年多才能订到,不过我们这次比较幸运, 由于第二天临时改变行程, 当天从山上下来后直接去柜台很幸运的竟然订到了唯一剩下的一间房. 如果你需要洗澡,空调,睡得舒服,那这个lodge就是最好的选择,位置是公园中心,开车去公园其他地方都是最方便的.再加上在海拔比较高的地方,即使是夏季,晚上也会比较凉快.
不过来Big Bend不体验一次露营那真的太可惜了. camping大致分为两种, campgrounds和backcountry camping. campgrounds应该是公园内最常见的露营方式. campgrounds实际上就是公园官方修的专门让游客露营的地方, 每个site之间都离的很近, 会配备公共厕所. Big Bend共有三个campgrounds: chisos basin campground, cottonwood campground和rio grande village campground. chisos basin campground和上面介绍的lodging是在同一个地方, 剩下两个分别在公园的西边和东边. rio grande village是唯一可以停RV的campgrounds. campground的好处是提供厕所,如果只是想体验一下住帐篷,那么campground是最佳选择,因为提供厕所而且因为周围都是其他camping的人,所以很安全(不用担心野生动物).各个campground都有一部分sites是可以在网上提前预订的(预订网站), 剩余的是属于walk up类型,就是直接自己去营地找到还没有被占的campsite, 然后在campsite的入口处self checkin就行. 如果你没有提前在网上预订到,我建议在早上甚至凌晨到达,一般会有人很早就走了. 如果下午过去, 旺季基本是不可能找不到地方的.
但是如果你想真正的体验big bend,那我推荐一定要选择backcountry camping. Backcountry camping的好处就是你周围是没有其他人的,是真正的在野外露营. Backcountry又大致分为两种, 一种是primitive roadside campsite另外就是backpacking camping. roadside campsite是可以开车直接到达的, 很多开房车来big bend的人会选择这种方式. 需要注意的是roadside campsites都在沙漠平原, 而在公园里是不允许长时间发动着车吹空调的, 所以我认为这种方式只适合秋冬季 (除非你不怕热). 秋冬季, 人多, 又想比较悠闲的玩的话, 租辆RV来roadside camping是我最推荐的方式.
而对于想真正体验一把融入大自然的话, 那backpacking无疑是最好的方式. backpacking又分为两种: 在trail旁的公园指定好的地点camping和wilderness camping. widerness camping顾名思义, 就是你可以在(公园沙漠平原处的)任何地方露营和徒步, 需要提前把每天的行程 (大致的徒步路径, 每天的露营地点)告诉公园的工作人员, 这种方式需要足够多的沙漠生存的经验, 很可能徒步一星期过程中一个活人都看不到, 遇到危险是基本每人能帮忙, 新手还是不要想了, 万一挂了得不偿失. 所以我们选择了第一种方式. 这些campsites都在公园最受欢迎的high chisos trails旁, 每一个露营点都离徒步的路线不远,所以不会有什么危险. 所有的这些backpacking campsite是不接受预订的, 只能在计划的第一晚的当天去游客中心订露营点. 这个网站可以查询每个露营点的状态. 另外这个PDF提供了很详尽的每个campsite的情况. 我们的计划是三天两夜的行程, 从chisos basin出发, 在上山和下山的路径上选择两个campsite, 这样可以很充分的享受整个过程.
这里首先想说的是我俩在这之前是完全没有任何露营经验的, 所以是真正意义上的新手. 而Big Bend的徒步是有一定难度的, 我们算是完成了一半, 本来是三天两夜的行程中途改成了两天一夜, 放弃第二夜的主要原因是老婆晚上害怕睡不着… 所以如果你有一定经验, 那不要犹豫, 如果没有经验, 也不要退缩. 我们一开始是有一定顾虑的, 想过住公园的hotel或者campground, 大概在出发前的一个多星期偶然地在网上搜攻略的时候发现了这个论坛 bigbendchats. 网上Big Bend徒步的信息实在太少 (中文的游记/经验贴更是压根就没有, 这也是我写这篇游记的目的之一), 这个论坛的帮助非常大, 上面有详尽的资料, 经验贴和讨论帖, 可以说绝大多数有用的信息我是在这上面发现的.
下面流水账一下我们这次主要的行程
周五早上准备好装备放在车上, 下午5点半开完组会我们就直接从学校出发了, 大概晚上11点半到达提前订的位于Fort Stockton一家motel, 入住好12点一到就上查询campsite的网站研究哪些露营点还没被占. 最终确定了两晚上的露营地点 (LW3和SW3). 虽然夏季不是高峰, 但是有些遗憾的是最受欢迎的那几个露营点还是被先到的人订了.
周六早上7点起床, 8点出发. Fort Stockton离公园的入口大概还有1个半小时的车程, 从公园入口开到Chisos Basin的游客中心又是半个多小时. 在高峰期, 如果想拿到露营的许可需要尽早去游客中心申请, 许可就是一张表, 需要随身携带. 我们10点在游客中心顺利拿到了许可就收拾好装备趁早出发了. 我们选择的是从西边的路 (Laguna Meadow Trail)上山到达第一晚的露营地LW3, 原本也就是3小时的行程由于我们背了太重的装备足足爬了6个小时. LW3离主路大概有一英里的距离, 四周有其余两个营地, 最近的离我们大概几百米, 所以这里是个非常隐蔽的地点. 把帐篷搭好, 我就去周围寻找野生动物的痕迹了, 营地四周发现了好几处鹿的粪便, 还有一条很明显的黑熊常走的路(bear trail), 路上每隔几十米就是一泡黑熊粪便. 粪便都非常干, 应该说明这里有段时间没有”游客”光临了. 另外还发现了一个小水哇, 不过水已经发黄. 用滤水装置加药片过滤了一瓶水, 但仍然是黄色的, 我俩纠结了半天也没喝, 最后用来洗手了. 下午5点多烧水吃了我们的freeze dry food. 出乎意料的好吃. 天黑前我们就睡下了, 由于这是我们第一次野外露营, 一晚上醒来很多次, 总觉得周围有东西在动. 因为要保存水量, 晚上非常渴但是舍不得喝水, 这一晚让我第一次真正意识到水的珍贵….当然对我来说最可怕的是虫子, 这边有一种像苍蝇的虫子会咬人, 被咬起鸡蛋大小的肿包, 非常痒. 我老婆说因为我的血太好吃, 所以虫子一口也没咬她. 不过由于晚上老婆害怕不敢出帐篷, 所以这次我俩很遗憾的没有看到Big Bend的星空.
第二天早上8点多起床, 吃了早饭, 水已经消耗了一半, 由于装备还是很重, 这一天又有很长的路要走, 我俩非常纠结不舍地决定放弃第二天晚上的露营, 改为直接当天从Colima trail转到东线下山. 从我们的营地出发, 继续爬升了1个多小时就到达Colima trail, 然后就开始了漫长的下山过程. 我们穿过Colima trail后到达Boot Canyon, 这里是High chisos最受欢迎的地方之一. 这里有条小溪是整个High Chisos唯一的”稳定”水源. 我们到达那里发现溪水很旺, 鸟类很多. 在Boot Canyon吃完午饭(freeze dry版的宫保鸡丁)后, 歇了一会我俩就继续下山, 东线有段路非常的陡, 我俩庆幸前一天是从西边上的山, 要不然估计第一天就半途放弃了. 又经过了漫长的似乎永远没有尽头的三个多小时的下山过程, 我俩终于到了Chisos Basin大本营. 直奔商店买了瓶可乐, 最想干的事情就是洗一个冷水澡. 这时候已经3点多了, 想到直接开回Austin会太累, 我们决定去Basin campground碰碰运气, 看看有没有空余的位置, 我们确实找到几个空的营地, 不过都没有遮阳的棚子, 当时正是太阳最毒的时候, 我们放弃了住在campground的想法. 我们又查了下公园外面的情况, 全都住满了, 这时候我俩只剩下一个选择了, 去lodge看看有没有剩房. 我俩本来是没抱什么希望的, 没想到还真让我们抢到了最后一个空房. 虽然贵了点, 不过对于已经快累趴下的我俩, 顾不了那么多了… 我俩洗完澡就美美的睡了一觉, 晚上6点起床去sotol vista看日落. 一开始因为太累了, 只想一觉睡到第二天, 庆幸的是我们没有这么干,要不然就会错过这个难忘的日落. 这里可以看我们在Sotol Vista拍的日落
第三天我俩趁着早上还没有太热, 开车去Rio Grande Village看看. 营地基本是空的. 有个很短trail可以爬到一个山包顶上, 从上面可以看到远处的墨西哥小镇. 这边的Rio Grande河有个基本180度的弯, 所以在山顶你的左右两侧实际上都是墨西哥. 最有意思的是我俩看到一个墨西哥人骑着个毛驴悠然自得的从墨西哥一侧跨过Rio Grande河溜达到了美国境内, 然后有慢悠悠的跨过河回到墨西哥. 想到Trump要建墙的计划, 我看看四周, 这里太大太偏了, 觉得基本是痴人说梦.
大概10点钟我们决定返程Austin, 在离开公园的路上在Fossil Discovery Exhibit参观了一下. Big Bend在亿万年前实际上是片海洋, 后来成了一片湿地, 在之后变成了沙漠. 恐龙曾经在这里繁衍生息. 展览建在曾经的一个挖掘点旁, 有一些当地挖掘的化石(包括史前巨鳄的头骨, 翼展有10来米的翼龙化石, 2米高的阿拉莫龍腿骨等等). 另外这里提供一条trail, 游客可以在路上找化石, 所有找到的化石都可以自己带走. 不过我们到那里的时候已经中午了, 由于太热我俩看完展览就匆匆离开了.
My research work involves a lot of using of IPython Notebook. I usually do it on an office MAC. However I also very often need to access it from home. After a brief searching, I found these three wonderful articles on this topic.
I have been doing this for a while. But it eventually comes to me that how good it would be if I can make it automatic. So I wrote this python script to do the procedures described in those three articles. I am sure there must be some more elegant way to do this. But this is what I got so far and it works.
import paramiko import sys import subprocess import socket import argparse import pyperclip # function to get available port def get_free_port(): s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind(('localhost',0)) s.listen(1) port = s.getsockname()[1] s.close() return port # print out the output from paramiko SSH connection def print_output(output): for line in output: print(line) parser = argparse.ArgumentParser(description='Locally open IPython Notebook on remote server\n') parser.add_argument('t', 'terminate', dest='terminate', action='store_true', \ help='terminate the IPython notebook on remote server') args = parser.parse_args() host="***" # host name user="***" # username # write a temporary python script to upload to server to execute # this python script will get available port number def temp(): with open('free_port_tmp.py', 'w') as f: f.write('import socket\nimport sys\n') f.write('def get_free_port():\n') f.write(' s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)\n') f.write(" s.bind(('localhost', 0))\n") f.write(' s.listen(1)\n') f.write(' port = s.getsockname()[1]\n') f.write(' s.close()\n') f.write(' return port\n') f.write("sys.stdout.write('{}'.format(get_free_port()))\n") f.write('sys.stdout.flush()\n') def connect(): # create SSH client client = paramiko.SSHClient() client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) client.load_system_host_keys() client.connect(host, username=user) # generate the temp file and upload to server temp() ftpClient = client.open_sftp() ftpClient.put('free_port_tmp.py', "/tmp/free_port_tmp.py") # execute python script on remote server to get available port id stdin, stdout, stderr = client.exec_command("python /tmp/free_port_tmp.py") stderr_lines = stderr.readlines() print_output(stderr_lines) port_remote = int(stdout.readlines()[0]) print('REMOTE IPYTHON NOTEBOOK FORWARDING PORT: {}\n'.format(port_remote)) ipython_remote_command = "source ~/.zshrc;tmux \ newsession d s remote_ipython_session 'ipython notebook \ nobrowser port={}'".format(port_remote) stdin, stdout, stderr = client.exec_command(ipython_remote_command) stderr_lines = stderr.readlines() if len(stderr_lines) != 0: if 'duplicate session: remote_ipython_session' in stderr_lines[0]: print("ERROR: \"duplicate session: remote_ipython_session already exists\"\n") sys.exit(0) print_output(stderr_lines) # delete the temp files on local machine and server subprocess.run('rm rf free_port_tmp.py', shell=True) client.exec_command('rm rf /tmp/free_port_tmp.py') client.close() port_local = int(get_free_port()) print('LOCAL SSH TUNNELING PORT: {}\n'.format(port_local)) ipython_local_command = "ssh N f L localhost:{}:localhost:{} \ gs27722@wel14531.cm.utexas.edu".format(port_local, port_remote) subprocess.run(ipython_local_command, shell=True) pyperclip.copy('http://localhost:{}'.format(port_local)) def close(): # create SSH client client = paramiko.SSHClient() client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) client.load_system_host_keys() client.connect(host, username=user) stdin, stdout, stderr = client.exec_command("source ~/.zshrc;tmux killsession t remote_ipython_session") stderr_lines = stderr.readlines() if len(stderr_lines) == 0: print('Successfully terminate the IPython notebook on {}\n'.format(host)) else: print_output(stderr_lines) client.close() # kill the ssh tunneling on the local machine try: pid = int(subprocess.check_output("ps aux  grep localhost  grep v grep  awk '{print $2}'", shell=True)) subprocess.run("kill {}".format(pid), shell=True) except ValueError: pass if args.terminate: close() else: connect()
Basically this script does the following:
1. Connect to the server using python package paramiko.
2. Upload a temporary python script. Use paramiko
to execute the python script. This script gets an available port on localhost on remote machine.
3. Open Ipython Notebook on remote machine using the port we get from the last step. I used tmux to do this. The reason tmux
is used here is that it allows us to exit the session after launching notebook otherwise we cannot proceed the following steps. And my shell is zsh
. You can modify that part of code based on your situation
4. On the local machine, find an available port and create an SSH tunneling to port forwarding the port on the remote machine to local machine.
If the script runs successfully, you will see something like this.
If you want to check is IPython Notebook really running on the remote machine. SSH to remote machine, and use command tmux ls
. A tmux session named remote_ipython_session
should exist.
In browser, open http://localhost: 50979
. You should be able to access your ipython notebook.
To terminate the ipython notebook on the remote machine, simply do
P.S. If you don’t even want to input link in browser, you can use python package pyperclip to send the link to clipboard.
This is a note about learning LAMMPS
source codes. This note focuses on compute style of Lammps
which is used to compute certain quantity during the simulation run. Of course you can as well compute these quantities in postprocess, however it’s usually faster to do it in the simulation since you can take advantage of the all the distance, forces, et al generated during the simulation instead of computing them again in postprocess. I will go through the LAMMPS
source code compute_gyration.h
and compute_gyration.cpp
. I am not very familiar with c++
, so I will also explain some language related details which is what I learn when studying the code. Hope this article can be helpful when someone want to modify or make their own Lammps
compute style.
compute_gyration.h
#ifdef COMPUTE_CLASS ComputeStyle(gyration,ComputeGyration) #else #ifndef LMP_COMPUTE_GYRATION_H #define LMP_COMPUTE_GYRATION_H #include "compute.h" namespace LAMMPS_NS { class ComputeGyration : public Compute { public: ComputeGyration(class LAMMPS *, int, char **); ~ComputeGyration(); void init(); double compute_scalar(); void compute_vector(); private: double masstotal; }; } #endif #endif
First part of this code
#ifdef COMPUTE_CLASS ComputeStyle(gyration,ComputeGyration) #else
is where this specific compute style is defined. If you want to write your own compute style, let's say intermediate scattering function. Then we write like this
#ifdef COMPUTE_CLASS ComputeStyle(isf,ComputeISF) // ISF stands for intermediate scattering function #else
Move to the rest part. #include "compute.h"
and namespace LAMMPS_NS
is to include the base class and namespace. Nothing special is here, you need to have this in every specific compute style header file.
class ComputeGyration : public Compute { public: ComputeGyration(class LAMMPS *, int, char **); ~ComputeGyration(); void init(); double compute_scalar(); void compute_vector(); private: double masstotal;
You can see there is a overal structure in the above code class A : public B
. This basically means that our derived class A will inherit all the public and protected member of class B. More details can be found here, here and here
Next, we declare two types of member of our derived class, public
and private
. public
is the member we want the other code can access to and private
is the member which is only used in the derived class scope. Now let’s look at the public class member. Note that there is no type declaration of class member ComputeGyration
and ~ComputeGyration
. They are called Class Constructor and Class Destructor. They are usually used to set up the initial values for certain member variables as we can see later in compute_gyration.cpp
. Note that for some compute style such as compute_msd.h
, the destructor is virtual, that is virtual ~ComputeMSD
instead of just ~ComputeMSD
. This is because class ComputeMSD
is also inherited by derive class ComputeMSDNonGauss
. So you need to decalre the base destructor as being virtual. Look at this page for more details. Now let’s move forward.
void init(); double compute_scalar(); void compute_vector();
here all the function init
, compute_scalar
and compute_vector
all are the base class member which are already defined in compute.h
. However they are all virtual functions, which means that they can be overrided in the derived class, here it is the ComputeGyration
. This and this pages provide some basic explanations for the use of virtual functions. Here is a list shown in LAMMPS documentation of some examples of the virtual functions you can use in your derived class.
In our case, gyration computation will return a scalor and a vector, then we need compute_scalar()
and compute_vector()
. Private member masstotal
is the quantity calculated locally which is only used within the class and not needed for the rest of the codes.
compute_gyration.cpp
Now let’s look at the compute_gyration.cpp
.
#include <math.h> #include "compute_gyration.h" #include "update.h" #include "atom.h" #include "group.h" #include "domain.h" #include "error.h"
Here the necessary header files are include. The name of these header file is selfexplanary. For instance, updata.h
declare the functions to update the timestep, et al.
ComputeGyration::ComputeGyration(LAMMPS *lmp, int narg, char **arg) : Compute(lmp, narg, arg) { if (narg != 3) error>all(FLERR,"Illegal compute gyration command"); scalar_flag = vector_flag = 1; size_vector = 6; extscalar = 0; extvector = 0; vector = new double[6]; }
The above code define the what the constructor ComputeGyration
actually does. ::
is called scope operator, it is used to specify that the function being defined is a member (in our case which is the constructor which has the same name as the its class) of the class ComputeGyration
and not a regular nonmember function. The structure ComputeGyration : Compute()
is called a Member Initializer List. It initializes the member Compute()
with the arguments lmp, narg, arg
. narg
is the number of arguments provided. scalar_flag
, vector_flag
, size_vector
, extscalar
and extvector
all are the flags parameter defined in Compute.h
. For instance, scalar_flag = 1/0
indicates we will/won’t use function compute_scalar()
in our derived class. The meaning of each parameter is explained in compute.h
. This line vector = new double[6]
is to dynamically allocate the memory for array of length 6. Normally the syntax of new operator is such
double *vector = NULL; vector = new double[6];
Here the line double *vector = NULL
is actually in compute.h
and compute.cpp
. Where pointer vector
is defined in compute.h
and its value is set to NULL
in compute.cpp
.
ComputeGyration::~ComputeGyration() { delete [] vector; }
The above code speficy destructor that is what will be excuted when class ComputeGyration
goes out of scope or is deleted. In this case, it delete the gyration tensor vector defined above. The syntax of delete operator for array is delete [] vector
. For details of new and delete can be found here.
void ComputeGyration::init() { masstotal = group>mass(igroup); }
This part perform one time setup like initialization. Operator > is just a syntax sugar, class>member
is equivalent with (*class).member
. What group>mass(igroup)
does is to call the member mass()
function of class group
, provided the groupID, and return the total mass of this group. How value of igroup
is set can be examined in compute.cpp
. It’s the second argument of compute style.
double ComputeGyration::compute_scalar() { invoked_scalar = update>ntimestep; double xcm[3]; group>xcm(igroup,masstotal,xcm); scalar = group>gyration(igroup,masstotal,xcm); return scalar; }
invoked_scalar
is defined in base class Compute
. The value is the last timestep on which compute_scalar()
was invoked. ntimestep
is the member of class update
which is the current timestep. xcm
function of class group
calculate the centerofmass coords. The result will be stored in xcm
. gyration
function calculate the gyration of a group given the total mass and center of mass of the group. The total mass is calculated in init()
. And in order for it to be accessed here, it is defined as private in compute_gyration.h
. Notice that here there is no explicit code to calculte the gyration scalor because the member function which does this job is already defined in class group
. So we just need to call it. However we also want to calculate the gyration tensor, we need to write a function to calculate it.
void ComputeGyration::compute_vector() { invoked_vector = update>ntimestep; double xcm[3]; group>xcm(igroup,masstotal,xcm); double **x = atom>x; int *mask = atom>mask; int *type = atom>type; imageint *image = atom>image; double *mass = atom>mass; double *rmass = atom>rmass; int nlocal = atom>nlocal; double dx,dy,dz,massone; double unwrap[3]; double rg[6]; rg[0] = rg[1] = rg[2] = rg[3] = rg[4] = rg[5] = 0.0; for (int i = 0; i < nlocal; i++) if (mask[i] & groupbit) { if (rmass) massone = rmass[i]; else massone = mass[type[i]]; domain>unmap(x[i],image[i],unwrap); dx = unwrap[0]  xcm[0]; dy = unwrap[1]  xcm[1]; dz = unwrap[2]  xcm[2]; rg[0] += dx*dx * massone; rg[1] += dy*dy * massone; rg[2] += dz*dz * massone; rg[3] += dx*dy * massone; rg[4] += dx*dz * massone; rg[5] += dy*dz * massone; } MPI_Allreduce(rg,vector,6,MPI_DOUBLE,MPI_SUM,world); if (masstotal == 0.0) return; for (int i = 0; i < 6; i++) vector[i] = vector[i]/masstotal; }
The above code do the actual computation of gyration tensor.
Here is the list of meaning of each variable
x
: 2D array of the position of atoms. mask
: array of group information of each atom. if (mask[i] & groupbit)
check whether the atom is in the group on which we want to perform calculation. type
: type of atom.image
: image flags of atoms. For instance a value of 2 means add 2 box lengths to get the unwrapped coordinate. mass
: mass of atoms.rmass
: mass of atoms with finitesize (meaning that it can have rotational motion). Notice that mass of such particle is set by density and diameter, not directly by the mass. That’s why they set two variables rmass
and mass
. To extract mass of atom i
, use rmass[i]
or mass[type[i]]
.nlocal
: number of atoms in one processor.Look at this line domain>unmap(x[i],image[i],unwrap)
, domain.cpp
tells that function unmap
return the unwrapped coordination of atoms in unwrap
. The following several lines calculate the gyration tensor. The MPI code MPI_Allreduce(rg,vector,6,MPI_DOUBLE,MPI_SUM,world)
sums all the six components of rg
calculated by each processor, store the value in vector
and then distribute vector
to all the processors. Refer to this article for details.
Here are two good articles about understanding and hacking LAMMPS code.
Installation of numpy
through pip
won’t link lapack
correctly. However since Deepthought 2
has openblas
software available, we can build bumpy
manually to link it to LAPACK
/openblas
library.
numpy
pip download numpy
or
git clone https://github.com/numpy/numpy.git
Deepthought 2
Load multithreads version of openblas
.
module load openblas/0.2.15/gcc/4.9.3/pthreads
Load gcc
module load gcc/4.9.3
Load lapack
module load lapack/3.4.2
site.cfg
to link openblas
Find the openblas
section, edit as the following
.... [openblas] libraries = openblas library_dirs = /cell_root/software/openblas/0.2.15/gcc/4.9.3/pthreads/sys/lib include_dirs = /cell_root/software/openblas/0.2.15/gcc/4.9.3/pthreads/sys/include runtime_library_dirs = /cell_root/software/openblas/0.2.15/gcc/4.9.3/pthreads/sys/lib ....
numpy
python setupy.py config
the output should look something like this
.... openblas_info: FOUND: libraries = ['openblas', 'openblas'] library_dirs = ['/opt/OpenBLAS/lib'] language = c define_macros = [('HAVE_CBLAS', None)] FOUND: libraries = ['openblas', 'openblas'] library_dirs = ['/opt/OpenBLAS/lib'] language = c define_macros = [('HAVE_CBLAS', None)] ....
Then build and install
python setup.py build && python setup.py install
Use these two scripts to test performance. To test it properly, use script like this
#!/bin/tcsh #SBATCH n 1 #SBATCH t 00:01:00 setenv OMP_NUM_THREADS 4 source ~/.virtualenv/py2env/bin/activate.csh python test_numpy_script.py
setenv OMP_NUM_THREADS 4
tell python to use four threads.
This instruction is based on this
I have a book list to finish reading before taking my Political Analysis Comprehensive Exam this May. Interpreting NAFTA is the first book I finished and is such a fascinating one. This book not only tells vivid stories behind the negotiation of NAFTA, it also provides a multilevel framework we could use to analyze the political decisions. For those we have read this book, I hope you enjoyed it as much as I do.
Frederick W. Mayer, Interpreting NAFTA: The Science and Art of Political Analysis, New York: Columbia University Press, 1998.
Ch1 Introduction
1.Goals of the book:
In February of 1990, Mexico (President Salinas) asked the U.S. to consider a bilateral trade agreement, to which the U.S. (President George Bush) agreed. Canada (Prime Minister Mulroney) was not initially interested in participating (they had completed a CanadaU.S. bilateral agreement in 1987 with substantial political costs), but later changed its mind. Negotiations formally began in 1991, lasting until August of 1992 – which meant that the U.S. presidential election campaigns were under way. Candidate Clinton came out in favor of NAFTA, but conditionally on strong environmental and labor side agreements (Bush had already made some concessions to these groups, but they had more leverage with Clinton both because of the election and because he was a democrat). NAFTA was passed in the U.S. in November of 1993 following a major political battle.
Ch2 A Framework for Political Analysis (most important chapter)
Understanding the history of NAFTA (including why it took the specific form that it did) requires the use of multiple modes/theories and levels of analysis. Different theories and levels are more important at different stages of the process, but we cannot fully explain the observed outcomes without drawing from (usually competing) models of rational choice, institutional process, and symbolic politics – often nested at the international, domestic, and individual levels.
Based on his findings, Mayer argues against the idea that one or another theory or mode of analysis is dominant or most important, and that using them synergistically is the best approach for political analysis in many cases.
Level of Analysis  Mode of Politics  
Rational Choice
“Interests” 
Institutional Process
“Institutions” 
Symbolic Response
“Ideas” 

International  Realism  Regime Theory  Epistemic Communities 
Domestic  Political Economy  Organizational Behavior  Cultural Anthropology 
Individual  Public Choice & Institutional Economics  Cognitive Psychology  Constructivism, Symbolic Interactionism 
Dimensions of the Analytic Framework
International Level
Focuses on processes in the international arena (politics among states). The dominant theories in International Relations concentrate on this level (states as unitary actors).
Domestic Level
Analyzes international relations as the consequence of domestic politics. “National behavior is determined by the action and interaction of bureaucracies and legislatures, political parties, business, and union lobbies, and other advocacy groups.” (p. 15)
Individual Actors
Focuses on elite actions and the particular worldviews of individuals. This may include theories that concentrate on elite decisionmaking or the individual interaction within groups.
Modes of Politics (assumptions about the fundamental nature of political behavior and political processes)
Rational Choice
Assumes the behavior of actors in the system is determined by rational choice. Two types of rational choice: substantively rational (seek to maximize the same set of core interests) and procedural rationality (strategically rational). This approach seeks to predict how individual choices will aggregate into collective action. Consider the example of NAFTA.
Institutional Process
The behavior of actors is determined by preexisting institutions. “In this concept of politics, rule, norms, routines, and other institutions limit options for action and at least partially predetermine their selection, thus channeling behavior along established paths. Political behavior is less a choice than an execution of programs” (p. 19).
Symbolic Response
Political behavior is a response to the way in which political circumstances are symbolically constructed. People respond to and interpret symbols. The emphasis is on ideas, symbolic representations, and their competing influences. “The essence of politics, therefore, is a contest among competing constructions.” (p. 20).
Section 1 Deciding to Negotiate
Ch3 Why a North American Free Trade Agreement?
Background: In early February 1990, Mexican President Carlos Salinas de Gortari decided to ask the United States to negotiate a free trade agreement with Mexico. US President George Bush agreed and Canadian Prime Minister Brian Mulroney decided it like a seat at the negotiating table. In January 1991, the three countries finally agree to negotiate.
Apply the framework with why NAFTA:
International Level:
Neorealism (international level rational choice theory) assumes (1) states have clear interests, primarily in security; (2) states are rational actors; (3) the international system is anarchic, so that international outcomes are determined solely by the balance of power and the strategic interplay among states.
In this case, neorealism could be used to explain why NAFTA was in the economic interest of all three countries (benefits of free trade), but it can’t explain the timing of the decision to negotiate, or why the road to an agreement mutually beneficial to all three states would be so long and uncertain.
International Regime Theory argues that international institutions matter and that internationally held rules, norms, and organizations affect the choices that nations make. Most International Regime Theory assumes a moderately weak system of international institutions in which countries are rational actors but occasionally bounded by international regimes (much like a repeated Prisoner’s Dilemma game). Regimes can facilitate mutually advantageous cooperation under the right circumstances. This theory helps explain (1) that the three countries were creating a new regime by negotiating, (2) the timing of NAFTA was influenced by Mexico’s entering into new international regimes by having to obtain IMF stabilization loans in the 1980s and, therefore, open its economy along mandated lines; (3) the importance of free trade agreements, which are the regimes of choice in the 1980s and 1990s for increasing cooperation. Neither neorealism or regime theory can explain the exact timing of the decision to negotiate.
International Symbolic Politics maintains that ideas, ideologies, and worldviews affect the behavior of nations, independent of interests and institutions. This includes various notions including (1) that ideas are potential solutions to collective action problems, (2) symbolic systems establish beliefs about cause and effect (e.g. what we believe is the outcome of trade), (3) ideas frame issues to make some dimension more important than another (e.g. free trade vs. human rights), and (4) ideas and symbols help construct a sense of national interest. This helps to explain Mexico’s decision to seek a freetrade agreement – it was a symbolic step signaling a changed Mexico to an international audience, and a signal to international investors.
Domestic Level:
While much of the decision to negotiate can be explained at the international level, we must turn to the domestic level to understand the key question of why countries must negotiate something that it is in their economic interest to do unilaterally. In fact, the need to negotiate an international agreement can be explained by the need to overcome a domestic collective action problem in international trade. International agreements serve as vehicles for solving domestic political problems by tying the hands of domestic interests. Mexico in particular used NAFTA as a way of committing domestic interests to free market policies.
Individual Level:
There is decisionmaking by elites. This is a story about the relationship among the three heads of state. Bush was a Texan, comfortable with Mexico and a free trader; Salinas has a graduate degree from the US and was much bolder than his predecessor in liberalizing the economy; Brian Mulroney, a protrade Conservative prime minister who at the time of the decisiontonegotiate was in a politically secure position despite the unpopularity of CAFTA a few years before (he was no longer PM when the agreement was signed).
Ch2 Domestic Politics Matter: The Fast Track Fight
Background: For Mexico and Canada, once Salinas and Mulroney committed to NAFTA, the decision was made. But for the United States, Bush would need to obtain “fast track” negotiating authority from Congress. “Fast track” is the standard process by which Congress delegates authority to the president to negotiate on its behalf and commits itself to both to limited debate and to an updown vote. The 1988 Trade Act had authorized the fast track for three years, Bush need to ask for the extension by March 1991. Organizations concerned about the workers rights, environment, food safety, family farmers and other issues raised oppositions. In May 1991, Congress granted the president tow years of negotiating authority while Bush had to address the labor and environment issues in the negotiation.
Apply the framework with why Fast Track:
Domestic Level:
Domestic Rational Choice: Interest groups understood the quite well on the effects of Fast Track and correctly anticipated the likely outcome of the extensions. So their fight over the institution of fast track can be viewed as a rational choice process among competing domestic interests. Three interest groups: supporters, especially business groups fought hard for it; opponents, like unions, some grassroots environmental organizations and highly protected sectors; opportunities, mainstream environmental organizations, gain leverage to advance issues of concern to them. The features of the politics of fast track extension can be regarded as a bargain between two players – business and mainstream environment. By linking the fast track extension issues with the environmental extension problems, environmental groups made Bush administration make a modest side payment in exchange for support as a result of such bargain. The payment is in the form of the Action Plan, a promise to address the border issues, to discuss greater cooperation with Mexico on environmental issues, and put environmentalists on the trade advisory committees.
Domestic Institutions: the choice of rules in one period must be made according to rules established at an earlier period. The congressional decision about Fast Track in 1991 was based on rules laid down in 1988. In the Omnibus Trade and Competitiveness Act, Congress granted the president three years of fast track negotiating authority, and allowed for an extension of two more years. The 1988 process was itself structured by institutions decided upon earlier, most notably in 1974 when the fast track process was invented for the Tokyo Round of the GATT negotiations.
Section II International Negotiation
Ch5 TwoLevel Bargaining: The NAFTA Negotiation
Background:
President Salinas, President Bush and Prime Minister Mulroney hoped to conclude the negotiation by the end of 1991 or early 1992, before the US presidential campaign heated up. But the not until August 12, 1992, did the negotiators finally reach agreement and a new Congress and a new president would decide what to do with it. The NAFTA agreements all three countries to eliminate most barriers to trade goods and services; to open investment in most sectors of the economy and to adhere to new rules for intellectual property, government procurement, and dispute settlement (p110). However, a few politically privileged industries in all three countries were exempted, e.g. Mexican oil, Canadian culture industries, and US shipping. Only narrow class of professionals and businesspersons were free to work across national borders.
Two Stage Analytical Framework: the auto rule of origin
Much of the NAFTA negotiation focused on the rules of origin that established how much of a good needed to be made in North America for it to be considered “North American”, thus qualifying for preferential tariff treatment (P155). After long and tough bargaining, the negotiators finally settled on 62.5 percent. Such number is a result of both internationallevel bargain and domesticlevel bargain among strong interests groups. For instance, in the US domestic level, the three automakersGM, Ford, and Chrysler interested in high rule of origin to make it more difficult for European and Japanese competitors to locate assembly plants in Canada or Mexico and therefore ship autoparts to US duty free. Based on their own patterns of production and competitive position, each of them sets a different threshold.
(P156.P158)
Ch6 Making Side Issues Central: The Labor and Environment Negotiations
Background: With Bill Clinton’s election in November, the side issuesenvironment and labor became central since Clinton tried to gain Democratic supporters during the campaign. In May of 1993, Clinton administration tabled its first position that satisfies enough Democrats sympathetic to labor and environmental interests to win Congressional approval, but it was rejected by Mexico and Canada. The negotiation also failed by summer’s end of 1993 when the vast majority of Democrats in Congress were ready to oppose the agreement. Negotiators for all three countries found ways to make compromises and finally reached an agreement on August 13,1993.
Three Stage Analytical Framework: the environment negotiation
The thoughness, “teeth”, of the enforcement mechanism for environmental issues is measured from “gum” to “fangs”. At international level, Mexico preferred no enforcement; Canada preferred some enforcement and the United States had a preference for sharp teeth. The US preference on international level reflects the domestic level bargain between business and mainstream environmental groups. The presidential election between Bush and Clinton changed the domestic game in the U.S. in a way which gave more power to unions and environmentalists, forcing the negotiation of labor and environmental sideagreements with “teeth.” Mainstream environment’s preferences at the domestic level reflect the individuallevel bargain between members and staff. Mainstream environmentalist leadership did not originally insist on sanctionenforcement for the environmental sideagreement, but their position changed in part because of the way the issue was framed to the public – which influenced their membership and therefore constrained their actions.
Section 3 The Politics of Ratification
Ch7 Symbolic Politics: Growing Grassroots Opposition
Background: In the fall of 1993, NAFTA faced remarkable breadth of opposition in the United States.
Symbolic politics played against NAFTA on a domestic level in all three countries. Canadian and Mexican fears about losing their identity were particularly strong, although US opinion also soon became highly charged against NAFTA. In the US, the citizens’ participation in opposing NAFTA can be explained by that ideological consumption of individuals overcame the logic of collective action. Individuals received satisfaction from the act of joining, writing, or attending because some form of ideological consumption(p260). But why would they join the opposite acts? The symbolic politics is powerful when the costs of information acquisition and processing are high and the potential benefits of using symbols to reinforce identities or worldviews are also high(p272). So as apprehended by union members, grassroots environmentalists, and so on, NAFTA became the story of corrupt politicians and greedy corporation. Opposition to NAFTA was a matter of honor, a matter of moral imperative and an affirmation of identity(p272).
Ch8 Diagnosis and Strategy
Background: NAFTA supporters won the battle.
To diagnose NAFTA’s problem and revise strategy, what is needed is a framework that allows us to assess the nature of Congressional institutions, interest group politics, and symbolic constructions in the broader public; understand the complex interrelationships among them; the most importantly, identify the possible points of leverage in this system (p334).