We've Moved In!

January 17th, 2012

About two weeks ago, I had the electricity turned on at the new apartment and I set up a new Charter Cable Internet account so there wouldn't be a gap in Internet service between moving out of the old apartment and moving into the new one. After spending several days cleaning the dust out of the new apartment, I began moving some of my books and book cases into the new apartment.  I rented a U-Haul truck on Thursday (12/12/2012). Melissa and I moved the bulk of our stuff out of the old apartment and into the new apartment and storage unit. I spent the next few days thoroughly cleaning the old apartment and moving the remainder of our belongings.  Everything was finished by yesterday afternoon! Now we have to unpack and organize things. I should find out how much of my security deposit will be returned within 20 days.

The new place is small and the plumbing and wiring leave a lot to be desired (the bathroom sink supplies a mere trickle of cold water comparable to a water pistol and our small heat pump shares the same circuit as half of the outlets in the apartment, so it's quite easy to accidentally trip the breaker), but it is relatively quiet!  There's no loud dog barking endlessly next door, the train is less noisy here, there is no roar of freeway traffic 24/7, there are no gamers playing FPS video games, and there is no crazy pot head downstairs laughing maniacally at all hours of the day and night and polluting our apartment with smoke. I've been resting better without the noise.  Saving several hundred dollars per month on rent is also a major benefit. I may actually be able to earn enough from tutoring to pay my half of the rent.

Side notes:

I put my three containers of Irish moss (pearlwort or sagina subulata) in the yard last night temporarily until I could figure out what to do with them.  A dog apparently got in the yard during the night or early morning and tore one of the trays apart. So now I know that it's not safe to put my bonsai in the back yard. The fence apparently does nothing to protect the yard from dogs. I may plant some little bits of the Irish moss in the yard to see if it survives...

My assigned parking spot is directly beside the dumpster.  This seems like a recipe for trouble.

Among the chaos of moving, I managed to apply for UCR's Dissertation Year Program (DYP) Fellowship. The fellowship is merit-based and more than one person in the Physics Department will receive it this year, so I have a decent chance of success.

Wikipedia and a few other websites will go dark tomorrow in protest of the horrible SOPA and PIPA legislation.

 

Sunrise is Compiled!

December 30th, 2011

I have finally managed to compile the latest (default) version of Sunrise! I had to tinker with it periodically over the course of a few days, but I finally managed to do it.  The current version of the code in the Mercurial repository has evidently been designed to compile only with Arepo support, but I do not have Arepo and I will not be using Arepo for my project.  It seems that compiling the code without Arepo is planned, but not fully implemented in the current version.  There are several errors that occur when Arepo is absent.  In order to get the code to compile, I had to extensively edit the pre-processor directives.

There may be a cleaner  way of doing this, but this works:

First, in the config.h header, I had to uncomment the line:

#undef WITH_AREPO

Then, I wrapped each block of code containing the word "arepo" in

#ifdef WITH_AREPO

blah blah arepo blah

#endif

Some (but not enough) pieces of code were already wrapped in this way. In at least two source files, it was done incorrectly (ir_grid.cc and aux_grid.cc). Once I had properly prevented all Arepo-specific code from being compiled, the compilation was successful.  I now have the three executables sfrhist, mcrx, and broadband.  I haven't tried running the exucutables yet---except for the very simple test of seeing whether they ask for input files.  I also haven't attempted to compile with CUDA support, but both of these tasks should be completed soon and I'll find out if there are any run-time errors in this version...fun!

 

And Now, A Two-Point Cubic Spline

November 11th, 2011

The simple quadratic spline discussed my previous post seems to be sufficient for interpolating particle positions between two GADGET-2 snapshots for the purposes of making animations, but it's also possible to use a cubic spline for the task.  The advantages of using a cubic spline are (1) it can be more accurate---especially in cases where the acceleration changes sign once between snapshots, and (2) the derivatives at the end-points of the spline match the derivatives specified in the input data. In the case of the quadratic spline, if the acceleration (second derivative) changes sign even once during the interval, the interpolation is poor and the derivative of the spline at the end points is not guaranteed to match the specified value. Thus, in the case of the quadratic spline, the derivative is generally discontinuous when moving from one interval to another. This doesn't matter too much when making animations with reasonably small intervals, but it would be important for certain other situations. The disadvantage of the cubic spline is that it adds complexity.

Cubic spline interpolation seems to be quite popular. Derivations of the general method can be found in most textbooks on numerical methods. I'll just discuss the specific result that's useful for my application---the cubic spline in terms of the end-point positions and velocities. To derive the result, we just have to slightly modify the derivation given in the previous post. We add a jerk term \mathbf{j}(t) (third derivative of the position with respect to time) to the third and fourth Taylor expansions,

\mathbf{r}_{0}\approx\mathbf{r}(t)+(t_{0}-t)\mathbf{v}(t)+\frac{1}{2}(t_{0}-t)^2\mathbf{a}(t)

\mathbf{r}_{1}\approx\mathbf{r}(t)+(t_{1}-t)\mathbf{v}(t)+\frac{1}{2}(t_{1}-t)^2\mathbf{a}(t)

\mathbf{v}_{0}\approx\mathbf{v}(t)+(t_{0}-t)\mathbf{a}(t)+\frac{1}{2}(t_{1}-t)^2\mathbf{j}(t)

\mathbf{v}_{1}\approx\mathbf{v}(t)+(t_{1}-t)\mathbf{a}(t)+\frac{1}{2}(t_{1}-t)^2\mathbf{j}(t)

and solve the system of equations for \mathbf{r}(t). There are, of course, several ways of doing this. I'll skip the details and show the end result. I defined

\delta t \equiv t_1-t_0,\qquad \tau_0 \equiv t_0-t,\qquad \tau_1 \equiv t_1-t

In terms of these definitions,

\displaystyle\mathbf{r}(t)=\frac{\mathbf{r}_0\tau_1^2+\mathbf{r}_1\tau_0 ^2+\tau_0\tau_1(\mathbf{v}_1\tau_0 ^2-\mathbf{v}_0\tau_1^2)/\delta t}{\tau_1^2+\tau_0^2}

This can be written more compactly as

\displaystyle\mathbf{r}(t)=\frac{\mathbf{r}_0+\rho \mathbf{r}_1+(\rho \mathbf{v}_1-\mathbf{v}_0)\tau_1\tau_0/\delta t}{\rho+1}

where

\rho\equiv\frac{\tau_0^2}{\tau_1^2}=\frac{(t_0-t)^2}{(t_1-t)^2}

As an example, the cubic function generated by this technique when given values corresponding to \sin(x) at x=0 and x=2\pi is

\displaystyle s(x)=\frac{(2\pi-x)(\pi-x)x}{x^2-2\pi x+2\pi^2}

In the plot below, \sin(x) is the blue curve and s(x) is the dotted black curve.

Quadratic Spline Interpolation

November 10th, 2011

While animating my first GADGET-2 galaxy mergers, I discovered that setting the frame rate to the common value of 25 frames per second resulted in movies that were too fast. The complexity of the merger process couldn't be appreciated--even though the snapshot interval was quite small (0.005 simulation time units). I had to interpolate the GADGET-2 snapshots in order to produce intermediate frames that could be used to slow the animation down. Linear interpolation didn't work well enough to meet my standards, so I looked for an efficient quadratic interpolation scheme. I wanted to perform a quadratic interpolation using only two snapshots at a time (a quadratic interpolation using two snapshots is possible in this case because each snapshot contains particle position and velocity information). The first thing that came to mind was simply using the familiar expression

\displaystyle\mathbf{r}(t)=\mathbf{r}_0+\mathbf{v}t

to take a half-step forward from the first snapshot and a half step backward from the second snapshot and then average the two results together to estimate the likely position of each particle in the snapshot like this:

\displaystyle\mathbf{r}_{1/2}=\frac{1}{2}(\mathbf{r}_0+\mathbf{r}_1)+\frac{1}{2}\left(\mathbf{v}_0\frac{\delta t}{2} - \mathbf{v}_1\frac{\delta t}{2}\right)

Where \delta t is the time interval between snapshots 0 and 1, \mathbf{r}_{1/2} is the position of a particle at time \delta t/2 after snapshot 0, \mathbf{r}_0\mathbf{r}_1, \mathbf{v}_0, and \mathbf{v}_1 are the positions and velocities of the particle at snapshot 0 and snapshot 1. This method is slightly better than linear interpolation, but not as good as using a quadratic spline. I tested it out and the results were disappointing. I searched my computational physics and numerical analysis textbooks and searched the web, but I couldn't find what I was looking for, so I derived the method myself. My final result was

\displaystyle\mathbf{r}(t)=\frac{1}{\delta t}\left[\mathbf{r}_0(t_1-t)-\mathbf{r}_1(t_0-t)\right]+\frac{1}{2\delta t}\left(\mathbf{v}_1-\mathbf{v}_0\right)(t_1-t)(t_0-t)


This is very closely related to a quadratic Bézier curve. The derivation follows:

The Taylor expansion of a function f(x) about the point x=x_0 is

\displaystyle f(x)=f(x_0)+(x-x_0)f'(x_0)+\frac{1}{2}(x-x_0)^2 f''(x_0)+\cdots

It is possible to approximate the value of the function at x_0 using a Taylor expansion by swapping x with x_0 in the expansion above and truncating the series:

\displaystyle f(x_0)\approx f(x)+(x_0-x)f'(x)+\frac{1}{2}(x_0-x)^2 f''(x)

Writing expansions for \mathbf{r}_0, \mathbf{r}_1, \mathbf{v}_0 and \mathbf{v_1},

\mathbf{r}_{0}\approx\mathbf{r}(t)+(t_{0}-t)\mathbf{v}(t)+\frac{1}{2}(t_{0}-t)^2\mathbf{a}(t)

\mathbf{r}_{1}\approx\mathbf{r}(t)+(t_{1}-t)\mathbf{v}(t)+\frac{1}{2}(t_{1}-t)^2\mathbf{a}(t)

\mathbf{v}_{0}\approx\mathbf{v}(t)+(t_{0}-t)\mathbf{a}(t)

\mathbf{v}_{1}\approx\mathbf{v}(t)+(t_{1}-t)\mathbf{a}(t)

Eliminating \mathbf{a}(t) using the third and fourth equations:

\displaystyle\mathbf{a}(t)\approx\frac{\mathbf{v}_{1}-\mathbf{v}(t)}{(t_{1}-t)}\approx\frac{\mathbf{v}_{0}-\mathbf{v}(t)}{(t_{0}-t)}

The first two equations then become

\mathbf{r}_{0}\approx\mathbf{r}(t)+\frac{1}{2}(t_{0}-t)[\mathbf{v}_{0}+\mathbf{v}(t)]

\mathbf{r}_{1}\approx\mathbf{r}(t)+\frac{1}{2}(t_{1}-t)[\mathbf{v}_{1}+\mathbf{v}(t)]

Now we eliminate \mathbf{v}(t) between the two equations

\frac{2\mathbf{r}_{0}-2\mathbf{r}(t)}{(t_{0}-t)}-\mathbf{v}_{0}=\frac{2\mathbf{r}_{1}-2\mathbf{r}(t)}{(t_{1}-t)}-\mathbf{v}_{1}

Solving for \mathbf{r}(t) and defining \delta t=t_1-t_0 yields the result stated above.

This is a decent approximation for the motion of each particle between snapshots. In fact, it's a pretty good approximation for any function with specified end point values and first derivatives as long as the sign of the second derivative doesn't change during the interval.

For demonstration purposes, the blue curve below is \sin(x). The black dashed curve is the interpolating parabola generated by the expression derived above for endpoints at x_0=0 and x_1=\pi. The red dashed curve is the interpolating parabola for endpoints at x_0=0 and x_1=\pi/2

First Simulations

November 8th, 2011

Several months have passed since I last wrote a research log entry.  I should probably write these a bit more frequently.  Much has happened since the previous entry!

  • My HST Cycle 19 Theory proposal was accepted and all of requested funding has been provided!
  • I became the first physics graduate student at UCR to be an instructor for a course (rather than just a teaching assistant). I taught Physics 40A during the summer session.
  • I got married!
  • I finished writing my first paper.  The paper is still being reviewed as I write this, so it's technically not finished because it hasn't been revised, resubmitted, and accepted for publication yet.
  • I compiled P-GADGET-2 and ran several small simulations on Crunch.
  • I've written a code in C++ to read, analyze, manipulate, and output GADGET-2 snapshot files. The code is very basic at the moment, but it should be mostly finished in a few months. I'll use this code for the non-Sunrise component of my thesis project. I plan to write another research log entry describing the first application of the code: quadratic spline interpolation of snapshot files for purposes of making animations.
  • I've built a GADGET-aware version of IFrIT.  This was somewhat more time-consuming than it should have been since there were no explicit instructions on how to build the GADGET plugin into IFrIT. I also managed to get SPLASH to read and analyze / visualize my GADGET-2 snapshots.  I still haven't managed to get Splotch to read my snapshot format though...I'm going to move on to working with Sunrise now.  I've heard that successfully building Sunrise takes some effort.
  • I've started reviewing the core of theoretical physics and I've finally started learning some elementary particle physics / quantum field theory.

For the next few weeks, I'll likely be practicing using Sunrise, writing code, and benchmarking GADGET-2 and Sunrise with various compilers and with different compiler optimization flags. I'm testing GCC 4.4 & 4.6 as well as the latest versions of the Open64, EKOPath, LLVM, Intel, and Oracle compilers.

The video below shows the results of the highest resolution simulation that I've performed. The video shows the evolution of the stars and gas during the simulation from two different view-points.  The AGN feedback is quite obvious. I've also made animations of the evolution of the dark matter during the simulation; they look exactly like the dissipationless mergers that I analyzed in my qualifier project and first paper.

 

Density profile calculator

February 24th, 2011

I had to find an efficient way to plot the density profiles of spherical systems of point particles of varying mass.  Here's a summary:

First note that the mass enclosed in a sphere of radius r is

\displaystyle M(r)=4\pi\int_0^r\rho(r')r'^2 dr'

I'm trying to find \rho(r), so I'd like to get rid of the integral sign. Thus I differentiate with respect to r

\displaystyle \frac{dM(r)}{dr}=4\pi \rho(r)r^2

then 

\displaystyle \rho(r)=\frac{1}{4\pi r^2}\frac{dM(r)}{dr}.

The task has been reduced to the problem of computing {dM(r)}/{dr}, which is rather straightforward.  If a fast sorting algorithm, such as quicksort, is used to sort the particles by distance from the center of the spherical distribution (i.e. radius), then a mass array M(i) can be quickly constructed such that each entry i is the mass contained between r and r+i\delta r for 0\leq r\leq R, where R=n\delta r and n is the number of elements in the array. Then a finite differencing scheme can be used to compute the derivative {dM(r)}/{dr} at each point. Finish by properly handling the case for which the derivative at i is vastly different from the derivatives at the surrounding points i-1 and i+1. This problem can be mitigated by using (at least) a central differencing scheme and choosing an appropriate value of \delta r; it should be no smaller than the resolution limit of the particle distribution in the region where the density is being computed/plotted.

There are, of course, several other ways of computing the density profile.  In particular, if the potential has already been computed everywhere, then the density distribution can be inferred from that. The method described above seemed to be the best method for my particular code since the quicksort algorithm typically scales as \mathcal{O}(N\log N) and all of the other operations either scale as \mathcal{O}(N) or \mathcal{O}(n).

UPDATE:  Here's a plot of the density profile using only 10,000 particles and no smoothing (blue line) compared to the exact analytic density profile from which the distribution was built (red line).  Density is in arbitrary units and is scaled by the central density, so \rho = \mbox{const} \rho(r)/\rho_c.

When cloud-in-cell smoothing was added (green line below), the result was roughly the same as the un-smoothed version, but with less random variation and some systematic differences; the gradient in the particle number density caused a slight bias toward higher density at larger radii and lower central density. This is expected behavior because the particle distribution has been "blurred" in this case, which softens the density gradient.

Paper in preparation, P-GADGET-2, & Git

February 21st, 2011

Last week I finally obtained a copy of the full-physics version of GADGET-2.  According to Desika Narayanan, it's a version which has been verified to work with Sunrise. Once I'm no longer busy with other things, I'll start working with it.

I have almost finished writing a cycle 19 HST theory proposal. The deadline is in a few days.

I'm in the early stages of working on a paper describing my qualifier project.  After reading through the literature from the last 34 years and thinking about the deficiencies in my project as it stands, I've decided to further modify the code to include several more features.  Yesterday I wrote an energy tracking function and I've begun writing a density profile plotter.  I've also improved the performance of the code somewhat.  Functions for plotting the line-of-sight velocity distribution and for computing the skewness of the distribution will be added shortly.  I may also try writing a hierarchical time-stepping scheme to improve performance further.

I've decided to maintain the qualifier code using a revision control system.  I've used Subversion for projects in the past, but Mercurial and Git seemed to be much better options.  I ultimately decided to use Git.

Git Tutorial,  Git Manual

Git should also come in handy when collaborating on LaTeX documents.

Crunch

February 14th, 2011

Dr. Gillian Wilson, one of my thesis committee members, was kind enough to provide me with funding to build a workstation for my project. I needed a fast shared memory machine with at least 32 GB of RAM in order to run Sunrise efficiently. After configuring several systems on the HP, Dell, Apple, and other websites, I discovered that it would be considerably more cost-effective if I designed and built the machine myself. Ideally, I would have built a system consisting of 2 or 4 of Intel's latest Xeon processors and an Nvidia Tesla c2050 card, but I tried to keep the price as low as possible for the level of performance necessary.  I eventually obtained the following combination of components:

Motherboard: Asus KGPE-D16 Dual Socket G34 AMD SR5690 SSI EEB 3.61
Processors: 2x AMD Opteron 6172 Magny-Cours 2.1 GHz  (24 cores total)
Memory: 16x 4 GB DDR3 1333 unregistered DRAM (64 GB total)
Graphics: EVGA Nvidia GeForce GTX 580 (containing 512 CUDA cores and 1.5 GB GDDR5 )
System disk: 64 GB Crucial RealSSD C300
Data disks: 2x 1.0 TB Western Digital Caviar Black WD1002FAEX
Chassis: Intel 5U Server Chassis SC5650WSNA, with 1000W PSU

Note 1: The Magny-Cours Opteron processors are slower per clock cycle than the current Intel Xeons, however (1) the performance of the Magny-Cours systems scales better with core count than Xeon-based machines, (2) the codes that I will be using scale rather well with thread count, (3) the Opterons are considerably cheaper for a given performance level, and (4) The Magny-Cours Opteron processors allowed me to use a large quantity of inexpensive unregistered memory modules.  The only Xeons that support this amount of unregistered memory are approximately twice as expensive as the Opterons. To summarize, I could have built an equivalent machine with fewer Xeon cores running at a higher clock rate, but such a system would have been considerably more expensive.

Note 2: The GTX 580 graphics card is typically faster than the Tesla c2050 for CUDA as long as the problem can fit in its smaller memory.  The primary deficiency of the GTX is that it was not expressly designed for sustained HPC workloads, and thus it is likely to fail sooner under heavy workloads.  On the other hand, one could buy four GTX 580 cards for less than the price of one Tesla c2050.

I've tentatively named the system "Crunch".  It's running Ubuntu 10.04.2 LTS with the 2.6.38-rc4 Linux kernel.  I'll likely upgrade to the final 2.6.38 kernel when it's released and keep  the kernel version unchanged until I finish my Ph.D. work. I have installed all of the development tools, compilers, and libraries that I think I'll need, including the latest Nvidia driver and CUDA Toolkit.  I have successfully compiled and run sample CUDA programs from the CUDA SDK, as well as my qualifier project code, the ray-tracing code POV-Ray 3.7, and benchmarks from the Phoronix Test Suite. I also installed Apache2, PHP5, MySQL, and many modules so that the system can be used as a web server. The latest version of WordPress has been installed on the web server.  The server could potentially be used for data-sharing.

Here are the outputs of bandwidthTest, deviceQuery, hwinfo, lshw, lsmod, lspci, and  lsusb.

For a listing of all software installed from the Ubuntu repositories (i.e. all pre-compiled packages), click here.

Here's a screenshot from the machine's first parallel code execution:

Running my qualifier project code remotely and viewing the gnome-system-monitor using ssh -X.

Some images taken during the assembly process:

The processors (top-side)

Processors (under-side)

16 DIMMs

GTX 580

A populated server board

Heatsink and fan

Almost finished

The finished product

Beamer

December 11th, 2010

I used the Beamer \LaTeX class to create the presentation file that I used for my advancement exam.  Beamer is an easy to use tool that makes very nice presentations.  It's much less of a hassle and yields better results than OpenOffice Impress or MS PowerPoint. The source to my presentation can be found here:

beamer-presentation-source.zip

The full source, including the videos that I showed during the presentation, can be found here:

beamer-presentation-full.zip

Hopefully these can be of use to someone who is learning to use Beamer. Having examples is a good way of learning. To compile the document, run the shell script called compile.sh or manually issue the commands that you find in the script. Alternatively, a \LaTeX editor such as Kile can be used to compile the document as long as you tell it to use pdflatex.

The output PDF will end up looking like this.

Simulation Lectures

May 3rd, 2010

I designed a series of three lectures for my fellow astronomy / astrophysics students here at UC Riverside.  The purpose of the lectures was two-fold:

  1. To introduce the students to the essentials of N-body methods and computational fluid dynamics.
  2. To motivate me to study the subject in more detail and breadth.

Having finished preparing for the lectures, I can say that the second goal has definitely been met.  Two of the three lectures were presented last week and the feedback has been positive, so it seems that the first goal was also met. Tomorrow is the third and last lecture of the series. After that, I will be able to spend time focusing on my research project once again!

The PDF presentation can be found at www.idius.net/lectures/N-body-methods.pdf.