2016-08-28

where's the information?

It came up frequently in discussions this summer (and last): Where is the information (in, say, a spectrum of a star) about some parameter of interest (say, the potassium abundance of the star, or the radial velocity), and how much information is there? The answer is very simple! But the issues can be subtle, because there is only calculable information within the context of some kind of model. And by “model” here, I mean a probability density function for the data, parameterized by the parameters of interest. That is, a likelihood function.

The fast answer is this: The information about parameter θ is related to the (inverse squared) amount you can move parameter θ and still get reasonable probability for the data. The nice thing is that you can compute this, often, without doing a full inference. It is easiest in linear (or linearized) models with Gaussian noise! That's the question we will answer here.

When you have a linear or linearized model with Gaussian noise, there are derivatives of the expectation Y for the data with respect to the parameter of interest, dY/dθ. Here (for now) Y is an N-vector the size N of your data, and θ is a scalar parameter (let's call it the velocity!). So the derivative dY/dθ is an N-vector. The information about θ in the data is related to the dot product of this vector with respect to itself: The accuracy with which you can measure θ given data with Gaussian noise with N×N covariance matrix C (possibly diagonal if the N data points are independent) is:

σθ-2 = [dY/dθ]T C-1 [dY/dθ]

where σθ is the uncertainty on θ. That is, the inverse variance on the θ parameter is the inner product of the derivative vectors, where that inner product uses the inverse variance tensor of the noise in the data as its metric! Here we have implicitly assumed that the vectors are column vectors. When the N data points are independent, the C matrix is diagonal, as is its inverse. Note the units too: The inverse variance tensor has inverse Y-squared units, the inner product uses the derivatives to change this to inverse θ-squared units.

(When there are multiple parameters in θ—say K parameters—the inner product generalizes to making a K×K inverse covariance matrix for the parameter vector, and the expected variance on each parameter is obtained by inverting that inverse variance matrix and looking at the diagonals.)

But we started with the question: Where is the information in the data? In this case, it means: Where in the spectrum is the information about the velocity? The answer is simple: It is where the data—or really the inverse variance tensor for the noise the data—makes large contributions to the inverse variance computed above for θ. You can think of splitting the data into fine chunks, and asking this question about every chunk; the chunks or pixels or data subsets that contribute most to the scalar inverse variance are the subsets that contain the most information about θ.

2016-08-14

walk or take the elevator?

I'm just generally excited about getting back into the classroom after a long sabbatical. I'm thinking about problem-set problems for the Physics Majors. Here's what's in my head right now:

NYC has had a hot summer, with most buildings running air conditioning on a thermostat continuously. To save energy, NYU (and other large entities in NYC) asked their employees to conserve energy in various ways, some of which we might take issue with. Here's an uncontroversial one: You should take the stairs, not the elevator.

But is that uncontroversial? What considerations are required to figure out whether this policy would reduce or increase energy consumption? Obviously—if you take the stairs—you use less elevator energy, but then you drop a metabolic load on the building air-conditioning. Which uses more power in the end? Use a combination of web research and simple physical arguments to make cases, and identify weaknesses in your argument as you change assumptions. Things that matter include: Neither humans nor elevators are 100-percent efficient delivery vehicles for potential energy (in fact, can you see a fundamental argument that elevators must spend more than 50 percent of their energy generating heat?). Elevators are heavy but counter-weighted. Some buildings have very busy elevators, so your contribution to the elevator load is only the marginal contribution; in other buildings you are typically the only person in the elevator. Air conditioning systems have efficiencies limited by fundamental ideas in thermodynamics, but are probably much less efficient than the limits. And so on!

Thanks to Andrei Gruzinov (NYU) for starting me thinking about this one.

2016-02-07

Syllabus (the book)

I just read Syllabus by Lynda Barry. It is a set of syllabi and daily in-class instructions, along with some reflections, from a few years of teaching cartooning and writing at Wisconsin. It is a combination of hilarious and insightful, both about learning to draw and cartoon, but also about the practice of teaching.


A page from Lynda Barry's Syllabus.

The syllabi and daily exercises include many great practices. For example, when students are listening to something (being read or previously recorded), she has them either draw tight spirals in their notebooks or else color in a line drawing. She has the students sketch in non-photo blue pencil and ink in later; it reduces inhibitions. She has simple forms for students to write daily diary or journal entries so that they make close observations of the world, in words and pictures. She loves most the students who have never drawn (since childhood), and she makes exercises that capitalize on their newness (and defeat those who are more experienced at rendering), like forcing them to do drawings in increasingly short time intervals.

The book is hilarious in part because she shows lots of great student work (it is not clear she has proper permissions here, because she does not individually credit each student drawing), and because the whole book is a collage of drawings, writings, and found objects, in the Lynda Barry style. I doubt there is another book out there on teaching that makes you laugh out loud all the way through.

A great question for me is what of Barry's practices could carry over to a class in Physics?

2015-09-14

Make it Stick

I just read Make it Stick, about research-based results in how people learn and the implications for education. I loved it; it is filled with simple, straightforward ideas that will be useful in the classroom. For example, it is better to do many low-stakes quizzes than a few high-stakes exams. For another, it isn't useful for students to re-read the textbook, and it is useful for the lectures and the textbook to be misaligned. For another, students' perception of their learning is often wrong and misguided. For another, it is useful to interleave topics and not just do “massed practice”. It points out that it is very adaptive for learners to believe that their brains are plastic and their abilities are not innately limited. Luckily this also appears to be true. All of these things will come into my next pre-health (or other big) class. And I will explicitly explain to the students why.

My quibbles with the book are few. One is that they slag off unschooling, and then immediately follow with a long profile of a Bruce Hendry, who is a perfect example of the power of unschooling (he is entirely self-taught through self-directed projects of great importance to himself!). I also found the writing repetitive and a bit slow. But the book is filled with good ideas. Also, it is not just informative, it is responsible: The authors clearly differentiate between research findings and speculations or over-generalizations of them. This is a great contribution to the literature on teaching and learning.

2015-01-27

emission lines from stars

At the end of Mike Blanton's brown-bag talk at NYU yesterday, Matt Kleban asked: Why don't stars produce emission lines; why only absorption lines? Maryam Modjaz said "because they are hotter on the inside and cooler on the outside". That's true! But it is slightly non-trivial to see why the consequence is always absorption-lines only. And does it mean that if the stars were cold, condensed objects bathed in a hotter radiation field, they would produce emission lines? (I think the answer here might be "yes"; think of a gas cloud bombarded with ionizing radiation.) Also Kleban pointed out that actually the very outside of the Sun is in fact hotter than the surface, which is true, but it must be that this is just so optically thin it barely matters.

In some ways, the biggest paradox about stars is that they aren't all the same temperature: After all, the "surface temperature" of a star is the temperature around the place where the photosphere becomes optically thin; shouldn't this be around 10,000 K for all stars? After all, that's the temperature around which hydrogen atoms recombine (see, for example, the CMB). I don't know any simple answer to this paradoxical question; to my (outsider) perspective it seems like the answer is always all about detailed atomic physics.

2014-11-14

oscillations and the metric

In class, I was solving the normal-mode problem for a solid object near equilibrium, using generalized coordinates, in the usual manner. This starts by orthogonalizing the coordinates to make (what I call) the "mass tensor" (the tensor that comes in to the quadratic kinetic-energy term) proportional to (or identical to) the identity. This operation was annoying me: Why do we have to get explicit about the coordinates? The whole point is that the coordinates are general and we don't have to get specific about their form!

In my anger, I solved the problem without this orthogonalization. It turns out that this solution is easier! Of course it is: I can do everything with pure matrix operations.

I had two other in-class epiphanies about the problem. The first is that the solution you get when you don't do the orthogonalization is more analogous to the simple one-dimensional problem in every way. The second is that, in a D-dimensional problem with D generalized coordinates, the tensor that goes in to the kinetic energy term is some kind of spatial metric for a D-dimensional dynamical problem. (Or proportional to it, anyway.) That is simultaneously obvious and deep.

2014-10-22

many-body systems; composite objects

Every time I teach mechanics (and this is something like the 21st year I have taught it at the undergraduate level) I learn something new. This week we are talking about many-body systems; I had two epiphanies (both trivial, but still): The first is that the description of the object in terms of a center-of-mass vector and then many difference vectors away from the center of mass (one per "atom") is purely a coordinate transform. Indeed, it is just generalized coordinate system that is related to the Newtonian coordinates by a holonomic transformation. Awesome! So when the Lagrangian separates into external and internal terms, this is just a result of the appropriateness of that transformation.

The second is that the definition of the many-body system is completely arbitrary. It should be chosen not on the grounds of being bound or solid or connected but rather on the grounds of whether choosing it that way simplifies the problem solution. Both of these realizations are simple and obvious, but it took a lot of teaching for me to get them fully. I am reminded as I realize these things that the physics concepts we expect first-year undergraduates to manipulate and be comfortable with are in fact pretty damned hard.