Thursday, March 23, 2017

Normal Science: Chapter 2 of Kuhn

I've been going through Thomas Kuhn's book/essay The Structure of Scientific Revolutions, and I'm sharing my thoughts as I go (Chapter 1). Chapter 2 is about what Normal Science is and how it comes about from the general confusion that precedes the first settled scientific paradigm in a given field. The chapter feels a bit like it's set up chronologically backwards working from what being inside of a paradigm is like, then going back to how the paradigm forms, and finally talking a bit about what doing science pre-paradigm is like.

I think I'll go in chronological order instead, using an example like the early study of electricity. Now by "early" I don't mean going back before Newtonian mechanics became a thing, but early enough that we really only know that a few strange things are happening but no clue why. So actually let's back up one more step and say that you don't want to study electricity, but instead you want to study why static shocks happen. After all, "studying electricity" implies that you know a whole bunch of events and data are connected. In your study of static shocks, you learn that rubbing certain materials causes shocks more than others, sometimes. You learn that certain combinations work better. You also might realize that shocks come off of batteries (or rudimentary things like them). But now you have to try to put all of this together. Here are some of Kuhn's thoughts:
In the absence of a paradigm or some candidate for paradigm, all of the facts that could possibly pertain to the development of a given science are likely to seem equally relevant. As a result, early fact-gathering is a far more nearly random activity than the one that subsequent scientific development makes familiar. Furthermore, in the absence of a reason for seeking some particular form of more recondite information, early fact-gathering is usually restricted to the wealth of data that lie ready to hand.
So before you get started, you have no idea if anything could affect your measurements of static. Time of day, how dusty your floor is, what shoes you wear, and whether you hold your pinky out or not are all variables that could somehow matter. In the case of static, air moisture actually does matter and was probably really hard to control or even measure. I can imagine a number of scientists trying to discern esoteric patterns out of the day to day (or yearlong) fluctuations caused by moisture in the air. Suppose, after years and years of data collection and analysis, they finally did figure it out without a more general theory of electricity. All that analysis would tell them is that water makes shocks more conductive. I suppose there could be some benefit there, but from a modern perspective it seems like a big distraction.

Also, without a paradigm, you have no way of agreeing with any data collected by someone else studying the same thing. If any variable could matter, then it's impossible to report your results. Did you collect your data over multiple days? Did you include the moisture level? If I'm someone who thinks that's incredibly important, and all you want to talk about is the materials you rub together to make static shocks, you may not have taken the time and energy to collect the moisture data (or the data that some third person really cares about). If one hundred people all think something different matters to the creation of the static shocks, then collecting the right data to discuss becomes practically impossible.

So now we see we need some way to agree on what's relevant and what isn't. To do that we need a paradigm. More Kuhn:
Men whose research is based on shared paradigms are committed to the same rules
and standards for scientific practice. That commitment and the apparent consensus it produces are prerequisites for normal science, i.e., for the genesis and continuation of a particular research tradition.
...
No wonder, then, that in the early stages of the development of any science different men confronting the same range of phenomena, but not usually all the same particular phenomena, describe and interpret them in different ways. What is surprising, and perhaps also unique in its degree to the fields we call science, is that such initial divergences should ever largely disappear.
So once you've gathered enough data early on, and you convince a few fellow researchers that some set of parameters are what matter, you're on your way to studying static. Another aspect of the paradigm (mentioned above) is the way you interpret your data. I actually think that the parameter paradigm and the interpretation paradigm are separate things. Both make it easier to communicate to likeminded researchers, but the data is still the data. A highly religions mystic and a hard materialist could agree on a parameter paradigm but then one would interpret electricity as the wrath of spirits while the other might interpret it as the emission of stored energy. Those disagreements will make it harder to agree on what follow-up experiments should be, and peer review will likely be tricky, but the data itself would still be acceptable.

After all of this is agreed upon, we get to do what Kuhn calls normal science. Kuhn sometimes talks as though normal science happens most of the time, but at other times, when he's talking about paradigms within a very specialized field, it makes me wonder if paradigms are almost always shifting. In psychological research there is constant discussion and disagreement about which human behaviors, optical illusions, or EEG signals should be lumped together and why. My only experience with something like normal science was when I was on a team designing a sonar array (set of microphones on a string behind the boat). Even in that case, where the behavior of sound in water was almost entirely agreed upon, when I came up with some unintuitively good results (a computer model of the array could detect torpedoes in the water better than we expected), and someone else replicated them, everyone was still incredibly skeptical. In spite of having a shared paradigm, it felt as though pushing people out of their comfort zone still required a shift in how they thought, eventually using non-simulated data. In the end, though, this process of convincing the Navy using real data probably does fall within normal science because there was a final agreed-upon test of the real system that would prove I was right.

One interesting note from this chapter is that Kuhn spends a few pages talking about book-making. It took me a while to figure out what he meant, but I guess in his day, writing a book would be done if you're addressing the public while just writing journal articles or reports would be for one's peers.

Only in the earlier, pre-paradigm, stages of the development of the various sciences did the book ordinarily possess the same relation to professional achievement that it still retains in other creative fields. And only in those fields that still retain the book, with or without the article, as a vehicle for research communication are the lines of professionalization still so loosely drawn that the layman may hope to follow progress by reading the practitioners’ original reports.
...
Although it has become customary, and is surely proper, to deplore the widening gulf that separates the professional scientist from his colleagues in other fields, too little attention is paid to the essential relationship between that gulf and the mechanisms intrinsic to scientific advance.

So Kuhn is arguing that because of the community that is created by a scientific paradigm, it separates from the lay public and develops its own subculture, which in turn makes it harder for the scientists in that subculture to communicate with others. People often complain that scientists are terrible public speakers, and this gives a reason why that may actually be necessary. It's an interesting thought, and it leaves open the possibility that scientific translation may become more valuable as science progresses. I'll just leave a link to the interesting idea of research debt that I've seen a few times this week. If Kuhn is right that the development of a paradigm automatically separates the scientists from the lay community, then I think the solution to bringing research to a wider audience is more nuanced than many people presume.

PS - I improved the layout a bit. I hope it looks decent.

Wednesday, March 8, 2017

Scientific Method: Chapter 1 of "The Structure of Scientific Revolutions" by Kuhn

Before I get started, here's the PDF I'm using.

I started thinking about this post a while ago because I wanted to talk about what the scientific method is like in reality as opposed to how it works according to textbooks and simple graphs like the one below (taken from here).


After looking around a little, I realized I could try to reinvent the wheel or I could start from what other people have said and go from there. I listened to a discussion on this book (Kuhn calls his over 100 page book an essay), and it sounded like the author has something interesting to say, though I will probably disagree enough that it will be interesting to write about my thoughts.

Kuhn starts off by talking about how the field of history has a role to play in upending the textbook story of how science progresses. According to Kuhn, textbook progression of science is just like the diagram above, and importantly the textbook progression also says that we are steadfastly marching towards the truth. Go Science! But, Kuhn says, that's not how science actually works. We often ignore data points as outliers. You may actually remember doing things like this in statistics class with box and whisker plots. Scientists also tend to get stuck on particular compelling theories, even when the data doesn't entire support them. Sometimes the holders of a previous theory have to die off before a new one is fully accepted. I can imagine many Newtonian physicists were too appalled by relativity to ever embrace it (hopefully covered in a later chapter). Lastly, Kuhn discusses how the order of discovery seems to affect which theories get adopted. He talks about how a new scientist entering a field with scientific skills but no knowledge of that field will likely arrive at a very different theory from another person doing the same thing, depending on which order experiments are done in.

Those are the phenomena that Kuhn is trying to explain. An interesting connection this sparked for me is that the rejection of outliers is also connected to why scientists don't often report negative results. If a project is going horribly, it may be because you aren't using your equipment right, or training an animal correctly, or any number of other completely scientifically boring reasons. When I meet with professors to discuss a work in progress, the first five or so iterations of the work are completely unpublishable because minor errors make the analysis (or the collected data) meaningless. Kuhn does raise an interesting point that properly identifying these sorts of errors is crucial to doing science well, and it is hard to tell (especially as an outsider) what should be categorized as an outlier and what should be categorized as an interesting anomaly. It is common practice for a scientist to tell another scientist that they must have done something wrong because the data "looks off". This is usually good advice.

Kuhn then goes on to introduce the idea of a scientific paradigm, which consists of a set of standards for what is normal, what amounts of error are acceptable in certain measurements, and what interpretations of data are valid. He says that normal science works basically as described in the diagram above, with the standards all in place. Here's a quote on it:

Normal science, the activity in which most scientists inevitably spend almost all their time, is predicated on the assumption that the scientific community knows what the world is like. Much of the success of the enterprise derives from the community’s willingness to defend that assumption, if necessary at considerable cost. Normal science, for example, often suppresses fundamental novelties because they are necessarily subversive of its basic commitments.

He's quite persuasive in his writing, but I have to admit I disagree with his reasoning about why scientists behave this way. The suppression of fundamental novelties is because the scientist is acting entirely reasonably. Taking the motion of the planets as an example, a scientists who believed in Newtonian mechanics would be quite justified, having seen both the discovery of Neptune from the orbits of other planets, and the regular prediction of every planet's motion except Mercury. So being that scientist, do you really expect that you need to completely rethink everything you know about time and space? Maybe instead, there is something strange about Mercury that we haven't discovered. Can it be explained by strange composition? Maybe tidal forces with the sun? An invisible moon? Who knows? And furthermore, I regularly read about physics professors having crackpots tell them they solved every physics problem with some mystical-sounding theory (neuroscience has it's share too). If the new theory sounds like one of those, it's not surprising that the professor might take extraordinary persuasion to overturn Newton. At the very least, I hope Kuhn will discuss this as a very reasonable possibility, and if he does reject it I hope it is with good reason.

Kuhn wants to draw a distinction between Normal science as described above and times of Paradigm shift, when the concepts in use are changing, the standards shift, and a new paradigm emerges, under which we continue to do normal science again. I'm not sure the two are so separate, but maybe that is because the field of neuroscience is relatively young. In neuroscience we know we are unable to measure most of the variables that affect what we care to study, so new theories are practically expected, though if someone claimed magic rays caused our thoughts and it had nothing to do with neurons I would certainly be dismissive.

As a final caveat to all of this, I have heard (though I don't think he mentioned in this chapter) that Kuhn believes that the data itself changes when a paradigm shift occurs. Whatever he means by that, I hope that he doesn't mean that repeating a procedure after the shift will produce a different outcome. That way lies madness.

Wednesday, February 22, 2017

LSTM musings

This week I'm reading about a type of artificial neural network called a Long Short Term Memory Networks (LSTMs for short), which I've heard about a number of times but never actually learned what they are beyond the name. The tutorial I'm reading comes highly recommended by my friend Google (I searched LSTM tutorial).

The details are useful, and the idea of explicitly having memory management in an RNN makes a lot of sense. The basic idea is that if you want your artificial neural network to have memory of stuff that happened in the past, then you should specifically arrange it so that it has that property. Interestingly, just taking some outputs and plugging them back in as inputs (a standard RNN) is not able to learn to represent long-term memories. From my perspective, the idea to structure your network to have a desired property is the same rough idea as how Convolutional Neural Networks apply the same operation over all of visual space, since we expect that vision in the middle of the image should be pretty similar to vision on the edges.

Anyway, the tutorial does a great job of explaining how they work, so I won't repeat it here. I'm curious what advances have happened since 2015 when it was written. I know people have already tried applying 'attention' to networks, as mentioned in the conclusion, though I don't know how well those work or how well they mimic the brain's version of attention. Certainly no one has found an LSTM module in the brain yet, so I doubt anyone adding something on to these networks is aiming too much for biological plausibility.

One thing I struggle with when dealing with these kinds of learning algorithms is that its very difficult to tell (without playing around with them for a while) which kinds of changes to the architecture are going to make substantial changes in how the network behaves. The tutorial has a few variants that seem to conceptually allow the network to do the same thing. Do these differences matter substantially? It's likely they do, but it's certainly not clear how exactly. I'd love to know if anyone has any decent ways to figure an answer to that kind of question. Is the answer just a shotgun approach? Throw your algorithm/structure at everything and see what works? Are there any good ways to visualize the loss function that might provide insights? For example, are there classes of problems where the loss function is a spiky mess full of local minima?

I'll continue to ponder all of this, but I'd love to get some input if you have any to give.

Wednesday, February 8, 2017

Interesting paper on why you have to make a good null hypothesis

Today I'll be talking about this paper, titled "Spike-Centered Jitter Can Mistake Temporal Structure." This is a topic that first got me interested in my lab because I felt that I had something to add to the neuroscience community. This is a case where mathematical rigor is important to get right and the neuroscience community at large is not as focused on mathematical rigor as I am (this isn't a criticism and one day I'll write a post about why). Before I get to what the paper is about, let me give some background.

Background


A while ago, we (meaning other scientists that aren't me at all because I wasn't alive yet) figured out that neurons tend to respond to particular stimuli and are organized in a particular way. We generally describe what stimuli activate a neuron using a Receptive field. That's generally driven by the part of the network that starts with stimulus (e.g. light hitting your eyes) and goes further into the brain (e.g. visual cortex). This direction (stimulus to brain, and further into the brain) is usually called the feedforward direction.

Next we started asking about what the brain does with that feedforward information. Many many of the connections in the brain aren't feedforward at all. There are lateral connections within an area, feedback loops that span many different areas, and everything in between. If we want to know how the brain works, we better figure out what all those connections are doing. And to figure out what they're doing, we better figure out which connections are there and which ones aren't. The "easy" way would be to literally look at all of the connections from one neuron to another. This is really hard because there are about 1 billion neurons and the connections are tiny and/or long. So what we want is a way to record from a pair of neurons and see if they're connected, and if we can make it fancier we want to record from tons of neurons and figure out the wiring diagram.

We started doing just that by looking at the number of times neurons fire at the same time (this is called synchrony), or at some specific delay from the other. If they're firing at the same time, they likely have some common input, and if they have a delay, one may be causing the other to fire. Well that seems to solve the problem, but then you get the age-old scientific question: what's the null hypothesis? And now things get tricky again. You could say "I expect a flat firing rate". Then you can count the number of spikes and compute how many you would expect by chance pretty easily. But in most experiments, you don't see flat firing rates. Even worse, you can see firing rates that co-vary between neurons even though they aren't connected at all. This can happen because the stimuli that activate the neurons are being presented at the same time, or because of waves of electrical activity that propagate across cortex may have little to do with direct connections between neurons.

As computers have gotten faster, we've been able to come up with more complicated null hypotheses. A good one (I thought so until I read this stuff) is the spike-centered jitter hypothesis. It says that if the null hypothesis is true, the number of synchronous spikes should be the same if we randomly shift the spikes around by a little bit. We can simulate this pretty easily by making fake spike trains where the spikes have been shifted around and counting the number of times the spikes still line up.

On to the New Stuff


It turns out there's a problem with that. The problem is that if you look at the null hypothesis, it's dependent on the data. That's could be okay if dependence is done correctly but here it isn't. The problem is that if you had a whole bunch of these jittered spike trains, you could pick out the original as the one where each of the spikes is in the center of where jittered spikes fall. My first reaction to this was to say "is this just a quirk of math, or do I actually need to care about it?"

Now the authors have shown me the answer. Everyone should care. What they do in this paper is they show a number of surprisingly reasonable cases where the spike-centered jitter hypothesis would make scientists detect synchrony when there isn't any connection between the spike trains. The best example they give is of case where one of the spike trains is a rhythmic firing, which does actually happen, and the other one is just a few random spikes. They show that if there are enough random spikes, you can get fooled by the spike-centered jitter hypothesis for any p-value threshold you can think of.

They also show that their alternative method (which is already mathematically proven but takes a bit more computing power and thought) avoids all of these problems.

Take-Aways


The first take-away from this paper is obvious: don't use the spike-centered jitter hypothesis. Just don't do it. Use interval jitter instead, or better yet, use my algorithm to implement it! The second take-away is that being careful about your null hypothesis is important. If it has any dependence on your original data, scrutinize it a second or third time to make sure it doesn't fall into the trap that the spike-centered jitter hypothesis does. If still unsure, check with a statistician.

Wednesday, January 18, 2017

My Views on the Mind/Body Problem

I had originally started my last post wanting to talk about what I think about the mind/body problem, but got derailed with what I (maybe narcissistically) thought was an interesting story about how I got where I am. With that out of the way, I'd like to discuss how I think the mind/body problem arises and what can be done about it. This is largely a rephrasing of what David Chalmers had to say about consciousness in The Conscious Mind (1998), but I hope it's helpful anyway.

Chalmers spends a long portion in the beginning of his book discussing Philisophical Zombies (or p-zombies, see Wikipedia and the Stanford Encyclopedia). The idea behind a p-zombie is that unlike regular zombies it behaves externally just like a person: eats just like a person, laughs just like a person, etc. but internally it is devoid of consciousness. If you were to take this idea way too seriously, you might wonder whether other people in your life are actually conscious or just p-zombies, but please don't do that.

So if we don't think they really exist, why talk about p-zombies at all? The reason is that given the current scientific perspective, everyone looks more like a p-zombie than an actually conscious person. Science so far has done wonderfully with reductionist explanations of things we observe. Why is the sky blue? Because the air in the atmosphere diffracts light in a certain way. Why does it diffract light in that particular way? Because of the mixture and arrangement of its atoms. Why do atoms behave that way? Because they are made of electrons protons and neutrons that behave a certain way. I'm not sure I could keep going with this too much longer, but at some point you get to quantum physics, which is ludicrously accurate at predicting the behavior of very small objects, and we don't have anything at a smaller scale than that. So given that very low-level theory, we should be able to put together pieces and get a human. Start with quantum stuff, work your way up to atoms, build up atoms to proteins and other complex chemicals. Those chemicals sit an a lump called a cell. Some cells that we particularly care about (neurons) have excitable membranes that change voltage very suddenly. That voltage change gets transmitted along the neuron until it gets to another cell and either transmits current directly into that cell or releases neurotransmitters. Then the next cell gets excited and the process continues. Getting into the neuroscience side of things now, we know that the spiking of some neurons are modified by inputs to a human (sensory stuff like touch, taste, sight, hearing, etc.) and that other neurons spiking causes our muscles to undertake the actions we humans take as well as controlling a bunch of stuff in the rest of our body. Combine all that with the fact that we can prove that a large network of simplified neurons can compute anything (yes, anything, so long as the network is big enough) and that means that baring working out the (ludicrously complicated but not metaphysically mysterious) details we have a really good theory to explain human behavior entirely.

But hold on, we now have a model that is a bunch of entirely unthinking unconscious particles following basic laws in an certain way to produce actions that mimic thoughts and consciousness. That sounds like... a p-zombie! One solution that you might hear a lot is that mimicking consciousness IS consciousness. I think the phrases "We're just a bag of chemicals", or "love is just a hormone in your brain" are both sentiments that reflect this idea. Another solution is to say that your brain makes you believe you are conscious but you really aren't. This seems like a cop out, since it denies one of your most basic direct experiences. These explanations don't explain why it is different to actually be you instead of being an external observer.

The correct approach when you see this kind of a contradiction between your model of the physical world (which is essentially a p-zombie) and reality (the fact that you do in fact have conscious experiences that aren't just a behavioral or computational trick), is to first recognize that your model is incomplete and then look for ways to get data to guide you in improving your model. Chalmers suggests trying to correlate what people report themselves as being conscious of as a start for getting data, which I think is a good place to start. Then, after figuring out many of those ludicrously complicated but not metaphysically mysterious details I mentioned earlier, we might be able to see what physical reactions or computations correspond to conscious experience and go from there.

I love having this framework in my head to remind myself why all the intricate details of the physics of the brain are guiding the human race towards an understanding of something that has mystified us for so long. It's an awesome project to be part of.

Wednesday, January 4, 2017

What I Like About the Brain

Before I got to grad school, I knew I was interested in the brain, but I'm not sure I really knew why. The brain is an amazing piece of hardware that is able to compute pretty much anything using much less power than your average computer with more memory and processing power. It is more creative than any existing computer. It can process visual and audio input better than any computer (baring weird optical and audio illusions). Robots have just developed walking in the last few years while each human can figure it out in about a year along with a host of other motor behaviors. I was surprised to find out over the course of my studies that I don't care a ton about all of this. It certainly isn't what gets me up in the morning.

There is lots of really cool technology associated with the brain that is really interesting. I still don't know the math behind how fMRI works and some day I'd like to read up on it. I know it uses a time-varying magnetic field (which is why you don't want to wear conductive metal anything in there, especially loops of metal), and I know it does some transformation from a frequency domain representation of the blood-oxygen signal into a spatial representation, but I don't know a number of important details to figure out how it does those things, and I have to admit it would be fun to learn. I even worked in an fMRI lab one summer in undergrad, but was surprised that I didn't enjoy the field very much. Why? At the time I couldn't really say. Some of it was that I thought the interpretation of experiments was a little overblown, but I don't think that captured all of my internal issues.

There is also the incredibly sexy technology known as Brain-Machine Interfaces (BMIs), which connect a computer (or a robot arm, or whatever) to the brain or the nervous system by a set of electrodes (wires) measuring the activity of neurons. I originally went to graduate school to study and hopefully improve on BMIs, but after the initial wow factor wore off I once again found myself frustrated with work. While the idea of making cyborgs sounds amazing (who wouldn't want to say they build cyborgs for a living?), the practice was difficult, as all research is, and rather than push through that difficulty I found myself asking why I cared. The other problem was that I found very little reason to choose between any given change in how BMIs already work. Though we know a fair bit about the brain, we don't really know a lot about how to hook up electrodes in a useful way to the brain. We mostly just place the wires and then let the brain figure out what to do with them. It usually does an okay job, but we don't know what the brain is doing so its difficult (or a very long research program) to try to figure out how to make things better.

So between working on fMRI and BMIs I had read books and articles about the brain for non-neuroscientists that had grabbed my attention. I liked some portion of A Universe of Consciousness that discussed the idea of "The World Knot", which was their way of describing the interlinking between personal experience and the rest of the world, though I found it unsatisfactory. I had also liked The Brain that Changes Itself as an introduction to some of the complexity of neuroscience and changes in the scientific perspective since I was younger.

Finally, the biggest personal reason that I thought "of course the brain is interesting" is that it is a big complicated puzzle. Generally speaking, I love puzzles. I play lots of puzzle games in my spare time and fiddle with a Rubik's Cube every now and then, refusing to look up how to do them because I want to figure it out myself. But what is the puzzle of the brain that is interesting? When I was looking for labs in neuroscience, many of them studied one aspect of the brain or another. They would pick an area of the brain that they thought was interesting, or a topic like visual input, motor control, memory, or attention. Each of these sound like fields with lots of hard questions but again I didn't find myself terribly attached to any of them.

In the summer of 2015, after I had already settled in my current lab, I watched TED talk by David Chalmers, which gave me the rare experience of feeling like someone else is explaining my own unconscious thoughts to me. I usually don't love TED talks because they tend to oversimplify things, but I thought Chalmers did a really good job.

The brain is interesting because it is the only thing that we are really sure has experiences. Experiences are so fundamental to our existence that we have to learn that other things don't have them. By the time you're an adult, you think of a rock as something that doesn't move unless something pushes it, and when it does move it doesn't care in the least. If someone said "What is it like to be a rock?" you'd laugh at them because rocks don't have experiences. But you do! Why? I don't know, but that's what's so interesting! With this perspective, suddenly the fields of sensation, motor control, memory, attention, and bits of computer science all seem really interesting in how they relate to this more fundamental question. So these days I study attention with the hope that as we improve our understanding of how the pieces of the brain work together, we will be able to work out how the computations performed by the brain allow it to be conscious and have experiences instead of just being like a wet rock.

I'll end this with some suggested reading: The Conscious Mind by David Chalmers, and Consciousness Explained by Dan Dennett. Both books are dense but interesting. I only read the last few chapters of Consciousness Explained because the rest was only tangentially related to my interests, but I read nearly all of The Conscious Mind.  Dennett argues that anything that acts like it has a mind must be conscious (I disagree), but reading both sides of this argument helps keep an open mind and helps me try to avoid mental pitfalls that are easy to fall into when reading philosophy. Chalmers argues that we need to add mental properties to our understanding of the universe to explain consciousness because modern physics does not predict the existence of an internal mental experience (I agree).

PS - Happy 2017!