What is a Gene?

Alongside their usual “UNICORNS CAUSE CANCER” style headlines, the tabloid press are also quite fond of “BOFFINS DISCOVER THE GENE FOR BELIEVING IN UNICORNS” style headlines. I think it kind of goes without saying that most of those who write such headlines have only the vaguest idea of what a gene is. To be fair, the more we discover, the vaguer the scientific notion of what a gene is has become, but the basics are very well established.

So what is a gene?

There are all sorts of useful analogies, similes, and metaphors we can use here. I think my favourite is the story of the pilgrim who asked for an audience with the Dalai Lama.

He was told he must first spend five years in contemplation. After the five years, he was ushered into the Dalai Lama's presence, who said, 'Well, my son, what do you wish to know?' So the pilgrim said, 'I wish to know the meaning of life, father.'

And the Dalai Lama smiled and said, 'Well my son, life is like a beanstalk, isn't it?'

“In Held 'twas In I” by Procol Harum

But I’m going to try here to describe what a gene (the real secret of life) really is instead of what it is a bit like.

Now we’ve all heard of the “chromosome”. Say this word to most people (and indeed Google images) and it probably conjures up an image like this:

Now there’s a good reason why the word “chromosome” conjures up an image like this. Basically, it’s when chromosomes look like this that we can see them under normal microscopes. But chromosomes only look like this (all bunched up and double) when they are getting ready to divide. Most of the time, and in most organisms, chromosomes look nothing like this.

Most people (even journalists) who’ve heard of chromosomes have also heard of “DNA” and are aware that it comes in the form of a double helix:

This is basically what you are looking at (ignoring all sorts of caveats that we can sweep under the lab bench for now) when you look at a length of chromosome (or at one of the strands of the chromosome in the doubled up chromosome in the chromosome picture).

So there you have it, chromosomes are (caveats aside) basically long strands of DNA.

But we haven’t mentioned “genes” yet I hear you cry.

Well a gene is a short(ish) bit of chromosome (or DNA strand if you prefer). Now (returning to analogies) “genes” are often compared here to beads on a string. But, since there isn’t really any “string” (just molecules and links between them) popper beads maybe provide a better analogy …. except that there aren’t really any beads either.

Let’s look at the DNA molecule in more detail:

DNA is made from Nucleotides – which is what the “N” stands for in “DNA”. There are just four different nucleotides involved Adenine, Cytosine, Guanine, and Thymine - which are often denoted by their initial letters: A, C, G and T.

If we un-twist the DNA and look at a short bit of it, it looks a bit like this:

But that’s already a bit complicated, so let’s simplify things still further:

(For any pedants reading, each box here represents a nucleotide together with a phosphate deoxyribose; but let's keep things simple.)

Now the more astute among you will have noticed that these two strands are complementary – the sequence of Gs, Cs,As and Ts in the strand at the bottom can be inferred from the sequence of Gs, Cs,As and Ts in the strand at the top (and vice versa).

As this implies, we only really need one strand and, indeed, we are only really interested in one stand today: the” sense” strand. The complementary strand is “anti-sense” and we can ignore it until we come to DNA duplication – which we’re not going to come to in this post.

Going back to analogies again for a second, it’s a bit like every time Guardian journalist Ben Goldacre (@BenGoldacre / http://www.badscience.net) writes a sensible sentence in his blog, Daily Mail journalist Melanie Phillips (@MelanieLatest / http://melaniephillips.com) writes a completely irrational and nonsensical sentence in her blog, and the two kind of cancel each other out.

Anyway, this leaves us with:

These are a bit like popper beads I suppose, but they are nucleotides not genes. There may be, not billions and billions and squillions (said in a Lancashire accent), but certainly hundreds or thousands of these in one gene.

So what use is that?

Well these for nucleotides form a kind of code – a code comprising only four “letters”, but a very powerful code for all that.

But if a chromosome is just a long series of nucleotides and a gene is a simply a part of that series, how do we know where one gene ends and the next one begins?

Well I suppose (and here I’m going to resort to a serious(ish) analogy) it’s a bit like the old style telegrams where you were restricted to twenty-six capital letters and that was it. You had to write stuff like ….


…. in order to avoid misreading (try it without the STOPs).

It’s like that with the genetic code. There’s no punctuation, it’s all in the sequence of “letters”, but, as has been noted, we don’t even have twenty-six, we only have four. These make up three letter “words” called “DNA triplets” and each triplet codes for one amino acid.

Just as a DNA strand is a string of nucleotides, a protein is a sequence of amino acids and each gene coded for the string of amino acids that make up a particular protein. Like this:

So the sequence of nucleotides CTA codes for the amino acid “aspartic acid”, AAA codes for the amino acid “phenylalanine” and ATG codes for “stop making protein”.

Since this “protein” only has two amino acids in it, I’m not sure you can really call it a “protein”. It would more usually be called a “dipeptide”. But you’ve almost certainly eaten some of this (give or take a methyl group); it is the artificial sweetener called “aspartame” or “Nutrasweet”. I doubt that there are actually any real genes out in the wild for making aspartame, but I suppose there could be, and it’s a nice simple example of what a very short gene could do.


So now you understand what a gene is. It’s a sequence of nucleotides that codes for a protein (or at least part of a protein – some proteins are made from more than one amino acid chain).

I suppose, armed only with the understanding presented above, you could (naively) begin to imagine that if you have lots of genes for (say) muscle protein (or genes that produce extra good quality muscle protein) you might be more likely to make it as athlete, but how does it all get so complicated and how can you have a gene for believing in unicorns?

Well part of the answer (the full answers really are complicated) is that proteins, as well as being structural like muscle proteins, can be regulatory, like enzymes – which control all sorts of things that go on in our bodies.

Once you consider that the products of some genes can control what other genes do (in all sorts of complicated direct and indirect ways that we don’t need to go into here) you begin to realize that genetics is very sophisticated and subtle and complex.

Your computer is not really built from transistors any more (and still less from valves) but the principle is the same. A transistor is a switch that turns another switch on and off. Once you start putting a few transistors together, you rapidly start to get quite complex behaviour. Put shedloads together and you get something that can do stuff like decide to stall my Ford Galaxy just before I want to set off from a junction (while producing a fault-code which my garage insists doesn’t exist).

Anyway I digress. My point is that even simple feedback mechanisms (and the feedback mechanisms in genetics are far from simple) can produce really really complex behaviour.

Some species of bird are genetically programmed to build very sophisticated nests to lie in. My cats are genetically programmed to catch birds (fortunately for the birds they’re both rather crap at it) but are not genetically programmed (and not bright enough) to even move a twig out of the way before lying down on an otherwise perfectly comfortable and sunny patch of grass in the garden.

These complex behaviours require lots of genes (and maybe lots of so called “junk” DNA) working in harmony. On the other hand, the colours of my cats (one is black and the other is tortoiseshell) arise from the actions of just one or two genes (though even here – especially in the case of the tortoiseshell – things are a bit more complicated than you might imagine).

So while you probably can’t really have a gene for believing in unicorns, you probably can (for example) have a genetic makeup that makes you more susceptible to superstition and irrational views.

At heart, however, a gene is simply a code for making a protein.