ElGordillo
Tuesday, October 05, 2004
  Rathergate and Syllogisms

Some people, notably the lackwits at this site, are still going on as if Prof. David Hailey's risible hack job on the Rather/Burkett/CBS memos hasn't been busted wide open by Kevin and Paul at Wizbang. But let's throw some formal logic at Hailey's hypothesis. From what I understand, he's played about with some sort of typewriter font in Photoshop and concluded that because (in his opinion) he can resurrect the metrics of a long-dead typewriter font via...I don't know, a Ouija board...that the Bush TANG memos were typed. OK, OK, laughable on its face. Charles Johnson and Jeff Harrell have already disposed of this from an Occam's Razor point of view. Dr. Joseph Newcomer's work has been dispositive (and the interest in it appears to have taken his site down pro tem). But let's say that there existed some magic word-processing typewriter five years before Erik Estrada got the casting call to play Ponch in 'CHiPs'. Now in formal logic there's a thing called modus ponens. It's a syllogism. It says. "if A then B, A, therefore B. No argument there. But it has an evil twin, known as affirming the consequent. This one says: if A then B, B, therefore A. It's the formal logic equivalent of 'begging the question'. Simply put, subsitute for A 'there exists 1972 technology that could create 21st Century-looking documents' and for B 'the documents look 21st century'. Then you see the fallacy: even if these documents could have been created in the 1970's, their existence does not prove that they were.

OK, that's a bit dry. But it shows what a false trail all the amateur 'forensic document examiners' with their My First Forgery Detection Kits from MB Games are following. So what if, by some miracle, you find a machine from the era of Watergate and The Joy of Sex (1st Ed.) that can re-create Microsoft Word documents? The Arrow of Causality is pointing in the wrong direction. Your job isn't to prove that a document from the twenty-first century can be reproduced in the 1970's. It's your job to prove that a document from the 1970's can't be reproduced in the twenty-first century.

 
Thursday, September 30, 2004
  Another Derbyshire Maths Problem

John Derbyshire has another cute end-of-the-month maths puzzle. Here's how he puts it:


Consider the powers of two: 2, 4, 8, 16, 32, 64, 128... Their right-most digits follow a simple repetitive pattern: 2, 4, 8, 6, 2, 4, 8, 6, 2, 4, 8, 6... What about their left-most digits, though? Here are the first 40-odd [actually, it's the first 149 - Ed.]. (Read right to left, line by line. I've just jammed the digits together, leaving out the courtesy commas, to save space.)

24813612512481361251248136125

124813612512481371251249137125

124913712512491371361249137136

124913713612512481361251248136

125124813612512481361251248137...

There is a pattern there, but it keeps breaking down. When you get up into really big powers of 2, in fact, it breaks down altogether. For the ten powers of 2 from the 30th to the 39th, for instance, we have lead digits 1, 2, 4, 8, 1, 3, 6, 1, 2, 5. For the ten powers from the billionth to the 1,000,000,009th, by contrast, we have lead digits 4, 9, 1, 3, 7, 1, 2, 5, 1, 2. Every digit (except, of course, zero) shows up, though.

I am going to refer to this infinite sequence of digits as "the sequence." Now I am going to ask the following two questions: In the first N digits of the sequence, how many occurrences of 3 shall I find? And how many occurrences of 4?

Call the first number U(N), the second V(N). The first few values of U(N) and V(N), for N from 1 to 15, look like this, as you can easily verify:

U(N) = 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,...

V(N) = 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2,...

Show that for sufficiently large N, U(N) will always be bigger than V(N). Find the limit of the ratio U(N) to V(N) as N tends to infinity.


To answer these questions we need to know how to calculate the leading (leftmost) digits of the powers of two. We could just start off by raising two to succesively higher powers: 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131071,... but this won't tell us anything about the distribution of that leading digit in general. To get a direct formula for the leading digit we will need to use logarithms.

First let's review a few properties of logarithms (or logs). To start, here's the definition. The logarithm to base b of x is that number that when you raise b to it, you get x. It is written logb x. A couple of examples: log8 512 = 3, because 83 = 512. Similarly, log3.66 12.8901 = 1.97035 beacuse 3.661.97035 = 12.8901. And blogb x = x.

A second important property is that addition of logs is like multiplication. log (x y) = log x + log y. So, log2 32 = 5 = log2 4 + log2 8 = 2 + 3.

Thirdly, raising a number to a power is like multiplying a log by a constant. For example, logb xy = y logb x.

OK, so now we can derive our formula for the leading digit of 2x. If we take log10 2x, we get x log10 2 (in other words, 10x log10 2 = 2x). This will be some decimal number, like 4.21442 (log10 16384). We can split this into two pieces, by our second property: the integer part and the fractional part. The integer part represents a power of ten, in our example 104 = 10,000. The fractional part represents the amount by which you have to multiply this power of ten to get our original number back, in this case 100.21442 = 1.6384. And therein lies our answer. Let's introduce two new functions, called floor and frac. The 'floor' of x is the largest integer that is smaller than x. So floor(3.2) is 3, floor(-4.6) is -5. The 'frac' function just takes the fractional part of a number: frac(4.386743) = 0.386743.

If we raise 10 to the power frac(x log10 2), we get some number. If we then take the floor of this number, we get the leading digit of 2x. So our final function is floor(10frac(x log10 2)).

The frac function returns a value between 0 and 1. But does frac(x log10 2) take on all values between 0 and 1? Yes. The reason for this is highly theoretical and depends on a thing called Weyl's criterion. Without going into too much detail, Weyl's criterion says that frac(x y) (x is an integer) takes on all values between 0 and 1 as x gets bigger if y is irrational (the technical term is frac(x y) is equidistributed and dense in [0,1]). Another property of logarithms is that logb a is irrational if a and b are integers and one of them has a prime factor that the other does not. So log10 2 is irrational. If we raise 10 to the power of the frac function as x gets bigger and bigger, we eventually get a smooth curve that varies between 1 and 10. Applying the floor function turns it into a 'staircase', with risers of height 1 and treads of varying widths. Each tread is a power of two with the corresponding lead digit. The widths decrease as we go up. As they get narrower, the chance that a random power of x will have that leading digit goes down. So we have the answer to Derb's first question: for sufficiently large N, powers of two start with a three more often than a four. How much more often? About 1.29 times more often, but that will have to wait for an update.

UPDATE: What are the probabilities of a power of two having a given leading digit?. In other words, what are the widths of the treads in the staircase function we obtained by applying floor to 10x? The function first takes a step up when 10x = 2, or x = log10 2. That value is approximately 0.30103. So roughly 30% of powers of two have leading digit 1. The next step has width log10 3 - log10 2 = log10 3/2 ~ 0.1761, and so on. The ratio of the third step to the fourth step (in Derb's question, the ratio of U(N) to V(N)) is log10 4/3 divided by log10 5/4 ~ 1.2892.

If we actually calculate the leading digits, for the first 100,000 powers of two the number of results with leading digit 3 and 4 are 12492 and 9692 respectively, which is a ratio of 1.2889, pretty close to the theoretical value. As we take more and more powers, the result will get even closer (although working with the results gets hard: 21000000 has around 300,000 digits).
 
Sunday, September 26, 2004
  Sunday Recipe

Nothing I can add on the Rathergate issue, so instead here's Sunday's recipe for basic meatballs:

200g finely ground minced beef
1/2 egg, beaten
1 tsp onion powder
1 tsp garlic powder
1/4 cup breadcrumbs
1/4 tsp dried oregano
1 tsp Worcester sauce
1/2 tsp Bovril or other beef extract
1/2 tsp salt
1 tsp pepper
3 tbsp vegetable oil

Mix all the ingredients, except the vegetable oil, in a bowl or food processor. Roll into 3 cm balls (this quantity will make about six). Roll them in flour and leave them to sit in a refrigerator for 30 mins or so to firm. Heat 3 tbsp of vegetable oil (sunflower is best) in a heavy skillet over high heat. Place the floured meatballs in the pan, reduce the heat to medium and fry them for 9–10 minutes (in batches if necessary — don't overload the pan), turning frequently so that they are browned on all sides.

Serve them with buttered fettucini or a tomato sauce. If you want to make them a bit more tangy (a.k.a 'Mama mia, that's a spicy meatball!'). substitute half the black pepper for cayenne and add two tbsp of Tabasco habañero sauce. Then WASH YOUR HANDS!

This quantity's enough for one person; scale accordingly.

 
Tuesday, September 21, 2004
  What a tangled web

I'm a hardcore wonk when it comes to Memogate, but with the volume of material that Ardolino, Harrell, Allah, Ace, Hewitt, Powerline, Geraghty, Johnson et al. have been putting out I feel a bit Inside Baseball. I actually have a job and a social life, and those two bad boys are seriously cutting into the time I have to study the flood of data that is coming my way thanks to the blogosphere. It's like being back in University cramming for an exam.

 
Monday, September 20, 2004
  SHOCK: Woody Allen is not a Bush fan

This just in: Woody Allen is not fond of Dubya. Who'd'a thunk it? But if ever a life could be described as islands of comedy against a background of tragedy, it sure ain't George Bush's. For that description, one merely needs to look at Allen's recent filmic output.

 
Sunday, September 19, 2004
  Epistemology of a forgery

There was some poor witless girl nattering on about the 'epistemology' of the forged memos (she was linked to in Jonathan Last's Weekly Standard article; looking up the link was almost more than I could bear). Dear little thing trotted out the meme that MS Word was designed to mimic 1970's typewriters. To anyone who has more than a nanogram of clue in their whole body, that fails the laugh test (why would you hobble an incredibly powerful piece of software with constraints like that? It's like saying an F-15 was designed to mimic a Sopwith Camel). Furthermore, that whole argument is totally bass-ackwards. The bone of contention is not whether 1970's documents can be created with twenty-first century technology. It's not whether twenty-first century documents can be created, in 2004, with 1970's technology. It's whether twenty-first century documents can be created in the seventies. The hapless waif is making the same mistake that creationists make. They conflate logical possibility with epistemological possibility. Is it totally out of the question that these documents are 1970's-vintage? No. Is the probability that the documents are genuine a number sufficiently close to zero as to render it ludicrous? Yes. The truth of the statement 'in the limit as n goes to infinity, (1 + 1/n)n is e, the base of natural logarithms' is in a different class of statements than 'there does not exist a teapot in orbit about Pluto'. We 'know' both these statements to be true, but we 'know' them in a different way. Epistemology is the study of 'knowing how we know what we know'. It's not amenable, in most instances, to logical proof. If someone calls upon the argument 'well, it could happen', you know the foundations of his case are shaky.

In the interim, the debate has moved forward. Now that all but an irrelevant remnant of observers has concluded that these documents are inept forgeries (even the author of the piece of cod-epistemology that I link to above has partially resiled from her original position), the real meat of the story is: whodunit?

 
Thursday, September 16, 2004
  Put your money where your mouth is

Tradesports has long contracts on Bush winning the election at a 67.0/67.8 bid/ask spread tonight. Of course the final winner of the election will be whoever cracks a majority in the college, and sites like Election Projection have a much more detailed (nuanced?) analysis of the outcome, but a verdict like this from an efficent information aggregator does not bode well for Kerry.

 
  Orthographic horrorshow

The InstaProf points to an excellent synoptic article about l'affaire Rather in The Scotsman. He makes a parenthetical remark about pyjamas although, bizarrely, he spells them as 'pajamas' [sic]. Even more oddly, he insists on using the curious young participle 'spelled' when of course he intended to write 'spelt'. Those crazy Americans and their neologisms.

 
Musings from Costa Rica

Contact me: d a g g i l l i e s @ y a h o o . c o m 


About me



ARCHIVES
2002/04/14 - 2002/04/21
2002/04/21 - 2002/04/28
2002/04/28 - 2002/05/05
2002/05/05 - 2002/05/12
2002/05/19 - 2002/05/26
2002/05/26 - 2002/06/02
2002/06/02 - 2002/06/09
2002/06/09 - 2002/06/16
2002/06/16 - 2002/06/23
2002/06/23 - 2002/06/30
2002/07/07 - 2002/07/14
2002/07/14 - 2002/07/21
2002/07/21 - 2002/07/28
2002/09/15 - 2002/09/22
2003/04/13 - 2003/04/20
2004/08/08 - 2004/08/15
2004/08/15 - 2004/08/22
2004/08/22 - 2004/08/29
2004/08/29 - 2004/09/05
2004/09/05 - 2004/09/12
2004/09/12 - 2004/09/19
2004/09/19 - 2004/09/26
2004/09/26 - 2004/10/03
2004/10/03 - 2004/10/10


LINKS

Blogs:

News:

Current Affairs:

Sci/Tech:


Powered by Blogger