Four short links: 2 May 2012
29 days ago by rahuldave
Punting on SxSW (Brad Feld) -- I came across this old post and thought: if you can make money by being a dick, or make money by being a caring family person, why would you choose to be a dick? As far as I can tell, being a dick is optional. Brogrammers, take note. Be more like Brad Feld, who prioritises his family and acts accordingly.
Probabilistic Structures for Data Mining -- readable introduction to useful algorithms and datastructures showing their performance, reliability, and resources trade-off. (via Hacker News)
Dataset -- a Javascript library for transforming, querying, manipulating data from different sources.
Many HTTPS Servers are Insecure -- 75% still vulnerable to the BEAST attack.
algorithms
bigdata
bradfeld
cs
culture
javascript
machinelearning
math
opensource
security
ssl
worklifebalance
from google
Probabilistic Structures for Data Mining -- readable introduction to useful algorithms and datastructures showing their performance, reliability, and resources trade-off. (via Hacker News)
Dataset -- a Javascript library for transforming, querying, manipulating data from different sources.
Many HTTPS Servers are Insecure -- 75% still vulnerable to the BEAST attack.
29 days ago by rahuldave
Traveling salesman art
4 weeks ago by rahuldave
Bill Cook sent me a file yesterday that renders the Endeavour photo on my blog as the solution to a 66,290-city Traveling Salesman problem. His iPhone app Concord TSP chose 66,290 points and then solved for the shortest path connecting these points, a feat that would have strained a supercomputer a few years ago. (Bill Cook and I are not related as far as I know.)
Here is a thumbnail image of the full TSP tour:
You can find the full PDF here (1.24 MB). To show some of the detail, here is a close-up from near the top-left corner of the image:
I asked how the tour was constructed:
How do you construct a set of points whose TSP solution resembles a photograph? Is it sufficient to add more “cities” in regions where you want darker shading? And are the cities added at random with a density specified by color depth?
Bill Cook replied:
By default, the app will select the points along the line you describe: it splits the image into a grid, computes the average gray scale in each grid region, and drops a number of random cities into each grid region in proportion to the square of the average gray scale. This technique was first proposed by Bob Bosch and Adrianne Herman at Oberlin College. It is the default since it takes almost no time to compute, but I include two other options, that each take about a minute to render a large image on an iPhone 4.
The image of The Endeavour was created with a method Jim Bumgarnder proposed in his Stipple Cam project.
Related post:
Moore’s law squared
Computing
Math
from google
Here is a thumbnail image of the full TSP tour:
You can find the full PDF here (1.24 MB). To show some of the detail, here is a close-up from near the top-left corner of the image:
I asked how the tour was constructed:
How do you construct a set of points whose TSP solution resembles a photograph? Is it sufficient to add more “cities” in regions where you want darker shading? And are the cities added at random with a density specified by color depth?
Bill Cook replied:
By default, the app will select the points along the line you describe: it splits the image into a grid, computes the average gray scale in each grid region, and drops a number of random cities into each grid region in proportion to the square of the average gray scale. This technique was first proposed by Bob Bosch and Adrianne Herman at Oberlin College. It is the default since it takes almost no time to compute, but I include two other options, that each take about a minute to render a large image on an iPhone 4.
The image of The Endeavour was created with a method Jim Bumgarnder proposed in his Stipple Cam project.
Related post:
Moore’s law squared
4 weeks ago by rahuldave
Fractional integration
8 weeks ago by rahuldave
Define the integration operator I by
so I f is an antiderivative of f.
Define the second antiderivative I2 by applying I to f twice:
It turns out that
To see this, notice that both expressions for I2 are equal when x = a, and they have the same derivative, so they must be equal everywhere.
Now define I3 by applying I to I2 and so forth defining In for all positive integers n. Then one can show that the nth antiderivative is given by
The right side of this equation makes sense for non-integer values of n, and so we can take it as a definition of fractional integrals when n is not an integer.
Related post:
Fractional derivatives
Math
Integration
from google
so I f is an antiderivative of f.
Define the second antiderivative I2 by applying I to f twice:
It turns out that
To see this, notice that both expressions for I2 are equal when x = a, and they have the same derivative, so they must be equal everywhere.
Now define I3 by applying I to I2 and so forth defining In for all positive integers n. Then one can show that the nth antiderivative is given by
The right side of this equation makes sense for non-integer values of n, and so we can take it as a definition of fractional integrals when n is not an integer.
Related post:
Fractional derivatives
8 weeks ago by rahuldave
Book review: Functional Analysis
february 2012 by rahuldave
Functional Analysis by Elias Stein and Rami Shakarchi is a fast-paced book on functional analysis and related topics. By page 60, you’ve had a decent course in functional analysis and you’ve got 360 pages left.
This book is the last in a series of four volumes based on a series of lectures that began at Princeton in 2000. The first three volumes are devoted to
Fourier series and integrals
Complex analysis
Measure theory, Lebesgue integration, and Hilbert spaces.
The first three books are not necessarily prerequisites for the fourth book, though the final book does assume familiarity with the basics of the topics in the earlier books. The final book does make fairly frequent references to its predecessors. Someone who has not read the first three volumes — I have not — can let these references go by.
Stein and Shakarchi bring in several topics that may not be considered functional analysis per se but are often included in functional analysis books, namely harmonic analysis and generalized functions. It goes into territory less often included in a functional analysis text: probability, Brownian motion, and an introduction to several complex variables. This broad selection of topics is in keeping with the stated aims of the lecture series
to present, in an integrated manner, the core areas of analysis … to make plain the organic unity that exists between the various parts of the subject …
The goal of integrating various parts of analysis may be most clearly seen in the fourth chapter: Applications of the Baire Category Theorem. The material here is not organized by result but rather by proof technique.
Each chapter ends with a set of “exercises” and a set of “problems.” The former are closely related to the material in the book and include generous hints. The latter are more challenging and go beyond the scope of the book.
Math
Books
from google
This book is the last in a series of four volumes based on a series of lectures that began at Princeton in 2000. The first three volumes are devoted to
Fourier series and integrals
Complex analysis
Measure theory, Lebesgue integration, and Hilbert spaces.
The first three books are not necessarily prerequisites for the fourth book, though the final book does assume familiarity with the basics of the topics in the earlier books. The final book does make fairly frequent references to its predecessors. Someone who has not read the first three volumes — I have not — can let these references go by.
Stein and Shakarchi bring in several topics that may not be considered functional analysis per se but are often included in functional analysis books, namely harmonic analysis and generalized functions. It goes into territory less often included in a functional analysis text: probability, Brownian motion, and an introduction to several complex variables. This broad selection of topics is in keeping with the stated aims of the lecture series
to present, in an integrated manner, the core areas of analysis … to make plain the organic unity that exists between the various parts of the subject …
The goal of integrating various parts of analysis may be most clearly seen in the fourth chapter: Applications of the Baire Category Theorem. The material here is not organized by result but rather by proof technique.
Each chapter ends with a set of “exercises” and a set of “problems.” The former are closely related to the material in the book and include generous hints. The latter are more challenging and go beyond the scope of the book.
february 2012 by rahuldave
Jinc function
february 2012 by rahuldave
This afternoon I ran across the jinc function for the first time.
The sinc function is defined by
sinc(t) = sin(t) / t.
The jinc function is defined analogously by
jinc(t) = J1(t) / t
where J1 is a Bessel function. Bessel functions are analogous to sines, so the jinc function is analogous to the sinc function.
Here’s what the sinc and jinc functions look like.
The jinc function is not as common as the sinc function. For example, both Mathematica and SciPy have built-in functions for sinc but not for jinc. [There are actually two definitions of sinc. Mathematica uses the definition above, but SciPy uses sin(πt)/πt. The SciPy convention is more common in digital signal processing.]
As I write this, Wikipedia has an entry for sinc but not for jinc. Someone want to write one?
For small t, jinc(t) is approximately cos(t/2) / 2. This approximation has error O(t4), so it’s very good for small t, useless for large t.
For large values of t, jinc(t) is like a damped, shifted cosine. Specifically,
with an error that decreases like O( |t|-2 ).
Like the sinc function, the jinc function has a simple Fourier transform. Both transforms are zero outside the interval [-1, 1]. Inside this interval, the transform of sinc is a constant, √(π/8). On the same interval, the transform of jinc is √(2/π) √(1 – ω2).
Update: How to compute jinc(x)
Related posts:
How to visualize Bessel functions
Diagram of Bessel function relationships
Math
from google
The sinc function is defined by
sinc(t) = sin(t) / t.
The jinc function is defined analogously by
jinc(t) = J1(t) / t
where J1 is a Bessel function. Bessel functions are analogous to sines, so the jinc function is analogous to the sinc function.
Here’s what the sinc and jinc functions look like.
The jinc function is not as common as the sinc function. For example, both Mathematica and SciPy have built-in functions for sinc but not for jinc. [There are actually two definitions of sinc. Mathematica uses the definition above, but SciPy uses sin(πt)/πt. The SciPy convention is more common in digital signal processing.]
As I write this, Wikipedia has an entry for sinc but not for jinc. Someone want to write one?
For small t, jinc(t) is approximately cos(t/2) / 2. This approximation has error O(t4), so it’s very good for small t, useless for large t.
For large values of t, jinc(t) is like a damped, shifted cosine. Specifically,
with an error that decreases like O( |t|-2 ).
Like the sinc function, the jinc function has a simple Fourier transform. Both transforms are zero outside the interval [-1, 1]. Inside this interval, the transform of sinc is a constant, √(π/8). On the same interval, the transform of jinc is √(2/π) √(1 – ω2).
Update: How to compute jinc(x)
Related posts:
How to visualize Bessel functions
Diagram of Bessel function relationships
february 2012 by rahuldave
Solution to Renaissance problem
november 2011 by rahuldave
The previous post presented a problem first posed by Johannes Müller in 1471.
Where you should stand so that a vertical bar appears longest?
To be more precise, suppose a vertical bar is hanging so that the top of the bar is a distance a above your eye level and the bottom is a distance b above your eye level. Let x be the horizontal distance to the bar. For what value of x does the bar appear longest?
In the following diagram, we wish to maximize the angle θ.
Since tangent is an increasing function, it suffices to maximize tan(θ). Let α = θ + β. Then
Now use tan α = a/x and tan β = b/x to reduce the expression above to
Now we have a function of x to maximize. Take the derivative and find where it is zero. The maximum occurs at √ab, the geometric mean of a and b.
However, when Müller proposed his problem in 1471, calculus had not yet been invented, so we can be pretty sure Müller did not take derivatives. I don’t know how (or even if) Müller solved his problem, but the book where I found the problem showed how it could be solved without calculus. The derivation is a little longer, but it only depends on simple algebra and the arithmetic-geometric mean inequality, i.e. the observation that (a + b) /2 ≥ √ab.
Update: Here is a purely geometric solution by George Papademetriou.
Update: See this post for more historical background.
Other posts about the geometric mean:
Means and inequalities
The middle size of the universe
Math
from google
Where you should stand so that a vertical bar appears longest?
To be more precise, suppose a vertical bar is hanging so that the top of the bar is a distance a above your eye level and the bottom is a distance b above your eye level. Let x be the horizontal distance to the bar. For what value of x does the bar appear longest?
In the following diagram, we wish to maximize the angle θ.
Since tangent is an increasing function, it suffices to maximize tan(θ). Let α = θ + β. Then
Now use tan α = a/x and tan β = b/x to reduce the expression above to
Now we have a function of x to maximize. Take the derivative and find where it is zero. The maximum occurs at √ab, the geometric mean of a and b.
However, when Müller proposed his problem in 1471, calculus had not yet been invented, so we can be pretty sure Müller did not take derivatives. I don’t know how (or even if) Müller solved his problem, but the book where I found the problem showed how it could be solved without calculus. The derivation is a little longer, but it only depends on simple algebra and the arithmetic-geometric mean inequality, i.e. the observation that (a + b) /2 ≥ √ab.
Update: Here is a purely geometric solution by George Papademetriou.
Update: See this post for more historical background.
Other posts about the geometric mean:
Means and inequalities
The middle size of the universe
november 2011 by rahuldave
Surprising applications of math
november 2011 by rahuldave
The comments in the previous post touched on surprising applications of math, so I thought I’d expand this theme into it’s own post. Below I’ll give a couple general examples of surprising applications and then I’ll give a couple more personal applications I found surprising.
Number theory has traditionally been the purest of pure mathematics. People study number theory for the joy of doing so, not to make money. At least that was largely true until the advent of public key cryptography. The difficulty of solving certain number theory problems now ensures the difficulty of decrypting private communication, or so we hope. (By the way, I’ve always thought Euler deserved part of the credit for the RSA encryption scheme. Maybe it should be called RSAE encryption. R, S, and A came up with the brilliant idea to apply E’s theorem to cryptography.)
Non-euclidean geometry started as a pure mathematical abstraction. Of course the physical world is Euclidean, but let’s see what happens if we monkey with Euclid’s fifth postulate. Then along came Einstein and suddenly the real world is non-Euclidean.
One personal application of math that I found surprising was using Fibonacci numbers in practical computation. Computing Fibonacci numbers is a computer science cliché, but I actually needed to compute Fibonacci numbers for a numerical integration problem. I wrote up the details in Fibonacci numbers at work.
Another application that surprised me was using the trapezoid rule for real work. The trapezoid rule is a crude numerical integration technique. It’s good for teaching because it’s very simple, but it’s not very accurate. Or so I thought. It’s not very accurate in general, but in the right circumstances, it can be extraordinarily accurate. I explain more in Three surprises with the trapezoid rule.
One surprising non-application has been differential equations. For the past three centuries, differential equations have been at the heart of applied math. One could argue that to first approximation, applied math equals differential equations and supporting material. But I personally have not had nearly as much opportunity to use differential equations professionally as I expected, even though that was my specialization in grad school.
Related posts:
Ten surprises in numerical linear algebra
Impure math
Math
from google
Number theory has traditionally been the purest of pure mathematics. People study number theory for the joy of doing so, not to make money. At least that was largely true until the advent of public key cryptography. The difficulty of solving certain number theory problems now ensures the difficulty of decrypting private communication, or so we hope. (By the way, I’ve always thought Euler deserved part of the credit for the RSA encryption scheme. Maybe it should be called RSAE encryption. R, S, and A came up with the brilliant idea to apply E’s theorem to cryptography.)
Non-euclidean geometry started as a pure mathematical abstraction. Of course the physical world is Euclidean, but let’s see what happens if we monkey with Euclid’s fifth postulate. Then along came Einstein and suddenly the real world is non-Euclidean.
One personal application of math that I found surprising was using Fibonacci numbers in practical computation. Computing Fibonacci numbers is a computer science cliché, but I actually needed to compute Fibonacci numbers for a numerical integration problem. I wrote up the details in Fibonacci numbers at work.
Another application that surprised me was using the trapezoid rule for real work. The trapezoid rule is a crude numerical integration technique. It’s good for teaching because it’s very simple, but it’s not very accurate. Or so I thought. It’s not very accurate in general, but in the right circumstances, it can be extraordinarily accurate. I explain more in Three surprises with the trapezoid rule.
One surprising non-application has been differential equations. For the past three centuries, differential equations have been at the heart of applied math. One could argue that to first approximation, applied math equals differential equations and supporting material. But I personally have not had nearly as much opportunity to use differential equations professionally as I expected, even though that was my specialization in grad school.
Related posts:
Ten surprises in numerical linear algebra
Impure math
november 2011 by rahuldave
Scientifically, You Are Likely In the Slowest Line
december 2010 by rahuldave
MojoKid writes "As you wait in the checkout line for the holidays, your observation is most likely correct. That other line is moving faster than yours. That's what Bill Hammack (the Engineer Guy), from the Department of Chemical and Biomolecular Engineering at the University of Illinois — Urbana proves in this video. Ironically, the most efficient set-up is to have one line feed into several cashiers. This is because if any one line slows because of an issue, the entry queue continues to have customers reach check-out optimally. However, this is also perceived by customers as the least efficient, psychologically."
Read more of this story at Slashdot.
math
from google
Read more of this story at Slashdot.
december 2010 by rahuldave
Simple approximation to normal distribution
april 2010 by rahuldave
Here’s a simple approximation to the normal distribution I just ran across. The density function is
f(x) = (1 + cos(x))/2π
over the interval (-π, π). The plot below graphs this density with a solid blue line. For comparison, the density of a normal distribution with the same variance is plotted with a dashed orange line.
The approximation is good enough to use for teaching. Students may benefit from doing an exercise twice, once with this approximation and then again with the normal distribution. Having an approximation they can integrate in closed form may help take some of the mystery out of the normal distribution.
The approximation may have practical uses. The agreement between the PDFs isn’t great. However, the agreement between the CDFs (which is more important) is surprisingly good. The maximum difference between the two CDFs is only 0.018. (The differences between the PDFs oscillate, and so their integrals, the CDFs, are closer together.)
I ran across this approximation here. It goes back to the 1961 paper “A cosine approximation to the normal distribution” by D. H. Raab and E. H. Green, Psychometrika, Volume 26, pages 447-450.
Update 1: See the paper referenced in the first comment. It gives a much more accurate approximation using a logistic function. The cosine approximation is a little simpler and may be better for teaching. However, the logistic approximation has infinite support. That could be an advantage since students might be distracted by the finite support of the cosine approximation.
The logistic approximation for the standard normal CDF is
F(x) = 1/(1 + exp(-0.07056 x3 – 1.5976 x))
and has a maximum error of 0.00014 at x = ± 3.16.
Update 2:
How might you use this approximation the other way around, approximating a cosine by a normal density? See Always invert.
Related posts:
Rolling dice for normal samples
Sums of uniform random variables
Math
Statistics
Probability_and_Statistics
from google
f(x) = (1 + cos(x))/2π
over the interval (-π, π). The plot below graphs this density with a solid blue line. For comparison, the density of a normal distribution with the same variance is plotted with a dashed orange line.
The approximation is good enough to use for teaching. Students may benefit from doing an exercise twice, once with this approximation and then again with the normal distribution. Having an approximation they can integrate in closed form may help take some of the mystery out of the normal distribution.
The approximation may have practical uses. The agreement between the PDFs isn’t great. However, the agreement between the CDFs (which is more important) is surprisingly good. The maximum difference between the two CDFs is only 0.018. (The differences between the PDFs oscillate, and so their integrals, the CDFs, are closer together.)
I ran across this approximation here. It goes back to the 1961 paper “A cosine approximation to the normal distribution” by D. H. Raab and E. H. Green, Psychometrika, Volume 26, pages 447-450.
Update 1: See the paper referenced in the first comment. It gives a much more accurate approximation using a logistic function. The cosine approximation is a little simpler and may be better for teaching. However, the logistic approximation has infinite support. That could be an advantage since students might be distracted by the finite support of the cosine approximation.
The logistic approximation for the standard normal CDF is
F(x) = 1/(1 + exp(-0.07056 x3 – 1.5976 x))
and has a maximum error of 0.00014 at x = ± 3.16.
Update 2:
How might you use this approximation the other way around, approximating a cosine by a normal density? See Always invert.
Related posts:
Rolling dice for normal samples
Sums of uniform random variables
april 2010 by rahuldave
The coolest elevator I’ve seen
march 2010 by rahuldave
Elevator in Mumbai office building
This was amazing! I loved that the elevator proudly displayed both a “0 floor” for the ground AND a “-1 floor” for the subterranean parking garage.
This kind of logic is not widespread in buildings. A few strange examples are explained in Wikipedia’s entry for building story’s. The gist is that most floor numbering systems seem to suffer from superstition (no 13th floor), innumeracy (mezzanine level), or inconsistency (a building on a hill may have two “ground” floors).
I applaud these architects and hope this is a genuine sign that math literacy is improving.
Have you seen other interesting floor numbering systems? Write your story in the comments or send a pic to me and I’ll be happy to post it.
Tangents
elevator
funny
math
from google
This was amazing! I loved that the elevator proudly displayed both a “0 floor” for the ground AND a “-1 floor” for the subterranean parking garage.
This kind of logic is not widespread in buildings. A few strange examples are explained in Wikipedia’s entry for building story’s. The gist is that most floor numbering systems seem to suffer from superstition (no 13th floor), innumeracy (mezzanine level), or inconsistency (a building on a hill may have two “ground” floors).
I applaud these architects and hope this is a genuine sign that math literacy is improving.
Have you seen other interesting floor numbering systems? Write your story in the comments or send a pic to me and I’ll be happy to post it.
march 2010 by rahuldave
related tags
algorithms ⊕ bigdata ⊕ Books ⊕ bradfeld ⊕ Computing ⊕ cs ⊕ culture ⊕ elevator ⊕ funny ⊕ Integration ⊕ javascript ⊕ machinelearning ⊕ math ⊖ opensource ⊕ Probability_and_Statistics ⊕ security ⊕ ssl ⊕ Statistics ⊕ Tangents ⊕ worklifebalance ⊕Copy this bookmark: