Reading for the Rushed [Reading]
march 2012 by rahuldave
People sometimes ask me how I'm able to read 70+ books every year despite my extra-curricular, professional, and authoring activities. The truth is that although my reading count the past few years has remained fairly consistent, it's far less than my historical count (by half) and pathetically less than truly prolific readers. Alas, people ask and so I'll try to answer the best that I can. Below you'll find a short list of principles that help me to maximize my reading time and motivation. More »
Reading
Books
Education
Learning
Top
from google
march 2012 by rahuldave
Statistics project ideas for students
february 2012 by rahuldave
(This article was first published on Simply Statistics, and kindly contributed to R-bloggers)
Here are a few ideas that might make for interesting student projects at all levels (from high-school to graduate school). I’d welcome ideas/suggestions/additions to the list as well. All of these ideas depend on free or scraped data, which means that anyone can work on them. I’ve given a ballpark difficulty for each project to give people some idea.
Happy data crunching!
Data Collection/Synthesis
Creating a webpage that explains conceptual statistical issues like randomization, margin of error, overfitting, cross-validation, concepts in data visualization, sampling. The webpage should not use any math at all and should explain the concepts so a general audience could understand. Bonus points if you make short 30 second animated youtube clips that explain the concepts. (Difficulty: Lowish; Effort: Highish)
Building an aggregator for statistics papers across disciplines that can be the central resource for statisticians. Journals ranging from PLoS Genetics to Neuroimage now routinely publish statistical papers. But there is no one central resource that aggregates all the statistics papers published across disciplines. Such a resource would be hugely useful to statisticians. You could build it using blogging software like Wordpress so articles could be tagged/you could put the resource in your RSS feeder. (Difficulty: Lowish; Effort: Mediumish)
Data Analyses
Scrape the LivingSocial/Groupon sites for the daily deals and develop a prediction of how successful the deal will be based on location/price/type of deal. You could use either the RCurl R package or the XML R package to scrape the data. (Difficulty: Mediumish; Effort: Mediumish)
You could use the data from your city (here are a few cities with open data) to: (a) identify the best and worst neighborhoods to live in based on different metrics like how many parks are within walking distance, crime statistics, etc. (b) identify concrete measures your city could take to improve different quality of life metrics like those described above - say where should the city put a park, or (c) see if you can predict when/where crimes will occur (like these guys did). (Difficulty: Mediumish; Effort: Highish)
Download data on state of the union speeches from here and use the tm package in R to analyze the patterns of word use over time (Difficulty: Lowish; Effort: Lowish)
Use this data set from Donors Choose to determine the characteristics that make the funding of projects more likely. You could send your results to the Donors Choose folks to help them improve the funding rate for their projects. (Difficulty: Mediumish; Effort: Mediumish)
Which basketball player would you want on your team? Here is a really simple analysis done by Rafa. But it doesn’t take into account things like defense. If you want to take on this project, you should take a look at this Denis Rodman analysis which is the gold standard. (Difficulty: Mediumish; Effort: Highish).
Data visualization
Creating an R package that wraps the svgAnnotation package. This package can be used to create dynamic graphics in R, but is still a bit too flexible for most people to use. Writing some wrapper functions that simplify the interface would be potentially high impact. Maybe something like svgPlot() to create simple, dynamic graphics with only a few options (Difficulty: Mediumish; Effort: Mediumish).
The same as project 1 but for D3.js. The impact could potentially be a bit higher, since the graphics are a bit more professional, but the level of difficulty and effort would also both be higher. (Difficulty: Highish; Effort: Highish)
To leave a comment for the author, please follow the link and comment on his blog: Simply Statistics.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...
R_bloggers
DIY
education
Projects
R
from google
Here are a few ideas that might make for interesting student projects at all levels (from high-school to graduate school). I’d welcome ideas/suggestions/additions to the list as well. All of these ideas depend on free or scraped data, which means that anyone can work on them. I’ve given a ballpark difficulty for each project to give people some idea.
Happy data crunching!
Data Collection/Synthesis
Creating a webpage that explains conceptual statistical issues like randomization, margin of error, overfitting, cross-validation, concepts in data visualization, sampling. The webpage should not use any math at all and should explain the concepts so a general audience could understand. Bonus points if you make short 30 second animated youtube clips that explain the concepts. (Difficulty: Lowish; Effort: Highish)
Building an aggregator for statistics papers across disciplines that can be the central resource for statisticians. Journals ranging from PLoS Genetics to Neuroimage now routinely publish statistical papers. But there is no one central resource that aggregates all the statistics papers published across disciplines. Such a resource would be hugely useful to statisticians. You could build it using blogging software like Wordpress so articles could be tagged/you could put the resource in your RSS feeder. (Difficulty: Lowish; Effort: Mediumish)
Data Analyses
Scrape the LivingSocial/Groupon sites for the daily deals and develop a prediction of how successful the deal will be based on location/price/type of deal. You could use either the RCurl R package or the XML R package to scrape the data. (Difficulty: Mediumish; Effort: Mediumish)
You could use the data from your city (here are a few cities with open data) to: (a) identify the best and worst neighborhoods to live in based on different metrics like how many parks are within walking distance, crime statistics, etc. (b) identify concrete measures your city could take to improve different quality of life metrics like those described above - say where should the city put a park, or (c) see if you can predict when/where crimes will occur (like these guys did). (Difficulty: Mediumish; Effort: Highish)
Download data on state of the union speeches from here and use the tm package in R to analyze the patterns of word use over time (Difficulty: Lowish; Effort: Lowish)
Use this data set from Donors Choose to determine the characteristics that make the funding of projects more likely. You could send your results to the Donors Choose folks to help them improve the funding rate for their projects. (Difficulty: Mediumish; Effort: Mediumish)
Which basketball player would you want on your team? Here is a really simple analysis done by Rafa. But it doesn’t take into account things like defense. If you want to take on this project, you should take a look at this Denis Rodman analysis which is the gold standard. (Difficulty: Mediumish; Effort: Highish).
Data visualization
Creating an R package that wraps the svgAnnotation package. This package can be used to create dynamic graphics in R, but is still a bit too flexible for most people to use. Writing some wrapper functions that simplify the interface would be potentially high impact. Maybe something like svgPlot() to create simple, dynamic graphics with only a few options (Difficulty: Mediumish; Effort: Mediumish).
The same as project 1 but for D3.js. The impact could potentially be a bit higher, since the graphics are a bit more professional, but the level of difficulty and effort would also both be higher. (Difficulty: Highish; Effort: Highish)
To leave a comment for the author, please follow the link and comment on his blog: Simply Statistics.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...
february 2012 by rahuldave
Apple to announce tools, platform to "digitally destroy" textbook publishing
january 2012 by rahuldave
Apple is slated to announce the fruits of its labor on improving the use of technology in education at its special media event on Thursday, January 19. While speculation has so far centered on digital textbooks, sources close to the matter have confirmed to Ars that Apple will announce tools to help create interactive e-books—the "GarageBand for e-books," so to speak—and expand its current platform to distribute them to iPhone and iPad users.
Along with the details we were able to gather from our sources, we also spoke to two experts in the field of digital publishing to get a clearer picture of the significance of what Apple is planning to announce.
Read the comments on this post
News
News
Apple
digitaltextbooks
ebooks
education
ibooks
ipad
mobile
stevejobs
from google
Along with the details we were able to gather from our sources, we also spoke to two experts in the field of digital publishing to get a clearer picture of the significance of what Apple is planning to announce.
Read the comments on this post
january 2012 by rahuldave
Teaching Bayesian stats backward
april 2011 by rahuldave
Most presentations of Bayesian statistics I’ve seen start with elementary examples of Bayes’ Theorem. And most of these use the canonical example of testing for rare diseases. But the connection between these examples and Bayesian statistics is not obvious at first. Maybe this isn’t the best approach.
What if we begin with the end in mind? Bayesian calculations produce posterior probability distributions on parameters. An effective way to teach Bayesian statistics might be to start there. Suppose we had probability distributions on our parameters. Never mind where they came from. Never mind classical objections that say you can’t do this. What if you could? If you had such distributions, what could you do with them?
For starters, point estimation and interval estimation become trivial. You could, for example, use the distribution mean as a point estimate and the area between two quantiles as an interval estimate. The distributions tell you far more than point estimates or interval estimates could; these estimates are simply summaries of the information contained in the distributions.
It makes logical sense to start with Bayes’ Theorem since that’s the tool used to construct posterior distributions. But I think it makes pedagogical sense to start with the posterior distribution and work backward to how one would come up with such a thing.
Bayesian statistics is so named because Bayes’ Theorem is essential to its calculations. But that’s a little like classical statistics Central Limitist statistics because it relies heavily on the Central Limit Theorem.
The key idea of Bayesian statistics is to represent all uncertainty by probability distributions. That idea can be obscured by an early emphasis on calculations.
Related posts:
Interview with David Spiegelhalter
Occam’s razor and Bayes’ theorem
Four reasons to use Bayesian inference
Statistics
Bayesian
Education
from google
What if we begin with the end in mind? Bayesian calculations produce posterior probability distributions on parameters. An effective way to teach Bayesian statistics might be to start there. Suppose we had probability distributions on our parameters. Never mind where they came from. Never mind classical objections that say you can’t do this. What if you could? If you had such distributions, what could you do with them?
For starters, point estimation and interval estimation become trivial. You could, for example, use the distribution mean as a point estimate and the area between two quantiles as an interval estimate. The distributions tell you far more than point estimates or interval estimates could; these estimates are simply summaries of the information contained in the distributions.
It makes logical sense to start with Bayes’ Theorem since that’s the tool used to construct posterior distributions. But I think it makes pedagogical sense to start with the posterior distribution and work backward to how one would come up with such a thing.
Bayesian statistics is so named because Bayes’ Theorem is essential to its calculations. But that’s a little like classical statistics Central Limitist statistics because it relies heavily on the Central Limit Theorem.
The key idea of Bayesian statistics is to represent all uncertainty by probability distributions. That idea can be obscured by an early emphasis on calculations.
Related posts:
Interview with David Spiegelhalter
Occam’s razor and Bayes’ theorem
Four reasons to use Bayesian inference
april 2011 by rahuldave
HTML5 For Web Designers
may 2010 by rahuldave
When Mandy Brown, Jason Santa Maria and I formed A Book Apart, one topic burned uppermost in our minds, and there was only one author for the job.
Nothing else, not even “real fonts” or CSS3, has stirred the standards-based design community like the imminent arrival of HTML5. Born out of dissatisfaction with the pacing and politics of the W3C, and conceived for a web of applications (not just documents), this new edition of the web’s lingua franca has in equal measure excited, angered, and confused the web design community.
Win free copies of HTML5 For Web Designers on Gowalla!
Just as he did with the DOM and JavaScript, Jeremy Keith has a unique ability to illuminate HTML5 and cut straight to what matters to accessible, standards-based designer-developers. And he does it in this book, using only as many words and pictures as are needed.
Watch Jeremy Keith discuss HTML5 with Dan Benjamin and me live on The Big Web Show this Thursday at 1:00 PM Eastern.
There are other books about HTML5, and there will be many more. There will be 500 page technical books for application developers, whose needs drove much of HTML5’s development. There will be even longer secret books for browser makers, addressing technical challenges that you and I are blessed never to need to think about.
But this is a book for you—you who create web content, who mark up web pages for sense and semantics, and who design accessible interfaces and experiences. Call it your user guide to HTML5. Its goal—one it will share with every title in the forthcoming A Book Apart catalog—is to shed clear light on a tricky subject, and do it fast, so you can get back to work.
4 May 2010
Jeffrey Zeldman, Publisher
A Book Apart “for people who make websites”
In Association with A List Apart
An imprint of Happy Cog™
The present-day content producer refuses to die.
And don’t miss…
Read Chapter One free in today’s issue of A List Apart!
The author, Mr Jeremy Keith himself, shares his thoughts!
Creative director Jason Santa Maria discusses the design of A Book Apart!
Editor Mandy Brown discusses the business side of A Book Apart!
Announcements
Applications
Code
Design
Education
HTML
HTML5
Jeremy_Keith
Publications
Publishing
Web_Design
Web_Design_History
Web_Standards
Zeldman
development
editorial
industry
jeremy
keith
thursday
discusses
books
book
gowalla
from google
Nothing else, not even “real fonts” or CSS3, has stirred the standards-based design community like the imminent arrival of HTML5. Born out of dissatisfaction with the pacing and politics of the W3C, and conceived for a web of applications (not just documents), this new edition of the web’s lingua franca has in equal measure excited, angered, and confused the web design community.
Win free copies of HTML5 For Web Designers on Gowalla!
Just as he did with the DOM and JavaScript, Jeremy Keith has a unique ability to illuminate HTML5 and cut straight to what matters to accessible, standards-based designer-developers. And he does it in this book, using only as many words and pictures as are needed.
Watch Jeremy Keith discuss HTML5 with Dan Benjamin and me live on The Big Web Show this Thursday at 1:00 PM Eastern.
There are other books about HTML5, and there will be many more. There will be 500 page technical books for application developers, whose needs drove much of HTML5’s development. There will be even longer secret books for browser makers, addressing technical challenges that you and I are blessed never to need to think about.
But this is a book for you—you who create web content, who mark up web pages for sense and semantics, and who design accessible interfaces and experiences. Call it your user guide to HTML5. Its goal—one it will share with every title in the forthcoming A Book Apart catalog—is to shed clear light on a tricky subject, and do it fast, so you can get back to work.
4 May 2010
Jeffrey Zeldman, Publisher
A Book Apart “for people who make websites”
In Association with A List Apart
An imprint of Happy Cog™
The present-day content producer refuses to die.
And don’t miss…
Read Chapter One free in today’s issue of A List Apart!
The author, Mr Jeremy Keith himself, shares his thoughts!
Creative director Jason Santa Maria discusses the design of A Book Apart!
Editor Mandy Brown discusses the business side of A Book Apart!
may 2010 by rahuldave
The 21st-century textbook
april 2010 by rahuldave
With new technologies constantly coming on-line, and with states like California, Texas, and Oregon allowing digital curriculum to replace printed curriculum, the question arises: what will textbooks look like in the coming years?
Dale’s post, "A hunger for good learning," featured a fantastic video about teaching math. In a few brief minutes, Dan Meyer showed us a photo of a math problem involving filling a tank of water and calculating how long that would take, then showed us why traditional approaches to teaching this problem stifled student learning. The picture showed a traditional math problem with a line drawing of the tank, a problem set-up written in text (octagonal tank, straight sides, 27oz per second, etc.) followed by short sub-steps that are needed to solve the problem (calculate the surface area of the base, calculate the volume). Then, finally, it asks the question “how long will it take to fill the tank?” Dan’s view is that this spoon-feeding of problem solving in little steps trains students not to think like mathematicians and not to have the patience for solving complex problems. Instead, Dan prefers to show his students a video of the tank filling up, agonizingly slowly, until the students are eager to know “How long until that tank fills up, anyway?” And then they’re off -- discussing, questioning, and, most importantly, formulating the problem on their own, just as good mathematicians do.
It seems that what the textbook looks like in the 21st century is a lot more like Dan's presentation than the bound paper tomes we grew up with. If the 21st century textbook is delivered digitally to students, we can expect it to be far more than a .pdf representation of a traditional text. For example, let's say the textbook publisher chose to experiment with findings from the research community that kids learn better from authentic and difficult problems than they do from bite-sized steps laid out one after the other. The publisher does what Dan Meyer did, recreates the tank problem and updates a version of the textbook for a handful of beta testers. The next morning, Dan’s students walk into class and open the book to chapter 5. The old problem is gone, instead there is just a video of a tank and instructions that say “watch me fill up -- when you know how long it takes, please enter the answer.” Sure, a student might choose to watch the video for seven-plus hours and finally write down the time it took. But when boredom sets in, a more engaging option is to just play with the problem. By staying up to date with new information and practices, this textbook is living.
In this example, the student finds all the needed tools lying around the page. A ruler for measuring the size of the tank, a cup of known size and a stopwatch to measure the rate of water flow, as well as various other tools, leaving it to the student to decide which ones are relevant to solving the problem. This textbook is interactive.
On the opposite page of the book is a chat window where students can share hypotheses, discuss approaches, share results from using the tools (I get 18.4 inches for the height, but you got 18.7). This textbook is participative.
Of course, for the kid who already understands this deeply and finishes quickly, there are better challenges waiting. Similar problems with trickier shapes to the tank, problems where there is more than one pipe filling the tank up, problems where the rate of the water varies. This textbook provides each student with the right level of challenge at any given time -- it is adaptive.
If some of these problems require new tools and concepts, the student has the ability to research on the Internet, connect with tutors in higher grades, chat with other students across the world who happen to be wrestling with this same problem right now, or find and watch a YouTube-sized lecture on a relevant topic. This textbook is connected.
In addition to living, interactive, participative, adaptive, and connected, we can expect the 21st century textbook to be personalized and mashable. Beyond that, though, could the 21st century textbook hold out a unique promise - that the student who uses this kind of textbook no longer needs to wait for high-stakes, anxiety-inducing tests to determine whether he had learned a topic? What if the digital textbook were instrumented to collect and interpret data in such a way that it could tell a student's level of mastery without test-taking, just from how he engaged with the content? Some of these measurements and interpretations are easy to imagine, such as: 'Which digital tool did the student first pick up to make measurements in the tank-filling problem', and 'What keywords did he search for on the internet?' Other kinds of data will be harder to interpret, such as: 'What solutions did he try on his scratch pad', 'What questions did he ask his peers', and 'Which of his peers' questions did he answer?' But to any degree, what would it mean for a textbook to understand a student's level of mastery in real-time from his work in this digital medium? With what information could a teacher know exactly what next challenge would be optimal for each student’s learning on a daily basis?
What if the textbook publishers could see, in aggregate, how effective their content is, learn from that, adapt their textbooks, and redistribute new and improved content in months, weeks, or days rather than the current seven-year adoption cycles -- much in the way that Google measures our interactions with their applications and improve them based on the results. Depending on how well the beta testers in Dan's classroom learned to solve algebra problems, the textbook modifications might become standard for all algebra students. What if the instrumented 21st century textbook were able to measure both a student's learning and its own effectiveness, and that capability moved education innovation itself to Internet time?
edu20
education
publishing
textbooks
from google
Dale’s post, "A hunger for good learning," featured a fantastic video about teaching math. In a few brief minutes, Dan Meyer showed us a photo of a math problem involving filling a tank of water and calculating how long that would take, then showed us why traditional approaches to teaching this problem stifled student learning. The picture showed a traditional math problem with a line drawing of the tank, a problem set-up written in text (octagonal tank, straight sides, 27oz per second, etc.) followed by short sub-steps that are needed to solve the problem (calculate the surface area of the base, calculate the volume). Then, finally, it asks the question “how long will it take to fill the tank?” Dan’s view is that this spoon-feeding of problem solving in little steps trains students not to think like mathematicians and not to have the patience for solving complex problems. Instead, Dan prefers to show his students a video of the tank filling up, agonizingly slowly, until the students are eager to know “How long until that tank fills up, anyway?” And then they’re off -- discussing, questioning, and, most importantly, formulating the problem on their own, just as good mathematicians do.
It seems that what the textbook looks like in the 21st century is a lot more like Dan's presentation than the bound paper tomes we grew up with. If the 21st century textbook is delivered digitally to students, we can expect it to be far more than a .pdf representation of a traditional text. For example, let's say the textbook publisher chose to experiment with findings from the research community that kids learn better from authentic and difficult problems than they do from bite-sized steps laid out one after the other. The publisher does what Dan Meyer did, recreates the tank problem and updates a version of the textbook for a handful of beta testers. The next morning, Dan’s students walk into class and open the book to chapter 5. The old problem is gone, instead there is just a video of a tank and instructions that say “watch me fill up -- when you know how long it takes, please enter the answer.” Sure, a student might choose to watch the video for seven-plus hours and finally write down the time it took. But when boredom sets in, a more engaging option is to just play with the problem. By staying up to date with new information and practices, this textbook is living.
In this example, the student finds all the needed tools lying around the page. A ruler for measuring the size of the tank, a cup of known size and a stopwatch to measure the rate of water flow, as well as various other tools, leaving it to the student to decide which ones are relevant to solving the problem. This textbook is interactive.
On the opposite page of the book is a chat window where students can share hypotheses, discuss approaches, share results from using the tools (I get 18.4 inches for the height, but you got 18.7). This textbook is participative.
Of course, for the kid who already understands this deeply and finishes quickly, there are better challenges waiting. Similar problems with trickier shapes to the tank, problems where there is more than one pipe filling the tank up, problems where the rate of the water varies. This textbook provides each student with the right level of challenge at any given time -- it is adaptive.
If some of these problems require new tools and concepts, the student has the ability to research on the Internet, connect with tutors in higher grades, chat with other students across the world who happen to be wrestling with this same problem right now, or find and watch a YouTube-sized lecture on a relevant topic. This textbook is connected.
In addition to living, interactive, participative, adaptive, and connected, we can expect the 21st century textbook to be personalized and mashable. Beyond that, though, could the 21st century textbook hold out a unique promise - that the student who uses this kind of textbook no longer needs to wait for high-stakes, anxiety-inducing tests to determine whether he had learned a topic? What if the digital textbook were instrumented to collect and interpret data in such a way that it could tell a student's level of mastery without test-taking, just from how he engaged with the content? Some of these measurements and interpretations are easy to imagine, such as: 'Which digital tool did the student first pick up to make measurements in the tank-filling problem', and 'What keywords did he search for on the internet?' Other kinds of data will be harder to interpret, such as: 'What solutions did he try on his scratch pad', 'What questions did he ask his peers', and 'Which of his peers' questions did he answer?' But to any degree, what would it mean for a textbook to understand a student's level of mastery in real-time from his work in this digital medium? With what information could a teacher know exactly what next challenge would be optimal for each student’s learning on a daily basis?
What if the textbook publishers could see, in aggregate, how effective their content is, learn from that, adapt their textbooks, and redistribute new and improved content in months, weeks, or days rather than the current seven-year adoption cycles -- much in the way that Google measures our interactions with their applications and improve them based on the results. Depending on how well the beta testers in Dan's classroom learned to solve algebra problems, the textbook modifications might become standard for all algebra students. What if the instrumented 21st century textbook were able to measure both a student's learning and its own effectiveness, and that capability moved education innovation itself to Internet time?
april 2010 by rahuldave
Photo Tutor Teaches Basic Camera Exposure on the Go from Your iPhone [Downloads]
march 2010 by rahuldave
iPhone/iPod touch: If you're interested in getting beyond automatic mode on your camera but don't have a lot of time, the free versions of Photo Tutor Module Lite 1 and 2 explain the finer points of aperture and shutter speed on-the-go. More »
Downloads
Education
Featured_iPhone_Download
iPhone
iPhone_Apps
ipod_touch
Learning
Photography
Photography_Tip
Photos
Pictures
from google
march 2010 by rahuldave
related tags
Announcements ⊕ Apple ⊕ Applications ⊕ Bayesian ⊕ book ⊕ books ⊕ Code ⊕ Design ⊕ development ⊕ digitaltextbooks ⊕ discusses ⊕ DIY ⊕ Downloads ⊕ ebooks ⊕ editorial ⊕ edu20 ⊕ education ⊖ Featured_iPhone_Download ⊕ gowalla ⊕ HTML ⊕ HTML5 ⊕ ibooks ⊕ industry ⊕ ipad ⊕ iPhone ⊕ iPhone_Apps ⊕ ipod_touch ⊕ jeremy ⊕ Jeremy_Keith ⊕ keith ⊕ Learning ⊕ mobile ⊕ News ⊕ Photography ⊕ Photography_Tip ⊕ Photos ⊕ Pictures ⊕ Projects ⊕ Publications ⊕ publishing ⊕ R ⊕ Reading ⊕ R_bloggers ⊕ Statistics ⊕ stevejobs ⊕ textbooks ⊕ thursday ⊕ Top ⊕ Web_Design ⊕ Web_Design_History ⊕ Web_Standards ⊕ Zeldman ⊕Copy this bookmark: