wrrn + statistics   68

Statisticians Investigate Political Bias On Wikipedia - Slashdot
The team first identified 1,000 political phrases based on the number of times these phrases appeared in the text of the 2005 Congressional Record and applied statistical methods to identify the phrases that separated Democratic representatives from Republican representatives, under the model that each group speaks to its respective constituents with a distinct set of coded language.
statistics  digital-humanities  data-mining  information-retrieval  connectionmachine 
3 days ago by wrrn
BioPhysEngr Blog: EigenBracket 2012: Using Graph Theory to Predict NCAA March Madness Basketball
A simplified (and mostly accurate) way to think about this is that every team starts out with an equal number of "quality points".  Every time the computer says "Go", teams distribute their quality points to all the teams that beat them.  Thus, good teams get more quality points than they gave away (and vice versa for bad teams).  After a few rounds of this procedure, the quality points for every team approaches convergence.
graph-theory  mathematics  statistics  prediction 
11 weeks ago by wrrn
Decision Theories: A Less Wrong Primer - Less Wrong
a decision theory is an algorithm for making decisions.0 The inputs are an agent's knowledge of the world, and the agent's goals and values; the output is a particular action (or plan of actions). Actually, in many cases the goals and values are implicit in the algorithm rather than given as input, but it's worth keeping them distinct in theory.
decision-trees  critical_thinking  game-theory  statistics 
11 weeks ago by wrrn
Classification with more than two classes
Classification for classes that are not mutually exclusive is called any-of , multilabel , or multivalue classification . In this case, a document can belong to several classes simultaneously, or to a single class, or to none of the classes.
statistics  bayesian  machine-learning 
11 weeks ago by wrrn
How data science is like magic | Anne Z.
As much as it was like anything, magic was like a language. And like a language, textbooks and teachers treated it as an orderly system for the purposes of teaching it, but in reality it was complex and chaotic and organic. It obeyed rules only to the extent that it felt like it, and there were almost as many special cases and one-time variations as there were rules.
data-mining  statistics  algorithms  science  learning 
12 weeks ago by wrrn
Google Maps Help Predict Meth Labs Before They Open | Fast Company
Burnum and Lu examined data collected from 2002 to 2005 on seized meth lab equipment and where rogue chemists dumped the toxic by-products of methamphetamine manufacture. Map data analyzed over time successfully demonstrated the spread of meth labs throughout a metropolitan area--and even predicted where they would pop up next.
google  maps  opendata  crime  statistics  probability 
february 2012 by wrrn
The Coalition Government has only a 1 in 3 chance of lasting its term. Statistical modelling predicts its fall in October of 2014 | British Politics and Policy at LSE
this particular model is an example of duration analysis. Duration analysis is used in lots of fields, but with different names. Engineers might talk about time-to-failure models. Epidemiologists might talk about survival models. In all cases, we’re trying to make predictions about the time until a particular event – failure of a key mechanical part, or death due to disease
politics  statistics  mathematics  models  connectionmachine 
february 2012 by wrrn
Markov chain - Wikipedia, the free encyclopedia
a mathematical system that undergoes transitions from one state to another, between a finite or countable number of possible states. It is a random process characterized as memoryless: the next state depends only on the current state and not on the sequence of events that preceded it. This specific kind of "memorylessness" is called the Markov property. Markov chains have many applications as statistical models of real-world processes.
programming  statistics  mathematics  algorithms  study 
february 2012 by wrrn
How To Build a Naive Bayes Classifier
In this article I'm describing the math behind it. Don't fear the math, as this is simple enough that a high-schooler understands. And even though there are a lot of libraries out there that already do this, you're far better off for understanding the concept behind it, otherwise you won't be able to tweak the implementation in response to your needs.
probability  statistics  NaiveBayes  programming  mathematics 
february 2012 by wrrn
How Companies Learn Your Secrets - NYTimes.com
There is a calculus, it turns out, for mastering our subconscious urges. For companies like Target, the exhaustive rendering of our conscious and unconscious patterns into data sets and algorithms has revolutionized what they know about us and, therefore, how precisely they can sell.
marketing  privacy  data-mining  statistics  information-society  thepropagandasarecoming 
february 2012 by wrrn
Statistical Science and Philosophy of Science: Where Do (Should) They Meet in 2011 and Beyond?
The "meeting grounds" of statistical science and philosophy of science are or should be connected by a two-way street: while general philosophical questions about evidence and inference bear on statistical questions (about methods to use, and how to interpret them), statistical methods bear on philosophical problems about inference and knowledge
statistics  philosophy  science  research  logic  knowledge  beinghuman 
february 2012 by wrrn
Stanford School of Engineering - Stanford Engineering Everywhere
The Motivation & Applications of Machine Learning, The Logistics of the Class, The Definition of Machine Learning, The Overview of Supervised Learning, The Overview of Learning Theory, The Overview of Unsupervised Learning, The Overview of Reinforcement Learning
ai  lectures  video  machine-learning  statistics  algorithms  beinghuman  study  Online-Courses 
january 2012 by wrrn
Foundations of Statistical Natural Language Processing
For a theoretical underpinning of whatever you are going to code, you may want to check out Foundations of Statistical Natural Language Processing by Chris Manning and Hinrich Schütze.
book  linguistics  statistics  NLP 
january 2012 by wrrn
How Khan Academy is using Machine Learning to Assess Student Mastery | David Hu
Logistic regression takes into account prior performance. So, getting lots correct is always a good thing, and you’ll be able to recover faster from a wrong answer if you were previously doing well. Contrast with the streak model, which loses all memory after a single incorrect answer.
education  learning  machine-learning  python  pedagogy  tools  mathematics  statistics  Online-Courses 
november 2011 by wrrn
The holes in my philosophy of Bayesian data analysis « Statistical Modeling, Causal Inference, and Social Science
caused me to wonder whether it was possible to have a consistent philosophy of data analysis and whether it could it be possible that Godel’s incompleteness theorem extends as far as to say that it wasn’t possible?
statistics  philosophy  mathematics  ideas  machine-learning 
november 2011 by wrrn
Information retrieval - Wikipedia, the free encyclopedia
An information retrieval process begins when a user enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevancy.
information-retrieval  datamining  search  machine-learning  statistics  from delicious
september 2011 by wrrn
TODAY: Innovation in Search and Artificial Intelligence - YouTube
Recent advances in Artificial Intelligence, and in Internet search, have been driven by the ability to build improved models from large amounts of data. This talk looks at the process of gathering and processing the data, building the models, and using them for new applications in language processing, computer vision, and other fields.
machine-learning  ai  data-mining  statistics  information-retrieval  video  from delicious
september 2011 by wrrn
On Chomsky and the Two Cultures of Statistical Learning
Science is a combination of gathering facts and making theories; neither can progress on its own. I think Chomsky is wrong to push the needle so far towards theory over facts; in the history of science, the laborious accumulation of facts is the dominant mode, not a novelty. The science of understanding language is no different than other sciences in this respect.
Language  statistics  machine-learning  science  research  Norvig  Chomsky  linguistics  from delicious
september 2011 by wrrn
Lecture 1 | Machine Learning (Stanford) - YouTube
Lecture by Professor Andrew Ng for Machine Learning (CS 229) in the Stanford Computer Science department. Professor Ng provides an overview of the course in this introductory meeting.
mathematics  computing  statistics  study  machine-learning  Online-Courses  from delicious
september 2011 by wrrn
CS 229: Machine Learning
This course provides a broad introduction to machine learning and statistical pattern recognition. Topics include: supervised learning (generative/discriminative learning, parametric/non-parametric learning, neural networks, support vector machines); unsupervised learning (clustering, dimensionality reduction, kernel methods); learning theory (bias/variance tradeoffs; VC theory; large margins); reinforcement learning and adaptive control
machine-learning  mathematics  computing  statistics  study  learning-theory  Online-Courses  from delicious
september 2011 by wrrn
AIMS Review Course: Markov Chains and Monte Carlo Methods
An example of a stochastic process in discrete time would be the sequence of temperatures recorded every morning at Braemar in the Scottish Highlands. Another example would be the price of a share recorded at the opening of the market every day. During the day we can trace the share price continuously, which would constitute a stochastic process in continuous time.
learning  mathematics  statistics  machine-learning  python  howto  from delicious
september 2011 by wrrn
Pattern Recognition and Machine Learning Information Science and Statistics: Amazon.co.uk: Christopher M. Bishop: Books
This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning.
patternrecodition  machine-learning  statistics  mathematics  books  connectionmachine  from delicious
september 2011 by wrrn
pandas: powerful Python data analysis toolkit — pandas v0.4.0 documentation
fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language.
python  programming  data  data-mining  mathematics  statistics  from delicious
september 2011 by wrrn
Unsupervised Feature Learning and Deep Learning Tutorial
This tutorial will teach you the main ideas of Unsupervised Feature Learning and Deep Learning. By working through it, you will also get to implement several feature learning/deep learning algorithms, get to see them work for yourself, and learn how to apply/adapt these ideas to new problems.
study  machine-learning  data-mining  connectionmachine  programming  statistics  learning  from delicious
june 2011 by wrrn
introduction to machine learning
The purpose of this chapter is to provide the reader with an overview over the vast range of applications which have at their heart a machine learning problem and to bring some degree of order to the zoo of problems.
books  free  algorithms  statistics  machine-learning  mathematics  from delicious
june 2011 by wrrn
Principles of Uncertainty | Statistics, Mathematics, Philosophy
probability and statistics textbook, for maths students to build up to understanding Bayesian reasoning.
mathematics  statistics  philosophy  probability  learning  machine-learning  study  from delicious
june 2011 by wrrn
Tutorial on Crowdsourcing and Human Computation - A Computer Scientist in a Business School
Their presented material will cover topics from a variety of elds, including computer science, statistics, economics, and psychology. Furthermore, the material will include real-life examples and case<br />
studies from years of experience in running and managing crowdsourcing applications in business settings.
human-computation  learning  howto  computing  statistics  psychology  connectionmachine  from delicious
may 2011 by wrrn
Machined Learnings: Regarding leveraging the Twitter social graph
Leveraging the unlabeled social graph data more extensively. This includes trying more unsupervised
twitter  statistics  machine-learning  data-mining  information  connectionmachine  from delicious
may 2011 by wrrn
RStudio
RStudio brings together everything you need to be productive with R in a single, customizable environment.
R  programming  statistics  mathematics  tools  from delicious
march 2011 by wrrn
3 skills a data scientist needs - O'Reilly Radar
The first skill, as you might expect, is a base in statistics, algorithms, machine learning, and mathematics. "You need to have a solid grounding in those principles to actually extract signals from this data and build things with it,"
data  information-society  learning  beinghuman  study  mathematics  statistics  data-mining  from delicious
january 2011 by wrrn
dataists
The basic data science pipeline is on its way to becoming an open one. From Open Data, through an open source analysis, and ending up in results released as part of the Creative Commons, every step of data science can be performed openly.
data  statistics  blog  programming  connectionmachine 
december 2010 by wrrn
Regression analysis - Wikipedia, the free encyclopedia
techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps us understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed.
statistics  mathematics  data  information  datamining  machine-learning  ML-Class 
september 2010 by wrrn
How to visualize data with cartoonish faces ala Chernoff
The point of Chernoff faces is to display multiple variables at once by positioning parts of the human face, such as ears, hair, eyes, and nose, based on numbers in a dataset. The assumption is that we can read people's faces easily in real life, so we should be able to recognize small differences when they represent data. Now that's a pretty big assumption, but debate aside, they're fun to make.
R  statistics  information  visualization  howto  data 
september 2010 by wrrn
Self-Improving Bayesian Sentiment Analysis for Twitter
What I’ve now done is set up an automated script that checks Twitter every hour for new customer service/support tweets.  Each tweet is run through the classifier, and any high-confidence classifications are automatically added to the corpus, thereby gradually improving the accuracy without manual input
SentimentAnalysis  code  php  ideas  statistics  bayesian  machine-learning  connectionmachine 
august 2010 by wrrn
Elements of Statistical Learning: data mining, inference, and prediction. 2nd Edition.
data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book descibes the important ideas in these areas in a common conceptual framework.
statistics  books  learning  information  data  free 
august 2010 by wrrn
Marginal Revolution: LA Times Ranks Teachers
The Times obtained seven years of math and English test scores from the Los Angeles Unified School District and used the information to estimate the effectiveness of L.A. teachers — something the district could do but has not.
education  politics  teaching  information  information-society  statistics 
august 2010 by wrrn
ZIA Code Repository « Zero Intelligence Agents
I often develop code in posts, either for research or for examples. This is a public repository of this code organized by language and area of application.
data  information-society  programming  python  R  thepropagandasarecoming  statistics 
august 2010 by wrrn
Pandora, MOG, Apple, and online music’s future : The New Yorker
anonymous programmers who write the algorithms that control the series of songs in these streaming services may end up having a huge effect on the way that people think of musical narrative
music  computing  recommendation-engines  statistics  dj  themusicsarecoming 
august 2010 by wrrn
Crime software may help police predict violent offences | UK news | The Observer
According to Cleverley, the company is now refining the system to enable it to sample data from an even wider range of sources and process the results faster. "At some point in the future we hope to include analysis of feeds from CCTV cameras and public sources from the internet such as Facebook posts." Had such a system been in place it might have prevented Raoul Moat's rampage.
statistics  probability  surveillance  cctv  information-society 
july 2010 by wrrn
[1006.5731] A Taxonomy of Networks
there have been very few attempts to investigate the interrelatedness of the different classes of networks studied by different disciplines. Here, we introduced a framework to establish a taxonomy of networks from various origins. The provision of this family tree not only helps understand the kinship of networks, but also facilitates the transfer of empirical analysis, theoretical modeling, and conceptual developments across disciplinary boundaries.
network-theory  data-mining  statistics  probability  theory  connectionmachine 
july 2010 by wrrn
Britain faces drink, drugs and obesity health crisis | Society | guardian.co.uk
In the latest edition of the ONS's Social Trends report, researchers show that while life expectancy at birth in Britain is expected to continue rising and reach 81.5 years for men in 2021, much of the population faces major health issues.
UK  data  health  statistics  information-society  connectionmachine 
july 2010 by wrrn
Smarter Than You Think - I.B.M.'s Supercomputer to Challenge 'Jeopardy!' Champions - NYTimes.com
The great shift in artificial intelligence began in the last 10 years, when computer scientists began using statistics to analyze huge piles of documents, like books and news stories. They wrote algorithms that could take any subject and automatically learn what types of words are, statistically speaking, most (and least) associated with it.
ai  computing  language  statistics  connectionmachine 
june 2010 by wrrn
Unemployment benefit claimants constituency by constituency: full data | Business | guardian.co.uk
We've gone for claimants rather than unemployed because – although the numbers are lower – they are bang up to date and available at a really local level, so you can see exactly what's happening near where you live.
data  uk  economics  employment  statistics 
june 2010 by wrrn
How to Make a Heatmap – a Quick and Easy Solution
How do you make a heatmap? This came from kerimcan in the FlowingData forums, and krees followed up with a couple of good links on how to do them in R. It really is super easy. Here's how to make a heatmap, with just a few lines of code, like this.
R  statistics  programming  howto  information  visualization  data  thepropagandasarecoming 
may 2010 by wrrn
Lee Byron » How » Code to generate Streamgraphs, now available
Streamgraphs had been seen before in a project I did with last.fm data called Listening History as well as in a graphic in the New York Times called Ebb and Flow at the Box Office.
processing  visualization  information  programming  opensource  statistics 
may 2010 by wrrn
Mapreduce & Hadoop Algorithms in Academic Papers (3rd update)
Learn from academic literature about how the mapreduce parallel model and hadoop implementation is used to solve algorithmic problems.
data  computing  mapreduce  hadoop  mathematics  statistics  complexity 
may 2010 by wrrn
SPSS Statistics and R | SPSS Inside-Out
By bringing R and SPSS together, you get the best of both worlds: a large collection of statistical and graphical tools from R with the ease of use, data handling, and output presentation of SPSS Statistics.
spss  software  data  data-mining  statistics  R 
april 2010 by wrrn
SPSS - Wikipedia, the free encyclopedia
SPSS is among the most widely used programs for statistical analysis in social science. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations and others. The original SPSS manual (Nie, Bent & Hull, 1970) has been described as one of "sociology's most influential books"
data  software  statistics  spss  dd202  sociology 
april 2010 by wrrn
Rise of the Data Scientist
Even if you're not into visualization, you're going to need at least a subset of the skills that Fry highlights if you want to seriously mess with data. Statisticians should know APIs, databases, and how to scrape data; designers should learn to do things programmatically; and computer scientists should know how to analyze and find meaning in data.
data-mining  data  mathematics  research  statistics  visualization  information  beinghuman 
april 2010 by wrrn
Is Britain really broken? Measuring the UK's civic health | News | guardian.co.uk
The aim of the statistics is to measure the quality of life in each local authority around Britain. We can see, for example, that perceived anti-social behaviour is a big problem in Newham, where 47.9% of those questioned think it is a problem, whereas in City of London only 7% feel the same way.
UK  data  politics  election  statistics  government  civic  opengov 
march 2010 by wrrn
Minister of Truth: Meet Britain’s Top Data Cop | Magazine
. “I got hooked on what you might call the politics of statistics,” he says. Now, as head of assessment, he monitors figures from roughly 200 public agencies. He opens the data to peer review, publicly calls out bureaucrats, and even drags them before Parliament if need be.
information-society  data  politics  uk  statistics  thepropagandasarecoming 
march 2010 by wrrn
Inkling Magazine - Crescat Graffiti, Vita Excolatur
The graffiti preserved in Pompeii after the eruption of Mount Vesuvius provided unique insights into Roman street life. The Mayan graffiti found in Tekal and the graffiti left by Vikings also give us small glimpses into the past. What kind of insight might a longitudinal study of the graffiti on the walls at the University of Chicago’s main library provide into the lives and minds of this community of college students?
culture  data  language  graffiti  statistics  research 
february 2010 by wrrn
How did the Tories get their crime figures in a twist? | News | guardian.co.uk
The National Crime Recording Standard changed a crucial element of recorded crime: instead of police officers deciding whether an incident should be recorded as a violent crime, the decision was given to the alleged victim. It had the effect of forcing up recorded violence by an estimated 35% in the first year.
politics  uk  crime  statistics  Tories 
february 2010 by wrrn
Next Big Sound
Understand Your Fans' Online Actions

We're tracking the number of plays, views, fans, comments, mentions, and other key metrics for 493,049 artists across major web properties like Facebook, MySpace, Last.fm, Twitter, and more. And we're more accurate than your intern.
music  information  marketing  industry  data  realitymining  statistics  connectionmachine  data-mining 
november 2009 by wrrn
Hans Rosling reveals new insights on poverty | Video on TED.com
Researcher Hans Rosling uses his cool data tools to show how countries are pulling themselves out of poverty. He demos Dollar Street, comparing households of varying income levels worldwide. Then he does something really amazing.
design  data  visualization  poverty  statistics  globalisation  video  TED 
december 2008 by wrrn
Music Ally | Blog Archive » Exclusive: Warner Chappell reveals Radiohead’s ‘In Rainbows’ pot of gold
The topline figure, though, is that there were three million purchases of In Rainbows, including physical CDs, box-sets, and all downloads - including those from the band’s own website and from other digital music stores.
download  business  marketing  Radiohead  statistics  music  industry 
october 2008 by wrrn
Why the cloud cannot obscure the scientific method
nobody, including Anderson himself if he had thought about it, should be happy with stopping at this level of understanding of the natural world.
data  philosophy  science  theory  information  cloud  PetabyteAge  critical_thinking  statistics 
june 2008 by wrrn
The End of Theory: The Data Deluge Makes the Scientific Method Obsolete
The new availability of huge amounts of data, along with the statistical tools to crunch these numbers, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent models, unified theo
realitymining  evolution  google  philosophy  science  statistics  information-society  critical_thinking  theory  data  PetabyteAge 
june 2008 by wrrn
Visualizing Data, Making it Accessible | White African
There are now a number of excellent blogs, agencies and consultants who deal with this stuff every day.
data  statistics  information  visualization 
march 2008 by wrrn
CrimLinks Home
Criminal and community justice and criminology are expanding fast. There is huge demand for information in these areas. CrimLinks helps practitioners, academics and students find key online sources quickly.
crime  statistics  research 
october 2007 by wrrn
The numbers game: statistics and politics | openDemocracy
Its fault is to encourage disengagement. If we have nothing better to show than cynicism, we resign ourselves to powerlessness, and turn away, understanding nothing, griping about everything.
statistics  economics  politics  mathematics 
october 2007 by wrrn

related tags

Africa  ai  algorithms  bayesian  beinghuman  blog  book  books  business  cctv  Chomsky  civic  cloud  code  complexity  computing  connectionmachine  crime  critical_thinking  culture  data  data-mining  datamining  dd202  decision-trees  design  digital-humanities  dj  download  economics  education  election  employment  evolution  free  game-theory  globalisation  google  government  graffiti  graph-theory  hadoop  health  howto  human-computation  ideas  industry  infographic  information  information-retrieval  information-society  knowledge  language  learning  learning-theory  lectures  linguistics  logic  machine-learning  mapreduce  maps  marketing  mathematics  media  ML-Class  models  music  NaiveBayes  network-theory  NLP  Norvig  Online-Courses  opendata  opengov  opensource  patternrecodition  pedagogy  PetabyteAge  philosophy  php  politics  poverty  prediction  privacy  probability  processing  programming  psychology  python  R  Radiohead  realitymining  recommendation-engines  research  science  search  SentimentAnalysis  simplenote  social  sociology  software  spss  statistics  study  surveillance  teaching  technology  TED  themusicsarecoming  theory  thepropagandasarecoming  tools  Tories  twitter  uk  via:zite  video  visualization 

Copy this bookmark:



description:


tags: