Using Internet Data for Economic Research
20 days ago by cshalizi
"The data used by economists can be broadly divided into two categories. First, structured datasets arise when a government agency, trade association, or company can justify the expense of assembling records. The Internet has transformed how economists interact with these datasets by lowering the cost of storing, updating, distributing, finding, and retrieving this information. Second, some economic researchers affirmatively collect data of interest. For researcher-collected data, the Internet opens exceptional possibilities both by increasing the amount of information available for researchers to gather and by lowering researchers' costs of collecting information. In this paper, I explore the Internet's new datasets, present methods for harnessing their wealth, and survey a sampling of the research questions these data help to answer. The first section of this paper discusses "scraping" the Internet for data—that is, collecting data on prices, quantities, and key characteristics that are already available on websites but not yet organized in a form useful for economic research. A second part of the paper considers online experiments, including experiments that the economic researcher observes but does not control (for example, when Amazon or eBay alters site design or bidding rules); and experiments in which a researcher participates in design, including those conducted in partnership with a company or website, and online versions of laboratory experiments. Finally, I discuss certain limits to this type of data collection, including both "terms of use" restrictions on websites and concerns about privacy and confidentiality."
to:NB
economics
data_sets
web
re:your_favorite_dsge_sucks
20 days ago by cshalizi
PeteSearch: Keep the web weird
10 weeks ago by cshalizi
"I'm doing a short talk at SXSW tomorrow, as part of a panel on Creating the Internet of Entities. Preparing is tough because don't I believe it's possible, and even if it was I wouldn't like it. Opposing better semantic tagging feels like hating on Girl Scout cookies, but I've realized that I like an internet full of messy, redundant, ambiguous data.
"The stated goal of an Internet of Entities is a web where "real-world people, places, and things can be referenced unambiguously". We already have that. Most pages give enough context and attributes for a person to figure out which real world entity it's talking about. What the definition is trying to get at is a reference that a machine can understand.
"The implicit goal of this and similar initiatives like Stephen Wolfram's .data proposal is to make a web that's more computable. Right now, the pages that make up the web are a soup of human-readable text, a long way from the structured numbers and canonical identifiers that programs need to calculate with. I often feel frustrated as I try to divine answers from chaotic, unstructured text, but I've also learned to appreciate the advantages of the current state of things."
to:blog
warden.peter
web
internet
semantic_web
tagging
networked_life
"The stated goal of an Internet of Entities is a web where "real-world people, places, and things can be referenced unambiguously". We already have that. Most pages give enough context and attributes for a person to figure out which real world entity it's talking about. What the definition is trying to get at is a reference that a machine can understand.
"The implicit goal of this and similar initiatives like Stephen Wolfram's .data proposal is to make a web that's more computable. Right now, the pages that make up the web are a soup of human-readable text, a long way from the structured numbers and canonical identifiers that programs need to calculate with. I often feel frustrated as I try to divine answers from chaotic, unstructured text, but I've also learned to appreciate the advantages of the current state of things."
10 weeks ago by cshalizi
MathJax | Beautiful math in all browsers
december 2010 by cshalizi
See if this is already installed on the host? latex2html grows irksome.
latex
web
blogging
december 2010 by cshalizi
related tags
blogging ⊕ branching_processes ⊕ data_mining ⊕ data_sets ⊕ economics ⊕ funny:geeky ⊕ funny:malicious ⊕ heard_the_talk ⊕ html ⊕ information_cascades ⊕ internet ⊕ in_NB ⊕ latex ⊕ lerman.kristina ⊕ lolcats ⊕ networked_life ⊕ networks ⊕ re:social-networks-as-sensor-networks ⊕ re:your_favorite_dsge_sucks ⊕ semantic_web ⊕ sociology ⊕ tagging ⊕ to:blog ⊕ to:NB ⊕ to_teach:data-mining ⊕ via:arsyed ⊕ via:arthegall ⊕ warden.peter ⊕ web ⊖Copy this bookmark: