jschneider + reproducibility 17
Reconstructing the List of 44 British Novelistic Genres in Graphs, Maps, Trees
"A list of periods can be viewed as describing a process consisting of a sequence of arrivals and departures (or births and deaths). This kind of data occurs often in historical research.""any research program benefits from some "worked examples" that are reproducible in detail. "
reproducibility
periodization
25 days ago by jschneider
We should care about the reconstructability of the periodizations for two reasons. First, any corrections or adjustments might well diminish the observed clustering of the genres, weakening important claims in "Graphs" (see pp. 17-30). Second, for interested students and researchers any difficulty in reconstructing the data is unwelcoming. Those who, having read through "Graphs," want to pursue the project further will likely turn to the expert studies cited. Differences between Moretti's periodizations and those of the genre experts he cites risk creating confusion. For a nascent research field within literary studies, this would be regrettable.
"A list of periods can be viewed as describing a process consisting of a sequence of arrivals and departures (or births and deaths). This kind of data occurs often in historical research.""any research program benefits from some "worked examples" that are reproducible in detail. "
25 days ago by jschneider
How a New Hope in Cancer Fell Apart - NYTimes.com
5 weeks ago by jschneider
"Doctors say the heart of the problem is the intricacy of the analyses in this emerging field and the difficulty in finding errors. Even well-respected scientists often “oversee a machine they do not understand and cannot supervise directly” because each segment of the research requires different areas of expertise, said Dr. Lajos Pusztai, a breast cancer researcher at M. D. Anderson Cancer Center at the University of Texas. As a senior scientist, he added, “It’s true for me, too.”"
nytimes
cancer
statistics
datacuration
reproducibility
5 weeks ago by jschneider
Bulletin August/September 2011
august 2011 by jschneider
"Print-on-paper materials are made accessible through a slowly evolved infrastructure of scholarly norms (for example, acknowledgement and citation), genres of technical writing, specialized publishers and distribution channels, libraries, and bibliographies, catalogs and indexes. The infrastructure for publishing and bibliographical access was established by scholars, societies, librarians and publishers. During the second half of the 20th century digital methods made new techniques feasible. (One thinks of Chemical Abstracts, Medline and the Science Citation Index.) It is a creaky system but it works.
No comparable infrastructure is in place yet for data sets, which undermines the credibility of even well-intentioned data management plans. It is not known how bad the situation really is. If one were to pick a random selection of papers reporting the results of federally funded projects completed five or 10 years ago and sought to re-use the data sets they were based on, the effort would probably generate more frustration and embarrassment than success. "
data
bibliographies
preservation
reproducibility
No comparable infrastructure is in place yet for data sets, which undermines the credibility of even well-intentioned data management plans. It is not known how bad the situation really is. If one were to pick a random selection of papers reporting the results of federally funded projects completed five or 10 years ago and sought to re-use the data sets they were based on, the effort would probably generate more frustration and embarrassment than success. "
august 2011 by jschneider
ActivePapers — Theoretical Biophysics, Molecular Simulation, and Numerically Intensive Computation
june 2011 by jschneider
"
The main question I am addressing in this project is the one of data management. The ActivePapers infrastructure is about packaging data, code, and text in files that have a well-defined format and can be used with various kinds of tools (desktop software, server-based software, databases, ...). It is also about re-using published material (mostly data and code) through a reference system that can also improve bibliometry."
beyondthepdf
reproducibility
data
datacuration
The main question I am addressing in this project is the one of data management. The ActivePapers infrastructure is about packaging data, code, and text in files that have a well-defined format and can be used with various kinds of tools (desktop software, server-based software, databases, ...). It is also about re-using published material (mostly data and code) through a reference system that can also improve bibliometry."
june 2011 by jschneider
On the Importance of Replication in HCI and Social Computing Research | blog@CACM | Communications of the ACM
june 2011 by jschneider
"What’s interesting is the interpretation of the results suggest that squeezing more information onto the screen does not improve subject perceptual and search performance. Instead, the experiment show that there is a very complex interaction between visual attention/search with density of information of the display. Under high scent conditions, information seems to ‘pop out’ in the hyperbolic browser, helping to achieve higher performance."
reproducibility
information
visual-attention
information-scent
HCI
social-computing
june 2011 by jschneider
Science in the Open » Blog Archive » Best practice in Science and Coding. Holding up a mirror.
april 2011 by jschneider
"I think that code and experiment are actually linked at a deeper level. Both are an instantiation of process that take inputs and generate outputs. These are (to a first approximation – good enough for this discussion) deterministic in any given instance. But they are meaningless without context. Useless without the meaning that documentation and testing provide.""Too often when we write a scientific paper it’s the last part of the process. We fabricate a story that makes sense so that we can fit in the bits we want to. Now there’s nothing wrong with this. Humans are narrative processing systems, we need stories to make sense of the world. But its not the whole story. What if, as we collect and capture the events that we ultimately use to tell our story, that we also collect and structure the story of what actually happened? Of the experiments that didn’t work, of the statistical spread of good and bad results. There’s a sarcastic term in synthetic organic chemistry, the “American Yield” in which we imagine that 20 PhD students have been tasked with making a compound and the one who manages to get the highest overall yield gets to be first author. This isn’t actually a particularly useful number. Much more useful to the chemist who wants to use this prep is the spread of values, information that is generally thrown away. The difference between actually incorporating the running of the code into the documentation, and just showing one log file, cut and pasted, from when it worked well. You lose the information about when it doesn’t work.""Best practice in coding mirrors best practice in science. Documentation, testing, integration are at the core. Best practice is also a long way ahead of common practice in both science and coding. Both, perhaps are driven increasingly by a celebrity culture that is more dependent on what your outputs look like (and where they get published) than whether anyone uses them. Testing and documentation are hardly glamorous activities."
science
coding
reproducibility
context
Dexy
provenance
april 2011 by jschneider
The 2010 ACM SIGMOD/PODS Conference: Indianapolis, Indiana, USA - SIGMOD Conference Experimental Repeatability Requirements
january 2011 by jschneider
via http://twitter.com/ananelson/status/27875345394831361 and Yolanda Gil's mention
beyondthepdf
reproducibility
scholarly-communication
scientific-communication
january 2011 by jschneider
Go To Hellman: Real Research Gets Reproduced
november 2010 by jschneider
"In science, it's usual that a surprising result will only be accepted once it has been reproduced by someone else. My scientific training has sometimes gotten me in trouble in the world of libraries and publishing. When presented with something that seems surprising to me, I ask for the evidence. In cultures that are more comfortable assigning and recognizing authority, my questions have sometimes been seen as irritants."
reproducible-research
reproducibility
november 2010 by jschneider
related tags
**** ⊕ beyondthepdf ⊕ bibliographies ⊕ cancer ⊕ coding ⊕ communication ⊕ computation ⊕ computational-science ⊕ context ⊕ data ⊕ datacuration ⊕ Dexy ⊕ digitization ⊕ HCI ⊕ information ⊕ information-scent ⊕ Jonathan ⊕ LaTeX ⊕ nytimes ⊕ periodization ⊕ preservation ⊕ provenance ⊕ reproducibility ⊖ reproducible-research ⊕ Rochkind ⊕ scholarly ⊕ scholarly-communication ⊕ science ⊕ scientific-communication ⊕ scientific-computing ⊕ social-computing ⊕ software ⊕ statistics ⊕ Sweave ⊕ tolook ⊕ visual-attention ⊕Copy this bookmark: