arthegall + linguistics 31
Corpus of Historical American English (COHA)
june 2010 by arthegall
"This is an alpha version of the 400 million word Corpus of Historical American English (COHA), which is the largest structured corpus of historical English (or any language, for that matter)." -- Horrific web interface (terrible use of frames) but awesome tool nonetheless.
web
english
american-history
language
linguistics
database
text
search
via:languagelog
june 2010 by arthegall
Zellig Harris, "Grammar on Mathematical Principles" (1978)
september 2009 by arthegall
[JSTOR: Journal of Linguistics, Vol. 14, No. 1 (Mar., 1978), pp. 1-20]
zellig-harris
grammar
mathematics
language
linguistics
jstor
september 2009 by arthegall
"Fucking shut the fuck up" (Language Log)
july 2009 by arthegall
"(I once offered to put five dollars in the tips jar at the Stevenson College Coffee House at UC Santa Cruz if they would stop playing the Van Morrison CD they had put on. They did, and I did. So his music has negative cash value for me: I have actually paid money to not hear it.)" -- It's like Geoff Pullum has been reading Andrew Gelman (http://www.stat.columbia.edu/~cook/movabletype/archives/2009/07/there_is_no_uti.html)
humor
van-morrison
music
value
utility
language
linguistics
swearing
july 2009 by arthegall
"Musical protolanguage: Darwin’s theory of language evolution revisited" (Tecumseh Fitch at the Language Log)
february 2009 by arthegall
I was listening to a description of a paper Darwin wrote, "A Biographical Sketch of an Infant," about the development of his own son over his first four years, and comparing it to observations he had made earlier about a baby orangutan. At some point, I'd like to come back to this and track down that paper...
biology
evolution
language
linguistics
development
darwin
february 2009 by arthegall
"Formality and interpretation" (Language Log)
february 2009 by arthegall
"The one positive conclusion from Fish's work that I believe I've grasped, so far, is the crucial role of what he calls "interpretive communities" in providing enough of a shared context — even if ephemeral and unfounded — for some minimal communication to take place. So it's ironic that he so completely fails to understand Kempson's work in the context of her native interpretive community." --- That sound you hear is Stanley Fish getting smacked down *with logic*. (That is to say, not only is it a logical smackdown, but its actual performance includes a prominent use of the phrase, "model theory.") Also, the Kempson article is really good.
humor
language
linguistics
logic
stanley-fish
representationalism
february 2009 by arthegall
Publications, Mark Johnson, Cognitive and Linguistic Sciences, Brown University
february 2009 by arthegall
Saw Mark Johnson give a talk about "Adaptor Grammars" (man, that 'o' really bothers me) two days ago. It turned out to be ... an extremely boring talk, although the idea itself seems modestly interesting and it included several reasonable animations of hierarchical Chinese Restaurant processes that were modestly illuminating. At any rate, I sat in the back, doodled on my notebook, and started to idly wonder if issues of "frequentist consistency" for this sort of learning process had been examined (or were even worth examining) at all...
statistics
machinelearning
bayesian-methods
grammar
nlp
linguistics
consistency
nonparametric-methods
mark-johnson
chinese-restaurant-process
february 2009 by arthegall
Lieberman, Michel, Jackson, Tang, & Nowak "Quantifying the evolutionary dynamics of language" (Nature)
january 2009 by arthegall
The regularization of irregular verbs, and the death of old words. I remember reading about this (on the Language Log, probably?) when it was published.
language
research-article
linguistics
evolution
via:WanderingAengus
january 2009 by arthegall
"Words and Credit Scores" (Social Science Statistics Blog)
november 2008 by arthegall
Word-frequency models for P(default), based on loan applications from P2P lending sites. Filing this away for use at a later date, when (a) I have more money, and (b) it wouldn't be insane to lend it to somone on "a P2P lending site."
money
nlp
linguistics
machinelearning
credit-scores
november 2008 by arthegall
The Nature of Field Work in a Monolingual Setting
august 2008 by arthegall
An excerpt of a description of Kenneth Pike eliciting information about a new language from a native speaker without a translator -- "monolingual elicitation." Benzon at the Valve links to this, asking "what did Pike know that Quine didn't?" But that seems like a complete misreading of Quine, who's describing how the narrowing process (I think he terms this, asking "ostensive" questions) narrows down some intermediate translation without ever permanently settling the question. Would someone like Pike ever really dispute that? Anyway, I think language-learning-games like this are probably a great lab for thinking about science too -- "nature" as the native speaker.
language
science
quote
kenneth-pike
linguistics
learning
quine
via:the-valve
august 2008 by arthegall
"Canoe wives and unnatural semantic relations" (Language Log )
august 2008 by arthegall
"In my view, there is no sense in trying to develop a taxonomy of possible semantic relations that noun-noun compounds can express, given that one of them would apparently have to be a relation that permits N1 N2 to hold of a person x iff N2 is the name of the relation that x bears to some person y such that y was involved in an incident in which an object of the type N1 played a salient role. Define the notion "natural semantic relation" as you will, this surely isn't one."
linguistics
generative-grammar
language
english
amnesia
semantics
august 2008 by arthegall
WNDB(5WN) manual page
june 2008 by arthegall
The WordNet database files are (a) text files, but (b) are indexed by byte-offsets. I can't tell if this is hideous or hilarious (probably both).
file-format
humor
computer
language
linguistics
nlp
documentation
data
june 2008 by arthegall
Yan, Isard, and Liberman. "Different Roles of Pitch and Duration in Distinguishing Word Stress in English"
june 2008 by arthegall
Via the Language Log: http://languagelog.ldc.upenn.edu/nll/?p=252
Includes an analysis of American Supreme Court recordings, although "Clarence Thomas didn't speak often enough to be included in the analyzed data."
research-article
linguistics
computational-linguistics
word-stress
supreme-court
language
Includes an analysis of American Supreme Court recordings, although "Clarence Thomas didn't speak often enough to be included in the analyzed data."
june 2008 by arthegall
Harris and Mattick, "Science Sublanguages and the Prospects for a Global Language of Science" (JSTOR)
may 2008 by arthegall
Annals of the American Academy of Political and Social Science, Vol. 495, Telescience: Scientific Communication in the Information Age (Jan., 1988), pp. 73-83
via:cshalizi
journal-article
jstor
science
language
linguistics
research
thesis
zellig-harris
may 2008 by arthegall
"Ontological Promiscuity v. Recursion" (Language Log)
february 2008 by arthegall
Several links to the controversy over the Piraha.
linguistics
ontology
langauge
february 2008 by arthegall
Online Learning of Relaxed CCG Grammars for Parsing to Logical Form | Lambda the Ultimate
november 2007 by arthegall
Paper by a student down the hall from me, and Michael Collins downstairs. And what *I* wonder is how one might think of applying techniques like this to biological sequences.
paper
research
research-article
language
nlp
linguistics
machinelearning
november 2007 by arthegall
languagehat.com: MODULO.
september 2007 by arthegall
The OED (and others) on the origins and use of the word "modulo," which I first heard as a college student from my undergraduate advisor. Not much discussion of the mathematical context, but...whatever.
mathematics
linguistics
etymology
language
dictionary
september 2007 by arthegall
XLE Project
september 2007 by arthegall
"XLE consists of cutting-edge algorithms for parsing and generating Lexical Functional Grammars (LFGs) along with a rich graphical user interface for writing and debugging such grammars."
linguistics
nlp
language
parser
tools
software
september 2007 by arthegall
Anna Szabolcsi
august 2007 by arthegall
Faculty member, NYU Linguistics department. Co-author of 2006 paper with Bernardi that looks interesting.
semantics
linguistics
homepage
faculty
computerscience
august 2007 by arthegall
Joshua Tauberer's Homepage
may 2007 by arthegall
Homepage for the guy who created the govtrack.us site.
homepage
linguistics
rdf
nlp
quote
language
may 2007 by arthegall
Statistical NLP / corpus-based computational linguistics resources
march 2007 by arthegall
A set of links to NLP resources: papers, programs, and datasets.
machinelearning
ai
resources
index
links
nlp
linguistics
language
march 2007 by arthegall
"Decline of Grammar," by Geoffrey Nunberg
march 2007 by arthegall
An article from the Atlantic Monthly in 1983, by linguist and Language Log contributor, about language and grammar, prescriptivism vs. descriptivism.
language
linguistics
magazine-article
march 2007 by arthegall
Speech Accent Archive
march 2007 by arthegall
"The speech accent archive uniformly presents a large set of speech samples from a variety of language backgrounds."
speech
speechanalysis
linguistics
audio
accents
language
march 2007 by arthegall
related tags
accents ⊕ ai ⊕ american-history ⊕ amnesia ⊕ arxiv ⊕ audio ⊕ bayesian-methods ⊕ biology ⊕ blog ⊕ chinese-restaurant-process ⊕ computational-linguistics ⊕ computer ⊕ computers ⊕ computerscience ⊕ consistency ⊕ credit-scores ⊕ darwin ⊕ data ⊕ database ⊕ development ⊕ dictionary ⊕ documentation ⊕ english ⊕ etymology ⊕ evolution ⊕ faculty ⊕ file-format ⊕ generative-grammar ⊕ grammar ⊕ homepage ⊕ humor ⊕ index ⊕ informatics ⊕ journal-article ⊕ jstor ⊕ kenneth-pike ⊕ langauge ⊕ language ⊕ learning ⊕ linguistics ⊖ links ⊕ logic ⊕ machinelearning ⊕ magazine-article ⊕ mark-johnson ⊕ mathematics ⊕ mit ⊕ money ⊕ music ⊕ nlp ⊕ nonparametric-methods ⊕ ontology ⊕ paper ⊕ parser ⊕ programming ⊕ quine ⊕ quote ⊕ rdf ⊕ representationalism ⊕ research ⊕ research-article ⊕ resources ⊕ science ⊕ search ⊕ semantics ⊕ social-networks ⊕ social-science ⊕ software ⊕ soundfiles ⊕ speech ⊕ speechanalysis ⊕ stanley-fish ⊕ statistics ⊕ supreme-court ⊕ swearing ⊕ text ⊕ thesis ⊕ tools ⊕ utility ⊕ value ⊕ van-morrison ⊕ via:cshalizi ⊕ via:languagelog ⊕ via:the-valve ⊕ via:WanderingAengus ⊕ web ⊕ word-stress ⊕ zellig-harris ⊕Copy this bookmark: