Vaguery + linguistics   23

[1007.3254] Distinguishing Fact from Fiction: Pattern Recognition in Texts Using Complex Networks
"We establish concrete mathematical criteria to distinguish between different kinds of written storytelling, fictional and non-fictional. Specifically, we constructed a semantic network from both novels and news stories, with $N$ independent words as vertices or nodes, and edges or links allotted to words occurring within $m$ places of a given vertex; we call $m$ the word distance. We then used measures from complex network theory to distinguish between news and fiction, studying the minimal text length needed as well as the optimized word distance $m$. The literature samples were found to be most effectively represented by their corresponding power laws over degree distribution $P(k)$ and clustering coefficient $C(k)$; we also studied the mean geodesic distance, and found all our texts were small-world networks.…"
nudge-targets  computational-linguistics  linguistics  classification  machine-learning  statistics  natural-language-processing 
august 2010 by Vaguery
[1005.4803] Hirsch index as a network centrality measure
"…The h index is compared with the Degree centrality (a local measure), the Betweenness and Eigenvector centralities (two non-local measures) in the case of a biological network (Yeast interaction protein-protein network) and a linguistic network (Moby Thesaurus II). In both networks, the Hirsch index has poor correlation with Betweenness centrality but correlates well with Eigenvector centrality, specially for the more important nodes that are relevant for ranking purposes, say in Search Engine Optimization. In the thesaurus network, the h index seems even to outperform the Eigenvector centrality measure as evaluated by simple linguistic criteria."
network-theory  linguistics  search-engines  algorithms  nudge-targets  classification  machine-learning 
july 2010 by Vaguery
[1005.0950] On Duplication in Mathematical Repositories
"Building a repository of proof-checked mathematical knowledge is without any doubt a lot of work, and besides the actual formalization process there also is the task of maintaining the repository. Thus it seems obvious to keep a repsoitory as small as possible, in particular each piece of mathematical knowledge should be formalized only once. In this paper, however, we claim that it might be reasonable or even necessary to duplicate knowledge in a mathematical repository. We analyze different situations and reasons for doing so and provide a number of examples supporting our thesis."
parsimony  pragmatism  library2.0  mathematics  linguistics  that-Gödel-fellow-said-something-relevant 
may 2010 by Vaguery
Phrase Detectives - The AnaWiki annotation game
"Lovers of literature, grammar and language, this is the place where you can work together to improve future generations of technology. By indicating relationships between words and phrases you will help to create a resource that is rich in linguistic information.
Simply register a username and password and you can get started."
linguistics  crowdsourcing  collaboration  serious-games  English  corpus  annotation 
january 2010 by Vaguery
Ἡλληνιστεύκοντος: Your Fractal Analysis of Esperanto does not add up
"As the review said, one of the things McMahon points out in the book is, there is a regrettable tendency in numerical approaches to linguistics to just put the raw data into the Analysatrons, and see what happens. And she said, in a more measured and thoughtful way than I just did, that this is nonsense: a linguist still needs to make sense of the input, identify what correlations are worth pursuing, and filter out what methodologically needs filtering out.

I mean, word lengths and word frequencies? Even Plato had a more sophisticated understanding of language structure than that; and that's not saying much."
linguistics  computational-methods  aptly-harsh  everything-is-made-of-physics 
september 2009 by Vaguery
Great Ape Trust graduate student's paper sheds light on bonobo language
"After applying conversational analysis tools, Pedersen asserted that language is more than the simple act of transferring information, but a conversational interaction between active participants. Language-competent bonobos use lexigrams, which are made up of arbitrary symbols that represent words, as the basis for conversations with humans.

Pedersen said linguistic aspects of the conversation included turn taking, negotiation, pauses and repetition, and went far beyond information sharing made possible through the use of lexigrams symbols.

"She was using language to get at what she wanted," Pedersen said. "She is very, very clever and is fully capable of following the conversation the same way a human does. This tells me that Panbanisha's knowledge of language is far beyond understanding the words, to understanding how to use them in a conversation to get what she wants."
language  anthropology  linguistics  apes  speciesism  analysis 
august 2008 by Vaguery
DARE WEBPAGE
Somebody was asking about this in a conversation. Ed?
DARE  local  language  dictionary  closed  books  geography  regionalism  project  reference  culture  linguistics 
july 2008 by Vaguery
Language Log: Après Fish, le déluge?
One wants to know how set boundaries may be made fluid again. One wants, I think, to let people do what they enjoy. There are enough of us for that.
via:cshalizi  disintermediation  (?)  academia  education  humanities  linguistics  scholarship 
january 2008 by Vaguery
Mixed Soda Name? | Ask MetaFilter
I ate a delicious mariposa plum the other day from Produce Station, very ripe, and it tasted //exactly// like a "suicide" Mister Misty used to taste. ////Exactly////.
nostalgia  food  fruit  cooking  taste  Dairy-Queen  Produce-Station  plums  language  dialect  regional  linguistics 
august 2007 by Vaguery
Language Log: But is it a recursive combination?
"The Shadows' own name for themselves is 10,000 letters long, and unpronounceable by many," says Wikipedia....
linguistics  marketing  humor  symbolism  pronunciation  branding  intellectual-property 
april 2007 by Vaguery
Odd Ends
Two linguistic just-so stories about te follies of English pronunciation. Blame the floppy-throated Frenchies.
linguistics  history  digitization  Distributed-Proofreaders  19C  funny  pronunciation  spelling  English 
april 2007 by Vaguery

related tags

(?)  18C  19C  academia  affordances  algorithms  analysis  analytics  annotation  answer-key  anthropology  apes  aphasia  aptly-harsh  books  branding  caregiving  challenges  classics  classification  cliché  closed  cognition  collaboration  computational-linguistics  computational-methods  cooking  corpus  correctness  count  crowdsourcing  cultural-norms  culture  current-events  Dairy-Queen  DARE  data-analysis  data-mining  dataset  definition  dementia  dialect  dictionary  digitization  disintermediation  Distributed-Proofreaders  diversity  Don-Imus  editing  education  English  Erdős  etymology  everything-is-made-of-physics  explanation  extinction  figures-of-speech  food  fruit  funny  genetic-programming  geography  Google  grammar  history  humanities  humor  intellectual-property  jargon  language  Latin  learning-from-data  library2.0  linguistics  local  machine-learning  marketing  mathematics  meaning  n-grams  natural-language-processing  network-theory  networks  nostalgia  not-meaning  nudge  nudge-targets  number  Olympiad  parsimony  pedantry  personal  plums  pragmatism  problems  Produce-Station  project  pronunciation  publication  punctuation  questions  quiz  reference  regional  regionalism  replicate  rhetoric  scholarship  science  search-engines  serious-games  simulation  social  sound  sounds-and-forms  Spanish  speciesism  speech  spelling  Sperberism  standards  statistics  symbolism  synthesis  taste  term-of-art  testing  that-Gödel-fellow-said-something-relevant  theory-and-practice-sitting-in-a-tree  via:cshalizi  via:languagelog  wildcards  writing 

Copy this bookmark:



description:


tags: