rybesh + classification   49

About Campaign 2012 in the Media | Project for Excellence in Journalism (PEJ)
To arrive at the results regarding the tone of coverage, PEJ employed computer coding software developed by Crimson Hexagon along with PEJ's traditional media research methods.

The technology for Crimson Hexagon is rooted in an algorithm created by Gary King, a professor at Harvard University's Institute for Quantitative Social Science. (Click here to view the study explaining the algorithm.)

According to Crimson Hexagon, the purpose of computer coding is to "take as data a potentially large set of text documents, of which a small subset is hand coded into an investigator-chosen set of mutually exclusive and exhaustive categories. As output, the methods give approximately unbiased and statistically consistent estimates of the proportion of all documents in each category."
news  textanalysis  sentiment  machinelearning  classification 
17 hours ago by rybesh
Modeling the Evolution of Science
This browseable 75-topic dynamic topic model of the Journal Science (1880-2002) is part of the on-line supplement to the submission "Modeling the Evolution of Science." This browser allows a user to visualize the dynamic topic model, and use the hidden topics that it has uncovered to guide an exploration of the original collection of documents.
linguistics  topicmodels  classification  science  libraries 
21 days ago by rybesh
timjurka/RTextTools
RTextTools is a free, open source machine learning package for automatic text classification that makes it simple for both novice and advanced users to get started with supervised learning. The package includes nine algorithms for ensemble classification (svm, slda, boosting, bagging, random forests, glmnet, decision trees, neural networks, maximum entropy), comprehensive analytics, and thorough documentation.
textanalysis  classification  tools  research 
11 weeks ago by rybesh
[1003.0783] Supervised Topic Models
We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive an approximate maximum-likelihood procedure for parameter estimation, which relies on variational methods to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and the political tone of amendments in the U.S. Senate based on the amendment text. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression.
slda  classification  lda  topicmodels  textanalysis  machinelearning 
12 weeks ago by rybesh
Supervised latent Dirichlet allocation for classification
This is a C++ implementation of supervised latent Dirichlet allocation (sLDA) for classification.
c++  slda  classification  topicmodels  lda  machinelearning  textanalysis 
12 weeks ago by rybesh
natural language processing blog: Making sense of Wikipedia categories
Wikipedia's category hierarchy forms a graph. It's definitely cyclic (Category:Ethology belongs to Category:Behavior, which in turn belongs to Category:Ethology).

At any rate, did you know that "Chicago Stags coaches" are a subcategory of "Natural sciences"?
wikipedia  classification  categorization  inls520 
february 2012 by rybesh
If French language is a class ...
... any idea of what an instance could be?

Looking closely at http://id.loc.gov/vocabulary/iso639-1/fr for the first
time seriously (shame on me, can't even tell since when this URI has been
available) ...
I read that it is a *rdfs:subClassOf*
http://id.loc.gov/vocabulary/iso639-1/iso639-1_Language

Well, why isn't it an *instance* of this class?
I can see the rationale : there is not "one" French language, one can
imagine further subclasses such as Canadian French, Middle-Age French etc.
so French is a class of languages OK.
But are there any subclasses of French defined at id.loc.gov ?

And if it were the case, where do one stop the subclasses recursion and
introduce instances, if any? Is it turtles all the way down?
modeling  classification  inls520  taxonomy 
february 2012 by rybesh
Automatic text analytics using DBpedia and PoolParty – A Live Demo |The Semantic Puzzle
Let me show you which steps have to be taken to generate a high-quality text mining application, ready to be used to annotate and to categorize any kind of text or documents covering nearly any domain. With our approach of thesaurus based text mining your documents can also be linked to the world of linked (open) data; enrich your documents with data from the LOD cloud!
webinfo  inls520  semweb  textanalysis  classification  skos  tools 
february 2012 by rybesh
Michael Buckland's Paul Otlet Page
Michael Buckland's notes on Paul Otlet, with links to other Otlet resources.

"Paul Otlet (portrait) was born in Brussels, Belgium, in 1868. His monumental book Traité de documentation. (Brussels, 1934) was both central and symbolic in the development of information science - then called 'Documentation' - in the first half of this century. In addition, it reminds us of something that has been too widely forgotten: That this field did have a lively existence in the early decades of this century and a sophistication concerning theory and information technology that now commonly surprises people."
webhistory  webinfo  otlet  cataloging  classification  history  hypertext  libraries 
january 2012 by rybesh
Library of Congress Subject Headings (LCSH) Approved Lists - Cataloging and Acquisitions (Library of Congress)
Lists of new and changed subject headings are posted to this Web site by the Policy and Standards Division as they are approved.
classification  loc  vocabularies 
november 2011 by rybesh
LBSC773: Classification Theory - Kari Kraus
Survey of classificatory principles from bibliographic, philosophical, biological, psychological, and linguistic perspectives. Challenges to traditional principles from the cognitive sciences and their implementations for bibliographic classification.
inls520  classification  syllabus 
august 2011 by rybesh
Corpus-Based Study of Scientific Methodology: Comparing the Historical and Experimental Sciences
This chapter studies the use of textual features based on systemic functional linguistics, for genre-based text categorization. We describe feature sets that represent different types of conjunctions and modal assessment, which together can partially indicate how different genres structure text and may prefer certain classes of attitudes towards propositions in the text. This enables analysis of large-scale rhetorical differences between genres by examining which features are important for classification. The specific domain we studied comprises scientific articles in historical and experimental sciences (paleontology and physical chemistry, respectively). We applied the SMO learning algorithm, which with our feature set achieved over 83% accuracy for classifying articles according to field, though no field-specific terms were used as features. The most highly-weighted features for each were consistent with hypothesized methodological differences between historical and experimental sciences, thus lending empirical evidence to the recent philosophical claim of multiple scientific methods.
nlp  rhetoric  science  history  language  genre  classification  linguistics 
july 2011 by rybesh
Sapping Attention: Fresh set of eyes
If we treat each lettered heading in the Library of Congress Catalog as a single, long text, we can ask the computer to find similar genres based on word usage.
classification  clustering  inls520 
february 2011 by rybesh
Google Prediction API - Google Code
The Prediction API enables access to Google's machine learning algorithms to analyze your historic data and predict likely future outcomes.
machinelearning  api  classification 
july 2010 by rybesh
Apache Mahout
Mahout's goal is to build scalable machine learning libraries.
machinelearning  opensource  hadoop  apache  recommendation  clustering  classification  datamining 
november 2009 by rybesh
FUMSI -- Helping you Find, Use, Manage and Share Information
This two-part article is a step-by-step guide for those wishing to create new taxonomies for their business unit, or client. It will outline the many different elements that make up a quality taxonomy and the pitfalls you should be aware of when starting a new project.
classification  taxonomy  information  architecture  methods  design  analysis  howto 
october 2008 by rybesh
Cataloger's Desktop
Access to the most widely used cataloging documentation resources in an integrated, online system.
library  cataloging  classification  reference 
july 2008 by rybesh
Library of Congress Authorities
Browse and view authority headings for Subject, Name, Title and Name/Title combinations; and download authority records in MARC format.
library  metadata  cataloging  classification  reference  authority  tools 
november 2007 by rybesh
Guide to the ADL Gazetteer Content Standard
A a comprehensive framework for recording descriptions of named geographic places, including the core elements of toponyms (and their history), spatial location (in various representations), and classification (according to referenced typing schemes), and
locative  metadata  thesaurus  classification  reference  standards 
october 2007 by rybesh
ADL Feature Type Thesaurus 070302
A set of terms for categories of geographic places; terms to indicate the nature of a place. It has been designed to be used with the Alexandria Digital Library (ADL) Gazetteer.
locative  metadata  thesaurus  classification  reference  library 
october 2007 by rybesh
Revision 926: /lcco-skos/trunk/rdfizer
An RDF graph of the Library of Congress Classification Outline.
python  semweb  library  classification  tools 
september 2007 by rybesh
i d e a n t: Tag Literacy
Distributed classification systems function at the intersection of individual choices and the shared linguistic/semantic norms of a social group (the folks in folksonomy).
social  metadata  categorization  classification  collaboration  linguistics  semantics 
july 2007 by rybesh
Articles - “K” is for… Tags? | iStockphoto.com
The implementation of the Controlled Vocabulary takes great steps to help reduce inappropriate tagging, and to ease translation into any of the 12 languages now offered on iStock, by categorizing the search terms into a huge vocabulary tree.
metadata  classification  interface  image  graphics  visualweb 
march 2007 by rybesh
W3C Content Labels
Proposed W3C standard for labeling content as, e.g., mobile-friendly, kid-safe, trustworthy, dangerous, or otherwise certified to be something by somebody.
web  standards  metadata  mobile  pornography  classification  trust 
february 2007 by rybesh
Library of Congress Authorities (Search for Name, Subject, Title and Name/Title)
Using Library of Congress Authorities, you can browse and view authority headings for Subject, Name, Title and Name/Title combinations; and download authority records in MARC format for use in a local library system.
archives  bibliography  books  catalogs  classification  government  library  metadata  reference  search 
february 2007 by rybesh
Reference to transcoding in _A companion to museum studies_
[Database] interfaces are modeled on existing genres and media, including the museum because of its archiving and classifying functions...
visualmedia  manovich  transcoding  museum  library  genre  media  interface  database  archives  classification 
january 2007 by rybesh
MBA RSS Feeds | Media Bloggers Association
The MBA RSS Edited Feeds project is intended to create feeds from member blogs by subject, geography, both - or events.
public  blog  journalism  news  syndication  classification  locative  events 
january 2007 by rybesh
FUTEF API
FUTEF API provides a search API to explore and find Wikipedia content - including faceted categories.
wiki  webservices  classification  reference  api 
december 2006 by rybesh
Northrop
A genre categorizer that lets users narrow down searches to particular genres like editorials, financial reports or scientific writing or group search results according to genre.
genre  search  organization  nlp  classification  machinelearning 
october 2006 by rybesh
BBC - Radio 1 Superstar VJS - Clips index
Faceted classification of BBC Creative Archive clips.
video  archives  classification  remix  culture 
january 2006 by rybesh
How I Learned (1-4)
Jennifer and Kevin McCoy have exhaustively catalogued all the individual shots from all of the episodes of the 1970s television show Kung Fu and recompiled the shots according to genres.
tv  video  classification  art  archives  genre 
december 2005 by rybesh
Effects of iTunes RSS on the podcasting community
Podcasters were not included in the new, popularized search and discovery mechanism where I believe most wanted to be; the directory is where the popular control lies now, especially when a directory is large and not open.
audio  video  blog  community  search  classification  commercial  YRB  timetags 
november 2005 by rybesh
broadbandsports.com
Sports video clip site combining taxonomic "channels" with fauxsonomic "tags."
sports  video  social  metadata  classification 
november 2005 by rybesh
Orange
Orange is a component-based data mining software. It includes a range of preprocessing, modelling and data exploration techniques.
machinelearning  classification  code  datamining  python  opensource  tools  nlp  statistics 
october 2005 by rybesh
Divmod.org :: Reverend
Reverend is a general purpose Bayesian classifier. Use the Reverend to quickly add Bayesian smarts to your app.
machinelearning  bayes  classification  python  statistics  opensource  code 
october 2005 by rybesh
apophenia: articles on tagging (help?)
Articles that analyze tagging either through data, through situated comparisons or through philosophical hammering.
annotation  social  classification  reference  research  metadata  collaboration  community 
october 2005 by rybesh
How to build on bubble-up folksonomies...
Understanding semantic relationships between concepts makes folksonomic tagging even more exciting.
social  metadata  classification  ideas  music 
september 2005 by rybesh
SIMPLIcity / ALIP: Object Concept Recognition / Content Based Image Retrieval / Annotation / Search
This content-based image search and automatic learning-based linguistic indexing project was started in 1995.
annotation  automatic  classification  image  metadata  research  search  computervision 
august 2005 by rybesh
Semantic Wave: Evil Hierarchy vs. Good Tags
"One thing that disturbs me about attacks on organized classification schemes (in general) is the ease in which decades of thinking and research are cast aside in favor of trendy, book-selling concepts of the moment."
social  metadata  semweb  quote  classification 
august 2005 by rybesh
Introducing SKOS
Nice introduction to the SKOS RDF vocabulary for expressing concept schemes like thesauri and controlled vocabularies.
metadata  semweb  standards  classification  msmdx 
june 2005 by rybesh
Tag Sorting: Another tool in an information architect's toolbox
What would an information architect do with the wealth of information given by del.icio.us / flickr / technorati tags?
social  metadata  architecture  ideas  classification 
may 2005 by rybesh
Marcia J. Bates
Dr. Bates has published widely in the areas of information system search strategy, user-centered design of information retrieval systems, and information seeking behavior.
classification  search  library  academia  people 
april 2005 by rybesh
Freetag - an Open Source Tagging / Folksonomy module
Freetag is an easy tagging and folksonomy-enabled plugin for use with MySQL-PHP applications. It allows you to create tags on existing database schemas, and access and manage your tags through a robust API.
classification  code  database  metadata  opensource  php  tools  social  web  msmdx 
april 2005 by rybesh
Development of the Genre Concept
The value of genre theory for content engineering is, that it helps to look at the design problems from a different perspective, and to ask a different type of questions with a broader scope than prevailing usability engineering strategies raise.
classification  ideas  theory  genre  documents  usability 
april 2005 by rybesh
Vimeo - Automatic Movies
Can assemble clips with common tags into automatic movies.
web  video  social  metadata  classification  msmdx 
february 2005 by rybesh
wixonomy
This is a collection of collaboratively edited taxonomies.
social  metadata  wiki  classification  collaboration 
february 2005 by rybesh

related tags

academia  analysis  annotation  apache  api  architecture  archives  art  audio  authority  automatic  bayes  bibliography  blog  books  c++  cataloging  catalogs  categorization  classification  clustering  code  collaboration  commercial  community  computervision  culture  database  datamining  design  documents  events  genre  government  graphics  hadoop  history  howto  hypertext  ideas  image  information  infoviz  inls520  interface  journalism  language  lda  libraries  library  linguistics  loc  locative  machinelearning  manovich  media  metadata  methods  mobile  modeling  msmdx  museum  music  news  nlp  opensource  organization  otlet  people  php  pornography  public  python  quote  recommendation  reference  remix  research  rhetoric  science  search  semantics  semweb  sentiment  skos  slda  social  sports  standards  statistics  svm  syllabus  syndication  taxonomy  textanalysis  theory  thesaurus  timetags  tools  topicmodels  transcoding  trust  tv  usability  video  visualmedia  visualweb  vocabularies  web  webhistory  webinfo  webservices  wiki  wikipedia  YRB 

Copy this bookmark:



description:


tags: