rybesh + classification 49
About Campaign 2012 in the Media | Project for Excellence in Journalism (PEJ)
17 hours ago by rybesh
To arrive at the results regarding the tone of coverage, PEJ employed computer coding software developed by Crimson Hexagon along with PEJ's traditional media research methods.
The technology for Crimson Hexagon is rooted in an algorithm created by Gary King, a professor at Harvard University's Institute for Quantitative Social Science. (Click here to view the study explaining the algorithm.)
According to Crimson Hexagon, the purpose of computer coding is to "take as data a potentially large set of text documents, of which a small subset is hand coded into an investigator-chosen set of mutually exclusive and exhaustive categories. As output, the methods give approximately unbiased and statistically consistent estimates of the proportion of all documents in each category."
news
textanalysis
sentiment
machinelearning
classification
The technology for Crimson Hexagon is rooted in an algorithm created by Gary King, a professor at Harvard University's Institute for Quantitative Social Science. (Click here to view the study explaining the algorithm.)
According to Crimson Hexagon, the purpose of computer coding is to "take as data a potentially large set of text documents, of which a small subset is hand coded into an investigator-chosen set of mutually exclusive and exhaustive categories. As output, the methods give approximately unbiased and statistically consistent estimates of the proportion of all documents in each category."
17 hours ago by rybesh
Modeling the Evolution of Science
21 days ago by rybesh
This browseable 75-topic dynamic topic model of the Journal Science (1880-2002) is part of the on-line supplement to the submission "Modeling the Evolution of Science." This browser allows a user to visualize the dynamic topic model, and use the hidden topics that it has uncovered to guide an exploration of the original collection of documents.
linguistics
topicmodels
classification
science
libraries
21 days ago by rybesh
timjurka/RTextTools
11 weeks ago by rybesh
RTextTools is a free, open source machine learning package for automatic text classification that makes it simple for both novice and advanced users to get started with supervised learning. The package includes nine algorithms for ensemble classification (svm, slda, boosting, bagging, random forests, glmnet, decision trees, neural networks, maximum entropy), comprehensive analytics, and thorough documentation.
textanalysis
classification
tools
research
11 weeks ago by rybesh
Support Vector Machines: Software
12 weeks ago by rybesh
Nice ranked list of SVM software.
svm
machinelearning
classification
12 weeks ago by rybesh
[1003.0783] Supervised Topic Models
12 weeks ago by rybesh
We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive an approximate maximum-likelihood procedure for parameter estimation, which relies on variational methods to handle intractable posterior expectations. Prediction problems motivate this research: we use the fitted model to predict response values for new documents. We test sLDA on two real-world problems: movie ratings predicted from reviews, and the political tone of amendments in the U.S. Senate based on the amendment text. We illustrate the benefits of sLDA versus modern regularized regression, as well as versus an unsupervised LDA analysis followed by a separate regression.
slda
classification
lda
topicmodels
textanalysis
machinelearning
12 weeks ago by rybesh
Supervised latent Dirichlet allocation for classification
12 weeks ago by rybesh
This is a C++ implementation of supervised latent Dirichlet allocation (sLDA) for classification.
c++
slda
classification
topicmodels
lda
machinelearning
textanalysis
12 weeks ago by rybesh
natural language processing blog: Making sense of Wikipedia categories
february 2012 by rybesh
Wikipedia's category hierarchy forms a graph. It's definitely cyclic (Category:Ethology belongs to Category:Behavior, which in turn belongs to Category:Ethology).
At any rate, did you know that "Chicago Stags coaches" are a subcategory of "Natural sciences"?
wikipedia
classification
categorization
inls520
At any rate, did you know that "Chicago Stags coaches" are a subcategory of "Natural sciences"?
february 2012 by rybesh
If French language is a class ...
february 2012 by rybesh
... any idea of what an instance could be?
Looking closely at http://id.loc.gov/vocabulary/iso639-1/fr for the first
time seriously (shame on me, can't even tell since when this URI has been
available) ...
I read that it is a *rdfs:subClassOf*
http://id.loc.gov/vocabulary/iso639-1/iso639-1_Language
Well, why isn't it an *instance* of this class?
I can see the rationale : there is not "one" French language, one can
imagine further subclasses such as Canadian French, Middle-Age French etc.
so French is a class of languages OK.
But are there any subclasses of French defined at id.loc.gov ?
And if it were the case, where do one stop the subclasses recursion and
introduce instances, if any? Is it turtles all the way down?
modeling
classification
inls520
taxonomy
Looking closely at http://id.loc.gov/vocabulary/iso639-1/fr for the first
time seriously (shame on me, can't even tell since when this URI has been
available) ...
I read that it is a *rdfs:subClassOf*
http://id.loc.gov/vocabulary/iso639-1/iso639-1_Language
Well, why isn't it an *instance* of this class?
I can see the rationale : there is not "one" French language, one can
imagine further subclasses such as Canadian French, Middle-Age French etc.
so French is a class of languages OK.
But are there any subclasses of French defined at id.loc.gov ?
And if it were the case, where do one stop the subclasses recursion and
introduce instances, if any? Is it turtles all the way down?
february 2012 by rybesh
Automatic text analytics using DBpedia and PoolParty – A Live Demo |The Semantic Puzzle
february 2012 by rybesh
Let me show you which steps have to be taken to generate a high-quality text mining application, ready to be used to annotate and to categorize any kind of text or documents covering nearly any domain. With our approach of thesaurus based text mining your documents can also be linked to the world of linked (open) data; enrich your documents with data from the LOD cloud!
webinfo
inls520
semweb
textanalysis
classification
skos
tools
february 2012 by rybesh
Michael Buckland's Paul Otlet Page
january 2012 by rybesh
Michael Buckland's notes on Paul Otlet, with links to other Otlet resources.
"Paul Otlet (portrait) was born in Brussels, Belgium, in 1868. His monumental book Traité de documentation. (Brussels, 1934) was both central and symbolic in the development of information science - then called 'Documentation' - in the first half of this century. In addition, it reminds us of something that has been too widely forgotten: That this field did have a lively existence in the early decades of this century and a sophistication concerning theory and information technology that now commonly surprises people."
webhistory
webinfo
otlet
cataloging
classification
history
hypertext
libraries
"Paul Otlet (portrait) was born in Brussels, Belgium, in 1868. His monumental book Traité de documentation. (Brussels, 1934) was both central and symbolic in the development of information science - then called 'Documentation' - in the first half of this century. In addition, it reminds us of something that has been too widely forgotten: That this field did have a lively existence in the early decades of this century and a sophistication concerning theory and information technology that now commonly surprises people."
january 2012 by rybesh
Library of Congress Subject Headings (LCSH) Approved Lists - Cataloging and Acquisitions (Library of Congress)
november 2011 by rybesh
Lists of new and changed subject headings are posted to this Web site by the Policy and Standards Division as they are approved.
classification
loc
vocabularies
november 2011 by rybesh
LBSC773: Classification Theory - Kari Kraus
august 2011 by rybesh
Survey of classificatory principles from bibliographic, philosophical, biological, psychological, and linguistic perspectives. Challenges to traditional principles from the cognitive sciences and their implementations for bibliographic classification.
inls520
classification
syllabus
august 2011 by rybesh
Corpus-Based Study of Scientific Methodology: Comparing the Historical and Experimental Sciences
july 2011 by rybesh
This chapter studies the use of textual features based on systemic functional linguistics, for genre-based text categorization. We describe feature sets that represent different types of conjunctions and modal assessment, which together can partially indicate how different genres structure text and may prefer certain classes of attitudes towards propositions in the text. This enables analysis of large-scale rhetorical differences between genres by examining which features are important for classification. The specific domain we studied comprises scientific articles in historical and experimental sciences (paleontology and physical chemistry, respectively). We applied the SMO learning algorithm, which with our feature set achieved over 83% accuracy for classifying articles according to field, though no field-specific terms were used as features. The most highly-weighted features for each were consistent with hypothesized methodological differences between historical and experimental sciences, thus lending empirical evidence to the recent philosophical claim of multiple scientific methods.
nlp
rhetoric
science
history
language
genre
classification
linguistics
july 2011 by rybesh
Sapping Attention: Fresh set of eyes
february 2011 by rybesh
If we treat each lettered heading in the Library of Congress Catalog as a single, long text, we can ask the computer to find similar genres based on word usage.
classification
clustering
inls520
february 2011 by rybesh
Google Prediction API - Google Code
july 2010 by rybesh
The Prediction API enables access to Google's machine learning algorithms to analyze your historic data and predict likely future outcomes.
machinelearning
api
classification
july 2010 by rybesh
Apache Mahout
november 2009 by rybesh
Mahout's goal is to build scalable machine learning libraries.
machinelearning
opensource
hadoop
apache
recommendation
clustering
classification
datamining
november 2009 by rybesh
FUMSI -- Helping you Find, Use, Manage and Share Information
october 2008 by rybesh
This two-part article is a step-by-step guide for those wishing to create new taxonomies for their business unit, or client. It will outline the many different elements that make up a quality taxonomy and the pitfalls you should be aware of when starting a new project.
classification
taxonomy
information
architecture
methods
design
analysis
howto
october 2008 by rybesh
Classification Web
october 2008 by rybesh
Browsable LOC classification.
library
classification
cataloging
reference
october 2008 by rybesh
Cataloger's Desktop
july 2008 by rybesh
Access to the most widely used cataloging documentation resources in an integrated, online system.
library
cataloging
classification
reference
july 2008 by rybesh
Library of Congress Authorities
november 2007 by rybesh
Browse and view authority headings for Subject, Name, Title and Name/Title combinations; and download authority records in MARC format.
library
metadata
cataloging
classification
reference
authority
tools
november 2007 by rybesh
Guide to the ADL Gazetteer Content Standard
october 2007 by rybesh
A a comprehensive framework for recording descriptions of named geographic places, including the core elements of toponyms (and their history), spatial location (in various representations), and classification (according to referenced typing schemes), and
locative
metadata
thesaurus
classification
reference
standards
october 2007 by rybesh
ADL Feature Type Thesaurus 070302
october 2007 by rybesh
A set of terms for categories of geographic places; terms to indicate the nature of a place. It has been designed to be used with the Alexandria Digital Library (ADL) Gazetteer.
locative
metadata
thesaurus
classification
reference
library
october 2007 by rybesh
Revision 926: /lcco-skos/trunk/rdfizer
september 2007 by rybesh
An RDF graph of the Library of Congress Classification Outline.
python
semweb
library
classification
tools
september 2007 by rybesh
i d e a n t: Tag Literacy
july 2007 by rybesh
Distributed classification systems function at the intersection of individual choices and the shared linguistic/semantic norms of a social group (the folks in folksonomy).
social
metadata
categorization
classification
collaboration
linguistics
semantics
july 2007 by rybesh
Articles - “K” is for… Tags? | iStockphoto.com
march 2007 by rybesh
The implementation of the Controlled Vocabulary takes great steps to help reduce inappropriate tagging, and to ease translation into any of the 12 languages now offered on iStock, by categorizing the search terms into a huge vocabulary tree.
metadata
classification
interface
image
graphics
visualweb
march 2007 by rybesh
W3C Content Labels
february 2007 by rybesh
Proposed W3C standard for labeling content as, e.g., mobile-friendly, kid-safe, trustworthy, dangerous, or otherwise certified to be something by somebody.
web
standards
metadata
mobile
pornography
classification
trust
february 2007 by rybesh
Library of Congress Authorities (Search for Name, Subject, Title and Name/Title)
february 2007 by rybesh
Using Library of Congress Authorities, you can browse and view authority headings for Subject, Name, Title and Name/Title combinations; and download authority records in MARC format for use in a local library system.
archives
bibliography
books
catalogs
classification
government
library
metadata
reference
search
february 2007 by rybesh
Reference to transcoding in _A companion to museum studies_
january 2007 by rybesh
[Database] interfaces are modeled on existing genres and media, including the museum because of its archiving and classifying functions...
visualmedia
manovich
transcoding
museum
library
genre
media
interface
database
archives
classification
january 2007 by rybesh
MBA RSS Feeds | Media Bloggers Association
january 2007 by rybesh
The MBA RSS Edited Feeds project is intended to create feeds from member blogs by subject, geography, both - or events.
public
blog
journalism
news
syndication
classification
locative
events
january 2007 by rybesh
FUTEF API
december 2006 by rybesh
FUTEF API provides a search API to explore and find Wikipedia content - including faceted categories.
wiki
webservices
classification
reference
api
december 2006 by rybesh
Northrop
october 2006 by rybesh
A genre categorizer that lets users narrow down searches to particular genres like editorials, financial reports or scientific writing or group search results according to genre.
genre
search
organization
nlp
classification
machinelearning
october 2006 by rybesh
BBC - Radio 1 Superstar VJS - Clips index
january 2006 by rybesh
Faceted classification of BBC Creative Archive clips.
video
archives
classification
remix
culture
january 2006 by rybesh
How I Learned (1-4)
december 2005 by rybesh
Jennifer and Kevin McCoy have exhaustively catalogued all the individual shots from all of the episodes of the 1970s television show Kung Fu and recompiled the shots according to genres.
tv
video
classification
art
archives
genre
december 2005 by rybesh
Effects of iTunes RSS on the podcasting community
november 2005 by rybesh
Podcasters were not included in the new, popularized search and discovery mechanism where I believe most wanted to be; the directory is where the popular control lies now, especially when a directory is large and not open.
audio
video
blog
community
search
classification
commercial
YRB
timetags
november 2005 by rybesh
broadbandsports.com
november 2005 by rybesh
Sports video clip site combining taxonomic "channels" with fauxsonomic "tags."
sports
video
social
metadata
classification
november 2005 by rybesh
mefeedia video tags
november 2005 by rybesh
Faceted tagclouds.
social
metadata
interface
ideas
infoviz
classification
november 2005 by rybesh
Orange
october 2005 by rybesh
Orange is a component-based data mining software. It includes a range of preprocessing, modelling and data exploration techniques.
machinelearning
classification
code
datamining
python
opensource
tools
nlp
statistics
october 2005 by rybesh
Divmod.org :: Reverend
october 2005 by rybesh
Reverend is a general purpose Bayesian classifier. Use the Reverend to quickly add Bayesian smarts to your app.
machinelearning
bayes
classification
python
statistics
opensource
code
october 2005 by rybesh
apophenia: articles on tagging (help?)
october 2005 by rybesh
Articles that analyze tagging either through data, through situated comparisons or through philosophical hammering.
annotation
social
classification
reference
research
metadata
collaboration
community
october 2005 by rybesh
How to build on bubble-up folksonomies...
september 2005 by rybesh
Understanding semantic relationships between concepts makes folksonomic tagging even more exciting.
social
metadata
classification
ideas
music
september 2005 by rybesh
SIMPLIcity / ALIP: Object Concept Recognition / Content Based Image Retrieval / Annotation / Search
august 2005 by rybesh
This content-based image search and automatic learning-based linguistic indexing project was started in 1995.
annotation
automatic
classification
image
metadata
research
search
computervision
august 2005 by rybesh
Semantic Wave: Evil Hierarchy vs. Good Tags
august 2005 by rybesh
"One thing that disturbs me about attacks on organized classification schemes (in general) is the ease in which decades of thinking and research are cast aside in favor of trendy, book-selling concepts of the moment."
social
metadata
semweb
quote
classification
august 2005 by rybesh
Introducing SKOS
june 2005 by rybesh
Nice introduction to the SKOS RDF vocabulary for expressing concept schemes like thesauri and controlled vocabularies.
metadata
semweb
standards
classification
msmdx
june 2005 by rybesh
Tag Sorting: Another tool in an information architect's toolbox
may 2005 by rybesh
What would an information architect do with the wealth of information given by del.icio.us / flickr / technorati tags?
social
metadata
architecture
ideas
classification
may 2005 by rybesh
Marcia J. Bates
april 2005 by rybesh
Dr. Bates has published widely in the areas of information system search strategy, user-centered design of information retrieval systems, and information seeking behavior.
classification
search
library
academia
people
april 2005 by rybesh
Freetag - an Open Source Tagging / Folksonomy module
april 2005 by rybesh
Freetag is an easy tagging and folksonomy-enabled plugin for use with MySQL-PHP applications. It allows you to create tags on existing database schemas, and access and manage your tags through a robust API.
classification
code
database
metadata
opensource
php
tools
social
web
msmdx
april 2005 by rybesh
Development of the Genre Concept
april 2005 by rybesh
The value of genre theory for content engineering is, that it helps to look at the design problems from a different perspective, and to ask a different type of questions with a broader scope than prevailing usability engineering strategies raise.
classification
ideas
theory
genre
documents
usability
april 2005 by rybesh
Vimeo - Automatic Movies
february 2005 by rybesh
Can assemble clips with common tags into automatic movies.
web
video
social
metadata
classification
msmdx
february 2005 by rybesh
wixonomy
february 2005 by rybesh
This is a collection of collaboratively edited taxonomies.
social
metadata
wiki
classification
collaboration
february 2005 by rybesh
related tags
academia ⊕ analysis ⊕ annotation ⊕ apache ⊕ api ⊕ architecture ⊕ archives ⊕ art ⊕ audio ⊕ authority ⊕ automatic ⊕ bayes ⊕ bibliography ⊕ blog ⊕ books ⊕ c++ ⊕ cataloging ⊕ catalogs ⊕ categorization ⊕ classification ⊖ clustering ⊕ code ⊕ collaboration ⊕ commercial ⊕ community ⊕ computervision ⊕ culture ⊕ database ⊕ datamining ⊕ design ⊕ documents ⊕ events ⊕ genre ⊕ government ⊕ graphics ⊕ hadoop ⊕ history ⊕ howto ⊕ hypertext ⊕ ideas ⊕ image ⊕ information ⊕ infoviz ⊕ inls520 ⊕ interface ⊕ journalism ⊕ language ⊕ lda ⊕ libraries ⊕ library ⊕ linguistics ⊕ loc ⊕ locative ⊕ machinelearning ⊕ manovich ⊕ media ⊕ metadata ⊕ methods ⊕ mobile ⊕ modeling ⊕ msmdx ⊕ museum ⊕ music ⊕ news ⊕ nlp ⊕ opensource ⊕ organization ⊕ otlet ⊕ people ⊕ php ⊕ pornography ⊕ public ⊕ python ⊕ quote ⊕ recommendation ⊕ reference ⊕ remix ⊕ research ⊕ rhetoric ⊕ science ⊕ search ⊕ semantics ⊕ semweb ⊕ sentiment ⊕ skos ⊕ slda ⊕ social ⊕ sports ⊕ standards ⊕ statistics ⊕ svm ⊕ syllabus ⊕ syndication ⊕ taxonomy ⊕ textanalysis ⊕ theory ⊕ thesaurus ⊕ timetags ⊕ tools ⊕ topicmodels ⊕ transcoding ⊕ trust ⊕ tv ⊕ usability ⊕ video ⊕ visualmedia ⊕ visualweb ⊕ vocabularies ⊕ web ⊕ webhistory ⊕ webinfo ⊕ webservices ⊕ wiki ⊕ wikipedia ⊕ YRB ⊕Copy this bookmark: