Automatic text analytics using DBpedia and PoolParty – A Live Demo |The Semantic Puzzle
8 hours ago
Let me show you which steps have to be taken to generate a high-quality text mining application, ready to be used to annotate and to categorize any kind of text or documents covering nearly any domain. With our approach of thesaurus based text mining your documents can also be linked to the world of linked (open) data; enrich your documents with data from the LOD cloud!
webinfo
inls520
semweb
textanalysis
classification
skos
tools
8 hours ago
N-grams: corpus based (COCA, COHA, Spanish, Portuguese)
9 hours ago
These n-grams are based on the largest publicly-available, genre-balanced corpus of English -- the 425 million word Corpus of Contemporary American English (COCA). With this n-grams data (2, 3, 4, 5-word sequences, with their frequency), you can carry out powerful queries offline -- without needing to access the corpus via the web interface.
english
corpus
linguistics
nlp
ngrams
9 hours ago
cheese it, the cops!: absurd
10 hours ago
"the last thing you want to do is prove professors of communication right."
from twitter
10 hours ago
The Code4Lib Journal – HTML5 Microdata and Schema.org
11 hours ago
This article is an introduction to Microdata and Schema.org. The first section describes what HTML5, Microdata and Schema.org are, and the problems they have been designed to solve. With this foundation in place section 2 provides a practical tutorial of how to use Microdata and Schema.org using a real life example from the cultural heritage sector. Along the way some tools for implementers will also be introduced. Issues with applying these technologies to cultural heritage materials will crop up along with opportunities to improve the situation.
webinfo
microdata
11 hours ago
Conditional Random Fields
13 hours ago
Conditional random fields (CRFs) are a probabilistic framework for labeling and segmenting structured data, such as sequences, trees and lattices. The underlying idea is that of defining a conditional probability distribution over label sequences given a particular observation sequence, rather than a joint distribution over both label and observation sequences. The primary advantage of CRFs over hidden Markov models is their conditional nature, resulting in the relaxation of the independence assumptions required by HMMs in order to ensure tractable inference. Additionally, CRFs avoid the label bias problem, a weakness exhibited by maximum entropy Markov models (MEMMs) and other conditional Markov models based on directed graphical models. CRFs outperform both MEMMs and HMMs on a number of real-world tasks in many fields, including bioinformatics, computational linguistics and speech recognition.
machinelearning
nlp
crf
textmining
metadata
13 hours ago
Olivier Labs | Jason
yesterday
Jason is a JSON viewer & editor for Mac OS X. It can open local documents as well as download JSON data via HTTP and, in case of invalid data, an error message is presented and the line containing the error is highlighted.
json
tools
yesterday
Groundhog Day Is Worth Revisiting, Wouldn’t You Say? | Tor.com
yesterday
Happy Groundhog Day.
from twitter
yesterday
PhantomJS: Headless WebKit with JavaScript API
yesterday
PhantomJS is a headless WebKit with JavaScript API. It has fast and native support for various web standards: DOM handling, CSS selector, JSON, Canvas, and SVG.
PhantomJS is an optimal solution for fast headless testing, site scraping, pages capture, SVG renderer, network monitoring and many other use cases.
javascript
scraping
testing
PhantomJS is an optimal solution for fast headless testing, site scraping, pages capture, SVG renderer, network monitoring and many other use cases.
yesterday
If Dickens Came Back to America, He Would Note Today's WSJ - James Fallows - National - The Atlantic
yesterday
If Dickens came back to America:
from twitter
yesterday
Graphs Beyond the Hairball | eagereyes
yesterday
Networks are usually drawn using a technique called node-link diagrams. While that works well for small graphs (the technical name for networks), it breaks down beyond a few dozen nodes. Better techniques exist, though these are currently focused on specific types of graphs or answer particular questions.
infoviz
networks
yesterday
N-Quads: Extending N-Triples with Context
2 days ago
This document describes N-Quads, a format that extends N-Triples with context. Each triple in an N-Quads document can have an optional context value.
semweb
rdf
standards
2 days ago
Web Data Commons
2 days ago
Web Data Commons will extract all Microformat, Microdata and RDFa data that is contained in the Common Crawl corpus and will provide the extracted data for free download in the form of RDF-quads as well as CSV-tables for common entity types (e.g. product, organization, location, ...).
semweb
rdfa
web
metadata
webinfo
microdata
microformats
database
2 days ago
Book Banning in Arizona « Academe Blog
2 days ago
Is there a worse place in the U.S. than Arizona? I will never spend a dime there.
from twitter
2 days ago
Before and After Demonstration: Overview
2 days ago
The Before and After Demonstration is a multi-page resource that shows an inaccessible website and a retrofitted version of this same website. Each web page includes inline annotations that can be activated to highlight some of the key accessibility barriers or repairs. Each web page is also accompanied by an evaluation report to inform the developers on the level of conformance to the Web Content Accessibility Guidelines (WCAG).
accessibility
interface
design
standards
2 days ago
Digital humanities
2 days ago
Humanities students often do not realize (or even imagine) that 1) they are capable of learning to write useful and practical computer programs within the course of a semester even if they have no prior background in programming; 2) the ability to write one’s own programs can be valuable for scholars in the humanities, especially because commercial software often does not address research needs in the humanities; and 3) practical computer programming, no less than reading, writing, and arithmetic, is a useful skill that is within the reach of any educated person regardless of academic specialization.
This course will introduce students to the role that computational methods can play in primary research and scholarship in the humanities, using as a technological framework eXtensible Markup Language (XML) and related technologies.
digitalhumanities
syllabus
xml
This course will introduce students to the role that computational methods can play in primary research and scholarship in the humanities, using as a technological framework eXtensible Markup Language (XML) and related technologies.
2 days ago
How to bypass firewalls or captive portals with dns2tcp | fosk.it! 2.0
4 days ago
Classic wireless hot spots commonly allow two protocols: ICMP and DNS (UDP/53). ICMP (Internet Control Message Protocol) is used to report errors and warning to the client and DNS is mandatory to resolve hostnames. While ICMP can also be used as a transport protocol (see PTunnel), firewalls may block unusual ICMP packets (ex: suspicious big packets). On the other side, there are often less restriction regarding DNS traffic.
internet
howto
4 days ago
Fuchs
4 days ago
"the role of surveillance in Google’s form of capital accumulation is explained"
from twitter
4 days ago
Historical Controversies Now
4 days ago
Instead of going to the library or the archive, we increasingly access history, the past, through the web. But what kind of history or histories, past or pasts are we accessing online? And what does this accessing entail? Following Leong et al., we approach temporality on the web “as a multiplicity of times derived from relations between different elements (2009, 1279)." This project is specifically focused on contentious historical moments, pasts that have had and potentially still have a major emotional impact, and which have been subject of struggle. Moreover, we not interested in sites specifically devoted to history, but in the major platforms on the web.
Confronting the historical events on the various platforms and opening up to a multiplicity of time we immediately realized that the traditional linear conception of time does not work online. First, most platforms do no not work in a chronological fashion, but with a reverse chronology. Second, because the platforms order sources according to ‘relevance’, the chronology of the sources as they are presented to us is radically mixed up. Third, sources do their own trick with time as well. Some focus on the historical event itself, while other rework the event. This reworking happens in a wide variety of ways, for example, by metaphorically invoking the event, by turning it into a historiographic debate, or by incorporating the event in a personal account (reading a history book, visiting a historical site, listening to a song). Crucially, in some of these reworkings, the event is actualized as controversial. These temporal complications directly informed our research, analysis, and visualization.
The above considerations translate in the following research questions:
Source time: Do we primarily find contemporary sources or historical sources in the various spheres? Does this vary across controversies?
Historical time: Do the sources on a platform focus on the historical moment itself, or a contemporary reworking of the moment? Does this vary across controversies?
Heat of the controversy: Is the controversy treated as settled, or is it actualized as still controversial? Does this vary across platforms and controversies?
history
datamining
web
publichistory
Confronting the historical events on the various platforms and opening up to a multiplicity of time we immediately realized that the traditional linear conception of time does not work online. First, most platforms do no not work in a chronological fashion, but with a reverse chronology. Second, because the platforms order sources according to ‘relevance’, the chronology of the sources as they are presented to us is radically mixed up. Third, sources do their own trick with time as well. Some focus on the historical event itself, while other rework the event. This reworking happens in a wide variety of ways, for example, by metaphorically invoking the event, by turning it into a historiographic debate, or by incorporating the event in a personal account (reading a history book, visiting a historical site, listening to a song). Crucially, in some of these reworkings, the event is actualized as controversial. These temporal complications directly informed our research, analysis, and visualization.
The above considerations translate in the following research questions:
Source time: Do we primarily find contemporary sources or historical sources in the various spheres? Does this vary across controversies?
Historical time: Do the sources on a platform focus on the historical moment itself, or a contemporary reworking of the moment? Does this vary across controversies?
Heat of the controversy: Is the controversy treated as settled, or is it actualized as still controversial? Does this vary across platforms and controversies?
4 days ago
API Ecology
4 days ago
The concept of the mashup implies the combination of different data sources or functions. Practically, this often means that a mashup makes use of several APIs and tries to produce new insights or new functionalities by mixing them together. The patterns of combination are not random: one can imagine that certain APIs are brought together more often than others. This (short) project proposes to examine this dynamic more closely.
webservices
api
networks
infoviz
4 days ago
yellow garlic - Google Search
6 days ago
If I accomplish nothing else in life, at least I can say that I am the Web's authority on yellow garlic.
from twitter
6 days ago
A Position on Peer Reviewing in HCI, part 1 « Interaction Culture
7 days ago
"peer reviewing is a critical, not a scientific, activity"
from twitter
7 days ago
Thoms, William John (DNB00) - Wikisource
7 days ago
In 1849 he resumed his project of providing a paper ‘in which literary men could answer one another's questions.’ Dilke encouraged him, with the result that the first number of ‘Notes and Queries’ appeared on 3 Nov. 1849. The name was chosen by Thoms, and he selected for a motto Captain Cuttle's phrase, ‘When found, make a note of.’ In form the journal was modelled on the ‘Somerset House Gazette.’
scholarlycommunication
scholarship
history
editorsnotes
7 days ago
MURK AVENUE, I FOUND ICE CUBES 'GOOD DAY'
7 days ago
How to do historical research: I FOUND ICE CUBES ‘GOOD DAY’
from twitter
7 days ago
Telomere - Wikipedia, the free encyclopedia
7 days ago
Telomeres are repetitive nucleotide sequences located at the termini of linear chromosomes. Mons and Velterop refer to concepts linked by a predicate in an RDF triple as "telomeric concepts," an interesting metaphor demonstrating that Otlet's dream of a science of documentation that mirrors the science of natural phenomena is alive and well.
documentation
semweb
science
scholarlycommunication
7 days ago
Oxford Journals | Humanities | Notes and Queries
7 days ago
Founded under the editorship of the antiquary W J Thoms, the primary intention of Notes and Queries was, and still remains, the asking and answering of readers' questions. It is devoted principally to English language and literature, lexicography, history, and scholarly antiquarianism.
history
language
literature
editorsnotes
scholarlycommunication
scholarship
7 days ago
Notes and Queries (Bookshelf) - Gutenberg
7 days ago
Notes and Queries (originally subtitled "a medium of inter-communication for literary men, artists, antiquaries, genealogists, etc") is a London-based, quarterly publication, part academic journal, part correspondence magazine, in which scholars and interested amateurs can exchange knowledge on literature and history.
editorsnotes
scholarship
scholarlycommunication
history
literature
7 days ago
Mocha - the fun, simple, flexible JavaScript test framework
9 days ago
Mocha is a feature-rich JavaScript test framework running on node and the browser, making asynchronous testing simple and fun. Mocha tests run serially, allowing for flexible and accurate reporting, while mapping uncaught exceptions to the correct test cases.
nodejs
javascript
testing
qa
9 days ago
A New Part of Your Digital Humanities Toolkit | Tapas Project
9 days ago
Tapas is the TEI Archival Publishing and Access Service for scholars and other creators of TEI data who need a place to publish their materials in different forms and ensure it remains accessible over time. Tapas is also for anyone interested in reading and exploring TEI data, and communicating with those that share that interest.
tei
publishing
digitalhumanities
9 days ago
mhevery/jasmine-node - GitHub
10 days ago
Write the specifications for your code in *.js and *.coffee files in the spec/ directory (note: your specification files must end with either .spec.js or .spec.coffee; otherwise jasmine-node won't find them!). You can use sub-directories to better organise your specs.
javascript
nodejs
testing
qa
coffeescript
10 days ago
nytd/ice - GitHub
11 days ago
Ice is a track changes implementation, built in javascript, for anything that is contenteditable on the web.
editing
interface
versioning
javascript
html
11 days ago
splitta - statistical sentence boundary detection
13 days ago
Sentence tokenizer written in python. Includes proper tokenization and models for very high accuracy sentence boundary detection (English only for now). The models are trained from Wall Street Journal news combined with the Brown Corpus which is intended to be widely representative of written English. Error rates on test news data are near 0.25%.
nlp
python
13 days ago
Paul A Lombardo - Legal Archaeology: Recovering the Stories behind the Cases
13 days ago
Every lawsuit is a potential drama: a story of conflict, often with victims and villains, leading to justice done or denied. Yet a great deal, if not all, that we learn about the most noteworthy of lawsuits — the truly great cases — comes from reading the opinion of an appellate court, written by a judge who never saw the parties of the case, who worked at a time and a place far removed from the events that gave rise to litigation. We focus on “the facts of the case,” as described in a judge’s opinion, and then we describe the way the court applied the law to such facts as doctrine, hardly pausing to note the irony of this ex cathedra image, smacking of infallibility. Rarely do we admit that the official factual account contained in an appellate opinion may have only the most tenuous relationship to the events that actually led the parties to court. The complex stories — turning on small facts, seemingly trivial circumstances, and inter-contingent events — fade away as the “case” takes on a life of its own as it leaves the court of appeals.
law
narrative
history
facts
archives
archaeology
health
13 days ago
The Little Book on CoffeeScript
14 days ago
CoffeeScript is a little language that compiles down to JavaScript. The syntax is inspired by Ruby and Python, and implements many features from those two languages. This book is designed to help you learn CoffeeScript, understand best practices and start building awesome client side applications. The book is little, only five chapters, but that's rather apt as CoffeeScript is a little language too.
coffeescript
javascript
reference
14 days ago
tmpvar/jsdom - GitHub
14 days ago
A javascript implementation of the W3C DOM.
dom
javascript
nodejs
jquery
scraping
14 days ago
ARL Report on Digital Humanities
15 days ago
Washington DC--The Association of Research Libraries (ARL) has published Digital Humanities, SPEC Kit 326, which provides a snapshot of research library experiences with digital scholarship centers or services that support the humanities (e.g., history, art, music, film, literature, philosophy, religion, etc.) and the benefits and challenges of hosting them. The survey asked ARL libraries about the organization of these services, how they are staffed and funded, what services they offer and to whom, what technical infrastructure is provided, whether the library manages or archives the digital resources produced, and how services are assessed, among other questions.
This survey revealed that library-based support for the digital humanities is offered predominantly on an ad hoc basis. However, as demand for services supporting the digital humanities has grown, libraries have begun to re-evaluate their provisional service and staffing models. Many respondents expressed a desire to implement practices, policies, and procedures that would allow them to cope with increases in demand for services.
This SPEC Kit includes documentation from respondents that describes the mission or purpose of digital humanities centers, the services offered, policies and procedures, examples of digital projects, fellowship and grant opportunities, promotional materials, and repositories for digital projects.
digitalhumanities
research
libraries
This survey revealed that library-based support for the digital humanities is offered predominantly on an ad hoc basis. However, as demand for services supporting the digital humanities has grown, libraries have begun to re-evaluate their provisional service and staffing models. Many respondents expressed a desire to implement practices, policies, and procedures that would allow them to cope with increases in demand for services.
This SPEC Kit includes documentation from respondents that describes the mission or purpose of digital humanities centers, the services offered, policies and procedures, examples of digital projects, fellowship and grant opportunities, promotional materials, and repositories for digital projects.
15 days ago
Tableau Public | Tableau Software
15 days ago
Tableau Public is a free service that lets you create and share data visualizations on the web. Thousands use it to share data on websites and blogs and through social media like Facebook and Twitter. Tableau Public allows you to see data efficiently and powerfully without any programming.
visualization
infoviz
15 days ago
t.co / Twitter
16 days ago
RT @fivethirtyeight: It's Republicans, not Democrats, who are responsible for #SOPA and #PIPA seeming to go down in flames. ...
PIPA
SOPA
from twitter
16 days ago
The Problem of the Yellow Milkmaid: A Business Model Perspective on Open Metadata
16 days ago
"The Milkmaid," one of Johannes Vermeer's most famous pieces, depicts a scene of a woman quietly pouring milk into a bowl. During a survey the Rijksmuseum discovered that there were over 10,000 copies of the image on the internet—mostly poor, yellowish reproductions. As a result of all of these low-quality copies on the web, according to the Rijksmuseum, "people simply didn't believe the postcards in our museum shop were showing the original painting. This was the trigger for us to put high-resolution images of the original work with open metadata on the web ourselves. Opening up our data is our best defence against the 'yellow Milkmaid.'"
metadata
business
art
museum
16 days ago
Data Clustering Software | Karypis Lab
16 days ago
CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. CLUTO is well-suited for clustering data sets arising in many diverse application areas including information retrieval, customer purchasing transactions, web, GIS, science, and biology.
clustering
datamining
16 days ago
Chris Dodd’s paid SOPA crusading - Salon.com
16 days ago
RT @Lawgeek: Former Senator, now #MPAA CEO Chris Dodd goes back on his promise not to become a lobbyist to push #SOPA:
MPAA
SOPA
from twitter
16 days ago
Diction Software - Home
16 days ago
Diction 6.0 uses dictionaries (word-lists) to search a text for these qualities:
· Certainty - Language indicating resoluteness, inflexibility, and completeness and a tendency to speak ex cathedra.
· Activity - Language featuring movement, change, the implementation of ideas and the avoidance of inertia.
· Optimism - Language endorsing some person, group, concept or event, or highlighting their positive entailments.
· Realism - Language describing tangible, immediate, recognizable matters that affect people's everyday lives.
· Commonality - Language highlighting the agreed-upon values of a group and rejecting idiosyncratic modes of engagement.
textanalysis
sentiment
digitalhumanities
· Certainty - Language indicating resoluteness, inflexibility, and completeness and a tendency to speak ex cathedra.
· Activity - Language featuring movement, change, the implementation of ideas and the avoidance of inertia.
· Optimism - Language endorsing some person, group, concept or event, or highlighting their positive entailments.
· Realism - Language describing tangible, immediate, recognizable matters that affect people's everyday lives.
· Commonality - Language highlighting the agreed-upon values of a group and rejecting idiosyncratic modes of engagement.
16 days ago
Statistics 110: Introduction to Probability
16 days ago
Statistics 110 (Introduction to Probability), taught at Harvard University by Joe Blitzstein in Fall 2011. Lecture videos, homework, review material, practice exams, and a large collection of practice problems with detailed solutions are provided. This course is an introduction to probability as a language and set of tools for understanding statistics, science, risk, and randomness. The ideas and methods are useful in statistics, science, philosophy, engineering, economics, finance, and everyday life. Topics include the following. Basics: sample spaces and events, conditional probability, Bayes’ Theorem. Random variables and their distributions: cumulative distribution functions, moment generating functions, expectation, variance, covariance, correlation, conditional expectation. Univariate distributions: Normal, t, Binomial, Negative Binomial, Poisson, Beta, Gamma. Multivariate distributions: joint, conditional, and marginal distributions, independence, transformations, Multinomial, Multivariate Normal. Limit theorems: law of large numbers, central limit theorem. Markov chains: transition probabilities, stationary distributions, reversibility, convergence.
statistics
education
16 days ago
Blacksmith
17 days ago
A static site generator built with Node.js, JSDOM, and Weld.
nodejs
web
tools
blog
17 days ago
Definition of User Agent - WAI UA Wiki
17 days ago
A user agent is any software that retrieves and presents Web content for end users or is implemented using Web technologies. User agents include Web browsers, media players, and plug-ins that help in retrieving, rendering and interacting with Web content. The family of user agents also includes operating system shells, consumer electronics with Web-widgets, and stand-alone applications or embedded applications whose user interface is implemented as a combination of Web technologies.
webinfo
definitions
17 days ago
Network Protocol Headers
17 days ago
Nice diagrams of various internet protocol headers.
internet
networking
webinfo
17 days ago
Kanso
21 days ago
Kanso can be described as the NPM for CouchApps, with tools for installing and publishing shared packages while managing dependencies. The Kanso community provides reusable build-tools, modules, templates and more via the online repository. Kanso's built around a powerful packaging system, meaning almost all the functionality can be customized by you.
couchdb
javascript
21 days ago
Discovering the Template | Easily Distracted
22 days ago
I can see that another thing I often do in my courses, particularly thematic classes, is provide a “spine” narrative that supports the discussion. For all that I think “coverage” is an uninteresting objective for a class, I clearly recognize that without some core storyline or knowledge base, a class would be nothing but 14 weeks of “another interesting reading”: fun and diverting, but not giving students any sense of cumulative ownership over the subject, a sense that they know something that can be brought to bear in unexpected and creative ways on later readings (and on later experiences once the class is over).
narrative
education
history
22 days ago
Augmenting Human Intellect: A Conceptual Framework - 1962 (AUGMENT,3906,) - Doug Engelbart Institute
23 days ago
By "augmenting human intellect" we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems. Increased capability in this respect is taken to mean a mixture of the following: more-rapid comprehension, better comprehension, the possibility of gaining a useful degree of comprehension in a situation that previously was too complex, speedier solutions, better solutions, and the possibility of finding solutions to problems that before seemed insoluble. And by "complex situations" we include the professional problems of diplomats, executives, social scientists, life scientists, physical scientists, attorneys, designers--whether the problem situation exists for twenty minutes or twenty years. We do not speak of isolated clever tricks that help in particular situations. We refer to a way of life in an integrated domain where hunches, cut-and-try, intangibles, and the human "feel for a situation" usefully co-exist with powerful concepts, streamlined terminology and notation, sophisticated methods, and high-powered electronic aids.
hci
hypertext
webhistory
23 days ago
One Book, Many Readings
23 days ago
Visualizations of the structures of Choose Your Own Adventure Books.
hypertext
infoviz
design
23 days ago
Jakib Nielsen - Hypertext '87
23 days ago
Hypertext '87 was the first large-scale meeting devoted to the hypertext concept. Before the workshop, hypertext had been considered a somewhat esoteric concept of interest to a few fanatics only.
hypertext
23 days ago
Emanuel Goldberg, Electronic Document Retrieval, And Vannevar Bush's Memex
23 days ago
Vannevar Bush's famous paper "As We May Think" (1945) described an imaginary information retrieval machine, the Memex. The Memex is usually viewed, unhistorically, in relation to subsequent developments using digital computers. This paper attempts to reconstruct the little-known background of information retrieval in and before 1939 when "As We May Think" was originally written. The Memex was based on Bush's work during 1938-1940 developing an improved photoelectric microfilm selector, an electronic retrieval technology pioneered by Emanuel Goldberg of Zeiss Ikon, Dresden, in the 1920s. Visionary statements by Paul Otlet (1934) and Walter Schuermeyer (1935) and the development of electronic document retrieval technology before Bush are examined.
goldberg
webhistory
webinfo
memex
searchengine
history
23 days ago
Michael Buckland's Wilhelm Ostwald Page
23 days ago
Michael Buckland's notes on Wilhelm Ostwald.
"Ostwald discussed problems of information management with Paul Otlet, co-founder of the International Institute for Bibliography in Brussels, in 1910. He used most of his Nobel Prize money to finance a similar organization, Die Bruecke ('The Bridge'), an 'international institute for the organizing of intellectual work,' which he founded in Munich with Karl Wilhelm Buehrer and Adolf Saager in June 1911. The manifesto of the The Bridge, entitled, the 'The Organizing of Intellectual Work' was published in German and in Esperanto ('everybody's second language') in 1911."
"They advocated 'the monographic principle' (hypertext), technical standards, the use of the Universal Decimal Classification, and the idea of a World Brain. The Bridge ended in 1913 after publishing numerous pamphlets. Ostwald died in 1932. One lasting legacy of his work is the international standard for paper sizes (A4 etc.)."
history
information
ostwald
"Ostwald discussed problems of information management with Paul Otlet, co-founder of the International Institute for Bibliography in Brussels, in 1910. He used most of his Nobel Prize money to finance a similar organization, Die Bruecke ('The Bridge'), an 'international institute for the organizing of intellectual work,' which he founded in Munich with Karl Wilhelm Buehrer and Adolf Saager in June 1911. The manifesto of the The Bridge, entitled, the 'The Organizing of Intellectual Work' was published in German and in Esperanto ('everybody's second language') in 1911."
"They advocated 'the monographic principle' (hypertext), technical standards, the use of the Universal Decimal Classification, and the idea of a World Brain. The Bridge ended in 1913 after publishing numerous pamphlets. Ostwald died in 1932. One lasting legacy of his work is the international standard for paper sizes (A4 etc.)."
23 days ago
Alle Kennis van de Wereld (Biography of Paul Otlet)
23 days ago
A free documentary about Paul Otlet, narrated by W. Boyd Rayward, his biographer.
otlet
biography
documentary
video
23 days ago
True Films: The Man Who Wanted to Classify the World
23 days ago
Kevin Kelly's notes on _The Man Who Wanted to Classify the World_, a French documentary on Paul Otlet.
otlet
history
documentary
webhistory
webinfo
23 days ago
The Mundaneum Museum Honors the First Concept of the World Wide Web
23 days ago
NYT article on Paul Otlet, with an excellent graphic explaining the Mundaneum system, and a video excerpt from the documentary on him.
webhistory
webinfo
otlet
history
information
technology
23 days ago
Michael Buckland's Emanuel Goldberg Page
23 days ago
Michael Buckland's notes on Emanuel Goldberg, with links to other resources.
"Emanuel Goldberg (Portrait) was born in Moscow, Russia, in 1881, a chemist, inventor, and industrialist who contributed to almost all aspects of imaging technology in the first half of the twentieth century: photographic sensitometry, reprographics, standardized film speeds, color printing (moiré effect), aerial photography, extreme microphotography (microdots), optics, camera design (the Contax), the important, early hand-held Kinamo movie camera, and early television technology. He received his doctorate from Wilhelm Ostwald's institute in Leipzig in 1906."
goldberg
webhistory
history
film
microfilm
searchengine
"Emanuel Goldberg (Portrait) was born in Moscow, Russia, in 1881, a chemist, inventor, and industrialist who contributed to almost all aspects of imaging technology in the first half of the twentieth century: photographic sensitometry, reprographics, standardized film speeds, color printing (moiré effect), aerial photography, extreme microphotography (microdots), optics, camera design (the Contax), the important, early hand-held Kinamo movie camera, and early television technology. He received his doctorate from Wilhelm Ostwald's institute in Leipzig in 1906."
23 days ago
Michael Buckland's Paul Otlet Page
23 days ago
Michael Buckland's notes on Paul Otlet, with links to other Otlet resources.
"Paul Otlet (portrait) was born in Brussels, Belgium, in 1868. His monumental book Traité de documentation. (Brussels, 1934) was both central and symbolic in the development of information science - then called 'Documentation' - in the first half of this century. In addition, it reminds us of something that has been too widely forgotten: That this field did have a lively existence in the early decades of this century and a sophistication concerning theory and information technology that now commonly surprises people."
webhistory
webinfo
otlet
cataloging
classification
history
hypertext
libraries
"Paul Otlet (portrait) was born in Brussels, Belgium, in 1868. His monumental book Traité de documentation. (Brussels, 1934) was both central and symbolic in the development of information science - then called 'Documentation' - in the first half of this century. In addition, it reminds us of something that has been too widely forgotten: That this field did have a lively existence in the early decades of this century and a sophistication concerning theory and information technology that now commonly surprises people."
23 days ago
Ian Bogost - This is a Blog Post about the Digital Humanities
23 days ago
Last sentence here nails the State of Digital Humanities in 2012:
from twitter
23 days ago
Learn by Doing - Code School
24 days ago
Code School is all about learning by doing. Our educational courses combine video, coding in the browser, and gamification principles to make learning more fun and therefore more effective.
programming
education
tutorials
24 days ago
Learn to code | Codecademy
24 days ago
Codecademy is the easiest way to learn how to code.
programming
tutorial
education
webinfo
24 days ago
RDF Cookbook for Digital Humanities
25 days ago
The purpose of this cookbook is to document and discuss the use of RDF in digital humanities. Its focus is specific applications as found in the real world, though a few general principles are suggested. It assumes that you’re vaguely comfortable with RDF and RDFa.
rdf
rdfa
linkeddata
digitalhumanities
25 days ago
Simple JavaScript Applications with CouchDB - CouchApp.org
25 days ago
CouchApps are JavaScript and HTML5 applications served directly from CouchDB. If you can fit your application into those constraints, then you get CouchDB's scalability and flexibility "for free" (and deploying your app is as simple as replicating it to the production server).
couchdb
html5
javascript
webinfo
25 days ago
Pannapacker at MLA: Alt-Ac Is the Future of the Academy - Brainstorm - The Chronicle of Higher Education
25 days ago
While I fully support the notion of "alt-ac" for humanities PhDs, I hope that it doesn't mislead students into thinki…
from twitter
25 days ago
The Meaning and The Mining of Legal Texts
26 days ago
Positive law, inscribed in legal texts, entails an authority not inherent in literary texts, generating legal consequences that can have real effects on a person’s life and liberty. The interpretation of legal texts, necessarily a normative undertaking, resists the mechanical application of rules, though still requiring a measure of predictability, coherence with other relevant legal norms and compliance with constitutional safeguards. The present proliferation of legal texts on the internet (codes, statutes, judgments, treaties, doctrinal treatises) renders the selection of relevant texts and cases next to impossible. We may expect that systems to mine these texts to find arguments that support one’s case, as well as expert systems that support the decision-making process of courts, will end up doing much of the work.
This raises the question of the difference between human interpretation and computational pattern-recognition and the issue of whether this difference makes a difference for the meaning of law. Possibly, data mining will produce patterns that disclose habits of the minds of judges and legislators that would have otherwise gone unnoticed (reinforcing the argument of the ‘legal realists’ at the beginning of the 20th century). Also, after the data analysis it will still be up to the judge to decide how to interpret the results or up to the prosecution which patterns to engage in the construction of evidence (requiring a hermeneutics of computational patterns instead of texts). My focus in this paper regards the fact that the mining process necessarily disambiguates the legal texts in order to transform them into a machine-readable data set, while the algorithms used for the analysis embody a strategy that will co-determine the outcome of the patterns. There seems a major due process concern here to the extent that these patterns are invisible for the naked human eye and will not be contestable in a court of law, due to their hidden complexity and computational nature.
This position paper aims to explain what is at stake in the computational turn with regard to legal texts. This prepares for the question I want to put forward to those involved in distant reading and not-reading of texts: could a visualization of computational patterns constitute a new way of un-hiding the complexity involved, opening the results of computational ‘knowledge’ to citizens’ scrutiny?
textmining
machinelearning
visualization
digitalhumanities
law
This raises the question of the difference between human interpretation and computational pattern-recognition and the issue of whether this difference makes a difference for the meaning of law. Possibly, data mining will produce patterns that disclose habits of the minds of judges and legislators that would have otherwise gone unnoticed (reinforcing the argument of the ‘legal realists’ at the beginning of the 20th century). Also, after the data analysis it will still be up to the judge to decide how to interpret the results or up to the prosecution which patterns to engage in the construction of evidence (requiring a hermeneutics of computational patterns instead of texts). My focus in this paper regards the fact that the mining process necessarily disambiguates the legal texts in order to transform them into a machine-readable data set, while the algorithms used for the analysis embody a strategy that will co-determine the outcome of the patterns. There seems a major due process concern here to the extent that these patterns are invisible for the naked human eye and will not be contestable in a court of law, due to their hidden complexity and computational nature.
This position paper aims to explain what is at stake in the computational turn with regard to legal texts. This prepares for the question I want to put forward to those involved in distant reading and not-reading of texts: could a visualization of computational patterns constitute a new way of un-hiding the complexity involved, opening the results of computational ‘knowledge’ to citizens’ scrutiny?
26 days ago
The Association of American Publishers
4 weeks ago
I just threw up in my mouth a little. @RepoRat: DAMMIT NO. MT @CopyrightLibn: Vile: Research Works Act
from twitter
4 weeks ago
Untitled (http://lists.okfn.org/pipermail/open-bibliography/2012-January/001272.html)
4 weeks ago
RT @acka47: Obviously, @oclc hasn't understood #opendata yet: Can someone please scrape & publish the FAST data? /c ...
opendata
from twitter
4 weeks ago
1990
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
3d
academia
advertising
ai
ajax
analysis
anime
annotation
anthropology
api
architecture
archives
art
audio
authoring
berkeley
bibliography
biography
blog
books
business
cinema
classics
classification
club
code
collaboration
comics
commercial
commons
communication
community
computers
conference
copyright
courses
creative
criticism
css
culture
dance
data
database
datamining
delivery
design
development
digital
digitalhumanities
distributed
django
documentary
documentation
documents
economics
editing
education
electronic
events
experimental
fans
fiction
flash
food
foreignfilm
framework
future
games
google
graphics
hardware
hiphop
history
howto
html5
humor
hypermedia
hypertext
ideas
identity
image
indierock
information
infoviz
inls520
interface
internet
ireland
istanbul
japan
java
javascript
jazz
journalism
json
language
law
library
linkeddata
literary
literature
locative
machinelearning
management
maps
marketing
math
media
metadata
methods
mobile
mp3
msmdx
mthd
multimedia
music
narrative
neh2007
networking
newmedia
news
nlp
nodejs
ontology
opensource
osx
p2p
participatory
pdf
people
performingarts
philosophy
photography
php
policy
politics
post
press
psychology
python
quote
rdf
reference
remix
research
rest
rock
science
search
semantics
semiotics
semweb
sfbayarea
singer
social
socialaspects
socialscience
sociology
standards
statistics
strategy
streaming
subtitle
svg
syllabus
syndication
technology
testing
theory
timetags
tools
travel
turkey
tv
ubicomp
uk
Uncategorized
unix
unmediated
urban
usa
usability
video
web
webinfo
webservices
wiki
windows
wireless
wishlist
writing
xml
yahoo
YRB