FDADAP floppy disk adapter
22 days ago
The D Bit FDADAP board is a small adapter which adapts 8" floppy disk drives (Shugart SA800 style bus) to work with the PC 3.5"/5.25" floppy disk cable pinout.
floppies
digital-curation
hardware
22 days ago
Lightgrep: a Multipattern Regular Expression Search Tool for Digital Forensics
7 weeks ago
t We describe lightgrep, our implementation of a multipattern regular
expression search tool, which can efficiently search huge data streams.
lightgrep addresses several shortcomings of existing tools used in digital
forensics by taking advantage of some recent (and some not-so-recent)
developments in automata theory. We state the characteristics of regular
expression searching in forensics, review current approaches, and then
discuss lightgrep’s implementation in detail, based on direct simulation
of nondeterministic finite automata, including a number of practical
optimizations related to searching with large pattern sets.
forensics
grep
regex
software
research
pdf
expression search tool, which can efficiently search huge data streams.
lightgrep addresses several shortcomings of existing tools used in digital
forensics by taking advantage of some recent (and some not-so-recent)
developments in automata theory. We state the characteristics of regular
expression searching in forensics, review current approaches, and then
discuss lightgrep’s implementation in detail, based on direct simulation
of nondeterministic finite automata, including a number of practical
optimizations related to searching with large pattern sets.
7 weeks ago
DanielHeath/excesselt
8 weeks ago
A ruby library that you could use instead of XSLT. Very clean syntax
ruby
xml
programming
xslt
8 weeks ago
MIST: The MITRE Identification Scrubber Toolkit
8 weeks ago
The MITRE Identification Scrubber Toolkit (MIST) is a suite of tools for identifying and redacting personally identifiable information (PII) in free-text medical records. MIST helps you replace these PII either with obscuring fillers, such as [NAME], or with artificial, synthesized, but realistic English fillers.
nlp
redaction
tools
software
8 weeks ago
MSpace at the University of Manitoba: Filling up the house: building an appraisal strategy for curling archives in Manitoba
10 weeks ago
Curling is an important part of the Canadian cultural landscape, and nowhere is this more evident than in Manitoba. However, the documentation of curling records within archival repositories in the province has occurred without a strategic plan. This thesis first explores the modern archival appraisal theories and then proposes an appraisal model that utilizes a combination of the documentation strategy and macroappraisal in order to develop a strategy for the documentation of curling in Manitoba. Using this model, this thesis first examines the historical and contemporary context of Canadian sport in order to determine curling’s place within it, and then identifies five key functions of curling in order to evaluate, using function-based appraisal methodologies, the quality of the records that have been collected in archival repositories. The functions, structures, and records of two urban curling clubs and one rural curling club in Manitoba are then examined as case studies, and an appraisal strategy is suggested in order to better ensure that the records documenting curling in Manitoba are preserved. This strategy can be used as a template not only for appraising the records of curling, but for all sports.
archives
research
sports
appraisal
10 weeks ago
Open Metadata Pathway
11 weeks ago
The project is a collaboration between King's College London Archives, King's Centre for e-Research and the University of London Computer Centre (ULCC), utilising data held by partners in AIM25 (Archives in London and the M25) - the archive description aggregation service maintained by King's Archives and ULCC. The Open Metadata Pathway project will deliver a demonstrator of the effectiveness of opening up archival catalogues to widened automated linking and discovery.
archives
metadata
linked-data
lodlam
11 weeks ago
Archive@NYU : Demographics of Mechanical Turk
february 2012
We present the results of a survey that collected information about the demographics of participants on Amazon Mechanical Turk, together with information about their level of activity and motivation for working on Amazon Mechanical Turk. We find that approximately 50% of the workers come from the United States and 40% come from India. Country of origin tends to change the motivating reasons for workers to participate in the marketplace. Significantly more workers from India participate on Mechanical Turk because the online marketplace is a primary source of income, while in the US most workers consider Mechanical Turk a secondary source of income. While money is a primary motivating reason for workers to participate in the marketplace, workers also cite a variety of other motivating reasons, including entertainment and education.
labor
mturk
amazon
demographics
statistcs
february 2012
Beerjobber.com - Online Beer Store | Buy Beer Online | Craft Beer Shopping and Delivery
february 2012
Browse our beers. Add to your cart and checkout quickly and securely. The beer is then picked up fresh from the brewery and delivered straight to your door. For brewers, we deal with all the hassle of shipping across state lines, so their beer can get into the hands of the faithful, without the headaches.
beer
shopping
february 2012
Apache Hadoop Goes Realtime at Facebook
february 2012
Facebook recently deployed Facebook Messages, its first ever
user-facing application built on the Apache Hadoop platform.
Apache HBase is a database-like layer built on Hadoop designed
to support billions of messages per day. This paper describes the
reasons why Facebook chose Hadoop and HBase over other
systems such as Apache Cassandra and Voldemort and discusses
the applicationBs requirements for consistency, availability,
partition tolerance, data model and scalability. We explore the
enhancements made to Hadoop to make it a more effective
realtime system, the tradeoffs we made while configuring the
system, and how this solution has significant advantages over the
sharded MySQL database scheme used in other applications at
Facebook and many other web-scale companies. We discuss the
motivations behind our design choices, the challenges that we
face in day-to-day operations, and future capabilities and
improvements still under development. We offer these
observations on the deployment as a model for other companies
who are contemplating a Hadoop-based solution over traditional
sharded RDBMS deployments.
facebook
hadoop
hbase
mapreduce
programming
user-facing application built on the Apache Hadoop platform.
Apache HBase is a database-like layer built on Hadoop designed
to support billions of messages per day. This paper describes the
reasons why Facebook chose Hadoop and HBase over other
systems such as Apache Cassandra and Voldemort and discusses
the applicationBs requirements for consistency, availability,
partition tolerance, data model and scalability. We explore the
enhancements made to Hadoop to make it a more effective
realtime system, the tradeoffs we made while configuring the
system, and how this solution has significant advantages over the
sharded MySQL database scheme used in other applications at
Facebook and many other web-scale companies. We discuss the
motivations behind our design choices, the challenges that we
face in day-to-day operations, and future capabilities and
improvements still under development. We offer these
observations on the deployment as a model for other companies
who are contemplating a Hadoop-based solution over traditional
sharded RDBMS deployments.
february 2012
Princeton University Library Finding Aids
february 2012
dev site for princeton's finding aids, using bootstrap
archives
ux
bootstrap
css
february 2012
michael/substance - GitHub
february 2012
A data-driven and cloud-aware document authoring engine
cms
couchdb
writing
february 2012
See this user's network
*blogthis
*work
academia
activism
advertising
ajax
animals
anthropology
api
architecture
archives
art
atom
audio
beer
blogs
books
brooklyn
business
career
cataloging
classification
clothing
code4lib
comics
community
computers
conferences
cooking
cool
copyright
corporate
css
culture
data
databases
dc
design
development
diff
digital-curation
digital-humanities
digital-libraries
digitization
distributed
diy
django
drupal
ead
economics
education
electronic-records
electronics
events
fashion
fedora
film
firefox
flash
flickr
folksonomy
food
free
friends
funny
games
geek
gender
geography
gis
git
google
government
graphics
hacks
hardware
health
history
howtos
html
http
humor
images
indexing
information
intellectual-property
interfaces
java
javascript
journalism
json
judaism
language
law
libraries
library2.0
lifehacks
linguistics
linked-data
linux
literature
logic
lucene
mac
management
maps
marc
mashups
mathematics
media
medicine
metadata
michigan
modules
mp3
museums
music
news
nyc
oai
oclc
ontology
opacs
open-source
palm
papers
pdf
performance
philosophy
photography
photos
php
physics
plugins
policy
politics
presentations
preservation
privacy
productivity
programming
psychology
python
race
rails
rdf
recipes
reference
religion
repositories
research
rest
restaurants
reviews
rss
ruby
scalability
science
search
security
semantic-web
serials
sex
shopping
software
solr
sql
standards
statistics
storage
stupid
taxonomy
technology
theory
tools
transportation
travel
twitter
ubuntu
ui
unix
urban-planning
usability
version-control
via:boingboing
via:code4lib
via:consumerist
via:ericcook
via:infosthetics
via:inkdroid
via:jaydatema
via:ksclarke
via:makemagazine
via:pizzawhale
via:revgeorge
via:unalog
via:vielmetti
video
videos
visualization
web
web2.0
weird
wikis
wordpress
writing
xml
youtube