nrabinowitz/pjscrape
23 days ago by edsu
pjscrape is a framework for anyone who's ever wanted a command-line tool for web scraping using Javascript and jQuery. Built to run with PhantomJS, it allows you to scrape pages in a fully rendered, Javascript-enabled context from the command line, no browser required.
javascript
scraping
web
jquery
nodejs
23 days ago by edsu
Web Data Commons
9 weeks ago by edsu
forgot the URL for that report I guess it's more analysis performed by Web Data Commons on the @commoncrawl data
web
metadata
webarchiving
9 weeks ago by edsu
MementoProxy: Proxied Web Archives
10 weeks ago by edsu
@azaroth42 @hvdsomp is mementoproxy down? clicking on links leads to "unknown proxy command" pages
web
archiving
memento
10 weeks ago by edsu
Joho the Blog » [2b2k] Linking is a public good
february 2012 by edsu
@dweinberger thanks for putting words to why links (especially external ones) are a public good
urls
links
web
february 2012 by edsu
Opening Keynote – Collisions, Chimera and Consonance in Web Content - Jeni Tennison
february 2012 by edsu
... and much needed humor too, such as what does the A-Team have to do with web formats like HTML, XML, JSON and RDF
web
html
rdf
linkeddata
json
xml
videos
february 2012 by edsu
Web Science and Digital Libraries Research Group: 2012-02-11: Losing My Revolution: A year after the Egyptian Revolution, 10% of the social media documentation is gone.
february 2012 by edsu
RT @blefurgy: A year after the Egyptian Revolution, 10% of the social media documentation is gone. via @kboughia @p ...
history
webarchiving
preservation
web
egypt
february 2012 by edsu
Odeo Releases Twttr | TechCrunch
january 2012 by edsu
"I imagine most users are not going to want to have all of their Twttr messages published on a public website."
twitter
history
web
january 2012 by edsu
David Galbraith’s Blog » Blog Archive » Tim Berners-Lee. Confirming The Exact Location Where the Web Was Invented
january 2012 by edsu
“Confirming The Exact Location Where the Web Was Invented”
web
history
timbl
cern
from twitter_favs
january 2012 by edsu
A List Apart: Articles: What I Learned About the Web in 2011
december 2011 by edsu
RT @erinengle: Some good observations | What I Learned About the Web in 2011 via @alistapart #community #users #content
web
content
users
community
december 2011 by edsu
discontents - It’s all about the stuff: collections, interfaces, power and people
december 2011 by edsu
@wragge's It's All About The Stuff nicely compliments @gnat's Libraries: Where It All Went Wrong
archives
libraries
linkeddata
web
politics
december 2011 by edsu
Dart; or Why JavaScript has already won - QuirksBlog
october 2011 by edsu
If you want to work with the web, learn JavaScript. If you don’t want to learn JS, stay the hell away from the web.
javascript
google
web
october 2011 by edsu
[1105.3459] Analyzing the Persistence of Referenced Web Resources with Memento
october 2011 by edsu
RT @MarthaBunton: Scholarly papers references to web resources are endangered @hvdsomp #fpw11
web
preservation
linkrot
science
fpw11
from iphone
october 2011 by edsu
draft-duerst-anno-link-01 - The x27annotation-serverx27 Link Relation Type
september 2011 by edsu
neat, an IETF proposal for an annotation-server link relation to discover annotation services
annotation
web
ietf
rfcs
september 2011 by edsu
REST Fest on Vimeo
august 2011 by edsu
RT @talios: Just found a bunch of videos from #restfest 2011 being posted at - time to start watching!
rest
videos
conferences
http
web
restfest
august 2011 by edsu
related tags
activitystreams ⊕ adobe ⊕ advertising ⊕ advice ⊕ ajax ⊕ analytics ⊕ annotation ⊕ annotations ⊕ apache ⊕ apis ⊕ apple ⊕ architecture ⊕ archive ⊕ archives ⊕ archiving ⊕ art ⊕ articles ⊕ astronomy ⊕ atom ⊕ atompub ⊕ audio ⊕ authentication ⊕ autobiography ⊕ baltimore ⊕ bbc ⊕ bio ⊕ blog ⊕ blogs ⊕ books ⊕ bots ⊕ browser ⊕ browsers ⊕ business ⊕ caching ⊕ calendars ⊕ cataloging ⊕ cern ⊕ chemistry ⊕ chicago ⊕ chipy ⊕ chrome ⊕ citations ⊕ classes ⊕ classification ⊕ clayshirky ⊕ cms ⊕ code4lib ⊕ collaboration ⊕ collections ⊕ comet ⊕ commoncrawl ⊕ community ⊕ computers ⊕ computing ⊕ conferences ⊕ congress ⊕ content ⊕ cookies ⊕ copy ⊕ couchdb ⊕ crawling ⊕ creativity ⊕ crs ⊕ css ⊕ culture ⊕ curation ⊕ daoism ⊕ data ⊕ databases ⊕ dc ⊕ ddos ⊕ design ⊕ dewey ⊕ dhapi ⊕ dictionaries ⊕ digital-curation ⊕ distributed ⊕ django ⊕ documents ⊕ doi ⊕ dois ⊕ dpla ⊕ drawing ⊕ dublincore ⊕ eac ⊕ ead ⊕ education ⊕ egov ⊕ egypt ⊕ elag2011 ⊕ england ⊕ eprints ⊕ erlang ⊕ europeana ⊕ evolution ⊕ facebook ⊕ fileformats ⊕ film ⊕ firefox ⊕ flash ⊕ flickr ⊕ folksonomy ⊕ fpw11 ⊕ fragment ⊕ framework ⊕ freedom ⊕ freenode ⊕ functional ⊕ geo ⊕ geocities ⊕ geography ⊕ git ⊕ github ⊕ golang ⊕ google ⊕ government ⊕ graphs ⊕ grddl ⊕ hadoop ⊕ harvesting ⊕ haskell ⊕ hieroglyphics ⊕ history ⊕ hotels ⊕ html ⊕ html5 ⊕ http ⊕ humor ⊕ hypertext ⊕ ical ⊕ ideas ⊕ identifiers ⊕ identity ⊕ ie ⊕ ie6 ⊕ ietf ⊕ images ⊕ imls ⊕ indexing ⊕ institutions ⊕ interenet ⊕ internet ⊕ internetarchive ⊕ interpreter ⊕ interviews ⊕ ipad ⊕ irc ⊕ java ⊕ javascript ⊕ jena ⊕ jisc ⊕ journalism ⊕ jquery ⊕ jruby ⊕ json ⊕ knowledge ⊕ language ⊕ languages ⊕ lc ⊕ learning ⊕ libraries ⊕ licenses ⊕ licensing ⊕ linkeddata ⊕ linking ⊕ linkrot ⊕ links ⊕ lisp ⊕ loc ⊕ management ⊕ maps ⊕ math ⊕ media ⊕ mediatypes ⊕ memento ⊕ memex ⊕ metadata ⊕ metaphors ⊕ microformats ⊕ microsoft ⊕ mime ⊕ mobile ⊕ movies ⊕ mozilla ⊕ museums ⊕ mvc ⊕ naac2012 ⊕ namespaces ⊕ ndnp ⊕ networks ⊕ newspapers ⊕ nginx ⊕ node ⊕ nodejs ⊕ oai-ore ⊕ oai-pmh ⊕ oasis ⊕ oauth ⊕ obama ⊕ occupywallst ⊕ oclc ⊕ okfn ⊕ ontologies ⊕ ontology ⊕ oop ⊕ opac ⊕ openid ⊕ oreilly ⊕ organizations ⊕ owd ⊕ owf ⊕ owl ⊕ patterns ⊕ people ⊕ performance ⊕ perl ⊕ philosophy ⊕ photos ⊕ php ⊕ pilgrim ⊕ podcasts ⊕ politics ⊕ portal ⊕ portico ⊕ posters ⊕ powder ⊕ power ⊕ presentations ⊕ preservation ⊕ privacy ⊕ proceedings ⊕ programming ⊕ protocols ⊕ provenance ⊕ publishing ⊕ pubsub ⊕ purl ⊕ python ⊕ questions ⊕ rails ⊕ rdf ⊕ rdfa ⊕ reading ⊕ realtime ⊕ registries ⊕ repositories ⊕ resourcesync ⊕ rest ⊕ restfest ⊕ rfc ⊕ rfcs ⊕ rfid ⊕ robots ⊕ robots.txt ⊕ royalsociety ⊕ rss ⊕ ruby ⊕ rubyonrails ⊕ sadness ⊕ scalability ⊕ scholarship ⊕ science ⊕ scraping ⊕ search ⊕ security ⊕ semantic ⊕ semanticweb ⊕ semweb ⊕ seo ⊕ seriously ⊕ services ⊕ shirky ⊕ silos ⊕ simile ⊕ sitemaps ⊕ slides ⊕ smithsonian ⊕ soa ⊕ soap ⊕ socialweb ⊕ society ⊕ software ⊕ speech ⊕ standards ⊕ stanford ⊕ statistics ⊕ storage ⊕ support ⊕ surveys ⊕ syndication ⊕ talis ⊕ talks ⊕ taxonomies ⊕ tbl ⊕ technology ⊕ technorati ⊕ tednelson ⊕ templating ⊕ testing ⊕ things ⊕ timbl ⊕ timeline ⊕ tools ⊕ trackbacks ⊕ tribune ⊕ tuning ⊕ tutorial ⊕ twitter ⊕ unix ⊕ uri ⊕ uris ⊕ url ⊕ urls ⊕ urlshortening ⊕ useragents ⊕ users ⊕ uuids ⊕ versioning ⊕ video ⊕ videos ⊕ visualization ⊕ w3c ⊕ wadl ⊕ web ⊖ web2 ⊕ web2.0 ⊕ webarchiving ⊕ webdav ⊕ webofthings ⊕ webservices ⊕ websockets ⊕ widgets ⊕ wiki ⊕ wikileaks ⊕ wikipedia ⊕ writing ⊕ wsgi ⊕ wwic ⊕ www ⊕ xanadu ⊕ xml ⊕ xmpp ⊕ xtech ⊕ yahoo ⊕Copy this bookmark: