arthegall + opensource   206

boyter/BATF - GitHub
"Big-Ass Text File" -- constantly-updated, versioned, infinitely (YKWIM)-sized text file. Backed by mysql.
via:?  editor  opensource  text  writing 
december 2011 by arthegall
tmux
Opensource alternative to 'screen'.
screen  linux  terminal  utility  opensource 
december 2011 by arthegall
RDKit
Whoa -- did *not* realize that Greg Landrum was (also) "The RDKit guy."
cheminformatics  machinelearning  greg-landrum  software  library  python  opensource 
august 2011 by arthegall
Apache Tika - Apache Tika
Automated metadata extraction, integrated with Lucene and SOLR.
apache  opensource  metadata  java  lucene  parsing 
july 2011 by arthegall
JCIFS
Does NTLM v2. I wish we didn't have to, though.
ntlm  authentication  networking  java  samba  library  opensource  lgpl 
july 2011 by arthegall
TileMill | Home
Maybe a fun project to play with in the future...
cartography  mapping  geography  web  javascript  css  opensource 
june 2011 by arthegall
Delve inside the Lucene indexing mechanism
Understanding some of the details of the lucene index has been on my to-do list for a while now...
lucene  java  ibm  tutorial  search  text  opensource  tab-dump 
april 2011 by arthegall
OSQA | The Open Source Q&A System
Django-based Question-and-answer site. Stack-exchange-ey.
stack-exchange  django  opensource  python  web  questions  software  collaboration 
april 2011 by arthegall
SIREn: Semantic Information Retrieval Engine
" SIREn - Semantic Information Retrieval Engine - a Lucene plugin to overcome these shortcomings and efficiently index and query RDF, as well as any textual document with an arbitrary amount of metadata fields."
lucene  rdf  semanticweb  apache  opensource  library  search  text  indexing 
february 2011 by arthegall
JSON Tools - Home
LGPL'ed Java tools for JSON, including a JSON Schema validator. To compare with my own JSON schema validator project...
json  javascript  java  programming  library  opensource  lgpl 
february 2011 by arthegall
elda - Project Hosting on Google Code
Linked-Data-in-Java. The jury's still out (for me) on how useful this kind of thing is, but maybe it's a good idea to accumulate a few implementations.
java  linked-data  google-code  opensource  library  software 
january 2011 by arthegall
PoDoFo
Another PDF parsing and extraction library, this one in C++.
c++  pdfs  parsing  library  software  opensource  from delicious
january 2011 by arthegall
COIN-OR
Open source "industrial strength" linear and integer programming.
linear-programming  integer-programming  software  tool  opensource  from delicious
january 2011 by arthegall
Main Page - Open Babel
"Open Babel is a chemical toolbox designed to speak the many languages of chemical data. It's an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas." -- Includes a parser/filter engine for SMARTS expressions.
smarts  structure  bioinformatics  chemoinformatics  chemistry  software  opensource  molecular-modeling  from delicious
january 2011 by arthegall
pgbovine's CDE at master - GitHub
On Linux, observes the execution of a script or program and (in a fairly general way) builds a virtual environment (think: "like virtualenv, but not restricted to python") that can be packaged up and given to someone else, so that they can run the same code with the same dependencies. A research project, so caveat downloader. It's (of course) a really cool idea, and probably very useful, although poking through the code makes it seem like the "black magic quotient" is a little high.
virtualization  environment  github  opensource  linux  programming  repeatability  via:amitp 
november 2010 by arthegall
Mac OS X Multitouch Event API — Project Kenai
Apache-licensed OS X library that lets you, via JNI, get access to finger motions on the trackpad in Java programs.
multitouch  mac  os-x  java  programming  library  opensource  apache-license  ui 
october 2010 by arthegall
Blip: Biomedical Logic Programming
A package from Chris Mungall... (Can I modify that hoary Greenspun quote about "sufficiently complicated systems" and Scheme interpreters here? Something like, "any sufficiently complicated bioinformatics software system contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Prolog, Sparql, or a DL reasoner"... of course, it helps if you *start* with Prolog, as in this case. Not that any of this is a bad thing. Maybe worthwhile thinking about how to generalize.)
greenspun  programming  bioinformatics  chris-mungall  logic-programming  prolog  opensource 
september 2010 by arthegall
Apache Ivy
Dependency management, integrated with ant.
ant  build  programming  software  ivy  apache  opensource 
september 2010 by arthegall
wavii's pfp at master - GitHub
pfp == "pretty fast parser." (== "like the Stanford NLP parser, but faster") Now all I need is a JavaCC grammar for JavaCC grammars, and we'll be good to go, right?
parsing  grammar  dynamic-programming  software  tool  opensource  nlp  language 
september 2010 by arthegall
FileUpload - Home
"FileUpload parses HTTP requests which conform to RFC 1867, "Form-based File Upload in HTML"."
apache  programming  java  web  forms  opensource  library 
august 2010 by arthegall
reJ - Project homepage
bytecode view and manipulation library for java.
java  reflection  bytecode  library  opensource  programming 
august 2010 by arthegall
Ganymed SSH-2 for Java
"Ganymed SSH-2 for Java is an open source library which implements the SSH-2 protocol in pure Java (tested on J2SE 1.4.2 and 5.0). It allows one to connect to SSH servers from within Java programs. It supports SSH sessions (remote command execution and shell access), local and remote port forwarding, local stream forwarding, X11 forwarding, SCP and SFTP. There are no dependencies on any JCE provider, as all crypto functionality is included."
java  scp  ssh  library  bsd  opensource  networking  cryptography 
august 2010 by arthegall
"Open-Source Pharmaceutical Babble" (In the Pipeline)
"And that's it; that's the payoff. We'll all just hop to it, enabling and facilitating, expanding and evolving, stimulating and focusing. None of those are concrete verbs suggesting real courses of action. Whenever you see someone slip into that sort of talk, you can be sure that (at the very least) they have difficulty communicating whatever specific ideas they have. Or (more likely) that they don't have any specific ideas to tell you about at all."
opensource  buzzwords  pharmaceuticals  research  community  web  futurism  science 
july 2010 by arthegall
openscholar
"A full-featured web site-creation package solely for the academic community. Scholars create web sites in seconds and can easily manage everything themselves (for free)" -- yet another Drupal-based system for "community creation." (YADSCC.)
drupal  education  academia  research  software  web  community  cms  opensource 
july 2010 by arthegall
Eastern District of New York - Live Database Version 4.0.3-U.S. District Court
"Please be aware that RECAP is "open-source" software, which can be freely obtained by anyone with Internet access and modified for benign or malicious purposes, such as facilitating unauthorized access to restricted documents or seeding the repository with falsified or spurious documents." --- ZOMG, not freely obtainable on the interwebs!!1! Noooo000ooo!!!
via:carl-malamud  recap  pacer  opensource  legal  idiocy 
july 2010 by arthegall
"Why WordPress Themes are Derivative of WordPress" (Mark on WordPress)
For all the sound and fury about this GPL-vs-Wordpress-plugin-author mess, why do I feel as if this sentence -- "If modules are designed to run linked together in a shared address space..." -- is actually the crux of the argument? None of the other mess about "common language," or "uses an API," or "intended to work together," actually functions as a bright line. To my eye, this is the only reasonable proposal that's actually been made for differentiating derivative vs. non-derivative software with different origins (no surprise that it's from the FSF, who ostensibly know what they're talking about) -- although technically, that's not the case here, since some of the Thesis code is actually copied from WP, meaning it's pretty clearly "derivative." Now, given that it's the only reasonable proposal put forward, the next question is: Is this a distinction that's actually present in statutes or case law? Or is this simply the desired reading of the FSF itself?
gpl  fsf  opensource  licensing  web  argument  copyright  software  wordpress  law  license  via:manuel 
july 2010 by arthegall
java-twitter - Project Hosting on Google Code
For when I finally getting around to creating a command-line twitter client for myself, so I don't have to reload the stupid thing in Firefox all the time.
twitter  java  api  software  library  client  programming  google  opensource 
july 2010 by arthegall
OpenSCAD - The Programmers Solid 3D CAD Modeller
"OpenSCAD is not an interactive modeller. Instead it is something like a 3D-compiler that reads in a script file that describes the object and renders the 3D model from this script file (see examples below). This gives you (the designer) full control over the modelling process and enables you to easily change any step in the modelling process or make designes that are defined by configurable parameters."
via:ltu  3d  software  opensource  graphics  modeling  visualization  design  cad 
june 2010 by arthegall
logkext - Project Hosting on Google Code
Keylogger for OS X, as I muse more about that "log all your own keystrokes" idea. Browsing this code (reduced C++ which abuses #defines in some weird ways) has helped me understand a little more about low-level OS X programming. At the same time, I'm dead scared of installing something like this on my laptop without completely reviewing every inch of the code first.
osx  programming  c++  opensource  keystroke-logging  apple  keylogger 
april 2010 by arthegall
Guava: Google Core Libraries for Java 1.6
Includes, apparently, the Google Collections library. Plus a bunch of other stuff, some of which I end up re-writing on any java system I use anyway.
guava  google  java  programming  library  opensource  collections 
january 2010 by arthegall
Jpcap Tutorial
Jpcap is a java library for capturing (but also, more importantly, reading and writing previously-captured) packet-capture (pcap) dump files.
library  java  software  networking  debugging  opensource 
december 2009 by arthegall
salmon-protocol - Project Hosting on Google Code
"We want to stop fragmenting conversations on the Internet. The Salmon protocol defines how comments can swim upstream to the resource being discussed. It's open, standards based, decentralized, abuse resistant, and user centric."
via:manuel  web  conversation  discussion  protocol  opensource 
november 2009 by arthegall
kowari
Open-source cross-platform triple-store? Another candidate for "alternative to NC's Virtuoso installation."
triple-store  software  opensource  data  database  java  semanticweb  tool  work 
november 2009 by arthegall
Welcome to Solr
Apache project. "Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, a web administration interface and many more features. It runs in a Java servlet container such as Tomcat."
solr  apache  lucene  search  web  java  opensource  server  software 
october 2009 by arthegall
GNU M4 - GNU Project - Free Software Foundation (FSF)
JAR also (apparently) makes heavy use of M4, the pre-processor. I will admit that this (on its face) offends my sensibilities a little bit. But he's smarter than me, so I probably owe it a second look...
preprocessor  programming  tool  software  opensource  gnu  jonathan-rees 
october 2009 by arthegall
LSW - ESW Wiki
These are Alan's tools for creating OWL and dealing with Sparql. Jonathan uses XQuery and Perl, and the time is fast approaching when I'm going to need to choose between the two (or forge my own, third, path).
sparql  owl  tool  semanticweb  lisp  library  ontology  opensource  alan-ruttenberg 
october 2009 by arthegall
Virtuoso Open-Source Wiki : OpenLink Virtuoso Open-Source Edition: Downloads
Software bridges (downloads) from Virtuoso to other platforms -- notably, Jena and Sesame.
jena  sesame  software  opensource  java  semanticweb  virtuoso 
october 2009 by arthegall
"Java OCR" (Ron Cemer's Blog)
Pure Java OCR == good. (Text identification and alignment seem like the real issues in a free-floating environment, but this is a start.)
via:yaroslavb  ocr  java  software  opensource 
september 2009 by arthegall
OBO-Edit - Overview
"an open-source ontology editor written in Java" -- for OBO files, obvs.
obo  ontology  editor  java  software  opensource  semanticweb  work 
september 2009 by arthegall
Scrapy | An open source web scraping framework for Python
"Scrapy is a high level scraping and web crawling framework for writing spiders to crawl sites and parse their pages for all kinds of purposes"
python  web  software  opensource  crawling 
august 2009 by arthegall
G3DATA
Free software for the same! Even better.
via:arsyed  odr  data  visualization  graphics  extraction  software  opensource 
august 2009 by arthegall
"Apollo 11 mission's 40th Anniversary: One large step ..." (Google Code Blog)
"To commemorate this event the Command Module code (Comanche054) and Lunar Module code (Luminary099) have been transcribed from scanned images to run on yaAGC (an open source AGC emulator) by the Virtual AGC and AGS project." --- Awe-some.
opensource  google  lunar-lander  nasa  history  computing  moon  via:chl 
july 2009 by arthegall
skia - Google Code
This is the (2D graphics) library for which a skeleton was released when Chrome came out, but not the real thing... right? But now, it appears to be completely released? If so, very very cool.
library  programming  graphics  google  chrome  opensource  via:shivak 
march 2009 by arthegall
http://www.sagebase.org/publications.html
Ah, so this is what happened to Rosetta... "The foundation for Sage’s activities are the pioneering studies conducted by researchers at Rosetta Inpharmatics, a subsidiary of Merck & Co., Inc.. Here is a sampling of 2008 publications by these scientists that illustrate the value and potential of the advanced Sage technology." --- That list is fine, but likely not enough to build an entire company around, right?
list  opensource  publications  pharmaceuticals  sage 
march 2009 by arthegall
"Gene Expression: You Haven't Been Thinking Big Enough?" (In the Pipeline)
"Well, here’s another crack at open-source science. Stephen Friend, the previous head of Rosetta (before and after being bought by Merck), is heading out on his own to form a venture in Seattle called Sage. The idea is to bring together genomic studies from all sorts of laboratories into a common format and database, with the expectation that interesting results will emerge that couldn’t be found from just one lab’s data. ... once you get down to the many labs that can do high-level genomics (or to the even larger number that can do less extensive sequencing), the problems will be many. Sage is also going to look at gene expression levels, something that's easier to do (although we're still not in weekend-garage territory yet). Some people would say that it's a bit too easy to do: there are a lot of different techniques in this field, not all of which always yield comparable data, to put it mildly. ... Then you've got the really hard issues: intellectual property, for one."
science  genomics  technology  opensource 
march 2009 by arthegall
Simple 4.1.5
"The primary focus of the project is to provide a truly embeddable Java based HTTP engine capable of handling enormous loads. Simple provides a truly asynchronous service model, request completion is driven using an internal, transparent, monitoring system. This allows Simple to vastly outperform most popular Java based servers in a multi-tier environment, as it requires only a very limited number of threads to handle very high quantities of concurrent clients." --- I keep thinking that these standalone, Java-based, easily-deployable server-options would be really useful in some kind of distributed graph database project. But I still haven't nailed down the details in my mind... obviously this isn't going to happen while I'm still at school.
web  opensource  java  http  server 
february 2009 by arthegall
sparqlite - Google Code
"An implementation of the SPARQL protocol, exploring issues of robustness and scalability. Currently based on the Jena SDB/TDB libraries." It'd be nice to see more documentation, to understand exactly what's going on here.
opensource  semanticweb  google-code  sparql  jena  via:danja 
february 2009 by arthegall
HttpClient - HttpClient Home
"Although the java.net package provides basic functionality for accessing resources via HTTP, it doesn't provide the full flexibility or functionality needed by many applications. The Jakarta Commons HttpClient component seeks to fill this void by providing an efficient, up-to-date, and feature-rich package implementing the client side of the most recent HTTP standards and recommendations. See the Features page for more details on standards compliance and capabilities."
web  java  programming  apache  opensource  http 
january 2009 by arthegall
Pronto—A Probabilistic Reasoner for OWL DL and Pellet
"Pronto is an extension of Pellet that enables probabilistic knowledge representation and reasoning in OWL ontologies. Pronto is distributed as a Java library equipped with a command line tool for demonstrating its basic capabilities."
software  opensource  semanticweb  probabilistic-methods  description-logic  owl  reasoner 
january 2009 by arthegall
"Google Blog Converters 1.0 Released" (Google Open Source Blog)
So, embedded in this code somewhere must be stuff that parses out content from different prominent blogging platforms... Hmm....
web  tool  blogging  programming  google  opensource  parsing  wordpress 
january 2009 by arthegall
Canto
"Canto is an Atom/RSS feed reader for the console that is meant to be quick, concise, and colorful." --- Hmm, yes, this might be quite useful (when running in a screen, of course).
web  via:chl  software  opensource  rss  unix  reader 
december 2008 by arthegall
Pellet: The Open Source OWL DL Reasoner
"Pellet is an open source reasoner for OWL 2 DL in Java. It provides standard and cutting-edge reasoning services for OWL ontologies." We used this in our 6.830 class project, as a background example. Ted claims that Jena+Pellet is state-of-the-art for the Semantic Web, that it's "used by NASA."
inference  semanticweb  java  opensource  logic  ontology  description-logic 
december 2008 by arthegall
GDAL: GDAL - Geospatial Data Abstraction Library
"GDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats."
mapping  library  software  opensource  api  gis 
december 2008 by arthegall
TISCH - Tangible Interactive Surfaces for Collaboration between Humans
"We believe that the logical next step is the design of a software architecture, which allows developers to focus on writing applications instead of focusing on low-level stuff like hardware issues, gesture recognition and so on." A software library, for doing higher-level programming with multi-touch stuff. "Inspired by Jeff Han."
multitouch  library  software  opensource 
december 2008 by arthegall
Lemonade Stand
Promises Visual-Basic source code that is faithful to the original lemonade-stand game. But I can't seem to extract it properly (and I can't find a writeup of the original rules on the web). Help, please? I just want to look at the source code!
lemonade-stand  game  childhood  software  opensource 
november 2008 by arthegall
ScapeToad - cartogram software by the Choros laboratory
An alternate implementation of the Newman and Gastner diffusion cartogram method. Open source, and in Java.
visualization  opensource  maps  software  java  via:bryan 
november 2008 by arthegall
Apache POI - Java API To Access Microsoft Format Files
"The POI project consists of APIs for manipulating various file formats based upon Microsoft's OLE 2 Compound Document format using pure Java. In short, you can read and write MS Excel files using Java. Soon, you'll be able to read and write Word, PowerPoint and Visio files using Java. POI is your Java Excel solution as well as your Java Word solution. However, we have a complete API for porting other OLE 2 Compound Document formats, and welcome others to participate. "
apache  microsoft  office  file-format  programming  library  opensource 
october 2008 by arthegall
Sylvester - Vector and Matrix math for JavaScript
MIT-Licensed linear algebra routines for JavaScript. Can you see where this is going? :-)
programming  library  javascript  browser  linear-algebra  opensource 
october 2008 by arthegall
Cascading
Dataflow language, in Java, on top of Hadoop. Via Simon Willison.
tools  software  opensource  parallel  programming  java  mapreduce  hadoop 
october 2008 by arthegall
Alchemy - Open Source AI
Implementations of, among other things, the Markov Logic Network stuff. Linked to from Pedro Domingos's ongoing class (thank you, Prof. Shalizi!), for which there are now five lectures online.
opensource  machinelearning  software  markov-logic  inference  logic 
september 2008 by arthegall
bibapp - Google Code
"BibApp matches researchers on your campus with their publication data and allows you to mine the data to see collaborations and to find experts in research areas. BibApp makes it easy to see what publications can be archived for greater access and impact and makes it easy to push those publications directly into an institutional or other repository. "
tool  software  google-code  opensource  citations  research  collaboration 
september 2008 by arthegall
« earlier      

related tags

3d  academia  academics  actionscript  adobe  ai  ajax  alan-ruttenberg  algebra  algorithms  amazon  analysis  ant  apache  apache-license  api  app-engine  apple  archive  argument  atom  audio  authentication  backup  baseball  bayesian-methods  behaviour-driven-development  best-practices  betting  bibliographic-records  bioinformatics  blogging  book  browser  bsd  build  buzzwords  byte-code  bytecode  c  c++  cad  cairo  cartography  cataloging  chat  cheminformatics  chemistry  chemoinformatics  chess  childhood  chip-chip  chris-mungall  chrome  citation  citations  classification  client  cloud-computing  cloudcomputing  cluster-computing  clustering  cms  code-review  collaboration  collections  commandline  community  compiler  computation  computer  computers  computing  conditional-random-fields  conversation  copyright  crawling  crosscompilation  cryptography  css  culture  dalvik  data  database  debugger  debugging  decompiler  description-logic  design  development  dewey-decimal-system  discussion  distributed  distributed-computing  django  documentation  dot-net  drupal  dynamic-programming  economics  editor  education  electronics  engineering  environment  extension  extraction  facebook  file-format  filesystem  firefox  flash  flex  flickr  font  format  forms  framework  free  fsf  ftp  fun  functionalprogramming  futurism  game  gcc  geek  genomics  geography  geometry  gis  github  gnu  google  google-code  gpl  grammar  graph  graph-algorithms  graphics  graphs  greenspun  greg-landrum  gsm  guava  hadoop  hardware  haskell  history  html  htmm  http  humor  ibm  icmp  id3  idiocy  IM  image-processing  images  index  indexing  inference  integer-programming  intel  ivy  jabber  java  javascript  jena  jit  jonathan-rees  journal  json  jvm  keylogger  keystroke-logging  knowledge  language  latex  law  layout  legal  lemonade-stand  lgpl  libraries  library  license  licensing  linear-algebra  linear-programming  linked-data  linux  lisp  list  load-balancing  logic  logic-programming  lucene  lunar-lander  mac  machinelearning  management  manual  mapping  mapreduce  maps  marc  markov-logic  mathematics  matlab  mcmc  media  messaging  metadata  microsoft  mit  mobile  modeling  molecular-modeling  moon  movies  mozilla  mp3  multics  multitouch  music  mysql  nasa  network  networking  news-article  nlp  notation  ntlm  obo  ocr  odr  office  ontologies  ontology  ontology-broker  openblocks  opendata  openjdk  opensource  oracle  os-x  osx  owl  pacer  parallel  parsing  parts-of-speech  pda  pdf  pdfs  performance  perl  persistence  personal  pharmaceuticals  phone  photography  php  physics  pim  politics  prediction-markets  preprocessor  probabilistic-methods  processing  productivity  programming  programminglanguage  projects  prolog  protocol  proxy  publications  purl  python  questions  r  radio  rails  rdf  reader  reasoner  recap  reflection  repeatability  research  rmi  rss  s3  sage  samba  sampling  science  scp  screen  search  security  semanticweb  server  sesame  signal-processing  simulation  simulator  smarts  software  solr  sparql  speech-recognition  sports  sql  ssh  stack-exchange  statistics  strategy  stream-programming  structure  sun  svn  swing  tab-dump  tagging  taxonomy  technology  terminal  test  testing  tex  text  text-mining  thesis  tool  tools  traffic  triple-store  tutorial  twitter  typesetting  typography  ui  unix  url  utility  versioncontrol  via:?  via:alanr  via:amitp  via:arsyed  via:bryan  via:carl-malamud  via:chl  via:danja  via:ltu  via:manuel  via:nprnncbl  via:shivak  via:vaguery  via:yaroslavb  video  virtual-machine  virtualization  virtuoso  visualization  web  web-analytics  wiki  windows  wordpress  work  writing  xml  xmpp  yahoo  youtube 

Copy this bookmark:



description:


tags: