donturn + analysis   71

riivo/pwum - GitHub
pwum is a set of python scripts for working with web log files and extracting frequent patterns and clustering sessions.

Two main functions:

Finding frequent patters. Extract frequently co-accessed pages in web sessions. Uses traditonal frequent pattern mining algorithm Apriori. For more information on the implementation, please see here

Finding similar sessions based on behaviour,i.e, visited pages by clustering. Available methods are based on building Markov chain like transition matrix out of session and clustering these or representing sessions as simple feature vectors. Clustering currently done by k-means algorithm.
python  logs  analysis  web  analytics 
may 2011 by donturn
Orange - Data Mining Fruitful & Fun
the Orange app w/python & graphical dev interfaces for data mining & visualization looks nice #opensource
datamining  db  machinelearning  opensource  python  dev  viz  analysis  textmining 
february 2011 by donturn
Wrangler
looks a lot like excel. but in the cloud? but with bigger data sets?
analysis  analytics  data  tools  visualization  viz  etl  kdd  excel 
february 2011 by donturn
google-refine - Project Hosting on Google Code
Google Refine looks like a great tool for cleaning & transforming messy data for use w/web services
analysis  data  datamining  google  tools  dev  datascience  from twitter
november 2010 by donturn
Woopra
real time blog analytics with chat functionality too.
analysis  analytics  blog  messaging  dashboard  statistics  stats  web  wordpress  website  chat 
march 2008 by donturn
R for Mac OS X
Macintosh download page for R, the stat app. From r-project.org.
osx  mac  statistics  stats  analysis  quant  rstats 
november 2006 by donturn
web site use analysis tool
looks interesting with graphical overlays for hosted pages.
access_log  analysis  analytics  web_logs  webtrends 
january 2006 by donturn

related tags

academic  accessibility  access_log  ads  ajax  algorithm  analysis  analytics  api  apps  audio  bayes  behavior  bibliometrics  blocking  blog  blogging  blogs  book  bpm  browser  business  cf  charts  chat  cipa  citation  citations  classification  cloud  code  collaboration  collaborative_filtering  community  content  crawler  cscw  dashboard  data  database  datamining  datascience  data_mart  data_mining  data_warehouse  db  delicious  design  dev  digg  economics  email  etl  excel  filteirng  filtering  finance  folksonomy  google  googlescholar  gov  graph  graphics  gui  hacks  hadoop  heatmap  image  indexing  information  informationtheory  information_architecture  information_retrieval  informetrics  interface  internet  investing  ir  iseek  itunes  java  javascript  journal  kdd  km  kms  language  link  links  linux  log  logging  logs  mac  machinelearning  machine_learning  mail  mapping  maps  math  mathematics  messaging  metadata  metrics  mp3  music  network  networks  nlp  nosql  ontology  opensource  osx  parsing  photo  picture  pim  playlist  policy  powerlaw  privacy  programming  python  quant  quantia  ranking  rating  rdf  reddit  regex  reports  research  rss  rstat  rstats  science  search  semantic  semantic_web  seo  social_computing  social_networks  social_software  software  splus  spss  startup  statistics  stats  statsitcs  study  style  syllabi  tagging  tags  taxonomy  testing  text  textmining  theory  timetracking  tools  tracker  tracking  trading  twitter  ui  usability  ux  visualization  viz  voting  web  webdesign  website  webtrends  web_logs  wikipedia  windows  wordnet  wordpress  zipf 

Copy this bookmark:



description:


tags: