cshalizi + automata_theory   15

Efficiently identifying deterministic real-time automata from labeled data
"We develop a novel learning algorithm RTI for identifying a deterministic real-time automaton (DRTA) from labeled time-stamped event sequences. The RTI algorithm is based on the current state of the art in deterministic finite-state automaton (DFA) identification, called evidence-driven state-merging (EDSM). In addition to having a DFA structure, a DRTA contains time constraints between occurrences of consecutive events. Although this seems a small difference, we show that the problem of identifying a DRTA is much more difficult than the problem of identifying a DFA: identifying only the time constraints of a DRTA given its DFA structure is already NP-complete. In spite of this additional complexity, we show that RTI is a correct and complete algorithm that converges efficiently (from polynomial time and data) to the correct DRTA in the limit. To the best of our knowledge, this is the first algorithm that can identify a timed automaton model from time-stamped event sequences.
A straightforward alternative to identifying DRTAs is to identify a DFA that models time implicitly, i.e., a DFA that uses different states for different points in time. Such a DFA can be identified by first sampling the timed sequences using a fixed frequency, and subsequently applying EDSM to the resulting non-timed event sequences. We evaluate the performance of both RTI and this sampling approach experimentally on artificially generated data. In these experiments RTI outperforms the sampling approach significantly. Thus, we show that if we obtain data from a real-time system, it is easier to identify a DRTA from this data than to identify an equivalent DFA."
in_NB  automata_theory  machine_learning  grammar_induction 
february 2012 by cshalizi
APPLICATIONS OF AUTOMATA THEORY AND ALGEBRA
I actually read chunks of this while in graduate school. It was... strange.
complexity  automata_theory  borderline_psychoceramica  books:noted 
february 2010 by cshalizi
Statistical Estimation Requires Unbounded Memory
"We investigate the existence of bounded-memory consistent estimators of various statistical functionals. This question is resolved in the negative in a rather strong sense. We propose various bounded-memory approximations, using techniques from automata theory and stochastic processes. Some questions of potential interest are raised for future work."

In other words: you need an unbounded memory even to estimate the bias of coin-flips, even if you know that there are only just two possible biases. This is pretty sweet (in a thoroughly negative way). --- A lot of work is done here by the strength of the notion of consistency employed (see e.g. the counter-examples with finite but small limiting error probabilities), but this is the standard one!
automata_theory  statistics  statistical_inference_for_stochastic_processes  computational_complexity  kontorovich.aryeh  kith_and_kin  have_read 
july 2009 by cshalizi
FSA utilities: A toolbox to manipulate finite-state automata
"This paper describes the FSA Utilities toolbox: a collection of utilities to manipulate finite-state automata and finite-state transducers. Manipulations include determinization (both for finite-state acceptors and finite-state transducers), minimization, composition, complementation, intersection, Kleene closure, etc. Furthermore, various visualization tools are available to browse finite-state automata. The toolbox is implemented in SICStus Prolog." Where's something comparable in a sane language? (N.B., must handle transducers.)
automata_theory  programming  re:AoS_project 
june 2009 by cshalizi
[0812.1949] Prediction with Restricted Resources and Finite Automata
"We obtain an index of the complexity of a random sequence by allowing the role of the measure in classical probability theory to be played by a function we call the generating mechanism. Typically, this generating mechanism will be a finite automata. We generate a set of biased sequences by applying a finite state automata with a specified number, $m$, of states to the set of all binary sequences. Thus we can index the complexity of our random sequence by the number of states of the automata. We detail optimal algorithms to predict sequences generated in this way."
prediction  automata_theory  complexity_measures  to_read  to_be_shot_after_a_fair_trial 
december 2008 by cshalizi

Copy this bookmark:



description:


tags: