|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectmorfologik.fsa.FSA
public abstract class FSA
This class implements Finite State Automaton traversal as described in Jan Daciuk's Incremental Construction of Finite-State Automata and Transducers, and Their Use in the Natural Language Processing (PhD thesis, Technical University of Gdansk).
This is an abstract base class for all forms of binary storage present in Jan Daciuk's FSA package.
Field Summary | |
---|---|
protected byte |
filler
The meaning of this field is not clear (check the FSA documentation). |
protected byte |
gotoLength
Size of transition's destination node "address". |
protected byte |
version
Dictionary version (derived from the combination of flags). |
static byte |
VERSION_5
Version number for version 5 of the automaton. |
Constructor Summary | |
---|---|
protected |
FSA(java.io.InputStream fsaStream,
java.lang.String dictionaryEncoding)
Creates a new automaton reading the FSA automaton from an input stream. |
Method Summary | |
---|---|
char |
getAnnotationSeparator()
Return the annotation separator character, converted to a character according to the encoding scheme passed in in the constructor of this class. |
abstract int |
getArc(int node,
byte label)
Returns the identifier of an arc leaving node and labeled
with label . |
abstract byte |
getArcLabel(int arc)
Return the label associated with a given arc . |
abstract int |
getEndNode(int arc)
Return the end node pointed to by a given arc . |
char |
getFillerCharacter()
Return the filler character, converted to a character according to the encoding scheme passed in in the constructor of this class. |
abstract int |
getFirstArc(int node)
Returns the identifier of the first arc leaving node or 0 if
the node has no outgoing arcs. |
int |
getFlags()
Returns a set of flags for this FSA instance. |
static FSA |
getInstance(java.io.File fsaFile,
java.lang.String dictionaryEncoding)
This static method will attempt to instantiate an appropriate implementation of the FSA for the version found in file given in the input argument. |
static FSA |
getInstance(java.io.InputStream fsaStream,
java.lang.String dictionaryEncoding)
This static method will attempt to instantiate an appropriate implementation of the FSA for the version found in file given in the input argument. |
abstract int |
getNextArc(int node,
int arc)
Returns the identifier of the next arc after arc and leaving
node . |
abstract int |
getNumberOfArcs()
Returns the number of arcs in this automaton. |
abstract int |
getNumberOfNodes()
Returns the number of nodes in this automaton. |
abstract int |
getRootNode()
Returns the identifier of the root node of this automaton. |
FSATraversalHelper |
getTraversalHelper()
Returns an object which can be used to walk the edges of this finite state automaton and match arbitrary sequences against its states. |
int |
getVersion()
Returns the version number of the binary representation of this FSA. |
abstract boolean |
isArcFinal(int arc)
Returns true if the destination node at the end of this
arc corresponds to an input sequence created when building
this automaton. |
abstract boolean |
isArcTerminal(int arc)
Returns true if this arc does not have a
terminating node. |
java.util.Iterator<java.nio.ByteBuffer> |
iterator()
Returns an iterator over all binary sequences starting from the initial FSA state and ending in final nodes. |
protected byte[] |
readFully(java.io.InputStream stream)
Reads all bytes from an input stream. |
protected void |
readHeader(java.io.DataInput in,
long fileSize)
Reads a FSA header from a stream. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final byte VERSION_5
protected byte version
protected byte filler
protected byte gotoLength
Constructor Detail |
---|
protected FSA(java.io.InputStream fsaStream, java.lang.String dictionaryEncoding) throws java.io.IOException
fsaStream
- An input stream with FSA automaton.
java.io.IOException
- if the dictionary file cannot be read, or version of the file
is not supported.Method Detail |
---|
public final int getVersion()
The version number is a derivation of combination of flags and is exactly the same as in Jan Daciuk's FSA package.
public final int getFlags()
FSAFlags.FLEXIBLE
flag, one must
perform a bitwise AND:
boolean isFlexible = ((dict.getFlags() & FSA.FSA_FLEXIBLE ) != 0)
public final char getAnnotationSeparator()
public final char getFillerCharacter()
public abstract int getNumberOfArcs()
public abstract int getNumberOfNodes()
public FSATraversalHelper getTraversalHelper()
public static FSA getInstance(java.io.File fsaFile, java.lang.String dictionaryEncoding) throws java.io.IOException
java.io.IOException
- An exception is thrown if no corresponding FSA parser is
found or if the input file cannot be opened.public static FSA getInstance(java.io.InputStream fsaStream, java.lang.String dictionaryEncoding) throws java.io.IOException
java.io.IOException
- An exception is thrown if no corresponding FSA parser is
found or if the input file cannot be opened.protected void readHeader(java.io.DataInput in, long fileSize) throws java.io.IOException
java.io.IOException
- If the stream is not a dictionary, or if the version is not
supported.protected byte[] readFully(java.io.InputStream stream) throws java.io.IOException
stream
-
java.io.IOException
public java.util.Iterator<java.nio.ByteBuffer> iterator()
ByteBuffer
that changes on each call to Iterator.next()
,
so if the content should be preserved, it must be copied somewhere else.
It is guaranteed that the returned byte buffer is backed by a byte array and that the content of the byte buffer starts at the array's index 0.
iterator
in interface java.lang.Iterable<java.nio.ByteBuffer>
public abstract int getRootNode()
getTraversalHelper()
public abstract int getFirstArc(int node)
node
or 0 if
the node has no outgoing arcs.
getTraversalHelper()
public abstract int getArc(int node, byte label)
node
and labeled
with label
. An identifier equal to 0 means the node has no
outgoing arc labeled label
.
getTraversalHelper()
public abstract int getNextArc(int node, int arc)
arc
and leaving
node
. Zero is returned if no more arcs are available for the
node.
getTraversalHelper()
public abstract int getEndNode(int arc)
arc
. Terminal arcs
(those that point to a terminal state) have no end node representation
and throw a runtime exception.
getTraversalHelper()
public abstract byte getArcLabel(int arc)
arc
.
public abstract boolean isArcFinal(int arc)
true
if the destination node at the end of this
arc
corresponds to an input sequence created when building
this automaton.
public abstract boolean isArcTerminal(int arc)
true
if this arc
does not have a
terminating node.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |