com.redhat.et.silex.text

LogTokenizer

object LogTokenizer extends LogTokenizing

Linear Supertypes
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. LogTokenizer
  2. LogTokenizing
  3. AnyRef
  4. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  7. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  10. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  11. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  12. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  13. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  14. def leadingPunctuation: Regex

    A regular expression describing punctuation to strip from the beginning of tokens; matches will be stripped by replacing them with their first match group.

    A regular expression describing punctuation to strip from the beginning of tokens; matches will be stripped by replacing them with their first match group. Override this definition to customize tokenizer behavior. Defaults to

    "(\\s)[^\\sA-Za-z0-9-_/]+|()^[^\\sA-Za-z0-9-_/]+"
    
    .

    "(\\s)[\\sA-Za-z0-9-_/]+|()[^\\sA-Za-z0-9-_/]+"

    Definition Classes
    LogTokenizing
  15. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  16. final def notify(): Unit

    Definition Classes
    AnyRef
  17. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  18. def rejectedIntratokenPunctuation: Regex

    A regular expression describing punctuation to strip from within tokens; matches will be stripped by replacing them with the empty string.

    A regular expression describing punctuation to strip from within tokens; matches will be stripped by replacing them with the empty string. Override this definition to customize tokenizer behavior. Defaults to

    "[^A-Za-z0-9-_./:@]"
    
    if not overridden.

    "[^A-Za-z0-9-_./:@]"

    Definition Classes
    LogTokenizing
  19. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  20. def toString(): String

    Definition Classes
    AnyRef → Any
  21. def tokens(msg: String, post: (String) ⇒ String = identity[String], pred: (String) ⇒ Boolean = str => true): Seq[String]

    Splits a log message into a sequence of tokens, by

    Splits a log message into a sequence of tokens, by

    • collapsing runs of whitespace into single spaces,
    • stripping rejected intertoken punctuation,
    • stripping rejected intratoken punctuation,
    • splitting on whitespace,
    • rejecting candidate tokens not containing at least one letter, and
    • applying optional user-supplied transformation and filtering functions.
    returns

    a sequence of tokens

    Definition Classes
    LogTokenizing
    See also

    Using word2vec on log messages

  22. def trailingPunctuation: Regex

    A regular expression describing punctuation to strip from the end of tokens; matches will be stripped by replacing them with their first match group.

    A regular expression describing punctuation to strip from the end of tokens; matches will be stripped by replacing them with their first match group. Override this definition to customize tokenizer behavior. Defaults to

    "[^\\sA-Za-z0-9-_/]+(\\s)|()[^\\sA-Za-z0-9-_/]+$"
    
    if not overridden.

    "[\\sA-Za-z0-9-_/]+(\\s)|()[\\sA-Za-z0-9-_/]+$"

    Definition Classes
    LogTokenizing
  23. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from LogTokenizing

Inherited from AnyRef

Inherited from Any

Ungrouped