KMedoids

Instance Constructors

new KMedoids(metric: (T, T) ⇒ Double, k: Int, maxIterations: Int, epsilon: Double, fractionEpsilon: Double, sampleSize: Int, numThreads: Int, seed: Long)

metric
The distance metric imposed on data elements
k
The number of clusters to use. If k is zero, the clustering will attempt to identify a number of clusters that is "good" w.r.t. Minimum Description Length.
maxIterations
The maximum number of model refinement iterations to run
epsilon
The epsilon threshold to use. Must be >= 0. If c1 is the current clustering model cost, and c0 is the cost of the previous model, then refinement halts when (c0 - c1) <= epsilon (Lower cost is better).
fractionEpsilon
The fractionEpsilon threshold to use. Must be >= 0. If c1 is the current clustering model cost, and c0 is the cost of the previous model, then refinement halts when (c0 - c1) / c0 <= fractionEpsilon (Lower cost is better).
sampleSize
The target size of the random sample. Must be > 0.
numThreads
The number of threads to use while clustering
seed
The random seed to use for RNG. Cluster training runs with the same starting random seed will be the same. By default, training runs will vary randomly.

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val epsilon: Double

The epsilon threshold to use.
The epsilon threshold to use. Must be >= 0. If c1 is the current clustering model cost, and c0 is the cost of the previous model, then refinement halts when (c0 - c1) <= epsilon (Lower cost is better).
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
val fractionEpsilon: Double

The fractionEpsilon threshold to use.
The fractionEpsilon threshold to use. Must be >= 0. If c1 is the current clustering model cost, and c0 is the cost of the previous model, then refinement halts when (c0 - c1) / c0 <= fractionEpsilon (Lower cost is better).
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val k: Int

The number of clusters to use.
The number of clusters to use. If k is zero, the clustering will attempt to identify a number of clusters that is "good" w.r.t. Minimum Description Length.
def logDebug(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logger: Logger

Definition Classes
Logging
val maxIterations: Int

The maximum number of model refinement iterations to run
val metric: (T, T) ⇒ Double

The distance metric imposed on data elements
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val numThreads: Int

The number of threads to use while clustering
def run(data: Seq[T]): KMedoidsModel[T]

Perform a K-Medoid clustering model training run on some input data
Perform a K-Medoid clustering model training run on some input data
data
The input data to train the clustering model on.
returns
A KMedoidsModel object representing the clustering model.
def run(data: RDD[T]): KMedoidsModel[T]

Perform a K-Medoid clustering model training run on some input data
Perform a K-Medoid clustering model training run on some input data
data
The input data to train the clustering model on.
returns
A KMedoidsModel object representing the clustering model.
val sampleSize: Int

The target size of the random sample.
The target size of the random sample. Must be > 0.
val seed: Long

The random seed to use for RNG.
The random seed to use for RNG. Cluster training runs with the same starting random seed will be the same. By default, training runs will vary randomly.
def setEpsilon(epsilon_: Double): KMedoids[T]

Set epsilon halting threshold for clustering cost improvement between refinements.
Set epsilon halting threshold for clustering cost improvement between refinements.
If c1 is the current clustering model cost, and c0 is the cost of the previous model, then refinement halts when (c0 - c1) <= epsilon (Lower cost is better).
epsilon_
The epsilon threshold to use. Must be >= 0.
returns
Copy of this instance, with updated value of epsilon
def setFractionEpsilon(fractionEpsilon_: Double): KMedoids[T]

Set fractionEpsilon threshold for clustering cost improvement between refinements.
Set fractionEpsilon threshold for clustering cost improvement between refinements.
If c1 is the current clustering model cost, and c0 is the cost of the previous model, then refinement halts when (c0 - c1) / c0 <= fractionEpsilon (Lower cost is better).
fractionEpsilon_
The fractionEpsilon threshold to use. Must be >= 0.
returns
Copy of this instance, with updated fractionEpsilon setting
def setK(k_: Int): KMedoids[T]

Set the number of clusters to train
Set the number of clusters to train
k_
The number of clusters. Must be >= 0. If k is zero, the clustering will attempt to identify a number of clusters that is "good" w.r.t. Minimum Description Length.
returns
Copy of this instance with new value for k
def setMaxIterations(maxIterations_: Int): KMedoids[T]

Set the maximum number of iterations to allow before halting cluster refinement.
Set the maximum number of iterations to allow before halting cluster refinement.
maxIterations_
The maximum number of refinement iterations. Must be > 0.
returns
Copy of this instance, with updated value for maxIterations
def setMetric(metric_: (T, T) ⇒ Double): KMedoids[T]

Set the distance metric to use over data elements
Set the distance metric to use over data elements
metric_
The distance metric
returns
Copy of this instance with new metric
def setNumThreads(numThreads_: Int): KMedoids[T]

Set the number of threads to use for clustering runs
Set the number of threads to use for clustering runs
numThreads_
The number of threads to use while clustering. Must be > 0.
returns
Copy of this instance with updated value of numThreads
def setSampleSize(sampleSize_: Int): KMedoids[T]

Set the size of the random sample to take from input data to use for clustering.
Set the size of the random sample to take from input data to use for clustering.
sampleSize_
The target size of the random sample. Must be > 0.
returns
Copy of this instance, with updated value of sampleSize
def setSeed(seed_: Long): KMedoids[T]

Set the random number generation (RNG) seed.
Set the random number generation (RNG) seed.
Cluster training runs with the same starting random seed will be the same. By default, training runs will vary randomly.
seed_
The random seed to use for RNG
returns
Copy of this instance, with updated random seed
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object KMedoids | package cluster

case class KMedoids[T](metric: (T, T) ⇒ Double, k: Int, maxIterations: Int, epsilon: Double, fractionEpsilon: Double, sampleSize: Int, numThreads: Int, seed: Long) extends Serializable with Logging with Product

Instance Constructors

new KMedoids(metric: (T, T) ⇒ Double, k: Int, maxIterations: Int, epsilon: Double, fractionEpsilon: Double, sampleSize: Int, numThreads: Int, seed: Long)

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

val epsilon: Double

final def eq(arg0: AnyRef): Boolean

def finalize(): Unit

val fractionEpsilon: Double

final def getClass(): Class[_]

final def isInstanceOf[T0]: Boolean

val k: Int

def logDebug(msg: ⇒ String): Unit

def logError(msg: ⇒ String): Unit

def logInfo(msg: ⇒ String): Unit

def logWarning(msg: ⇒ String): Unit

def logger: Logger

val maxIterations: Int

val metric: (T, T) ⇒ Double

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

val numThreads: Int

def run(data: Seq[T]): KMedoidsModel[T]

def run(data: RDD[T]): KMedoidsModel[T]

val sampleSize: Int

val seed: Long

def setEpsilon(epsilon_: Double): KMedoids[T]

def setFractionEpsilon(fractionEpsilon_: Double): KMedoids[T]

def setK(k_: Int): KMedoids[T]

def setMaxIterations(maxIterations_: Int): KMedoids[T]

def setMetric(metric_: (T, T) ⇒ Double): KMedoids[T]

def setNumThreads(numThreads_: Int): KMedoids[T]

def setSampleSize(sampleSize_: Int): KMedoids[T]

def setSeed(seed_: Long): KMedoids[T]

final def synchronized[T0](arg0: ⇒ T0): T0

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Product

Inherited from Equals

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped