The RBB ML package integrates the RBB with the Sandia Cognitive Foundry machine learning library, for making predictions based on data in an RBB using algorithms such as nearest neighbor, decision tree induction, support vector machines, and linear regression. This functionality goes beyond the Blackboard concept, and is not necessary basic RBB use.
The software design of RBB ML differs from the base RBB package. RBB ML applications must be written in Java (or using Java bindings from another langage). Some basic RBB ML applications may be developed using RBB ML command line utilities, but there is no full SQL interface.
The information in RBB ML application is divided into 4 categories, which may be stored in different RBBs. These are:
There is a corresponding Java class for each of these in the package gov.sandia.rbb.ml.parts, but normally access to them is coordinated through the MLMain class.
For simple applications all of the information can be stored in a single RBB. All the RBB ML commands support this using a single -rbb parameter, for example:
- The record of "what happened" - the source data being modeled
- Training examples and model parameters, gleaned from (potentially) multiple Sessions
- Stored Problem Sets (prediction paramters) and prediction results (such as generated Flags or synthesized Entities) for any number of Session/Model combinations
- The user-selected Session Time and selected Data (to be added as training data, for example)
java -jar rbb.jar ml predict -rbb jdbc:h2:SoccerGame
However, depending on the application it may be necessary to store the information in separate RBBs, for example:
- SessionRBB: The Session might be opened read-only, either so the original record of the source data cannot be altered, or because it is in a .zip file to save space.
- ModelRBB: A Model might be created, validated, and version-controlled by a subject matter expert, and applied by mulitple end-users to multiple Sessions. Note: a ModelRBB can only hold one model, so separate ModelRBBs should be used if multiple Models will be applied to one or more Sessions.
- PredictionsRBB: The Session is read-only, or different users are independently applying the same Models to the same Sessions, and don't want or need to share their results.
- CoordinationRBB: The Session is read-only, or different users are independently accessing the Session (they don't want linked Time or Data Selection)
java -jar rbb.jar ml predict -sessionRBB jdbc:h2:SoccerGame -modelRBB jdbc:h2:GoalScored -predictionsRBB jdbc:h2:GoalsByHomeTeam -coordinationRBB jdbc:h2:MySession
All the RBB ML commands allow all of the -xxxRBB options, even though some of them are not required for some commands. This allows the same RBB parameters to be used in each application for all commands:
RBB="-sessionRBB jdbc:h2:SoccerGame -modelRBB jdbc:h2:GoalScored -predictionsRBB jdbc:h2:GoalsByHomeTeam -coordinationRBB jdbc:h2:MySession"
java -jar rbb.jar ml print $RBB # the -sessionRBB parameter is not required, and silently ignored.
java -jar rbb.jar ml predict $RBB
Perhaps more commonly, an application will require some data to be separate while putting the rest together in a single RBB. The -rbb option sets the default RBB, which is overridden by the individual RBB options:
java -jar rbb.jar ml predict -sessionRBB jdbc:h2:SoccerGame -rbb Analysis
Here the soccer game itself is separate (perhaps it is read-only) while the model, its parameters, and the results it produces will be contained in Analysis. The order of -rbb and -xxxRBB options does not matter; the -rbb option is overriden by any of the others, before or after.