Architecture
Last updated
Was this helpful?
Last updated
Was this helpful?
This tutorial aims to describe the main principles behind the design of our domestic robotics architecture as well as its intended use.
Our domestic architecture is based on the concepts of skills and behaviours that, when combined together, allow a robot to perform complex tasks.
We distinguish between skills and behaviours as follows:
skills are "primitive" functionalities that a robot can perform (and are thus equivalent to actions in the classical planning sense)
behaviours change the context in which actions are performed and thus usually reparameterise skills in some manner
It is clearly difficult - if not impossible - to come up with a generic level of granularity of skills and behaviours. We have attempted to make use of what is generally done in the robotics literature, such that we have implementations of skills such as move base
, turn base
, pickup
, place
, perceive plane
; on the other hand, we have behaviours such as pick_closest_from_surface
, place_based_on_category
, although we also have generic behaviours that exactly match the skill names (e.g. move_base
or place
), which have a default set of assumptions about the actions and their context.
For historical purposes, our domestic architecture is ROS-based at its core. Since both robots that we are currently working with - the Care-O-bot 3 and the Toyota HSR - heavily rely on ROS, having a ROS-based architecture simplifies the integration of our functionalities with the already existing robot components (for instance sensor drivers and navigation components); in addition, ROS makes it possible to develop different components in different programming languages, which is nice since it allows our developers to use the language they feel most comfortable with for achieving a certain purpose.
While ROS-based architectures have various advantages, completely relying on ROS makes it a bit difficult to use certain best practices in software engineering and often introduces an unnecessary overhead in how components are used. This has sometimes lead to difficult-to-maintain components in our code base.
So, while we do rely on ROS as a backbone, our architecture also makes use of various home-made ROS-indepenent components that are meant to eliminate some of the disadvantages mentioned above. One caveat of this is that most of our domestic software is now written in Python (since we currently only support ROS kinetic, we still live in the past and use Python 2).
Given all different aspects that need to be taken into account to make robots useful (perception, navigation, manipulation, reasoning), robot software tends to get quite complex in practice and managing all different components can be fairly complicated. In order to simplify this management process, our architecture makes use of a multi-repository setup with various major components being hosted in separate repositories rather than in a monolithic repository.
A multi-repository setup eases the treatment of different functionalities and allows developers to focus on different aspects without interfering with each other (too much). The downside of this is that it becomes difficult to get a development setup up and running since the repository dependencies need to be clearly specified. To alleviate this problem, we make extensive use of advanced tooling; in particular, we:
specify dependencies of a repository in rosinstall files and use wstool
for acquiring them automatically
make use of continuous integration (in particular Travis
)
Unfortunately, robots often get outdated and it sometimes becomes necessary to replace a new platform with a newer one. To make the accommodation of new platforms as painless as possible, our architecture distinguishes between robot-independent and robot-dependent components:
Robot-independent components occupy the core of our architecture and are implemented in a way that - hopefully - makes as few robot-independent assumptions as possible, which should simplify the process of interfacing with new robots when the need for that arises
Robot-dependent components either inherit from the robot-independent components or simply make use of the robot-independent components by reparameterising them for a specific robot platform.
To give an example, our robot-independent code base includes implementations of various robot actions, such as pick
, place
, move base
, and so forth; these are parameterised as much as possible (for example, ROS topic names that are likely to have different names for different robots are made parameters) so that using them on a different robot is mostly a matter of setting the correct values for those parameters (although this is how everything should work in theory, in practice we often find that implementations do indeed have various robot-dependent assumptions, which requires rethinking of components).
A brief description of these is given below:
an ontology of actions and action failures
implementations of models for executing actions (e.g. for the place
action, one execution model samples poses on a given surface regardless of whether they are in free space or not and returns a placing candidate pose on the surface, while another one samples poses in free space only; for the pickup
action, an execution model samples candidate grasping poses and returns the closest feasible grasping pose)
In order to be as flexible and as intelligent as possible, we develop our robots so that they behave in a "closed-loop" manner by using both encyclopedic and online knowledge about the world. The functionalities for knowledge management are centralised in the mas_knowledge_base
repository, which exposes interfaces for querying the knowledge base and updating its state. This repository also contains an interface for interacting with OWL ontologies, since we use an OWL ontology (also included there) for encoding encyclopedic knowledge about the world.
A diagram illustrating the interactions between components that occur during execution is shown below.
The above interfaces illustrate a general pattern that we want to follow in our architecture, namely we want to create abstractions of functionalities that can then be extended for a very particular purpose. This is simply a good software engineering practice that we try to follow in our architecture.
This interface defines a so-called fault-tolerant state machine (FTSM), which is illustrated in the diagram below:
This component is a very recent addition to our set of libraries, such that almost all our actions are implemented by reusing this abstraction. There is clearly a connection between the fault-tolerant implementation of actions and the ontology in the action-execution
library (but the integration of the two is ongoing work).
Implementing skills in our domestic code base is always a three-step process:
The action needs to be implemented as a component inheriting from the ActionSMBase
class described above. Not all states of the state machine have to be overriden, but it usually makes sense to override init
for initialising the action (e.g. by waiting for its dependencies to become available), running
for actually performing the action, and recovering
for specifying what needs to happen if the action fails.
An actionlib server for exposing the action needs to be defined
An action client inheriting from the ActionClientBase
shown above needs to be exposed. The action client
calls the action through the above actionlib server and
updates the knowledge base after the execution is complete
Our move_base
action assumes that a robot with a manipulator is used, such that the manipulator is brought to a safe position before the robot starts moving. In the init
method of the state machine, we thus wait for the server of the move_arm
action to become available.
The complete implementation of the action FTSM is included below for convenience.
The move_base
action server, which is shown below, simply reads the necessary parameters for the action and then waits for action requests. This action server implementation is in fact a template that all our action server are following.
Finally, the move_base
action client simply waits for action requests from a plan/task dispatcher, calls the action server, and, if the action completes successfully, updates the knowledge base with the fact that the robot is now at its destination location rather than at its original location. The implementation of the move_base
action client is given below.
go to a table in the lab
scan the table for any objects
pick the closest object from the table
place the object back on the table
repeat 2-4 until the execution is manually terminated
We also want to have some recovery in the execution; in particular, we want the robot to repeat the execution if it detects a failure in each of the states, such that if we have consecutive failures in a given state for a predefined number of times, the state machine will be interrupted.
We can encode this scenario using the YAML-based state machine specification described in mas_execution_manager
as follows:
The resulting state machine from this scenario definition is shown below:
In the state machine definition file, the state_module_name
and state_class_name
specify the package and class name that define the state that will be executed (states are thus dynamically loaded from the configuration file rather than hard-coded in a scenario script). In this particular case, all states are reused from the mdr_behaviours
metapackage, but they could also be redefined for a specific scenario to capture any particular nuances about the scenario.
Behaviours are implemented as smach states that inherit from a ScenarioStateBase
class defined in mas_execution_manager
. To interact with the skills, the behaviours use the action clients described above, i.e. skills are not invoked by creating an actionlib action client, but by sending a message to the action client much like a plan dispatcher (though there is nothing preventing one from creating an explicit client instead).
Author(s): Alex Mitrevski
Contributors: Minh Nguyen, Argentina Ortega
Last update: 23.02.2019
Our primary repository that contains the core functionalities is ; that is where we have all our skills and behaviours and all the "glue" code that combines everything together. This repository is however in the center of a multitude of repositories that we use. A diagram illustrating the interaction between various repositories of ours is shown below.
: A set of generic components shared by the b-it-bots@Home and @Work teams.
: Contains various libraries dealing with perception-related aspects (mostly vision), such as object detection and recognition, point cloud processing, and image segmentation.
: This repository is used for configuring navigation-related components (although we don't use these very often).
: Hosts interfaces to an ontology and an online knowledge base in the form of predicates and fluents that describe the world.
: A library for easily configuring state machines (which we use for specifying scenario configuration). The library also contains a base class interface for implementing actions in a fault-tolerant manner.
: The action execution library aims to centralise the process of making decision at execution time and includes:
: A library exposing an interface to a state machine which can be used for implementing components in a fault-tolerant manner
Our system embeds skills in a planning-oriented architecture built around the system (even though we don't always plan as such), where each skill is implemented as a separate action. The implementation of actions is split into two components - an action server and an action client - where the server is the one that takes care of the actual execution (e.g. the place
server performs the necessary robot motions for placing an object), while the client is the one that waits for requests (coming from a plan/task dispatcher). In order to maintain an updated state of the world, actions perform knowledge base updates after execution.
To simplify the implementation of robot-dependent functionalities, the mas_domestic_robotics
repository defines various robot-independent interfaces that can then be implemented for a specific robot. For instance, the mdr_sound_vocalisation
package defines the following base class (defined ) for implementing functionalities related to making robot sounds:
A similar interface is provided by the mdr_gripper_controller
package, which defines an interface to an end-effector of a robot manipulator (the definition is ):
Another good example is an interface for implementing action clients (namely components that wait for action requests, execute an action, and update the knowledge base after the execution), which is implemented in the mdr_rosplan_interface
package (in particular ):
Robots are unfortunately very error-prone, so this aspect has to be taken into account in the development of robot functionalities. As mentioned above, the mas_execution_manager
repositories includes an interface for implementing actions in a fault-tolerant manner, which is shown below for convenience (the definition of the interface is ):
This process is exemplified below for the move_base
action, which is defined .
The running
method of the state machine is where the logic behind executing the action is implemented. In our case, we can move to either a named target (named targets are expected to be defined in a YAML file such as ) or an explicitly specified pose. Either way, the arm is moved to a safe configuration before the robot starts moving.
One interesting aspect about our architecture is the creation of a complex scenario that a robot can execute. Most of the time, we define scenarios using state machines, such that we use the components in mas_execution_manager
for defining and executing state machines. A description of mas_execution_manager
is beyond the scope of this tutorial; please read the for that.
Instead, let's consider an example to illustrate how we could define a state machine for a scenario in which we want a robot to perform a very simple perpetual pick-and-place functionality (this is one of our lab demos defined ). In particular, let's assume that we want the robot to:
An example implementation of the move_base
behaviour used in the above GO_TO_TABLE
state that illustrates this principle is shown below (this behaviour can be found ):