MITM proxy for the Tezos P2P network.
What is it?
This tool acts as a Man in the middle proxy for a set of Tezos nodes, communicating on the P2P network.
It can be used to:
- monitor messages exchanged by participants on a predefined Tezos P2P network,
- simulate delays and loss of messages on the network for tests, and even clock drifts,
- play (and replay) scenarios (expressed in terms of message passing)
Mitten was developed to (re)play scenarios for the Tenderbake consensus protocol. Tezos nodes that are connected to Mitten are real nodes and execute real Tezos code, which means that:
- the instrumentation on Tezos code is nil (almost)
- any scenario that can be played with Mitten can also happen in real life on the real Tezos network
Quickstart
Getting Started
Prerequisites
Mitten depends on the Tezos code base (and libraries), and is distributed with it.
Optionally, if you want to simulate clock drifts on certain nodes and bakers, you need to have libfaketime installed.
Building
The easiest way to build Mitten is inside the Tezos repository (after having built Tezos itself).
dune build src/mitten
or, at the root of the repo
make mitten
Running
Forwarder/logger Proxy
The binary that is build by default is a simple forwarder/logger for the P2P network (logging the messages of the DDB).
The Mitten forwarder can be run with:
_build/default/src/mitten/mitten.exe --stats --verbose 3
Scenarios
Scenarios for Mitten are written (for now) as OCaml files, compiled with the Mitten libraries. Each scenario is built to an executable which, when run, starts an instance of Mitten parameterized by the scenario.
Running a Mitten scenario amounts to running the generated executable, e.g.:
_build/default/src/mitten/scenarios/baker_not_participating_round_r_because_pqc_previous_round_diff_playload.exe
Architecture and Design of Mitten
MITM Proxy
The Man in the middle proxy is configured to communicate with a set of Tezos nodes on multiple TCP sockets. Each Tezos node is started in private mode communicating with the other nodes of the network only through the proxy.
For instance, on the schema above, the node that is started on port
71
communicates with the node started on port 72
only through the
proxy on port 82
and never directly. Basically, each node sees its
peers through the Mitten proxy, which allows the latter to act on
messages that go through.
In the forwarder mode, this allows e.g. Mitten to log all P2P (DDB) messages before forwarding them to their intended destination.
Modularity
The machinery inside the mitten proxy can be replaced with anything that acts on the messages.
For now there are two machinery plugins in Mitten:
- a logger/forwarder,
- a scenario player, which is in essence a forwarder (that can delay, drop or modify messages) parameterized by a scenario written in a specific DSL.
Configuration
In this section you will see an overview of the different command line and configuration options.
Command Line Options
Option | Description |
---|---|
--config <file.json> | Use configuration file <file.json> for network |
--verbose <level> | Set verbosity level (0 - 5 ), default: 3 |
--debug | Show debug traces |
--stats | Show connections stats periodically |
--details | Show details of check results |
--only-proxy | Start only proxies |
--dont-track-block | Deactivate tracking of block hashes for each node |
--dont-track-operation | Deactivate tracking of operation hashes for each node |
--track-message | Track the known messages for each node |
--keep-alive | Keep nodes and bakers alive after termination |
--queue-level-limit | Delete messages older than limit from the queue, default: 2 |
-help | Display this list of options |
--help | Display this list of options |
Configuration File
The JSON file given in argument to --config
, allows to configure
certain parameters of Mitten (in particular on the sandboxed network)
that are common to multiple runs.
Here is an example configuration files which sets all the possible fields. A detailed explanation for each one is given afterwards.
{
"chain": "SANDBOXED_TEZOS",
"proxies" : [
{ "node" : {"name" : "b1", "port" : 19771, "rpc_port" : 18731 },
"proxy_port" : 19781
},
{ "node" : {"name" : "b2", "port" : 19772, "rpc_port" : 18732 },
"proxy_port" : 19782,
"baker" : {
"secret_key" : "edsk37djhrngD2GrZxhyu8PMFqbjv66Ee3j1d1zNVGdnDzMAx29CUK",
"state_file" : "/tmp/baker_state_b2.json"
}
},
{ "node" : {"name" : "b3", "port" : 19773, "rpc_port" : 18733 },
"proxy_port" : 19783,
"baker" : { },
"accuser" : false,
"drift" : { "add" : 0.2, "mul" : 1.001 }
},
{ "node" : {"name" : "b4", "port" : 19774, "rpc_port" : 18734 },
"proxy_port" : 19784,
"baker" : { }
}
],
"data_dir": "/tmp/data",
"logs_dir": "/tmp/logs",
"timeout" : 60.0,
"libfaketime" : "/opt/local/lib/faketime/libfaketime.1.dylib",
"rpc_proxy" : false,
"delegate_selection" : "round_robin",
"committee" : 4,
"quorum" : 3,
"round_durations" : [1, 2],
"node_binary" : "/path/to/tezos-node",
"client_binary" : "/path/to/tezos-client",
"baker_binary" : "/path/to/tezos-baker-alpha",
"accuser_binary" : "/path/to/tezos-accuser-alpha"
}
chain
By default "SANDBOXED_TEZOS"
. This is to allow the proxy to
communicate with the nodes on the P2P network.
proxies
A list of predefined proxies with their parameters that are started by mitten. Proxies are started automatically when playing scenarios, but the configuration file is the only way to configure the network when mitten is started in forwarder/logger mode.
Each proxy contains :
- node: a configuration for the node with
- name : The name that will be used to denote the node and baker in the logs and scenarios,
- port : the port on which the node listens for P2P messages (by default 19770 + node index)
- rpc_port : the port on which the node listens for RPC requests (by default 19780 + node index)
- proxy_port : the port for the P2P proxy in Mitten (on which the other nodes will connect)
- baker : a configuration for the baker. If this field is absent
no baker is started on this node. If this field is the empty object
{}
the parameters for the baker are chosen automatically.- secret_key : an unencrypted secret key for the delegate with which the baker will bake (and endorse) (optional)
- state_file : a path to the dump file for this baker (optional)
- accuser : whether or not to start the accuser automatically for this node
- drift : affine parameters for the clock drift of the associated baker and node
data_dir
Directory where to store node data and client data, for network
spawned. Default is a uniquely generated directory in /tmp
.
logs_dir
Directory where to store node, bakers, and accuser logs, for network
spawned. Default is a uniquely generated directory in /tmp
.
timeout
Global timeout in seconds for the proxy. When this time is elapsed the proxy is terminated.
libfaketime
The path to the library file for libfaketime (only useful for drifts).
rpc_proxy
By default false
. Whether to also start a proxy on the RPC port of
the node. This is useful to log and intercepts messages between a
baker and its node.
delegate_selection
The selection strategy for delegates (in the committee) at each level
and round. By default random
, which follows what is implemented in
the Tezos node.
Either :
"random"
: the default. Which follows what is implemented in the Tezos node. Delegates are chosen at random depending on their stake and a sampling strategy. Use this if you want to be as close as possible to a mainnet Tezos node and if your scenario does not depend on the delegates and the committee."round_robin"
: delegates are chosen in a round robin manner over the list of delegates listed as bakers in the proxies field.[["b1", "b2", "b3"], ["b3", "b1", "b2"]]
The explicit list of who is in the committee and who is the baker for each level and round. The last item of the list is iterated over in a round robin manner for the latter levels.
committee
Committee size for tenderbake. This is expressed in terms of slots. By default, the committee size is the number of bakers started on the network.
quorum
Quorum size for tenderbake. This is expressed in terms of slots. By
default it is committee * 2 / 3 + 1
.
node_binary
Path to the Tezos node binary. Can be absolute or relative to where
mitten is launched. Defaults to ./tezos-node
.
client_binary
Path to the Tezos client binary. Can be absolute or relative to where
mitten is launched. Defaults to ./tezos-client
.
baker_binary
Path to the Tezos baker binary. Can be absolute or relative to where
mitten is launched. Defaults to ./tezos-baker-alpha
.
accuser_binary
Path to the Tezos accuser binary. Can be absolute or relative to where
mitten is launched. Defaults to ./tezos-accuser-alpha
.
Scenarios in Mitten
A scenario in Mitten describes which messages are allowed by a sequence (or any other combination) of steps.
Each message is pattern matched with the current step of the scenario to decide if it should go through or not. Messages that are matched are forwarded, and messages which are not matched are discarded (but can be reconsidered later on).
Steps can contain variables, constants, and catch-all patterns.
Scenario Language
Variables and Constants
Free variables can be defined simply by giving them a name
#![allow(unused)] fn main() { let b = Free "b" let r = Free "r" let l = Free "l" }
Constants are written with the constructor Bound
. A node/baker constant is
written by its name:
#![allow(unused)] fn main() { let b1 = Bound (N "b1") }
There are smart constructors for level and round constants so that they can be written easily:
#![allow(unused)] fn main() { let l3 = level 3 let r0 = round 0 }
Once a free variable is bound to a constant by matching, it is bound to this same constant for the rest of the scenario.
Catch-all variables are written by the constructor Any
or simply
(two underscore symbols):
#![allow(unused)] fn main() { __ }
Elementary Steps
Elementary (or leaf) steps describe the actual messages themselves. In Tenderbake, we care only about the consensus messages which are either a proposal, a preendorsement, or an endorsement.
Proposal
The step written by
#![allow(unused)] fn main() { propose b1 l r ph [b2] }
where b1
, l
, r
, ph
and b2
are variables or constants (as
described above) will match a proposal
message whose :
- sender (signer) matches
b1
- level matches
l
- round matches
r
- payload hash matches
ph
- and destination matches
b2
.
Preendorsement
Similarly, the step written by
#![allow(unused)] fn main() { preendorse b1 l r ph [b2] }
will match a corresponding preendorsement message.
Endorsement
Similarly, the step written by
#![allow(unused)] fn main() { endorse b1 l r ph [b2] }
will match a corresponding endorsement message.
Step Combinators
Conjunction
A step which is a conjunction of other steps is matched when all the steps of the conjunction are matched (in any order).
Conjunctions are written with the infix symbol &&
(or the
constructor And
).
For instance, the following step will be matched when a proposal of level 1 is sent and a proposal of level 2 is sent (irrespective of their round, sender, destination, etc.).
#![allow(unused)] fn main() { propose __ (level 1) __ __ [__] && propose __ (level 2) __ __ [__] }
Elementary steps accept many destinations, which is simply syntactic sugar for the corresponding conjunction.
For instance,
#![allow(unused)] fn main() { propose b r l __ [b1; b2; b3] }
is syntactic sugar for
#![allow(unused)] fn main() { propose b r l __ [b1] && propose b r l __ [b2] && propose b r l __ [b3] }
Disjunction
A step which is a disjunction of other steps is matched as soon as one of the steps of the disjunction is matched.
Disjunctions are written with the infix symbol ||
(or the
constructor Or
).
For instance, the following step will be matched when either a proposal of round 0 is sent or a proposal of round 1 is sent.
#![allow(unused)] fn main() { propose __ __ (round 0) __ [__] || propose __ __ (round 1) __ [__] }
Negation
A step which is the negation of another step is matched by any message which does not match the negated step.
Negations are written with the prefix symbol ~!
(or the
constructor Not
).
For instance, the following step will be matched by any preendorsement
and endorsement, and by any proposal of someone other than b1
.
#![allow(unused)] fn main() { ~! propose b1 __ __ __ [__] }
Sequence
A sequence [s1; s2]
of steps is matched when step s1
is first matched
and when step s2
is then matched.
Note that this effectively imposes an order on messages. However,
because discarded messages are reconsidered for matching, this means
that s2
can arrive before s1
but s1
will always be forwarded
before s2
. Sequences can be seen as a way to reorder messages.
Sequences are written with
#![allow(unused)] fn main() { seq [ s1; s2; s3; ... sn; ] }
where s1
, ..., sn
are steps.
Loop
Loop steps are used to wait for a particular message.
A loop is written
#![allow(unused)] fn main() { repeat_step ->? stop_step }
and will be matched when :
-
The step representing the stopping condition
stop_step
is matched, or -
A message matching
repeat_step
is matched and then the same loop must be matched again#![allow(unused)] fn main() { repeat_step ->? stop_step }
For instance,
#![allow(unused)] fn main() { ( preendorse __ __ __ [__] || endorse __ __ __ [__] ) ->? propose __ __ __ [__] }
waits for a proposal message and forwards any preendorsement and endorsement messages in the meantime.
Structure of a scenario
A scenario in Mitten is represented by values of the following
scenario
type:
#![allow(unused)] fn main() { type scenario = { nodes : node list; parameters : sandbox_param; constraints : var_constraint list; code : step; timeout : float option; } }
nodes
The field nodes
contains a list of nodes for the scenario. Each node
must be given a unique name and can be configured to have a baker and
accuser associated.
Mitten will automatically start (and stop, after execution) the network with all the bakers and accusers.
Each node is a value of the type node
:
#![allow(unused)] fn main() { type node = { name : name; baking : bool; accusing : bool; drift : drift; } }
Nodes (with their respective baker and accuser) can be made to have a clock drift for simulation purposes. This is useful for instance to test how the network behaves, in particular how the consensus evolves in the presence of nodes which do not have the same clock (which happens in practice).
#![allow(unused)] fn main() { type drift = { add : float option; mul : float option; } }
The time for a node with a clock drift is an affine function of the real time (the one on the machine which runs ) :
For instance, a node whose drift is set to
#![allow(unused)] fn main() { { add = Some (- 1.0); mul = Some 1.01 } }
will have a clock which is initially 1 second early, but whose time passes 1% slower (i.e. the node will "loose" one second every hundred seconds).
A drift of nodrift = { add = None; mul = None }
is functionally
equivalent to { add = Some 0.0; mul = Some 1.0 }
but the system
calls will not be intercepted by libfaketime.
To simulate clock drifts, you need to have libfaketime installed on your system.
parameters
The network with which the scenario is run can be configured in the
parameters
field.
The components of this object are a subset of the ones that can be found in the global configuration file (see this section for example).
#![allow(unused)] fn main() { type sandbox_param = { delegate_selection: mitten_delegate_selection; committee: int option; quorum: int option; round_durations: (int * int) option; } }
delegate_selection
The selection strategy for delegates (i.e. validators in the committee) at each level and round.
It can be either :
random
#![allow(unused)] fn main() { Random }
This follows what is implemented in the Tezos node. Delegates are chosen at random depending on their stake and a sampling strategy. Use this if you want to be as close as possible to a mainnet Tezos node and if your scenario does not depend on the delegates and the committee.
round_robin
#![allow(unused)] fn main() { Round_robin }
Delegates are chosen in a round robin manner over the list of nodes.
round_robin_over
#![allow(unused)] fn main() { Round_robin_over [ [N "b1"; N "b2"; N "b3"]; [N "b2"]; [N "b3"; N "b1"; N "b2"] ] }
The explicit list of who is in the committee and who is the baker
for each level and round. The last item of the list is iterated over
in a round robin manner for the latter levels. In the example above,
the first level (of the protocol) will have b1
be the proposer for
round 0, b2
be the proposer for round 1, b3
be the proposer for
round 2, b1
be the proposer for round 3, and so on. In the second
level, b2
is the proposer for all the rounds. In subsequent levels,
the proposer is chosen in the list [b3, b1, b2]
by an index i who is
computed (by round robin) by:
or more generally, if the delegate selection strategy is
Round_robin_over l
(and we note by the element of
the list modulo its size):
-
When the level is smaller than :
-
When the level is greater or equal than :
committee
Committee size for tenderbake. This is expressed in terms of slots. If
None
, the committee size is the number of nodes.
quorum
Quorum size for tenderbake. This is expressed in terms of slots. If
None
, it is
round_durations
The durations for each round in seconds. If None
, it is set to (15, 15)
. The first component of the pair is the duration of the first
round and the second component is the duration of the second
round. All subsequent rounds have linearly increasing durations with
the same increase between the first and the second round.
For instance,
#![allow(unused)] fn main() { round_durations = Some (3, 5) }
the actual durations of the round will be (in seconds):
- round 0 : 3
- round 1 : 5
- round 2 : 7
- round 3 : 9
- round 4 : 11
- ...
Similarly if we set
#![allow(unused)] fn main() { round_durations = Some (1, 1) }
each round will be 1 second.
constraints
An optional set of constraints can be added on variables. This can be
useful to add e.g. arithmetic constraints like l' = l + 1
or r2 > r1
on variables for levels and rounds. See the section on
constraints for more details.
code
This is the actual code that should be executed by the scenario.
timeout
A global timeout can be specified, in seconds, for the execution of the scenario. This is useful for developing or fine tuning scenarios to avoid having to wait for too long, but it is most useful for scenarios that should be replayed as parts of tests in the CI (for instance) to avoid exhausting resources.
Running a Scenario
Writing Scenario Files
The file should start with an open of either:
open Proxy_scenario_engine
or
open Mockup_scenario_engine
depending on the desired mode to execute the scenario. The mode "proxy" is the default one and starts a whole sandboxed, proxied network whereas the mode "mockup" simulates multiple bakers and nodes in a single executable (see the section on the mockup engine for more details).
There is a single function to call to execute a scenario,
run_scenario
which takes as argument a value of type scenario
.
This call should be the last top-level call of the scenario file, as the program will exit after the scenario is over with a code indicating success or a failure in the execution of the scenario.
Compiling a Scenario File
The Simple Way
To have a scenario file be automatically compiled with the correct
incantation, it is only necessary to add it to the
src/mitten/scenarios
directory.
For instance, adding myscenario.ml
to this directory, the following
will compile Mitten and all scenarios in the scenarios
directory.
make mitten
If you want to only compile your scenario file you can do
dune build src/mitten/scenario/myscenario.exe
The hard Way
If you want to compile a scenario "manually", you can place it
anywhere together with a dune
file containing (for myscenario.ml
):
(executable
(name myscenario)
(libraries lib_mitten)
(flags (:standard
-open Tezos_base__TzPervasives
-open Lib_mitten
-open Lib_scenarios
-open Scenarios
-open Types
-linkall)))
and call
dune build path/to/myscenario.exe
Structure of a scenario
A scenario in Mitten is represented by values of the following
scenario
type:
#![allow(unused)] fn main() { type scenario = { nodes : node list; parameters : sandbox_param; constraints : var_constraint list; code : step; timeout : float option; } }
Grammar of Constraints x
where is a variable or a constant as described previously.
When a non-empty set of constraints is provided as part of a scenario, matching of messages, and thus variables, is done modulo these constraints.
For messages to match, the substitutions that result from the matching must keep the constraints set satisfiable, in which case we say the step is matched modulo this set of constraints.
Deductions that can be made straightforwardly are propagated automatically, but there is no real solver at the moment (for instance there is no union-find structure for equalities). This is something that can be improved later on if these constraints turn out to be important for describing some scenarios.
Syntax
The Mitten library provides a "helper" module to write constraints in a simpler way.
TODO
The Mockup Engine
Instead of spawning a new sandboxed network and acting as a proxy between the nodes for replaying scenario, Mitten can make use of the mockup mode of the Tezos node.
When run this way, the node is simulated and the bakers are run (together with the simulated node) in the same executable.
This is a more lightweight approach (as opposed to starting a full sandboxed network and proxies) which allows in theory to have a more deterministic setup for replaying scenarios, even though race conditions can still occur.
To use the mockup engine, the only thing that needs to be changed is the initial open, which should be replaced by:
open Mockup_scenario_engine
Assertions
To check properties and write meaningful scenarios, Mitten provides a mechanism for adding assertions throughout the code.
In fact, these assertions are very flexible and allow to execute arbitrary code. This gives the user total freedom on how to track information and check important properties, while still providing a general framework for recording and reporting results of the execution together with a few helper functions for usual checks and actions.
Pre- and Post-conditions
Assertions (and other code) are written in pre- and post-conditions of
scenario steps. They can be added with the function add_spec
:
val add_spec : ?pre:(check list) -> ?post:(check_list) -> step -> step
The pre-condition will be executed before propagating the message matched by this step, and the post-condition will be executed after propagating the message.
💡 Note that, because a post-condition is executed after the message is propagated to its intended destination, there are no guarantees that it is already taken into account by the node. In particular, checking properties on the state of the destination node/baker after the propagation is most of the time meaningless.
Assertions Language
Checks that can be added in pre/post-conditions can be of three kinds as shown by the type:
#![allow(unused)] fn main() { type check = | Exec of (check_arg -> unit tzresult Lwt.t) | Check of string * on_fail * (check_arg -> bool tzresult Lwt.t) | AsyncCheck of string * on_fail * (check_arg -> bool tzresult Lwt.t) }
Code Execution
For instance, the following scenario counts the number of proposals of round 1 until level 10:
#![allow(unused)] fn main() { let cpt = ref 0 let scenario_code = (add_spec ~post:[ Exec (fun _ -> incr cpt; return_unit) ] @@ proposal __ __ (round 1) __ [__] ) || any_step ->? proposal __ (level 10) __ __ [__] }
The scenario is composed of a single loop which waits for a proposal
of level 10. In the body of the loop, we accept proposals of round 1
or any other step, but we have added a post-condition to the step
proposal
which increments a counter.
At the end of the execution, the variable cpt
will contain the
number of proposal messages at round 1 that the proxy has seen go
through.
Another use of Exec
is to perform queries or actions on the node,
such as injecting operations to specific nodes:
#![allow(unused)] fn main() { Exec (Helpers.inject_dummy (N "b1")) }
Injects a dummy transaction of 1 mutez from b1 to itself, onto the node b1.
Checks
A check can be executed in a pre/post condition, when it is constructed
with Check
. It must be given a descriptive name and whether or not
the scenario should abort if the check fails.
The function which is executed by the check takes as argument the actual message and the step that was matched. This allows to extract information from the message (for instance we can fail a check if we see a proposal of round > 10).
The function must return either true
, if the check succeeded, or
false
if the check failed. The check itself can also fail in the
error monad or with an exception, for unforeseen executions.
Check execution results are gathered by Mitten while executing the scenario and reported at the end. Failures are also reported while they happen. Mitten differentiates failed checks from the ones that errored or raised an exception.
For instance, the following scenario check ensures that a baker that proposes has a node whose head is on the same level, and continues even if the test fails:
#![allow(unused)] fn main() { let scenario_code = wait @@ add_spec ~pre:[ Check ("Node has same level as proposal", Continue_on_fail, (fun _ -> Helpers.get_node_head_level b1 >|=? fun l -> l = 5l )) ] @@ proposal b1 (level 5) __ __ [__] }
Asynchronous Checks
Checks can be asynchronous when using the constructor AsyncCheck
instead of Check
. In this case the execution of the check does not
block the progression of the execution of the scenario.
This is useful to implement checks that must wait for a certain amount of time, or wait for events to perform checks. For instance, with this we can add a check that waits 10 seconds and checks that the receiving node has handled the proposal.
#![allow(unused)] fn main() { Check ("Node has handled proposal after 10s", Continue_on_fail, (fun (msg, _) -> match msg with | Proposal header -> ( let hash = Block_header.hash header in Lwt_unix.wait 10. >>= fun () -> Helpers.get_node_block b1 (`Hash (hash, 0)) >>= fun _ -> | Ok _ -> return true (* block known *) | Error _ -> return false (* block not known *) ) | _ -> return true )) }