IMP logo
IMP Manual  for IMP version 2.16.0
rmf.md
1 RMF {#rmf}
2 ===
3 
4 The IMP.rmf module acts as a link between RMF files and %IMP data structures.
5 Currently, three types of data are linked:
6 
10 
11 For each type of data, there is a set of functions declared in `foo_io.h`
12 and defined in `foo_io.cpp`. These functions support four major operations
13 
14 - addition of new data to the RMF file: this creates a hierarchy of nodes
15  in the RMF and stores a link between those nodes and the corresponding
17 - creation of %IMP structures from the RMF: %IMP data structures are created
18  based on corresponding data contained in the RMF file
19 - link existing %IMP structures to the RMF: existing %IMP data structures
20  are linked to data stored in the RMF. The two must correspond exactly
21  (eg the RMF must have been created by adding those %IMP data structures
22  to the file).
23 - saving a frame to a file: a new frame is added to the RMF file and all
24  linked data is used to write data to that frame.
25 
26 The links between %IMP data structures are done via IMP::rmf::LoadLink and
27 IMP::rmf::SaveLink. These are data structures associated with the RMF file
28 (the RMF library provides a mechanism for this) that store persistent state
29 that is available as long as the file is open. There is one load and one
30 save link class per type of data.
31 
32 ## Hierarchy ##
33 atom::Hierarchy data is stored as RMF::REPRESENTATION nodes in the file.
34 
35 There is exactly one node in the RMF file per IMP::Particle in the hierarchy.
36 The link classes are implemented in terms of a bunch of helper classes,
37 each of which handles a particular type of data.
38 
39 The main tricky part is handling rigid bodies. In particular, the particles
40 defining the rigid body are not always part of the %IMP hierarchy, and
41 older RMF files (where rigid bodies were written differently) must also be
42 handled.
43 
44 Rigid bodies that are not part of the IMP::atom::Hierarchy are handled by
45 
46 - at each node in the hierarchy, checking if all core::XYZ particles in the
47  subtree are part of the same rigid body
48 - if so, pretend that the current particle is the actual rigid body.
49 - since this can result in one %IMP rigid body having many corresponding
50  nodes in the RMF, a tag (an id integer) is added to such rigid bodies
51  so they can be combined when creating from the file
52 
53 When writing coordinates of things that are rigid bodies, the internal
54 coordinates are written to the RMF rather than the global coordinates. This
55 makes the files more compact. It works since the rigid bodies are written as
56 RMF::decorator::ReferenceFrame nodes in the file, which effectively provides
57 a transformation to the coordinates of things underneath it.
58 
59 ## Restraints ##
60 Restraints are stored as RMF::FEATURE nodes in the file.
61 
62 Restraints are more or less write only, in contrast to the other types. There
63 are create functions, but since the RMF doesn't contain enough data to
64 actually create the correct %IMP type, these just create dummy restraints
65 whose score is simply the value stored in the file (and which don't have any
66 derivatives). As a result, it is not clear that they can be used for much
67 beyond verifying scores and some simple analysis.
68 
69 Each IMP::Restraint added to the RMF is written as follows:
70 
71 - a RMF node is created for the restraint itself, containing one score per frame
72 - IMP::Restraint::create_current_decomposition() is called on the restraint.
73  If this decomposition contains more than one restraint, then child nodes
74  are added for each restraint and they are stored recursively.
75 - all leaves have the (static) list of RMF nodes that they involve written.
76 
77 Mapping between the sub-restraints created by create_current_decomposition()
78 and RMF nodes is slightly non-trivial. It is done based on the set up input
79 particles for each restraint returned by create_current_decomposition().
80 For each such restraint, the link first checks in a map if there is already
81 a node with that set of inputs, if so, it reuses that. If not, it creates
82 a new one.
83 
84 ## Geometry ##
85 Geometry is stored as RMF::SHAPE nodes in the file.
86 
87 The functionality provided in IMP::display::Geometry doesn't map entirely
88 naturally on to RMF's geometry support. On the %IMP side, things are
89 implemented in terms of an ephemeral decomposition of complex geometric
90 objects into simple ones (eg, a bounding box can be decomposed into edges,
91 but each time you ask for the decomposition, you will get a different set
92 of IMP::Objects). The RMF side expects a hierarchy whose structure stays
93 constant across frames. As a result, each type of geometry object has to
94 be special cased in IMP.rmf. Currently there is support for
95 
96 - segments
97 - balls
98 - bounding boxes
99 - cylinders
100 - points
101 
102 ## Considerations ##
103 For each piece of data stored in the RMF, a decision has to be made whether
104 to store it once per file (using the RMF set_static_foo() methods), once
105 per frame (using the RMF set_frame_foo() methods), or automatically
106 (using RMF set_foo() methods). In IMP.rmf, in general things are stored
107 once per file with the exception of coordinates/transformations of things
108 that are not rigid parts of rigid bodies, which are stored once per frame.
109 This could be changed to using automatic for almost everything, but that is
110 
111 - a bit slower as it has to check each time if the value is different
112  than what is already stored
113 - not something that other programs expect for things like residue index
114 
115 In particular, it might make sense for radii to become automatic.
116 
117 One must be very careful not to store any RMF::NodeHandle or RMF::FileHandle
118 objects in the adaptors as these will keep the RMF file open indefinitely
119 (since the handles will keep the file alive and the file keeps the adaptor
120 alive and the adaptor keeps the handle alive).
Restraint * create_current_decomposition() const